Skip to main content

Showing 1–6 of 6 results for author: Sarkar, S D

  1. arXiv:2305.17690  [pdf, other

    cs.CL

    HaVQA: A Dataset for Visual Question Answering and Multimodal Research in Hausa Language

    Authors: Shantipriya Parida, Idris Abdulmumin, Shamsuddeen Hassan Muhammad, Aneesh Bose, Guneet Singh Kohli, Ibrahim Said Ahmad, Ketan Kotwal, Sayan Deb Sarkar, Ondřej Bojar, Habeebah Adamu Kakudi

    Abstract: This paper presents HaVQA, the first multimodal dataset for visual question-answering (VQA) tasks in the Hausa language. The dataset was created by manually translating 6,022 English question-answer pairs, which are associated with 1,555 unique images from the Visual Genome dataset. As a result, the dataset provides 12,044 gold standard English-Hausa parallel sentences that were translated in a fa… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL 2023 as a long paper (Findings)

  2. arXiv:2304.14880  [pdf, other

    cs.CV

    SGAligner : 3D Scene Alignment with Scene Graphs

    Authors: Sayan Deb Sarkar, Ondrej Miksik, Marc Pollefeys, Daniel Barath, Iro Armeni

    Abstract: Building 3D scene graphs has recently emerged as a topic in scene representation for several embodied AI applications to represent the world in a structured and rich manner. With their increased use in solving downstream tasks (eg, navigation and room rearrangement), can we leverage and recycle them for creating 3D maps of environments, a pivotal step in agent operation? We focus on the fundamenta… ▽ More

    Submitted 26 September, 2023; v1 submitted 28 April, 2023; originally announced April 2023.

    Comments: Accepted at ICCV 2023

  3. arXiv:2107.00887  [pdf, other

    cs.CV cs.HC

    HO-3D_v3: Improving the Accuracy of Hand-Object Annotations of the HO-3D Dataset

    Authors: Shreyas Hampali, Sayan Deb Sarkar, Vincent Lepetit

    Abstract: HO-3D is a dataset providing image sequences of various hand-object interaction scenarios annotated with the 3D pose of the hand and the object and was originally introduced as HO-3D_v2. The annotations were obtained automatically using an optimization method, 'HOnnotate', introduced in the original paper. HO-3D_v3 provides more accurate annotations for both the hand and object poses thus resultin… ▽ More

    Submitted 2 July, 2021; originally announced July 2021.

  4. arXiv:2104.14639  [pdf, other

    cs.CV

    Keypoint Transformer: Solving Joint Identification in Challenging Hands and Object Interactions for Accurate 3D Pose Estimation

    Authors: Shreyas Hampali, Sayan Deb Sarkar, Mahdi Rad, Vincent Lepetit

    Abstract: We propose a robust and accurate method for estimating the 3D poses of two hands in close interaction from a single color image. This is a very challenging problem, as large occlusions and many confusions between the joints may happen. State-of-the-art methods solve this problem by regressing a heatmap for each joint, which requires solving two problems simultaneously: localizing the joints and re… ▽ More

    Submitted 19 April, 2022; v1 submitted 29 April, 2021; originally announced April 2021.

    Comments: Accepted at CVPR2022

  5. arXiv:2103.07969  [pdf, other

    cs.CV cs.AI cs.LG

    Monte Carlo Scene Search for 3D Scene Understanding

    Authors: Shreyas Hampali, Sinisa Stekovic, Sayan Deb Sarkar, Chetan Srinivasa Kumar, Friedrich Fraundorfer, Vincent Lepetit

    Abstract: We explore how a general AI algorithm can be used for 3D scene understanding to reduce the need for training data. More exactly, we propose a modification of the Monte Carlo Tree Search (MCTS) algorithm to retrieve objects and room layouts from noisy RGB-D scans. While MCTS was developed as a game-playing algorithm, we show it can also be used for complex perception problems. Our adapted MCTS algo… ▽ More

    Submitted 5 May, 2021; v1 submitted 14 March, 2021; originally announced March 2021.

    Comments: To be presented at CVPR 2021

  6. arXiv:2001.02149  [pdf, other

    cs.CV

    General 3D Room Layout from a Single View by Render-and-Compare

    Authors: Sinisa Stekovic, Shreyas Hampali, Mahdi Rad, Sayan Deb Sarkar, Friedrich Fraundorfer, Vincent Lepetit

    Abstract: We present a novel method to reconstruct the 3D layout of a room (walls, floors, ceilings) from a single perspective view in challenging conditions, by contrast with previous single-view methods restricted to cuboid-shaped layouts. This input view can consist of a color image only, but considering a depth map results in a more accurate reconstruction. Our approach is formalized as solving a constr… ▽ More

    Submitted 21 July, 2020; v1 submitted 7 January, 2020; originally announced January 2020.