Skip to main content

Showing 1–8 of 8 results for author: Deitke, M

  1. arXiv:2307.05663  [pdf, other

    cs.CV cs.AI

    Objaverse-XL: A Universe of 10M+ 3D Objects

    Authors: Matt Deitke, Ruoshi Liu, Matthew Wallingford, Huong Ngo, Oscar Michel, Aditya Kusupati, Alan Fan, Christian Laforte, Vikram Voleti, Samir Yitzhak Gadre, Eli VanderBilt, Aniruddha Kembhavi, Carl Vondrick, Georgia Gkioxari, Kiana Ehsani, Ludwig Schmidt, Ali Farhadi

    Abstract: Natural language processing and 2D vision models have attained remarkable proficiency on many tasks primarily by escalating the scale of training data. However, 3D vision tasks have not seen the same progress, in part due to the challenges of acquiring high-quality 3D data. In this work, we present Objaverse-XL, a dataset of over 10 million 3D objects. Our dataset comprises deduplicated 3D objects… ▽ More

    Submitted 11 July, 2023; originally announced July 2023.

  2. arXiv:2212.08051  [pdf, other

    cs.CV cs.AI cs.GR cs.RO

    Objaverse: A Universe of Annotated 3D Objects

    Authors: Matt Deitke, Dustin Schwenk, Jordi Salvador, Luca Weihs, Oscar Michel, Eli VanderBilt, Ludwig Schmidt, Kiana Ehsani, Aniruddha Kembhavi, Ali Farhadi

    Abstract: Massive data corpora like WebText, Wikipedia, Conceptual Captions, WebImageText, and LAION have propelled recent dramatic progress in AI. Large neural models trained on such datasets produce impressive results and top many of today's benchmarks. A notable omission within this family of large-scale datasets is 3D data. Despite considerable interest and potential applications in 3D vision, datasets… ▽ More

    Submitted 15 December, 2022; originally announced December 2022.

    Comments: Website: objaverse.allenai.org

  3. arXiv:2212.04819  [pdf, other

    cs.RO cs.AI cs.CV

    Phone2Proc: Bringing Robust Robots Into Our Chaotic World

    Authors: Matt Deitke, Rose Hendrix, Luca Weihs, Ali Farhadi, Kiana Ehsani, Aniruddha Kembhavi

    Abstract: Training embodied agents in simulation has become mainstream for the embodied AI community. However, these agents often struggle when deployed in the physical world due to their inability to generalize to real-world environments. In this paper, we present Phone2Proc, a method that uses a 10-minute phone scan and conditional procedural generation to create a distribution of training scenes that are… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

    Comments: https://allenai.org/project/phone2proc

  4. arXiv:2210.06849  [pdf, other

    cs.CV

    Retrospectives on the Embodied AI Workshop

    Authors: Matt Deitke, Dhruv Batra, Yonatan Bisk, Tommaso Campari, Angel X. Chang, Devendra Singh Chaplot, Changan Chen, Claudia Pérez D'Arpino, Kiana Ehsani, Ali Farhadi, Li Fei-Fei, Anthony Francis, Chuang Gan, Kristen Grauman, David Hall, Winson Han, Unnat Jain, Aniruddha Kembhavi, Jacob Krantz, Stefan Lee, Chengshu Li, Sagnik Majumder, Oleksandr Maksymets, Roberto Martín-Martín, Roozbeh Mottaghi , et al. (14 additional authors not shown)

    Abstract: We present a retrospective on the state of Embodied AI research. Our analysis focuses on 13 challenges presented at the Embodied AI Workshop at CVPR. These challenges are grouped into three themes: (1) visual navigation, (2) rearrangement, and (3) embodied vision-and-language. We discuss the dominant datasets within each theme, evaluation metrics for the challenges, and the performance of state-of… ▽ More

    Submitted 4 December, 2022; v1 submitted 13 October, 2022; originally announced October 2022.

  5. arXiv:2206.06994  [pdf, other

    cs.AI cs.CV cs.RO

    ProcTHOR: Large-Scale Embodied AI Using Procedural Generation

    Authors: Matt Deitke, Eli VanderBilt, Alvaro Herrasti, Luca Weihs, Jordi Salvador, Kiana Ehsani, Winson Han, Eric Kolve, Ali Farhadi, Aniruddha Kembhavi, Roozbeh Mottaghi

    Abstract: Massive datasets and high-capacity models have driven many recent advancements in computer vision and natural language understanding. This work presents a platform to enable similar success stories in Embodied AI. We propose ProcTHOR, a framework for procedural generation of Embodied AI environments. ProcTHOR enables us to sample arbitrarily large datasets of diverse, interactive, customizable, an… ▽ More

    Submitted 14 June, 2022; originally announced June 2022.

    Comments: ProcTHOR website: https://procthor.allenai.org

  6. arXiv:2103.16544  [pdf, other

    cs.CV cs.RO

    Visual Room Rearrangement

    Authors: Luca Weihs, Matt Deitke, Aniruddha Kembhavi, Roozbeh Mottaghi

    Abstract: There has been a significant recent progress in the field of Embodied AI with researchers developing models and algorithms enabling embodied agents to navigate and interact within completely unseen environments. In this paper, we propose a new dataset and baseline models for the task of Rearrangement. We particularly focus on the task of Room Rearrangement: an agent begins by exploring a room and… ▽ More

    Submitted 30 March, 2021; originally announced March 2021.

    Comments: CVPR 2021 - Oral Presentation

  7. arXiv:2004.06799  [pdf, other

    cs.CV cs.RO

    RoboTHOR: An Open Simulation-to-Real Embodied AI Platform

    Authors: Matt Deitke, Winson Han, Alvaro Herrasti, Aniruddha Kembhavi, Eric Kolve, Roozbeh Mottaghi, Jordi Salvador, Dustin Schwenk, Eli VanderBilt, Matthew Wallingford, Luca Weihs, Mark Yatskar, Ali Farhadi

    Abstract: Visual recognition ecosystems (e.g. ImageNet, Pascal, COCO) have undeniably played a prevailing role in the evolution of modern computer vision. We argue that interactive and embodied visual AI has reached a stage of development similar to visual recognition prior to the advent of these ecosystems. Recently, various synthetic environments have been introduced to facilitate research in embodied AI.… ▽ More

    Submitted 14 April, 2020; originally announced April 2020.

    Comments: CVPR 2020

  8. arXiv:1712.05474  [pdf, other

    cs.CV cs.AI cs.LG

    AI2-THOR: An Interactive 3D Environment for Visual AI

    Authors: Eric Kolve, Roozbeh Mottaghi, Winson Han, Eli VanderBilt, Luca Weihs, Alvaro Herrasti, Matt Deitke, Kiana Ehsani, Daniel Gordon, Yuke Zhu, Aniruddha Kembhavi, Abhinav Gupta, Ali Farhadi

    Abstract: We introduce The House Of inteRactions (THOR), a framework for visual AI research, available at http://ai2thor.allenai.org. AI2-THOR consists of near photo-realistic 3D indoor scenes, where AI agents can navigate in the scenes and interact with objects to perform tasks. AI2-THOR enables research in many different domains including but not limited to deep reinforcement learning, imitation learning,… ▽ More

    Submitted 26 August, 2022; v1 submitted 14 December, 2017; originally announced December 2017.