droidlet: modular, heterogenous, multi-modal agents
Authors:
Anurag Pratik,
Soumith Chintala,
Kavya Srinet,
Dhiraj Gandhi,
Rebecca Qian,
Yuxuan Sun,
Ryan Drew,
Sara Elkafrawy,
Anoushka Tiwari,
Tucker Hart,
Mary Williamson,
Abhinav Gupta,
Arthur Szlam
Abstract:
In recent years, there have been significant advances in building end-to-end Machine Learning (ML) systems that learn at scale. But most of these systems are: (a) isolated (perception, speech, or language only); (b) trained on static datasets. On the other hand, in the field of robotics, large-scale learning has always been difficult. Supervision is hard to gather and real world physical interacti…
▽ More
In recent years, there have been significant advances in building end-to-end Machine Learning (ML) systems that learn at scale. But most of these systems are: (a) isolated (perception, speech, or language only); (b) trained on static datasets. On the other hand, in the field of robotics, large-scale learning has always been difficult. Supervision is hard to gather and real world physical interactions are expensive. In this work we introduce and open-source droidlet, a modular, heterogeneous agent architecture and platform. It allows us to exploit both large-scale static datasets in perception and language and sophisticated heuristics often used in robotics; and provides tools for interactive annotation. Furthermore, it brings together perception, language and action onto one platform, providing a path towards agents that learn from the richness of real world interactions.
△ Less
Submitted 25 January, 2021;
originally announced January 2021.