Skip to main content

Showing 1–50 of 223 results for author: Pavone, M

  1. Locomotion as Manipulation with ReachBot

    Authors: Tony G. Chen, Stephanie Newdick, Julia Di, Carlo Bosio, Nitin Ongole, Mathieu Lapotre, Marco Pavone, Mark R. Cutkosky

    Abstract: Caves and lava tubes on the Moon and Mars are sites of geological and astrobiological interest but consist of terrain that is inaccessible with traditional robot locomotion. To support the exploration of these sites, we present ReachBot, a robot that uses extendable booms as appendages to manipulate itself with respect to irregular rock surfaces. The booms terminate in grippers equipped with micro… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Journal ref: Science Robotics 2024

  2. arXiv:2407.00959  [pdf, other

    cs.AI cs.RO

    Tokenize the World into Object-level Knowledge to Address Long-tail Events in Autonomous Driving

    Authors: Ran Tian, Boyi Li, Xinshuo Weng, Yuxiao Chen, Edward Schmerling, Yue Wang, Boris Ivanovic, Marco Pavone

    Abstract: The autonomous driving industry is increasingly adopting end-to-end learning from sensory inputs to minimize human biases in system design. Traditional end-to-end driving models, however, suffer from long-tail events due to rare or unseen inputs within their training distributions. To address this, we propose TOKEN, a novel Multi-Modal Large Language Model (MM-LLM) that tokenizes the world into ob… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  3. arXiv:2406.15349  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking

    Authors: Daniel Dauner, Marcel Hallgarten, Tianyu Li, Xinshuo Weng, Zhiyu Huang, Zetong Yang, Hongyang Li, Igor Gilitschenski, Boris Ivanovic, Marco Pavone, Andreas Geiger, Kashyap Chitta

    Abstract: Benchmarking vision-based driving policies is challenging. On one hand, open-loop evaluation with real data is easy, but these results do not reflect closed-loop performance. On the other, closed-loop evaluation is possible in simulation, but is hard to scale due to its significant computational demands. Further, the simulators available today exhibit a large domain gap to real data. This has resu… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  4. arXiv:2406.13857  [pdf, other

    cs.RO

    Martian Exploration of Lava Tubes (MELT) with ReachBot: Scientific Investigation and Concept of Operations

    Authors: Julia Di, Sara Cuevas-Quinones, Stephanie Newdick, Tony G. Chen, Marco Pavone, Mathieu G. A. Lapotre, Mark Cutkosky

    Abstract: As natural access points to the subsurface, lava tubes and other caves have become premier targets of planetary missions for astrobiological analyses. Few existing robotic paradigms, however, are able to explore such challenging environments. ReachBot is a robot that enables navigation in planetary caves by using extendable and retractable limbs to locomote. This paper outlines the potential scien… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: In International Conference on Space Robotics 2024

  5. arXiv:2406.12095  [pdf, other

    cs.CV cs.AI cs.RO

    DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features

    Authors: Letian Wang, Seung Wook Kim, Jiawei Yang, Cunjun Yu, Boris Ivanovic, Steven L. Waslander, Yue Wang, Sanja Fidler, Marco Pavone, Peter Karkus

    Abstract: We propose DistillNeRF, a self-supervised learning framework addressing the challenge of understanding 3D environments from limited 2D observations in autonomous driving. Our method is a generalizable feedforward model that predicts a rich neural scene representation from sparse, single-frame multi-view camera inputs, and is trained self-supervised with differentiable rendering to reconstruct RGB,… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  6. arXiv:2406.10789  [pdf, other

    cs.CV

    Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses

    Authors: Zhiwen Fan, Pu Wang, Yang Zhao, Yibo Zhao, Boris Ivanovic, Zhangyang Wang, Marco Pavone, Hao Frank Yang

    Abstract: The increasing rate of road accidents worldwide results not only in significant loss of life but also imposes billions financial burdens on societies. Current research in traffic crash frequency modeling and analysis has predominantly approached the problem as classification tasks, focusing mainly on learning-based classification or ensemble learning methods. These approaches often overlook the in… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  7. arXiv:2406.01814  [pdf, other

    cs.RO

    ZAPP! Zonotope Agreement of Prediction and Planning for Continuous-Time Collision Avoidance with Discrete-Time Dynamics

    Authors: Luca Paparusso, Shreyas Kousik, Edward Schmerling, Francesco Braghin, Marco Pavone

    Abstract: The past few years have seen immense progress on two fronts that are critical to safe, widespread mobile robot deployment: predicting uncertain motion of multiple agents, and planning robot motion under uncertainty. However, the numerical methods required on each front have resulted in a mismatch of representation for prediction and planning. In prediction, numerical tractability is usually achiev… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 8 pages, 3 figures, 1 table, submitted to 2024 IEEE International Conference on Robotics and Automation (ICRA)

  8. arXiv:2405.17187  [pdf, other

    cs.CV cs.AI cs.RO

    Memorize What Matters: Emergent Scene Decomposition from Multitraverse

    Authors: Yiming Li, Zehong Wang, Yue Wang, Zhiding Yu, Zan Gojcic, Marco Pavone, Chen Feng, Jose M. Alvarez

    Abstract: Humans naturally retain memories of permanent elements, while ephemeral moments often slip through the cracks of memory. This selective retention is crucial for robotic perception, localization, and mapping. To endow robots with this capability, we introduce 3D Gaussian Mapping (3DGM), a self-supervised, camera-only offline mapping framework grounded in 3D Gaussian Splatting. 3DGM converts multitr… ▽ More

    Submitted 29 May, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

    Comments: Project page: https://3d-gaussian-mapping.github.io; Code and data: https://github.com/NVlabs/3DGM

  9. arXiv:2405.16034  [pdf, other

    cs.CV

    DiffuBox: Refining 3D Object Detection with Point Diffusion

    Authors: Xiangyu Chen, Zhenzhen Liu, Katie Z Luo, Siddhartha Datta, Adhitya Polavaram, Yan Wang, Yurong You, Boyi Li, Marco Pavone, Wei-Lun Chao, Mark Campbell, Bharath Hariharan, Kilian Q. Weinberger

    Abstract: Ensuring robust 3D object detection and localization is crucial for many applications in robotics and autonomous driving. Recent models, however, face difficulties in maintaining high performance when applied to domains with differing sensor setups or geographic locations, often resulting in poor localization accuracy due to domain shift. To overcome this challenge, we introduce a novel diffusion-… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  10. arXiv:2405.15005  [pdf, other

    cs.RO

    ReachBot Field Tests in a Mojave Desert Lava Tube as a Martian Analog

    Authors: Tony G. Chen, Julia Di, Stephanie Newdick, Mathieu Lapotre, Marco Pavone, Mark R. Cutkosky

    Abstract: ReachBot is a robot concept for the planetary exploration of caves and lava tubes, which are often inaccessible with traditional robot locomotion methods. It uses extendable booms as appendages, with grippers mounted at the end, to grasp irregular rock surfaces and traverse these difficult terrains. We have built a partial ReachBot prototype consisting of a single boom and gripper, mounted on a tr… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Accepted to the IEEE ICRA Workshop on Field Robotics 2024; 4 pages

  11. arXiv:2405.11139  [pdf, other

    cs.RO cs.AI cs.LG

    RuleFuser: Injecting Rules in Evidential Networks for Robust Out-of-Distribution Trajectory Prediction

    Authors: Jay Patrikar, Sushant Veer, Apoorva Sharma, Marco Pavone, Sebastian Scherer

    Abstract: Modern neural trajectory predictors in autonomous driving are developed using imitation learning (IL) from driving logs. Although IL benefits from its ability to glean nuanced and multi-modal human driving behaviors from large datasets, the resulting predictors often struggle with out-of-distribution (OOD) scenarios and with traffic rule compliance. On the other hand, classical rule-based predicto… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: 9 pages, 3 figures

  12. arXiv:2405.03685  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Language-Image Models with 3D Understanding

    Authors: Jang Hyun Cho, Boris Ivanovic, Yulong Cao, Edward Schmerling, Yue Wang, Xinshuo Weng, Boyi Li, Yurong You, Philipp Krähenbühl, Yan Wang, Marco Pavone

    Abstract: Multi-modal large language models (MLLMs) have shown incredible capabilities in a variety of 2D vision and language tasks. We extend MLLMs' perceptual capabilities to ground and reason about images in 3-dimensional space. To that end, we first develop a large-scale pre-training dataset for 2D and 3D called LV3D by combining multiple existing 2D and 3D recognition datasets under a common task formu… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: Project page: https://janghyuncho.github.io/Cube-LLM

  13. arXiv:2404.03336  [pdf, other

    cs.RO

    Scaling Population-Based Reinforcement Learning with GPU Accelerated Simulation

    Authors: Asad Ali Shahid, Yashraj Narang, Vincenzo Petrone, Enrico Ferrentino, Ankur Handa, Dieter Fox, Marco Pavone, Loris Roveda

    Abstract: In recent years, deep reinforcement learning (RL) has shown its effectiveness in solving complex continuous control tasks like locomotion and dexterous manipulation. However, this comes at the cost of an enormous amount of experience required for training, exacerbated by the sensitivity of learning efficiency and the policy performance to hyperparameter selection, which often requires numerous tri… ▽ More

    Submitted 24 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: Submitted for publication to IEEE-RAS 23rd International Conference on Humanoid Robots

  14. arXiv:2404.01550  [pdf, other

    cs.RO eess.SY math.OC

    Perfecting Periodic Trajectory Tracking: Model Predictive Control with a Periodic Observer ($Π$-MPC)

    Authors: Luis Pabon, Johannes Köhler, John Irvin Alora, Patrick Benito Eberhard, Andrea Carron, Melanie N. Zeilinger, Marco Pavone

    Abstract: In Model Predictive Control (MPC), discrepancies between the actual system and the predictive model can lead to substantial tracking errors and significantly degrade performance and reliability. While such discrepancies can be alleviated with more complex models, this often complicates controller design and implementation. By leveraging the fact that many trajectories of interest are periodic, we… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 8 pages, 3 figures, Submitted to the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)

  15. arXiv:2403.20309  [pdf, other

    cs.CV

    InstantSplat: Unbounded Sparse-view Pose-free Gaussian Splatting in 40 Seconds

    Authors: Zhiwen Fan, Wenyan Cong, Kairun Wen, Kevin Wang, Jian Zhang, Xinghao Ding, Danfei Xu, Boris Ivanovic, Marco Pavone, Georgios Pavlakos, Zhangyang Wang, Yue Wang

    Abstract: While novel view synthesis (NVS) from a sparse set of images has advanced significantly in 3D computer vision, it relies on precise initial estimation of camera parameters using Structure-from-Motion (SfM). For instance, the recently developed Gaussian Splatting depends heavily on the accuracy of SfM-derived points and poses. However, SfM processes are time-consuming and often prove unreliable in… ▽ More

    Submitted 30 June, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

    Comments: Project Page: https://instantsplat.github.io/

  16. arXiv:2403.16439  [pdf, other

    cs.RO cs.CV cs.LG

    Producing and Leveraging Online Map Uncertainty in Trajectory Prediction

    Authors: Xunjiang Gu, Guanyu Song, Igor Gilitschenski, Marco Pavone, Boris Ivanovic

    Abstract: High-definition (HD) maps have played an integral role in the development of modern autonomous vehicle (AV) stacks, albeit with high associated labeling and maintenance costs. As a result, many recent works have proposed methods for estimating HD maps online from sensor data, enabling AVs to operate outside of previously-mapped regions. However, current online map estimation approaches are develop… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 14 pages, 14 figures, 6 tables. CVPR 2024

  17. arXiv:2403.10768  [pdf, other

    cs.RO

    Task-Driven Manipulation with Reconfigurable Parallel Robots

    Authors: Daniel Morton, Mark Cutkosky, Marco Pavone

    Abstract: ReachBot, a proposed robotic platform, employs extendable booms as limbs for mobility in challenging environments, such as martian caves. When attached to the environment, ReachBot acts as a parallel robot, with reconfiguration driven by the ability to detach and re-place the booms. This ability enables manipulation-focused scientific objectives: for instance, through operating tools, or handling… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  18. arXiv:2403.08125  [pdf, other

    cs.CV

    Q-SLAM: Quadric Representations for Monocular SLAM

    Authors: Chensheng Peng, Chenfeng Xu, Yue Wang, Mingyu Ding, Heng Yang, Masayoshi Tomizuka, Kurt Keutzer, Marco Pavone, Wei Zhan

    Abstract: Monocular SLAM has long grappled with the challenge of accurately modeling 3D geometries. Recent advances in Neural Radiance Fields (NeRF)-based monocular SLAM have shown promise, yet these methods typically focus on novel view synthesis rather than precise 3D geometry modeling. This focus results in a significant disconnect between NeRF applications, i.e., novel-view synthesis and the requirement… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  19. arXiv:2403.07076  [pdf, other

    cs.RO cs.AI cs.CV

    Mapping High-level Semantic Regions in Indoor Environments without Object Recognition

    Authors: Roberto Bigazzi, Lorenzo Baraldi, Shreyas Kousik, Rita Cucchiara, Marco Pavone

    Abstract: Robots require a semantic understanding of their surroundings to operate in an efficient and explainable way in human environments. In the literature, there has been an extensive focus on object labeling and exhaustive scene graph generation; less effort has been focused on the task of purely identifying and mapping large semantic regions. The present work proposes a method for semantic region map… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: Accepted by IEEE International Conference on Robotics and Automation (ICRA 2024)

  20. arXiv:2403.04057  [pdf, other

    cs.GT econ.TH

    To Spend or to Gain: Online Learning in Repeated Karma Auctions

    Authors: Damien Berriaud, Ezzat Elokda, Devansh Jalota, Emilio Frazzoli, Marco Pavone, Florian Dörfler

    Abstract: Recent years have seen a surge of artificial currency-based mechanisms in contexts where monetary instruments are deemed unfair or inappropriate, e.g., in allocating food donations to food banks, course seats to students, and, more recently, even for traffic congestion management. Yet the applicability of these mechanisms remains limited in repeated auction settings, as it is challenging for users… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: Manuscript submitted for review to the 25th ACM Conference on Economics & Computation (EC'24)

  21. arXiv:2402.17077  [pdf, other

    cs.LG cs.CV

    Parallelized Spatiotemporal Binding

    Authors: Gautam Singh, Yue Wang, Jiawei Yang, Boris Ivanovic, Sungjin Ahn, Marco Pavone, Tong Che

    Abstract: While modern best practices advocate for scalable architectures that support long-range interactions, object-centric models are yet to fully embrace these architectures. In particular, existing object-centric models for handling sequential inputs, due to their reliance on RNN-based implementation, show poor stability and capacity and are slow to train on long sequences. We introduce Parallelizable… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: See project page at https://parallel-st-binder.github.io

  22. arXiv:2402.16342  [pdf, other

    cs.AI cs.RO

    Contingency Planning Using Bi-level Markov Decision Processes for Space Missions

    Authors: Somrita Banerjee, Edward Balaban, Mark Shirley, Kevin Bradner, Marco Pavone

    Abstract: This work focuses on autonomous contingency planning for scientific missions by enabling rapid policy computation from any off-nominal point in the state space in the event of a delay or deviation from the nominal mission plan. Successful contingency planning involves managing risks and rewards, often probabilistically associated with actions, in stochastic scenarios. Markov Decision Processes (MD… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  23. arXiv:2402.16162  [pdf, other

    eess.SY cs.GT

    Catch Me If You Can: Combatting Fraud in Artificial Currency Based Government Benefits Programs

    Authors: Devansh Jalota, Matthew Tsao, Marco Pavone

    Abstract: Artificial currencies have grown in popularity in many real-world resource allocation settings, gaining traction in government benefits programs like food assistance and transit benefits programs. However, such programs are susceptible to misreporting fraud, wherein users can misreport their private attributes to gain access to more artificial currency (credits) than they are entitled to. To addre… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  24. arXiv:2402.11209  [pdf, other

    cs.GT cs.CC econ.TH math.OC

    When Simple is Near-Optimal in Security Games

    Authors: Devansh Jalota, Michael Ostrovsky, Marco Pavone

    Abstract: Fraudulent or illegal activities are ubiquitous across applications and involve users bypassing the rule of law, often with the strategic aim of obtaining some benefit that would otherwise be unattainable within the bounds of lawful conduct. However, user fraud is detrimental, as it may compromise safety or impose disproportionate negative externalities on particular population groups. To mitiga… ▽ More

    Submitted 21 February, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

  25. arXiv:2402.10255  [pdf, other

    quant-ph cs.ET stat.CO stat.ME

    Benchmarking the Operation of Quantum Heuristics and Ising Machines: Scoring Parameter Setting Strategies on Optimization Applications

    Authors: David E. Bernal Neira, Robin Brown, Pratik Sathe, Filip Wudarski, Marco Pavone, Eleanor G. Rieffel, Davide Venturelli

    Abstract: We discuss guidelines for evaluating the performance of parameterized stochastic solvers for optimization problems, with particular attention to systems that employ novel hardware, such as digital quantum processors running variational algorithms, analog processors performing quantum annealing, or coherent Ising Machines. We illustrate through an example a benchmarking procedure grounded in the st… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

    Comments: 13 pages, 6 figures

  26. arXiv:2402.05932  [pdf, other

    cs.RO cs.AI cs.CL

    Driving Everywhere with Large Language Model Policy Adaptation

    Authors: Boyi Li, Yue Wang, Jiageng Mao, Boris Ivanovic, Sushant Veer, Karen Leung, Marco Pavone

    Abstract: Adapting driving behavior to new environments, customs, and laws is a long-standing problem in autonomous driving, precluding the widespread deployment of autonomous vehicles (AVs). In this paper, we present LLaDA, a simple yet powerful tool that enables human drivers and autonomous vehicles alike to drive everywhere by adapting their tasks and motion plans to traffic rules in new locations. LLaDA… ▽ More

    Submitted 10 April, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: CVPR 2024, featured in GTC 2024: https://www.youtube.com/watch?v=t-UPlPlrYgQ&t=51s

  27. arXiv:2402.02769  [pdf, other

    cs.LG cs.AI

    Learning from Teaching Regularization: Generalizable Correlations Should be Easy to Imitate

    Authors: Can Jin, Tong Che, Hongwu Peng, Yiyuan Li, Marco Pavone

    Abstract: Generalization remains a central challenge in machine learning. In this work, we propose Learning from Teaching (LoT), a novel regularization technique for deep neural networks to enhance generalization. Inspired by the human ability to capture concise and abstract patterns, we hypothesize that generalizable correlations are expected to be easier to teach. LoT operationalizes this concept to impro… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  28. arXiv:2401.12135  [pdf, other

    math.OC cs.ET quant-ph

    Accelerating Continuous Variable Coherent Ising Machines via Momentum

    Authors: Robin Brown, Davide Venturelli, Marco Pavone, David E. Bernal Neira

    Abstract: The Coherent Ising Machine (CIM) is a non-conventional architecture that takes inspiration from physical annealing processes to solve Ising problems heuristically. Its dynamics are naturally continuous and described by a set of ordinary differential equations that have been proven to be useful for the optimization of continuous variables non-convex quadratic optimization problems. The dynamics of… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  29. arXiv:2401.11371  [pdf, other

    cs.RO eess.SY

    Modeling Considerations for Developing Deep Space Autonomous Spacecraft and Simulators

    Authors: Christopher Agia, Guillem Casadesus Vila, Saptarshi Bandyopadhyay, David S. Bayard, Kar-Ming Cheung, Charles H. Lee, Eric Wood, Ian Aenishanslin, Steven Ardito, Lorraine Fesq, Marco Pavone, Issa A. D. Nesnas

    Abstract: To extend the limited scope of autonomy used in prior missions for operation in distant and complex environments, there is a need to further develop and mature autonomy that jointly reasons over multiple subsystems, which we term system-level autonomy. System-level autonomy establishes situational awareness that resolves conflicting information across subsystems, which may necessitate the refineme… ▽ More

    Submitted 20 January, 2024; originally announced January 2024.

    Comments: Project page: https://sites.google.com/stanford.edu/spacecraft-models. 20 pages, 8 figures. Accepted to the IEEE Conference on Aerospace (AeroConf) 2024

    ACM Class: I.2.8; I.2.9; I.6.1; I.6.3; I.6.4; I.6.6; J.2

  30. arXiv:2312.13303  [pdf, other

    cs.LG cs.AI

    RealGen: Retrieval Augmented Generation for Controllable Traffic Scenarios

    Authors: Wenhao Ding, Yulong Cao, Ding Zhao, Chaowei Xiao, Marco Pavone

    Abstract: Simulation plays a crucial role in the development of autonomous vehicles (AVs) due to the potential risks associated with real-world testing. Although significant progress has been made in the visual aspects of simulators, generating complex behavior among agents remains a formidable challenge. It is not only imperative to ensure realism in the scenarios generated but also essential to incorporat… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  31. arXiv:2312.05873  [pdf, other

    eess.SY cs.AI cs.LG cs.RO cs.SC

    Learning for CasADi: Data-driven Models in Numerical Optimization

    Authors: Tim Salzmann, Jon Arrizabalaga, Joel Andersson, Marco Pavone, Markus Ryll

    Abstract: While real-world problems are often challenging to analyze analytically, deep learning excels in modeling complex processes from data. Existing optimization frameworks like CasADi facilitate seamless usage of solvers but face challenges when integrating learned process models into numerical optimizations. To address this gap, we present the Learning for CasADi (L4CasADi) framework, enabling the se… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

  32. arXiv:2312.04658  [pdf, other

    cs.LG stat.ML

    PAC-Bayes Generalization Certificates for Learned Inductive Conformal Prediction

    Authors: Apoorva Sharma, Sushant Veer, Asher Hancock, Heng Yang, Marco Pavone, Anirudha Majumdar

    Abstract: Inductive Conformal Prediction (ICP) provides a practical and effective approach for equipping deep learning models with uncertainty estimates in the form of set-valued predictions which are guaranteed to contain the ground truth with high probability. Despite the appeal of this coverage guarantee, these sets may not be efficient: the size and contents of the prediction sets are not directly contr… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023

  33. arXiv:2312.00438  [pdf, other

    cs.CV

    Dolphins: Multimodal Language Model for Driving

    Authors: Yingzi Ma, Yulong Cao, Jiachen Sun, Marco Pavone, Chaowei Xiao

    Abstract: The quest for fully autonomous vehicles (AVs) capable of navigating complex real-world scenarios with human-like understanding and responsiveness. In this paper, we introduce Dolphins, a novel vision-language model architected to imbibe human-like abilities as a conversational driving assistant. Dolphins is adept at processing multimodal inputs comprising video (or image) data, text instructions,… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: The project page is available at https://vlm-driver.github.io/

  34. arXiv:2311.18307  [pdf, other

    cs.LG cs.CV cs.RO

    Categorical Traffic Transformer: Interpretable and Diverse Behavior Prediction with Tokenized Latent

    Authors: Yuxiao Chen, Sander Tonkens, Marco Pavone

    Abstract: Adept traffic models are critical to both planning and closed-loop simulation for autonomous vehicles (AV), and key design objectives include accuracy, diverse multimodal behaviors, interpretability, and downstream compatibility. Recently, with the advent of large language models (LLMs), an additional desirable feature for traffic models is LLM compatibility. We present Categorical Traffic Transfo… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  35. arXiv:2311.10813  [pdf, other

    cs.CV cs.AI cs.CL cs.RO

    A Language Agent for Autonomous Driving

    Authors: Jiageng Mao, Junjie Ye, Yuxi Qian, Marco Pavone, Yue Wang

    Abstract: Human-level driving is an ultimate goal of autonomous driving. Conventional approaches formulate autonomous driving as a perception-prediction-planning framework, yet their systems do not capitalize on the inherent reasoning ability and experiential knowledge of humans. In this paper, we propose a fundamental paradigm shift from current pipelines, exploiting Large Language Models (LLMs) as a cogni… ▽ More

    Submitted 27 November, 2023; v1 submitted 17 November, 2023; originally announced November 2023.

    Comments: Project Page: https://usc-gvl.github.io/Agent-Driver/

  36. arXiv:2311.05780  [pdf, other

    eess.SY cs.LG cs.RO

    Real-time Control of Electric Autonomous Mobility-on-Demand Systems via Graph Reinforcement Learning

    Authors: Aaryan Singhal, Daniele Gammelli, Justin Luke, Karthik Gopalakrishnan, Dominik Helmreich, Marco Pavone

    Abstract: Operators of Electric Autonomous Mobility-on-Demand (E-AMoD) fleets need to make several real-time decisions such as matching available vehicles to ride requests, rebalancing idle vehicles to areas of high demand, and charging vehicles to ensure sufficient range. While this problem can be posed as a linear program that optimizes flows over a space-charge-time graph, the size of the resulting optim… ▽ More

    Submitted 3 April, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

    Comments: 9 pages, revised SF travel data, includes additional experimental results, content and clarification revisions per reviewer feedback, and typo fixes

  37. arXiv:2311.04079  [pdf, other

    cs.CV

    Augmenting Lane Perception and Topology Understanding with Standard Definition Navigation Maps

    Authors: Katie Z Luo, Xinshuo Weng, Yan Wang, Shuang Wu, Jie Li, Kilian Q Weinberger, Yue Wang, Marco Pavone

    Abstract: Autonomous driving has traditionally relied heavily on costly and labor-intensive High Definition (HD) maps, hindering scalability. In contrast, Standard Definition (SD) maps are more affordable and have worldwide coverage, offering a scalable alternative. In this work, we systematically explore the effect of SD maps for real-time lane-topology understanding. We propose a novel framework to integr… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

  38. arXiv:2311.02077  [pdf, other

    cs.CV

    EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision

    Authors: Jiawei Yang, Boris Ivanovic, Or Litany, Xinshuo Weng, Seung Wook Kim, Boyi Li, Tong Che, Danfei Xu, Sanja Fidler, Marco Pavone, Yue Wang

    Abstract: We present EmerNeRF, a simple yet powerful approach for learning spatial-temporal representations of dynamic driving scenes. Grounded in neural fields, EmerNeRF simultaneously captures scene geometry, appearance, motion, and semantics via self-bootstrapping. EmerNeRF hinges upon two core components: First, it stratifies scenes into static and dynamic fields. This decomposition emerges purely from… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: See the project page for code, data, and request pre-trained models: https://emernerf.github.io

  39. arXiv:2310.18301  [pdf, other

    cs.RO cs.AI eess.SY

    Interactive Joint Planning for Autonomous Vehicles

    Authors: Yuxiao Chen, Sushant Veer, Peter Karkus, Marco Pavone

    Abstract: In highly interactive driving scenarios, the actions of one agent greatly influences those of its neighbors. Planning safe motions for autonomous vehicles in such interactive environments, therefore, requires reasoning about the impact of the ego's intended motion plan on nearby agents' behavior. Deep-learning-based models have recently achieved great success in trajectory prediction and many mode… ▽ More

    Submitted 22 November, 2023; v1 submitted 27 October, 2023; originally announced October 2023.

  40. arXiv:2310.13831  [pdf, other

    cs.RO

    Transformers for Trajectory Optimization with Application to Spacecraft Rendezvous

    Authors: Tommaso Guffanti, Daniele Gammelli, Simone D'Amico, Marco Pavone

    Abstract: Reliable and efficient trajectory optimization methods are a fundamental need for autonomous dynamical systems, effectively enabling applications including rocket landing, hypersonic reentry, spacecraft rendezvous, and docking. Within such safety-critical application areas, the complexity of the emerging trajectory optimization problems has motivated the application of AI-based techniques to enhan… ▽ More

    Submitted 5 January, 2024; v1 submitted 20 October, 2023; originally announced October 2023.

    Comments: Presented in 2024 IEEE Aerospace Conference

  41. arXiv:2310.05885  [pdf, other

    cs.RO

    DTPP: Differentiable Joint Conditional Prediction and Cost Evaluation for Tree Policy Planning in Autonomous Driving

    Authors: Zhiyu Huang, Peter Karkus, Boris Ivanovic, Yuxiao Chen, Marco Pavone, Chen Lv

    Abstract: Motion prediction and cost evaluation are vital components in the decision-making system of autonomous vehicles. However, existing methods often ignore the importance of cost learning and treat them as separate modules. In this study, we employ a tree-structured policy planner and propose a differentiable joint training framework for both ego-conditioned prediction and cost models, resulting in a… ▽ More

    Submitted 23 February, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: 2024 IEEE International Conference on Robotics and Automation

  42. arXiv:2309.08603  [pdf, other

    eess.SY cs.RO

    Closing the Loop on Runtime Monitors with Fallback-Safe MPC

    Authors: Rohan Sinha, Edward Schmerling, Marco Pavone

    Abstract: When we rely on deep-learned models for robotic perception, we must recognize that these models may behave unreliably on inputs dissimilar from the training data, compromising the closed-loop system's safety. This raises fundamental questions on how we can assess confidence in perception systems and to what extent we can take safety-preserving actions when external environmental changes degrade ou… ▽ More

    Submitted 17 September, 2023; v1 submitted 15 September, 2023; originally announced September 2023.

    Comments: Accepted to the 2023 IEEE Conference on Decision and Control

  43. arXiv:2309.05746  [pdf, other

    eess.SY cs.RO math.OC

    Robust Nonlinear Reduced-Order Model Predictive Control

    Authors: John Irvin Alora, Luis A. Pabon, Johannes Köhler, Mattia Cenedese, Ed Schmerling, Melanie N. Zeilinger, George Haller, Marco Pavone

    Abstract: Real-world systems are often characterized by high-dimensional nonlinear dynamics, making them challenging to control in real time. While reduced-order models (ROMs) are frequently employed in model-based control schemes, dimensionality reduction introduces model uncertainty which can potentially compromise the stability and safety of the original high-dimensional system. In this work, we propose… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Comments: 9 pages, 3 figures, To be presented at Conference for Decision and Control 2023

  44. arXiv:2309.00709  [pdf, other

    cs.AI cs.LG cs.RO

    Reinforcement Learning with Human Feedback for Realistic Traffic Simulation

    Authors: Yulong Cao, Boris Ivanovic, Chaowei Xiao, Marco Pavone

    Abstract: In light of the challenges and costs of real-world testing, autonomous vehicle developers often rely on testing in simulation for the creation of reliable systems. A key element of effective simulation is the incorporation of realistic traffic models that align with human knowledge, an aspect that has proven challenging due to the need to balance realism and diversity. This works aims to address t… ▽ More

    Submitted 1 September, 2023; originally announced September 2023.

    Comments: 9 pages, 4 figures

  45. arXiv:2308.06337  [pdf, other

    cs.RO

    Refining Obstacle Perception Safety Zones via Maneuver-Based Decomposition

    Authors: Sever Topan, Yuxiao Chen, Edward Schmerling, Karen Leung, Jonas Nilsson, Michael Cox, Marco Pavone

    Abstract: A critical task for developing safe autonomous driving stacks is to determine whether an obstacle is safety-critical, i.e., poses an imminent threat to the autonomous vehicle. Our previous work showed that Hamilton Jacobi reachability theory can be applied to compute interaction-dynamics-aware perception safety zones that better inform an ego vehicle's perception module which obstacles are conside… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

    Comments: * indicates equal contribution. Accepted into the IEEE Intelligent Vehicles Symposium 2023

  46. arXiv:2307.13924  [pdf, other

    cs.CV cs.LG cs.RO

    trajdata: A Unified Interface to Multiple Human Trajectory Datasets

    Authors: Boris Ivanovic, Guanyu Song, Igor Gilitschenski, Marco Pavone

    Abstract: The field of trajectory forecasting has grown significantly in recent years, partially owing to the release of numerous large-scale, real-world human trajectory datasets for autonomous vehicles (AVs) and pedestrian motion tracking. While such datasets have been a boon for the community, they each use custom and unique data formats and APIs, making it cumbersome for researchers to train and evaluat… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: 15 pages, 15 figures, 3 tables

  47. arXiv:2307.07947  [pdf, other

    cs.CV

    Language Conditioned Traffic Generation

    Authors: Shuhan Tan, Boris Ivanovic, Xinshuo Weng, Marco Pavone, Philipp Kraehenbuehl

    Abstract: Simulation forms the backbone of modern self-driving development. Simulators help develop, test, and improve driving systems without putting humans, vehicles, or their environment at risk. However, simulators face a major challenge: They rely on realistic, scalable, yet interesting content. While recent advances in rendering and scene reconstruction make great strides in creating static scene asse… ▽ More

    Submitted 16 July, 2023; originally announced July 2023.

    Comments: Technical Report. Website available at https://ariostgx.github.io/lctgen

  48. arXiv:2307.03167  [pdf, other

    cs.RO eess.SY math.OC

    Risk-Averse Trajectory Optimization via Sample Average Approximation

    Authors: Thomas Lew, Riccardo Bonalli, Marco Pavone

    Abstract: Trajectory optimization under uncertainty underpins a wide range of applications in robotics. However, existing methods are limited in terms of reasoning about sources of epistemic and aleatoric uncertainty, space and time correlations, nonlinear dynamics, and non-convex constraints. In this work, we first introduce a continuous-time planning formulation with an average-value-at-risk constraint ov… ▽ More

    Submitted 26 September, 2023; v1 submitted 6 July, 2023; originally announced July 2023.

    Comments: Added numerical comparisons

  49. arXiv:2307.01408  [pdf, other

    cs.RO cs.LG eess.SY

    Multi-Predictor Fusion: Combining Learning-based and Rule-based Trajectory Predictors

    Authors: Sushant Veer, Apoorva Sharma, Marco Pavone

    Abstract: Trajectory prediction modules are key enablers for safe and efficient planning of autonomous vehicles (AVs), particularly in highly interactive traffic scenarios. Recently, learning-based trajectory predictors have experienced considerable success in providing state-of-the-art performance due to their ability to learn multimodal behaviors of other agents from data. In this paper, we present an alg… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  50. arXiv:2306.06344  [pdf, other

    cs.RO cs.AI cs.LG

    Language-Guided Traffic Simulation via Scene-Level Diffusion

    Authors: Ziyuan Zhong, Davis Rempe, Yuxiao Chen, Boris Ivanovic, Yulong Cao, Danfei Xu, Marco Pavone, Baishakhi Ray

    Abstract: Realistic and controllable traffic simulation is a core capability that is necessary to accelerate autonomous vehicle (AV) development. However, current approaches for controlling learning-based traffic models require significant domain expertise and are difficult for practitioners to use. To remedy this, we present CTG++, a scene-level conditional diffusion model that can be guided by language in… ▽ More

    Submitted 18 October, 2023; v1 submitted 10 June, 2023; originally announced June 2023.