subscribe to arXiv mailings

doi 10.1109/LRA.2022.3192205

Safe Reinforcement Learning Using Black-Box Reachability Analysis

Authors: Mahmoud Selim, Amr Alanwar, Shreyas Kousik, Grace Gao, Marco Pavone, Karl H. Johansson

Abstract: Reinforcement learning (RL) is capable of sophisticated motion planning and control for robots in uncertain environments. However, state-of-the-art deep RL approaches typically lack safety guarantees, especially when the robot and environment models are unknown. To justify widespread deployment, robots must respect safety constraints without sacrificing performance. Thus, we propose a Black-box Re… ▽ More Reinforcement learning (RL) is capable of sophisticated motion planning and control for robots in uncertain environments. However, state-of-the-art deep RL approaches typically lack safety guarantees, especially when the robot and environment models are unknown. To justify widespread deployment, robots must respect safety constraints without sacrificing performance. Thus, we propose a Black-box Reachability-based Safety Layer (BRSL) with three main components: (1) data-driven reachability analysis for a black-box robot model, (2) a trajectory rollout planner that predicts future actions and observations using an ensemble of neural networks trained online, and (3) a differentiable polytope collision check between the reachable set and obstacles that enables correcting unsafe actions. In simulation, BRSL outperforms other state-of-the-art safe RL methods on a Turtlebot 3, a quadrotor, a trajectory-tracking point mass, and a hexarotor in wind with an unsafe set adjacent to the area of highest reward. △ Less

Submitted 21 November, 2022; v1 submitted 15 April, 2022; originally announced April 2022.

Comments: This paper is accepted at IEEE Robotics and Automation Letters and International Conference on Robotics and Automation (ICRA)

arXiv:2204.06716 [pdf, other]

Control-oriented meta-learning

Authors: Spencer M. Richards, Navid Azizan, Jean-Jacques Slotine, Marco Pavone

Abstract: Real-time adaptation is imperative to the control of robots operating in complex, dynamic environments. Adaptive control laws can endow even nonlinear systems with good trajectory tracking performance, provided that any uncertain dynamics terms are linearly parameterizable with known nonlinear features. However, it is often difficult to specify such features a priori, such as for aerodynamic distu… ▽ More Real-time adaptation is imperative to the control of robots operating in complex, dynamic environments. Adaptive control laws can endow even nonlinear systems with good trajectory tracking performance, provided that any uncertain dynamics terms are linearly parameterizable with known nonlinear features. However, it is often difficult to specify such features a priori, such as for aerodynamic disturbances on rotorcraft or interaction forces between a manipulator arm and various objects. In this paper, we turn to data-driven modeling with neural networks to learn, offline from past data, an adaptive controller with an internal parametric model of these nonlinear features. Our key insight is that we can better prepare the controller for deployment with control-oriented meta-learning of features in closed-loop simulation, rather than regression-oriented meta-learning of features to fit input-output data. Specifically, we meta-learn the adaptive controller with closed-loop tracking simulation as the base-learner and the average tracking error as the meta-objective. With both fully-actuated and underactuated nonlinear planar rotorcraft subject to wind, we demonstrate that our adaptive controller outperforms other controllers trained with regression-oriented meta-learning when deployed in closed-loop for trajectory tracking control. △ Less

Submitted 13 April, 2022; originally announced April 2022.

Comments: First published in Robotics: Science and Systems (RSS) 2021. This extended version is under review for a special issue in the International Journal of Robotics Research (IJRR). arXiv admin note: substantial text overlap with arXiv:2103.04490

arXiv:2203.17150 [pdf, other]

Online Learning for Traffic Routing under Unknown Preferences

Authors: Devansh Jalota, Karthik Gopalakrishnan, Navid Azizan, Ramesh Johari, Marco Pavone

Abstract: In transportation networks, users typically choose routes in a decentralized and self-interested manner to minimize their individual travel costs, which, in practice, often results in inefficient overall outcomes for society. As a result, there has been a growing interest in designing road tolling schemes to cope with these efficiency losses and steer users toward a system-efficient traffic patter… ▽ More In transportation networks, users typically choose routes in a decentralized and self-interested manner to minimize their individual travel costs, which, in practice, often results in inefficient overall outcomes for society. As a result, there has been a growing interest in designing road tolling schemes to cope with these efficiency losses and steer users toward a system-efficient traffic pattern. However, the efficacy of road tolling schemes often relies on having access to complete information on users' trip attributes, such as their origin-destination (O-D) travel information and their values of time, which may not be available in practice. Motivated by this practical consideration, we propose an online learning approach to set tolls in a traffic network to drive heterogeneous users with different values of time toward a system-efficient traffic pattern. In particular, we develop a simple yet effective algorithm that adjusts tolls at each time period solely based on the observed aggregate flows on the roads of the network without relying on any additional trip attributes of users, thereby preserving user privacy. In the setting where the O-D pairs and values of time of users are drawn i.i.d. at each period, we show that our approach obtains an expected regret and road capacity violation of $O(\sqrt{T})$, where $T$ is the number of periods over which tolls are updated. Our regret guarantee is relative to an offline oracle that has complete information on users' trip attributes. We further establish a $Ω(\sqrt{T})$ lower bound on the regret of any algorithm, which establishes that our algorithm is optimal up to constants. Finally, we demonstrate the superior performance of our approach relative to several benchmarks on a real-world transportation network, thereby highlighting its practical applicability. △ Less

Submitted 31 March, 2022; originally announced March 2022.

arXiv:2203.07747 [pdf, other]

doi 10.1109/LRA.2023.3246839

Real-time Neural-MPC: Deep Learning Model Predictive Control for Quadrotors and Agile Robotic Platforms

Authors: Tim Salzmann, Elia Kaufmann, Jon Arrizabalaga, Marco Pavone, Davide Scaramuzza, Markus Ryll

Abstract: Model Predictive Control (MPC) has become a popular framework in embedded control for high-performance autonomous systems. However, to achieve good control performance using MPC, an accurate dynamics model is key. To maintain real-time operation, the dynamics models used on embedded systems have been limited to simple first-principle models, which substantially limits their representative power. I… ▽ More Model Predictive Control (MPC) has become a popular framework in embedded control for high-performance autonomous systems. However, to achieve good control performance using MPC, an accurate dynamics model is key. To maintain real-time operation, the dynamics models used on embedded systems have been limited to simple first-principle models, which substantially limits their representative power. In contrast to such simple models, machine learning approaches, specifically neural networks, have been shown to accurately model even complex dynamic effects, but their large computational complexity hindered combination with fast real-time iteration loops. With this work, we present Real-time Neural MPC, a framework to efficiently integrate large, complex neural network architectures as dynamics models within a model-predictive control pipeline. Our experiments, performed in simulation and the real world onboard a highly agile quadrotor platform, demonstrate the capabilities of the described system to run learned models with, previously infeasible, large modeling capacity using gradient-based online optimization MPC. Compared to prior implementations of neural networks in online optimization MPC we can leverage models of over 4000 times larger parametric capacity in a 50Hz real-time window on an embedded platform. Further, we show the feasibility of our framework on real-world problems by reducing the positional tracking error by up to 82% when compared to state-of-the-art MPC approaches without neural network dynamics. △ Less

Submitted 25 July, 2023; v1 submitted 15 March, 2022; originally announced March 2022.

Journal ref: IEEE Robotics and Automation Letters (Volume: 8, Issue: 4, April 2023)

arXiv:2203.04132 [pdf, other]

Motron: Multimodal Probabilistic Human Motion Forecasting

Authors: Tim Salzmann, Marco Pavone, Markus Ryll

Abstract: Autonomous systems and humans are increasingly sharing the same space. Robots work side by side or even hand in hand with humans to balance each other's limitations. Such cooperative interactions are ever more sophisticated. Thus, the ability to reason not just about a human's center of gravity position, but also its granular motion is an important prerequisite for human-robot interaction. Though,… ▽ More Autonomous systems and humans are increasingly sharing the same space. Robots work side by side or even hand in hand with humans to balance each other's limitations. Such cooperative interactions are ever more sophisticated. Thus, the ability to reason not just about a human's center of gravity position, but also its granular motion is an important prerequisite for human-robot interaction. Though, many algorithms ignore the multimodal nature of humans or neglect uncertainty in their motion forecasts. We present Motron, a multimodal, probabilistic, graph-structured model, that captures human's multimodality using probabilistic methods while being able to output deterministic maximum-likelihood motions and corresponding confidence values for each mode. Our model aims to be tightly integrated with the robotic planning-control-interaction loop; outputting physically feasible human motions and being computationally efficient. We demonstrate the performance of our model on several challenging real-world motion forecasting datasets, outperforming a wide array of generative/variational methods while providing state-of-the-art single-output motions if required. Both using significantly less computational power than state-of-the art algorithms. △ Less

Submitted 25 March, 2022; v1 submitted 8 March, 2022; originally announced March 2022.

Comments: CVPR 2022

arXiv:2203.03034 [pdf, other]

A Unified View of SDP-based Neural Network Verification through Completely Positive Programming

Authors: Robin Brown, Edward Schmerling, Navid Azizan, Marco Pavone

Abstract: Verifying that input-output relationships of a neural network conform to prescribed operational specifications is a key enabler towards deploying these networks in safety-critical applications. Semidefinite programming (SDP)-based approaches to Rectified Linear Unit (ReLU) network verification transcribe this problem into an optimization problem, where the accuracy of any such formulation reflects… ▽ More Verifying that input-output relationships of a neural network conform to prescribed operational specifications is a key enabler towards deploying these networks in safety-critical applications. Semidefinite programming (SDP)-based approaches to Rectified Linear Unit (ReLU) network verification transcribe this problem into an optimization problem, where the accuracy of any such formulation reflects the level of fidelity in how the neural network computation is represented, as well as the relaxations of intractable constraints. While the literature contains much progress on improving the tightness of SDP formulations while maintaining tractability, comparatively little work has been devoted to the other extreme, i.e., how to most accurately capture the original verification problem before SDP relaxation. In this work, we develop an exact, convex formulation of verification as a completely positive program (CPP), and provide analysis showing that our formulation is minimal -- the removal of any constraint fundamentally misrepresents the neural network computation. We leverage our formulation to provide a unifying view of existing approaches, and give insight into the source of large relaxation gaps observed in some cases. △ Less

Submitted 6 March, 2022; originally announced March 2022.

arXiv:2202.13305 [pdf, other]

Private Location Sharing for Decentralized Routing services

Authors: Matthew Tsao, Kaidi Yang, Karthik Gopalakrishnan, Marco Pavone

Abstract: Data-driven methodologies offer many exciting upsides, but they also introduce new challenges, particularly in the realm of user privacy. Specifically, the way data is collected can pose privacy risks to end users. In many routing services, a single entity (e.g., the routing service provider) collects and manages user trajectory data. When it comes to user privacy, these systems have a central poi… ▽ More Data-driven methodologies offer many exciting upsides, but they also introduce new challenges, particularly in the realm of user privacy. Specifically, the way data is collected can pose privacy risks to end users. In many routing services, a single entity (e.g., the routing service provider) collects and manages user trajectory data. When it comes to user privacy, these systems have a central point of failure since users have to trust that this entity will not sell or use their data to infer sensitive private information. Unfortunately, in practice many advertising companies offer to buy such data for the sake of targeted advertisements. With this as motivation, we study the problem of using location data for routing services in a privacy-preserving way. Rather than having users report their location to a central operator, we present a protocol in which users participate in a decentralized and privacy-preserving computation to estimate travel times for the roads in the network in a way that no individuals' location is ever observed by any other party. The protocol uses the Laplace mechanism in conjunction with secure multi-party computation to ensure that it is cryptogrpahically secure and that its output is differentially private. A natural question is if privacy necessitates degradation in accuracy or system performance. We show that if a road has sufficiently high capacity, then the travel time estimated by our protocol is provably close to the ground truth travel time. We validate the protocol through numerical experiments which show that using the protocol as a routing service provides privacy guarantees with minimal overhead to user travel time. △ Less

Submitted 14 March, 2022; v1 submitted 27 February, 2022; originally announced February 2022.

arXiv:2202.07147 [pdf, other]

Graph Meta-Reinforcement Learning for Transferable Autonomous Mobility-on-Demand

Authors: Daniele Gammelli, Kaidi Yang, James Harrison, Filipe Rodrigues, Francisco C. Pereira, Marco Pavone

Abstract: Autonomous Mobility-on-Demand (AMoD) systems represent an attractive alternative to existing transportation paradigms, currently challenged by urbanization and increasing travel needs. By centrally controlling a fleet of self-driving vehicles, these systems provide mobility service to customers and are currently starting to be deployed in a number of cities around the world. Current learning-based… ▽ More Autonomous Mobility-on-Demand (AMoD) systems represent an attractive alternative to existing transportation paradigms, currently challenged by urbanization and increasing travel needs. By centrally controlling a fleet of self-driving vehicles, these systems provide mobility service to customers and are currently starting to be deployed in a number of cities around the world. Current learning-based approaches for controlling AMoD systems are limited to the single-city scenario, whereby the service operator is allowed to take an unlimited amount of operational decisions within the same transportation system. However, real-world system operators can hardly afford to fully re-train AMoD controllers for every city they operate in, as this could result in a high number of poor-quality decisions during training, making the single-city strategy a potentially impractical solution. To address these limitations, we propose to formalize the multi-city AMoD problem through the lens of meta-reinforcement learning (meta-RL) and devise an actor-critic algorithm based on recurrent graph neural networks. In our approach, AMoD controllers are explicitly trained such that a small amount of experience within a new city will produce good system performance. Empirically, we show how control policies learned through meta-RL are able to achieve near-optimal performance on unseen cities by learning rapidly adaptable policies, thus making them more robust not only to novel environments, but also to distribution shifts common in real-world operations, such as special events, unexpected congestion, and dynamic pricing schemes. △ Less

Submitted 14 February, 2022; originally announced February 2022.

Comments: 11 pages, 4 figures

arXiv:2202.05232 [pdf, other]

Matching with Transfers under Distributional Constraints

Authors: Devansh Jalota, Michael Ostrovsky, Marco Pavone

Abstract: We study two-sided many-to-one matching markets with transferable utilities, e.g., labor and rental housing markets, in which money can exchange hands between agents, subject to distributional constraints on the set of feasible allocations. In such markets, we establish the efficiency of equilibrium arrangements, specified by an assignment and transfers between agents on the two sides of the marke… ▽ More We study two-sided many-to-one matching markets with transferable utilities, e.g., labor and rental housing markets, in which money can exchange hands between agents, subject to distributional constraints on the set of feasible allocations. In such markets, we establish the efficiency of equilibrium arrangements, specified by an assignment and transfers between agents on the two sides of the market, and study the conditions on the distributional constraints and agent preferences under which equilibria exist and can be computed efficiently. To this end, we first consider the setting when the number of institutions (e.g., firms in a labor market) is one and show that equilibrium arrangements exist irrespective of the nature of the constraint structure or the agents' preferences. However, equilibrium arrangements may not exist in markets with multiple institutions even when agents on each side have linear (or additively separable) preferences over agents on the other side. Thus, for markets with linear preferences, we study sufficient conditions on the constraint structure that guarantee the existence of equilibria using linear programming duality. Our linear programming approach not only generalizes that of Shapley and Shubik (1971) in the one-to-one matching setting to the many-to-one matching setting under distributional constraints but also provides a method to compute market equilibria efficiently. △ Less

Submitted 24 April, 2022; v1 submitted 10 February, 2022; originally announced February 2022.

arXiv:2202.04193 [pdf, other]

Data-Driven Chance Constrained Control using Kernel Distribution Embeddings

Authors: Adam J. Thorpe, Thomas Lew, Meeko M. K. Oishi, Marco Pavone

Abstract: We present a data-driven algorithm for efficiently computing stochastic control policies for general joint chance constrained optimal control problems. Our approach leverages the theory of kernel distribution embeddings, which allows representing expectation operators as inner products in a reproducing kernel Hilbert space. This framework enables approximately reformulating the original problem us… ▽ More We present a data-driven algorithm for efficiently computing stochastic control policies for general joint chance constrained optimal control problems. Our approach leverages the theory of kernel distribution embeddings, which allows representing expectation operators as inner products in a reproducing kernel Hilbert space. This framework enables approximately reformulating the original problem using a dataset of observed trajectories from the system without imposing prior assumptions on the parameterization of the system dynamics or the structure of the uncertainty. By optimizing over a finite subset of stochastic open-loop control trajectories, we relax the original problem to a linear program over the control parameters that can be efficiently solved using standard convex optimization techniques. We demonstrate our proposed approach in simulation on a system with nonlinear non-Markovian dynamics navigating in a cluttered environment. △ Less

Submitted 8 February, 2022; originally announced February 2022.

Comments: Submitted to 4th Annual Learning for Dynamics & Control Conference (L4DC) 2022

arXiv:2202.01997 [pdf, other]

Semi-Supervised Trajectory-Feedback Controller Synthesis for Signal Temporal Logic Specifications

Authors: Karen Leung, Marco Pavone

Abstract: There are spatio-temporal rules that dictate how robots should operate in complex environments, e.g., road rules govern how (self-driving) vehicles should behave on the road. However, seamlessly incorporating such rules into a robot control policy remains challenging especially for real-time applications. In this work, given a desired spatio-temporal specification expressed in the Signal Temporal… ▽ More There are spatio-temporal rules that dictate how robots should operate in complex environments, e.g., road rules govern how (self-driving) vehicles should behave on the road. However, seamlessly incorporating such rules into a robot control policy remains challenging especially for real-time applications. In this work, given a desired spatio-temporal specification expressed in the Signal Temporal Logic (STL) language, we propose a semi-supervised controller synthesis technique that is attuned to human-like behaviors while satisfying desired STL specifications. Offline, we synthesize a trajectory-feedback neural network controller via an adversarial training scheme that summarizes past spatio-temporal behaviors when computing controls, and then online, we perform gradient steps to improve specification satisfaction. Central to the offline phase is an imitation-based regularization component that fosters better policy exploration and helps induce naturalistic human behaviors. Our experiments demonstrate that having imitation-based regularization leads to higher qualitative and quantitative performance compared to optimizing an STL objective only as done in prior work. We demonstrate the efficacy of our approach with an illustrative case study and show that our proposed controller outperforms a state-of-the-art shooting method in both performance and computation time. △ Less

Submitted 4 February, 2022; originally announced February 2022.

Comments: Accepted to American Controls Conference 2022

arXiv:2112.05745 [pdf, other]

A Simple and Efficient Sampling-based Algorithm for General Reachability Analysis

Authors: Thomas Lew, Lucas Janson, Riccardo Bonalli, Marco Pavone

Abstract: In this work, we analyze an efficient sampling-based algorithm for general-purpose reachability analysis, which remains a notoriously challenging problem with applications ranging from neural network verification to safety analysis of dynamical systems. By sampling inputs, evaluating their images in the true reachable set, and taking their $ε$-padded convex hull as a set estimator, this algorithm… ▽ More In this work, we analyze an efficient sampling-based algorithm for general-purpose reachability analysis, which remains a notoriously challenging problem with applications ranging from neural network verification to safety analysis of dynamical systems. By sampling inputs, evaluating their images in the true reachable set, and taking their $ε$-padded convex hull as a set estimator, this algorithm applies to general problem settings and is simple to implement. Our main contribution is the derivation of asymptotic and finite-sample accuracy guarantees using random set theory. This analysis informs algorithmic design to obtain an $ε$-close reachable set approximation with high probability, provides insights into which reachability problems are most challenging, and motivates safety-critical applications of the technique. On a neural network verification task, we show that this approach is more accurate and significantly faster than prior work. Informed by our analysis, we also design a robust model predictive controller that we demonstrate in hardware experiments. △ Less

Submitted 13 April, 2022; v1 submitted 10 December, 2021; originally announced December 2021.

Comments: 4th Annual Learning for Dynamics & Control Conference (L4DC) 2022. Section V: added the assumption $\partial\mathcal{Y}\subseteq f(\partial\mathcal{X})$. If $\partial\mathcal{Y}\nsubseteq f(\partial\mathcal{X})$, then one should sample over the entire set $\mathcal{X}$ to obtain finite-sample bounds

arXiv:2112.00956 [pdf, other]

Personalized Federated Learning of Driver Prediction Models for Autonomous Driving

Authors: Manabu Nakanoya, Junha Im, Hang Qiu, Sachin Katti, Marco Pavone, Sandeep Chinchali

Abstract: Autonomous vehicles (AVs) must interact with a diverse set of human drivers in heterogeneous geographic areas. Ideally, fleets of AVs should share trajectory data to continually re-train and improve trajectory forecasting models from collective experience using cloud-based distributed learning. At the same time, these robots should ideally avoid uploading raw driver interaction data in order to pr… ▽ More Autonomous vehicles (AVs) must interact with a diverse set of human drivers in heterogeneous geographic areas. Ideally, fleets of AVs should share trajectory data to continually re-train and improve trajectory forecasting models from collective experience using cloud-based distributed learning. At the same time, these robots should ideally avoid uploading raw driver interaction data in order to protect proprietary policies (when sharing insights with other companies) or protect driver privacy from insurance companies. Federated learning (FL) is a popular mechanism to learn models in cloud servers from diverse users without divulging private local data. However, FL is often not robust -- it learns sub-optimal models when user data comes from highly heterogeneous distributions, which is a key hallmark of human-robot interactions. In this paper, we present a novel variant of personalized FL to specialize robust robot learning models to diverse user distributions. Our algorithm outperforms standard FL benchmarks by up to 2x in real user studies that we conducted where human-operated vehicles must gracefully merge lanes with simulated AVs in the standard CARLA and CARLO AV simulators. △ Less

Submitted 1 December, 2021; originally announced December 2021.

arXiv:2111.06084 [pdf, other]

On the Problem of Reformulating Systems with Uncertain Dynamics as a Stochastic Differential Equation

Authors: Thomas Lew, Apoorva Sharma, James Harrison, Edward Schmerling, Marco Pavone

Abstract: We identify an issue in recent approaches to learning-based control that reformulate systems with uncertain dynamics using a stochastic differential equation. Specifically, we discuss the approximation that replaces a model with fixed but uncertain parameters (a source of epistemic uncertainty) with a model subject to external disturbances modeled as a Brownian motion (corresponding to aleatoric u… ▽ More We identify an issue in recent approaches to learning-based control that reformulate systems with uncertain dynamics using a stochastic differential equation. Specifically, we discuss the approximation that replaces a model with fixed but uncertain parameters (a source of epistemic uncertainty) with a model subject to external disturbances modeled as a Brownian motion (corresponding to aleatoric uncertainty). △ Less

Submitted 11 November, 2021; originally announced November 2021.

arXiv:2110.10829 [pdf, other]

ReachBot: A Small Robot for Large Mobile Manipulation Tasks

Authors: Stephanie Schneider, Andrew Bylard, Tony G. Chen, Preston Wang, Mark Cutkosky, Marco Pavone

Abstract: Robots are widely deployed in space environments because of their versatility and robustness. However, adverse gravity conditions and challenging terrain geometry expose the limitations of traditional robot designs, which are often forced to sacrifice one of mobility or manipulation capabilities to attain the other. Prospective climbing operations in these environments reveals a need for small, co… ▽ More Robots are widely deployed in space environments because of their versatility and robustness. However, adverse gravity conditions and challenging terrain geometry expose the limitations of traditional robot designs, which are often forced to sacrifice one of mobility or manipulation capabilities to attain the other. Prospective climbing operations in these environments reveals a need for small, compact robots capable of versatile mobility and manipulation. We propose a novel robotic concept called ReachBot that fills this need by combining two existing technologies: extendable booms and mobile manipulation. ReachBot leverages the reach and tensile strength of extendable booms to achieve an outsized reachable workspace and wrench capability. Through their lightweight, compactable structure, these booms also reduce mass and complexity compared to traditional rigid-link articulated-arm designs. Using these advantages, ReachBot excels in mobile manipulation missions in low gravity or that require climbing, particularly when anchor points are sparse. After introducing the ReachBot concept, we discuss modeling approaches and strategies for increasing stability and robustness. We then develop a 2D analytical model for ReachBot's dynamics inspired by grasp models for dexterous manipulators. Next, we introduce a waypoint-tracking controller for a planar ReachBot in microgravity. Our simulation results demonstrate the controller's robustness to disturbances and modeling error. Finally, we briefly discuss next steps that build on these initially promising results to realize the full potential of ReachBot. △ Less

Submitted 20 October, 2021; originally announced October 2021.

Comments: 12 pages, 13 figures

arXiv:2110.09481 [pdf, other]

MTP: Multi-Hypothesis Tracking and Prediction for Reduced Error Propagation

Authors: Xinshuo Weng, Boris Ivanovic, Marco Pavone

Abstract: Recently, there has been tremendous progress in developing each individual module of the standard perception-planning robot autonomy pipeline, including detection, tracking, prediction of other agents' trajectories, and ego-agent trajectory planning. Nevertheless, there has been less attention given to the principled integration of these components, particularly in terms of the characterization an… ▽ More Recently, there has been tremendous progress in developing each individual module of the standard perception-planning robot autonomy pipeline, including detection, tracking, prediction of other agents' trajectories, and ego-agent trajectory planning. Nevertheless, there has been less attention given to the principled integration of these components, particularly in terms of the characterization and mitigation of cascading errors. This paper addresses the problem of cascading errors by focusing on the coupling between the tracking and prediction modules. First, by using state-of-the-art tracking and prediction tools, we conduct a comprehensive experimental evaluation of how severely errors stemming from tracking can impact prediction performance. On the KITTI and nuScenes datasets, we find that predictions consuming tracked trajectories as inputs (the typical case in practice) can experience a significant (even order of magnitude) drop in performance in comparison to the idealized setting where ground truth past trajectories are used as inputs. To address this issue, we propose a multi-hypothesis tracking and prediction framework. Rather than relying on a single set of tracking results for prediction, our framework simultaneously reasons about multiple sets of tracking results, thereby increasing the likelihood of including accurate tracking results as inputs to prediction. We show that this framework improves overall prediction performance over the standard single-hypothesis tracking-prediction pipeline by up to 34.2% on the nuScenes dataset, with even more significant improvements (up to ~70%) when restricting the evaluation to challenging scenarios involving identity switches and fragments -- all with an acceptable computation overhead. △ Less

Submitted 18 October, 2021; originally announced October 2021.

Comments: Project page: https://www.xinshuoweng.com/projects/MTP

arXiv:2110.08802 [pdf, other]

Coordinated Multi-Agent Pathfinding for Drones and Trucks over Road Networks

Authors: Shushman Choudhury, Kiril Solovey, Mykel Kochenderfer, Marco Pavone

Abstract: We address the problem of routing a team of drones and trucks over large-scale urban road networks. To conserve their limited flight energy, drones can use trucks as temporary modes of transit en route to their own destinations. Such coordination can yield significant savings in total vehicle distance traveled, i.e., truck travel distance and drone flight distance, compared to operating drones and… ▽ More We address the problem of routing a team of drones and trucks over large-scale urban road networks. To conserve their limited flight energy, drones can use trucks as temporary modes of transit en route to their own destinations. Such coordination can yield significant savings in total vehicle distance traveled, i.e., truck travel distance and drone flight distance, compared to operating drones and trucks independently. But it comes at the potentially prohibitive computational cost of deciding which trucks and drones should coordinate and when and where it is most beneficial to do so. We tackle this fundamental trade-off by decoupling our overall intractable problem into tractable sub-problems that we solve stage-wise. The first stage solves only for trucks, by computing paths that make them more likely to be useful transit options for drones. The second stage solves only for drones, by routing them over a composite of the road network and the transit network defined by truck paths from the first stage. We design a comprehensive algorithmic framework that frames each stage as a multi-agent path-finding problem and implement two distinct methods for solving them. We evaluate our approach on extensive simulations with up to $100$ agents on the real-world Manhattan road network containing nearly $4500$ vertices and $10000$ edges. Our framework saves on more than $50\%$ of vehicle distance traveled compared to independently solving for trucks and drones, and computes solutions for all settings within $5$ minutes on commodity hardware. △ Less

Submitted 10 February, 2022; v1 submitted 17 October, 2021; originally announced October 2021.

Comments: Accepted to Autonomous Agents and Multiagent Systems, 2022

arXiv:2110.03270 [pdf, other]

Injecting Planning-Awareness into Prediction and Detection Evaluation

Authors: Boris Ivanovic, Marco Pavone

Abstract: Detecting other agents and forecasting their behavior is an integral part of the modern robotic autonomy stack, especially in safety-critical scenarios entailing human-robot interaction such as autonomous driving. Due to the importance of these components, there has been a significant amount of interest and research in perception and trajectory forecasting, resulting in a wide variety of approache… ▽ More Detecting other agents and forecasting their behavior is an integral part of the modern robotic autonomy stack, especially in safety-critical scenarios entailing human-robot interaction such as autonomous driving. Due to the importance of these components, there has been a significant amount of interest and research in perception and trajectory forecasting, resulting in a wide variety of approaches. Common to most works, however, is the use of the same few accuracy-based evaluation metrics, e.g., intersection-over-union, displacement error, log-likelihood, etc. While these metrics are informative, they are task-agnostic and outputs that are evaluated as equal can lead to vastly different outcomes in downstream planning and decision making. In this work, we take a step back and critically assess current evaluation metrics, proposing task-aware metrics as a better measure of performance in systems where they are deployed. Experiments on an illustrative simulation as well as real-world autonomous driving data validate that our proposed task-aware metrics are able to account for outcome asymmetry and provide a better estimate of a model's closed-loop performance. △ Less

Submitted 7 October, 2021; originally announced October 2021.

Comments: 8 pages, 9 figures. arXiv admin note: substantial text overlap with arXiv:2107.10297

arXiv:2110.03267 [pdf, other]

Propagating State Uncertainty Through Trajectory Forecasting

Authors: Boris Ivanovic, Yifeng Lin, Shubham Shrivastava, Punarjay Chakravarty, Marco Pavone

Abstract: Uncertainty pervades through the modern robotic autonomy stack, with nearly every component (e.g., sensors, detection, classification, tracking, behavior prediction) producing continuous or discrete probabilistic distributions. Trajectory forecasting, in particular, is surrounded by uncertainty as its inputs are produced by (noisy) upstream perception and its outputs are predictions that are often… ▽ More Uncertainty pervades through the modern robotic autonomy stack, with nearly every component (e.g., sensors, detection, classification, tracking, behavior prediction) producing continuous or discrete probabilistic distributions. Trajectory forecasting, in particular, is surrounded by uncertainty as its inputs are produced by (noisy) upstream perception and its outputs are predictions that are often probabilistic for use in downstream planning. However, most trajectory forecasting methods do not account for upstream uncertainty, instead taking only the most-likely values. As a result, perceptual uncertainties are not propagated through forecasting and predictions are frequently overconfident. To address this, we present a novel method for incorporating perceptual state uncertainty in trajectory forecasting, a key component of which is a new statistical distance-based loss function which encourages predicting uncertainties that better match upstream perception. We evaluate our approach both in illustrative simulations and on large-scale, real-world data, demonstrating its efficacy in propagating perceptual state uncertainty through prediction and producing more calibrated predictions. △ Less

Submitted 12 July, 2022; v1 submitted 7 October, 2021; originally announced October 2021.

Comments: IEEE International Conference on Robotics and Automation (ICRA) 2022 -- 8 pages, 6 figures, 4 tables

arXiv:2109.14675 [pdf, other]

Data Sharing and Compression for Cooperative Networked Control

Authors: Jiangnan Cheng, Marco Pavone, Sachin Katti, Sandeep Chinchali, Ao Tang

Abstract: Sharing forecasts of network timeseries data, such as cellular or electricity load patterns, can improve independent control applications ranging from traffic scheduling to power generation. Typically, forecasts are designed without knowledge of a downstream controller's task objective, and thus simply optimize for mean prediction error. However, such task-agnostic representations are often too la… ▽ More Sharing forecasts of network timeseries data, such as cellular or electricity load patterns, can improve independent control applications ranging from traffic scheduling to power generation. Typically, forecasts are designed without knowledge of a downstream controller's task objective, and thus simply optimize for mean prediction error. However, such task-agnostic representations are often too large to stream over a communication network and do not emphasize salient temporal features for cooperative control. This paper presents a solution to learn succinct, highly-compressed forecasts that are co-designed with a modular controller's task objective. Our simulations with real cellular, Internet-of-Things (IoT), and electricity load data show we can improve a model predictive controller's performance by at least $25\%$ while transmitting $80\%$ less data than the competing method. Further, we present theoretical compression results for a networked variant of the classical linear quadratic regulator (LQR) control problem. △ Less

Submitted 5 October, 2021; v1 submitted 29 September, 2021; originally announced September 2021.

Comments: Accepted by 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

arXiv:2109.14152 [pdf, other]

Lyapunov-stable neural-network control

Authors: Hongkai Dai, Benoit Landry, Lujie Yang, Marco Pavone, Russ Tedrake

Abstract: Deep learning has had a far reaching impact in robotics. Specifically, deep reinforcement learning algorithms have been highly effective in synthesizing neural-network controllers for a wide range of tasks. However, despite this empirical success, these controllers still lack theoretical guarantees on their performance, such as Lyapunov stability (i.e., all trajectories of the closed-loop system a… ▽ More Deep learning has had a far reaching impact in robotics. Specifically, deep reinforcement learning algorithms have been highly effective in synthesizing neural-network controllers for a wide range of tasks. However, despite this empirical success, these controllers still lack theoretical guarantees on their performance, such as Lyapunov stability (i.e., all trajectories of the closed-loop system are guaranteed to converge to a goal state under the control policy). This is in stark contrast to traditional model-based controller design, where principled approaches (like LQR) can synthesize stable controllers with provable guarantees. To address this gap, we propose a generic method to synthesize a Lyapunov-stable neural-network controller, together with a neural-network Lyapunov function to simultaneously certify its stability. Our approach formulates the Lyapunov condition verification as a mixed-integer linear program (MIP). Our MIP verifier either certifies the Lyapunov condition, or generates counter examples that can help improve the candidate controller and the Lyapunov function. We also present an optimization program to compute an inner approximation of the region of attraction for the closed-loop system. We apply our approach to robots including an inverted pendulum, a 2D and a 3D quadrotor, and showcase that our neural-network controller outperforms a baseline LQR controller. The code is open sourced at \url{https://github.com/StanfordASL/neural-network-lyapunov}. △ Less

Submitted 28 September, 2021; originally announced September 2021.

Comments: Published at Robotics: Science and Systems (RSS) in July, 2021

arXiv:2109.14082 [pdf, other]

doi 10.1177/02783649231221580

Sample-Efficient Safety Assurances using Conformal Prediction

Authors: Rachel Luo, Shengjia Zhao, Jonathan Kuck, Boris Ivanovic, Silvio Savarese, Edward Schmerling, Marco Pavone

Abstract: When deploying machine learning models in high-stakes robotics applications, the ability to detect unsafe situations is crucial. Early warning systems can provide alerts when an unsafe situation is imminent (in the absence of corrective action). To reliably improve safety, these warning systems should have a provable false negative rate; i.e. of the situations that are unsafe, fewer than $ε$ will… ▽ More When deploying machine learning models in high-stakes robotics applications, the ability to detect unsafe situations is crucial. Early warning systems can provide alerts when an unsafe situation is imminent (in the absence of corrective action). To reliably improve safety, these warning systems should have a provable false negative rate; i.e. of the situations that are unsafe, fewer than $ε$ will occur without an alert. In this work, we present a framework that combines a statistical inference technique known as conformal prediction with a simulator of robot/environment dynamics, in order to tune warning systems to provably achieve an $ε$ false negative rate using as few as $1/ε$ data points. We apply our framework to a driver warning system and a robotic grasping application, and empirically demonstrate guaranteed false negative rate while also observing low false detection (positive) rate. △ Less

Submitted 2 January, 2024; v1 submitted 28 September, 2021; originally announced September 2021.

Comments: International Journal of Robotics Research, 2023

arXiv:2109.08706 [pdf, other]

Online Traffic Routing: Deterministic Limits and Data-driven Enhancements

Authors: Devansh Jalota, Dario Paccagnan, Maximilian Schiffer, Marco Pavone

Abstract: Over the past decade, GPS enabled traffic applications, such as Google Maps and Waze, have become ubiquitous and have had a significant influence on billions of daily commuters' travel patterns. A consequence of the online route suggestions of such applications, e.g., via greedy routing, has often been an increase in traffic congestion since the induced travel patterns may be far from the system o… ▽ More Over the past decade, GPS enabled traffic applications, such as Google Maps and Waze, have become ubiquitous and have had a significant influence on billions of daily commuters' travel patterns. A consequence of the online route suggestions of such applications, e.g., via greedy routing, has often been an increase in traffic congestion since the induced travel patterns may be far from the system optimum. Spurred by the widespread impact of traffic applications on travel patterns, this work studies online traffic routing in the context of capacity-constrained parallel road networks and analyzes this problem from two perspectives. First, we perform a worst-case analysis to identify the limits of deterministic online routing and show that the ratio between the total travel cost of the online solution of any deterministic algorithm and that of the optimal offline solution is unbounded, even in simple settings. This result motivates us to move beyond worst-case analysis. Here, we consider algorithms that exploit knowledge of past problem instances and show how to design data-driven algorithms whose performance can be quantified and formally generalized to unseen future instances. We present numerical experiments based on an application case for the San Francisco Bay Area to evaluate the performance of our approach. Our results show that the data-driven algorithms we develop outperform commonly used greedy online-routing algorithms. △ Less

Submitted 17 September, 2021; originally announced September 2021.

arXiv:2109.04453 [pdf, other]

doi 10.1109/LRA.2022.3153712

Tube-Certified Trajectory Tracking for Nonlinear Systems With Robust Control Contraction Metrics

Authors: Pan Zhao, Arun Lakshmanan, Kasey Ackerman, Aditya Gahlawat, Marco Pavone, Naira Hovakimyan

Abstract: This paper presents an approach towards guaranteed trajectory tracking for nonlinear control-affine systems subject to external disturbances based on robust control contraction metrics (CCM) that aims to minimize the $\mathcal L_\infty$ gain from the disturbances to nominal-actual trajectory deviations. The guarantee is in the form of invariant tubes, computed offline and valid for any nominal tra… ▽ More This paper presents an approach towards guaranteed trajectory tracking for nonlinear control-affine systems subject to external disturbances based on robust control contraction metrics (CCM) that aims to minimize the $\mathcal L_\infty$ gain from the disturbances to nominal-actual trajectory deviations. The guarantee is in the form of invariant tubes, computed offline and valid for any nominal trajectories, in which the actual states and inputs of the system are guaranteed to stay despite disturbances. Under mild assumptions, we prove that the proposed robust CCM (RCCM) approach yields tighter tubes than an existing approach based on CCM and input-to-state stability analysis. We show how the RCCM-based tracking controller together with tubes can be incorporated into a feedback motion planning framework to plan safe trajectories for robotic systems. Simulation results illustrate the effectiveness of the proposed method and empirically demonstrate reduced conservatism compared to the CCM-based approach. △ Less

Submitted 6 July, 2023; v1 submitted 9 September, 2021; originally announced September 2021.

Comments: Extended version of a paper published in IEEE Robotics and Automation Letters (2022). 13 pages, 6 figures

arXiv:2109.01627 [pdf, other]

doi 10.1109/TCNS.2023.3338248

On the Interplay between Self-Driving Cars and Public Transportation

Authors: Nicolas Lanzetti, Maximilian Schiffer, Michael Ostrovsky, Marco Pavone

Abstract: Cities worldwide struggle with overloaded transportation systems and their externalities. The emerging autonomous transportation technology has the potential to alleviate these issues, but the decisions of profit-maximizing operators running large autonomous fleets could negatively impact other stakeholders and the transportation system. An analysis of these tradeoffs requires modeling the modes o… ▽ More Cities worldwide struggle with overloaded transportation systems and their externalities. The emerging autonomous transportation technology has the potential to alleviate these issues, but the decisions of profit-maximizing operators running large autonomous fleets could negatively impact other stakeholders and the transportation system. An analysis of these tradeoffs requires modeling the modes of transportation in a unified framework. In this paper, we propose such a framework, which allows us to study the interplay among mobility service providers (MSPs), public transport authorities, and customers. Our framework combines a graph-theoretic network model for the transportation system with a game-theoretic market model in which MSPs are profit maximizers while customers select individually-optimal transportation options. We apply our framework to data for the city of Berlin and present sensitivity analyses to study parameters that MSPs or municipalities can strategically influence. We show that autonomous ride-hailing systems may cannibalize a public transportation system, serving between 7% and 80% of all customers, depending on market conditions and policy restrictions. △ Less

Submitted 23 December, 2023; v1 submitted 3 September, 2021; originally announced September 2021.

Comments: Accepted for publication in the IEEE Transactions on Control of Network Systems

arXiv:2108.11456 [pdf, other]

Vision-based Autonomous Disinfection of High Touch Surfaces in Indoor Environments

Authors: Sean Roelofs, Benoit Landry, Myra Kurosu Jalil, Adrian Martin, Saisneha Koppaka, Sindy K. Y. Tang, Marco Pavone

Abstract: Autonomous systems have played an important role in response to the Covid-19 pandemic. Notably, there have been multiple attempts to leverage Unmanned Aerial Vehicles (UAVs) to disinfect surfaces. Although recent research suggests that surface transmission is less significant than airborne transmission in the spread of Covid-19, surfaces and fomites can play, and have played, critical roles in the… ▽ More Autonomous systems have played an important role in response to the Covid-19 pandemic. Notably, there have been multiple attempts to leverage Unmanned Aerial Vehicles (UAVs) to disinfect surfaces. Although recent research suggests that surface transmission is less significant than airborne transmission in the spread of Covid-19, surfaces and fomites can play, and have played, critical roles in the transmission of Covid-19 and many other viruses, especially in settings such as child daycares, schools, offices, and hospitals. Employing UAVs for mass spray disinfection offers several potential advantages, including high-throughput application of disinfectant, large scale deployment, and the minimization of health risks to sanitation workers. Despite these potential benefits and preliminary usage of UAVs for disinfection, there has been little research into their design and effectiveness. In this work, we present an autonomous UAV capable of effectively disinfecting indoor surfaces. We identify relevant parameters such as disinfectant type and concentration, and application time and distance required of the UAV to disinfect high-touch surfaces such as door handles. Finally, we develop a robotic system that enables the fully autonomous disinfection of door handles in an unstructured and previously unknown environment. To our knowledge, this is the smallest untethered UAV ever built with both full autonomy and spraying capabilities, allowing it to operate in confined indoor settings, and the first autonomous UAV to specifically target high-touch surfaces on an individual basis with spray disinfectant, resulting in more efficient use of disinfectant △ Less

Submitted 16 September, 2021; v1 submitted 25 August, 2021; originally announced August 2021.

arXiv:2107.14412 [pdf, other]

Towards Data-Driven Synthesis of Autonomous Vehicle Safety Concepts

Authors: Karen Leung, Andrea Bajcsy, Edward Schmerling, Marco Pavone

Abstract: As safety-critical autonomous vehicles (AVs) will soon become pervasive in our society, a number of safety concepts for trusted AV deployment have recently been proposed throughout industry and academia. Yet, achieving consensus on an appropriate safety concept is still an elusive task. In this paper, we advocate for the use of Hamilton-Jacobi (HJ) reachability as a unifying mathematical framework… ▽ More As safety-critical autonomous vehicles (AVs) will soon become pervasive in our society, a number of safety concepts for trusted AV deployment have recently been proposed throughout industry and academia. Yet, achieving consensus on an appropriate safety concept is still an elusive task. In this paper, we advocate for the use of Hamilton-Jacobi (HJ) reachability as a unifying mathematical framework for comparing existing safety concepts, and through elements of this framework propose ways to tailor safety concepts (and thus expand their applicability) to scenarios with implicit expectations on agent behavior in a data-driven fashion. Specifically, we show that (i) existing predominant safety concepts can be embedded in the HJ reachability framework, thereby enabling a common language for comparing and contrasting modeling assumptions, and (ii) HJ reachability can serve as an inductive bias to effectively reason, in a learning context, about two critical, yet often overlooked aspects of safety: responsibility and context-dependency. △ Less

Submitted 20 June, 2022; v1 submitted 29 July, 2021; originally announced July 2021.

arXiv:2107.13682 [pdf, other]

Bayesian Embeddings for Few-Shot Open World Recognition

Authors: John Willes, James Harrison, Ali Harakeh, Chelsea Finn, Marco Pavone, Steven Waslander

Abstract: As autonomous decision-making agents move from narrow operating environments to unstructured worlds, learning systems must move from a closed-world formulation to an open-world and few-shot setting in which agents continuously learn new classes from small amounts of information. This stands in stark contrast to modern machine learning systems that are typically designed with a known set of classes… ▽ More As autonomous decision-making agents move from narrow operating environments to unstructured worlds, learning systems must move from a closed-world formulation to an open-world and few-shot setting in which agents continuously learn new classes from small amounts of information. This stands in stark contrast to modern machine learning systems that are typically designed with a known set of classes and a large number of examples for each class. In this work we extend embedding-based few-shot learning algorithms to the open-world recognition setting. We combine Bayesian non-parametric class priors with an embedding-based pre-training scheme to yield a highly flexible framework which we refer to as few-shot learning for open world recognition (FLOWR). We benchmark our framework on open-world extensions of the common MiniImageNet and TieredImageNet few-shot learning datasets. Our results show, compared to prior methods, strong classification accuracy performance and up to a 12% improvement in H-measure (a measure of novel class detection) from our non-parametric open-world few-shot learning scheme. △ Less

Submitted 5 October, 2022; v1 submitted 28 July, 2021; originally announced July 2021.

arXiv:2107.10297 [pdf, other]

Rethinking Trajectory Forecasting Evaluation

Authors: Boris Ivanovic, Marco Pavone

Abstract: Forecasting the behavior of other agents is an integral part of the modern robotic autonomy stack, especially in safety-critical scenarios with human-robot interaction, such as autonomous driving. In turn, there has been a significant amount of interest and research in trajectory forecasting, resulting in a wide variety of approaches. Common to all works, however, is the use of the same few accura… ▽ More Forecasting the behavior of other agents is an integral part of the modern robotic autonomy stack, especially in safety-critical scenarios with human-robot interaction, such as autonomous driving. In turn, there has been a significant amount of interest and research in trajectory forecasting, resulting in a wide variety of approaches. Common to all works, however, is the use of the same few accuracy-based evaluation metrics, e.g., displacement error and log-likelihood. While these metrics are informative, they are task-agnostic and predictions that are evaluated as equal can lead to vastly different outcomes, e.g., in downstream planning and decision making. In this work, we take a step back and critically evaluate current trajectory forecasting metrics, proposing task-aware metrics as a better measure of performance in systems where prediction is being deployed. We additionally present one example of such a metric, incorporating planning-awareness within existing trajectory forecasting metrics. △ Less

Submitted 21 July, 2021; originally announced July 2021.

Comments: 4 pages, 2 figures

arXiv:2107.08143 [pdf, other]

CoCo: Online Mixed-Integer Control via Supervised Learning

Authors: A. Cauligi, P. Culbertson, E. Schmerling, M. Schwager, B. Stellato, M. Pavone

Abstract: Many robotics problems, from robot motion planning to object manipulation, can be modeled as mixed-integer convex programs (MICPs). However, state-of-the-art algorithms are still unable to solve MICPs for control problems quickly enough for online use and existing heuristics can typically only find suboptimal solutions that might degrade robot performance. In this work, we turn to data-driven meth… ▽ More Many robotics problems, from robot motion planning to object manipulation, can be modeled as mixed-integer convex programs (MICPs). However, state-of-the-art algorithms are still unable to solve MICPs for control problems quickly enough for online use and existing heuristics can typically only find suboptimal solutions that might degrade robot performance. In this work, we turn to data-driven methods and present the Combinatorial Offline, Convex Online (CoCo) algorithm for quickly finding high quality solutions for MICPs. CoCo consists of a two-stage approach. In the offline phase, we train a neural network classifier that maps the problem parameters to a (logical strategy), which we define as the discrete arguments and relaxed big-M constraints associated with the optimal solution for that problem. Online, the classifier is applied to select a candidate logical strategy given new problem parameters; applying this logical strategy allows us to solve the original MICP as a convex optimization problem. We show through numerical experiments how CoCo finds near optimal solutions to MICPs arising in robot planning and control with 1 to 2 orders of magnitude solution speedup compared to other data-driven approaches and solvers. △ Less

Submitted 16 July, 2021; originally announced July 2021.

arXiv:2107.00165 [pdf, other]

Joint Optimization of Autonomous Electric Vehicle Fleet Operations and Charging Station Siting

Authors: Justin Luke, Mauro Salazar, Ram Rajagopal, Marco Pavone

Abstract: Charging infrastructure is the coupling link between power and transportation networks, thus determining charging station siting is necessary for planning of power and transportation systems. While previous works have either optimized for charging station siting given historic travel behavior, or optimized fleet routing and charging given an assumed placement of the stations, this paper introduces… ▽ More Charging infrastructure is the coupling link between power and transportation networks, thus determining charging station siting is necessary for planning of power and transportation systems. While previous works have either optimized for charging station siting given historic travel behavior, or optimized fleet routing and charging given an assumed placement of the stations, this paper introduces a linear program that optimizes for station siting and macroscopic fleet operations in a joint fashion. Given an electricity retail rate and a set of travel demand requests, the optimization minimizes total cost for an autonomous EV fleet comprising of travel costs, station procurement costs, fleet procurement costs, and electricity costs, including demand charges. Specifically, the optimization returns the number of charging plugs for each charging rate (e.g., Level 2, DC fast charging) at each candidate location, as well as the optimal routing and charging of the fleet. From a case-study of an electric vehicle fleet operating in San Francisco, our results show that, albeit with range limitations, small EVs with low procurement costs and high energy efficiencies are the most cost-effective in terms of total ownership costs. Furthermore, the optimal siting of charging stations is more spatially distributed than the current siting of stations, consisting mainly of high-power Level 2 AC stations (16.8 kW) with a small share of DC fast charging stations and no standard 7.7kW Level 2 stations. Optimal siting reduces the total costs, empty vehicle travel, and peak charging load by up to 10%. △ Less

Submitted 21 July, 2021; v1 submitted 30 June, 2021; originally announced July 2021.

Comments: 9 pages, 7 figures. Corrected typos, revised text for clarification purposes, results unchanged. A version of this submission, with minor formatting changes, is to be published in the proceedings of the 24th IEEE International Conference on Intelligent Transportation Systems (ITSC 2021)

arXiv:2106.14827 [pdf]

doi 10.1146/annurev-control-042920-012811

Analysis and Control of Autonomous Mobility-on-Demand Systems

Authors: Gioele Zardini, Nicolas Lanzetti, Marco Pavone, Emilio Frazzoli

Abstract: Challenged by urbanization and increasing travel needs, existing transportation systems need new mobility paradigms. In this article, we present the emerging concept of autonomous mobility-on-demand, whereby centrally orchestrated fleets of autonomous vehicles provide mobility service to customers. We provide a comprehensive review of methods and tools to model and solve problems related to autono… ▽ More Challenged by urbanization and increasing travel needs, existing transportation systems need new mobility paradigms. In this article, we present the emerging concept of autonomous mobility-on-demand, whereby centrally orchestrated fleets of autonomous vehicles provide mobility service to customers. We provide a comprehensive review of methods and tools to model and solve problems related to autonomous mobility-on-demand systems. Specifically, we first identify problem settings for their analysis and control, from both operational and planning perspectives. We then review modeling aspects, including transportation networks, transportation demand, congestion, operational constraints, and interactions with existing infrastructure. Thereafter, we provide a systematic analysis of existing solution methods and performance metrics, highlighting trends and trade-offs. Finally, we present various directions for further research. △ Less

Submitted 18 November, 2021; v1 submitted 28 June, 2021; originally announced June 2021.

Comments: To appear in Annual Review of Control, Robotics, and Autonomous Systems

arXiv:2106.10412 [pdf, other]

Fisher Markets with Linear Constraints: Equilibrium Properties and Efficient Distributed Algorithms

Authors: Devansh Jalota, Marco Pavone, Qi Qi, Yinyu Ye

Abstract: The Fisher market is one of the most fundamental models for resource allocation problems in economic theory, wherein agents spend a budget of currency to buy goods that maximize their utilities, while producers sell capacity constrained goods in exchange for currency. However, the consideration of only two types of constraints, i.e., budgets of individual buyers and capacities of goods, makes Fish… ▽ More The Fisher market is one of the most fundamental models for resource allocation problems in economic theory, wherein agents spend a budget of currency to buy goods that maximize their utilities, while producers sell capacity constrained goods in exchange for currency. However, the consideration of only two types of constraints, i.e., budgets of individual buyers and capacities of goods, makes Fisher markets less amenable for resource allocation settings when agents have additional linear constraints, e.g., knapsack and proportionality constraints. In this work, we introduce a modified Fisher market, where each agent may have additional linear constraints and show that this modification to classical Fisher markets fundamentally alters the properties of the market equilibrium as well as the optimal allocations. These properties of the modified Fisher market prompt us to introduce a budget perturbed social optimization problem (BP-SOP) and set prices based on the dual variables of BP-SOP's capacity constraints. To compute the budget perturbations, we develop a fixed point iterative scheme and validate its convergence through numerical experiments. Since this fixed point iterative scheme involves solving a centralized problem at each step, we propose a new class of distributed algorithms to compute equilibrium prices. In particular, we develop an Alternating Direction Method of Multipliers (ADMM) algorithm with strong convergence guarantees for Fisher markets with homogeneous linear constraints as well as for classical Fisher markets. In this algorithm, the prices are updated based on the tatonnement process, with a step size that is completely independent of the utilities of individual agents. Thus, our mechanism, both theoretically and computationally, overcomes a fundamental limitation of classical Fisher markets, which only consider capacity and budget constraints. △ Less

Submitted 18 June, 2021; originally announced June 2021.

arXiv:2106.10407 [pdf, other]

When Efficiency meets Equity in Congestion Pricing and Revenue Refunding Schemes

Authors: Devansh Jalota, Kiril Solovey, Karthik Gopalakrishnan, Stephen Zoepf, Hamsa Balakrishnan, Marco Pavone

Abstract: Congestion pricing has long been hailed as a means to mitigate traffic congestion; however, its practical adoption has been limited due to the resulting social inequity issue, e.g., low-income users are priced out off certain roads. This issue has spurred interest in the design of equitable mechanisms that aim to refund the collected toll revenues as lump-sum transfers to users. Although revenue r… ▽ More Congestion pricing has long been hailed as a means to mitigate traffic congestion; however, its practical adoption has been limited due to the resulting social inequity issue, e.g., low-income users are priced out off certain roads. This issue has spurred interest in the design of equitable mechanisms that aim to refund the collected toll revenues as lump-sum transfers to users. Although revenue refunding has been extensively studied for over three decades, there has been no thorough characterization of how such schemes can be designed to simultaneously achieve system efficiency and equity objectives. In this work, we bridge this gap through the study of \emph{congestion pricing and revenue refunding} (CPRR) schemes in non-atomic congestion games. We first develop CPRR schemes, which in comparison to the untolled case, simultaneously increase system efficiency without worsening wealth inequality, while being \emph{user-favorable}: irrespective of their initial wealth or values-of-time (which may differ across users), users would experience a lower travel cost after the implementation of the proposed scheme. We then characterize the set of optimal user-favorable CPRR schemes that simultaneously maximize system efficiency and minimize wealth inequality. Finally, we provide a concrete methodology for computing optimal CPRR schemes and also highlight additional equilibrium properties of these schemes under different models of user behavior. Overall, our work demonstrates that through appropriate refunding policies we can design user-favorable CPRR schemes that maximize system efficiency while reducing wealth inequality. △ Less

Submitted 30 March, 2023; v1 submitted 18 June, 2021; originally announced June 2021.

Comments: This paper was accepted to the 1st ACM conference on Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO)

arXiv:2106.09125 [pdf, other]

Convex Optimization for Trajectory Generation

Authors: Danylo Malyuta, Taylor P. Reynolds, Michael Szmuk, Thomas Lew, Riccardo Bonalli, Marco Pavone, Behcet Acikmese

Abstract: Reliable and efficient trajectory generation methods are a fundamental need for autonomous dynamical systems of tomorrow. The goal of this article is to provide a comprehensive tutorial of three major convex optimization-based trajectory generation methods: lossless convexification (LCvx), and two sequential convex programming algorithms known as SCvx and GuSTO. In this article, trajectory generat… ▽ More Reliable and efficient trajectory generation methods are a fundamental need for autonomous dynamical systems of tomorrow. The goal of this article is to provide a comprehensive tutorial of three major convex optimization-based trajectory generation methods: lossless convexification (LCvx), and two sequential convex programming algorithms known as SCvx and GuSTO. In this article, trajectory generation is the computation of a dynamically feasible state and control signal that satisfies a set of constraints while optimizing key mission objectives. The trajectory generation problem is almost always nonconvex, which typically means that it is not readily amenable to efficient and reliable solution onboard an autonomous vehicle. The three algorithms that we discuss use problem reformulation and a systematic algorithmic strategy to nonetheless solve nonconvex trajectory generation tasks through the use of a convex optimizer. The theoretical guarantees and computational speed offered by convex optimization have made the algorithms popular in both research and industry circles. To date, the list of applications includes rocket landing, spacecraft hypersonic reentry, spacecraft rendezvous and docking, aerial motion planning for fixed-wing and quadrotor vehicles, robot motion planning, and more. Among these applications are high-profile rocket flights conducted by organizations like NASA, Masten Space Systems, SpaceX, and Blue Origin. This article aims to give the reader the tools and understanding necessary to work with each algorithm, and to know what each method can and cannot do. A publicly available source code repository supports the provided numerical examples. By the end of the article, the reader should be ready to use the methods, to extend them, and to contribute to their many exciting modern applications. △ Less

Submitted 16 June, 2021; originally announced June 2021.

Comments: 68 pages, 42 figures, 5 tables. This work has been submitted to the IEEE for possible publication

arXiv:2104.14250 [pdf, other]

Control Barrier Functions for Cyber-Physical Systems and Applications to NMPC

Authors: Jan Schilliger, Thomas Lew, Spencer M. Richards, Severin Hänggi, Marco Pavone, Christopher Onder

Abstract: Tractable safety-ensuring algorithms for cyber-physical systems are important in critical applications. Approaches based on Control Barrier Functions assume continuous enforcement, which is not possible in an online fashion. This paper presents two tractable algorithms to ensure forward invariance of discrete-time controlled cyber-physical systems. Both approaches are based on Control Barrier Func… ▽ More Tractable safety-ensuring algorithms for cyber-physical systems are important in critical applications. Approaches based on Control Barrier Functions assume continuous enforcement, which is not possible in an online fashion. This paper presents two tractable algorithms to ensure forward invariance of discrete-time controlled cyber-physical systems. Both approaches are based on Control Barrier Functions to provide strict mathematical safety guarantees. The first algorithm exploits Lipschitz continuity and formulates the safety condition as a robust program which is subsequently relaxed to a set of affine conditions. The second algorithm is inspired by tube-NMPC and uses an affine Control Barrier Function formulation in conjunction with an auxiliary controller to guarantee safety of the system. We combine an approximate NMPC controller with the second algorithm to guarantee strict safety despite approximated constraints and show its effectiveness experimentally on a mini-Segway. △ Less

Submitted 29 April, 2021; originally announced April 2021.

arXiv:2104.12446 [pdf, other]

Heterogeneous-Agent Trajectory Forecasting Incorporating Class Uncertainty

Authors: Boris Ivanovic, Kuan-Hui Lee, Pavel Tokmakov, Blake Wulfe, Rowan McAllister, Adrien Gaidon, Marco Pavone

Abstract: Reasoning about the future behavior of other agents is critical to safe robot navigation. The multiplicity of plausible futures is further amplified by the uncertainty inherent to agent state estimation from data, including positions, velocities, and semantic class. Forecasting methods, however, typically neglect class uncertainty, conditioning instead only on the agent's most likely class, even t… ▽ More Reasoning about the future behavior of other agents is critical to safe robot navigation. The multiplicity of plausible futures is further amplified by the uncertainty inherent to agent state estimation from data, including positions, velocities, and semantic class. Forecasting methods, however, typically neglect class uncertainty, conditioning instead only on the agent's most likely class, even though perception models often return full class distributions. To exploit this information, we present HAICU, a method for heterogeneous-agent trajectory forecasting that explicitly incorporates agents' class probabilities. We additionally present PUP, a new challenging real-world autonomous driving dataset, to investigate the impact of Perceptual Uncertainty in Prediction. It contains challenging crowded scenes with unfiltered agent class probabilities that reflect the long-tail of current state-of-the-art perception systems. We demonstrate that incorporating class probabilities in trajectory forecasting significantly improves performance in the face of uncertainty, and enables new forecasting capabilities such as counterfactual predictions. △ Less

Submitted 2 March, 2022; v1 submitted 26 April, 2021; originally announced April 2021.

Comments: 15 pages, 15 figures, 6 tables

arXiv:2104.11434 [pdf, other]

Graph Neural Network Reinforcement Learning for Autonomous Mobility-on-Demand Systems

Authors: Daniele Gammelli, Kaidi Yang, James Harrison, Filipe Rodrigues, Francisco C. Pereira, Marco Pavone

Abstract: Autonomous mobility-on-demand (AMoD) systems represent a rapidly developing mode of transportation wherein travel requests are dynamically handled by a coordinated fleet of robotic, self-driving vehicles. Given a graph representation of the transportation network - one where, for example, nodes represent areas of the city, and edges the connectivity between them - we argue that the AMoD control pr… ▽ More Autonomous mobility-on-demand (AMoD) systems represent a rapidly developing mode of transportation wherein travel requests are dynamically handled by a coordinated fleet of robotic, self-driving vehicles. Given a graph representation of the transportation network - one where, for example, nodes represent areas of the city, and edges the connectivity between them - we argue that the AMoD control problem is naturally cast as a node-wise decision-making problem. In this paper, we propose a deep reinforcement learning framework to control the rebalancing of AMoD systems through graph neural networks. Crucially, we demonstrate that graph neural networks enable reinforcement learning agents to recover behavior policies that are significantly more transferable, generalizable, and scalable than policies learned through other approaches. Empirically, we show how the learned policies exhibit promising zero-shot transfer capabilities when faced with critical portability tasks such as inter-city generalization, service area expansion, and adaptation to potentially complex urban topologies. △ Less

Submitted 16 August, 2021; v1 submitted 23 April, 2021; originally announced April 2021.

arXiv:2104.08261 [pdf, other]

Adaptive Robust Model Predictive Control with Matched and Unmatched Uncertainty

Authors: Rohan Sinha, James Harrison, Spencer M. Richards, Marco Pavone

Abstract: We propose a learning-based robust predictive control algorithm that compensates for significant uncertainty in the dynamics for a class of discrete-time systems that are nominally linear with an additive nonlinear component. Such systems commonly model the nonlinear effects of an unknown environment on a nominal system. We optimize over a class of nonlinear feedback policies inspired by certainty… ▽ More We propose a learning-based robust predictive control algorithm that compensates for significant uncertainty in the dynamics for a class of discrete-time systems that are nominally linear with an additive nonlinear component. Such systems commonly model the nonlinear effects of an unknown environment on a nominal system. We optimize over a class of nonlinear feedback policies inspired by certainty equivalent "estimate-and-cancel" control laws pioneered in classical adaptive control to achieve significant performance improvements in the presence of uncertainties of large magnitude, a setting in which existing learning-based predictive control algorithms often struggle to guarantee safety. In contrast to previous work in robust adaptive MPC, our approach allows us to take advantage of structure (i.e., the numerical predictions) in the a priori unknown dynamics learned online through function approximation. Our approach also extends typical nonlinear adaptive control methods to systems with state and input constraints even when we cannot directly cancel the additive uncertain function from the dynamics. Moreover, we apply contemporary statistical estimation techniques to certify the system's safety through persistent constraint satisfaction with high probability. Finally, we show in simulation that our method can accommodate more significant unknown dynamics terms than existing methods. △ Less

Submitted 13 October, 2021; v1 submitted 16 April, 2021; originally announced April 2021.

Comments: Major revision

arXiv:2104.07768 [pdf, other]

Trust but Verify: Cryptographic Data Privacy for Mobility Management

Authors: Matthew Tsao, Kaidi Yang, Stephen Zoepf, Marco Pavone

Abstract: The era of Big Data has brought with it a richer understanding of user behavior through massive data sets, which can help organizations optimize the quality of their services. In the context of transportation research, mobility data can provide Municipal Authorities (MA) with insights on how to operate, regulate, or improve the transportation network. Mobility data, however, may contain sensitive… ▽ More The era of Big Data has brought with it a richer understanding of user behavior through massive data sets, which can help organizations optimize the quality of their services. In the context of transportation research, mobility data can provide Municipal Authorities (MA) with insights on how to operate, regulate, or improve the transportation network. Mobility data, however, may contain sensitive information about end users and trade secrets of Mobility Providers (MP). Due to this data privacy concern, MPs may be reluctant to contribute their datasets to MA. Using ideas from cryptography, we propose an interactive protocol between a MA and a MP in which MA obtains insights from mobility data without MP having to reveal its trade secrets or sensitive data of its users. This is accomplished in two steps: a commitment step, and a computation step. In the first step, Merkle commitments and aggregated traffic measurements are used to generate a cryptographic commitment. In the second step, MP extracts insights from the data and sends them to MA. Using the commitment and zero-knowledge proofs, MA can certify that the information received from MP is accurate, without needing to directly inspect the mobility data. We also present a differentially private version of the protocol that is suitable for the large query regime. The protocol is verifiable for both MA and MP in the sense that dishonesty from one party can be detected by the other. The protocol can be readily extended to the more general setting with multiple MPs via secure multi-party computation. △ Less

Submitted 15 November, 2021; v1 submitted 15 April, 2021; originally announced April 2021.

arXiv:2104.02213 [pdf, other]

Particle MPC for Uncertain and Learning-Based Control

Authors: Robert Dyro, James Harrison, Apoorva Sharma, Marco Pavone

Abstract: As robotic systems move from highly structured environments to open worlds, incorporating uncertainty from dynamics learning or state estimation into the control pipeline is essential for robust performance. In this paper we present a nonlinear particle model predictive control (PMPC) approach to control under uncertainty, which directly incorporates any particle-based uncertainty representation,… ▽ More As robotic systems move from highly structured environments to open worlds, incorporating uncertainty from dynamics learning or state estimation into the control pipeline is essential for robust performance. In this paper we present a nonlinear particle model predictive control (PMPC) approach to control under uncertainty, which directly incorporates any particle-based uncertainty representation, such as those common in robotics. Our approach builds on scenario methods for MPC, but in contrast to existing approaches, which either constrain all or only the first timestep to share actions across scenarios, we investigate the impact of a \textit{partial consensus horizon}. Implementing this optimization for nonlinear dynamics by leveraging sequential convex optimization, our approach yields an efficient framework that can be tuned to the particular information gain dynamics of a system to mitigate both over-conservatism and over-optimism. We investigate our approach for two robotic systems across three problem settings: time-varying, partially observed dynamics; sensing uncertainty; and model-based reinforcement learning, and show that our approach improves performance over baselines in all settings. △ Less

Submitted 12 September, 2021; v1 submitted 5 April, 2021; originally announced April 2021.

Comments: Accepted to International Conference on Intelligent Robots and Systems (IROS) 2021

arXiv:2104.00098 [pdf, other]

Balancing Fairness and Efficiency in Traffic Routing via Interpolated Traffic Assignment

Authors: Devansh Jalota, Kiril Solovey, Matthew Tsao, Stephen Zoepf, Marco Pavone

Abstract: System optimum (SO) routing, wherein the total travel time of all users is minimized, is a holy grail for transportation authorities. However, SO routing may discriminate against users who incur much larger travel times than others to achieve high system efficiency, i.e., low total travel times. To address the inherent unfairness of SO routing, we study the $β$-fair SO problem whose goal is to min… ▽ More System optimum (SO) routing, wherein the total travel time of all users is minimized, is a holy grail for transportation authorities. However, SO routing may discriminate against users who incur much larger travel times than others to achieve high system efficiency, i.e., low total travel times. To address the inherent unfairness of SO routing, we study the $β$-fair SO problem whose goal is to minimize the total travel time while guaranteeing a ${β\geq 1}$ level of unfairness, which specifies the maximum possible ratio between the travel times of different users with shared origins and destinations. To obtain feasible solutions to the $β$-fair SO problem while achieving high system efficiency, we develop a new convex program, the Interpolated Traffic Assignment Problem (I-TAP), which interpolates between a fairness-promoting and an efficiency-promoting traffic-assignment objective. We evaluate the efficacy of I-TAP through theoretical bounds on the total system travel time and level of unfairness in terms of its interpolation parameter, as well as present a numerical comparison between I-TAP and a state-of-the-art algorithm on a range of transportation networks. The numerical results indicate that our approach is faster by several orders of magnitude as compared to the benchmark algorithm, while achieving higher system efficiency for all desirable levels of unfairness. We further leverage the structure of I-TAP to develop two pricing mechanisms to collectively enforce the I-TAP solution in the presence of selfish homogeneous and heterogeneous users, respectively, that independently choose routes to minimize their own travel costs. We mention that this is the first study of pricing in the context of fair routing for general road networks (as opposed to, e.g., parallel road networks). △ Less

Submitted 8 February, 2022; v1 submitted 31 March, 2021; originally announced April 2021.

arXiv:2103.11067 [pdf, other]

Multi-Agent Algorithms for Collective Behavior: A structural and application-focused atlas

Authors: Federico Rossi, Saptarshi Bandyopadhyay, Michael T. Wolf, Marco Pavone

Abstract: The goal of this paper is to provide a survey and application-focused atlas of collective behavior coordination algorithms for multi-agent systems. We survey the general family of collective behavior algorithms for multi-agent systems and classify them according to their underlying mathematical structure. In doing so, we aim to capture fundamental mathematical properties of algorithms (e.g., sca… ▽ More The goal of this paper is to provide a survey and application-focused atlas of collective behavior coordination algorithms for multi-agent systems. We survey the general family of collective behavior algorithms for multi-agent systems and classify them according to their underlying mathematical structure. In doing so, we aim to capture fundamental mathematical properties of algorithms (e.g., scalability with respect to the number of agents and bandwidth use) and to show how the same algorithm or family of algorithms can be used for multiple tasks and applications. Collectively, this paper provides an application-focused atlas of algorithms for collective behavior of multi-agent systems, with three objectives: 1. to act as a tutorial guide to practitioners in the selection of coordination algorithms for a given application; 2. to highlight how mathematically similar algorithms can be used for a variety of tasks, ranging from low-level control to high-level coordination; 3. to explore the state-of-the-art in the field of control of multi-agent systems and identify areas for future research. △ Less

Submitted 19 March, 2021; originally announced March 2021.

Comments: Under review for journal publication. Revised and extended version of arXiv:1803.05464

arXiv:2103.04490 [pdf, other]

Adaptive-Control-Oriented Meta-Learning for Nonlinear Systems

Authors: Spencer M. Richards, Navid Azizan, Jean-Jacques Slotine, Marco Pavone

Abstract: Real-time adaptation is imperative to the control of robots operating in complex, dynamic environments. Adaptive control laws can endow even nonlinear systems with good trajectory tracking performance, provided that any uncertain dynamics terms are linearly parameterizable with known nonlinear features. However, it is often difficult to specify such features a priori, such as for aerodynamic distu… ▽ More Real-time adaptation is imperative to the control of robots operating in complex, dynamic environments. Adaptive control laws can endow even nonlinear systems with good trajectory tracking performance, provided that any uncertain dynamics terms are linearly parameterizable with known nonlinear features. However, it is often difficult to specify such features a priori, such as for aerodynamic disturbances on rotorcraft or interaction forces between a manipulator arm and various objects. In this paper, we turn to data-driven modeling with neural networks to learn, offline from past data, an adaptive controller with an internal parametric model of these nonlinear features. Our key insight is that we can better prepare the controller for deployment with control-oriented meta-learning of features in closed-loop simulation, rather than regression-oriented meta-learning of features to fit input-output data. Specifically, we meta-learn the adaptive controller with closed-loop tracking simulation as the base-learner and the average tracking error as the meta-objective. With a nonlinear planar rotorcraft subject to wind, we demonstrate that our adaptive controller outperforms other controllers trained with regression-oriented meta-learning when deployed in closed-loop for trajectory tracking control. △ Less

Submitted 19 June, 2021; v1 submitted 7 March, 2021; originally announced March 2021.

Comments: Robotics: Science and Systems, Virtual, 2021

arXiv:2102.12567 [pdf, other]

Sketching Curvature for Efficient Out-of-Distribution Detection for Deep Neural Networks

Authors: Apoorva Sharma, Navid Azizan, Marco Pavone

Abstract: In order to safely deploy Deep Neural Networks (DNNs) within the perception pipelines of real-time decision making systems, there is a need for safeguards that can detect out-of-training-distribution (OoD) inputs both efficiently and accurately. Building on recent work leveraging the local curvature of DNNs to reason about epistemic uncertainty, we propose Sketching Curvature of OoD Detection (SCO… ▽ More In order to safely deploy Deep Neural Networks (DNNs) within the perception pipelines of real-time decision making systems, there is a need for safeguards that can detect out-of-training-distribution (OoD) inputs both efficiently and accurately. Building on recent work leveraging the local curvature of DNNs to reason about epistemic uncertainty, we propose Sketching Curvature of OoD Detection (SCOD), an architecture-agnostic framework for equipping any trained DNN with a task-relevant epistemic uncertainty estimate. Offline, given a trained model and its training data, SCOD employs tools from matrix sketching to tractably compute a low-rank approximation of the Fisher information matrix, which characterizes which directions in the weight space are most influential on the predictions over the training data. Online, we estimate uncertainty by measuring how much perturbations orthogonal to these directions can alter predictions at a new test input. We apply SCOD to pre-trained networks of varying architectures on several tasks, ranging from regression to classification. We demonstrate that SCOD achieves comparable or better OoD detection performance with lower computational burden relative to existing baselines. △ Less

Submitted 24 February, 2021; originally announced February 2021.

arXiv:2102.10809 [pdf, other]

Local Calibration: Metrics and Recalibration

Authors: Rachel Luo, Aadyot Bhatnagar, Yu Bai, Shengjia Zhao, Huan Wang, Caiming Xiong, Silvio Savarese, Stefano Ermon, Edward Schmerling, Marco Pavone

Abstract: Probabilistic classifiers output confidence scores along with their predictions, and these confidence scores should be calibrated, i.e., they should reflect the reliability of the prediction. Confidence scores that minimize standard metrics such as the expected calibration error (ECE) accurately measure the reliability on average across the entire population. However, it is in general impossible t… ▽ More Probabilistic classifiers output confidence scores along with their predictions, and these confidence scores should be calibrated, i.e., they should reflect the reliability of the prediction. Confidence scores that minimize standard metrics such as the expected calibration error (ECE) accurately measure the reliability on average across the entire population. However, it is in general impossible to measure the reliability of an individual prediction. In this work, we propose the local calibration error (LCE) to span the gap between average and individual reliability. For each individual prediction, the LCE measures the average reliability of a set of similar predictions, where similarity is quantified by a kernel function on a pretrained feature space and by a binning scheme over predicted model confidences. We show theoretically that the LCE can be estimated sample-efficiently from data, and empirically find that it reveals miscalibration modes that are more fine-grained than the ECE can detect. Our key result is a novel local recalibration method LoRe, to improve confidence scores for individual predictions and decrease the LCE. Experimentally, we show that our recalibration method produces more accurate confidence scores, which improves downstream fairness and decision making on classification tasks with both image and tabular data. △ Less

Submitted 18 August, 2022; v1 submitted 22 February, 2021; originally announced February 2021.

arXiv:2101.12086 [pdf, other]

Risk-sensitive safety analysis using Conditional Value-at-Risk

Authors: Margaret P. Chapman, Riccardo Bonalli, Kevin M. Smith, Insoon Yang, Marco Pavone, Claire J. Tomlin

Abstract: This paper develops a safety analysis method for stochastic systems that is sensitive to the possibility and severity of rare harmful outcomes. We define risk-sensitive safe sets as sub-level sets of the solution to a non-standard optimal control problem, where a random maximum cost is assessed via Conditional Value-at-Risk (CVaR). The objective function represents the maximum extent of constraint… ▽ More This paper develops a safety analysis method for stochastic systems that is sensitive to the possibility and severity of rare harmful outcomes. We define risk-sensitive safe sets as sub-level sets of the solution to a non-standard optimal control problem, where a random maximum cost is assessed via Conditional Value-at-Risk (CVaR). The objective function represents the maximum extent of constraint violation of the state trajectory, averaged over a given percentage of worst cases. This problem is well-motivated but difficult to solve tractably because the temporal decomposition for CVaR is history-dependent. Our primary theoretical contribution is to derive computationally tractable under-approximations to risk-sensitive safe sets. Our method provides a novel, theoretically guaranteed, parameter-dependent upper bound to the CVaR of a maximum cost without the need to augment the state space. For a fixed parameter value, the solution to only one Markov decision process problem is required to obtain the under-approximations for any family of risk-sensitivity levels. In addition, we propose a second definition for risk-sensitive safe sets and provide a tractable method for their estimation without using a parameter-dependent upper bound. The second definition is expressed in terms of a new coherent risk functional, which is inspired by CVaR. We demonstrate our primary theoretical contribution via numerical examples. △ Less

Submitted 25 June, 2022; v1 submitted 28 January, 2021; originally announced January 2021.

Journal ref: IEEE Transactions on Automatic Control, 2022

arXiv:2101.01297 [pdf, other]

Composable Geometric Motion Policies using Multi-Task Pullback Bundle Dynamical Systems

Authors: Andrew Bylard, Riccardo Bonalli, Marco Pavone

Abstract: Despite decades of work in fast reactive planning and control, challenges remain in developing reactive motion policies on non-Euclidean manifolds and enforcing constraints while avoiding undesirable potential function local minima. This work presents a principled method for designing and fusing desired robot task behaviors into a stable robot motion policy, leveraging the geometric structure of n… ▽ More Despite decades of work in fast reactive planning and control, challenges remain in developing reactive motion policies on non-Euclidean manifolds and enforcing constraints while avoiding undesirable potential function local minima. This work presents a principled method for designing and fusing desired robot task behaviors into a stable robot motion policy, leveraging the geometric structure of non-Euclidean manifolds, which are prevalent in robot configuration and task spaces. Our Pullback Bundle Dynamical Systems (PBDS) framework drives desired task behaviors and prioritizes tasks using separate position-dependent and position/velocity-dependent Riemannian metrics, respectively, thus simplifying individual task design and modular composition of tasks. For enforcing constraints, we provide a class of metric-based tasks, eliminating local minima by imposing non-conflicting potential functions only for goal region attraction. We also provide a geometric optimization problem for combining tasks inspired by Riemannian Motion Policies (RMPs) that reduces to a simple least-squares problem, and we show that our approach is geometrically well-defined. We demonstrate the PBDS framework on the sphere $\mathbb S^2$ and at 300-500 Hz on a manipulator arm, and we provide task design guidance and an open-source Julia library implementation. Overall, this work presents a fast, easy-to-use framework for generating motion policies without unwanted potential function local minima on general manifolds. △ Less

Submitted 24 March, 2021; v1 submitted 4 January, 2021; originally announced January 2021.

arXiv:2012.06739 [pdf, other]

Sampling Training Data for Continual Learning Between Robots and the Cloud

Authors: Sandeep Chinchali, Evgenya Pergament, Manabu Nakanoya, Eyal Cidon, Edward Zhang, Dinesh Bharadia, Marco Pavone, Sachin Katti

Abstract: Today's robotic fleets are increasingly measuring high-volume video and LIDAR sensory streams, which can be mined for valuable training data, such as rare scenes of road construction sites, to steadily improve robotic perception models. However, re-training perception models on growing volumes of rich sensory data in central compute servers (or the "cloud") places an enormous time and cost burden… ▽ More Today's robotic fleets are increasingly measuring high-volume video and LIDAR sensory streams, which can be mined for valuable training data, such as rare scenes of road construction sites, to steadily improve robotic perception models. However, re-training perception models on growing volumes of rich sensory data in central compute servers (or the "cloud") places an enormous time and cost burden on network transfer, cloud storage, human annotation, and cloud computing resources. Hence, we introduce HarvestNet, an intelligent sampling algorithm that resides on-board a robot and reduces system bottlenecks by only storing rare, useful events to steadily improve perception models re-trained in the cloud. HarvestNet significantly improves the accuracy of machine-learning models on our novel dataset of road construction sites, field testing of self-driving cars, and streaming face recognition, while reducing cloud storage, dataset annotation time, and cloud compute time by between 65.7-81.3%. Further, it is between 1.05-2.58x more accurate than baseline algorithms and scalably runs on embedded deep learning hardware. We provide a suite of compute-efficient perception models for the Google Edge Tensor Processing Unit (TPU), an extended technical report, and a novel video dataset to the research community at https://sites.google.com/view/harvestnet. △ Less

Submitted 12 December, 2020; originally announced December 2020.

Comments: International Symposium on Experimental Robotics (ISER) 2020, Malta

arXiv:2012.03390 [pdf, other]

On Infusing Reachability-Based Safety Assurance within Planning Frameworks for Human-Robot Vehicle Interactions

Authors: Karen Leung, Edward Schmerling, Mengxuan Zhang, Mo Chen, John Talbot, J. Christian Gerdes, Marco Pavone

Abstract: Action anticipation, intent prediction, and proactive behavior are all desirable characteristics for autonomous driving policies in interactive scenarios. Paramount, however, is ensuring safety on the road -- a key challenge in doing so is accounting for uncertainty in human driver actions without unduly impacting planner performance. This paper introduces a minimally-interventional safety control… ▽ More Action anticipation, intent prediction, and proactive behavior are all desirable characteristics for autonomous driving policies in interactive scenarios. Paramount, however, is ensuring safety on the road -- a key challenge in doing so is accounting for uncertainty in human driver actions without unduly impacting planner performance. This paper introduces a minimally-interventional safety controller operating within an autonomous vehicle control stack with the role of ensuring collision-free interaction with an externally controlled (e.g., human-driven) counterpart while respecting static obstacles such as a road boundary wall. We leverage reachability analysis to construct a real-time (100Hz) controller that serves the dual role of (i) tracking an input trajectory from a higher-level planning algorithm using model predictive control, and (ii) assuring safety by maintaining the availability of a collision-free escape maneuver as a persistent constraint regardless of whatever future actions the other car takes. A full-scale steer-by-wire platform is used to conduct traffic weaving experiments wherein two cars, initially side-by-side, must swap lanes in a limited amount of time and distance, emulating cars merging onto/off of a highway. We demonstrate that, with our control stack, the autonomous vehicle is able to avoid collision even when the other car defies the planner's expectations and takes dangerous actions, either carelessly or with the intent to collide, and otherwise deviates minimally from the planned trajectory to the extent required to maintain safety. △ Less

Submitted 6 December, 2020; originally announced December 2020.

Comments: arXiv admin note: text overlap with arXiv:1812.11315

Journal ref: International Journal of Robotics Research, vol. 39, no. 10-11, pp. 1326--1345, 2020

Showing 101–150 of 262 results for author: Pavone, M