-
Control of Microrobots Using Model Predictive Control and Gaussian Processes for Disturbance Estimation
Authors:
Mehdi Kermanshah,
Logan E. Beaver,
Max Sokolich,
Sambeeta Das,
Ron Weiss,
Roberto Tron,
Calin Belta
Abstract:
This paper presents a control framework for magnetically actuated micron-scale robots ($μ$bots) designed to mitigate disturbances and improve trajectory tracking. To address the challenges posed by unmodeled dynamics and environmental variability, we combine data-driven modeling with model-based control to accurately track desired trajectories using a relatively small amount of data. The system is…
▽ More
This paper presents a control framework for magnetically actuated micron-scale robots ($μ$bots) designed to mitigate disturbances and improve trajectory tracking. To address the challenges posed by unmodeled dynamics and environmental variability, we combine data-driven modeling with model-based control to accurately track desired trajectories using a relatively small amount of data. The system is represented with a simple linear model, and Gaussian Processes (GP) are employed to capture and estimate disturbances. This disturbance-enhanced model is then integrated into a Model Predictive Controller (MPC). Our approach demonstrates promising performance in both simulation and experimental setups, showcasing its potential for precise and reliable microrobot control in complex environments.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Interpretable Generative Adversarial Imitation Learning
Authors:
Wenliang Liu,
Danyang Li,
Erfan Aasi,
Roberto Tron,
Calin Belta
Abstract:
Imitation learning methods have demonstrated considerable success in teaching autonomous systems complex tasks through expert demonstrations. However, a limitation of these methods is their lack of interpretability, particularly in understanding the specific task the learning agent aims to accomplish. In this paper, we propose a novel imitation learning method that combines Signal Temporal Logic (…
▽ More
Imitation learning methods have demonstrated considerable success in teaching autonomous systems complex tasks through expert demonstrations. However, a limitation of these methods is their lack of interpretability, particularly in understanding the specific task the learning agent aims to accomplish. In this paper, we propose a novel imitation learning method that combines Signal Temporal Logic (STL) inference and control synthesis, enabling the explicit representation of the task as an STL formula. This approach not only provides a clear understanding of the task but also allows for the incorporation of human knowledge and adaptation to new scenarios through manual adjustments of the STL formulae. Additionally, we employ a Generative Adversarial Network (GAN)-inspired training approach for both the inference and the control policy, effectively narrowing the gap between the expert and learned policies. The effectiveness of our algorithm is demonstrated through two case studies, showcasing its practical applicability and adaptability.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
Robustness Measures and Monitors for Time Window Temporal Logic
Authors:
Ahmad Ahmad,
Cristian-Ioan Vasile,
Roberto Tron,
Calin Belta
Abstract:
Temporal logics (TLs) have been widely used to formalize interpretable tasks for cyber-physical systems. Time Window Temporal Logic (TWTL) has been recently proposed as a specification language for dynamical systems. In particular, it can easily express robotic tasks, and it allows for efficient, automata-based verification and synthesis of control policies for such systems. In this paper, we defi…
▽ More
Temporal logics (TLs) have been widely used to formalize interpretable tasks for cyber-physical systems. Time Window Temporal Logic (TWTL) has been recently proposed as a specification language for dynamical systems. In particular, it can easily express robotic tasks, and it allows for efficient, automata-based verification and synthesis of control policies for such systems. In this paper, we define two quantitative semantics for this logic, and two corresponding monitoring algorithms, which allow for real-time quantification of satisfaction of formulas by trajectories of discrete-time systems. We demonstrate the new semantics and their runtime monitors on numerical examples.
△ Less
Submitted 13 April, 2023;
originally announced April 2023.
-
Learning Robust and Correct Controllers from Signal Temporal Logic Specifications Using BarrierNet
Authors:
Wenliang Liu,
Wei Xiao,
Calin Belta
Abstract:
In this paper, we consider the problem of learning a neural network controller for a system required to satisfy a Signal Temporal Logic (STL) specification. We exploit STL quantitative semantics to define a notion of robust satisfaction. Guaranteeing the correctness of a neural network controller, i.e., ensuring the satisfaction of the specification by the controlled system, is a difficult problem…
▽ More
In this paper, we consider the problem of learning a neural network controller for a system required to satisfy a Signal Temporal Logic (STL) specification. We exploit STL quantitative semantics to define a notion of robust satisfaction. Guaranteeing the correctness of a neural network controller, i.e., ensuring the satisfaction of the specification by the controlled system, is a difficult problem that received a lot of attention recently. We provide a general procedure to construct a set of trainable High Order Control Barrier Functions (HOCBFs) enforcing the satisfaction of formulas in a fragment of STL. We use the BarrierNet, implemented by a differentiable Quadratic Program (dQP) with HOCBF constraints, as the last layer of the neural network controller, to guarantee the satisfaction of the STL formulas. We train the HOCBFs together with other neural network parameters to further improve the robustness of the controller. Simulation results demonstrate that our approach ensures satisfaction and outperforms existing algorithms.
△ Less
Submitted 12 April, 2023;
originally announced April 2023.
-
LQR-CBF-RRT*: Safe and Optimal Motion Planning
Authors:
Guang Yang,
Mingyu Cai,
Ahmad Ahmad,
Amanda Prorok,
Roberto Tron,
Calin Belta
Abstract:
We present LQR-CBF-RRT*, an incremental sampling-based algorithm for offline motion planning. Our framework leverages the strength of Control Barrier Functions (CBFs) and Linear Quadratic Regulators (LQR) to generate safety-critical and optimal trajectories for a robot with dynamics described by an affine control system. CBFs are used for safety guarantees, while LQRs are employed for optimal cont…
▽ More
We present LQR-CBF-RRT*, an incremental sampling-based algorithm for offline motion planning. Our framework leverages the strength of Control Barrier Functions (CBFs) and Linear Quadratic Regulators (LQR) to generate safety-critical and optimal trajectories for a robot with dynamics described by an affine control system. CBFs are used for safety guarantees, while LQRs are employed for optimal control synthesis during edge extensions. Popular CBF-based formulations for safety critical control require solving Quadratic Programs (QPs), which can be computationally expensive. Moreover, LQR-based controllers require repetitive applications of first-order Taylor approximations for nonlinear systems, which can also create an additional computational burden. To improve the motion planning efficiency, we verify the satisfaction of the CBF constraints directly in edge extension to avoid the burden of solving the QPs. We store computed optimal LQR gain matrices in a hash table to avoid re-computation during the local linearization of the rewiring procedure. Lastly, we utilize the Cross-Entropy Method for importance sampling to improve sampling efficiency. Our results show that the proposed planner surpasses its counterparts in computational efficiency and performs well in an experimental setup.
△ Less
Submitted 27 September, 2023; v1 submitted 3 April, 2023;
originally announced April 2023.
-
Auxiliary-Variable Adaptive Control Barrier Functions for Safety Critical Systems
Authors:
Shuo Liu,
Wei Xiao,
Calin A. Belta
Abstract:
This paper studies safety guarantees for systems with time-varying control bounds. It has been shown that optimizing quadratic costs subject to state and control constraints can be reduced to a sequence of Quadratic Programs (QPs) using Control Barrier Functions (CBFs). One of the main challenges in this method is that the CBF-based QP could easily become infeasible under tight control bounds, esp…
▽ More
This paper studies safety guarantees for systems with time-varying control bounds. It has been shown that optimizing quadratic costs subject to state and control constraints can be reduced to a sequence of Quadratic Programs (QPs) using Control Barrier Functions (CBFs). One of the main challenges in this method is that the CBF-based QP could easily become infeasible under tight control bounds, especially when the control bounds are time-varying. The recently proposed adaptive CBFs have addressed such infeasibility issues, but require extensive and non-trivial hyperparameter tuning for the CBF-based QP and may introduce overshooting control near the boundaries of safe sets. To address these issues, we propose a new type of adaptive CBFs called Auxiliary-Variable Adaptive CBFs (AVCBFs). Specifically, we introduce an auxiliary variable that multiplies each CBF itself, and define dynamics for the auxiliary variable to adapt it in constructing the corresponding CBF constraint. In this way, we can improve the feasibility of the CBF-based QP while avoiding extensive parameter tuning with non-overshooting control since the formulation is identical to classical CBF methods. We demonstrate the advantages of using AVCBFs and compare them with existing techniques on an Adaptive Cruise Control (ACC) problem with time-varying control bounds.
△ Less
Submitted 19 April, 2024; v1 submitted 1 April, 2023;
originally announced April 2023.
-
Learning Feasibility Constraints for Control Barrier Functions
Authors:
Wei Xiao,
Christos G. Cassandras,
Calin A. Belta
Abstract:
It has been shown that optimizing quadratic costs while stabilizing affine control systems to desired (sets of) states subject to state and control constraints can be reduced to a sequence of Quadratic Programs (QPs) by using Control Barrier Functions (CBFs) and Control Lyapunov Functions (CLFs). In this paper, we employ machine learning techniques to ensure the feasibility of these QPs, which is…
▽ More
It has been shown that optimizing quadratic costs while stabilizing affine control systems to desired (sets of) states subject to state and control constraints can be reduced to a sequence of Quadratic Programs (QPs) by using Control Barrier Functions (CBFs) and Control Lyapunov Functions (CLFs). In this paper, we employ machine learning techniques to ensure the feasibility of these QPs, which is a challenging problem, especially for high relative degree constraints where High Order CBFs (HOCBFs) are required. To this end, we propose a sampling-based learning approach to learn a new feasibility constraint for CBFs; this constraint is then enforced by another HOCBF added to the QPs. The accuracy of the learned feasibility constraint is recursively improved by a recurrent training algorithm. We demonstrate the advantages of the proposed learning approach to constrained optimal control problems with specific focus on a robot control problem and on autonomous driving in an unknown environment.
△ Less
Submitted 10 March, 2023;
originally announced March 2023.
-
CatlNet: Learning Communication and Coordination Policies from CaTL+ Specifications
Authors:
Wenliang Liu,
Kevin Leahy,
Zachary Serlin,
Calin Belta
Abstract:
In this paper, we propose a learning-based framework to simultaneously learn the communication and distributed control policies for a heterogeneous multi-agent system (MAS) under complex mission requirements from Capability Temporal Logic plus (CaTL+) specifications. Both policies are trained, implemented, and deployed using a novel neural network model called CatlNet. Taking advantage of the robu…
▽ More
In this paper, we propose a learning-based framework to simultaneously learn the communication and distributed control policies for a heterogeneous multi-agent system (MAS) under complex mission requirements from Capability Temporal Logic plus (CaTL+) specifications. Both policies are trained, implemented, and deployed using a novel neural network model called CatlNet. Taking advantage of the robustness measure of CaTL+, we train CatlNet centrally to maximize it where network parameters are shared among all agents, allowing CatlNet to scale to large teams easily. CatlNet can then be deployed distributedly. A plan repair algorithm is also introduced to guide CatlNet's training and improve both training efficiency and the overall performance of CatlNet. The CatlNet approach is tested in simulation and results show that, after training, CatlNet can steer the decentralized MAS system online to satisfy a CaTL+ specification with a high success rate.
△ Less
Submitted 30 November, 2022;
originally announced December 2022.
-
Learning a Tracking Controller for Rolling $μ$bots
Authors:
Logan E Beaver,
Max Sokolich,
Suhail Alsalehi,
Ron Weiss,
Sambeeta Das,
Calin Belta
Abstract:
Micron-scale robots ($μ$bots) have recently shown great promise for emerging medical applications. Accurate controlling $μ$bots, while critical to their successful deployment, is challenging. In this work, we consider the problem of tracking a reference trajectory using a $μ$bot in the presence of disturbances and uncertainty. The disturbances primarily come from Brownian motion and other environm…
▽ More
Micron-scale robots ($μ$bots) have recently shown great promise for emerging medical applications. Accurate controlling $μ$bots, while critical to their successful deployment, is challenging. In this work, we consider the problem of tracking a reference trajectory using a $μ$bot in the presence of disturbances and uncertainty. The disturbances primarily come from Brownian motion and other environmental phenomena, while the uncertainty originates from errors in the model parameters. We model the $μ$bot as an uncertain unicycle that is controlled by a global magnetic field. To compensate for disturbances and uncertainties, we develop a nonlinear mismatch controller. We define the model mismatch error as the difference between our model's predicted velocity and the actual velocity of the $μ$bot. We employ a Gaussian Process to learn the model mismatch error as a function of the applied control input. Then we use a least-squares minimization to select a control action that minimizes the difference between the actual velocity of the $μ$bot and a reference velocity. We demonstrate the online performance of our joint learning and control algorithm in simulation, where our approach accurately learns the model mismatch and improves tracking performance. We also validate our approach in an experiment and show that certain error metrics are reduced by up to $40\%$.
△ Less
Submitted 13 August, 2023; v1 submitted 30 November, 2022;
originally announced December 2022.
-
Iterative Convex Optimization for Model Predictive Control with Discrete-Time High-Order Control Barrier Functions
Authors:
Shuo Liu,
Jun Zeng,
Koushil Sreenath,
Calin A. Belta
Abstract:
Safety is one of the fundamental challenges in control theory. Recently, multi-step optimal control problems for discrete-time dynamical systems were formulated to enforce stability, while subject to input constraints as well as safety-critical requirements using discrete-time control barrier functions within a model predictive control (MPC) framework. Existing work usually focus on the feasibilit…
▽ More
Safety is one of the fundamental challenges in control theory. Recently, multi-step optimal control problems for discrete-time dynamical systems were formulated to enforce stability, while subject to input constraints as well as safety-critical requirements using discrete-time control barrier functions within a model predictive control (MPC) framework. Existing work usually focus on the feasibility or the safety for the optimization problem, and the majority of the existing work restrict the discussions to relative-degree one control barrier functions. Additionally, the real-time computation is challenging when a large horizon is considered in the MPC problem for relative-degree one or high-order control barrier functions. In this paper, we propose a framework that solves the safety-critical MPC problem in an iterative optimization, which is applicable for any relative-degree control barrier functions. In the proposed formulation, the nonlinear system dynamics as well as the safety constraints modeled as discrete-time high-order control barrier functions (DHOCBF) are linearized at each time step. Our formulation is generally valid for any control barrier function with an arbitrary relative-degree. The advantages of fast computational performance with safety guarantee are analyzed and validated with numerical results.
△ Less
Submitted 13 July, 2023; v1 submitted 9 October, 2022;
originally announced October 2022.
-
Adaptive Sampling-based Motion Planning with Control Barrier Functions
Authors:
Ahmad Ahmad,
Calin Belta,
Roberto Tron
Abstract:
Sampling-based algorithms, such as Rapidly Exploring Random Trees (RRT) and its variants, have been used extensively for motion planning. Control barrier functions (CBFs) have been recently proposed to synthesize controllers for safety-critical systems. In this paper, we combine the effectiveness of RRT-based algorithms with the safety guarantees provided by CBFs in a method called CBF-RRT$^\ast$.…
▽ More
Sampling-based algorithms, such as Rapidly Exploring Random Trees (RRT) and its variants, have been used extensively for motion planning. Control barrier functions (CBFs) have been recently proposed to synthesize controllers for safety-critical systems. In this paper, we combine the effectiveness of RRT-based algorithms with the safety guarantees provided by CBFs in a method called CBF-RRT$^\ast$. CBFs are used for local trajectory planning for RRT$^\ast$, avoiding explicit collision checking of the extended paths. We prove that CBF-RRT$^\ast$ preserves the probabilistic completeness of RRT$^\ast$. Furthermore, in order to improve the sampling efficiency of the algorithm, we equip the algorithm with an adaptive sampling procedure, which is based on the cross-entropy method (CEM) for importance sampling (IS). The procedure exploits the tree of samples to focus the sampling in promising regions of the configuration space. We demonstrate the efficacy of the proposed algorithms through simulation examples.
△ Less
Submitted 1 June, 2022;
originally announced June 2022.
-
Control Barrier Functions for Systems with Multiple Control Inputs
Authors:
Wei Xiao,
Christos G. Cassandras,
Calin A. Belta,
Daniela Rus
Abstract:
Control Barrier Functions (CBFs) are becoming popular tools in guaranteeing safety for nonlinear systems and constraints, and they can reduce a constrained optimal control problem into a sequence of Quadratic Programs (QPs) for affine control systems. The recently proposed High Order Control Barrier Functions (HOCBFs) work for arbitrary relative degree constraints. One of the challenges in a HOCBF…
▽ More
Control Barrier Functions (CBFs) are becoming popular tools in guaranteeing safety for nonlinear systems and constraints, and they can reduce a constrained optimal control problem into a sequence of Quadratic Programs (QPs) for affine control systems. The recently proposed High Order Control Barrier Functions (HOCBFs) work for arbitrary relative degree constraints. One of the challenges in a HOCBF is to address the relative degree problem when a system has multiple control inputs, i.e., the relative degree could be defined with respect to different components of the control vector. This paper proposes two methods for HOCBFs to deal with systems with multiple control inputs: a general integral control method and a method which is simpler but limited to specific classes of physical systems. When control bounds are involved, the feasibility of the above mentioned QPs can also be significantly improved with the proposed methods. We illustrate our approaches on a unicyle model with two control inputs, and compare the two proposed methods to demonstrate their effectiveness and performance.
△ Less
Submitted 15 March, 2022;
originally announced March 2022.
-
Distributed Control using Reinforcement Learning with Temporal-Logic-Based Reward Shaping
Authors:
Ningyuan Zhang,
Wenliang Liu,
Calin Belta
Abstract:
We present a computational framework for synthesis of distributed control strategies for a heterogeneous team of robots in a partially observable environment. The goal is to cooperatively satisfy specifications given as Truncated Linear Temporal Logic (TLTL) formulas. Our approach formulates the synthesis problem as a stochastic game and employs a policy graph method to find a control strategy wit…
▽ More
We present a computational framework for synthesis of distributed control strategies for a heterogeneous team of robots in a partially observable environment. The goal is to cooperatively satisfy specifications given as Truncated Linear Temporal Logic (TLTL) formulas. Our approach formulates the synthesis problem as a stochastic game and employs a policy graph method to find a control strategy with memory for each agent. We construct the stochastic game on the product between the team transition system and a finite state automaton (FSA) that tracks the satisfaction of the TLTL formula. We use the quantitative semantics of TLTL as the reward of the game, and further reshape it using the FSA to guide and accelerate the learning process. Simulation results demonstrate the efficacy of the proposed solution under demanding task specifications and the effectiveness of reward shaping in significantly accelerating the speed of learning.
△ Less
Submitted 6 April, 2022; v1 submitted 8 March, 2022;
originally announced March 2022.
-
High Order Robust Adaptive Control Barrier Functions and Exponentially Stabilizing Adaptive Control Lyapunov Functions
Authors:
Max H. Cohen,
Calin Belta
Abstract:
This paper studies the problem of utilizing data-driven adaptive control techniques to guarantee stability and safety of uncertain nonlinear systems with high relative degree. We first introduce the notion of a High Order Robust Adaptive Control Barrier Function (HO-RaCBF) as a means to compute control policies guaranteeing satisfaction of high relative degree safety constraints in the face of par…
▽ More
This paper studies the problem of utilizing data-driven adaptive control techniques to guarantee stability and safety of uncertain nonlinear systems with high relative degree. We first introduce the notion of a High Order Robust Adaptive Control Barrier Function (HO-RaCBF) as a means to compute control policies guaranteeing satisfaction of high relative degree safety constraints in the face of parametric model uncertainty. The developed approach guarantees safety by initially accounting for all possible parameter realizations but adaptively reduces uncertainty in the parameter estimates leveraging data recorded online. We then introduce the notion of an Exponentially Stabilizing Adaptive Control Lyapunov Function (ES-aCLF) that leverages the same data as the HO-RaCBF controller to guarantee exponential convergence of the system trajectory. The developed HO-RaCBF and ES-aCLF are unified in a quadratic programming framework, whose efficacy is showcased via two numerical examples that, to our knowledge, cannot be addressed by existing adaptive control barrier function techniques.
△ Less
Submitted 3 March, 2022;
originally announced March 2022.
-
Overcoming Exploration: Deep Reinforcement Learning for Continuous Control in Cluttered Environments from Temporal Logic Specifications
Authors:
Mingyu Cai,
Erfan Aasi,
Calin Belta,
Cristian-Ioan Vasile
Abstract:
Model-free continuous control for robot navigation tasks using Deep Reinforcement Learning (DRL) that relies on noisy policies for exploration is sensitive to the density of rewards. In practice, robots are usually deployed in cluttered environments, containing many obstacles and narrow passageways. Designing dense effective rewards is challenging, resulting in exploration issues during training.…
▽ More
Model-free continuous control for robot navigation tasks using Deep Reinforcement Learning (DRL) that relies on noisy policies for exploration is sensitive to the density of rewards. In practice, robots are usually deployed in cluttered environments, containing many obstacles and narrow passageways. Designing dense effective rewards is challenging, resulting in exploration issues during training. Such a problem becomes even more serious when tasks are described using temporal logic specifications. This work presents a deep policy gradient algorithm for controlling a robot with unknown dynamics operating in a cluttered environment when the task is specified as a Linear Temporal Logic (LTL) formula. To overcome the environmental challenge of exploration during training, we propose a novel path planning-guided reward scheme by integrating sampling-based methods to effectively complete goal-reaching missions. To facilitate LTL satisfaction, our approach decomposes the LTL mission into sub-goal-reaching tasks that are solved in a distributed manner. Our framework is shown to significantly improve performance (effectiveness, efficiency) and exploration of robots tasked with complex missions in large-scale cluttered environments. A video demonstration can be found on YouTube Channel: https://youtu.be/yMh_NUNWxho.
△ Less
Submitted 23 February, 2023; v1 submitted 28 January, 2022;
originally announced January 2022.
-
Time-Incremental Learning from Data Using Temporal Logics
Authors:
Erfan Aasi,
Mingyu Cai,
Cristian Ioan Vasile,
Calin Belta
Abstract:
Real-time and human-interpretable decision-making in cyber-physical systems is a significant but challenging task, which usually requires predictions of possible future events from limited data. In this paper, we introduce a time-incremental learning framework: given a dataset of labeled signal traces with a common time horizon, we propose a method to predict the label of a signal that is received…
▽ More
Real-time and human-interpretable decision-making in cyber-physical systems is a significant but challenging task, which usually requires predictions of possible future events from limited data. In this paper, we introduce a time-incremental learning framework: given a dataset of labeled signal traces with a common time horizon, we propose a method to predict the label of a signal that is received incrementally over time, referred to as prefix signal. Prefix signals are the signals that are being observed as they are generated, and their time length is shorter than the common horizon of signals. We present a novel decision-tree based approach to generate a finite number of Signal Temporal Logic (STL) specifications from the given dataset, and construct a predictor based on them. Each STL specification, as a binary classifier of time-series data, captures the temporal properties of the dataset over time. The predictor is constructed by assigning time-variant weights to the STL formulas. The weights are learned by using neural networks, with the goal of minimizing the misclassification rate for the prefix signals defined over the given dataset. The learned predictor is used to predict the label of a prefix signal, by computing the weighted sum of the robustness of the prefix signal with respect to each STL formula. The effectiveness and classification performance of our algorithm are evaluated on an urban-driving and a naval-surveillance case studies.
△ Less
Submitted 28 December, 2021;
originally announced December 2021.
-
Learning Spatio-Temporal Specifications for Dynamical Systems
Authors:
Suhail Alsalehi,
Erfan Aasi,
Ron Weiss,
Calin Belta
Abstract:
Learning dynamical systems properties from data provides important insights that help us understand such systems and mitigate undesired outcomes. In this work, we propose a framework for learning spatio-temporal (ST) properties as formal logic specifications from data. We introduce SVM-STL, an extension of Signal Signal Temporal Logic (STL), capable of specifying spatial and temporal properties of…
▽ More
Learning dynamical systems properties from data provides important insights that help us understand such systems and mitigate undesired outcomes. In this work, we propose a framework for learning spatio-temporal (ST) properties as formal logic specifications from data. We introduce SVM-STL, an extension of Signal Signal Temporal Logic (STL), capable of specifying spatial and temporal properties of a wide range of dynamical systems that exhibit time-varying spatial patterns. Our framework utilizes machine learning techniques to learn SVM-STL specifications from system executions given by sequences of spatial patterns. We present methods to deal with both labeled and unlabeled data. In addition, given system requirements in the form of SVM-STL specifications, we provide an approach for parameter synthesis to find parameters that maximize the satisfaction of such specifications. Our learning framework and parameter synthesis approach are showcased in an example of a reaction-diffusion system.
△ Less
Submitted 20 December, 2021;
originally announced December 2021.
-
Classification of Time-Series Data Using Boosted Decision Trees
Authors:
Erfan Aasi,
Cristian Ioan Vasile,
Mahroo Bahreinian,
Calin Belta
Abstract:
Time-series data classification is central to the analysis and control of autonomous systems, such as robots and self-driving cars. Temporal logic-based learning algorithms have been proposed recently as classifiers of such data. However, current frameworks are either inaccurate for real-world applications, such as autonomous driving, or they generate long and complicated formulae that lack interp…
▽ More
Time-series data classification is central to the analysis and control of autonomous systems, such as robots and self-driving cars. Temporal logic-based learning algorithms have been proposed recently as classifiers of such data. However, current frameworks are either inaccurate for real-world applications, such as autonomous driving, or they generate long and complicated formulae that lack interpretability. To address these limitations, we introduce a novel learning method, called Boosted Concise Decision Trees (BCDTs), to generate binary classifiers that are represented as Signal Temporal Logic (STL) formulae. Our algorithm leverages an ensemble of Concise Decision Trees (CDTs) to improve the classification performance, where each CDT is a decision tree that is empowered by a set of techniques to generate simpler formulae and improve interpretability. The effectiveness and classification performance of our algorithm are evaluated on naval surveillance and urban-driving case studies.
△ Less
Submitted 7 July, 2022; v1 submitted 1 October, 2021;
originally announced October 2021.
-
The Reasonable Crowd: Towards evidence-based and interpretable models of driving behavior
Authors:
Bassam Helou,
Aditya Dusi,
Anne Collin,
Noushin Mehdipour,
Zhiliang Chen,
Cristhian Lizarazo,
Calin Belta,
Tichakorn Wongpiromsarn,
Radboud Duintjer Tebbens,
Oscar Beijbom
Abstract:
Autonomous vehicles must balance a complex set of objectives. There is no consensus on how they should do so, nor on a model for specifying a desired driving behavior. We created a dataset to help address some of these questions in a limited operating domain. The data consists of 92 traffic scenarios, with multiple ways of traversing each scenario. Multiple annotators expressed their preference be…
▽ More
Autonomous vehicles must balance a complex set of objectives. There is no consensus on how they should do so, nor on a model for specifying a desired driving behavior. We created a dataset to help address some of these questions in a limited operating domain. The data consists of 92 traffic scenarios, with multiple ways of traversing each scenario. Multiple annotators expressed their preference between pairs of scenario traversals. We used the data to compare an instance of a rulebook, carefully hand-crafted independently of the dataset, with several interpretable machine learning models such as Bayesian networks, decision trees, and logistic regression trained on the dataset. To compare driving behavior, these models use scores indicating by how much different scenario traversals violate each of 14 driving rules. The rules are interpretable and designed by subject-matter experts. First, we found that these rules were enough for these models to achieve a high classification accuracy on the dataset. Second, we found that the rulebook provides high interpretability without excessively sacrificing performance. Third, the data pointed to possible improvements in the rulebook and the rules, and to potential new rules. Fourth, we explored the interpretability vs performance trade-off by also training non-interpretable models such as a random forest. Finally, we make the dataset publicly available to encourage a discussion from the wider community on behavior specification for AVs. Please find it at github.com/bassam-motional/Reasonable-Crowd.
△ Less
Submitted 28 July, 2021;
originally announced July 2021.
-
Rule-based Evaluation and Optimal Control for Autonomous Driving
Authors:
Wei Xiao,
Noushin Mehdipour,
Anne Collin,
Amitai Y. Bin-Nun,
Emilio Frazzoli,
Radboud Duintjer Tebbens,
Calin Belta
Abstract:
We develop optimal control strategies for autonomous vehicles (AVs) that are required to meet complex specifications imposed as rules of the road (ROTR) and locally specific cultural expectations of reasonable driving behavior. We formulate these specifications as rules, and specify their priorities by constructing a priority structure, called \underline{T}otal \underline{OR}der over e\underline{Q…
▽ More
We develop optimal control strategies for autonomous vehicles (AVs) that are required to meet complex specifications imposed as rules of the road (ROTR) and locally specific cultural expectations of reasonable driving behavior. We formulate these specifications as rules, and specify their priorities by constructing a priority structure, called \underline{T}otal \underline{OR}der over e\underline{Q}uivalence classes (TORQ). We propose a recursive framework, in which the satisfaction of the rules in the priority structure are iteratively relaxed in reverse order of priority.
Central to this framework is an optimal control problem, where convergence to desired states is achieved using Control Lyapunov Functions (CLFs) and clearance with other road users is enforced through Control Barrier Functions (CBFs). We present offline and online approaches to this problem. In the latter, the AV has limited sensing range that affects the activation of the rules, and the control is generated using a receding horizon (Model Predictive Control, MPC) approach. We also show how the offline method can be used for after-the-fact (offline) pass/fail evaluation of trajectories - a given trajectory is rejected if we can find a controller producing a trajectory that leads to less violation of the rule priority structure. We present case studies with multiple driving scenarios to demonstrate the effectiveness of the algorithms, and to compare the offline and online versions of our proposed framework.
△ Less
Submitted 15 July, 2021;
originally announced July 2021.
-
Inferring Temporal Logic Properties from Data using Boosted Decision Trees
Authors:
Erfan Aasi,
Cristian Ioan Vasile,
Mahroo Bahreinian,
Calin Belta
Abstract:
Many autonomous systems, such as robots and self-driving cars, involve real-time decision making in complex environments, and require prediction of future outcomes from limited data. Moreover, their decisions are increasingly required to be interpretable to humans for safe and trustworthy co-existence. This paper is a first step towards interpretable learning-based robot control. We introduce a no…
▽ More
Many autonomous systems, such as robots and self-driving cars, involve real-time decision making in complex environments, and require prediction of future outcomes from limited data. Moreover, their decisions are increasingly required to be interpretable to humans for safe and trustworthy co-existence. This paper is a first step towards interpretable learning-based robot control. We introduce a novel learning problem, called incremental formula and predictor learning, to generate binary classifiers with temporal logic structure from time-series data. The classifiers are represented as pairs of Signal Temporal Logic (STL) formulae and predictors for their satisfaction. The incremental property provides prediction of labels for prefix signals that are revealed over time. We propose a boosted decision-tree algorithm that leverages weak, but computationally inexpensive, learners to increase prediction and runtime performance. The effectiveness and classification accuracy of our algorithms are evaluated on autonomous-driving and naval surveillance case studies.
△ Less
Submitted 24 May, 2021;
originally announced May 2021.
-
A Control Architecture for Provably-Correct Autonomous Driving
Authors:
Erfan Aasi,
Cristian Ioan Vasile,
Calin Belta
Abstract:
This paper presents a novel two-level control architecture for a fully autonomous vehicle in a deterministic environment, which can handle traffic rules as specifications and low-level vehicle control with real-time performance. At the top level, we use a simple representation of the environment and vehicle dynamics to formulate a linear Model Predictive Control (MPC) problem. We describe the traf…
▽ More
This paper presents a novel two-level control architecture for a fully autonomous vehicle in a deterministic environment, which can handle traffic rules as specifications and low-level vehicle control with real-time performance. At the top level, we use a simple representation of the environment and vehicle dynamics to formulate a linear Model Predictive Control (MPC) problem. We describe the traffic rules and safety constraints using Signal Temporal Logic (STL) formulas, which are mapped to mixed integer-linear constraints in the optimization problem. The solution obtained at the top level is used at the bottom-level to determine the best control command for satisfying the constraints in a more detailed framework. At the bottom-level, specification-based runtime monitoring techniques, together with detailed representations of the environment and vehicle dynamics, are used to compensate for the mismatch between the simple models used in the MPC and the real complex models. We obtain substantial improvements over existing approaches in the literature in the sense of runtime performance and we validate the effectiveness of our proposed control approach in the simulator CARLA.
△ Less
Submitted 6 May, 2021;
originally announced May 2021.
-
Safe Exploration in Model-based Reinforcement Learning using Control Barrier Functions
Authors:
Max H. Cohen,
Calin Belta
Abstract:
This paper develops a model-based reinforcement learning (MBRL) framework for learning online the value function of an infinite-horizon optimal control problem while obeying safety constraints expressed as control barrier functions (CBFs). Our approach is facilitated by the development of a novel class of CBFs, termed Lyapunov-like CBFs (LCBFs), that retain the beneficial properties of CBFs for de…
▽ More
This paper develops a model-based reinforcement learning (MBRL) framework for learning online the value function of an infinite-horizon optimal control problem while obeying safety constraints expressed as control barrier functions (CBFs). Our approach is facilitated by the development of a novel class of CBFs, termed Lyapunov-like CBFs (LCBFs), that retain the beneficial properties of CBFs for developing minimally-invasive safe control policies while also possessing desirable Lyapunov-like qualities such as positive semi-definiteness. We show how these LCBFs can be used to augment a learning-based control policy to guarantee safety and then leverage this approach to develop a safe exploration framework in a MBRL setting. We demonstrate that our approach can handle more general safety constraints than comparative methods via numerical examples.
△ Less
Submitted 19 September, 2022; v1 submitted 16 April, 2021;
originally announced April 2021.
-
Neural Network-based Control for Multi-Agent Systems from Spatio-Temporal Specifications
Authors:
Suhail Alsalehi,
Noushin Mehdipour,
Ezio Bartocci,
Calin Belta
Abstract:
We propose a framework for solving control synthesis problems for multi-agent networked systems required to satisfy spatio-temporal specifications. We use Spatio-Temporal Reach and Escape Logic (STREL) as a specification language. For this logic, we define smooth quantitative semantics, which captures the degree of satisfaction of a formula by a multi-agent team. We use the novel quantitative sema…
▽ More
We propose a framework for solving control synthesis problems for multi-agent networked systems required to satisfy spatio-temporal specifications. We use Spatio-Temporal Reach and Escape Logic (STREL) as a specification language. For this logic, we define smooth quantitative semantics, which captures the degree of satisfaction of a formula by a multi-agent team. We use the novel quantitative semantics to map control synthesis problems with STREL specifications to optimization problems and propose a combination of heuristic and gradient-based methods to solve such problems. As this method might not meet the requirements of a real-time implementation, we develop a machine learning technique that uses the results of the off-line optimizations to train a neural network that gives the control inputs at current states. We illustrate the effectiveness of the proposed framework by applying it to a model of a robotic team required to satisfy a spatial-temporal specification under communication constraints.
△ Less
Submitted 6 April, 2021;
originally announced April 2021.
-
Safe Model-based Control from Signal Temporal Logic Specifications Using Recurrent Neural Networks
Authors:
Wenliang Liu,
Mirai Nishioka,
Calin Belta
Abstract:
We propose a policy search approach to learn controllers from specifications given as Signal Temporal Logic (STL) formulae. The system model, which is unknown but assumed to be an affine control system, is learned together with the control policy. The model is implemented as two feedforward neural networks (FNNs) - one for the drift, and one for the control directions. To capture the history depen…
▽ More
We propose a policy search approach to learn controllers from specifications given as Signal Temporal Logic (STL) formulae. The system model, which is unknown but assumed to be an affine control system, is learned together with the control policy. The model is implemented as two feedforward neural networks (FNNs) - one for the drift, and one for the control directions. To capture the history dependency of STL specifications, we use a recurrent neural network (RNN) to implement the control policy. In contrast to prevalent model-free methods, the learning approach proposed here takes advantage of the learned model and is more efficient. We use control barrier functions (CBFs) with the learned model to improve the safety of the system. We validate our algorithm via simulations and experiments. The results show that our approach can satisfy the given specification within very few system runs, and can be used for on-line control.
△ Less
Submitted 16 November, 2022; v1 submitted 29 March, 2021;
originally announced March 2021.
-
Event-Triggered Safety-Critical Control for Systems with Unknown Dynamics
Authors:
Wei Xiao,
Calin Belta,
Christos G. Cassandras
Abstract:
This paper addresses the problem of safety-critical control for systems with unknown dynamics. It has been shown that stabilizing affine control systems to desired (sets of) states while optimizing quadratic costs subject to state and control constraints can be reduced to a sequence of quadratic programs (QPs) by using Control Barrier Functions (CBFs) and Control Lyapunov Functions (CLFs). Our rec…
▽ More
This paper addresses the problem of safety-critical control for systems with unknown dynamics. It has been shown that stabilizing affine control systems to desired (sets of) states while optimizing quadratic costs subject to state and control constraints can be reduced to a sequence of quadratic programs (QPs) by using Control Barrier Functions (CBFs) and Control Lyapunov Functions (CLFs). Our recently proposed High Order CBFs (HOCBFs) can accommodate constraints of arbitrary relative degree. One of the main challenges in this approach is obtaining accurate system dynamics, which is especially difficult for systems that require online model identification given limited computational resources and system data. In order to approximate the real unmodelled system dynamics, we define adaptive affine control dynamics which are updated based on the error states obtained by real-time sensor measurements. We define a HOCBF for a safety requirement on the unmodelled system based on the adaptive dynamics and error states, and reformulate the safety-critical control problem as the above mentioned QP. Then, we determine the events required to solve the QP in order to guarantee safety. We also derive a condition that guarantees the satisfaction of the HOCBF constraint between events. We illustrate the effectiveness of the proposed framework on an adaptive cruise control problem and compare it with the classical time-driven approach.
△ Less
Submitted 29 March, 2021;
originally announced March 2021.
-
Experimental Validation of Linear and Nonlinear MPC on an Articulated Unmanned Ground Vehicle
Authors:
Erkan Kayacan,
Wouter Saeys,
Herman Ramon,
Calin Belta,
Joshua M. Peschel
Abstract:
This paper focuses on the trajectory tracking control problem for an articulated unmanned ground vehicle. We propose and compare two approaches in terms of performance and computational complexity. The first uses a nonlinear mathematical model derived from first principles and combines a nonlinear model predictive controller (NMPC) with a nonlinear moving horizon estimator (NMHE) to produce a cont…
▽ More
This paper focuses on the trajectory tracking control problem for an articulated unmanned ground vehicle. We propose and compare two approaches in terms of performance and computational complexity. The first uses a nonlinear mathematical model derived from first principles and combines a nonlinear model predictive controller (NMPC) with a nonlinear moving horizon estimator (NMHE) to produce a control strategy. The second is based on an input-state linearization (ISL) of the original model followed by linear model predictive control (LMPC). A fast real-time iteration scheme is proposed, implemented for the NMHE-NMPC framework and benchmarked against the ISL-LMPC framework, which is a traditional and cheap method. The experimental results for a time-based trajectory show that the NMHE-NMPC framework with the proposed real-time iteration scheme gives better trajectory tracking performance than the ISL-LMPC framework and the required computation time is feasible for real-time applications. Moreover, the ISL-LMPC produces results of a quality comparable to the NMHE-NMPC framework at a significantly reduced computational cost.
△ Less
Submitted 25 March, 2021;
originally announced March 2021.
-
High Order Control Lyapunov-Barrier Functions for Temporal Logic Specifications
Authors:
Wei Xiao,
Calin A. Belta,
Christos G. Cassandras
Abstract:
Recent work has shown that stabilizing an affine control system to a desired state while optimizing a quadratic cost subject to state and control constraints can be reduced to a sequence of Quadratic Programs (QPs) by using Control Barrier Functions (CBFs) and Control Lyapunov Functions (CLFs). In our own recent work, we defined High Order CBFs (HOCBFs) for systems and constraints with arbitrary r…
▽ More
Recent work has shown that stabilizing an affine control system to a desired state while optimizing a quadratic cost subject to state and control constraints can be reduced to a sequence of Quadratic Programs (QPs) by using Control Barrier Functions (CBFs) and Control Lyapunov Functions (CLFs). In our own recent work, we defined High Order CBFs (HOCBFs) for systems and constraints with arbitrary relative degrees. In this paper, in order to accommodate initial states that do not satisfy the state constraints and constraints with arbitrary relative degree, we generalize HOCBFs to High Order Control Lyapunov-Barrier Functions (HOCLBFs). We also show that the proposed HOCLBFs can be used to guarantee the Boolean satisfaction of Signal Temporal Logic (STL) formulae over the state of the system. We illustrate our approach on a safety-critical optimal control problem (OCP) for a unicycle.
△ Less
Submitted 12 February, 2021;
originally announced February 2021.
-
Rule-based Optimal Control for Autonomous Driving
Authors:
Wei Xiao,
Noushin Mehdipour,
Anne Collin,
Amitai Bin-Nun,
Emilio Frazzoli,
Radboud Duintjer Tebbens,
Calin Belta
Abstract:
We develop optimal control strategies for Autonomous Vehicles (AVs) that are required to meet complex specifications imposed by traffic laws and cultural expectations of reasonable driving behavior. We formulate these specifications as rules, and specify their priorities by constructing a priority structure. We propose a recursive framework, in which the satisfaction of the rules in the priority s…
▽ More
We develop optimal control strategies for Autonomous Vehicles (AVs) that are required to meet complex specifications imposed by traffic laws and cultural expectations of reasonable driving behavior. We formulate these specifications as rules, and specify their priorities by constructing a priority structure. We propose a recursive framework, in which the satisfaction of the rules in the priority structure are iteratively relaxed based on their priorities. Central to this framework is an optimal control problem, where convergence to desired states is achieved using Control Lyapunov Functions (CLFs), and safety is enforced through Control Barrier Functions (CBFs). We also show how the proposed framework can be used for after-the-fact, pass / fail evaluation of trajectories - a given trajectory is rejected if we can find a controller producing a trajectory that leads to less violation of the rule priority structure. We present case studies with multiple driving scenarios to demonstrate the effectiveness of the proposed framework.
△ Less
Submitted 14 January, 2021;
originally announced January 2021.
-
Sufficient Conditions for Feasibility of Optimal Control Problems Using Control Barrier Functions
Authors:
Wei Xiao,
Calin Belta,
Christos G. Cassandras
Abstract:
It has been shown that satisfying state and control constraints while optimizing quadratic costs subject to desired (sets of) state convergence for affine control systems can be reduced to a sequence of quadratic programs (QPs) by using Control Barrier Functions (CBFs) and Control Lyapunov Functions (CLFs). One of the main challenges in this approach is ensuring the feasibility of these QPs, espec…
▽ More
It has been shown that satisfying state and control constraints while optimizing quadratic costs subject to desired (sets of) state convergence for affine control systems can be reduced to a sequence of quadratic programs (QPs) by using Control Barrier Functions (CBFs) and Control Lyapunov Functions (CLFs). One of the main challenges in this approach is ensuring the feasibility of these QPs, especially under tight control bounds and safety constraints of high relative degree. In this paper, we provide sufficient conditions for guranteed feasibility. The sufficient conditions are captured by a single constraint that is enforced by a CBF, which is added to the QPs such that their feasibility is always guaranteed. The additional constraint is designed to be always compatible with the existing constraints, therefore, it cannot make a feasible set of constraints infeasible - it can only increase the overall feasibility. We illustrate the effectiveness of the proposed approach on an adaptive cruise control problem.
△ Less
Submitted 16 November, 2020;
originally announced November 2020.
-
Recurrent Neural Network Controllers for Signal Temporal Logic Specifications Subject to Safety Constraints
Authors:
Wenliang Liu,
Noushin Mehdipour,
Calin Belta
Abstract:
We propose a framework based on Recurrent Neural Networks (RNNs) to determine an optimal control strategy for a discrete-time system that is required to satisfy specifications given as Signal Temporal Logic (STL) formulae. RNNs can store information of a system over time, thus, enable us to determine satisfaction of the dynamic temporal requirements specified in STL formulae. Given a STL formula,…
▽ More
We propose a framework based on Recurrent Neural Networks (RNNs) to determine an optimal control strategy for a discrete-time system that is required to satisfy specifications given as Signal Temporal Logic (STL) formulae. RNNs can store information of a system over time, thus, enable us to determine satisfaction of the dynamic temporal requirements specified in STL formulae. Given a STL formula, a dataset of satisfying system executions and corresponding control policies, we can use RNNs to predict a control policy at each time based on the current and previous states of system. We use Control Barrier Functions (CBFs) to guarantee the safety of the predicted control policy. We validate our theoretical formulation and demonstrate its performance in an optimal control problem subject to partially unknown safety constraints through simulations.
△ Less
Submitted 23 September, 2020;
originally announced September 2020.
-
Bridging the Gap between Optimal Trajectory Planning and Safety-Critical Control with Applications to Autonomous Vehicles
Authors:
Wei Xiao,
Christos G. Cassandras,
Calin A. Belta
Abstract:
We address the problem of optimizing the performance of a dynamic system while satisfying hard safety constraints at all times. Implementing an optimal control solution is limited by the computational cost required to derive it in real time, especially when constraints become active, as well as the need to rely on simple linear dynamics, simple objective functions, and ignoring noise. The recently…
▽ More
We address the problem of optimizing the performance of a dynamic system while satisfying hard safety constraints at all times. Implementing an optimal control solution is limited by the computational cost required to derive it in real time, especially when constraints become active, as well as the need to rely on simple linear dynamics, simple objective functions, and ignoring noise. The recently proposed Control Barrier Function (CBF) method may be used for safety-critical control at the expense of sub-optimal performance. In this paper, we develop a real-time control framework that combines optimal trajectories generated through optimal control with the computationally efficient CBF method providing safety guarantees. We use Hamiltonian analysis to obtain a tractable optimal solution for a linear or linearized system, then employ High Order CBFs (HOCBFs) and Control Lyapunov Functions (CLFs) to account for constraints with arbitrary relative degrees and to track the optimal state, respectively. We further show how to deal with noise in arbitrary relative degree systems. The proposed framework is then applied to the optimal traffic merging problem for Connected and Automated Vehicles (CAVs) where the objective is to jointly minimize the travel time and energy consumption of each CAV subject to speed, acceleration, and speed-dependent safety constraints. In addition, when considering more complex objective functions, nonlinear dynamics and passenger comfort requirements for which analytical optimal control solutions are unavailable, we adapt the HOCBF method to such problems. Simulation examples are included to compare the performance of the proposed framework to optimal solutions (when available) and to a baseline provided by human-driven vehicles with results showing significant improvements in all metrics.
△ Less
Submitted 17 August, 2020;
originally announced August 2020.
-
Adaptive Control Barrier Functions for Safety-Critical Systems
Authors:
Wei Xiao,
Calin Belta,
Christos G. Cassandras
Abstract:
Recent work showed that stabilizing affine control systems to desired (sets of) states while optimizing quadratic costs and observing state and control constraints can be reduced to quadratic programs (QP) by using control barrier functions (CBF) and control Lyapunov functions. In our own recent work, we defined high order CBFs (HOCBFs) to accommodating systems and constraints with arbitrary relat…
▽ More
Recent work showed that stabilizing affine control systems to desired (sets of) states while optimizing quadratic costs and observing state and control constraints can be reduced to quadratic programs (QP) by using control barrier functions (CBF) and control Lyapunov functions. In our own recent work, we defined high order CBFs (HOCBFs) to accommodating systems and constraints with arbitrary relative degrees, and a penalty method to increase the feasibility of the corresponding QPs. In this paper, we introduce adaptive CBF (AdaCBFs) that can accommodate time-varying control bounds and dynamics noise, and also address the feasibility problem. Central to our approach is the introduction of penalty functions in the definition of an AdaCBF and the definition of auxiliary dynamics for these penalty functions that are HOCBFs and are stabilized by CLFs. We demonstrate the advantages of the proposed method by applying it to a cruise control problem with different road surfaces, tires slipping, and dynamics noise.
△ Less
Submitted 11 February, 2020;
originally announced February 2020.
-
Feasibility-Guided Learning for Robust Control in Constrained Optimal Control Problems
Authors:
Wei Xiao,
Calin A. Belta,
Christos G. Cassandras
Abstract:
Optimal control problems with constraints ensuring safety and convergence to desired states can be mapped onto a sequence of real time optimization problems through the use of Control Barrier Functions (CBFs) and Control Lyapunov Functions (CLFs). One of the main challenges in these approaches is ensuring the feasibility of the resulting quadratic programs (QPs) if the system is affine in controls…
▽ More
Optimal control problems with constraints ensuring safety and convergence to desired states can be mapped onto a sequence of real time optimization problems through the use of Control Barrier Functions (CBFs) and Control Lyapunov Functions (CLFs). One of the main challenges in these approaches is ensuring the feasibility of the resulting quadratic programs (QPs) if the system is affine in controls. The recently proposed penalty method has the potential to improve the existence of feasible solutions to such problems. In this paper, we further improve the feasibility robustness (i.e., feasibility maintenance in the presence of time-varying and unknown unsafe sets) through the definition of a High Order CBF (HOCBF) that works for arbitrary relative degree constraints; this is achieved by a proposed feasibility-guided learning approach. Specifically, we apply machine learning techniques to classify the parameter space of a HOCBF into feasible and infeasible sets, and get a differentiable classifier that is then added to the learning process. The proposed feasibility-guided learning approach is compared with the gradient-descent method on a robot control problem. The simulation results show an improved ability of the feasibility-guided learning approach over the gradient-decent method to determine the optimal parameters in the definition of a HOCBF for the feasibility robustness, as well as show the potential of the CBF method for robot safe navigation in an unknown environment.
△ Less
Submitted 6 December, 2019;
originally announced December 2019.
-
Distributed and Consistent Multi-Image Feature Matching via QuickMatch
Authors:
Zachary Serlin,
Guang Yang,
Brandon Sookraj,
Calin Belta,
Roberto Tron
Abstract:
In this work we consider the multi-image object matching problem, extend a centralized solution of the problem to a distributed solution, and present an experimental application of the centralized solution. Multi-image feature matching is a keystone of many applications, including simultaneous localization and mapping, homography, object detection, and structure from motion. We first review the Qu…
▽ More
In this work we consider the multi-image object matching problem, extend a centralized solution of the problem to a distributed solution, and present an experimental application of the centralized solution. Multi-image feature matching is a keystone of many applications, including simultaneous localization and mapping, homography, object detection, and structure from motion. We first review the QuickMatch algorithm for multi-image feature matching. We then present a scheme for distributing sets of features across computational units (agents) that largely preserves feature match quality and minimizes communication between agents (avoiding, in particular, the need of flooding all data to all agents). Finally, we show how QuickMatch performs on an object matching test with low quality images. The centralized QuickMatch algorithm is compared to other standard matching algorithms, while the Distributed QuickMatch algorithm is compared to the centralized algorithm in terms of preservation of match consistency. The presented experiment shows that QuickMatch matches features across a large number of images and features in larger numbers and more accurately than standard techniques.
△ Less
Submitted 29 October, 2019;
originally announced October 2019.
-
Average-based Robustness for Continuous-Time Signal Temporal Logic
Authors:
Noushin Mehdipour,
Cristian-Ioan Vasile,
Calin Belta
Abstract:
We propose a new robustness score for continuous-time Signal Temporal Logic (STL) specifications. Instead of considering only the most severe point along the evolution of the signal, we use average scores to extract more information from the signal, emphasizing robust satisfaction of all the specifications' subformulae over their entire time interval domains. We demonstrate the advantages of this…
▽ More
We propose a new robustness score for continuous-time Signal Temporal Logic (STL) specifications. Instead of considering only the most severe point along the evolution of the signal, we use average scores to extract more information from the signal, emphasizing robust satisfaction of all the specifications' subformulae over their entire time interval domains. We demonstrate the advantages of this new score in falsification and control synthesis problems in systems with complex dynamics and multi-agent systems.
△ Less
Submitted 2 September, 2019;
originally announced September 2019.
-
Sampling-based Motion Planning via Control Barrier Functions
Authors:
Guang Yang,
Bee Vang,
Zachary Serlin,
Calin Belta,
Roberto Tron
Abstract:
Robot motion planning is central to real-world autonomous applications, such as self-driving cars, persistence surveillance, and robotic arm manipulation. One challenge in motion planning is generating control signals for nonlinear systems that result in obstacle free paths through dynamic environments. In this paper, we propose Control Barrier Function guided Rapidly-exploring Random Trees (CBF-R…
▽ More
Robot motion planning is central to real-world autonomous applications, such as self-driving cars, persistence surveillance, and robotic arm manipulation. One challenge in motion planning is generating control signals for nonlinear systems that result in obstacle free paths through dynamic environments. In this paper, we propose Control Barrier Function guided Rapidly-exploring Random Trees (CBF-RRT), a sampling-based motion planning algorithm for continuous-time nonlinear systems in dynamic environments. The algorithm focuses on two objectives: efficiently generating feasible controls that steer the system toward a goal region, and handling environments with dynamical obstacles in continuous time. We formulate the control synthesis problem as a Quadratic Program (QP) that enforces Control Barrier Function (CBF) constraints to achieve obstacle avoidance. Additionally, CBF-RRT does not require nearest neighbor or collision checks when sampling, which greatly reduce the run-time overhead when compared to standard RRT variants.
△ Less
Submitted 15 July, 2019;
originally announced July 2019.
-
Temporal Logic Guided Safe Reinforcement Learning Using Control Barrier Functions
Authors:
Xiao Li,
Calin Belta
Abstract:
Using reinforcement learning to learn control policies is a challenge when the task is complex with potentially long horizons. Ensuring adequate but safe exploration is also crucial for controlling physical systems. In this paper, we use temporal logic to facilitate specification and learning of complex tasks. We combine temporal logic with control Lyapunov functions to improve exploration. We inc…
▽ More
Using reinforcement learning to learn control policies is a challenge when the task is complex with potentially long horizons. Ensuring adequate but safe exploration is also crucial for controlling physical systems. In this paper, we use temporal logic to facilitate specification and learning of complex tasks. We combine temporal logic with control Lyapunov functions to improve exploration. We incorporate control barrier functions to safeguard the exploration and deployment process. We develop a flexible and learnable system that allows users to specify task objectives and constraints in different forms and at various levels. The framework is also able to take advantage of known system dynamics and handle unknown environmental dynamics by integrating model-free learning with model-based planning.
△ Less
Submitted 23 March, 2019;
originally announced March 2019.
-
Arithmetic-Geometric Mean Robustness for Control from Signal Temporal Logic Specifications
Authors:
Noushin Mehdipour,
Cristian-Ioan Vasile,
Calin Belta
Abstract:
We present a new average-based robustness score for Signal Temporal Logic (STL) and a framework for optimal control of a dynamical system under STL constraints. By averaging the scores of different specifications or subformulae at different time points, our new definition highlights the frequency of satisfaction, as well as how robustly each specification is satisfied at each time point. We show t…
▽ More
We present a new average-based robustness score for Signal Temporal Logic (STL) and a framework for optimal control of a dynamical system under STL constraints. By averaging the scores of different specifications or subformulae at different time points, our new definition highlights the frequency of satisfaction, as well as how robustly each specification is satisfied at each time point. We show that this definition provides a better score for how well a specification is satisfied. Its usefulness in monitoring and control synthesis problems is illustrated through case studies.
△ Less
Submitted 12 March, 2019;
originally announced March 2019.
-
Control Barrier Functions for Systems with High Relative Degree
Authors:
Wei Xiao,
Calin Belta
Abstract:
This paper extends control barrier functions (CBFs) to high order control barrier functions (HOCBFs) that can be used for high relative degree constraints. The proposed HOCBFs are more general than recently proposed (exponential) HOCBFs. We introduce high order barrier functions (HOBF), and show that their satisfaction of Lyapunov-like conditions implies the forward invariance of the intersection…
▽ More
This paper extends control barrier functions (CBFs) to high order control barrier functions (HOCBFs) that can be used for high relative degree constraints. The proposed HOCBFs are more general than recently proposed (exponential) HOCBFs. We introduce high order barrier functions (HOBF), and show that their satisfaction of Lyapunov-like conditions implies the forward invariance of the intersection of a series of sets. We then introduce HOCBF, and show that any control input that satisfies the HOCBF constraints renders the intersection of a series of sets forward invariant. We formulate optimal control problems with constraints given by HOCBF and control Lyapunov functions (CLF) and analyze the influence of the choice of the class $\mathcal{K}$ functions used in the definition of the HOCBF on the size of the feasible control region. We also provide a promising method to address the conflict between HOCBF constraints and control limitations by penalizing the class $\mathcal{K}$ functions. We illustrate the proposed method on an adaptive cruise control problem.
△ Less
Submitted 13 March, 2019; v1 submitted 11 March, 2019;
originally announced March 2019.
-
Reactive Control Meets Runtime Verification: A Case Study of Navigation
Authors:
Dogan Ulus,
Calin Belta
Abstract:
This paper presents an application of specification based runtime verification techniques to control mobile robots in a reactive manner. In our case study, we develop a layered control architecture where runtime monitors constructed from formal specifications are embedded into the navigation stack. We use temporal logic and regular expressions to describe safety requirements and mission specificat…
▽ More
This paper presents an application of specification based runtime verification techniques to control mobile robots in a reactive manner. In our case study, we develop a layered control architecture where runtime monitors constructed from formal specifications are embedded into the navigation stack. We use temporal logic and regular expressions to describe safety requirements and mission specifications, respectively. An immediate benefit of our approach is that it leverages simple requirements and objectives of traditional control applications to more complex specifications in a non-intrusive and compositional way. Finally, we demonstrate a simulation of robots controlled by the proposed architecture and we discuss further extensions of our approach.
△ Less
Submitted 11 February, 2019;
originally announced February 2019.
-
Automata Guided Reinforcement Learning With Demonstrations
Authors:
Xiao Li,
Yao Ma,
Calin Belta
Abstract:
Tasks with complex temporal structures and long horizons pose a challenge for reinforcement learning agents due to the difficulty in specifying the tasks in terms of reward functions as well as large variances in the learning signals. We propose to address these problems by combining temporal logic (TL) with reinforcement learning from demonstrations. Our method automatically generates intrinsic r…
▽ More
Tasks with complex temporal structures and long horizons pose a challenge for reinforcement learning agents due to the difficulty in specifying the tasks in terms of reward functions as well as large variances in the learning signals. We propose to address these problems by combining temporal logic (TL) with reinforcement learning from demonstrations. Our method automatically generates intrinsic rewards that align with the overall task goal given a TL task specification. The policy resulting from our framework has an interpretable and hierarchical structure. We validate the proposed method experimentally on a set of robotic manipulation tasks.
△ Less
Submitted 25 September, 2018; v1 submitted 17 September, 2018;
originally announced September 2018.
-
Metrics for Signal Temporal Logic Formulae
Authors:
Curtis Madsen,
Prashant Vaidyanathan,
Sadra Sadraddini,
Cristian-Ioan Vasile,
Nicholas A. DeLateur,
Ron Weiss,
Douglas Densmore,
Calin Belta
Abstract:
Signal Temporal Logic (STL) is a formal language for describing a broad range of real-valued, temporal properties in cyber-physical systems. While there has been extensive research on verification and control synthesis from STL requirements, there is no formal framework for comparing two STL formulae. In this paper, we show that under mild assumptions, STL formulae admit a metric space. We propose…
▽ More
Signal Temporal Logic (STL) is a formal language for describing a broad range of real-valued, temporal properties in cyber-physical systems. While there has been extensive research on verification and control synthesis from STL requirements, there is no formal framework for comparing two STL formulae. In this paper, we show that under mild assumptions, STL formulae admit a metric space. We propose two metrics over this space based on i) the Pompeiu-Hausdorff distance and ii) the symmetric difference measure, and present algorithms to compute them. Alongside illustrative examples, we present applications of these metrics for two fundamental problems: a) design quality measures: to compare all the temporal behaviors of a designed system, such as a synthetic genetic circuit, with the "desired" specification, and b) loss functions: to quantify errors in Temporal Logic Inference (TLI) as a first step to establish formal performance guarantees of TLI algorithms.
△ Less
Submitted 1 August, 2018;
originally announced August 2018.
-
Automata-Guided Hierarchical Reinforcement Learning for Skill Composition
Authors:
Xiao Li,
Yao Ma,
Calin Belta
Abstract:
Skills learned through (deep) reinforcement learning often generalizes poorly across domains and re-training is necessary when presented with a new task. We present a framework that combines techniques in \textit{formal methods} with \textit{reinforcement learning} (RL). The methods we provide allows for convenient specification of tasks with logical expressions, learns hierarchical policies (meta…
▽ More
Skills learned through (deep) reinforcement learning often generalizes poorly across domains and re-training is necessary when presented with a new task. We present a framework that combines techniques in \textit{formal methods} with \textit{reinforcement learning} (RL). The methods we provide allows for convenient specification of tasks with logical expressions, learns hierarchical policies (meta-controller and low-level controllers) with well-defined intrinsic rewards, and construct new skills from existing ones with little to no additional exploration. We evaluate the proposed methods in a simple grid world simulation as well as a more complicated kitchen environment in AI2Thor
△ Less
Submitted 20 May, 2018; v1 submitted 31 October, 2017;
originally announced November 2017.
-
A Policy Search Method For Temporal Logic Specified Reinforcement Learning Tasks
Authors:
Xiao Li,
Yao Ma,
Calin Belta
Abstract:
Reward engineering is an important aspect of reinforcement learning. Whether or not the user's intentions can be correctly encapsulated in the reward function can significantly impact the learning outcome. Current methods rely on manually crafted reward functions that often require parameter tuning to obtain the desired behavior. This operation can be expensive when exploration requires systems to…
▽ More
Reward engineering is an important aspect of reinforcement learning. Whether or not the user's intentions can be correctly encapsulated in the reward function can significantly impact the learning outcome. Current methods rely on manually crafted reward functions that often require parameter tuning to obtain the desired behavior. This operation can be expensive when exploration requires systems to interact with the physical world. In this paper, we explore the use of temporal logic (TL) to specify tasks in reinforcement learning. TL formula can be translated to a real-valued function that measures its level of satisfaction against a trajectory. We take advantage of this function and propose temporal logic policy search (TLPS), a model-free learning technique that finds a policy that satisfies the TL specification. A set of simulated experiments are conducted to evaluate the proposed approach.
△ Less
Submitted 27 September, 2017;
originally announced September 2017.
-
Reinforcement Learning With Temporal Logic Rewards
Authors:
Xiao Li,
Cristian-Ioan Vasile,
Calin Belta
Abstract:
Reinforcement learning (RL) depends critically on the choice of reward functions used to capture the de- sired behavior and constraints of a robot. Usually, these are handcrafted by a expert designer and represent heuristics for relatively simple tasks. Real world applications typically involve more complex tasks with rich temporal and logical structure. In this paper we take advantage of the expr…
▽ More
Reinforcement learning (RL) depends critically on the choice of reward functions used to capture the de- sired behavior and constraints of a robot. Usually, these are handcrafted by a expert designer and represent heuristics for relatively simple tasks. Real world applications typically involve more complex tasks with rich temporal and logical structure. In this paper we take advantage of the expressive power of temporal logic (TL) to specify complex rules the robot should follow, and incorporate domain knowledge into learning. We propose Truncated Linear Temporal Logic (TLTL) as specifications language, that is arguably well suited for the robotics applications, together with quantitative semantics, i.e., robustness degree. We propose a RL approach to learn tasks expressed as TLTL formulae that uses their associated robustness degree as reward functions, instead of the manually crafted heuristics trying to capture the same specifications. We show in simulated trials that learning is faster and policies obtained using the proposed approach outperform the ones learned using heuristic rewards in terms of the robustness degree, i.e., how well the tasks are satisfied. Furthermore, we demonstrate the proposed RL approach in a toast-placing task learned by a Baxter robot.
△ Less
Submitted 2 March, 2017; v1 submitted 11 December, 2016;
originally announced December 2016.
-
Robotic Swarm Control from Spatio-Temporal Specifications
Authors:
Iman Haghighi,
Sadra Sadraddini,
Calin Belta
Abstract:
In this paper, we study the problem of controlling a two-dimensional robotic swarm with the purpose of achieving high level and complex spatio-temporal patterns. We use a rich spatio-temporal logic that is capable of describing a wide range of time varying and complex spatial configurations, and develop a method to encode such formal specifications as a set of mixed integer linear constraints, whi…
▽ More
In this paper, we study the problem of controlling a two-dimensional robotic swarm with the purpose of achieving high level and complex spatio-temporal patterns. We use a rich spatio-temporal logic that is capable of describing a wide range of time varying and complex spatial configurations, and develop a method to encode such formal specifications as a set of mixed integer linear constraints, which are incorporated into a mixed integer linear programming problem. We plan trajectories for each individual robot such that the whole swarm satisfies the spatio-temporal requirements, while optimizing total robot movement and/or a metric that shows how strongly the swarm trajectory resembles given spatio-temporal behaviors. An illustrative case study is included.
△ Less
Submitted 20 September, 2016;
originally announced September 2016.
-
A Hierarchical Reinforcement Learning Method for Persistent Time-Sensitive Tasks
Authors:
Xiao Li,
Calin Belta
Abstract:
Reinforcement learning has been applied to many interesting problems such as the famous TD-gammon and the inverted helicopter flight. However, little effort has been put into developing methods to learn policies for complex persistent tasks and tasks that are time-sensitive. In this paper, we take a step towards solving this problem by using signal temporal logic (STL) as task specification, and t…
▽ More
Reinforcement learning has been applied to many interesting problems such as the famous TD-gammon and the inverted helicopter flight. However, little effort has been put into developing methods to learn policies for complex persistent tasks and tasks that are time-sensitive. In this paper, we take a step towards solving this problem by using signal temporal logic (STL) as task specification, and taking advantage of the temporal abstraction feature that the options framework provide. We show via simulation that a relatively easy to implement algorithm that combines STL and options can learn a satisfactory policy with a small number of training cases
△ Less
Submitted 20 June, 2016;
originally announced June 2016.
-
Time Window Temporal Logic
Authors:
Cristian-Ioan Vasile,
Derya Aksaray,
Calin Belta
Abstract:
This paper introduces time window temporal logic (TWTL), a rich expressivity language for describing various time bounded specifications. In particular, the syntax and semantics of TWTL enable the compact representation of serial tasks, which are typically seen in robotics and control applications. This paper also discusses the relaxation of TWTL formulae with respect to deadlines of tasks. Effici…
▽ More
This paper introduces time window temporal logic (TWTL), a rich expressivity language for describing various time bounded specifications. In particular, the syntax and semantics of TWTL enable the compact representation of serial tasks, which are typically seen in robotics and control applications. This paper also discusses the relaxation of TWTL formulae with respect to deadlines of tasks. Efficient automata-based frameworks to solve synthesis, verification and learning problems are also presented. The key ingredient to the presented solution is an algorithm to translate a TWTL formula to an annotated finite state automaton that encodes all possible temporal relaxations of the specification. Case studies illustrating the expressivity of the logic and the proposed algorithms are included.
△ Less
Submitted 13 February, 2016;
originally announced February 2016.
-
Control with Probabilistic Signal Temporal Logic
Authors:
Chanyeol Yoo,
Calin Belta
Abstract:
Autonomous agents often operate in uncertain environments where their decisions are made based on beliefs over states of targets. We are interested in controller synthesis for complex tasks defined over belief spaces. Designing such controllers is challenging due to computational complexity and the lack of expressivity of existing specification languages. In this paper, we propose a probabilistic…
▽ More
Autonomous agents often operate in uncertain environments where their decisions are made based on beliefs over states of targets. We are interested in controller synthesis for complex tasks defined over belief spaces. Designing such controllers is challenging due to computational complexity and the lack of expressivity of existing specification languages. In this paper, we propose a probabilistic extension to signal temporal logic (STL) that expresses tasks over continuous belief spaces. We present an efficient synthesis algorithm to find a control input that maximises the probability of satisfying a given task. We validate our algorithm through simulations of an unmanned aerial vehicle deployed for surveillance and search missions.
△ Less
Submitted 28 October, 2015;
originally announced October 2015.