License: arXiv.org perpetual non-exclusive license
arXiv:2404.05952v1 [cs.RO] 09 Apr 2024

Robot Safe Planning In Dynamic Environments Based On Model Predictive Control Using Control Barrier Function

Zetao Lu, Kaijun Feng, Jun Xu, , Haoyao Chen,  and Yunjiang Lou This work was supported in part by the National Natural Science Foundation of China under Grant 62173113, and in part by the Science and Technology Innovation Committee of Shenzhen Municipality under Grant GXWD20231129101652001, and in part by Natural Science Foundation of Guangdong Province of China under Grant 2022A1515011584. (Corresponding author: Jun Xu.)The authors are with the School of Mechanical Engineering and Automation, Harbin Institute of Technology Shenzhen, Shenzhen, Guangdong, China, 518055 (email: {22S153114, 21S153137}@stu.hit.edu.cn; {xujunqgy, hychen5, louyj}@hit.edu.cn).
Abstract

Implementing obstacle avoidance in dynamic environments is a challenging problem for robots. Model predictive control (MPC) is a popular strategy for dealing with this type of problem, and recent work mainly uses control barrier function (CBF) as hard constraints to ensure that the system state remains in the safe set. However, in crowded scenarios, effective solutions may not be obtained due to infeasibility problems, resulting in degraded controller performance. We propose a new MPC framework that integrates CBF to tackle the issue of obstacle avoidance in dynamic environments, in which the infeasibility problem induced by hard constraints operating over the whole prediction horizon is solved by softening the constraints and introducing exact penalty, prompting the robot to actively seek out new paths. At the same time, generalized CBF is extended as a single-step safety constraint of the controller to enhance the safety of the robot during navigation. The efficacy of the proposed method is first shown through simulation experiments, in which a double-integrator system and a unicycle system are employed, and the proposed method outperforms other controllers in terms of safety, feasibility, and navigation efficiency. Furthermore, real-world experiment on an MR1000 robot is implemented to demonstrate the effectiveness of the proposed method.

Index Terms:
Collision avoidance, model predictive control, control barrier function, autonomous vehicle navigation.

I Introduction

In recent years, with the continuous development of robot technology, the application scope of robots is no longer limited to industrial manufacturing scenarios, but has also been expanding to other industries, such as autonomous driving, inspection robots, disinfection robots and delivery robots [1]. In order to achieve safe navigation of robots in dynamic and shared environments, it is of great significance to design a safety-critical controller to enable the autonomous system to achieve optimal performance while ensuring safety. Some recent work combines the control barrier function (CBF) with model predictive control (MPC) [2] to implement such a safety-critical controller and applies it to dynamic environments by extending CBF to dynamic CBF (D-CBF) [3]. However, due to the existence of state constraints, applying CBF as hard constraints to the entire prediction horizon may lead to failure in solving the optimization problem, which is particularly obvious in complex dynamic environments.

In order to solve the aforementioned problems and achieve better control effects, we transformed the CBF hard constraints into soft constraints and incorporated them into the penalty function of the optimization problem. Besides, a single-step CBF is imposed to enhance safety. Through our approach, the robot is able to significantly reduce the probability of solution failure in dynamic environments and reach its destination with higher efficiency and safety.

I-A Related Work

Existing robot navigation work in dynamic environments can be divided into three categories: 1) reactive based; 2) learning based; 3) optimization based. In reactive-based approaches, the robot makes one-step optimal action based on information about dynamic obstacles in the current environment, including velocity obstacle (VO) [4] and its variants [5, 6]. However, these types of methods usually do not take into account the robot’s kinematic constraints. Relying solely on current state information can result in short-term, oscillatory, and unnatural behavior, which does not facilitate pedestrian understanding of the robot’s movement intentions[7]. In learning-based approaches, robots endeavor to emulate appropriate navigation strategies. By imitating the interactive movements of pedestrians, they aim to navigate through dense crowds in a manner that is more socially acceptable. Deep reinforcement learning is often used to train computationally efficient navigation strategies [8, 9], which implicitly encodes interactions and collaborations between pedestrians to generate paths with behavioral patterns more consistent with humans [10]. However, learning-based methods are dependent on offline training and are then restricted by environmental characteristics, which may encounter generalization issues when transitioning from simulation to real world, i.e., the performance in scenarios not covered by the training data can not be guaranteed. Optimization-based methods usually consist of two consecutive steps of prediction and planning at each time step, first using a motion model to predict dynamic obstacles in the environment [11], and then formulating robot navigation as an optimal control problem [7]. Such methods are usually based on MPC, as it is able to integrate the kinematic constraints and static/dynamic collision constraints of the robot while combining planning with control to find an ideal trajectory [12].

However, a significant concern with optimization-based methods in dynamic environments is the safety of the generated trajectories. CBF has recently been introduced as an effective method, combined with MPC [2], to design safety-critical controllers that can guarantee effective safety margins under a short prediction horizon. In dynamic environments, [3] implemented obstacle avoidance with a safety-critical controller built based on lidar and dynamic CBF. In a static maze scenario, [13] successfully navigated different robot shapes using relaxation technology [14]. However, applying CBF as hard constraints to the entire prediction horizon may lead to failure in solving the optimization problem [2]. [15] proposed generalized CBF (GCBF) to use CBF constraint as a one-step constraint to improve feasibility, but there is still a trade-off between feasibility and safety. In [14], the trade-off is handled by incorporating slack variables into the CBF constraints to enhance feasibility, although this approach inherently increases the solution time and diminishes the safety margin.

Inspired by these studies, we contemplate the conversion of CBF hard constraints into soft constraints. The goal is to maintain control effects that are comparable to those of hard constraints while minimizing the likelihood of solution failure. This approach inspires robots to actively seek feasible paths in dynamic environments. Concurrently, it’s essential to introduce an effective safety guarantee, fulfilled by integrating dynamic GCBF (D-GCBF) constraint. The paper’s main focus lies on the application of CBF within the framework of MPC, aiming to enable robots to navigate through crowded and complex dynamic environments efficiently while ensuring safety.

I-B Contribution

The contributions of this paper are as follows.

  • We propose an MPC framework based on CBF soft constraints for generating safe collision-free trajectories in dynamic environments;

  • We incorporate D-GCBF within this framework as a single-step hard constraint to enhance safety;

  • Simulation experiments and real-world tests were carried out to validate the real-time capability, effectiveness, and stability of the algorithm.

I-C Paper Structure

This paper is structured as follows. In Section II, we provide an overview of the definition of the CBF and its associated optimization problem construction when used as hard constraints. In Section III, we transform CBF hard constraints into soft constraints and derive conditions for exact penalty. Besides, single-step D-GCBF hard constraint is imposed as safety guarantee. In order to verify the effectiveness of the controller design and algorithm, examples of obstacle avoidance of our algorithm in the simulation environment and the real world are demonstrated in Section IV. Section V concludes the paper.

II PRELIMINARIES

In this section, we present the preliminaries related to the CBF and propose the basic form of the optimization problem based on MPC and CBF. This lays the foundation for subsequent controller design.

II-A Problem Formulation

Consider the robot’s motion model as a discrete-time control system

𝐱k+1=f(𝐱k,𝐮k),subscript𝐱𝑘1𝑓subscript𝐱𝑘subscript𝐮𝑘\mathbf{x}_{k+1}=f(\mathbf{x}_{k},\mathbf{u}_{k}),bold_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_f ( bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) , (1)

where 𝐱k𝒳nsubscript𝐱𝑘𝒳superscript𝑛\mathbf{x}_{k}\in\mathcal{X}\subset\mathbb{R}^{n}bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ caligraphic_X ⊂ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT is the state of the system, 𝐮k𝒰msubscript𝐮𝑘𝒰superscript𝑚\mathbf{u}_{k}\in\mathcal{U}\subset\mathbb{R}^{m}bold_u start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ caligraphic_U ⊂ blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT is the control input. In a dynamic environment, assuming that the motion equation of a moving obstacle 𝐨isuperscript𝐨𝑖\mathbf{o}^{i}bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT is

𝐨k+1i=ξ(𝐨k),subscriptsuperscript𝐨𝑖𝑘1𝜉subscript𝐨𝑘\mathbf{o}^{i}_{k+1}=\xi(\mathbf{o}_{k}),bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_ξ ( bold_o start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) , (2)

where 𝐨kinosubscriptsuperscript𝐨𝑖𝑘superscriptsubscript𝑛𝑜\mathbf{o}^{i}_{k}\in\mathbb{R}^{n_{o}}bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT end_POSTSUPERSCRIPT represents the state of the moving obstacle at time k𝑘kitalic_k, the superscript i{1,2,,No}𝑖12subscript𝑁𝑜i\in\{1,2,\dots,N_{o}\}italic_i ∈ { 1 , 2 , … , italic_N start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT } represents the i𝑖iitalic_i-th moving obstacle, ξ()𝜉\xi(\cdot)italic_ξ ( ⋅ ) is the state transition function. In robot obstacle avoidance scenarios, the obstacle avoidance problem is usually described using an optimal control problem based on distance constraints[16]. Assuming that the robot and the moving obstacles are approximated by circles with center points (xt,yt)subscript𝑥𝑡subscript𝑦𝑡(x_{t},y_{t})( italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) and (xti,yti)subscriptsuperscript𝑥𝑖𝑡subscriptsuperscript𝑦𝑖𝑡(x^{i}_{t},y^{i}_{t})( italic_x start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_y start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ), radii rrsubscript𝑟𝑟r_{r}italic_r start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT and r𝐨isubscript𝑟superscript𝐨𝑖r_{\mathbf{o}^{i}}italic_r start_POSTSUBSCRIPT bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT end_POSTSUBSCRIPT respectively on the two-dimensional plane, then the safe distance between the robot and the moving obstacle 𝐨isuperscript𝐨𝑖\mathbf{o}^{i}bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT is defined as ri=rr+r𝐨i+ϵsubscript𝑟𝑖subscript𝑟𝑟subscript𝑟superscript𝐨𝑖italic-ϵr_{i}=r_{r}+r_{\mathbf{o}^{i}}+\epsilonitalic_r start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_r start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT + italic_r start_POSTSUBSCRIPT bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + italic_ϵ, where ϵitalic-ϵ\epsilonitalic_ϵ is the additional safety margin. So at time step t𝑡titalic_t, the MPC problem based on distance constraints (MPC-DC) is as follows

min𝐮t+k|tsubscriptsubscript𝐮𝑡conditional𝑘𝑡\displaystyle\min_{\mathbf{u}_{t+k|t}}\ roman_min start_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT p(𝐱t+N|t)+k=0N1q(𝐱t+k|t,𝐮t+k|t)𝑝subscript𝐱𝑡conditional𝑁𝑡subscriptsuperscript𝑁1𝑘0𝑞subscript𝐱𝑡conditional𝑘𝑡subscript𝐮𝑡conditional𝑘𝑡\displaystyle p(\mathbf{x}_{t+N|t})+\sum^{N-1}_{k=0}q(\mathbf{x}_{t+k|t},% \mathbf{u}_{t+k|t})italic_p ( bold_x start_POSTSUBSCRIPT italic_t + italic_N | italic_t end_POSTSUBSCRIPT ) + ∑ start_POSTSUPERSCRIPT italic_N - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k = 0 end_POSTSUBSCRIPT italic_q ( bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ) (3a)
s.t.formulae-sequencest\displaystyle\mathrm{s.t.}\ roman_s . roman_t . forallk=0,,N1::forall𝑘0𝑁1absent\displaystyle\mathrm{for}\ \mathrm{all}\ k=0,\dots,N-1:roman_for roman_all italic_k = 0 , … , italic_N - 1 :
𝐱t+k+1|t=f(𝐱t+k|t,𝐮t+k|t),subscript𝐱𝑡𝑘conditional1𝑡𝑓subscript𝐱𝑡conditional𝑘𝑡subscript𝐮𝑡conditional𝑘𝑡\displaystyle\mathbf{x}_{t+k+1|t}=f(\mathbf{x}_{t+k|t},\mathbf{u}_{t+k|t}),bold_x start_POSTSUBSCRIPT italic_t + italic_k + 1 | italic_t end_POSTSUBSCRIPT = italic_f ( bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ) , (3b)
𝐱t+k|t𝒳,subscript𝐱𝑡conditional𝑘𝑡𝒳\displaystyle\mathbf{x}_{t+k|t}\in\mathcal{X},bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ∈ caligraphic_X , (3c)
𝐮t+k|t𝒰,subscript𝐮𝑡conditional𝑘𝑡𝒰\displaystyle\mathbf{u}_{t+k|t}\in\mathcal{U},bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ∈ caligraphic_U , (3d)
𝐱t|t=𝐱t,subscript𝐱conditional𝑡𝑡subscript𝐱𝑡\displaystyle\mathbf{x}_{t|t}=\mathbf{x}_{t},bold_x start_POSTSUBSCRIPT italic_t | italic_t end_POSTSUBSCRIPT = bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , (3e)
gi(𝐱t+k|t,𝐨t+k|ti)0,subscript𝑔𝑖subscript𝐱𝑡conditional𝑘𝑡subscriptsuperscript𝐨𝑖𝑡conditional𝑘𝑡0\displaystyle g_{i}(\mathbf{x}_{t+k|t},\mathbf{o}^{i}_{t+k|t})\geq 0,italic_g start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT , bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ) ≥ 0 , (3f)

where N𝑁Nitalic_N is the prediction horizon. The vectors 𝐱t+k|tsubscript𝐱𝑡conditional𝑘𝑡\mathbf{x}_{t+k|t}bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT and 𝐮t+k|tsubscript𝐮𝑡conditional𝑘𝑡\mathbf{u}_{t+k|t}bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT represent the predicted state and designed input at time step t+k𝑡𝑘t+kitalic_t + italic_k, respectively. The first term of the cost function (3a) is the terminal cost, and the latter one is the stage cost. (3c) and (3d) represent the state constraints and input constraints along the prediction horizon, respectively. The safety limit distance constraints are represented by (3f), where gi(𝐱t+k|t,𝐨t+k|ti)=(xt+k|txt+k|ti)2+(yt+k|tyt+k|ti)2risubscript𝑔𝑖subscript𝐱𝑡conditional𝑘𝑡subscriptsuperscript𝐨𝑖𝑡conditional𝑘𝑡superscriptsubscript𝑥𝑡conditional𝑘𝑡subscriptsuperscript𝑥𝑖𝑡conditional𝑘𝑡2superscriptsubscript𝑦𝑡conditional𝑘𝑡subscriptsuperscript𝑦𝑖𝑡conditional𝑘𝑡2subscript𝑟𝑖g_{i}(\mathbf{x}_{t+k|t},\mathbf{o}^{i}_{t+k|t})=\sqrt{(x_{t+k|t}-x^{i}_{t+k|t% })^{2}+(y_{t+k|t}-y^{i}_{t+k|t})^{2}}-r_{i}italic_g start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT , bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ) = square-root start_ARG ( italic_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT - italic_x start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_y start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT - italic_y start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG - italic_r start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT.

The optimal solution of this problem at time step t𝑡titalic_t is an input sequence i.e. {𝐮t|t*,,𝐮t+N1|t*}subscriptsuperscript𝐮conditional𝑡𝑡subscriptsuperscript𝐮𝑡𝑁conditional1𝑡\{\mathbf{u}^{*}_{t|t},\dots,\mathbf{u}^{*}_{t+N-1|t}\}{ bold_u start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t | italic_t end_POSTSUBSCRIPT , … , bold_u start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + italic_N - 1 | italic_t end_POSTSUBSCRIPT }. Only the first element 𝐮t|t*subscriptsuperscript𝐮conditional𝑡𝑡\mathbf{u}^{*}_{t|t}bold_u start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t | italic_t end_POSTSUBSCRIPT of the optimal solution will become the control input of the system (1). Then the above optimization problem is solved repeatedly at the new state 𝐱t+1subscript𝐱𝑡1\mathbf{x}_{t+1}bold_x start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT.

II-B Control Barrier Function

In control theory, CBF is a continuously differentiable function used to ensure forward invariance of the system state. When the system state is on the boundary of the invariant set, CBF can adjust the input of the control system to keep it within the invariant set. The definition of CBF is given below based on the concept of safety set in [17]. Assume that the safe set 𝒞𝒞\mathcal{C}caligraphic_C is a super-level set of a continuously differentiable function h:n:superscript𝑛h:\mathbb{R}^{n}\rightarrow\mathbb{R}italic_h : blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT → blackboard_R:

𝒞𝒞\displaystyle\mathcal{C}caligraphic_C ={𝐱n:h(𝐱)0},absentconditional-set𝐱superscript𝑛𝐱0\displaystyle=\{\mathbf{x}\in\mathbb{R}^{n}:h(\mathbf{x})\geq 0\},= { bold_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT : italic_h ( bold_x ) ≥ 0 } , (4a)
𝒞𝒞\displaystyle\partial\mathcal{C}∂ caligraphic_C ={𝐱n:h(𝐱)=0},absentconditional-set𝐱superscript𝑛𝐱0\displaystyle=\{\mathbf{x}\in\mathbb{R}^{n}:h(\mathbf{x})=0\},= { bold_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT : italic_h ( bold_x ) = 0 } , (4b)
Int(𝒞)Int𝒞\displaystyle\mathrm{Int}(\mathcal{C})roman_Int ( caligraphic_C ) ={𝐱n:h(𝐱)>0}.absentconditional-set𝐱superscript𝑛𝐱0\displaystyle=\{\mathbf{x}\in\mathbb{R}^{n}:h(\mathbf{x})>0\}.= { bold_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT : italic_h ( bold_x ) > 0 } . (4c)

And h(𝐱)𝐱0𝐱𝐱0\frac{\partial h(\mathbf{x})}{\partial\mathbf{x}}\neq 0divide start_ARG ∂ italic_h ( bold_x ) end_ARG start_ARG ∂ bold_x end_ARG ≠ 0 holds for all points on the boundary of the safe set. Then if and only if h˙(𝐱)=h(𝐱)𝐱𝐱˙0,𝐱𝒞formulae-sequence˙𝐱𝐱𝐱˙𝐱0for-all𝐱𝒞\dot{h}(\mathbf{x})=\frac{\partial h(\mathbf{x})}{\partial\mathbf{x}}\dot{% \mathbf{x}}\geq 0,\forall\mathbf{x}\in\partial\mathcal{C}over˙ start_ARG italic_h end_ARG ( bold_x ) = divide start_ARG ∂ italic_h ( bold_x ) end_ARG start_ARG ∂ bold_x end_ARG over˙ start_ARG bold_x end_ARG ≥ 0 , ∀ bold_x ∈ ∂ caligraphic_C, the set 𝒞𝒞\mathcal{C}caligraphic_C is a forward invariant set, that is a safe set.

Definition 1 (Discrete-time CBF[2]).

Consider the discrete-time system (1). Given a set 𝒞𝒞\mathcal{C}caligraphic_C defined by (II-B) for a function h:nnormal-:normal-→superscript𝑛h:\mathbb{R}^{n}\rightarrow\mathbb{R}italic_h : blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT → blackboard_R, the function hhitalic_h is a discrete-time CBF if there exists a function γ𝒦𝛾subscript𝒦\gamma\in\mathcal{K}_{\infty}italic_γ ∈ caligraphic_K start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT s.t.

Δh(𝐱k,𝐮k)γ(h(𝐱k)),Δsubscript𝐱𝑘subscript𝐮𝑘𝛾subscript𝐱𝑘\Delta h(\mathbf{x}_{k},\mathbf{u}_{k})\geq-\gamma(h(\mathbf{x}_{k})),roman_Δ italic_h ( bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ≥ - italic_γ ( italic_h ( bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) , (5)

where Δh(𝐱k,𝐮k):=h(𝐱k+1)h(𝐱k)assignnormal-Δsubscript𝐱𝑘subscript𝐮𝑘subscript𝐱𝑘1subscript𝐱𝑘\Delta h(\mathbf{x}_{k},\mathbf{u}_{k}):=h(\mathbf{x}_{k+1})-h(\mathbf{x}_{k})roman_Δ italic_h ( bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) := italic_h ( bold_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT ) - italic_h ( bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ).

When discrete-time CBF is used as constraints in the safety-critical MPC design, the safety of the system can be fully guaranteed while avoiding static obstacles[2]. In order to better apply it to dynamic scenes, [3] proposed D-CBF on this basis. Similarly, we assume that the shape of the moving obstacle does not change and modify (5) to the following form

Δhi(𝐱k,𝐮k,𝐨ki)γ(hi(𝐱k,𝐨ki)),Δsubscript𝑖subscript𝐱𝑘subscript𝐮𝑘subscriptsuperscript𝐨𝑖𝑘𝛾subscript𝑖subscript𝐱𝑘subscriptsuperscript𝐨𝑖𝑘\Delta h_{i}(\mathbf{x}_{k},\mathbf{u}_{k},\mathbf{o}^{i}_{k})\geq-\gamma(h_{i% }(\mathbf{x}_{k},\mathbf{o}^{i}_{k})),roman_Δ italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ≥ - italic_γ ( italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) , (6)

where Δhi(𝐱k,𝐮k,𝐨ki):=h(𝐱k+1,𝐨k+1i)h(𝐱k,𝐨ki)assignΔsubscript𝑖subscript𝐱𝑘subscript𝐮𝑘subscriptsuperscript𝐨𝑖����subscript𝐱𝑘1subscriptsuperscript𝐨𝑖𝑘1subscript𝐱𝑘subscriptsuperscript𝐨𝑖𝑘\Delta h_{i}(\mathbf{x}_{k},\mathbf{u}_{k},\mathbf{o}^{i}_{k}):=h(\mathbf{x}_{% k+1},\mathbf{o}^{i}_{k+1})-h(\mathbf{x}_{k},\mathbf{o}^{i}_{k})roman_Δ italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) := italic_h ( bold_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT , bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT ) - italic_h ( bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ). We follow the result of [2] and select the 𝒦subscript𝒦\mathcal{K}_{\infty}caligraphic_K start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT function γ()𝛾\gamma(\cdot)italic_γ ( ⋅ ) as a constant γ(0,1]𝛾01\gamma\in(0,1]italic_γ ∈ ( 0 , 1 ].

Therefore, assuming that the states of the robot and the moving obstacles at time t𝑡titalic_t are known, the MPC problem based on D-CBF (6) constraints (MPC-D-CBF) is as follows

min𝐮t+k|tsubscriptsubscript𝐮𝑡conditional𝑘𝑡\displaystyle\min_{\mathbf{u}_{t+k|t}}\ roman_min start_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT p(𝐱t+N|t)+k=0N1q(𝐱t+k|t,𝐮t+k|t)𝑝subscript𝐱𝑡conditional𝑁𝑡subscriptsuperscript𝑁1𝑘0𝑞subscript𝐱𝑡conditional𝑘𝑡subscript𝐮𝑡conditional𝑘𝑡\displaystyle p(\mathbf{x}_{t+N|t})+\sum^{N-1}_{k=0}q(\mathbf{x}_{t+k|t},% \mathbf{u}_{t+k|t})italic_p ( bold_x start_POSTSUBSCRIPT italic_t + italic_N | italic_t end_POSTSUBSCRIPT ) + ∑ start_POSTSUPERSCRIPT italic_N - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k = 0 end_POSTSUBSCRIPT italic_q ( bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ) (7a)
s.t.formulae-sequencest\displaystyle\mathrm{s.t.}\ roman_s . roman_t . forallk=0,,N1::forall𝑘0𝑁1absent\displaystyle\mathrm{for}\ \mathrm{all}\ k=0,\dots,N-1:roman_for roman_all italic_k = 0 , … , italic_N - 1 :
𝐱t+k+1|t=f(𝐱t+k|t,𝐮t+k|t),subscript𝐱𝑡𝑘conditional1𝑡𝑓subscript𝐱𝑡conditional𝑘𝑡subscript𝐮𝑡conditional𝑘𝑡\displaystyle\mathbf{x}_{t+k+1|t}=f(\mathbf{x}_{t+k|t},\mathbf{u}_{t+k|t}),bold_x start_POSTSUBSCRIPT italic_t + italic_k + 1 | italic_t end_POSTSUBSCRIPT = italic_f ( bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ) , (7b)
𝐱t+k|t𝒳,subscript𝐱𝑡conditional𝑘𝑡𝒳\displaystyle\mathbf{x}_{t+k|t}\in\mathcal{X},bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ∈ caligraphic_X , (7c)
𝐮t+k|t𝒰,subscript𝐮𝑡conditional𝑘𝑡𝒰\displaystyle\mathbf{u}_{t+k|t}\in\mathcal{U},bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ∈ caligraphic_U , (7d)
𝐱t|t=𝐱t,subscript𝐱conditional𝑡𝑡subscript𝐱𝑡\displaystyle\mathbf{x}_{t|t}=\mathbf{x}_{t},bold_x start_POSTSUBSCRIPT italic_t | italic_t end_POSTSUBSCRIPT = bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , (7e)
hi(𝐱t+k+1|t,𝐨t+k+1|ti)(1γ)hi(𝐱t+k|t,𝐨t+k|ti),subscript𝑖subscript𝐱𝑡𝑘conditional1𝑡subscriptsuperscript𝐨𝑖𝑡𝑘conditional1𝑡1𝛾subscript𝑖subscript𝐱𝑡conditional𝑘𝑡subscriptsuperscript𝐨𝑖𝑡conditional𝑘𝑡\displaystyle h_{i}(\mathbf{x}_{t+k+1|t},\mathbf{o}^{i}_{t+k+1|t})\geq(1-% \gamma)h_{i}(\mathbf{x}_{t+k|t},\mathbf{o}^{i}_{t+k|t}),italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t + italic_k + 1 | italic_t end_POSTSUBSCRIPT , bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + italic_k + 1 | italic_t end_POSTSUBSCRIPT ) ≥ ( 1 - italic_γ ) italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT , bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ) , (7f)

where the constraint (7f) guarantees the forward invariance of the safe set 𝒞𝒞\mathcal{C}caligraphic_C [2], and 𝒞𝒞\mathcal{C}caligraphic_C is defined in (4). When γ=1𝛾1\gamma=1italic_γ = 1, constraint (7f) degenerates into constraint (3f), which will lead to a decrease in safety. However, the smaller the value of γ𝛾\gammaitalic_γ is, the more stringent the constraints will become, which makes the optimization problem difficult to solve or even unsolvable.

Refer to caption
Figure 1: Feasibility of optimization problem (II-B). The reachable set propagates along the prediction horizon, starting from the initial state 𝐱t|tsubscript𝐱conditional𝑡𝑡\mathbf{x}_{t|t}bold_x start_POSTSUBSCRIPT italic_t | italic_t end_POSTSUBSCRIPT. The definition of the safety set 𝒞𝒞\mathcal{C}caligraphic_C is derived from equation (II-B). 𝒮𝒮\mathcal{S}caligraphic_S represents the state space set that satisfies constraints (7c) and (7f). The red arc represents the boundary of 𝒮𝒮\mathcal{S}caligraphic_S, and its interior is depicted by a red arrow. The optimization problem is feasible only if the intersection of the reachable set and 𝒮𝒮\mathcal{S}caligraphic_S is not empty.

The CBF hard constraints (7f) in the optimization problem act on the entire prediction horizon. Thus, if there is no feasible region in any step of the prediction horizon, it renders the entire optimization problem infeasible. As seen in Fig. 1, when the CBF constraints are satisfied, the system state must be within the safe set 𝒞𝒞\mathcal{C}caligraphic_C. However, violating the CBF constraints does not imply that the system state is outside the safe set 𝒞𝒞\mathcal{C}caligraphic_C. Specifically, even if there is no feasible region at step k𝑘kitalic_k, it does not mean that the current state lacks a suitable input to avoid collision. Although the optimization problem (II-B) proves efficient and safe in simple scenarios, its performance may degrade in crowded environments. The reason is the application of CBF hard constraints (7f) over the entire prediction horizon could lead to frequent solution failures.

III CONTROLLER DESIGN

When the optimization problem (II-B) is infeasible, optimal control cannot be obtained, which may reduce the navigation efficiency of the mobile robot in obstacle avoidance scenarios. Hence, in the following, we formulate the relaxed safety control logic to alleviate this problem.

III-A Soft Constrained Predictive Control With CBF

According to [2], the infeasibility problem encountered in MPC arises from the intersection of the reachable set and set 𝒮𝒮\mathcal{S}caligraphic_S, which satisfies CBF constraints at horizon step k𝑘kitalic_k, being empty. In dynamic scenarios, this problem is especially serious as the set 𝒮𝒮\mathcal{S}caligraphic_S is mainly determined by dynamic obstacles and the initial state of the robot, which is dynamically changing. The mobile robot is then more likely to enter a state that is infeasible, so new control inputs cannot be obtained by solving (II-B). In this case, methods such as repeating the control input from the previous moment or calculating the control input without constraints will violate the controller requirements and may lead to unpredictable and dangerous behavior. The most direct way to solve this problem is to adjust the prediction horizon N𝑁Nitalic_N, but this will also change the prediction ability of the controller. When a robot’s ability to predict is not adequate, it tends to explore riskier areas more often. This behaviour, in turn, worsens its initial state at future time instants.

In the optimization problem (II-B), the scalar γ𝛾\gammaitalic_γ in the state constraints (7e) is also called a conservative coefficient. This is because we can find a trade-off between safety and feasibility by choosing the value of γ𝛾\gammaitalic_γ. However, gaining feasibility by reducing safety is not our original intention. Some state constraint softening methods for MPC have been proposed, such as [18, 19], which can ensure the feasibility of online optimization problems under unexpected disturbances. Inspired by these studies, we modify (II-B) to the soft-constrained MPC problem based on CBF (SCMPC-CBF)

min𝐮t+k|t,ζt+k|tsubscriptsubscript𝐮𝑡conditional𝑘𝑡subscript𝜁𝑡conditional𝑘𝑡\displaystyle\min_{\mathbf{u}_{t+k|t},\mathbf{\zeta}_{t+k|t}}\ roman_min start_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT , italic_ζ start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT p(𝐱t+N|t)+k=0N1q(𝐱t+k|t,𝐮t+k|t)𝑝subscript𝐱𝑡conditional𝑁𝑡subscriptsuperscript𝑁1𝑘0𝑞subscript𝐱𝑡conditional𝑘𝑡subscript𝐮𝑡conditional𝑘𝑡\displaystyle p(\mathbf{x}_{t+N|t})+\sum^{N-1}_{k=0}q(\mathbf{x}_{t+k|t},% \mathbf{u}_{t+k|t})italic_p ( bold_x start_POSTSUBSCRIPT italic_t + italic_N | italic_t end_POSTSUBSCRIPT ) + ∑ start_POSTSUPERSCRIPT italic_N - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k = 0 end_POSTSUBSCRIPT italic_q ( bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT )
+αk=0N1ζt+k|t𝛼subscriptsuperscript𝑁1𝑘0normsubscript𝜁𝑡conditional𝑘𝑡\displaystyle+\alpha\sum^{N-1}_{k=0}\|\mathbf{\zeta}_{t+k|t}\|+ italic_α ∑ start_POSTSUPERSCRIPT italic_N - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k = 0 end_POSTSUBSCRIPT ∥ italic_ζ start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ∥ (8a)
s.t.formulae-sequencest\displaystyle\mathrm{s.t.}\ roman_s . roman_t . forallk=0,,N1::forall𝑘0𝑁1absent\displaystyle\mathrm{for}\ \mathrm{all}\ k=0,\dots,N-1:roman_for roman_all italic_k = 0 , … , italic_N - 1 :
𝐱t+k+1|t=f(𝐱t+k|t,𝐮t+k|t),subscript𝐱𝑡𝑘conditional1𝑡𝑓subscript𝐱𝑡conditional𝑘𝑡subscript𝐮𝑡conditional𝑘𝑡\displaystyle\mathbf{x}_{t+k+1|t}=f(\mathbf{x}_{t+k|t},\mathbf{u}_{t+k|t}),bold_x start_POSTSUBSCRIPT italic_t + italic_k + 1 | italic_t end_POSTSUBSCRIPT = italic_f ( bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ) , (8b)
𝐱t+k|t𝒳,subscript𝐱𝑡conditional𝑘𝑡𝒳\displaystyle\mathbf{x}_{t+k|t}\in\mathcal{X},bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ∈ caligraphic_X , (8c)
𝐮t+k|t𝒰,subscript𝐮𝑡conditional𝑘𝑡𝒰\displaystyle\mathbf{u}_{t+k|t}\in\mathcal{U},bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ∈ caligraphic_U , (8d)
𝐱t|t=𝐱t,subscript𝐱conditional𝑡𝑡subscript𝐱𝑡\displaystyle\mathbf{x}_{t|t}=\mathbf{x}_{t},bold_x start_POSTSUBSCRIPT italic_t | italic_t end_POSTSUBSCRIPT = bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , (8e)
𝐱t+k|t𝒳(ζt+k|t),ζt+k|t0,formulae-sequencesubscript𝐱𝑡conditional𝑘𝑡𝒳subscript𝜁𝑡conditional𝑘𝑡subscript𝜁𝑡conditional𝑘𝑡0\displaystyle\mathbf{x}_{t+k|t}\in\mathcal{X}(\mathbf{\zeta}_{t+k|t}),\mathbf{% \zeta}_{t+k|t}\geq 0,bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ∈ caligraphic_X ( italic_ζ start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ) , italic_ζ start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ≥ 0 , (8f)

where ζNo𝜁superscriptsubscript𝑁𝑜\mathbf{\zeta}\in\mathbb{R}^{N_{o}}italic_ζ ∈ blackboard_R start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT end_POSTSUPERSCRIPT is the slack variable and can be expressed as ζt+k|t=[ζt+k|t1,,ζt+k|tNo]Tsubscript𝜁𝑡conditional𝑘𝑡superscriptsuperscriptsubscript𝜁𝑡conditional𝑘𝑡1superscriptsubscript𝜁𝑡conditional𝑘𝑡subscript𝑁𝑜𝑇\mathbf{\zeta}_{t+k|t}=[\zeta_{t+k|t}^{1},\ldots,\zeta_{t+k|t}^{N_{o}}]^{T}italic_ζ start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT = [ italic_ζ start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , … , italic_ζ start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. The soft constraint (8f) can be then described as,

𝒳(ζt+k|t)={𝐱n|ζt+k|ti(1γ)hi(𝐱t+k|t,𝐨t+k|ti)hi(𝐱t+k+1|t,𝐨t+k+1|ti),i=1,,No}.\begin{array}[]{rl}\mathcal{X}(\zeta_{t+k|t})&=\{\mathbf{x}\in\mathbb{R}^{n}|% \zeta_{t+k|t}^{i}\geq(1-\gamma)h_{i}(\mathbf{x}_{t+k|t},\mathbf{o}_{t+k|t}^{i}% )\\ &-h_{i}(\mathbf{x}_{t+k+1|t},\mathbf{o}_{t+k+1|t}^{i}),\forall i=1,\ldots,N_{o% }\}.\end{array}start_ARRAY start_ROW start_CELL caligraphic_X ( italic_ζ start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ) end_CELL start_CELL = { bold_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_ζ start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ≥ ( 1 - italic_γ ) italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT , bold_o start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL - italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t + italic_k + 1 | italic_t end_POSTSUBSCRIPT , bold_o start_POSTSUBSCRIPT italic_t + italic_k + 1 | italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) , ∀ italic_i = 1 , … , italic_N start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT } . end_CELL end_ROW end_ARRAY (9)

In (8a), α𝛼\alphaitalic_α is the constraint violation penalty weight. This ensures that the optimization problem is feasible for any input sequence in 𝒰𝒰\mathcal{U}caligraphic_U. Even if 𝐱t+k|tsubscript𝐱𝑡conditional𝑘𝑡\mathbf{x}_{t+k|t}bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT does not satisfy the constraint (7f), these violations are penalized in the cost function to determine the value of the slack variables. By constructing in this manner, we can provide an optimal solution that is not only close to that of (II-B) but also ensures feasibility.

Theorem 1.

Given a state 𝐱tsubscript𝐱𝑡\mathbf{x}_{t}bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, if 𝐮t+k|t*,k=0,,N1formulae-sequencesuperscriptsubscript𝐮𝑡conditional𝑘𝑡𝑘0normal-…𝑁1\mathbf{u}_{t+k|t}^{*},k=0,\ldots,N-1bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT , italic_k = 0 , … , italic_N - 1 is the optimal solution to (II-B), then there exists a Lagrange vector λ*superscript𝜆\mathbf{\lambda}^{*}italic_λ start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT such that

(𝐮t+k|t*,λ*)=0,superscriptsubscript𝐮𝑡conditional𝑘𝑡superscript𝜆0\mathcal{L}(\mathbf{u}_{t+k|t}^{*},\mathbf{\lambda}^{*})=0,caligraphic_L ( bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT , italic_λ start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ) = 0 ,

in which (𝐮t+k|t,λ)subscript𝐮𝑡conditional𝑘𝑡𝜆\mathcal{L}(\mathbf{u}_{t+k|t},\mathbf{\lambda})caligraphic_L ( bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT , italic_λ ) is the Lagrangian of the optimization problem (II-B), i.e.,

(𝐮t+k,λ)=Jt(𝐱t,𝐮t|t,,𝐮t+N1|t)+λTct(𝐱t,𝐮t|t,,𝐮t+N1|t).subscript𝐮𝑡𝑘𝜆absentsubscript𝐽𝑡subscript𝐱𝑡subscript𝐮conditional𝑡𝑡subscript𝐮𝑡𝑁conditional1𝑡missing-subexpressionsuperscript𝜆𝑇subscript𝑐𝑡subscript𝐱𝑡subscript𝐮conditional𝑡𝑡subscript𝐮𝑡𝑁conditional1𝑡\begin{array}[]{rl}\mathcal{L}(\mathbf{u}_{t+k},\mathbf{\lambda})&=J_{t}(% \mathbf{x}_{t},\mathbf{u}_{t|t},\ldots,\mathbf{u}_{t+N-1|t})\\ &+\mathbf{\lambda}^{T}c_{t}(\mathbf{x}_{t},\mathbf{u}_{t|t},\ldots,\mathbf{u}_% {t+N-1|t}).\end{array}start_ARRAY start_ROW start_CELL caligraphic_L ( bold_u start_POSTSUBSCRIPT italic_t + italic_k end_POSTSUBSCRIPT , italic_λ ) end_CELL start_CELL = italic_J start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_t | italic_t end_POSTSUBSCRIPT , … , bold_u start_POSTSUBSCRIPT italic_t + italic_N - 1 | italic_t end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL + italic_λ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_t | italic_t end_POSTSUBSCRIPT , … , bold_u start_POSTSUBSCRIPT italic_t + italic_N - 1 | italic_t end_POSTSUBSCRIPT ) . end_CELL end_ROW end_ARRAY (10)

In (10), the expressions Jt(𝐱t,𝐮t|t,,𝐮t+N1|t)subscript𝐽𝑡subscript𝐱𝑡subscript𝐮conditional𝑡𝑡normal-…subscript𝐮𝑡𝑁conditional1𝑡J_{t}(\mathbf{x}_{t},\mathbf{u}_{t|t},\ldots,\mathbf{u}_{t+N-1|t})italic_J start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_t | italic_t end_POSTSUBSCRIPT , … , bold_u start_POSTSUBSCRIPT italic_t + italic_N - 1 | italic_t end_POSTSUBSCRIPT ) and ct(𝐱t,𝐮t|t,,𝐮t+N1|t)subscript𝑐𝑡subscript𝐱𝑡subscript𝐮conditional𝑡𝑡normal-…subscript𝐮𝑡𝑁conditional1𝑡c_{t}(\mathbf{x}_{t},\mathbf{u}_{t|t},\ldots,\mathbf{u}_{t+N-1|t})italic_c start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_t | italic_t end_POSTSUBSCRIPT , … , bold_u start_POSTSUBSCRIPT italic_t + italic_N - 1 | italic_t end_POSTSUBSCRIPT ) are the cost and constraint corresponding to (II-B), respectively. Besides, if we choose the penalty α𝛼\alphaitalic_α such that

α>λ*D,𝛼subscriptnormsuperscript𝜆𝐷\alpha>\|\mathbf{\lambda}^{*}\|_{D},italic_α > ∥ italic_λ start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ,

where D\|\cdot\|_{D}∥ ⋅ ∥ start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT denotes the dual norm with respect to the norm for ζt+k|tsubscript𝜁𝑡conditional𝑘𝑡\mathbf{\zeta}_{t+k|t}italic_ζ start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT in (8a) and 𝐮t+k|t*superscriptsubscript𝐮𝑡conditional𝑘𝑡\mathbf{u}_{t+k|t}^{*}bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT satisfies (7b)-(7f), then the optimal solutions to (II-B) and (III-A) are equivalent.

Proof.

See [20] Thm 14.2.1, 14.3.1. ∎

Remark 1.

It is noted that if we choose the penalty weight α𝛼\alphaitalic_α such that

α>max𝐱tλ*D,𝛼subscriptsubscript𝐱𝑡subscriptnormsuperscript𝜆𝐷\alpha>\max\limits_{\mathbf{x}_{t}}\|\lambda^{*}\|_{D},italic_α > roman_max start_POSTSUBSCRIPT bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∥ italic_λ start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ,

then for all initial states 𝐱tsubscript𝐱𝑡\mathbf{x}_{t}bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, if the optimization problem (II-B) is feasible, then the optimal solution of (II-B) and (III-A) are equivalent.

In practice, it is not easy to determine max𝐱tλ*Dsubscriptsubscript𝐱𝑡subscriptnormsuperscript𝜆𝐷\max_{\mathbf{x}_{t}}\|\lambda^{*}\|_{D}roman_max start_POSTSUBSCRIPT bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∥ italic_λ start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT, as the terms Jt,ctsubscript𝐽𝑡subscript𝑐𝑡J_{t},c_{t}italic_J start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT in the Lagrangian (10) change with respect to the state 𝐱tsubscript𝐱𝑡\mathbf{x}_{t}bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. Hence, before designing a soft constrained predictive control problem, we select a number of initial states 𝐱tsubscript𝐱𝑡\mathbf{x}_{t}bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT to estimate different α𝛼\alphaitalic_α and then choose the maximum one as the penalty factor.

After the softening, if the optimization problem (II-B) is infeasible, we can use the optimization problem (III-A) to find a solution. However, this solution may not necessarily be feasible for (II-B).

III-B Safety Enhancement with D-GCBF

In Section III-A, the infeasibility problem of the obstacle avoidance problem (II-B) is solved by softening the CBF hard constraints. In this way, the softened problem (III-A) is always feasible. However, the solutions of (III-A) and (II-B) are equivalent only when (II-B) is feasible. In safety-critical situations, conflicting with obstacles is not desirable, and further efforts should be devoted to enhancing safety in SCMPC-CBF.

Common methods for ensuring safety are to impose a control invariant set [21], and in our previous work [22], by constructing a safety filter [23], which is basically a control invariant set, the safety of reinforcement-learning generated controller can be improved. Here, we adopt similar ideas of the safety filter, and considering that control invariant sets are difficult to calculate for high-dimensional nonlinear systems, the dynamic generalized CBF (D-GCBF) is employed. In order to design the D-GCBF, the relative degree of the state constraint to the system (8b) is first considered.

Definition 2 (Relative-degree [24]).

The state constraint h(𝐱t)subscript𝐱𝑡h(\mathbf{x}_{t})italic_h ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) of system (1) has relative-degree d𝑑ditalic_d with respect to control input 𝐮tsubscript𝐮𝑡\mathbf{u}_{t}bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT if

h(𝐱t+j)𝐮t=0,h(𝐱t+d)𝐮t0,formulae-sequencesubscript𝐱𝑡𝑗subscript𝐮𝑡0subscript𝐱𝑡𝑑subscript𝐮𝑡0\frac{\partial h(\mathbf{x}_{t+j})}{\partial\mathbf{u}_{t}}=0,\ \frac{\partial h% (\mathbf{x}_{t+d})}{\partial\mathbf{u}_{t}}\neq 0,divide start_ARG ∂ italic_h ( bold_x start_POSTSUBSCRIPT italic_t + italic_j end_POSTSUBSCRIPT ) end_ARG start_ARG ∂ bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG = 0 , divide start_ARG ∂ italic_h ( bold_x start_POSTSUBSCRIPT italic_t + italic_d end_POSTSUBSCRIPT ) end_ARG start_ARG ∂ bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG ≠ 0 , (11)

for j{0,1,,d1}for-all𝑗01normal-…𝑑1\forall j\in\{0,1,\dots,d-1\}∀ italic_j ∈ { 0 , 1 , … , italic_d - 1 }, 𝐱nfor-all𝐱superscript𝑛\forall\mathbf{x}\in\mathbb{R}^{n}∀ bold_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT.

That is, the relative degree d𝑑ditalic_d is the delay step at which the control input 𝐮tsubscript𝐮𝑡\mathbf{u}_{t}bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT appears in 𝐲tsubscript𝐲𝑡\mathbf{y}_{t}bold_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. Therefore, it is valid to impose safety constraints at time step d𝑑ditalic_d but not at time step j𝑗jitalic_j.

By incorporating one-step state constraint at time step d𝑑ditalic_d, the optimization problem can benefit from a wider feasible region and improved computational efficiency, as opposed to including state constraints across d𝑑ditalic_d steps [15]. We proposed D-GCBF in dynamic scenarios based on the results of [15] and the definition of relative degree.

Definition 3 (D-GCBF).

Consider the discrete-time system (1). Given a set 𝒞𝒞\mathcal{C}caligraphic_C defined by (II-B) for a function h:n×nonormal-:normal-→superscript𝑛superscriptsubscript𝑛𝑜h:\mathbb{R}^{n}\times\mathbb{R}^{n_{o}}\rightarrow\mathbb{R}italic_h : blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT × blackboard_R start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT end_POSTSUPERSCRIPT → blackboard_R, the function hhitalic_h is a dynamic generalized CBF if

hi(𝐱t+d,𝐨t+di)(1η)dhi(𝐱t,𝐨ti),subscript𝑖subscript𝐱𝑡𝑑subscriptsuperscript𝐨𝑖𝑡𝑑superscript1𝜂𝑑subscript𝑖subscript𝐱𝑡subscriptsuperscript𝐨𝑖𝑡h_{i}(\mathbf{x}_{t+d},\mathbf{o}^{i}_{t+d})\geq(1-\eta)^{d}h_{i}(\mathbf{x}_{% t},\mathbf{o}^{i}_{t}),italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t + italic_d end_POSTSUBSCRIPT , bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + italic_d end_POSTSUBSCRIPT ) ≥ ( 1 - italic_η ) start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) , (12)

where the constant η(γ,1]𝜂𝛾1\eta\in(\gamma,1]italic_η ∈ ( italic_γ , 1 ].

Refer to caption
Figure 2: Comparison between our obstacle avoidance constraints and those previously used. (a) shows that the CBF is imposed as hard constraints across the entire prediction horizon, and (b) represents that the CBF soft constraints functioning over this whole prediction horizon. Based on the formulation for (b), (c) introduces an additional D-GCBF constraint, which is hard and can enhance feasibility of the robot.

The reason η𝜂\etaitalic_η is lower bounded by γ𝛾\gammaitalic_γ is that we don’t want the hard constraint (12) to be stricter than the soft constraint (8f). Upon integrating the D-GCBF constraint, a comparison between our obstacle avoidance constraints and those previously used is depicted in Fig. 2. As can be seen from Fig. 2, when utilizing CBF as hard constraints, it can potentially lead to infeasibility issues. By converting these into soft constraints, the optimization problem will always remain solvable. Simultaneously, the addition of a one-step D-GCBF hard constraint could possibly result in solution failure as well, but it is comparatively less stringent. The complete optimization problem, i.e., SCMPC-CBF with D-GCBF, is constructed as follows

min𝐮t+k|t,ζt+k|tsubscriptsubscript𝐮𝑡conditional𝑘𝑡subscript𝜁𝑡conditional𝑘𝑡\displaystyle\min_{\mathbf{u}_{t+k|t},\mathbf{\zeta}_{t+k|t}}\ roman_min start_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT , italic_ζ start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT p(𝐱t+N|t)+k=0N1q(𝐱t+k|t,𝐮t+k|t)𝑝subscript𝐱𝑡conditional𝑁𝑡subscriptsuperscript𝑁1𝑘0𝑞subscript𝐱𝑡conditional𝑘𝑡subscript𝐮𝑡conditional𝑘𝑡\displaystyle p(\mathbf{x}_{t+N|t})+\sum^{N-1}_{k=0}q(\mathbf{x}_{t+k|t},% \mathbf{u}_{t+k|t})italic_p ( bold_x start_POSTSUBSCRIPT italic_t + italic_N | italic_t end_POSTSUBSCRIPT ) + ∑ start_POSTSUPERSCRIPT italic_N - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k = 0 end_POSTSUBSCRIPT italic_q ( bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT )
+αk=0N1ζt+k|t𝛼subscriptsuperscript𝑁1𝑘0normsubscript𝜁𝑡conditional𝑘𝑡\displaystyle+\alpha\sum^{N-1}_{k=0}\|\mathbf{\zeta}_{t+k|t}\|+ italic_α ∑ start_POSTSUPERSCRIPT italic_N - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k = 0 end_POSTSUBSCRIPT ∥ italic_ζ start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ∥ (13a)
s.t.formulae-sequencest\displaystyle\mathrm{s.t.}\ roman_s . roman_t . forallk=0,,N1::forall𝑘0𝑁1absent\displaystyle\mathrm{for}\ \mathrm{all}\ k=0,\dots,N-1:roman_for roman_all italic_k = 0 , … , italic_N - 1 :
𝐱t+k+1|t=f(𝐱t+k|t,𝐮t+k|t),subscript𝐱𝑡𝑘conditional1𝑡𝑓subscript𝐱𝑡conditional𝑘𝑡subscript𝐮𝑡conditional𝑘𝑡\displaystyle\mathbf{x}_{t+k+1|t}=f(\mathbf{x}_{t+k|t},\mathbf{u}_{t+k|t}),bold_x start_POSTSUBSCRIPT italic_t + italic_k + 1 | italic_t end_POSTSUBSCRIPT = italic_f ( bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ) , (13b)
𝐱t+k|t𝒳,subscript𝐱𝑡conditional𝑘𝑡𝒳\displaystyle\mathbf{x}_{t+k|t}\in\mathcal{X},bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ∈ caligraphic_X , (13c)
𝐮t+k|t𝒰,subscript𝐮𝑡conditional𝑘𝑡𝒰\displaystyle\mathbf{u}_{t+k|t}\in\mathcal{U},bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ∈ caligraphic_U , (13d)
𝐱t|t=𝐱t,subscript𝐱conditional𝑡𝑡subscript𝐱𝑡\displaystyle\mathbf{x}_{t|t}=\mathbf{x}_{t},bold_x start_POSTSUBSCRIPT italic_t | italic_t end_POSTSUBSCRIPT = bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , (13e)
𝐱t+k|t𝒳(ζt+k|t),ζt+k|t0,formulae-sequencesubscript𝐱𝑡conditional𝑘𝑡𝒳subscript𝜁𝑡conditional𝑘𝑡subscript𝜁𝑡conditional𝑘𝑡0\displaystyle\mathbf{x}_{t+k|t}\in\mathcal{X}(\mathbf{\zeta}_{t+k|t}),\mathbf{% \zeta}_{t+k|t}\geq 0,bold_x start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ∈ caligraphic_X ( italic_ζ start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ) , italic_ζ start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT ≥ 0 , (13f)
hi(𝐱t+d|t,𝐨t+d|ti)(1η)dhi(𝐱t|t,𝐨t|ti).subscript𝑖subscript𝐱𝑡conditional𝑑𝑡subscriptsuperscript𝐨𝑖𝑡conditional𝑑𝑡superscript1𝜂𝑑subscript𝑖subscript𝐱conditional𝑡𝑡subscriptsuperscript𝐨𝑖conditional𝑡𝑡\displaystyle h_{i}(\mathbf{x}_{t+d|t},\mathbf{o}^{i}_{t+d|t})\geq(1-\eta)^{d}% h_{i}(\mathbf{x}_{t|t},\mathbf{o}^{i}_{t|t}).italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t + italic_d | italic_t end_POSTSUBSCRIPT , bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + italic_d | italic_t end_POSTSUBSCRIPT ) ≥ ( 1 - italic_η ) start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t | italic_t end_POSTSUBSCRIPT , bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t | italic_t end_POSTSUBSCRIPT ) . (13g)

The addition of the constraint (13g) can guarantee safety for static obstacles, which is stated in the following theorem.

Theorem 2.

For a relative-degree d𝑑ditalic_d state constraints h(𝐱)0𝐱0h(\mathbf{x})\geq 0italic_h ( bold_x ) ≥ 0 and the corresponding safe set (4a), assume the system satisfies h(𝐱t+j)0subscript𝐱𝑡𝑗0h(\mathbf{x}_{t+j})\geq 0italic_h ( bold_x start_POSTSUBSCRIPT italic_t + italic_j end_POSTSUBSCRIPT ) ≥ 0 for all j{1,,d1}𝑗1normal-…𝑑1j\in\{1,\ldots,d-1\}italic_j ∈ { 1 , … , italic_d - 1 }. Then by solving (III-B) at time t𝑡titalic_t, if feasible solutions 𝐮t+k|t,k=0,,N1formulae-sequencesubscript𝐮𝑡conditional𝑘𝑡𝑘0normal-…𝑁1\mathbf{u}_{t+k|t},k=0,\ldots,N-1bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT , italic_k = 0 , … , italic_N - 1 and ζt+k|t,k=1,,Nformulae-sequencesubscript𝜁𝑡conditional𝑘𝑡𝑘1normal-…𝑁\zeta_{t+k|t},k=1,\ldots,Nitalic_ζ start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT , italic_k = 1 , … , italic_N to the problem (III-B) can be found, then in the next time steps, the state can be guaranteed to be within the safe set (4a).

Proof.

According to [15], if the system satisfies h(𝐱t+j)0subscript𝐱𝑡𝑗0h(\mathbf{x}_{t+j})\geq 0italic_h ( bold_x start_POSTSUBSCRIPT italic_t + italic_j end_POSTSUBSCRIPT ) ≥ 0 for all j{1,,d1}𝑗1𝑑1j\in\{1,\ldots,d-1\}italic_j ∈ { 1 , … , italic_d - 1 }, then the set (4a) defines a forward invariant safe set. Besides, as there is some control policy 𝐮t+k|tsubscript𝐮𝑡conditional𝑘𝑡\mathbf{u}_{t+k|t}bold_u start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT such that (13g) holds, the state is guaranteed to be with the safe set (4a), i.e., at time t+1𝑡1t+1italic_t + 1, we have

h(𝐱t+1)0,subscript𝐱𝑡10h(\mathbf{x}_{t+1})\geq 0,italic_h ( bold_x start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ) ≥ 0 ,

i.e., 𝐱t+i𝒞subscript𝐱𝑡𝑖𝒞\mathbf{x}_{t+i}\in\mathcal{C}bold_x start_POSTSUBSCRIPT italic_t + italic_i end_POSTSUBSCRIPT ∈ caligraphic_C. ∎

Therefore, for a fixed CBF h(𝐱)𝐱h(\mathbf{x})italic_h ( bold_x ), if the problem (III-B) is feasible, the system is always safe, and the system state 𝐱𝐱\mathbf{x}bold_x is always in the safe set 𝒞𝒞\mathcal{C}caligraphic_C.

Remark 2.

It is noted that D-GCBF simplifies multistep constraints into a single step, reducing computational complexity and enhancing feasibility. However, a one-step constraint may only partially ensure system safety[14], as the forward invariance of the set 𝒞𝒞\mathcal{C}caligraphic_C (4a) is guaranteed provided that h(𝐱t+j)0subscript𝐱𝑡𝑗0h(\mathbf{x}_{t+j})\geq 0italic_h ( bold_x start_POSTSUBSCRIPT italic_t + italic_j end_POSTSUBSCRIPT ) ≥ 0 holds for j{1,,d1}𝑗1normal-…𝑑1j\in\{1,\ldots,d-1\}italic_j ∈ { 1 , … , italic_d - 1 }, which are not included in the constraints of (III-B). Hence, solving (III-B) alone in general cannot guarantee the safety of the system. Therefore, we use D-GCBF as a single-step safeguard to reinforce the system’s safety after obtaining a potential safe trajectory through the optimization problem (III-A).

Remark 3.

In practice, the moving obstacles 𝐨i,i=1,,Noformulae-sequencesuperscript𝐨𝑖𝑖1normal-…superscript𝑁𝑜\mathbf{o}^{i},i=1,\ldots,N^{o}bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , italic_i = 1 , … , italic_N start_POSTSUPERSCRIPT italic_o end_POSTSUPERSCRIPT cause the changing of the CBF hi(𝐱,𝐨i)subscript𝑖𝐱superscript𝐨𝑖h_{i}(\mathbf{x},\mathbf{o}^{i})italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x , bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ); thus, the safe set also evolves with time. And if we can show that

𝒞t𝒞t+1,tsubscript𝒞𝑡subscript𝒞𝑡1for-all𝑡\mathcal{C}_{t}\subset\mathcal{C}_{t+1},\forall tcaligraphic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ⊂ caligraphic_C start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT , ∀ italic_t (14)

where 𝒞tsubscript𝒞𝑡\mathcal{C}_{t}caligraphic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is the safe set at time t𝑡titalic_t, then according to Theorem 2 we have

𝐱t+1𝒞t𝒞t+1,subscript𝐱𝑡1subscript𝒞𝑡subscript𝒞𝑡1\mathbf{x}_{t+1}\in\mathcal{C}_{t}\subset\mathcal{C}_{t+1},bold_x start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ∈ caligraphic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ⊂ caligraphic_C start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ,

i.e., the system is always safe at future times.

Although (14) is not always satisfied, and the safety of future times can not be guaranteed, we find in experiments that D-GCBF actually enhances safety.

It is noted that the hard constraint (13g) can also bring the problem of infeasibility. However, we can see that the constraint (13g) is far weaker than (7f), as (7f) is imposed on the entire prediction horizon, and besides, ηγ𝜂𝛾\eta\geq\gammaitalic_η ≥ italic_γ, hence the chance of infeasibility of (III-B) is far less than that of (II-B). To avoid possible damage, when (III-B) is infeasible, we ensure that the robot comes to a complete stop by activating the brakes instead of merely setting the control input to zero. The complete algorithm is presented as Algorithm 1.

Algorithm 1 SCMPC-CBF with D-GCBF
0:  Initial state 𝐱(t)𝐱𝑡\mathbf{x}(t)bold_x ( italic_t ), state constraints 𝒳𝒳\mathcal{X}caligraphic_X, input constraints 𝒰𝒰\mathcal{U}caligraphic_U, system dynamic (1), obstacles state 𝐨i(t)superscript𝐨𝑖𝑡\mathbf{o}^{i}(t)bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ( italic_t ), system dynamic (2) of obstacle, goal state 𝐱goalsubscript𝐱𝑔𝑜𝑎𝑙\mathbf{x}_{goal}bold_x start_POSTSUBSCRIPT italic_g italic_o italic_a italic_l end_POSTSUBSCRIPT.
0:  Optimal control 𝐮(t)𝐮𝑡\mathbf{u}(t)bold_u ( italic_t ).
1:  𝐱t=𝐱(t)subscript𝐱𝑡𝐱𝑡\mathbf{x}_{t}=\mathbf{x}(t)bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = bold_x ( italic_t ).
2:  𝐨ti=𝐨i(t)subscriptsuperscript𝐨𝑖𝑡superscript𝐨𝑖𝑡\mathbf{o}^{i}_{t}=\mathbf{o}^{i}(t)bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = bold_o start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ( italic_t ).
3:  Solve (III-B).
4:  if (III-B) is solved successfully then
5:     return  𝐮(t)=𝐮t|t*𝐮t+k|t*𝐮𝑡subscriptsuperscript𝐮conditional𝑡𝑡subscriptsuperscript𝐮𝑡conditional𝑘𝑡\mathbf{u}(t)=\mathbf{u}^{*}_{t|t}\in\mathbf{u}^{*}_{t+k|t}bold_u ( italic_t ) = bold_u start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t | italic_t end_POSTSUBSCRIPT ∈ bold_u start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + italic_k | italic_t end_POSTSUBSCRIPT
6:  else
7:     Activate the braking mechanism on the robot.
8:  end if

IV EXPERIMENTS

In this section, experiments in simulation environments and real scenarios are conducted to verify the effectiveness of our work.

IV-A Simulation Setup

All controllers are implemented in Python with Casadi[25] as modeling language, solved with IPOPT [26]. The simulation experiments were conducted on a computer running Ubuntu 20.04, which used an Intel Core i5-12490f processor with 16 GB RAM.

Following [8], each agent must stay within a 10m×10m two-dimensional space. The simulated pedestrians are controlled by ORCA [5], and their initial positions are randomly sampled on a circle with a radius of 4m, and their target positions are on the other side of the same circle. The robot has the same radius of 0.3m and the same preferred speed of 1m/s as the pedestrian. And the simulation time step is 0.2s.

In order to fully evaluate the safety and effectiveness of the proposed algorithm, we set the robot to be invisible to other humans. That is, the simulated human only reacts to humans and turns a blind eye to the robot. All controller algorithms are evaluated using 500 random test cases with five pedestrians.

IV-B Quantitative Evaluation

All controller performances are evaluated under the same prediction horizon N𝑁Nitalic_N and the same form of stage and terminal cost. The motion model (2) of all obstacles is approximated by a linear model. We use the following indicators to compare the five controllers: S (the rate of the robot reaching its destination without collision), C (the rate of the robot colliding with moving obstacles), T (the robot’s average navigation time in seconds), FS (the average number of solution failures), ST (the robot’s average solution time in milliseconds).

IV-B1 Double-integrator system

Consider the robot’s motion model (1) as a discrete-time linear double-integrator system,

𝐱k+1=A𝐱k+B𝐮k,subscript𝐱𝑘1𝐴subscript𝐱𝑘𝐵subscript𝐮𝑘\mathbf{x}_{k+1}=A\mathbf{x}_{k}+B\mathbf{u}_{k},bold_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_A bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_B bold_u start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , (15)

where 𝐱=[x,y,vx,vy]T𝐱superscript𝑥𝑦subscript𝑣𝑥subscript𝑣𝑦𝑇\mathbf{x}=[x,y,v_{x},v_{y}]^{T}bold_x = [ italic_x , italic_y , italic_v start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT and 𝐮=[ax,ay]T𝐮superscriptsubscript𝑎𝑥subscript𝑎𝑦𝑇\mathbf{u}=[a_{x},a_{y}]^{T}bold_u = [ italic_a start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT , italic_a start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT represent position (x,y)𝑥𝑦(x,y)( italic_x , italic_y ), velocity (vx,vy)subscript𝑣𝑥subscript𝑣𝑦(v_{x},v_{y})( italic_v start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT ) and acceleration (ax,ay)subscript𝑎𝑥subscript𝑎𝑦(a_{x},a_{y})( italic_a start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT , italic_a start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT ), respectively. We set ϵ=0.2italic-ϵ0.2\epsilon=0.2italic_ϵ = 0.2 in MPC-DC (II-A) and ϵ=0italic-ϵ0\epsilon=0italic_ϵ = 0 in other methods. The results are shown in Table I.

TABLE I: Quantitative Results With Double-integrator System
Controller γ𝛾\gammaitalic_γ S \uparrow C \downarrow T FS ST
ORCA [5] - 0.470 0.526 11.04 - -
MPC-DC(II-A) - 0.362 0.636 12.97 6.486 45.11
MPC-D-CBF(II-B) 0.08 0.736 0.262 16.73 21.734 42.47
0.10 0.684 0.316 14.91 16.602 42.78
0.12 0.624 0.376 13.80 13.544 44.12
SCMPC-CBF(III-A) 0.08 0.952 0.048 13.45 0 53.01
0.10 0.776 0.224 12.19 0 53.48
0.12 0.640 0.360 11.61 0 53.33
Ours(III-B) 0.08 0.996 0.004 14.35 0.374 55.08
0.10 0.966 0.034 13.61 0.794 55.76
0.12 0.954 0.046 13.34 1.242 55.51

The results show that in a dynamic environment, short-term ORCA performs poorly in the invisible setting due to a violation of the reciprocal assumption. MPC-DC still performs poorly even if an additional safety margin ϵitalic-ϵ\epsilonitalic_ϵ is added. This is because it cannot avoid dynamic obstacles in advance and easily enters high-risk areas. The safety of MPC-D-CBF will improve as γ𝛾\gammaitalic_γ decreases, but correspondingly, the navigation efficiency will decrease due to tighter hard constraints. Robots driven by soft constraints will more actively explore new paths in crowded areas, and single-step safety constraints will also enhance the safety of robot navigation. Although the introduction of slack variables may slightly increase the complexity of the solution, it can prevent the extra time consumption caused by solution failure. As shown in Fig. 3, we compared the navigation paths of these controllers in the 330th test case.

Refer to caption
(a) ORCA
Refer to caption
(b) MPC-DC
Refer to caption
(c) MPC-D-CBF
Refer to caption
(d) SCMPC-CBF
Refer to caption
(e) Ours
Figure 3: Trajectories of different controllers with double-integrator kinematics system. The circles in the picture are the locations of the agents at the labeled time. In the 330th test case, our method can make the robot successfully reach the destination. More test results are shown in Table I.

IV-B2 Unicycle system

Consider the robot’s motion model (1) as a discrete-time nonlinear unicycle system,

𝐱k+1=f(𝐱k,𝐮k),subscript𝐱𝑘1𝑓subscript𝐱𝑘subscript𝐮𝑘\mathbf{x}_{k+1}=f(\mathbf{x}_{k},\mathbf{u}_{k}),bold_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_f ( bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) , (16)

where 𝐱=[x,y,θ]T𝐱superscript𝑥𝑦𝜃𝑇\mathbf{x}=[x,y,\theta]^{T}bold_x = [ italic_x , italic_y , italic_θ ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT and 𝐮=[v,ω]T𝐮superscript𝑣𝜔𝑇\mathbf{u}=[v,\omega]^{T}bold_u = [ italic_v , italic_ω ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT represent position (x,y)𝑥𝑦(x,y)( italic_x , italic_y ), heading angle θ𝜃\thetaitalic_θ, line speed v𝑣vitalic_v and angular velocity ω𝜔\omegaitalic_ω, respectively. VO-based methods such as ORCA are suitable for robots that can move in any direction but are not suitable for robots with non-holonomic kinematics[27]. Therefore, we do not compare with ORCA. We set ϵ=0.2italic-ϵ0.2\epsilon=0.2italic_ϵ = 0.2 in MPC-DC (II-A) and ϵ=0italic-ϵ0\epsilon=0italic_ϵ = 0 in other methods. The results are shown in Table II.

TABLE II: Quantitative Results With Unicycle System
Controller γ𝛾\gammaitalic_γ S \uparrow C \downarrow T FS ST
MPC-DC(II-A) - 0.186 0.808 10.87 6.116 50.78
MPC-D-CBF(II-B) 0.08 0.848 0.152 14.40 8.862 49.07
0.10 0.764 0.236 13.27 7.756 49.13
0.12 0.724 0.276 12.58 6.976 49.27
SCMPC-CBF(III-A) 0.08 0.980 0.020 14.44 0 60.59
0.10 0.936 0.064 13.27 0 60.75
0.12 0.900 0.100 12.68 0 60.83
Ours(III-B) 0.08 0.982 0.018 14.56 0.032 61.54
0.10 0.960 0.040 13.34 0.128 62.37
0.12 0.902 0.098 12.71 0.388 61.03

The results demonstrate that our method achieves a high success rate despite the reduced action space of non-holonomic kinematics systems. Given the underactuated characteristics of non-holonomic mobile robots [28] (3 degrees of freedom (x,y,θ)𝑥𝑦𝜃(x,y,\theta)( italic_x , italic_y , italic_θ ), 2 controls (v,ω)𝑣𝜔(v,\omega)( italic_v , italic_ω )), the obstacle avoidance capability of these robots is inevitably limited. As a result, the superiority of our improved method over other methods is not so apparent as that in simulation experiments for double-integrator system. Similar to the previous simulation experiment for double-integrator system, as the value of γ𝛾\gammaitalic_γ decreases, the system’s safety during navigation improves. However, it also yields a higher probability of solution failure. The increase in average solution time can be attributed to the system’s nonlinearity. The simulation experiment conducted using the unicycle model lays the groundwork for subsequent experiments in real-world scenarios. Fig. 4 illustrates a comparison of the navigation paths of these controllers in the 116th test case. Our code and further examples can be found at http://https://github.com/Zetao-Lu/CrowdNav_MPCCBF

Refer to caption
(a) MPC-DC
Refer to caption
(b) MPC-D-CBF
Refer to caption
(c) SCMPC-CBF
Refer to caption
(d) Ours
Figure 4: Trajectories of different controllers with unicycle system. The circles in the picture are the locations of the agents at the labeled time. In the 116th test case, our method can make the robot successfully reach the destination. More test results are shown in Table II.

IV-C Real-world Experiments

In real-world experiments, we deployed our method on an MR1000 robot. We set the sensing range of the 64-line lidar to be 8 meters and used PointPillars[29] for pedestrian detection within this range. Additionally, we utilized AB3DMOT[30] to track the detection results and estimate their relative positions and velocities. After building a map of real-world environment, we estimated the robot’s state using the AMCL package in ROS and Kalman filter. As shown in Fig. 5, in real-world experiments, the MR1000 robot successfully navigated to the target location without colliding with pedestrians by utilizing our method as a local planning controller. The results demonstrate the successful transferability of our method from simulation to real robots as a local planning module, ensuring safety for the robot.

Refer to caption
(a) start time
Refer to caption
(b) after 4 seconds
Refer to caption
(c) after 8 seconds
Refer to caption
(d) after 12 seconds
Refer to caption
(e) after 16 seconds
Refer to caption
(f) end time
Figure 5: Real-world experiments have confirmed the effectiveness of the proposed collision avoidance method. The robot’s navigation sequence is illustrated by these six sub-images. When pedestrians enter the sensing range of the lidar, the robot detects them. As depicted on the right side of the sub-image, perceived pedestrians are depicted by red boxes.

V CONCLUSIONS

In this paper, we propose a new MPC framework that integrates the CBF to address the challenge of obstacle avoidance in dynamic environments while avoiding the infeasibility problem caused by hard constraints acting on the entire predictive horizon. Additionally, we design a D-GCBF based on the relative degree of constraints to the system, enhancing the robot’s obstacle avoidance capability under soft constraints by employing single-step safety constraints. Experimental results demonstrate that our method achieves a higher navigation success rate, lower collision rate, and lower solution failure probability compared to other baseline methods. Furthermore, we deploy the method as a local planning controller on the MR1000 robot using the ROS platform and validate the effectiveness of our approach in real-world environments.

References

  • [1] A. J. Lee, W. Song, B. Yu, D. Choi, C. Tirtawardhana, and H. Myung, “Survey of robotics technologies for civil infrastructure inspection,” Journal of Infrastructure Intelligence and Resilience, vol. 2, no. 1, p. 100018, 2023.
  • [2] J. Zeng, B. Zhang, and K. Sreenath, “Safety-critical model predictive control with discrete-time control barrier function,” in 2021 American Control Conference (ACC).   IEEE, 2021, pp. 3882–3889.
  • [3] Z. Jian, Z. Yan, X. Lei, Z. Lu, B. Lan, X. Wang, and B. Liang, “Dynamic control barrier function-based model predictive control to safety-critical obstacle-avoidance of mobile robot,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 3679–3685.
  • [4] P. Fiorini and Z. Shiller, “Motion planning in dynamic environments using velocity obstacles,” The International Journal of Robotics Research, vol. 17, no. 7, pp. 760–772, 1998.
  • [5] J. Van Den Berg, S. J. Guy, M. Lin, and D. Manocha, “Reciprocal n-body collision avoidance,” in Robotics Research: The 14th International Symposium ISRR.   Springer, 2011, pp. 3–19.
  • [6] D. J. Gonon, D. Paez-Granados, and A. Billard, “Reactive navigation in crowds for non-holonomic robots with convex bounding shape,” IEEE Robotics and Automation Letters, vol. 6, no. 3, pp. 4728–4735, 2021.
  • [7] Y. Chen, F. Zhao, and Y. Lou, “Interactive model predictive control for robot navigation in dense crowds,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 52, no. 4, pp. 2289–2301, 2021.
  • [8] C. Chen, Y. Liu, S. Kreiss, and A. Alahi, “Crowd-robot interaction: Crowd-aware robot navigation with attention-based deep reinforcement learning,” in 2019 International Conference on Robotics and Automation (ICRA), 2019, pp. 6015–6022.
  • [9] H. Yang, C. Yao, C. Liu, and Q. Chen, “RMRL: Robot navigation in crowd environments with risk map-based deep reinforcement learning,” IEEE Robotics and Automation Letters, 2023.
  • [10] M. Boldrer, A. Antonucci, P. Bevilacqua, L. Palopoli, and D. Fontanelli, “Multi-agent navigation in human-shared environments: A safe and socially-aware approach,” Robotics and Autonomous Systems, vol. 149, p. 103979, 2022.
  • [11] B. Lindqvist, S. S. Mansouri, A.-a. Agha-mohammadi, and G. Nikolakopoulos, “Nonlinear MPC for collision avoidance and control of UAVs with dynamic obstacles,” IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 6001–6008, 2020.
  • [12] B. Brito, B. Floor, L. Ferranti, and J. Alonso-Mora, “Model predictive contouring control for collision avoidance in unstructured dynamic environments,” IEEE Robotics and Automation Letters, vol. 4, no. 4, pp. 4459–4466, 2019.
  • [13] A. Thirugnanam, J. Zeng, and K. Sreenath, “Safety-critical control and planning for obstacle avoidance between polytopes with control barrier functions,” in 2022 International Conference on Robotics and Automation (ICRA).   IEEE, 2022, pp. 286–292.
  • [14] J. Zeng, Z. Li, and K. Sreenath, “Enhancing feasibility and safety of nonlinear model predictive control with discrete-time control barrier functions,” in 2021 60th IEEE Conference on Decision and Control (CDC).   IEEE, 2021, pp. 6137–6144.
  • [15] H. Ma, X. Zhang, S. E. Li, Z. Lin, Y. Lyu, and S. Zheng, “Feasibility enhancement of constrained receding horizon control using generalized control barrier function,” in 2021 4th IEEE International Conference on Industrial Cyber-Physical Systems (ICPS).   IEEE, 2021, pp. 551–557.
  • [16] X. Zhang, A. Liniger, and F. Borrelli, “Optimization-based collision avoidance,” IEEE Transactions on Control Systems Technology, vol. 29, no. 3, pp. 972–983, 2020.
  • [17] A. D. Ames, S. Coogan, M. Egerstedt, G. Notomista, K. Sreenath, and P. Tabuada, “Control barrier functions: Theory and applications,” in 2019 18th European Control Conference (ECC), 2019, pp. 3420–3431.
  • [18] G. Di Pillo and L. Grippo, “Exact penalty functions in constrained optimization,” SIAM Journal on Control and Optimization, vol. 27, no. 6, pp. 1333–1360, 1989.
  • [19] E. C. Kerrigan and J. M. Maciejowski, “Soft constraints and exact penalty functions in model predictive control,” Proc. UKACC International Conference (Control 2000), 2000.
  • [20] R. Fletcher, Practical methods of optimization.   John Wiley & Sons, 2000.
  • [21] T. Anevlavis, Z. Liu, N. Ozay, and P. Tabuada, “Controlled invariant sets: implicit closed-form representations and applications,” arXiv preprint arXiv:2107.08566, 2021.
  • [22] K. Feng, Z. Lu, J. Xu, H. Chen, and Y. Lou, “A safety filter for realizing safe robot navigation in crowds,” in 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023, pp. 9729–9736.
  • [23] B. Tearle, K. P. Wabersich, A. Carron, and M. N. Zeilinger, “A predictive safety filter for learning-based racing control,” IEEE Robotics and Automation Letters, vol. 6, no. 4, pp. 7635–7642, 2021.
  • [24] M. Sun and D. Wang, “Initial shift issues on discrete-time iterative learning control with system relative degree,” IEEE Transactions on Automatic Control, vol. 48, no. 1, pp. 144–148, 2003.
  • [25] J. A. Andersson, J. Gillis, G. Horn, J. B. Rawlings, and M. Diehl, “CasADi: a software framework for nonlinear optimization and optimal control,” Mathematical Programming Computation, vol. 11, pp. 1–36, 2019.
  • [26] L. T. Biegler and V. M. Zavala, “Large-scale nonlinear programming using IPOPT: An integrating framework for enterprise-wide dynamic optimization,” Computers & Chemical Engineering, vol. 33, no. 3, pp. 575–582, 2009.
  • [27] J. Alonso-Mora, A. Breitenmoser, M. Rufli, P. Beardsley, and R. Siegwart, “Optimal reciprocal collision avoidance for multiple non-holonomic robots,” in Distributed autonomous robotic systems: The 10th international symposium.   Springer, 2013, pp. 203–216.
  • [28] C. M. Pappalardo and D. Guida, “Forward and inverse dynamics of a unicycle-like mobile robot,” Machines, vol. 7, no. 1, p. 5, 2019.
  • [29] A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, “PointPillars: Fast encoders for object detection from point clouds,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
  • [30] X. Weng, J. Wang, D. Held, and K. Kitani, “3d multi-object tracking: A baseline and new evaluation metrics,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2020, pp. 10 359–10 366.