Robot Safe Planning In Dynamic Environments Based On Model Predictive Control Using Control Barrier Function

Zetao Lu, Kaijun Feng, Jun Xu, , Haoyao Chen, and Yunjiang Lou This work was supported in part by the National Natural Science Foundation of China under Grant 62173113, and in part by the Science and Technology Innovation Committee of Shenzhen Municipality under Grant GXWD20231129101652001, and in part by Natural Science Foundation of Guangdong Province of China under Grant 2022A1515011584. (Corresponding author: Jun Xu.)The authors are with the School of Mechanical Engineering and Automation, Harbin Institute of Technology Shenzhen, Shenzhen, Guangdong, China, 518055 (email: {22S153114, 21S153137}@stu.hit.edu.cn; {xujunqgy, hychen5, louyj}@hit.edu.cn).

Abstract

Implementing obstacle avoidance in dynamic environments is a challenging problem for robots. Model predictive control (MPC) is a popular strategy for dealing with this type of problem, and recent work mainly uses control barrier function (CBF) as hard constraints to ensure that the system state remains in the safe set. However, in crowded scenarios, effective solutions may not be obtained due to infeasibility problems, resulting in degraded controller performance. We propose a new MPC framework that integrates CBF to tackle the issue of obstacle avoidance in dynamic environments, in which the infeasibility problem induced by hard constraints operating over the whole prediction horizon is solved by softening the constraints and introducing exact penalty, prompting the robot to actively seek out new paths. At the same time, generalized CBF is extended as a single-step safety constraint of the controller to enhance the safety of the robot during navigation. The efficacy of the proposed method is first shown through simulation experiments, in which a double-integrator system and a unicycle system are employed, and the proposed method outperforms other controllers in terms of safety, feasibility, and navigation efficiency. Furthermore, real-world experiment on an MR1000 robot is implemented to demonstrate the effectiveness of the proposed method.

Index Terms:

Collision avoidance, model predictive control, control barrier function, autonomous vehicle navigation.

I Introduction

In recent years, with the continuous development of robot technology, the application scope of robots is no longer limited to industrial manufacturing scenarios, but has also been expanding to other industries, such as autonomous driving, inspection robots, disinfection robots and delivery robots [1]. In order to achieve safe navigation of robots in dynamic and shared environments, it is of great significance to design a safety-critical controller to enable the autonomous system to achieve optimal performance while ensuring safety. Some recent work combines the control barrier function (CBF) with model predictive control (MPC) [2] to implement such a safety-critical controller and applies it to dynamic environments by extending CBF to dynamic CBF (D-CBF) [3]. However, due to the existence of state constraints, applying CBF as hard constraints to the entire prediction horizon may lead to failure in solving the optimization problem, which is particularly obvious in complex dynamic environments.

In order to solve the aforementioned problems and achieve better control effects, we transformed the CBF hard constraints into soft constraints and incorporated them into the penalty function of the optimization problem. Besides, a single-step CBF is imposed to enhance safety. Through our approach, the robot is able to significantly reduce the probability of solution failure in dynamic environments and reach its destination with higher efficiency and safety.

I-A Related Work

Existing robot navigation work in dynamic environments can be divided into three categories: 1) reactive based; 2) learning based; 3) optimization based. In reactive-based approaches, the robot makes one-step optimal action based on information about dynamic obstacles in the current environment, including velocity obstacle (VO) [4] and its variants [5, 6]. However, these types of methods usually do not take into account the robot’s kinematic constraints. Relying solely on current state information can result in short-term, oscillatory, and unnatural behavior, which does not facilitate pedestrian understanding of the robot’s movement intentions[7]. In learning-based approaches, robots endeavor to emulate appropriate navigation strategies. By imitating the interactive movements of pedestrians, they aim to navigate through dense crowds in a manner that is more socially acceptable. Deep reinforcement learning is often used to train computationally efficient navigation strategies [8, 9], which implicitly encodes interactions and collaborations between pedestrians to generate paths with behavioral patterns more consistent with humans [10]. However, learning-based methods are dependent on offline training and are then restricted by environmental characteristics, which may encounter generalization issues when transitioning from simulation to real world, i.e., the performance in scenarios not covered by the training data can not be guaranteed. Optimization-based methods usually consist of two consecutive steps of prediction and planning at each time step, first using a motion model to predict dynamic obstacles in the environment [11], and then formulating robot navigation as an optimal control problem [7]. Such methods are usually based on MPC, as it is able to integrate the kinematic constraints and static/dynamic collision constraints of the robot while combining planning with control to find an ideal trajectory [12].

However, a significant concern with optimization-based methods in dynamic environments is the safety of the generated trajectories. CBF has recently been introduced as an effective method, combined with MPC [2], to design safety-critical controllers that can guarantee effective safety margins under a short prediction horizon. In dynamic environments, [3] implemented obstacle avoidance with a safety-critical controller built based on lidar and dynamic CBF. In a static maze scenario, [13] successfully navigated different robot shapes using relaxation technology [14]. However, applying CBF as hard constraints to the entire prediction horizon may lead to failure in solving the optimization problem [2]. [15] proposed generalized CBF (GCBF) to use CBF constraint as a one-step constraint to improve feasibility, but there is still a trade-off between feasibility and safety. In [14], the trade-off is handled by incorporating slack variables into the CBF constraints to enhance feasibility, although this approach inherently increases the solution time and diminishes the safety margin.

Inspired by these studies, we contemplate the conversion of CBF hard constraints into soft constraints. The goal is to maintain control effects that are comparable to those of hard constraints while minimizing the likelihood of solution failure. This approach inspires robots to actively seek feasible paths in dynamic environments. Concurrently, it’s essential to introduce an effective safety guarantee, fulfilled by integrating dynamic GCBF (D-GCBF) constraint. The paper’s main focus lies on the application of CBF within the framework of MPC, aiming to enable robots to navigate through crowded and complex dynamic environments efficiently while ensuring safety.

I-B Contribution

The contributions of this paper are as follows.

•

We propose an MPC framework based on CBF soft constraints for generating safe collision-free trajectories in dynamic environments;
•

We incorporate D-GCBF within this framework as a single-step hard constraint to enhance safety;
•

Simulation experiments and real-world tests were carried out to validate the real-time capability, effectiveness, and stability of the algorithm.

I-C Paper Structure

This paper is structured as follows. In Section II, we provide an overview of the definition of the CBF and its associated optimization problem construction when used as hard constraints. In Section III, we transform CBF hard constraints into soft constraints and derive conditions for exact penalty. Besides, single-step D-GCBF hard constraint is imposed as safety guarantee. In order to verify the effectiveness of the controller design and algorithm, examples of obstacle avoidance of our algorithm in the simulation environment and the real world are demonstrated in Section IV. Section V concludes the paper.

II PRELIMINARIES

In this section, we present the preliminaries related to the CBF and propose the basic form of the optimization problem based on MPC and CBF. This lays the foundation for subsequent controller design.

II-A Problem Formulation

Consider the robot’s motion model as a discrete-time control system

\mathbf{x}_{k+1}=f(\mathbf{x}_{k},\mathbf{u}_{k}),

(1)

where $\mathbf{x}_{k}\in\mathcal{X}\subset\mathbb{R}^{n}$ is the state of the system, $\mathbf{u}_{k}\in\mathcal{U}\subset\mathbb{R}^{m}$ is the control input. In a dynamic environment, assuming that the motion equation of a moving obstacle $\mathbf{o}^{i}$ is

\mathbf{o}^{i}_{k+1}=\xi(\mathbf{o}_{k}),

(2)

where $\mathbf{o}^{i}_{k}\in\mathbb{R}^{n_{o}}$ represents the state of the moving obstacle at time $k$ , the superscript $i\in\{1,2,\dots,N_{o}\}$ represents the $i$ -th moving obstacle, $\xi(\cdot)$ is the state transition function. In robot obstacle avoidance scenarios, the obstacle avoidance problem is usually described using an optimal control problem based on distance constraints[16]. Assuming that the robot and the moving obstacles are approximated by circles with center points $(x_{t},y_{t})$ and $(x^{i}_{t},y^{i}_{t})$ , radii $r_{r}$ and $r_{\mathbf{o}^{i}}$ respectively on the two-dimensional plane, then the safe distance between the robot and the moving obstacle $\mathbf{o}^{i}$ is defined as $r_{i}=r_{r}+r_{\mathbf{o}^{i}}+\epsilon$ , where $\epsilon$ is the additional safety margin. So at time step $t$ , the MPC problem based on distance constraints (MPC-DC) is as follows

$\displaystyle\min_{\mathbf{u}_{t+k\|t}}\$	$\displaystyle p(\mathbf{x}_{t+N\|t})+\sum^{N-1}_{k=0}q(\mathbf{x}_{t+k\|t},% \mathbf{u}_{t+k\|t})$	(3a)
$\displaystyle\mathrm{s.t.}\$	$\displaystyle\mathrm{for}\ \mathrm{all}\ k=0,\dots,N-1:$
	$\displaystyle\mathbf{x}_{t+k+1\|t}=f(\mathbf{x}_{t+k\|t},\mathbf{u}_{t+k\|t}),$	(3b)
	$\displaystyle\mathbf{x}_{t+k\|t}\in\mathcal{X},$	(3c)
	$\displaystyle\mathbf{u}_{t+k\|t}\in\mathcal{U},$	(3d)
	$\displaystyle\mathbf{x}_{t\|t}=\mathbf{x}_{t},$	(3e)
	$\displaystyle g_{i}(\mathbf{x}_{t+k\|t},\mathbf{o}^{i}_{t+k\|t})\geq 0,$	(3f)

where $N$ is the prediction horizon. The vectors $\mathbf{x}_{t+k|t}$ and $\mathbf{u}_{t+k|t}$ represent the predicted state and designed input at time step $t+k$ , respectively. The first term of the cost function (3a) is the terminal cost, and the latter one is the stage cost. (3c) and (3d) represent the state constraints and input constraints along the prediction horizon, respectively. The safety limit distance constraints are represented by (3f), where $g_{i}(\mathbf{x}_{t+k|t},\mathbf{o}^{i}_{t+k|t})=\sqrt{(x_{t+k|t}-x^{i}_{t+k|t% })^{2}+(y_{t+k|t}-y^{i}_{t+k|t})^{2}}-r_{i}$ .

The optimal solution of this problem at time step $t$ is an input sequence i.e. $\{\mathbf{u}^{*}_{t|t},\dots,\mathbf{u}^{*}_{t+N-1|t}\}$ . Only the first element $\mathbf{u}^{*}_{t|t}$ of the optimal solution will become the control input of the system (1). Then the above optimization problem is solved repeatedly at the new state $\mathbf{x}_{t+1}$ .

II-B Control Barrier Function

In control theory, CBF is a continuously differentiable function used to ensure forward invariance of the system state. When the system state is on the boundary of the invariant set, CBF can adjust the input of the control system to keep it within the invariant set. The definition of CBF is given below based on the concept of safety set in [17]. Assume that the safe set $\mathcal{C}$ is a super-level set of a continuously differentiable function $h:\mathbb{R}^{n}\rightarrow\mathbb{R}$ :

$\displaystyle\mathcal{C}$	$\displaystyle=\{\mathbf{x}\in\mathbb{R}^{n}:h(\mathbf{x})\geq 0\},$	(4a)
$\displaystyle\partial\mathcal{C}$	$\displaystyle=\{\mathbf{x}\in\mathbb{R}^{n}:h(\mathbf{x})=0\},$	(4b)
$\displaystyle\mathrm{Int}(\mathcal{C})$	$\displaystyle=\{\mathbf{x}\in\mathbb{R}^{n}:h(\mathbf{x})>0\}.$	(4c)

And $\frac{\partial h(\mathbf{x})}{\partial\mathbf{x}}\neq 0$ holds for all points on the boundary of the safe set. Then if and only if $\dot{h}(\mathbf{x})=\frac{\partial h(\mathbf{x})}{\partial\mathbf{x}}\dot{% \mathbf{x}}\geq 0,\forall\mathbf{x}\in\partial\mathcal{C}$ , the set $\mathcal{C}$ is a forward invariant set, that is a safe set.

Definition 1 (Discrete-time CBF[2]).

Consider the discrete-time system (1). Given a set $\mathcal{C}$ defined by (II-B) for a function $h:\mathbb{R}^{n}\rightarrow\mathbb{R}$ , the function $h$ is a discrete-time CBF if there exists a function $\gamma\in\mathcal{K}_{\infty}$ s.t.

\Delta h(\mathbf{x}_{k},\mathbf{u}_{k})\geq-\gamma(h(\mathbf{x}_{k})),

(5)

where $\Delta h(\mathbf{x}_{k},\mathbf{u}_{k}):=h(\mathbf{x}_{k+1})-h(\mathbf{x}_{k})$ .

When discrete-time CBF is used as constraints in the safety-critical MPC design, the safety of the system can be fully guaranteed while avoiding static obstacles[2]. In order to better apply it to dynamic scenes, [3] proposed D-CBF on this basis. Similarly, we assume that the shape of the moving obstacle does not change and modify (5) to the following form

\Delta h_{i}(\mathbf{x}_{k},\mathbf{u}_{k},\mathbf{o}^{i}_{k})\geq-\gamma(h_{i% }(\mathbf{x}_{k},\mathbf{o}^{i}_{k})),

(6)

where $\Delta h_{i}(\mathbf{x}_{k},\mathbf{u}_{k},\mathbf{o}^{i}_{k}):=h(\mathbf{x}_{% k+1},\mathbf{o}^{i}_{k+1})-h(\mathbf{x}_{k},\mathbf{o}^{i}_{k})$ . We follow the result of [2] and select the $\mathcal{K}_{\infty}$ function $\gamma(\cdot)$ as a constant $\gamma\in(0,1]$ .

Therefore, assuming that the states of the robot and the moving obstacles at time $t$ are known, the MPC problem based on D-CBF (6) constraints (MPC-D-CBF) is as follows

$\displaystyle\min_{\mathbf{u}_{t+k\|t}}\$	$\displaystyle p(\mathbf{x}_{t+N\|t})+\sum^{N-1}_{k=0}q(\mathbf{x}_{t+k\|t},% \mathbf{u}_{t+k\|t})$	(7a)
$\displaystyle\mathrm{s.t.}\$	$\displaystyle\mathrm{for}\ \mathrm{all}\ k=0,\dots,N-1:$
	$\displaystyle\mathbf{x}_{t+k+1\|t}=f(\mathbf{x}_{t+k\|t},\mathbf{u}_{t+k\|t}),$	(7b)
	$\displaystyle\mathbf{x}_{t+k\|t}\in\mathcal{X},$	(7c)
	$\displaystyle\mathbf{u}_{t+k\|t}\in\mathcal{U},$	(7d)
	$\displaystyle\mathbf{x}_{t\|t}=\mathbf{x}_{t},$	(7e)
	$\displaystyle h_{i}(\mathbf{x}_{t+k+1\|t},\mathbf{o}^{i}_{t+k+1\|t})\geq(1-% \gamma)h_{i}(\mathbf{x}_{t+k\|t},\mathbf{o}^{i}_{t+k\|t}),$	(7f)

where the constraint (7f) guarantees the forward invariance of the safe set $\mathcal{C}$ [2], and $\mathcal{C}$ is defined in (4). When $\gamma=1$ , constraint (7f) degenerates into constraint (3f), which will lead to a decrease in safety. However, the smaller the value of $\gamma$ is, the more stringent the constraints will become, which makes the optimization problem difficult to solve or even unsolvable.

Refer to caption — Figure 1: Feasibility of optimization problem (II-B). The reachable set propagates along the prediction horizon, starting from the initial state $\mathbf{x}_{t|t}$ . The definition of the safety set $\mathcal{C}$ is derived from equation (II-B). $\mathcal{S}$ represents the state space set that satisfies constraints (7c) and (7f). The red arc represents the boundary of $\mathcal{S}$ , and its interior is depicted by a red arrow. The optimization problem is feasible only if the intersection of the reachable set and $\mathcal{S}$ is not empty.

The CBF hard constraints (7f) in the optimization problem act on the entire prediction horizon. Thus, if there is no feasible region in any step of the prediction horizon, it renders the entire optimization problem infeasible. As seen in Fig. 1, when the CBF constraints are satisfied, the system state must be within the safe set $\mathcal{C}$ . However, violating the CBF constraints does not imply that the system state is outside the safe set $\mathcal{C}$ . Specifically, even if there is no feasible region at step $k$ , it does not mean that the current state lacks a suitable input to avoid collision. Although the optimization problem (II-B) proves efficient and safe in simple scenarios, its performance may degrade in crowded environments. The reason is the application of CBF hard constraints (7f) over the entire prediction horizon could lead to frequent solution failures.

III CONTROLLER DESIGN

When the optimization problem (II-B) is infeasible, optimal control cannot be obtained, which may reduce the navigation efficiency of the mobile robot in obstacle avoidance scenarios. Hence, in the following, we formulate the relaxed safety control logic to alleviate this problem.

III-A Soft Constrained Predictive Control With CBF

According to [2], the infeasibility problem encountered in MPC arises from the intersection of the reachable set and set $\mathcal{S}$ , which satisfies CBF constraints at horizon step $k$ , being empty. In dynamic scenarios, this problem is especially serious as the set $\mathcal{S}$ is mainly determined by dynamic obstacles and the initial state of the robot, which is dynamically changing. The mobile robot is then more likely to enter a state that is infeasible, so new control inputs cannot be obtained by solving (II-B). In this case, methods such as repeating the control input from the previous moment or calculating the control input without constraints will violate the controller requirements and may lead to unpredictable and dangerous behavior. The most direct way to solve this problem is to adjust the prediction horizon $N$ , but this will also change the prediction ability of the controller. When a robot’s ability to predict is not adequate, it tends to explore riskier areas more often. This behaviour, in turn, worsens its initial state at future time instants.

In the optimization problem (II-B), the scalar $\gamma$ in the state constraints (7e) is also called a conservative coefficient. This is because we can find a trade-off between safety and feasibility by choosing the value of $\gamma$ . However, gaining feasibility by reducing safety is not our original intention. Some state constraint softening methods for MPC have been proposed, such as [18, 19], which can ensure the feasibility of online optimization problems under unexpected disturbances. Inspired by these studies, we modify (II-B) to the soft-constrained MPC problem based on CBF (SCMPC-CBF)

$\displaystyle\min_{\mathbf{u}_{t+k\|t},\mathbf{\zeta}_{t+k\|t}}\$	$\displaystyle p(\mathbf{x}_{t+N\|t})+\sum^{N-1}_{k=0}q(\mathbf{x}_{t+k\|t},% \mathbf{u}_{t+k\|t})$
	$\displaystyle+\alpha\sum^{N-1}_{k=0}\\|\mathbf{\zeta}_{t+k\|t}\\|$	(8a)
$\displaystyle\mathrm{s.t.}\$	$\displaystyle\mathrm{for}\ \mathrm{all}\ k=0,\dots,N-1:$
	$\displaystyle\mathbf{x}_{t+k+1\|t}=f(\mathbf{x}_{t+k\|t},\mathbf{u}_{t+k\|t}),$	(8b)
	$\displaystyle\mathbf{x}_{t+k\|t}\in\mathcal{X},$	(8c)
	$\displaystyle\mathbf{u}_{t+k\|t}\in\mathcal{U},$	(8d)
	$\displaystyle\mathbf{x}_{t\|t}=\mathbf{x}_{t},$	(8e)
	$\displaystyle\mathbf{x}_{t+k\|t}\in\mathcal{X}(\mathbf{\zeta}_{t+k\|t}),\mathbf{% \zeta}_{t+k\|t}\geq 0,$	(8f)

where $\mathbf{\zeta}\in\mathbb{R}^{N_{o}}$ is the slack variable and can be expressed as $\mathbf{\zeta}_{t+k|t}=[\zeta_{t+k|t}^{1},\ldots,\zeta_{t+k|t}^{N_{o}}]^{T}$ . The soft constraint (8f) can be then described as,

\begin{array}[]{rl}\mathcal{X}(\zeta_{t+k|t})&=\{\mathbf{x}\in\mathbb{R}^{n}|% \zeta_{t+k|t}^{i}\geq(1-\gamma)h_{i}(\mathbf{x}_{t+k|t},\mathbf{o}_{t+k|t}^{i}% )\\ &-h_{i}(\mathbf{x}_{t+k+1|t},\mathbf{o}_{t+k+1|t}^{i}),\forall i=1,\ldots,N_{o% }\}.\end{array}

(9)

In (8a), $\alpha$ is the constraint violation penalty weight. This ensures that the optimization problem is feasible for any input sequence in $\mathcal{U}$ . Even if $\mathbf{x}_{t+k|t}$ does not satisfy the constraint (7f), these violations are penalized in the cost function to determine the value of the slack variables. By constructing in this manner, we can provide an optimal solution that is not only close to that of (II-B) but also ensures feasibility.

Theorem 1.

Given a state $\mathbf{x}_{t}$ , if $\mathbf{u}_{t+k|t}^{*},k=0,\ldots,N-1$ is the optimal solution to (II-B), then there exists a Lagrange vector $\mathbf{\lambda}^{*}$ such that

\mathcal{L}(\mathbf{u}_{t+k|t}^{*},\mathbf{\lambda}^{*})=0,

in which $\mathcal{L}(\mathbf{u}_{t+k|t},\mathbf{\lambda})$ is the Lagrangian of the optimization problem (II-B), i.e.,

\begin{array}[]{rl}\mathcal{L}(\mathbf{u}_{t+k},\mathbf{\lambda})&=J_{t}(% \mathbf{x}_{t},\mathbf{u}_{t|t},\ldots,\mathbf{u}_{t+N-1|t})\\ &+\mathbf{\lambda}^{T}c_{t}(\mathbf{x}_{t},\mathbf{u}_{t|t},\ldots,\mathbf{u}_% {t+N-1|t}).\end{array}

(10)

In (10), the expressions $J_{t}(\mathbf{x}_{t},\mathbf{u}_{t|t},\ldots,\mathbf{u}_{t+N-1|t})$ and $c_{t}(\mathbf{x}_{t},\mathbf{u}_{t|t},\ldots,\mathbf{u}_{t+N-1|t})$ are the cost and constraint corresponding to (II-B), respectively. Besides, if we choose the penalty $\alpha$ such that

\alpha>\|\mathbf{\lambda}^{*}\|_{D},

where $\|\cdot\|_{D}$ denotes the dual norm with respect to the norm for $\mathbf{\zeta}_{t+k|t}$ in (8a) and $\mathbf{u}_{t+k|t}^{*}$ satisfies (7b)-(7f), then the optimal solutions to (II-B) and (III-A) are equivalent.

Proof.

See [20] Thm 14.2.1, 14.3.1. ∎

Remark 1.

It is noted that if we choose the penalty weight $\alpha$ such that

\alpha>\max\limits_{\mathbf{x}_{t}}\|\lambda^{*}\|_{D},

then for all initial states $\mathbf{x}_{t}$ , if the optimization problem (II-B) is feasible, then the optimal solution of (II-B) and (III-A) are equivalent.

In practice, it is not easy to determine $\max_{\mathbf{x}_{t}}\|\lambda^{*}\|_{D}$ , as the terms $J_{t},c_{t}$ in the Lagrangian (10) change with respect to the state $\mathbf{x}_{t}$ . Hence, before designing a soft constrained predictive control problem, we select a number of initial states $\mathbf{x}_{t}$ to estimate different $\alpha$ and then choose the maximum one as the penalty factor.

After the softening, if the optimization problem (II-B) is infeasible, we can use the optimization problem (III-A) to find a solution. However, this solution may not necessarily be feasible for (II-B).

III-B Safety Enhancement with D-GCBF

In Section III-A, the infeasibility problem of the obstacle avoidance problem (II-B) is solved by softening the CBF hard constraints. In this way, the softened problem (III-A) is always feasible. However, the solutions of (III-A) and (II-B) are equivalent only when (II-B) is feasible. In safety-critical situations, conflicting with obstacles is not desirable, and further efforts should be devoted to enhancing safety in SCMPC-CBF.

Common methods for ensuring safety are to impose a control invariant set [21], and in our previous work [22], by constructing a safety filter [23], which is basically a control invariant set, the safety of reinforcement-learning generated controller can be improved. Here, we adopt similar ideas of the safety filter, and considering that control invariant sets are difficult to calculate for high-dimensional nonlinear systems, the dynamic generalized CBF (D-GCBF) is employed. In order to design the D-GCBF, the relative degree of the state constraint to the system (8b) is first considered.

Definition 2 (Relative-degree [24]).

The state constraint $h(\mathbf{x}_{t})$ of system (1) has relative-degree $d$ with respect to control input $\mathbf{u}_{t}$ if

\frac{\partial h(\mathbf{x}_{t+j})}{\partial\mathbf{u}_{t}}=0,\ \frac{\partial h% (\mathbf{x}_{t+d})}{\partial\mathbf{u}_{t}}\neq 0,

(11)

for $\forall j\in\{0,1,\dots,d-1\}$ , $\forall\mathbf{x}\in\mathbb{R}^{n}$ .

That is, the relative degree $d$ is the delay step at which the control input $\mathbf{u}_{t}$ appears in $\mathbf{y}_{t}$ . Therefore, it is valid to impose safety constraints at time step $d$ but not at time step $j$ .

By incorporating one-step state constraint at time step $d$ , the optimization problem can benefit from a wider feasible region and improved computational efficiency, as opposed to including state constraints across $d$ steps [15]. We proposed D-GCBF in dynamic scenarios based on the results of [15] and the definition of relative degree.

Definition 3 (D-GCBF).

Consider the discrete-time system (1). Given a set $\mathcal{C}$ defined by (II-B) for a function $h:\mathbb{R}^{n}\times\mathbb{R}^{n_{o}}\rightarrow\mathbb{R}$ , the function $h$ is a dynamic generalized CBF if

h_{i}(\mathbf{x}_{t+d},\mathbf{o}^{i}_{t+d})\geq(1-\eta)^{d}h_{i}(\mathbf{x}_{% t},\mathbf{o}^{i}_{t}),

(12)

where the constant $\eta\in(\gamma,1]$ .

The reason $\eta$ is lower bounded by $\gamma$ is that we don’t want the hard constraint (12) to be stricter than the soft constraint (8f). Upon integrating the D-GCBF constraint, a comparison between our obstacle avoidance constraints and those previously used is depicted in Fig. 2. As can be seen from Fig. 2, when utilizing CBF as hard constraints, it can potentially lead to infeasibility issues. By converting these into soft constraints, the optimization problem will always remain solvable. Simultaneously, the addition of a one-step D-GCBF hard constraint could possibly result in solution failure as well, but it is comparatively less stringent. The complete optimization problem, i.e., SCMPC-CBF with D-GCBF, is constructed as follows

$\displaystyle\min_{\mathbf{u}_{t+k\|t},\mathbf{\zeta}_{t+k\|t}}\$	$\displaystyle p(\mathbf{x}_{t+N\|t})+\sum^{N-1}_{k=0}q(\mathbf{x}_{t+k\|t},% \mathbf{u}_{t+k\|t})$
	$\displaystyle+\alpha\sum^{N-1}_{k=0}\\|\mathbf{\zeta}_{t+k\|t}\\|$	(13a)
$\displaystyle\mathrm{s.t.}\$	$\displaystyle\mathrm{for}\ \mathrm{all}\ k=0,\dots,N-1:$
	$\displaystyle\mathbf{x}_{t+k+1\|t}=f(\mathbf{x}_{t+k\|t},\mathbf{u}_{t+k\|t}),$	(13b)
	$\displaystyle\mathbf{x}_{t+k\|t}\in\mathcal{X},$	(13c)
	$\displaystyle\mathbf{u}_{t+k\|t}\in\mathcal{U},$	(13d)
	$\displaystyle\mathbf{x}_{t\|t}=\mathbf{x}_{t},$	(13e)
	$\displaystyle\mathbf{x}_{t+k\|t}\in\mathcal{X}(\mathbf{\zeta}_{t+k\|t}),\mathbf{% \zeta}_{t+k\|t}\geq 0,$	(13f)
	$\displaystyle h_{i}(\mathbf{x}_{t+d\|t},\mathbf{o}^{i}_{t+d\|t})\geq(1-\eta)^{d}% h_{i}(\mathbf{x}_{t\|t},\mathbf{o}^{i}_{t\|t}).$	(13g)

The addition of the constraint (13g) can guarantee safety for static obstacles, which is stated in the following theorem.

Theorem 2.

For a relative-degree $d$ state constraints $h(\mathbf{x})\geq 0$ and the corresponding safe set (4a), assume the system satisfies $h(\mathbf{x}_{t+j})\geq 0$ for all $j\in\{1,\ldots,d-1\}$ . Then by solving (III-B) at time $t$ , if feasible solutions $\mathbf{u}_{t+k|t},k=0,\ldots,N-1$ and $\zeta_{t+k|t},k=1,\ldots,N$ to the problem (III-B) can be found, then in the next time steps, the state can be guaranteed to be within the safe set (4a).

Proof.

According to [15], if the system satisfies $h(\mathbf{x}_{t+j})\geq 0$ for all $j\in\{1,\ldots,d-1\}$ , then the set (4a) defines a forward invariant safe set. Besides, as there is some control policy $\mathbf{u}_{t+k|t}$ such that (13g) holds, the state is guaranteed to be with the safe set (4a), i.e., at time $t+1$ , we have

h(\mathbf{x}_{t+1})\geq 0,

i.e., $\mathbf{x}_{t+i}\in\mathcal{C}$ . ∎

Therefore, for a fixed CBF $h(\mathbf{x})$ , if the problem (III-B) is feasible, the system is always safe, and the system state $\mathbf{x}$ is always in the safe set $\mathcal{C}$ .

Remark 2.

It is noted that D-GCBF simplifies multistep constraints into a single step, reducing computational complexity and enhancing feasibility. However, a one-step constraint may only partially ensure system safety[14], as the forward invariance of the set $\mathcal{C}$ (4a) is guaranteed provided that $h(\mathbf{x}_{t+j})\geq 0$ holds for $j\in\{1,\ldots,d-1\}$ , which are not included in the constraints of (III-B). Hence, solving (III-B) alone in general cannot guarantee the safety of the system. Therefore, we use D-GCBF as a single-step safeguard to reinforce the system’s safety after obtaining a potential safe trajectory through the optimization problem (III-A).

Remark 3.

In practice, the moving obstacles $\mathbf{o}^{i},i=1,\ldots,N^{o}$ cause the changing of the CBF $h_{i}(\mathbf{x},\mathbf{o}^{i})$ ; thus, the safe set also evolves with time. And if we can show that

\mathcal{C}_{t}\subset\mathcal{C}_{t+1},\forall t

(14)

where $\mathcal{C}_{t}$ is the safe set at time $t$ , then according to Theorem 2 we have

\mathbf{x}_{t+1}\in\mathcal{C}_{t}\subset\mathcal{C}_{t+1},

i.e., the system is always safe at future times.

Although (14) is not always satisfied, and the safety of future times can not be guaranteed, we find in experiments that D-GCBF actually enhances safety.

It is noted that the hard constraint (13g) can also bring the problem of infeasibility. However, we can see that the constraint (13g) is far weaker than (7f), as (7f) is imposed on the entire prediction horizon, and besides, $\eta\geq\gamma$ , hence the chance of infeasibility of (III-B) is far less than that of (II-B). To avoid possible damage, when (III-B) is infeasible, we ensure that the robot comes to a complete stop by activating the brakes instead of merely setting the control input to zero. The complete algorithm is presented as Algorithm 1.

Algorithm 1 SCMPC-CBF with D-GCBF

0: Initial state

\mathbf{x}(t)

, state constraints

\mathcal{X}

, input constraints

\mathcal{U}

, system dynamic (1), obstacles state

\mathbf{o}^{i}(t)

, system dynamic (2) of obstacle, goal state

\mathbf{x}_{goal}

0: Optimal control

\mathbf{u}(t)

\mathbf{x}_{t}=\mathbf{x}(t)

\mathbf{o}^{i}_{t}=\mathbf{o}^{i}(t)

3: Solve (III-B).

4: if (III-B) is solved successfully then

5: return

\mathbf{u}(t)=\mathbf{u}^{*}_{t|t}\in\mathbf{u}^{*}_{t+k|t}

6: else

7: Activate the braking mechanism on the robot.

8: end if

IV EXPERIMENTS

In this section, experiments in simulation environments and real scenarios are conducted to verify the effectiveness of our work.

IV-A Simulation Setup

All controllers are implemented in Python with Casadi[25] as modeling language, solved with IPOPT [26]. The simulation experiments were conducted on a computer running Ubuntu 20.04, which used an Intel Core i5-12490f processor with 16 GB RAM.

Following [8], each agent must stay within a 10m×10m two-dimensional space. The simulated pedestrians are controlled by ORCA [5], and their initial positions are randomly sampled on a circle with a radius of 4m, and their target positions are on the other side of the same circle. The robot has the same radius of 0.3m and the same preferred speed of 1m/s as the pedestrian. And the simulation time step is 0.2s.

In order to fully evaluate the safety and effectiveness of the proposed algorithm, we set the robot to be invisible to other humans. That is, the simulated human only reacts to humans and turns a blind eye to the robot. All controller algorithms are evaluated using 500 random test cases with five pedestrians.

IV-B Quantitative Evaluation

All controller performances are evaluated under the same prediction horizon $N$ and the same form of stage and terminal cost. The motion model (2) of all obstacles is approximated by a linear model. We use the following indicators to compare the five controllers: S (the rate of the robot reaching its destination without collision), C (the rate of the robot colliding with moving obstacles), T (the robot’s average navigation time in seconds), FS (the average number of solution failures), ST (the robot’s average solution time in milliseconds).

IV-B1 Double-integrator system

Consider the robot’s motion model (1) as a discrete-time linear double-integrator system,

\mathbf{x}_{k+1}=A\mathbf{x}_{k}+B\mathbf{u}_{k},

(15)

where $\mathbf{x}=[x,y,v_{x},v_{y}]^{T}$ and $\mathbf{u}=[a_{x},a_{y}]^{T}$ represent position $(x,y)$ , velocity $(v_{x},v_{y})$ and acceleration $(a_{x},a_{y})$ , respectively. We set $\epsilon=0.2$ in MPC-DC (II-A) and $\epsilon=0$ in other methods. The results are shown in Table I.

TABLE I: Quantitative Results With Double-integrator System

Controller	$\gamma$	S $\uparrow$	C $\downarrow$	T	FS	ST
ORCA [5]	-	0.470	0.526	11.04	-	-
MPC-DC(II-A)	-	0.362	0.636	12.97	6.486	45.11
MPC-D-CBF(II-B)	0.08	0.736	0.262	16.73	21.734	42.47
	0.10	0.684	0.316	14.91	16.602	42.78
	0.12	0.624	0.376	13.80	13.544	44.12
SCMPC-CBF(III-A)	0.08	0.952	0.048	13.45	0	53.01
	0.10	0.776	0.224	12.19	0	53.48
	0.12	0.640	0.360	11.61	0	53.33
Ours(III-B)	0.08	0.996	0.004	14.35	0.374	55.08
	0.10	0.966	0.034	13.61	0.794	55.76
	0.12	0.954	0.046	13.34	1.242	55.51

The results show that in a dynamic environment, short-term ORCA performs poorly in the invisible setting due to a violation of the reciprocal assumption. MPC-DC still performs poorly even if an additional safety margin $\epsilon$ is added. This is because it cannot avoid dynamic obstacles in advance and easily enters high-risk areas. The safety of MPC-D-CBF will improve as $\gamma$ decreases, but correspondingly, the navigation efficiency will decrease due to tighter hard constraints. Robots driven by soft constraints will more actively explore new paths in crowded areas, and single-step safety constraints will also enhance the safety of robot navigation. Although the introduction of slack variables may slightly increase the complexity of the solution, it can prevent the extra time consumption caused by solution failure. As shown in Fig. 3, we compared the navigation paths of these controllers in the 330th test case.

IV-B2 Unicycle system

Consider the robot’s motion model (1) as a discrete-time nonlinear unicycle system,

\mathbf{x}_{k+1}=f(\mathbf{x}_{k},\mathbf{u}_{k}),

(16)

where $\mathbf{x}=[x,y,\theta]^{T}$ and $\mathbf{u}=[v,\omega]^{T}$ represent position $(x,y)$ , heading angle $\theta$ , line speed $v$ and angular velocity $\omega$ , respectively. VO-based methods such as ORCA are suitable for robots that can move in any direction but are not suitable for robots with non-holonomic kinematics[27]. Therefore, we do not compare with ORCA. We set $\epsilon=0.2$ in MPC-DC (II-A) and $\epsilon=0$ in other methods. The results are shown in Table II.

TABLE II: Quantitative Results With Unicycle System

Controller	$\gamma$	S $\uparrow$	C $\downarrow$	T	FS	ST
MPC-DC(II-A)	-	0.186	0.808	10.87	6.116	50.78
MPC-D-CBF(II-B)	0.08	0.848	0.152	14.40	8.862	49.07
	0.10	0.764	0.236	13.27	7.756	49.13
	0.12	0.724	0.276	12.58	6.976	49.27
SCMPC-CBF(III-A)	0.08	0.980	0.020	14.44	0	60.59
	0.10	0.936	0.064	13.27	0	60.75
	0.12	0.900	0.100	12.68	0	60.83
Ours(III-B)	0.08	0.982	0.018	14.56	0.032	61.54
	0.10	0.960	0.040	13.34	0.128	62.37
	0.12	0.902	0.098	12.71	0.388	61.03

The results demonstrate that our method achieves a high success rate despite the reduced action space of non-holonomic kinematics systems. Given the underactuated characteristics of non-holonomic mobile robots [28] (3 degrees of freedom $(x,y,\theta)$ , 2 controls $(v,\omega)$ ), the obstacle avoidance capability of these robots is inevitably limited. As a result, the superiority of our improved method over other methods is not so apparent as that in simulation experiments for double-integrator system. Similar to the previous simulation experiment for double-integrator system, as the value of $\gamma$ decreases, the system’s safety during navigation improves. However, it also yields a higher probability of solution failure. The increase in average solution time can be attributed to the system’s nonlinearity. The simulation experiment conducted using the unicycle model lays the groundwork for subsequent experiments in real-world scenarios. Fig. 4 illustrates a comparison of the navigation paths of these controllers in the 116th test case. Our code and further examples can be found at http://https://github.com/Zetao-Lu/CrowdNav_MPCCBF

IV-C Real-world Experiments

In real-world experiments, we deployed our method on an MR1000 robot. We set the sensing range of the 64-line lidar to be 8 meters and used PointPillars[29] for pedestrian detection within this range. Additionally, we utilized AB3DMOT[30] to track the detection results and estimate their relative positions and velocities. After building a map of real-world environment, we estimated the robot’s state using the AMCL package in ROS and Kalman filter. As shown in Fig. 5, in real-world experiments, the MR1000 robot successfully navigated to the target location without colliding with pedestrians by utilizing our method as a local planning controller. The results demonstrate the successful transferability of our method from simulation to real robots as a local planning module, ensuring safety for the robot.

V CONCLUSIONS

In this paper, we propose a new MPC framework that integrates the CBF to address the challenge of obstacle avoidance in dynamic environments while avoiding the infeasibility problem caused by hard constraints acting on the entire predictive horizon. Additionally, we design a D-GCBF based on the relative degree of constraints to the system, enhancing the robot’s obstacle avoidance capability under soft constraints by employing single-step safety constraints. Experimental results demonstrate that our method achieves a higher navigation success rate, lower collision rate, and lower solution failure probability compared to other baseline methods. Furthermore, we deploy the method as a local planning controller on the MR1000 robot using the ROS platform and validate the effectiveness of our approach in real-world environments.

References

[1] A. J. Lee, W. Song, B. Yu, D. Choi, C. Tirtawardhana, and H. Myung, “Survey of robotics technologies for civil infrastructure inspection,” Journal of Infrastructure Intelligence and Resilience, vol. 2, no. 1, p. 100018, 2023.
[2] J. Zeng, B. Zhang, and K. Sreenath, “Safety-critical model predictive control with discrete-time control barrier function,” in 2021 American Control Conference (ACC). IEEE, 2021, pp. 3882–3889.
[3] Z. Jian, Z. Yan, X. Lei, Z. Lu, B. Lan, X. Wang, and B. Liang, “Dynamic control barrier function-based model predictive control to safety-critical obstacle-avoidance of mobile robot,” in 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023, pp. 3679–3685.
[4] P. Fiorini and Z. Shiller, “Motion planning in dynamic environments using velocity obstacles,” The International Journal of Robotics Research, vol. 17, no. 7, pp. 760–772, 1998.
[5] J. Van Den Berg, S. J. Guy, M. Lin, and D. Manocha, “Reciprocal n-body collision avoidance,” in Robotics Research: The 14th International Symposium ISRR. Springer, 2011, pp. 3–19.
[6] D. J. Gonon, D. Paez-Granados, and A. Billard, “Reactive navigation in crowds for non-holonomic robots with convex bounding shape,” IEEE Robotics and Automation Letters, vol. 6, no. 3, pp. 4728–4735, 2021.
[7] Y. Chen, F. Zhao, and Y. Lou, “Interactive model predictive control for robot navigation in dense crowds,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 52, no. 4, pp. 2289–2301, 2021.
[8] C. Chen, Y. Liu, S. Kreiss, and A. Alahi, “Crowd-robot interaction: Crowd-aware robot navigation with attention-based deep reinforcement learning,” in 2019 International Conference on Robotics and Automation (ICRA), 2019, pp. 6015–6022.
[9] H. Yang, C. Yao, C. Liu, and Q. Chen, “RMRL: Robot navigation in crowd environments with risk map-based deep reinforcement learning,” IEEE Robotics and Automation Letters, 2023.
[10] M. Boldrer, A. Antonucci, P. Bevilacqua, L. Palopoli, and D. Fontanelli, “Multi-agent navigation in human-shared environments: A safe and socially-aware approach,” Robotics and Autonomous Systems, vol. 149, p. 103979, 2022.
[11] B. Lindqvist, S. S. Mansouri, A.-a. Agha-mohammadi, and G. Nikolakopoulos, “Nonlinear MPC for collision avoidance and control of UAVs with dynamic obstacles,” IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 6001–6008, 2020.
[12] B. Brito, B. Floor, L. Ferranti, and J. Alonso-Mora, “Model predictive contouring control for collision avoidance in unstructured dynamic environments,” IEEE Robotics and Automation Letters, vol. 4, no. 4, pp. 4459–4466, 2019.
[13] A. Thirugnanam, J. Zeng, and K. Sreenath, “Safety-critical control and planning for obstacle avoidance between polytopes with control barrier functions,” in 2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022, pp. 286–292.
[14] J. Zeng, Z. Li, and K. Sreenath, “Enhancing feasibility and safety of nonlinear model predictive control with discrete-time control barrier functions,” in 2021 60th IEEE Conference on Decision and Control (CDC). IEEE, 2021, pp. 6137–6144.
[15] H. Ma, X. Zhang, S. E. Li, Z. Lin, Y. Lyu, and S. Zheng, “Feasibility enhancement of constrained receding horizon control using generalized control barrier function,” in 2021 4th IEEE International Conference on Industrial Cyber-Physical Systems (ICPS). IEEE, 2021, pp. 551–557.
[16] X. Zhang, A. Liniger, and F. Borrelli, “Optimization-based collision avoidance,” IEEE Transactions on Control Systems Technology, vol. 29, no. 3, pp. 972–983, 2020.
[17] A. D. Ames, S. Coogan, M. Egerstedt, G. Notomista, K. Sreenath, and P. Tabuada, “Control barrier functions: Theory and applications,” in 2019 18th European Control Conference (ECC), 2019, pp. 3420–3431.
[18] G. Di Pillo and L. Grippo, “Exact penalty functions in constrained optimization,” SIAM Journal on Control and Optimization, vol. 27, no. 6, pp. 1333–1360, 1989.
[19] E. C. Kerrigan and J. M. Maciejowski, “Soft constraints and exact penalty functions in model predictive control,” Proc. UKACC International Conference (Control 2000), 2000.
[20] R. Fletcher, Practical methods of optimization. John Wiley & Sons, 2000.
[21] T. Anevlavis, Z. Liu, N. Ozay, and P. Tabuada, “Controlled invariant sets: implicit closed-form representations and applications,” arXiv preprint arXiv:2107.08566, 2021.
[22] K. Feng, Z. Lu, J. Xu, H. Chen, and Y. Lou, “A safety filter for realizing safe robot navigation in crowds,” in 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023, pp. 9729–9736.
[23] B. Tearle, K. P. Wabersich, A. Carron, and M. N. Zeilinger, “A predictive safety filter for learning-based racing control,” IEEE Robotics and Automation Letters, vol. 6, no. 4, pp. 7635–7642, 2021.
[24] M. Sun and D. Wang, “Initial shift issues on discrete-time iterative learning control with system relative degree,” IEEE Transactions on Automatic Control, vol. 48, no. 1, pp. 144–148, 2003.
[25] J. A. Andersson, J. Gillis, G. Horn, J. B. Rawlings, and M. Diehl, “CasADi: a software framework for nonlinear optimization and optimal control,” Mathematical Programming Computation, vol. 11, pp. 1–36, 2019.
[26] L. T. Biegler and V. M. Zavala, “Large-scale nonlinear programming using IPOPT: An integrating framework for enterprise-wide dynamic optimization,” Computers & Chemical Engineering, vol. 33, no. 3, pp. 575–582, 2009.
[27] J. Alonso-Mora, A. Breitenmoser, M. Rufli, P. Beardsley, and R. Siegwart, “Optimal reciprocal collision avoidance for multiple non-holonomic robots,” in Distributed autonomous robotic systems: The 10th international symposium. Springer, 2013, pp. 203–216.
[28] C. M. Pappalardo and D. Guida, “Forward and inverse dynamics of a unicycle-like mobile robot,” Machines, vol. 7, no. 1, p. 5, 2019.
[29] A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, “PointPillars: Fast encoders for object detection from point clouds,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
[30] X. Weng, J. Wang, D. Held, and K. Kitani, “3d multi-object tracking: A baseline and new evaluation metrics,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020, pp. 10 359–10 366.