• No results found

Optimization of the Parking Manoeuvre for a 1-Trailer Truck

N/A
N/A
Protected

Academic year: 2022

Share "Optimization of the Parking Manoeuvre for a 1-Trailer Truck"

Copied!
75
0
0

Loading.... (view fulltext now)

Full text

(1)

IN

DEGREE PROJECT COMPUTER SCIENCE AND ENGINEERING, SECOND CYCLE, 30 CREDITS

STOCKHOLM SWEDEN 2020,

Optimization of the Parking Manoeuvre for a 1-Trailer Truck

HECTOR ESTEBAN CABEZOS

KTH ROYAL INSTITUTE OF TECHNOLOGY

(2)
(3)

Authors

Hector Esteban Cabezos <hectorec@kth.se>

School of Electrical Engineering and Computer Science KTH Royal Institute of Technology

29/02/2020

Place for Project

Stockholm, Sweden Munich, Germany

Examiner

Joakim Gustafsson Stockholm

KTH Royal Institute of Technology

Supervisor KTH

Patric Jensfelt Stockholm

KTH Royal Institute of Technology

Supervisor Company

Arne-Christoph Hildebrandt Munich

MAN Truck Bus SE

(4)

Abstract

Motion planning for autonomous heavy duty vehicles has been intensively studied during the last years due to its promising benefits for industry in terms of money savings and safety. The system treated in this thesis is a 1-trailer truck. The goal was to develop and evaluate different methodologies in order to optimize the reverse parking manoeuvre for this specific system, while considering its kinematic constraints and collision avoidance. So far most of the methodologies developed for the system at hand focus on finding a feasible path connecting a start configuration to a goal configuration, but very little attention has been paid to the optimality of the solution given. Along the report the author assumes that a feasible solution is available and the four methodologies are used to optimize the initial path. The algorithms were tested in different environments in order to assess their strengths and weaknesses. Results show that a good description of the obstacles in the map is required to obtain a good solution in all class of environments, and if that description is scarce the optimization for narrow and spatially constrained environments does not provide the desired results.

Furthermore, it was discovered that using a good feasible solution as a starting

point reduces the convergence time of the algorithms presented.

(5)

Sammanfattning

Rörelseplanering för autonoma tunga fordon har studerats intensivt under de

senaste åren tack vare dess fördelar för industrin när det gäller besparingar och

säkerhet. Systemet som behandlas i detta examensarbete är en lastbil med ett

släp. Målet var att utveckla och utvärdera olika metoder för att optimera den

omvända parkeringsmanövern för detta specifika system, med beaktande av dess

kinematiska begränsningar och undvikande av kollision. Hittills har de flesta av

de metoder som utvecklats för systemet fokuserat på att hitta en genomförbar

väg som kopplar en startkonfiguration till en målkonfiguration, men mycket liten

uppmärksamhet har ägnats åt optimaliteten i den angivna lösningen. I denna

rapport antas att en möjlig lösning är tillgänglig och fyra olika metoder undersöks

för att optimera den initiala vägen. Algoritmerna testades i olika miljöer för att

bedöma deras styrkor och svagheter. Resultaten visar att en bra beskrivning av

hinder i kartan krävs för att få en bra lösning i alla miljöklasser, och om den

beskrivningen är bristfällig ger optimeringen för smala och rumsligt begränsade

miljöer inte de önskade resultaten. Vidare upptäcktes att användning av en

god genomförbar lösning som utgångspunkt minskar konvergenstiden för de

presenterade algoritmerna.

(6)

Acknowledgements

First of all I wish to thank my supervisor at KTH, Patric Jensfelt, my supervisor at UPC, Luis J. de la Cruz Llopis, and my supervisor at MAN Truck and Bus AG Arne-Christoph Hildebrandt. Moreover, I would like to thank the other students from my peer review for their attention and ideas.

On a personal level I would like to thank my parents, whom I do not see as often as

I would like to, and my friends who have suffered me during these 5 months.

(7)

Contents

1 Introduction 1

1.1 Problem Formulation . . . . 2

1.2 Sustainability, Societal and Economic Aspects . . . . 2

1.3 Thesis Outline . . . . 3

2 Background 4 2.1 Modelling a Single Trailer Truck System . . . . 4

2.2 Optimal Path Planning . . . . 8

2.3 Sampling-Based Motion Planning for Truck-Trailer Systems . . . . 11

2.4 Distance Heuristics . . . 14

2.5 Local Planning for Truck-Trailer Systems . . . . 15

2.6 Collision Avoidance . . . 16

3 Related Work 19 4 Optimization Algorithms 22 4.1 Shortcut Algorithm . . . 22

4.2 Coordinate Points Optimization . . . 29

4.3 DDP and iLQR . . . 33

4.4 Non-Linear Optimization . . . 42

5 Evaluation and Results 50 5.1 Evaluation Implementation . . . 50

5.2 Test Scenarios . . . . 51

5.3 Results . . . 52

5.4 Results Evaluation . . . 54

6 Conclusion 55 6.1 Future Work . . . 55

References 57

(8)

1 Introduction

During this last decade a lot of research has been conducted, both by academia and industry, in the area of autonomous vehicles (AV). AV technology ranges from Driver Assistance systems, where some components of the vehicle support the driver but do not take control over the vehicle (Level 1 [2] of vehicle automation), to Full Automation vehicles where the whole control of the vehicle is automated and any person inside the vehicle is just a passenger (Level 5 of vehicle automation). With this technology breakthrough different benefits are accomplished in terms of human safety, reduction in the environmental footprint and time savings. Vehicles nowadays are equipped with numerous assisting functions like Cruise control, Lane Keeping Assist, Collision Avoidance which improved the overall efficiency and safety of the transportation [57]. In order for the reader to get a grasp on AV capabilities, according to the US Department of Transportation (USDOT) 94% of fatal vehicle accidents are accountable to human errors, like human distractions or driving under the influence of alcohol.

Even though most of the news around autonomous vehicles focus on cars, there is plenty of research being conducted for all kind of systems. AV are expected to provide a huge impact for industrial applications, where most of the concerns around autonomous cars, such as legality or ethics, are not applied. In this scenarios the introduction of autonomous vehicles would be safer and easier to test.

Driving automation operates through the output provided by several sensors, that allow the vehicle to understand its surrounding and to be able to safely drive through it. These systems can be grouped depending on their use into:

• Perception where different sensors based on visual data, LIDAR or Radar extract information from the environment and allow the other modules to understand it better.

• Localization and Mapping where the location of the vehicle with

respect its environment is provided, as well as a map with the different

obstacles from its surroundings. This module makes use of the output from

Perception in order to locate the vehicle with respect to the created map.

(9)

• Planning where a safe path is provided based on the inputs from the previous modules. It needs to be fast enough to adapt to sudden changes, like the appearance of a dynamic obstacle.

• Control executes the planned driving trajectories set by the planning system.

1.1 Problem Formulation

This Master Thesis is enclosed in the Planning aspect of an AV, described in Section 1. More concretely, a collision free optimal path to go from an initial pose to a goal pose is to be provided for a single-trailer truck. The term optimal will be discussed and tackled along this report.

The problem at hand is restricted to low-speed scenarios due to its simplicity when considering the dynamics of the vehicle and because it is reasonable that while a parking manoeuvre is being performed the vehicle needs to move at low-speed, due to security reasons. Moreover, the environment is assumed to be static, as if any change in the map occurs, i.e. a new obstacle is detected by the Perception module, due to the low-speed of the vehicle there is enough time to plan a new path.

Furthermore, a feasible solution is supposed to be available as a starting point for our optimization pipeline. Most of the time, it does not matter how the feasible path was obtained as we are just interested in the pose at each point of the discretized path along with the corresponding control inputs.

As a summary, this report focus on how to determine an optimized path for the reverse parking manoeuvre of a single truck-trailer system while respecting its kinematics and avoiding the obstacles in the environment.

1.2 Sustainability, Societal and Economic Aspects

The arrival of autonomous vehicles may provoke a big change in the job market, as some positions will probably disappear, i.e. drivers, while other ones will be created, i.e. perception engineers.

The use of autonomous vehicles in dangerous areas, such as mines, helps to reduce

human accidents. Moreover, the optimization of the path of an autonomous

(10)

vehicle helps to reduce its environmental footprint, as it reduces the distances covered by the vehicle and consequently the amount of gas consumed.

1.3 Thesis Outline

This Master Thesis report will be structured the following way:

• Chapter 1 Introduces the general topic of autonomous driving paying special attention to the problem at hand and presents the structure of the report.

• Chapter 2 Provides a background into the dynamics of the system considered in this thesis (single trailer truck) as well as required theory about Planning Algorithms and Optimization that may be useful for the following chapters.

• Chapter 3 The different algorithms developed in this project are presented.

• Chapter 4 The described algorithms in the previous chapter are compared and results to assess their quality for our task are given.

• Chapter 5 Finally, the report ends with conclusions obtained based on the

results from Chapter 4 and topics to be further analyzed in future work.

(11)

2 Background

This chapter will introduce some theoretical points that will be used along the report as well as providing an overview of the current state of reverse-autonomous parking for any kind of vehicle, emphasizing the available work done on single trailer trucks.

2.1 Modelling a Single Trailer Truck System

In this section a mathematical model of the concrete truck and trailer system used in this report is presented along with some of its properties. The system belongs to a class of systems called general n-trailer systems, where there is one pulling vehicle and n trailers being pulled by it. The system considered is the simplest one on this class, a general 1-trailer system.

Besides number of trailers, truck-trailer vehicles are also categorized based on where the trailer is hitched into the truck, having simple and general hitching.

Simple hitching is when the trailer is hitched to the center of the rear axle of the vehicle pulling it, in our case the truck, while general hitching is when the trailer is hitched away from the truck pulling, which is referred in the literature as off-axle.

The study developed in this report applies to the simple hitching truck. In order to translate the presented results into the general hitching problem, changes in the kinematics and system equations are required. Results for the general case get more complicated due to the extra degree of freedom, as explained in [44] . A similar study for the general hitching trailer can be found in [60].

The chosen dynamical model for the vehicle is the well known kinematic unicycle model [36]. This model simplifies the kinematics of the truck by replacing the four wheels by a single wheel that rolls without friction in the direction that it is pointing. Again, under the assumption of low speed this approximation is considered accurate enough. If the dynamical system is desired to be obtained, models such as Vertical Dynamics or Longitudinal and Lateral Dynamics [47]

can be used. Nevertheless, many degrees of freedom need to be added accounting

for the different contributions into the dynamics, making the calculus harder and

with no significant improvement. so the simplest model was taken.

(12)

2.1.1 The kinematic 1-Trailer System

The simplest model to describe the motion of a truck pulling a trailer is the kinematic single track truck model depicted in Fig. 2.1, where the trailer is hitched to the center of truck’s rear axle.

There are many ways to define the state vector and control inputs of the system.

In this report the convention used in [33] will be followed, where the state and control input vectors are:

x =

⎡ ⎢

⎢ ⎢

⎣ 𝑥 𝑦 𝜃 𝛼

⎤ ⎥

⎥ ⎥

(1a)

u = ⎡

⎣ 𝑣 𝜙

⎤ ⎥

(1b)

being (𝑥, 𝑦) ∈ ℝ

2

the coordinates of the rear axle of the truck, 𝜃 the orientation of the truck bounded by [−𝜋, 𝜋), 𝛼 ∈ [−𝛼

𝑚𝑎𝑥

, 𝛼

𝑚𝑎𝑥

] the hitch angle of the system, which can be seen as well as 𝛼 = 𝜃 − 𝜃

𝑡

, being 𝜃

𝑡

the orientation angle of the trailer and 𝛼 the difference between the orientation of the truck and the trailer. The control input vector from Eq. (1b) is formed by the velocity 𝑣 and 𝜙 ∈ [−𝜙

𝑚𝑎𝑥

, 𝜙

𝑚𝑎𝑥

] the steering angle of the front wheels of the truck.

Under the assumption that the system performs the parking manoeuvre at low speed the wheels roll without slip and the lateral acceleration is negligible. Thus, the equations of motion of a single track vehicle model are described by the following equations:

̇

𝑥 = 𝑣 ⋅ 𝑐𝑜𝑠(𝜃) (2a)

̇

𝑦 = 𝑣 ⋅ 𝑠𝑖𝑛(𝜃) (2b)

̇ 𝜃 = 𝑣

𝑙 ⋅ 𝑡𝑎𝑛(𝜙) (2c)

̇

𝛼 = −𝑣 ⋅ ( 𝑠𝑖𝑛(𝛼)

𝑙

𝑡

+ 𝑡𝑎𝑛( 𝜙

𝑙 )) (2d)

where 𝑙 is the wheelbase of the truck and 𝑙

𝑡

the hitch length of the trailer. The first

(13)

term in Eq. (2d) represents the change of 𝜃

𝑡

while the second term accounts for change in 𝜃.

For the truck and trailer system under consideration the maximum steering and hitch angle were set based on mechanical specifications to 𝜙

𝑚𝑎𝑥

= 0.55 𝑟𝑎𝑑 and 𝛼

𝑚𝑎𝑥

= 0.1 𝑟𝑎𝑑.

Figure 2.1: Kinematic single track truck model.

2.1.2 Time Independent Form

The reader may notice from the equations of motion in Eq. (2) that 𝑣 appears as a multiplicative factor, which allow us to represent the system in time-independent form (independent of the actual velocity) using Leibniz’s notation as in [33]:

̇

x = 𝜕x

𝜕𝑡 = 𝜕x

𝜕𝑠

𝜕𝑠

𝜕𝑡 = 𝜕x

𝜕𝑠 ⋅ |𝑣| = x

⋅ |𝑣| (3)

where |𝑣| =

𝑠𝑖𝑔𝑛(𝑣)𝑣

and x

represents the derivative of state vector x with respect

arc length. This allows to plan a time independent trajectory, which becomes a

(14)

path, making use of x

and caring only about the speed sign:

x

= 𝑠𝑖𝑔𝑛(𝑣) ⋅

⎡ ⎢

⎢ ⎢

𝑐𝑜𝑠(𝜃) 𝑠𝑖𝑛(𝜃)

𝑡𝑎𝑛(𝜙) 𝑙

−(

𝑠𝑖𝑛(𝛼)𝑙

𝑡

+

𝑡𝑎𝑛(𝜙)𝑙

)

⎤ ⎥

⎥ ⎥

(4)

In Chapter 4 some of the algorithms presented will make use of this property to optimize a time-independent trajectory, or path, just based on the set of results from Eq. (7) while others will still require to consider the original kinematics from Eq. (2).

2.1.3 Differential Flatness

Differential flatness has become a key tool in the development of steering methods for systems where one vehicle pulls another vehicle. It was introduced in [18]. A system is differentially flat if all the variables in its state space can be obtained from a set of variables called flat output. Considering a system ̇ 𝑥 = 𝑓(𝑥, 𝑢) where 𝑥 ∈ ℝ

𝑛

and 𝑢 ∈ ℝ

𝑚

there will be flat outputs of the form:

𝑦 = ℎ(𝑥, 𝑢), ̇ 𝑢, ..., 𝑢

(𝑠)

) (5)

where 𝑠 accounts for the maximum derivative in the controls. Moreover, making use of these flat outputs, it need to exist functions 𝑔 and 𝑎 so that:

𝑥 = 𝑔(𝑦, ̇ 𝑦, ..., 𝑦

(𝑗)

) 𝑢 = 𝑎(𝑦, ̇ 𝑦, ..., 𝑦

(𝑗)

)

(6)

For a 1-trailer system, the flat output is located in the center of the rear axle of the trailer [3]. Thus, the state space can change from (𝑥, 𝑦, 𝜃, 𝛼) to (𝑥

𝑓

, 𝑦

𝑓

, 𝜃

𝑓

, 𝜅

𝑓

):

x

𝑓

=

⎡ ⎢

⎢ ⎢

⎣ 𝑥

𝑓

𝑦

𝑓

𝜃

𝑓

𝜅

𝑓

⎤ ⎥

⎥ ⎥

(7)

(15)

being (𝑥

𝑓

, 𝑦

𝑓

) the flat output coordinates, 𝜃

𝑓

the orientation of the flat point and 𝜅

𝑓

the curvature of the flat point. The new state vector x

𝑓

fulfills the differentially flat system property for the flat output (𝑥

𝑓

, 𝑦

𝑓

) because 𝜃

𝑓

and 𝜅

𝑓

can be obtained as:

𝜃

𝑓

= 𝑎𝑡𝑎𝑛

2

(𝑦

𝑓

, 𝑥

𝑓

) 𝜅

𝑓

= 𝑥

𝑓

𝑦

𝑓

− 𝑥

𝑓

𝑦

𝑓

(𝑥

𝑓2

+ 𝑦

𝑓2

)

32

(8)

They only depend on the flat output and its derivatives with respect to the arc length. The trailer can be seen as a car-like vehicle obeying the kinematic unicycle model with steering angle 𝜙

𝑓

= −𝛼. The curvature of the trailer is related to 𝛼 by:

𝜃

𝑓

= 𝜅

𝑓

= 𝑠𝑖𝑔𝑛(𝑣) 𝑡𝑎𝑛(−𝛼)

𝑙

𝑡

(9)

For the single trailer truck system the relationship between the flat system state and the system state is:

x

𝑓

=

⎡ ⎢

⎢ ⎢

𝑥 − 𝑙

𝑡

𝑐𝑜𝑠(𝜃 + 𝛼) 𝑦 − 𝑙

𝑡

𝑠𝑖𝑛(𝜃 + 𝛼)

𝜃 + 𝛼

−𝑡𝑎𝑛(𝛼)/𝑙

𝑡

⎤ ⎥

⎥ ⎥

x =

⎡ ⎢

⎢ ⎢

𝑥

𝑓

+ 𝑙

𝑡

𝑐𝑜𝑠(𝜃

𝑓

) 𝑦

𝑓

− 𝑙

𝑡

𝑠𝑖𝑛(𝜃

𝑓

) 𝜃

𝑓

+ 𝑎𝑟𝑐𝑡𝑎𝑛(𝑙

𝑡

𝜅

𝑓

)

−𝑎𝑟𝑐𝑡𝑎𝑛(𝑙

𝑡

𝜅

𝑓

)

⎤ ⎥

⎥ ⎥

(10)

This property converts the steering problem for that system to constructing a smooth curve connecting the flat points of the start and goal configurations. The algorithm presented in Section 4.2 optimizes a set of coordinates < 𝑥

𝑓

, 𝑦

𝑓

> and uses the differential flatness property in order to find the corresponding set of

< 𝜃

𝑓

, 𝜅

𝑓

>.

2.2 Optimal Path Planning

Path planning consists on finding a path from an initial configuration to a specific

goal configuration, while respecting the constraints of the system. Terms as

(16)

feasible or optimal are used to describe the path based on the performance of the solution given. Feasible path planning determines a path that satisfies a set of constraints without considering how adequate the resulting path is, based on distance covered, steering effort or changes of direction. On the other hand, optimal path planning tries to find a path that fulfills the constraints but that at the same time minimizes a cost function.

As stated in [46] the shortest collision free path for a holonomic vehicle in a 2-D environment where the obstacles are polygons can be found in polynomial time 𝑂(𝑛

2

) [41], being 𝑛 the higher number of vertices of any obstacle. This path though is not suitable for our task as it is not bounded in curvature and in each vertex there is discontinuities in 𝜃, as Fig. 2.2 depicts. On the other hand, the problem where

Figure 2.2: Path going from S to G where curvature is non-continuous.

a bounded curvature and collision free path for polygonal obstacles is searched is NP-hard, which implies that there is no systematic algorithm that can solve our specific problem in polynomial time. This result is quite disappointing as the latter problem is the one tried to be solved in this report. Since exact algorithms that compute results in tractable time are non-existent one needs to turn ones attention into more general numerical methods.

Numerical methods for path and trajectory optimization (which can be

transformed from one to the other following [19] and [25]) divide broadly in three

groups: Variational methods, Graph-search methods and Incremental search

methods. The latter is the family of the sampling based approaches explained in

(17)

Section 2.3. The focus in this report will be on variational methods but the reader can find information about the three categories in [19], [36].

2.2.1 Variational Methods

These kind of methods represent the path that is tried to be optimized as a finite dimensional vector. Its parameters will be optimized using non-linear continuous optimization techniques. The results tend to converge to a local optima, instead of global. However, due to computational infeasibility of finding global optima in highly non-convex problems it is considered good enough. Numerical solvers require to state the problem as:

argmin

𝑥∈ℝ𝑛

𝐽 (𝑥)

subject to x[0] = x

𝑖𝑛𝑖𝑡

x[𝑁 ] = x

𝑔𝑜𝑎𝑙

𝑓(𝑥[𝑛], 𝑥

[𝑛], ...) = 0 ∀𝑛 ∈ [0, 𝑁 ] 𝑔(𝑥[𝑛], 𝑥

[𝑛], ...) ≤ 0 ∀𝑛 ∈ [0, 𝑁 ]

(11)

where the nonholonomic and differential contraints are represented by 𝑓(.) and 𝑔(.). In some examples due to the infeasibility of obtaining a solution for the constraint problem, it can be softened by converting it to an unconstrained problem in which barrier functions are used in place of the constrained conditions [1].

̄ 𝐽 (𝑥) = 𝐽 (𝑥) + 𝜖

𝑁

𝑛=0

ℎ(𝑥[𝑛], 𝑥

[𝑛], ...) (12) The barrier function ℎ(.) should satisfy: 𝑔 ≤ 0 → ℎ < ∞ and 𝑔 > 0 → ℎ = ∞ and 𝜖 is a factor accounting for the importance of fulfilling the soft constrains.

This formulation has the drawback that local minima may violate the problem constraints, as they are not guaranteed when using Eq. (12).

Two groups of variational methods are presented:

• Direct methods: The optimization problem is transformed from the infinite-

dimensional space of all possible paths/trajectories available into a finite

dimensional subspace. Policy 𝜋(𝑥), which determines the optimal next pose

(18)

based on current index, will be approximated as:

𝜋

𝑥

[𝑛] ≈ ̄ 𝜋[𝑛] =

𝑁

𝑖=0

𝜋

𝑖

𝜙

𝑖

[𝑛] (13)

where 𝜋

𝑖

∈ ℝ is a weighting factor for each discretized basis function 𝜙

𝑖

. Different methods are used to find this basis functions, the most important ones are collocation points and pseudoespectral methods [59][8].

• Indirect methods: The optimality conditions are taken from Pontryagin’s minimum principle. It adds the kinematic constraints of the system to the Lagrangian ℒ along with the time-varying Lagrange multiplier vector 𝜆 , whose elements are called the co-states of the system. Indirect methods [56] will solve the system of ODEs generated by the Pontryagin’s principle resulting in optimal states and co-states. As solving the ODEs can be numerically untractable, there is one technique called shooting method in which the free initial conditions of the problem are iteratively tuned. After each iteration the system is forward simulated. The goal is to find the set of initial conditions that allow the system to reach the desired final state. The shooting method is highly used in optimal control environments to solve optimal control problems, such as ACADO [4].

2.3 Sampling-Based Motion Planning for Truck-Trailer Systems

Using the time-independent kinematic form presented in Eq. (7) a time-

independent path from 𝑥

𝑠𝑡𝑎𝑟𝑡

to 𝑥

𝑔𝑜𝑎𝑙

can be constructed. We are interested in

finding an optimal path going from 𝑥

𝑠𝑡𝑎𝑟𝑡

to 𝑥

𝑔𝑜𝑎𝑙

, based on a cost function to be

(19)

tuned for our specific purpose. The problem at hand has the form:

minimize 𝐽 =

𝑠𝑔𝑜𝑎𝑙

𝑠=𝑠𝑠𝑡𝑎𝑟𝑡

𝐶(𝑥

𝑠

, 𝑢

𝑠

) subject to x

(𝑠) = x

(𝑥

𝑠

, 𝑢

𝑠

)

x(𝑠

𝑔𝑜𝑎𝑙

) = x

𝑔𝑜𝑎𝑙

x(𝑠

𝑠𝑡𝑎𝑟𝑡

) = x

𝑠𝑡𝑎𝑟𝑡

x(𝑠) ∈ ℂ

𝑓𝑟𝑒𝑒

(14)

being ℂ

𝑓𝑟𝑒𝑒

the region of space free from obstacles. Finding an optimal path while respecting the nonholonomic constraints of the vehicle has been found to be PSPACE-hard [36], which means that there exists no algorithm that can solve the problem in polynomial time [9]. Due to this computational limitation, the research community has focused on studying approximate methods and trying to solve sub-problems of the general motion planning problem. The common approach followed in the literature is to obtain a relatively good path through sampling-base methods and afterwards optimize the resulting path, which is faster than trying to solve Eq. (14) from scratch. For highly non-convex problems, the initialization of the interested parameters, such as the set of state vectors and control inputs, is of high importance.

The main advantage of sampling-based motion planning is that avoids having to represent the whole ℂ

𝑓𝑟𝑒𝑒

space and instead checks the space by randomly sampling all over it. Moreover, a collision detection module is responsible of determine whether or not the sampled configuration is collision free. Considering the collision module as a black box allows us to focus on developing better planning algorithms, with disregard of the environment.

There are two main approaches to construct a sampling-based planner: regularly- spaced lattice sampling and random state sampling.

2.3.1 Spaced Lattice Sampling

The continuous nonholonomic motion planning problem is converted into a

discretized planning problem as the state space is divided in cells of a certain

area. The smaller the area the higher the resolution but the computational time

(20)

increases. The problem is solved using search algorithms like A* or Dijkstra [11]. Once the ℂ

𝑓𝑟𝑒𝑒

map has been discretized a search over the graph of all the discretized cells is conducted. It starts at the vertex belonging to 𝑥

𝑠𝑡𝑎𝑟𝑡

and explores adjacent nodes until the destination node, the one that 𝑥

𝑔𝑜𝑎𝑙

belongs to, is reached. These methods have been extensively tested in holonomic problems [12] where the adjacent nodes for a specific node are all its neighbour nodes. For a nonholonomic vehicle the adjacent nodes are reformulated as the set of nodes that are kinematically feasible to be reached from the current node. For example, in Dijkstra algorithm just the nodes that can be reached from the node that is currently being examined is saved into the open space, so spatial proximity makes no sense when dealing with a nonholomic system. A* is a variant of Dijkstra in which assigns a weight to each open node equal to weight of the edge to that node plus the approximate distance between that node and the goal. This approximate distance is found by an heuristic, and represents a minimum possible distance between that node and the end. In path planning for robotics the common heuristic used is Euclidian distance to the goal without considering obstacles. This is far from being accurate but gives a certain extra weight to those nodes that get closer in distance to the goal. The biggest drawback of the two methods presented so far for our problem is that the system is constrained to move to the middle point of the lattice cells, which may be hard to reach exactly. Hybrid A* removes these constraints by forward simulating the system based on a set of controls chosen randomly and the end pose is saved so that after the graph algorithm can determine the shortest path to reach the goal.

All lattice-based methods have the disadvantage of not being able to exactly reach a given goal state. Then, an optimization algorithm or a local planner (Section 2.5) are required to get exactly to the goal state.

2.3.2 Sampling Based

The main idea is to randomly sample states from the continuous state space, different from the discretization of the space done in the spaced lattice approach.

In the original sampling based algorithm, called Probabilistic Road Maps (PRM),

a graph was first created after sampling different states and connecting them

making use of a local planner. Afterwards, a graph search algorithm is applied

(21)

to the resulting graph to determine a path between the starting and goal configurations. Later in time, algorithms that tried to reach the goal state while creating the graph were presented, such as Rapidly-exploring Random Tree (RRT) algorithm.

Every time a new node is sampled it is checked to see if it belongs to the ℂ

𝑓𝑟𝑒𝑒

using a collision avoidance algorithm [33]. Once a node ∈ ℂ

𝑓𝑟𝑒𝑒

has been sampled, the nearest node previously saved needs to be found and try to connect both.

In a holonomic procedure, this step is trivial as neighbour nodes are adjacent nodes in space and to move from one node to its adjacent a straight line works.

Nevertheless, in nonholonomic scenarios in order to find the closest node to the sampled one a steering method should be used to calculate the distance between nodes.

2.4 Distance Heuristics

A steering method constructs a path from a start state towards exactly a goal state. The curve connecting both nodes needs to fulfill the kinematic constraints of the vehicle so a boundary problem is required to be solved, as explained in Section 2.5. As doing this from every node/cell saved towards the new sampled one is prohibitively expensive, some heuristic distance functions are normally used. Kvarnfors [33] obtained a discretized cost map for going to all the near positions around the single trailer truck system in the discretized grid. It was obtained using the exact local planner explained in Section 2.5 to go to all the cells from the discretized grid. Fig. 2.3 depicts one of the tables obtained. The figure presented shows the interpolated results when 𝜃

𝑔𝑜𝑎𝑙

= 0, 𝜅

𝑠𝑡𝑎𝑟𝑡

= 0 and 𝜅

𝑒𝑛𝑑

= 0.

Different values for any of these terms means that a new lookup table needs to be created. Moreover, each lookup table is only valid for the specific system in which the local planner was used. If a new system wants to make use of the table, some parameters from the local planner, such as length of the vehicle, need to be changed in order to obtain a new lookup table.

Once the lookup table is available, it is easier to evaluate neighbour nodes. After

the neighbour node is found, the system tries to steer from the neighbour node

towards the newly sampled node. Depending on the RRT algorithm used, the steer

(22)

step may change.

Figure 2.3: This Look-Up table represents the distance from configuration (0, 0, 0, 0, 0) to (Δ

𝑥

, Δ

𝑦

, 0, 0, 0) using the local planner from Section 2.5.

2.5 Local Planning for Truck-Trailer Systems

An exact local planning method is intended to drive exactly the system from

an initial start configuration to a specific goal configuration. As there is no

known optimal solutions for the 1-trailer system, it needs to be assessed either

numerically or using an heuristic. The local planner used along this thesis (in

Section 4.1) was firstly developed in [33] where using the differential flatness

property of the single trailer truck a geometrical curve from the starting pose until

the end pose is computed. This curve is constraint to be continuous on curvature

and sharpness, which is the derivative of curvature with respect to arc length

(𝜎 =

𝛿𝜅𝛿𝑠

). It uses Continuous Sharpness Turns [45] where the curvature profile

is represented as 𝜅(𝑠) = 𝑎

3

𝑠

3

+ 𝑎

2

𝑠

2

+ 𝑎

1

𝑠 + 𝑎

0

which produces a path that is

continuous in curvature and sharpness. The coefficients of the curve are found by

setting the initial and final curvature of the segment. In order to generate a path

one of these turns is placed at the start state and another turn at the end state

and they are connected with a straight line. This line requires that the 2 lines

on its extremes end with a curvature of 0. Due to the limitation of forcing both

paths to be connected through a straight line it may result in suboptimal paths.

(23)

The other solution proposed in [33] is to use seventh order degree Bézier curves to connect both configurations. This approach extends the work done in [34], where fifth order Bézier curves are used, to seventh order curves. This decision is made as it is the minimum degree that makes the curvature and sharpness of the path continuous. This is a requirement to make the connection between multiple paths drivable. Once a continuous sharpness turn and a Bézier curve have been generated the planner will save the option with the smaller cost. Figure 2.4 compares different paths obstained making use of the local planner from [33].

Figure 2.4: All paths start at [0, 0, 0, 0] (black system) being the state vector the one presented in Eq. 1a. The final state for the orange path is [25, 40, 0, 0], for the blue is [30, −5, 0, 0] and for the green [−5, −10, 0, 0].

2.6 Collision Avoidance

Along this report, two different approaches to deal with the collision avoidance problem are treated. They will be used depending on the kind of optimization algorithm being used. In here just one method will be explained while in Section 4.4.1 the other method, which is specifically used in one of the optimization algorithms, will be developed.

The first collision avoidance method is based on the SAT- algorithm for object-

oriented bounding boxes (OBBs) [52],[14]. Considering a pair of convex polygons,

(24)

a set of lines are considered for testing whether they are separated or not. That set includes the normal vectors to the edges of the polygons. As in our case both obstacle and truck-trailer are represented as rectangles, just two normal vectors need to be tested for each obstacle and for the truck or the trailer, being those two orthogonal between each other as depicted in Fig. 2.5. Both objects (the obstacle and the truck or trailer) are then projected into the axis generated by the normal vector. The interval occupied by the projection of a compact set 𝐶 onto a line with unit length referred as 𝐷 is:

𝐼 = [𝜆

𝑚𝑖𝑛

(D, 𝜆

𝑚𝑎𝑥

(D)] = [𝑚𝑖𝑛{𝐷 ⋅ 𝑋 ∶ 𝑋 ∈ 𝐶}, 𝑚𝑎𝑥{𝐷 ⋅ 𝑋 ∶ 𝑋 ∈ 𝐶}] (15)

Two compact convex sets 𝐶

𝑎

and 𝐶

𝑏

are separated if there exists a direction 𝐷 such that the projection intervals 𝐼

𝑎

and 𝐼

𝑏

do not intersect. Analytically, they do not intersect when:

𝜆(𝑎)

𝑚𝑖𝑛

(D) > 𝜆(𝑏)

𝑚𝑎𝑥

(D) or 𝜆(𝑎)

𝑚𝑎𝑥

(D) < 𝜆(𝑏)

𝑚𝑖𝑛

(D) (16)

All the combinations are tested and it is just needed that in one of the normal

axis the projected objects do not intersect each other to determine they are not

colliding. If they intersect in all the axis it will be considered that there is a collision

between both objects. This test will be done for each coordinate of both truck and

trailer against each obstacle in the map.

(25)

Figure 2.5: Trailer and obstacle projected into axis D1,t (green lines) where they

do not interfere

(26)

3 Related Work

Research on motion planning for truck and trailer systems started in the 90s where the concept of differential flatness [3] was discovered, which lead to the creation of a new set of planners [48] that used this property to solve the motion planning problem in a geometrical manner. For the specific single trailer problem, an exact path formed by a group of translation, rotation and transition segments each with constant speed and steering angle connecting an initial and goal configuration was provided in [54].

The inability of planning a path in environments with higher complexity and when dealing with nonholonomic constraints was the reason why the attention was turned into sampled lattice planner and sampled based motion planner algorithms, introduced by LaValle [35] as new ways of motion planning. The original RRT algorithm is probabilistically complete, which means that if enough time is given it will converge to a solution in case it exists, but no attention is given to the optimality of it. Later on Karaman et al. [28] introduced RRT*

which is not only probabilistically complete but also probabilistically optimal.

Nonetheless when used with non-holonomic systems a two point boundary value

problem is to be solved in the re-wire step which is a hard problem. Later on CL-

RRT [32] was introduced which used RRT for the sampling step and a closed-loop

controller for connecting the nodes, meaning that the exact goal node could not be

reached but it allowed to speed up the procedure with respect using local planners

for the connection between the nodes. Another RRT variation introduced in the

original work from LaValle et al. [37] is the Bidirectional RRT where one tree

starts at the start pose and another one at the final pose and both are extended

and tried to be connected between each other. This approach avoids a typical

drawback of RRT called bug traps. Bug traps are planning problems that are

significantly easier to solve in one direction than in the other one. Evestedt

et al. studied the general 2-trailer system making use of RRT based motion

planners [16] and grid based approaches [40] while connecting the nodes or

vertexes through forward simulation. Kvarnfors [33] developed the Bidirectional

CL-RRT, where the Bidirectional RRT was combined with a closed loop controller

for the connection between adjacent nodes. As a closed loop controller does

(27)

not lead to the exact goal configuration, the actual configuration achieved was saved in the graph, instead of the one sampled in the sampling based step. This approach returns a feasible path if enough time is given, but the solution may have unnecessary turns due to the random sampling behaviour.

Until now, none of the presented procedures, except RRT*, dealt with the optimality of the given path. Holmer [25] developed a CL-RRT motion planner to approximately reach the goal position combined with an optimal control problem for the last part of the path to exactly reach the desired position, solved making use of a direct multiple shooting method [6], for a 2-trailer system. This approach did not assess the optimality of the whole path or the unnecessary turns due to the sampling step but just focused on reaching exactly the goal position optimizing just the final segment of the path. A more complete procedure from the optimality point of view is presented in [60], where a simple path planner to obtain an initial path, which does not need to be perfectly feasible, for a general hitch truck and trailer is combined with an optimization approach where the obstacles are considered as repelling springs, which does not totally guarantee that the resulting path is collision free.

More recently optimality has being studied for different systems than the one treated in this report but those results could be expanded for our purpose. In [15]

a simple RRT is combined with a local planner in order to shortcut unnecessary nodes. Dolgov et al. [13] developed a path-planning algorithm to compete in the DARPA Urban Challenge consisting on first a Hybrid-state A*, a modification of the traditional A* [24], combined with a trajectory optimization approach using conjugate-gradient (CG) descent divided into two steps. At first the vertices obtained from Hybrid A* would be moved to improve smoothness and latter on another pass of the optimization would reduce the discretization of the path.

This same approach was used by Kurzer [31] during his Master Thesis at KTH.

Both of these approaches did not represent the obstacles explicitly but by using

potential barrier functions or the Voronoi Field [29] which are not suitable for

high non-holonomic system such as the single-trailer truck, as it is not clear how

the derivatives of the Voronoi Filed are calculated with respect the 𝜃 and 𝛼 of

the truck. Some ways to include the obstacles into a Non Linear Optimization

(28)

framework are presented in [26] making use of Mixed Integer Representation but

it is unclear of its speediness to be run online. A more suitable way is presented

in [58] where the whole autonomous parking problem, including the obstacles,

for a car is formulated as a smooth non-convex optimization problem solved

making use of interior point methods [10]. Their algorithm is first initialized

using Hybrid A*, which computes a coarse trajectory using a simplified vehicle

model and by discretizing the state-input space, and after an optimization-based

collision avoidance algorithm generates smooth, collision-free, and dynamically

feasible path. Their Hybrid A* approach uses Reeds-Shepp paths to connect the

different nodes and this cannot be applied for the truck and trailer system as it

produces paths that violate the kinematic constraints of the system.

(29)

4 Optimization Algorithms

In this section all the different algorithms developed during this thesis are explained in detail. All these methods have in common that they do not start their search from scratch but from a collision free kinematically feasible solution. This may have been obtained through the methods presented in Section 2.3 and 2.5, which may not be very efficient in terms of changes of directions, steering effort or distance covered. Consequently, it is not the focus of this thesis to obtain better paths from the beginning, let’s say by applying state-of-the-art random sampling approaches, but to optimize on top of any given solution.

4.1 Shortcut Algorithm

The algorithm expects as an input a discretized path where the different parent nodes from the random-sampling methods are known in advance. Based on that, connections between these nodes are tested where the objective is to remove unnecessary nodes [20][15], as depicted in Fig. 4.1. In order to achieve that the

Figure 4.1: The short-cut algorithm applied to a suboptimal path (black). All the unnecessary nodes are eliminated, resulting in the optimized path (green) that has smaller cost than the original one.

algorithm does two iterations in which the first one (Iteration A) starts in the

starting node and the other one (Iteration B) starts in the end node. A connection

using the exact local planner described in Section 2.5 is intended from 𝐴 to 𝐵 or

(30)

vice-versa, using the Time Reversibility principle explained in [25]. If the local planner does not provide a collision free solution, Iteration B decrements one node and the procedure gets repeated until a shortcut can be done or Iteration B reaches parent node 𝑁

𝐴

+ 1, being 𝑁

𝐴

the current node of Iteration A. Next step Iteration A will move to the next parent node while Iteration B will start back again from the end node.

Figure 4.2 depicts two different iterations of the algorithm, one in which the shortcut is not successful and one that it is. In case the connection between two non-adjacent nodes provided by the local planner is collision free, it needs to be determined if the new sub-path is better or worse than the original one. Many features from the curve can be taken into account but for simplicity reasons the new sub-path is considered better if the distance covered by the trailer is shorter.

As can be seen from Fig. 4.2b, the orange line connecting two parent nodes is shorter than the blue line connecting those same parent nodes. Other more complex heuristics can be used that account for 𝜅, 𝜙 or jerk. Fig. 4.3 compares the curvature and steering of the resulting path and the original one. For the test scenario chosen both features are smoothed, its absolute average value gets smaller, but it could be the opposite, as none of this parameters are included into the cost function.

If the new sub-path is better based on the cost function the original one will be substituted by the new one and the algorithm will continue. Algorithm 1 depicts the pseudocode for the shortcut algorithm.

Another variant of this algorithm is to randomly chose a interval from the original

path and try to connect the initial point of the interval with the final point using the

same exact local planner as before (Algorithm 2). The size of the interval chosen

as well as the number of iterations can be tuned by the user and they may depend

on the size of the original path (the more points it has the more chances of finding

shortcuts in bigger intervals) or the amount of time available to optimize the path

(if little time fewer iterations should be used). Figure 4.4 depicts two different

iterations of the non-deterministic shortcut algorithm. Another pre-processing

step that can be done if time is limited is to set a threshold in the curvature of

the trailer 𝜅

𝑓

so that if max(|𝜅

𝑓

|) from the interval is smaller than a threshold

manually chosen an optimization will not be performed, as the current interval is

(31)

(a) A connection between parent node 0 and parent node 45 is done (orange line) but it is not successful as the hitch angle required to execute the proposed solution is higher than 𝛼𝑚𝑎𝑥.

(b) A connection between parent node 0 and parent node 28 is successfully done (orange line) due to there is no collision and the cost function of the new path is less than the one from the original path.

Figure 4.2: Plotting of two different iterations of the deterministic shortcut

algorithm for the optimization of the path of the trailer.

(32)

(a) Curvature comparison between original path and result obtained at Fig.

4.2b.

(b) Steering comparison between original path and result obtained at Fig. 4.2b.

Figure 4.3: Plotting of two different iterations of the non-deterministic shortcut

algorithm for M=1000 samples.

(33)

considered good enough.

Algorithm 1 Deterministic Shortcut Algorithm

1:

procedure Shortcut(path) ▷ path is the feasible path provided by RRT methods

2:

N ← Number of parent nodes from RRT

3:

for 𝑖 = 0 ∶ 𝑁 − 1 do

4:

for 𝑗 = 𝑁 − 1 ∶ 𝑖 do

5:

success, newPath := Steer(𝑖, 𝑗) ▷ Steers from node 𝑖 to node 𝑗

6:

if success = TRUE then

7:

costOld := cost(path

𝑖

→ path

𝑗

) ▷ Cost current path going from 𝑖 to 𝑗

8:

costNew := cost(newPath)

9:

if costNew < costOld then

10:

path[𝑖 → 𝑗] = newPath ▷ The path going from 𝑖 to 𝑗 is updated

11:

𝑖 = 𝑗

12:

break

13:

end if

14:

end if

15:

end for

16:

end for

17:

return path ▷ The updated path after shortcutting

18:

end procedure

(34)

(a) The shortcut intended is not successful because the segment length is higher than the original sub-path and it collides with an obstacle.

(b) The shortcut try is successful due to there is no collision and the length of the new path is less than the one from the original path.

Figure 4.4: Plotting of two different iterations of the non-deterministic shortcut

algorithm for M=1000 samples.

(35)

Algorithm 2 Non-Deterministic Shortcut Algorithm

1:

procedure Shortcut(path) ▷ does not matter how the path was obtained

2:

M ← Number of iterations to be performed

3:

L ← Length of the intervals to consider

4:

for 𝑖 = 0 ∶ 𝑀 do

5:

indexStart, indexEnd := samplePoint(path, L)

6:

success, newPath := Steer(path

indexStart

, path

indexEnd

)

7:

if success = TRUE then

8:

costOld := costPath(path

indexStart

→ path

indexEnd

)

9:

costNew := costPath(newPath)

10:

if costNew < costOld then

11:

path[indexStart → indexEnd] = newPath

12:

end if

13:

end if

14:

end for

15:

return path ▷ The updated path after shortcutting

16:

end procedure

(36)

4.2 Coordinate Points Optimization

Another approach taken to deal with the optimization problem of the path subject to constraints is explained in [13], which will be referred as Coordinate Points Optimization (CPO). There, a sub-optimal path is first obtained using A*, see Section 2.3, and afterwards it is optimized. Their original path is considered drivable but suffers from unnatural turns that require a high steering effort. In order to optimize the path the original set of < 𝑥

𝑖

, 𝑦

𝑖

> is provided and this set of points will be moved during the optimization based on the gradient of the cost function in order to shorten and smooth the final path. In the original paper they use conjugate-gradient (CG) descent as numerical-optimization technique but in the presented implementation better results were obtained applying Broyden–Fletcher–Goldfarb–Shanno (BFGS) [17] algorithm. The cost function used is:

minimize

x0,…,x𝑁

𝐽 = 𝑤

𝑜

𝑁

𝑖=0

𝜎

0

(|x

𝑖

− o

𝑖

| − 𝑑

𝑚𝑎𝑥

) + 𝑤

𝑘

𝑁−1

𝑖=1

𝜎

𝑘

( Δ𝜙

𝑖

|Δx

𝑖

| − 𝜅

𝑚𝑎𝑥

) + 𝑤

𝑠

𝑁−1

𝑖=1

(Δx

𝑖+1

− Δx

𝑖

)

2

+ 𝑤

𝑔

𝑁−1

𝑖=2

𝜎

𝑝

( Δ𝜙

𝑖

|Δx

𝑖

| − Δ𝜙

𝑖−1

|Δx

𝑖−1

| )

(17)

where 𝜎

0

, 𝜎

𝑘

and 𝜎

𝑝

are quadratic penalty functions,

|∆x∆𝜙𝑖

𝑖|

= 𝜅

𝑖

the curvature at index 𝑖 and Δ𝜙

𝑖

= |𝑡𝑎𝑛

−1

(

∆y∆x𝑖+1

𝑖+1

) − 𝑡𝑎𝑛

−1

(

∆y∆x𝑖

𝑖

)| the displacement vector at index 𝑖.

The first cost term accounts for the distance of the vehicle to the closest obstacle, the second term forces the curvature to be lower than 𝜅

𝑚𝑎𝑥

, the third term tries to reduce the distance between points and the last term penalizes the sharpness of the path. The first three terms were exactly taken from the original cost function used in [13], in order to optimize the path for a car, but as the truck and trailer path requires continuity in the curvature the forth term is required. The distance to the obstacles is accounted by creating a discretized cost map where the smaller the distance between a cell and its closest obstacle the bigger the cost. Cells inside obstacles are assigned a very big cost. The function used to determine the cost of each cell is:

cost cell = 1000(1 − 𝑑

𝑜𝑏𝑠𝑡𝑎𝑐𝑙𝑒

𝑑

𝑚𝑎𝑥

) (18)

(37)

if 𝑑

𝑜𝑏𝑠𝑡𝑎𝑐𝑙𝑒

< 𝑑

𝑚𝑎𝑥

. As it is required to have the derivatives with respect to 𝑥 and 𝑦 for all cells in the map, these are the unit vectors pointing away from the closest obstacle of each cell. As can be seen, the distance function considered in the cost map is the Euclidean distance, which does not consider neither the orientation of the truck nor the hitch angle, which may induce errors in finding the real closest obstacle. Figure 4.5 depicts the cost map for the benchmark map.

Figure 4.5: Cost inside the obstacles is set to 10

3

. Each cell from the map has an area of 0.25 𝑚

2

.

Including the distance to the obstacles in the cost function does not guarantee that the resulting path will be collision-free and that is why the authors after each optimization run, check if all the points are collision free and in case some of them are not, these are anchored, so they will not be modified in successive optimization runs. Moreover, the BFGS algorithm requires an analytical representation of the derivatives with respect to 𝑥 and 𝑦 of the three cost terms in order to decide in which direction are the points updated. The derivatives for the first three terms in Eq. (17) can be directly found in [13] while the derivative of the last term follows directly from the derivative of the second term. Results for the same path used in Section 4.1 are presented in Fig. 4.6.

The main two drawbacks of this procedure are:

• No kinematics are considered, so the resulting path is not guaranteed to be

(38)

(a) Comparison between original path and new path following procedure in [13].

(b) Comparison between original curvature of the trailer 𝜅 and the new curvature obtained using CPO.

Figure 4.6: Results for the path optimization method presented. As can be seen

the point of change of direction does not change.

(39)

able to be followed. As the resulting path is just a set of < 𝑥

𝑡

, 𝑦

𝑡

> reverse engineering is needed to obtain the set of steering angles or curvatures (depending what is used as control input) required to reach each of the coordinate points. Both controls are obtained from Eq. (57). As can be seen in Fig 4.6b obtaining the curvature from the coordinates induces to numerical issues that make the curvature profile noisy. In order to solve this problem a controller to follow those points might be used.

• No optimization of the complete manoeuvre is possible as the direction of the vehicle is not included in the algorithm. We can just optimize the forward and the backward direction separately, without modifying the point of change of direction, as can be seen in Fig. 4.6. Moreover, the hitch angle 𝛼 from the truck and trailer system needs to be considered in the point of change of direction. Forward and backward segments need to meet with a common curvature 𝜅

𝑐

so that the path can be followed in reality. This problem does not occur in car type vehicles as they can instantly steer.

In order to tackle this problem, some nodes at the beginning and at the

end of each unidirectional segment are anchored during the optimization

procedure so that they will not move. Considering the curvature formula

from Eq. (57) three coordinates (x

𝑖

, x

𝑖+1

, x

𝑖+2

) are required to define the

curvature at index 𝑖, so considering that the original path is drivable, if two

samples are anchored at the beginning and at the end of each interval the

curvature in the extreme points should remain the same that the one from

the original path.

(40)

4.3 DDP and iLQR

Differential Dynamic Programming (DDP) [42] is a shooting method which belongs to the family of indirect methods (Section 2.2.1). As has been introduced before indirect methods just parametrize the controls of the system and by forward integration, also called shooting, the corresponding states are obtained.

The term indirect comes from obtaining the states indirectly from the controls.

4.3.1 Basics of DDP

DDP is a trajectory optimization method which uses locally linearized and discretized dynamics of the kinematics of the system and the cost function. First of all the discrete system dynamics and cost function are represented as:

x

𝑖+1

= 𝑓(x

𝑖

, u

𝑖

) 𝐽

𝑜

(x

0

, U) =

𝑁−1

𝑖=0

𝑙(x

𝑖

, u

𝑖

) + 𝑙

𝑓

(x

𝑁

)

(19)

where 𝑖 ∈ [0, 𝑁 − 1] being 𝑁 the size of the control vector that is applied, known as horizon. 𝐽

𝑜

(x

0

, U) represents the cost function of starting at state x

0

and apply the set of controls U = {u

0

, ..., u

𝑁−1

}. The reader may notice that a special cost for the last state is included, as DDP does not accept hard constraints in the states, it may be reasonable to set a high cost for the deviation between the obtained last pose and the desired one. DDP works by iteratively computing a forward pass by integrating the dynamics in Eq. (19) for the current U in order to obtain the resulting state vector, followed by a backward pass which estimates the value function and dynamics for each (x, u) in the state-space and control signal trajectories [55]. This step uses Bellman’s Principle of Optimality [5] to obtain the value function. The value function of a state for a timestep i is the optimal cost-to-go from that state:

𝑉

𝑖

(x) = min

U𝑖

𝐽 (x, U

𝑖

) (20)

(41)

(a) Comparison between original path and new path obtained using DDP.

(b) Comparison between original curvature of the trailer 𝜅 and new curvature obtained using DDP.

Figure 4.7: Comparison between benchmark trajectory and the optimization

performed using DDP. 𝕎

𝑎

= 𝑑𝑖𝑎𝑔(1, 0.3), 𝕎

𝑏

= 𝕎

𝑐

= 𝐼, 𝕎

𝑑

= 10

−5

.

References

Related documents

The five stakeholders selected have been Brighton and Hove City Council, Our Brighton Hippodrome and Brighton Hippodrome Community Interest Company (CIC), The

Slutsatsen som dras är att en variant av discrete differential evolution presterar betydligt bättre än en generisk genetisk algoritm på detta problem, men inget generellt antagande

Therefore, the integration of branded retail apps in physical stores in relation to the smart customer experience and purchase decision process constitutes the

If you bike every day to the lab, you reduce your emissions with 1200 kg/year, compared to a petrol-driven car (10 km one way).. DID

relevant theories that pertained to the problem we wanted to study. Many facets of the focus area have been increasingly difficult to acquire information about. We have read

In this section the statistical estimation and detection algorithms that in this paper are used to solve the problem of detection and discrimination of double talk and change in

Looking at the dynamic design process Stolterman and Löwgren (2004) presents it could be said that we where still in- between the vision and operative image at

A Compositional Framework for End-to-End Path Delay Calculation of Automotive Systems under Different Path Semantics.. Nico Feiertag and