• No results found

Path Following Using Gain Scheduled LQR Control : with applications to a labyrinth game

N/A
N/A
Protected

Academic year: 2021

Share "Path Following Using Gain Scheduled LQR Control : with applications to a labyrinth game"

Copied!
69
0
0

Loading.... (view fulltext now)

Full text

(1)

Master of Science Thesis in Electrical Engineering

Department of Electrical Engineering, Linköping University, 2020

Path Following Using Gain

Scheduled LQR Control

with applications to a labyrinth game

Emil Frid and Fredrik Nilsson

(2)

Master of Science Thesis in Electrical Engineering

Path Following Using Gain Scheduled LQR Control: with applications to a labyrinth game

Emil Frid and Fredrik Nilsson LiTH-ISY-EX–20/5305–SE

Supervisor: Doctoral student Fredrik Ljungberg

isy, Linköping University Site Manager Rikard Hagman

Combine Control Systems, Linköping

Examiner: Associate Professor Johan Löfberg

isy, Linköping University

Division of Automatic Control Department of Electrical Engineering

Linköping University SE-581 83 Linköping, Sweden

(3)

Abstract

This master’s thesis aims to make the BRIO Labyrinth Game autonomous and the main focus is on the development of a path following controller. A test-bench system is built using a modern edition of the classic game with the addition of a Raspberry Pi, a camera and two servos. A mathematical model of the ball and plate system is derived to be used in model based controllers. A method of us-ing path projection on a cubic spline interpolated path to derive the reference states is explained. After that, three path following controllers are presented, a modified LQR, a Gain Scheduled LQR and a Gain Scheduled LQR with obstacle avoidance. The performances of these controllers are compared on an easy and a hard labyrinth level, both with respect to the ability of following the reference path and with respect to success rate of controlling the ball from start to finish without falling into any hole. All three controllers achieved a success rate over 90 % on the easy level. On the hard level the Gain Scheduled LQR achieved the highest success rate, 78.7 %, while the modified LQR achieved the lowest devia-tion from the reference path. The Gain Scheduled LQR with obstacle avoidance performed the worst in both regards. Overall, the results are promising and some insights gained when designing the controllers can possibly be useful for devel-opment of controllers in other applications as well.

(4)
(5)

Acknowledgments

Firstly, we would like to thank Rikard Hagman for the opportunity of working with such a fun system and that we could freely choose in which direction the thesis would go. We are also thankful for the nice office space he lend us and for inviting us to all company activities.

We would like to thank our supervisor Fredrik Ljungberg and examiner Johan Löfberg for helping us with theory questions and showing interest in the project by coming up with their own ideas and suggestions.

We would also like to thank Anders Narbrink from Linköping University for help-ing us with the construction of the system and letthelp-ing us use the woodwork class-room.

Lastly, we want to thank Erik Frisk for the course Autonomous Vehicles - Plan-ning, Control, and Learning Systems, which inspired us to use a spline path for generating reference states to the controller.

Linköping, June 2020 Emil Frid and Fredrik Nilsson

(6)
(7)

Contents

Notation ix 1 Introduction 1 1.1 Aim . . . 1 1.2 Problem Specification . . . 2 1.3 Delimitation . . . 2 1.4 Related Research . . . 3 2 Theoretical Background 5 2.1 Cubic Spline . . . 5 2.2 Kalman Filter . . . 6 2.3 Linear-Quadratic Controller . . . 8 3 Implementation 9 3.1 Test Platform . . . 9

3.1.1 Hardware and Software . . . 9

3.1.2 Construction . . . 11 3.1.3 Electronic Schematic . . . 12 3.2 Image Processing . . . 13 3.3 System Model . . . 13 3.3.1 Model Derivation . . . 13 3.3.2 Friction Modeling . . . 16 3.4 Spline Path . . . 17

3.5 Path Following Controller . . . 21

3.5.1 LQR . . . 22 3.5.2 Gain Scheduling . . . 24 3.5.3 Obstacle Avoidance . . . 25 3.5.4 Integration of θref . . . 26 4 Results 29 4.1 Friction Modeling . . . 29 4.2 Controller Performances . . . 29 4.2.1 Easy Labyrinth . . . 30 vii

(8)

viii Contents 4.2.2 Hard Labyrinth . . . 33 5 Discussion 37 5.1 Results . . . 37 5.1.1 Friction Modeling . . . 37 5.1.2 Controller Performances . . . 38 5.2 Method . . . 39 5.2.1 Weaknesses . . . 39

5.2.2 Reliability and Validity . . . 40

6 Conclusions 41 6.1 Spline Interpolated Path . . . 41

6.2 Control Algorithms . . . 41 6.3 Future Work . . . 42 A Controller Tuning 47 A.1 LQR . . . 47 A.2 GS . . . 48 A.3 GSOA . . . 48 B Failed Attempts 49 B.1 LQR . . . 49 B.2 GS . . . 50 B.3 GSOA . . . 50

C Estimated Success Rate of GSOA Without Hole 12 55

(9)

Notation

Abbreviations

Abbreviation Meaning

GS Gain Scheduling

GSOA Gain Scheduling with Obstacle Avoidance

IPC Inter Process Communication

LMI Linear Matrix Inequalities

LPV Linear Parameter-Varying

LQR Linear-Quadratic Regulator

OpenCV Open Source Computer Vision Library

PID Proportional, Integral, Differential (regulator)

PWM Pulse-width Modulation

RMSE Root Mean Square Error

ROS Robot Operating System

RPi Raspberry Pi 3B+

TS Takagi-Sugeno

Definitions

Notation Meaning

˙x, ¨x First and second order derivative of x with respect to time

f0(x), f00(x) First and second order derivative of f (x)

(10)
(11)

1

Introduction

Since 1946 the BRIO Labyrinth Game has challenged players with the task of balancing a steel ball from start to finish, without falling into holes. This prob-lem can be simplified to the classical control probprob-lem called ’ball and plate’ [1], which is a nonlinear multivariable system often used to evaluate control methods. The addition of both walls and holes make it an extra challenging problem that requires advanced algorithms to solve autonomously.

This project is a collaboration with Combine Control Systems AB and serves as a continuation of a previous master’s thesis [2], which managed to solve the labyrinth using both proportional, integral, differential (PID) control and a linear-quadratic regulator (LQR). The aim is to improve upon this work by integrat-ing new hardware and implementintegrat-ing more advanced methods for solvintegrat-ing the labyrinth.

1.1

Aim

The aim of this master’s thesis is to develop an autonomous BRIO Labyrinth Game. The product can be used by Combine Control Systems at business fairs to illustrate a controller applied on a system which is well known to be challeng-ing. Further, the development process will yield knowledge about applications of robust control methods for nonlinear systems.

(12)

2 1 Introduction

1.2

Problem Specification

The main focus of the project is to investigate a control method for path following in a labyrinth game. A camera will be used for ball tracking and servos for con-trolling the angle of the board. A computer vision system and a control system will be developed and then implemented on a Raspberry Pi 3B+ (RPi) which acts as the computational unit for the system.

When solving the main problem the aim is to examine and answer the following questions:

1. Can path following from using a spline path together with a state feed-back controller perform well enough to control the ball from start to fin-ish without falling into any hole? - Instead of designing the controller to reach a position and treating the path as a set of discrete reference points, as was done in [2], it will take both reference position and velocity derived from path projection on a cubic spline interpolation of the path.

2. Can gain scheduled LQR increase the control performance of this system compared to a normal LQR? - The system has conflicting state references, the controller wants to position the ball on the closest point of the path but at the same time it wants the ball to have a positive velocity in the path direction. This may be hard to achieve with a normal LQR but could be handled by a gain scheduled LQR. Both methods have been implemented and compared to each other.

3. Can the addition of a weighted hole avoidance to the controller lower the risk of failing the labyrinth?- Working in parallel with the gain scheduled LQR, an obstacle avoidance algorithm can be used to change the controller output in order to steer away from holes. The question is if this will increase the likelihood of completing the labyrinth, or if it will interfere too much with the path following part of the controller.

1.3

Delimitation

Since the focus of this project was on control theory, a computer vision program-ming library called Open Source Computer Vision Library (OpenCV) was used for the ball tracking. The computer vision is intended to only operate in a well lit environment.

This project does not compensate for curved labyrinth plates which proved to be challenging in [2]. To make sure that the plates were straight, they were con-structed using a sturdy flat material.

The path that the ball is expected to follow was predefined for each labyrinth and closely resembles the paths drawn on the original BRIO labyrinths. That is, no

(13)

1.4 Related Research 3

path planning algorithm was used to determine the optimal, or easiest path to follow.

Online computations were limited by the computing power of the RPi and the delay before the servos adjust to the requested angle was not compensated for. Furthermore, the ball is always assumed to have contact with the board and no slipping was modeled. Other simplifications in the model were also made (see Section 3.3).

1.4

Related Research

This project is a continuation of a master’s thesis carried out in 2019 [2] on behalf of Combine Control Systems AB. In that project the ball was successfully tracked, using computer vision, and controlled through the labyrinth using both PID and LQR control. A Kalman filter was used to estimate the position and speed of the ball.

Unlike [2] this thesis does not treat the path as a set of discrete reference points. The reference points are instead interpolated using cubic splines from which a continuous reference position and heading angle can be derived. This method is based on an assignment [3] given in the course Autonomous Vehicles - Planning, Control, and Learning Systems at Linköping University. In the assignment, the path following was treated as a state stabilization problem based on a path devi-ation dynamics model. Another widely used path stabilizdevi-ation technique is Pure Pursuit [4].

In [5] the BRIO Labyrinth was solved using machine learning. A learning con-troller based on Locally Weighted Projection Regression (LWPR) was combined with a PID controller in order to generate training data online using a platform without walls, or holes. The LWPR algorithm was then trained offline and sur-passed the performance of the PID, but was not able to complete the labyrinth with walls and holes.

As mentioned earlier, the labyrinth system can be simplified to a ball and plate system. This problem is well studied and mathematical models of the system have been derived either by using the Lagrangian method [6] or the Newton-Euler method [1]. When it comes to controlling the system there have been a number of different approaches.

Different kinds of fuzzy control have been tested for the ball and plate system. In [7] a fuzzy gain scheduling controller was designed to stabilize and follow a refer-ence trajectory. Further, an LQR algorithm with fuzzy Takagi-Sugeno (TS) design were used in a two level system [8]. The first level controlled the ball towards the goal and the second handled obstacle avoidance by scaling the control signal with a risk factor determined by TS type fuzzy rules based on circular high and low risk zones surrounding the obstacles.

(14)

posi-4 1 Introduction

tion deviation [9], have proven to increase the accuracy of the LQR algorithm for the ball and plate system. When the deviation is small, the scale factor is large and vice versa.

Other methods for solving the ball and plate system include sliding mode control [10–12], model predictive control [13] and neural networks [14–16].

Gain scheduling is a central part of this project. It means varying of the con-troller coefficients according to operating conditions called scheduling signals. In [17] an extensive research on gain scheduling is presented. Specifically, in one section it is reviewed how H∞ optimal controllers based on linear matrix

in-equalities (LMI) can be adapted to compute gain-scheduled controllers for linear parameter-varying (LPV) models. This is taken a step further by describing the LPV model as a dynamic TS fuzzy model in [18, 19]. They introduce a method to design a gain scheduled state feedback controller for these models using LMI, that guarantees stability and minimizes an upper bound on a quadratic perfor-mance measure.

(15)

2

Theoretical Background

This chapter presets some underlying theory needed for the implementation in chapter 3. It includes introductions to cubic spline interpolation, Kalman filter and LQR.

2.1

Cubic Spline

A spline is a type of function defined piecewise by polynomials and is commonly used for solving interpolation problems [20]. Interpolation of f (x) through every point a = x1< x2< ... < xn = b, is achieved if the spline, s(x), is composed of n − 1 polynomials, si(x), such that s(xi) = f (xi) for i = 1, 2, . . . n.

Splines may consist of polynomials of different orders. Cubic splines, which are commonly used to obtain smooth curves, uses third degree polynomials and re-quire that s, s0

and s00

are continuous [20]. Both s and s0

are continuous if the spline satisfy the following equations for i = 1, . . . , n − 1

si(xi) = yi and si(xi+1) = yi+1 (2.1)

s0i(xi) = y 0 i and s 0 i(xi+1) = y 0 i+1 (2.2)

where yi = f (xi). The second-order derivative is continuous if

s00i−1(xi) = s

00

i(xi), i = 2, . . . , n − 1 (2.3) One way of expressing the cubic polynomials are

si(x) = ai+ bi x − xi hi ! + ci x − xi hi !2 + di x − xi hi !3 (2.4) 5

(16)

6 2 Theoretical Background

where

hi = xi+1xi

hi = si+1si

and i = 1, 2, . . . n − 1. Here ai, bi, ci and di are parameters that have to be chosen so that (2.1), (2.2) and (2.3) are satisfied. They satisfy these requirements if they are set to ai = fi (2.5a) bi = hiy 0 i (2.5b) ci = 3(fi+1fi) − hi(2y 0 i+ y 0 i+1) (2.5c) di = 2(fifi+1) + hi(y 0 i+ y 0 i+1) (2.5d)

where yi0is the solution to the following system of equations

hiy 0 i−1+ 2(hi + hi−1)y 0 i + hi−1y 0 i+1= 3 hi fifi−1 hi−1 + hi−1 fi+1fi hi ! , i = 2, . . . , n − 1

By adding the following conditions at the endpoints of the interval [a, b]

s001(x1) = 2c1 h21 = 0 s00n−1(xn) = 2cn−1+ 6dn−1 h2n−1 = 0

a natural cubic spline is obtained. This means that the spline continues as a straight line outside the interval. In combination with (2.5c) and (2.5d) the con-ditions for a natural cubic spline are

       2y10 + y20 = 3f2−f1 h1 yn−10 + 2y0 n = 3 fnfn−1 hn−1

2.2

Kalman Filter

A Kalman filter is an algorithm used to estimate unknown states from measure-ments. It does this recursively by first predicting what the states should be, using a so called motion model, and then adjusting these estimates based on measure-ments. These estimates can be more accurate than estimates that are only based on measurements [21].

The relation between inputs, u, states, x, and measurements, y, is described by a discrete-time linear state space model

xk+1= Fxk+ Guk+ wk (2.6a)

(17)

2.2 Kalman Filter 7

where u is the control signal, and the process noise wk ∼ N(0, Qk) and mea-surement noise vk ∼ N(0, Rk) are normal distributed and additive. The linear matrices F and G are called the motion model and H is called the measurement model.

The Kalman filter algorithm is shown in Algorithm 1 and is also described in [21]. Algorithm 1:Kalman filter

The Kalman filter for a discrete-time state space model (2.6) consists of two steps. Initial state estimate ˆx1|0and covariance P1|0are used the first

iteration. 1. Time update ˆ xk|k−1= F ˆxk−1|k−1+ Guk|k Pk|k−1= FPk−1|k−1FT + Q 2. Measurement update k = ykH ˆxk|k−1 Sk = HkPk|k−1HkT + R Kk = Pk|k−1HkTS −1 k ˆ xk|k= ˆxk|k−1+ Kkk Pk|k= Pk|k−1KkHkPk|k−1

Here the covariance matrices Q and R are non-time-varying estimates of Cov(wk) and Cov(vk).

The covariance of the process noise Q and measurement noise R are usually tuned to achieve desired characteristics of the Kalman gain K. State estimates are weighted by the Kalman gain to different degrees by varying the coefficients in Q, or R corresponding to the covariance of that state.

A high Kalman gain (close to one) leads to faster, but more noise sensitive esti-mates. Lower gains (close to zero) gives less responsive and less noise sensitive estimates. The proportional relationship between Kalman gain and the covari-ance matrices is

K ∝ ||||Q|| R||

(18)

8 2 Theoretical Background

2.3

Linear-Quadratic Controller

The LQR described in this section is the same as in [22]. The discrete-time LQR aims to minimize the criteria

min(kek2Q 1+ ku − u(r)k2Q 2) = (2.7) minX eTkQ1ek+ (uku(r))TQ2(uku(r))dk (2.8)

where e = z − r is the deviation between controlled states, z, and reference states

r. Furthermore Q1 and Q2 are positive semidefinite matrices and u(r) is the

constant system input signal which in stationarity and without noise would yield

z = r. The criteria is to minimize a weighting of the size of the requested state

deviation e and the size of the system input signal u. The subtraction with u(r) is there to remove the state bias that would occur if the reference state r require

uk , 0.

The discrete-time system model is the same as for the Kalman filter (2.6), but instead of using all states in xk, a variable M is used to select which states that should be controlled

zk = Mxk (2.9)

If (F, G) are controllable the LQR control law is given by

uk = −L ˆxk|k+ Lrrk (2.10)

where ˆxk|kis the Kalman filter estimate and

L = (GTSG + Q2) −1 GTSF (2.11) Lr = (M(I + GL − F)1 G)−1 (2.12)

where S is the unique, positive semidefinite, symmetric solution to the matrix equation

S = FTSF + MTQ1M − FTSG(GTSG + Q2) −1

(19)

3

Implementation

This chapter presents what was developed and implemented during this thesis.

3.1

Test Platform

The test platform was constructed from a BRIO labyrinth game. This section will list and describe the components the test platform consists of and how it was constructed.

3.1.1

Hardware and Software

The system components are: • BRIO labyrinth game

• Easy and hard labyrinth plate

• 2 x Futaba S9154 Digital High Speed Servo

• Adafruit 16-Channel 12-bit PWM/Servo Driver - PCA9685 • 4.8V 2500mAh NiMH Instant cub (4 cell AA)

• Raspberry Pi 3B+

• Raspberry Pi Camera Module v2.1 9

(20)

10 3 Implementation

The RPi is the only computational unit of the system and is in charge of running the programs. The Adafruit Servo Driver is used to create and output the pulse-width modulated (PWM) signals to the two servos. The NiMH battery is only used to power the servos since they are high torque and high speed and therefore demand a steady power supply. This means that the RPi still has to be plugged in to a power outlet.

The software consists of three programs that communicate with each other via what is referred to as topics in Robot Operating System (ROS) [23]. The first program handled the ball tracking, it was written in Python and uses the image processing library OpenCV [24]. The reason to why the program was written in Python was because the master’s thesis that this thesis was a continuation of used Python and their ball tracking worked well after some small optimizations. A justification for the usage of Python in this application is that OpenCV-Python is a Python wrapper for the original OpenCV C++ implementation, which means that well written code will yield programs which are not much slower than if they were implemented in C++ [24].

The second program is the controller, which also includes the Kalman filter and the spline path implementation. This program was written in C++ and because many matrices and matrix operations would be used the linear algebra template library Eigen, found in [25], was installed.

The third program sends the requested angles to the Adafruit Servo Driver. The only reason this task could not be included in the controller program was because the dependencies of the Adafruit Servo Driver library, found in [26], could not be found for C++. Therefore this task had to be written in Python.

Figure 3.1 illustrates the hardware and software and how everything is connected in a diagram.

Figure 3.1: A diagram of the hardware and software. The sharp cornered boxes illustrate hardware and the rounded software.

(21)

3.1 Test Platform 11

3.1.2

Construction

Some changes had to be made to the BRIO labyrinth game so that the labyrinth plate angles could be controlled by two servos instead of the steering wheels. The steering wheels, and the metal rods they were connected to, were removed. Then the servos could be mounted inside the labyrinth box and connected to the plate via an arm with ball joints as shown in Figure 3.2. The connection points were carefully chosen so that a servo only caused a rotation about one axis.

Figure 3.2:Servo mounted to the labyrinth plate.

A camera stand was installed on the game so that the camera module could get a top down view of the whole labyrinth. Figure 3.3 shows the labyrinth game and how the camera stand was mounted to it.

Figure 3.3:The labyrinth game.

It was noticed that the available labyrinth plates, produced by BRIO, were bent. In order to make the system easier to control, two plates, one with a simple labyrinth and one with a more complex, were constructed from a thicker wooden board and wooden rods. They are shown in Figure 3.4. A drilling machine in a woodwork classroom was used in order to drill clear-cut holes. To obtain a

(22)

12 3 Implementation

good contact surface for the glue between the rods and the plate, the rods were polished flat on one side.

To make it easier to detect the ball with image processing the steel ball was col-ored with hobby spray paint. Red color was chosen because it was thought to be easily distinguished from the surrounding light brown wood. The ball was sprayed carefully so that no bumps were created by excessive paint.

(a) (b)

Figure 3.4:The (a) easy and (b) hard labyrinth level.

3.1.3

Electronic Schematic

Figure 3.5 shows the connections between the electrical components. The num-bered components in the figure corresponds to those listed in Table 3.1. Some connections have been left out from the figure. For instance, a screen can be connected to the RPi using the HDMI port, or the DSI Display Connector, while mouse and keyboard can be connected via USB ports. The RPi should preferably be powered via the 5V micro USB Port.

(23)

3.2 Image Processing 13

Table 3.1:Electronic components.

Nr. Component

1 Servo motors

2 Battery

3 Servo driver

4 Raspberry Pi

5 Raspberry Pi Camera Module

3.2

Image Processing

The image processing method was mostly the same as the one used in the pre-vious master’s thesis [2]. One key difference is that the HSV color space was changed for LAB color space instead. This color space has the three parameters, L for lightness, A for green to red and B for blue to yellow. After coloring the ball red it was possible to mask out the ball in a frame by simply adjusting one param-eter in this color space. The L and B paramparam-eters included the whole spectrum while the minimum value for A, initialized as a very high value, was decreased until one filled circle of the same size as the ball was masked. This method was implemented in a script that could be run to calibrate the color masking before starting the autonomous labyrinth.

Another modification of the image processing was predictive cropping. A simple prediction of the ball position was made based on the ball’s previous position and velocity. The reason to why the state estimates from the Kalman filter were not used for this, was because they were thought to be delayed since they would have had to be sent via a ROS topic. This was however never tested. The frame was then cropped around this predicted position before any image processing was performed, making the computations less time consuming. By doing this the cropped image could be as small as 50x50 pixels. Which is, compared to the original image size of 320x240, around 3.3 % of the original number of pixels.

3.3

System Model

Instead of deriving a model of the ball and plate system directly, a model of the simpler ball on beam system was derived. Then the model was duplicated to describe the dynamics in the x- and y-axis independently, omitting the coupling effects that appear in a 3D system.

3.3.1

Model Derivation

The 3D ball and plate system and the definition of the plate coordinate system is presented in Figure 3.6. Notice that a positive α rotation is defined as a right

(24)

14 3 Implementation

handed rotation around ˆx while a positiveβ rotation is a left handed rotation

around ˆy. This was done so that the two models, in ˆx and ˆy, would be identical. A

positive rotation means that the corresponding coordinate vector points upwards in both models.

Figure 3.6: The ball and plate system. During the ball on beam model derivation the ˆy dimension is neglected.

Figure 3.7 shows the ball on beam system and the free body diagram of the ball and Table 3.2 presents all the model parameters.

Figure 3.7:The ball on beam system and a free body diagram of the ball.

Table 3.2:Model parameters.

Parameter Value Description

g 9.82 [m/s2] Standard gravity Jb 1.152 × 107 [kgm2] Inertia of ball mb 8 × 10 −3 [kg] Mass of ball rb 6 × 10−3[m] Radius of ball

From Figure 3.7 Euler’s first and second law can be written as ˆ

x :mbg sin(β) − Ff r= mbx¨ (3.1a) ˆz :mbg cos(β) + N = 0 (3.1b)

(25)

3.3 System Model 15

where fr( ˙x)N rbis introduced as rolling resistance torque and fr( ˙x) is the rolling resistance coefficient. (3.1b,3.1c) inserted in (3.1a) yields

mbg sin(β) − fr( ˙x)mbg cos(β) −Jbω˙ rb = mbx ⇒¨ n ˙ ωrb= ¨x o ⇒ ⇒ −mbg sin(β) − fr( ˙x)mbg cos(β) −Jbx¨ r2b = mbx ⇒¨ ⇒mb      1 + Jb mbrb2      x = −f¨ r( ˙x)mbg cos(β) − mbg sin(β) ⇒ (3.2) ⇒nkbB      1 + Jb mbrb2       o ⇒x = −f¨ r( ˙x)g kb cos(β) − g kb sin(β) ≈ ≈n|β| < 5 π 180 o ≈ −fr( ˙x)g kbg kb β

Since the system only operates with plate angles smaller than five degrees, the small angle approximation cos(β) ≈ 1, sin(β) ≈ β could be applied.

The next step is to calculate the fictitious forces Ff ic which appear because the plate is a rotating frame of reference, illustrated in Figure 3.8.

Figure 3.8: The plate frame of reference in the inertial frame of reference, where ωk and αk are the angular velocity and angular acceleration of the body k. rCAis the position vector of the ball in the defined plate coordinate system and F is the net force on the ball.

Fictitious forces, from [27], are calculated as

Ff ic= −mbaCmbαk× rCAmbωk×(ωk× rCA) − 2mbωk× vA/k= = 0 + mbβ ˆ¨y ×x ˆx +mbβ ˆ˙y × (− ˙β ˆy ×x ˆx) + 2mbβ ˆ˙y × ˙x ˆx = (3.3) = −mbβx ˆz + m¨ ˙2x ˆx − 2mbβ ˙x ˆz˙

(26)

16 3 Implementation

These terms are the Euler, Centrifugal and Coriolis forces respectively. They would be added to the left side in (3.1) but will instead be omitted since ˙β, ¨β

are assumed to be small. Thereby, it is assumed that Ff ic≈0.

From the result of (3.2) the dynamic model of the ball on beam system can be written as ˙q1= q2 (3.4a) ˙q2= −fr(q2) g kbg kb β (3.4b) where q1= x and q2= ˙x.

Euler’s forward method ˙q(kTs) ≈ qk+1

qk

Ts where qk+1 = q(kTs + Ts) yields the

discrete-time system model

q1k+1= q1k+ Tsq2k (3.5a) q2k+1= q2kTsfr(q2 k) g kbTs g kb βk (3.5b)

where q1k = xkand q2k = ˙xk at time instance k and Tsis a constant sample time.

3.3.2

Friction Modeling

Tests on the system revealed that the ball was affected by rolling resistance when it stood still. Inserting q2, ˙q2= 0 in (3.4b) and defining β0B β in that state yields

0 = −fr(0)

g k

g

0⇒fr(0) = −β0 (3.6)

In order to measure β0the plate was set to discretely increasing angles until the

ball started to roll.

The next step was to measure the rolling resistance coefficient fr(q2) when the

ball was moving. When the ball tracking was working an experiment could be set up where the ball was given an initial push on a horizontally leveled plate. The position of the ball was then measured as it slowed down and eventually stopped. The rolling resistance coefficient could then be calculated from (3.5a) substituted into (3.5b) with βk = 0

q1k+2q1 k+1 Ts,k+1 = q 1 k+1q 1 k Ts,kTs,kfr(q2 k) g kbfr(q2k) = − kb gTs,k       q1k+2q1 k+1 Ts,k+1q 1 k+1q1k Ts,k       (3.7)

Observe that Tsis now given a time instance index as well. This is because the ball tracking code does not operate with a constant sample time. Low-pass filtering of the differentiated signals is needed since differentiation amplifies noise in the measurements. This experiment could only be performed on a different plate

(27)

3.4 Spline Path 17

(a)

(b)

Figure 3.9: Graphical path drawing interface of the (a) easy level and (b) hard level. On the left, the program is set to display the placed points with lines drawn between them and to the right it is set to display the resulting spline interpolation.

than the ones the labyrinths are built on. This was because the plates used for the labyrinths were not flat enough and the ball would accelerate in different directions rather than slow down. The different plate had a noticeably higher friction constant and thus a higher rolling resistance.

3.4

Spline Path

The ball is set to follow a continuous path made using cubic spline interpolation as described in Section 2.1. A graphical user interface for constructing the path was developed, see Figure 3.9. It allows users to place points on an image of the labyrinth, which are then converted from pixels to points in the plate coordinate system. These points are then interpolated with two cubic splines, one for the x-coordinates and another for the y-coordinates, denoted xsand ys respectively. Let the distance traveled by the ball on the path be called s ∈ [0, L], where L is the length of the path, and where xs and ys are spline functions of s. Each

(28)

piece-18 3 Implementation

wise polynomials, the derivatives (dxs

ds, dys

ds) at s can easily be computed. Note that the notation in this chapter differs from the conventional spline notation in Section 2.1. Here xs and ys are two spline functions each corresponding to the piecewise polynomial s(x) and s corresponds to the input x.

In order to know which point on the path the ball is currently closest to, a line-search method is used to find an orthogonal projection of the balls position Pbto

(29)

3.4 Spline Path 19

the path. This method is explained in Algorithm 2 and illustrated in Figure 3.10. Algorithm 2:Path projection

Let n(s) be the normal vector of the path at spline distance s and dp(s) be the vector from the projecting point Pbto the spline at s. There are three main steps when projecting Pbon the spline path.

1. Guess projection point

Set the initially guessed projection point, s0, to the previously found

projection point. On the first iteration, set s0 = 0.

2. Find an interval containing the projection point

The interval s ∈ [smin, smax] in which there is a root to the function

f (s) = n(s) × dp(s), is found by recursively decreasing/increasing sminand

smax, initialized as s0, until a sign change is found.

s_min , s_max = s0 i t e r = 0

while sign( f ( s_min ) ) = sign ( f ( s_max ) ) and i t e r < max_iter decrement s_min max( 0 , s_min ) increment s_max min( s_max , L ) increment i t e r end

3. Find the projection point

Use Brent’s method [28] to find the root of n(s) × dp(s) on the interval

s ∈ [smin, smax]. The root corresponds to a projection from point Pbto the spline path. If step 2 was interrupted by the iterator, use the s out of smin, or smaxthat results in the smallest distance to the path.

i f i t e r < max_iter

find r o o t using Brent ’ s method

e l s e

d_min = d i s t a n c e from t h e b a l l t o t h e path a t s_min d_max = d i s t a n c e from t h e b a l l t o t h e path a t s_max

i f d_min < d_max s = s_min e l s e s = s_max end end

(30)

20 3 Implementation

Figure 3.10: Example of projection point finding method. (a) Initially guessed position on spline, s0, does not correspond to the projection point,

Ps, because n × dp , 0. (b) sminand smax(initiated as s0) are decreased and

increased respectively. No sign change occur in n × dp when calculated from

sminand smax. (c) Sign change has occurred after further decreasing/increas-ing smin and smax. The interval [smin, smax] must therefore contain a mini-mum. (d) Brent’s method is used to find the root of n × dp and thereby the path position s corresponding to the projection point Ps.

When an s corresponding to the projection from the ball is found the path tangent angle θs and distance to the path d can be calculated. An illustration of the relationship between the ball and the path is shown in Figure 3.11. The distance is calculated

(31)

3.5 Path Following Controller 21

where

dp = Pb− Ps

while Pband Psare vectors from origin to Pband Psrespectively. The path tangent angle θsis calculated as follows

θs = atan2 dys ds, dxs ds ! (3.9)

Figure 3.11:Geometry of the path following method. The angles are relative to the plate coordinate system, depicted as vectors ˆx and ˆy. ˆt and ˆn are the

tangent and normal vector of the spline.

3.5

Path Following Controller

The principles of the path following controller was to steer towards the closest point of the path when the ball was far away from it and to yield a velocity in the forward direction of the path when the ball was close to it. This was imple-mented in three ways: with a modified LQR, a Gain Scheduled LQR and a Gain Scheduled LQR with obstacle avoidance. All three controllers got state estimates from the same Kalman filter with a constant velocity motion model. This section also covers a method to free the ball when it gets stuck behind walls, which was used in every controller.

(32)

22 3 Implementation

3.5.1

LQR

Since the system model handles the x− and y−dynamics separately the controller was chosen to be implemented in the same way.

The rolling resistance term Tsfr(q2k) g

kb in the discrete-time system model (3.5)

is highly nonlinear because of sign shifts. Therefore, the rolling resistance was chosen to be handled with disturbance compensation instead of being included in the model for the LQR. Equation (3.5b) can be written as

q2k+1= q2kTsfr(q2 k) g kbTs g kb βk = q2kTs g kb (fr(q2k) + βk) = =nuk B fr(q2k) + βk o = q2kTs g kb uk (3.10)

Note that the plane angle β is determined by a summation of the control signal and the friction disturbance as βk = uk+ fr(q2k). Therefore, the model used by the LQR is

q1k+1= q1k+ Tsq2k (3.11a)

q2k+1= q2kTs g

kb

uk (3.11b)

which can be written as

qk+1= Fqk+ Guk (3.12a) y = H qk+1 (3.12b) F = 1 T0 1s ! , G = T0 skgb ! , H =1 0 (3.12c)

The complete controller consists of two of these LQRs, as illustrated in Figure 3.12. Since the rolling resistance had to be considered on a 2D plate now rather than the previous discussed 1D system, scaling had to be made depending on the head-ing of the ball as

β α ! =                      ux uy       −fr(v)       cos(θ) sin(θ)      , if v > vtol       ux uy      , otherwise (3.13)

where v = kvk = k( ˙x ˙y)Tkis the speed of the ball. The rolling resistance compen-sation was only active when it was certain that the ball had a velocity in the θ direction, v > vtol, to prevent compensation in the wrong direction.

(33)

projec-3.5 Path Following Controller 23

Figure 3.12:A block diagram of the LQR.

tion like r =             xs

vref cos(θref)

ys

vref sin(θref)             (3.14)

where vref is the constant reference speed of the ball, (xs ys)T is the projection of the ball position onto the spline path and θref = θs in most cases, see Sec-tion 3.5.4. This state reference entails to steer towards the closest point of the path and to keep a velocity in the forward direction of the path.

The two LQRs were tuned exactly the same. The tuning of the weight matrices Q and R was made recursively and manually by test runs on the real system. M in (2.9) was set to the identity matrix since both the position and velocity had to be controlled.

An important deviation from the LQR theory had to be made for this controller to be able to follow a path. A requirement when calculating an Lr matrix is that the number of control signals is at least equal to the number of controlled states. These LQRs had only one control signal and two states to control. The LQR was only able to control the ball to a reference position, which means that the velocity would be zero when the ball reached the position. This caused the ball to frequently stop on the path and since (xs ys)T can move in both directions of the path, there was nothing pushing the ball either forward or backward on the path. Therefore, the element relating to the velocity had to be tuned manually.

(34)

24 3 Implementation

3.5.2

Gain Scheduling

Instead of tweaking Lr in an LQR that is really only designed to control position, one might want to have two LQRs, one that controls position and one that con-trols velocity and weight their control laws together. This weighting can then be situation dependent to give one LQR more influence to control its state. This way of using a weighted sum of multiple controllers is an example of a gain scheduled controller. Figure 3.13 illustrates the gain scheduled LQR that was designed and tested in this project.

Figure 3.13:A block diagram of the gain scheduled LQR.

The distance to the path d from the path projection was used to calculate the LQR weighting as

w = min(d/dmax, 1) (3.15a)

u =wupos+ (1 − w)uvel (3.15b)

where dmaxis a tuning parameter scaling the influence of each LQR and at what distance d the position LQR takes over completely.

The system model for the velocity LQR is just (3.11b) which state matrices are

F = 1, G = −Ts

g kb

(3.16) The state references were calculated each iteration from the results of path pro-jection like rpos=             xs 0 ys 0             , rvel = vref cos(θref) sin(θref) ! (3.17)

(35)

3.5 Path Following Controller 25

This means that the position controller steered towards the closest point of the path and the velocity controller kept a velocity in the forward direction of the path.

3.5.3

Obstacle Avoidance

Figure 3.14:Obstacle avoidance block diagram. F is an arbitrary path stabi-lizing controller.

Neither of the controllers described above, nor the model, take obstacles into account. Holes are therefore solely avoided to the extent of which the path is fol-lowed. In order to actively steer away from holes if needed, an obstacle avoidance control law can be added to the already existing control law, see Figure 3.14. In this project obstacle avoidance was added to the gain scheduling controller de-scribed in Section 3.5.2, but it could be added to any path stabilizing controller. Let ri= xyb b ! − xobs,i yobs,i ! = xbxobs,i ybyobs,i ! = rrx,i y,i ! (3.18) be the vector to the ball position (xb, yb) from the position (xobs,i, yobs,i) of obstacle

i = 1, . . . , Nobs, where Nobs is the number of holes in the labyrinth. Figure 3.15 illustrates the geometry of the obstacle avoidance algorithm.

The distance to obstacle i and speed at which the ball is approaching it is given by ri = ||ri|| (3.19) ˙ri = vTˆri=  ˙xb ˙yb  ri ||ri|| (3.20) The total control law, utot, can be obtained by adding the sum of every obsta-cles contribution to the control signal with the existing control law, u, from any

(36)

26 3 Implementation

Figure 3.15:Obstacle avoidance. The red and gray circles represents the ball and holes respectively.

controller utot= u + Nobs X i=1 αiuobs,i (3.21)

Here αi is the risk value associated with obstacle i, which is given by

αi =        D1(1 −rRi) − D2˙ri, if ri < R and ˙ri < 0 0, otherwise (3.22)

where D1 and D2are parameters tuned to adjust the level of influence ri and ˙ri has on the steer signal respectively. Note that the risk value is zero either if the ball moves away from the obstacle, ˙ri > 0, or if it is outside a predefined circular risk zone ri > R, where R is the radius of that zone.

The direction in which the counteracting control law is applied is determined by

uobs,i = − cos(θsin(θsaf e,i) saf e,i) !

(3.23) where θsaf e,iis the angle of the vector rirelative to the plate coordinate system

θsaf e,i= atan2 

ry,i, rx,i 

= atan2 ybyobs,i, xbxobs,i (3.24)

3.5.4

Integration of

θ

ref

In Section 3.4, a method of projecting the ball position to the path was presented, along with the expression (3.9) for the path tangent θsat that point. This tangent

(37)

3.5 Path Following Controller 27

could in most cases be used as the reference heading for the ball, θref = θs. There are however scenarios where this can lead to unwanted behaviours.

For example, if the ball deviates from the path too much and ends up bumping into a wall that is orthogonal to the reference heading, the controller would try to steer the ball through the wall. Although parts of the controller would still try to minimize d, and therefore try and steer the ball directly toward the projected point, there is a risk that the static friction from the wall is enough to make the ball become stuck.

In cases like this, integration of θref could be used to slowly turn the reference heading toward the path. Since the controller has no awareness of the walls, this is simply done after a predefined time that the ball has been still. This assumes that every time the ball is still for that amount of time, it is stuck to a wall. The direction in which to turn θref is determined based on which side of the spline the ball is. Let ˆt be the tangent of the path in the forward direction and

ˆ

n the normal as shown in Figure 3.16. In this spline path coordinate system, the

only variable that determines if the ball is to the left or right of the spline is the one in the ˆn direction. A transformation from the plate coordinate system to the

ˆ

n part of the spline coordinate system is given by

ˆ

n : yn = − sin(θs)(xbxs) + cos(θs)(ybys) (3.25)

where (xb, yb) and (xs, ys) are the coordinates of the ball and projected spline po-sition respectively in the plate coordinate system.

Figure 3.16:Integration of θref. ˆx and ˆy represents the plate coordinate sys-tem, while ˆt and ˆn the spline path coordinate system. The reference heading

(38)

28 3 Implementation

Now the integration of θref can be defined as

θref =        θsIθ, if yn> 0 θs+ Iθ, otherwise (3.26) where Iθis the integrating factor that is increased with ∆θ every iteration, k, of the control signal update. This integrating factor is given by

Iθ= Iθ,k=        Iθ,k−1+ ∆θ, if tstill > ttol 0, if tstillttol (3.27) where tstill is the time the ball has been still, or more specifically the time when |v|< vtol, where vtolis a predefined tolerance speed for being essentially still and

ttolis the time required before the integration kicks in. Notice that Iθ = 0 as soon as the ball is set in motion.

(39)

4

Results

Results from friction modeling and the evaluated controllers are presented in this chapter.

4.1

Friction Modeling

The experiment to measure β0 was performed a number of times and the

aver-age estimate was β0 = 0.013. Implying that the plate must be tilted at least 180

π 0.013 ≈ 0.745 degrees for the ball to begin rolling.

Figure 4.1 presents one set of measurement data and the step-wise procedure of estimating the rolling resistance coefficient.

It is seen that the measurement noise is amplified during the differentiation steps and that low-pass filtering and outlier rejection was beneficial. Since this exper-iment could not be performed on the same plate material as the labyrinths were built on, but was instead done on a plate with higher friction constant, the ex-periment was not repeated and the result was only used as an upper limit of the rolling resistance coefficient. The upper limit was chosen as fr,max = 0.006 from Figure 4.1d.

4.2

Controller Performances

This section presents the performance of three different controllers. The first controller was the LQR from Section 3.5.1, the second used the gain scheduling method from Section 3.5.2 and the third was gain scheduling with the addition of obstacle avoidance as described in Section 3.5.3. From now on, these controllers

(40)

30 4 Results 2 3 4 5 6 7 8 Time [s] -10 -5 0 5 Position [cm] Position measurements Raw Cropped & Filtered

(a) 2.5 3 3.5 4 4.5 5 5.5 Time [s] 5 10 15 20 Velocity [cm/s] Velocity estimates Raw Filtered (b) (c) 0 2 4 6 8 10 Velocity [cm/s] 0 0.005 0.01 0.015 0.02 fr [-]

Estimates of rolling resistance coefficient

(d)

Figure 4.1:The process of estimating rolling resistance coefficient from po-sition data.

will be referred to as LQR, GS and GSOA respectively. All three controllers used the reference heading integration described in Section 3.5.4. The GS and GSOA controllers used the same tuning, except that the GSOA controller also had an obstacle avoidance part. All tuning can be found in Appendix A.

Each controller was tested 60 times on both the easy and hard labyrinth level. The tests were interrupted when either the controller had moved the ball from start to finish, or if the ball had fallen into a hole. The root mean square error (RMSE) between the path and ball position was logged in order to evaluate the path following performance and statistics were kept on which holes the ball fell into. The results for each labyrinth level will be presented separately.

4.2.1

Easy Labyrinth

The general performance of each controller on the easy labyrinth level is pre-sented in Figure 4.2. It is seen that the GS and GSOA controllers have a slightly higher average RMSE from the reference path, than the LQR. Out of the 60 runs,

(41)

4.2 Controller Performances 31

the LQR also has a somewhat higher success rate, although every controller com-plete the labyrinth more that 90 % of the times.

Compare RMSE: easy labyrinth level

LQR GS GSOA 0 0.2 0.4 0.6 0.8 1 1.2 1.4 RMSE [cm] (a)

Compare success rate: easy labyrinth level 96.7 91.7 91.7 LQR GS GSOA 0 10 20 30 40 50 60 70 80 90 100 Success rate [%] (b)

Figure 4.2:The performance results of the different controllers on the easy labyrinth level after 60 runs. (a) presents the average RMSE from the ref-erence path. The black bars shows the standard deviation of the average RMSE. (b) presents the percentage of runs where the ball succeeded to travel from start to finish without falling into any holes.

Figure 4.3 shows a picture of the easy level with each of the eight holes numbered. The number of times the ball fell in these holes are shown in Figure 4.4. Hole three is the hole which the ball fell into most frequently and was the only hole the ball fell into using the LQR. Despite this, no significant trend is shown because of the small number of falls. Figure 4.5 shows three separate successful runs with each controller. -15 -10 -5 0 5 10 15 x [cm] -10 -5 0 5 10 y [cm]

Holes on the easy labyrinth level

1 2 3 5 7 6 8 4 Holes

(42)

32 4 Results

Hole falling distribution: easy labyrinth level

2 0 0 0 1 1 1 2 2 2 0 1 3 4 5 8 Hole nr. 0 0.5 1 1.5 2 2.5 3

Falls (out of 60 runs)

LQR GS GSOA

Figure 4.4: Number of times the ball fell into different holes after running the easy level 60 times with each controller. The holes left out had zero recorded falls. -15 -10 -5 0 5 10 15 x [cm] -10 -5 0 5 10 y [cm]

LQR on easy labyrinth level Ball path Ref. path Goal (a) -15 -10 -5 0 5 10 15 x [cm] -10 -5 0 5 10 y [cm]

GS on easy labyrinth level Ball path Ref. path Goal (b) -15 -10 -5 0 5 10 15 x [cm] -10 -5 0 5 10 y [cm]

GSOA on easy labyrinth level Ball path Ref. path Goal

(c)

Figure 4.5:Successful runs on the easy labyrinth level using the (a) LQR, (b) GS and (c) GSOA controller.

(43)

4.2 Controller Performances 33

4.2.2

Hard Labyrinth

The general performance of each controller on the hard labyrinth level is pre-sented in Figure 4.6. Once again the LQR has a slightly lower average RMSE from the reference path, but unlike the easy level, the GS controller has the high-est success rate.

Figure 4.7 shows a picture of the hard level with each of the 16 holes numbered. The number of times the ball fell in these holes are shown in Figure 4.8. The statistic that stands out the most is hole 12, which the ball never fell in using the LQR and GS controllers, but fell in eleven times when using the GSOA. Moreover, the LQR often failed at passing hole 6 and 7, and the GS controller failed at passing hole 7 more than half of the times the ball fell into a hole using this controller.

Figure 4.9 shows three separate runs with each controller. Especially note the behaviour of the GSOA controller in Figure 4.9c near hole 12, where there is a ”bouncing effect” back and forth. This behaviour is also illustrated in Ap-pendix B.3, where a number of unsuccessful runs are shown. In ApAp-pendix B.1 and B.2, unsuccessful runs for the LRQ and GS controller are also shown.

Compare RMSE: hard labyrinth level

LQR GS GSOA 0 0.2 0.4 0.6 0.8 1 1.2 1.4 RMSE [cm] (a)

Compare success rate: hard labyrinth level

70.5 78.7 63.3 LQR GS GSOA 0 10 20 30 40 50 60 70 80 90 100 Success rate [%] (b)

Figure 4.6:The performance results of the different controllers on the hard labyrinth level after 60 runs. (a) presents the average RMSE from the ref-erence path. The black bars shows the standard deviation of the average RMSE. (b) presents the percentage of runs where the ball succeeded to travel from start to finish without falling into any holes.

(44)

34 4 Results -15 -10 -5 0 5 10 15 x [cm] -10 -5 0 5 10 y [cm]

Holes on the hard labyrinth level

1 11 2 3 4 5 6 7 9 10 15 16 8 14 13 12 Holes

Figure 4.7:Holes on the hard labyrinth level.

Hole falling distribution: hard labyrinth level

1 1 8 5 2 1 0 0 0 2 0 2 7 0 0 0 1 1 1 1 2 3 0 1 11 1 2 4 5 6 7 8 9 12 13 14 Hole nr. 0 2 4 6 8 10 12

Falls (out of 60 runs)

LQR GS GSOA

Figure 4.8: Number of times the ball fell into different holes after running the hard level 60 times with each controller. The holes left out had zero recorded falls.

(45)

4.2 Controller Performances 35 -15 -10 -5 0 5 10 15 x [cm] -10 -5 0 5 10 y [cm]

LQR on hard labyrinth level Ball path Ref. path Goal (a) -15 -10 -5 0 5 10 15 x [cm] -10 -5 0 5 10 y [cm]

GS on hard labyrinth level Ball path Ref. path Goal (b) -15 -10 -5 0 5 10 15 x [cm] -10 -5 0 5 10 y [cm]

GSOA on hard labyrinth level Ball path Ref. path Goal

(c)

Figure 4.9:Successful runs on the hard labyrinth level using the (a) LQR, (b) GS and (c) GSOA controller.

(46)
(47)

5

Discussion

The discussion in this chapter will first focus on the results from Chapter 4, fol-lowed by a discussion regarding the chosen methods used in this thesis.

5.1

Results

In this section, a discussions regarding experimental results from friction model-ing and the evaluated controllers are presented.

5.1.1

Friction Modeling

Estimation of the rolling resistance was a time consuming part of the project that did not lead to any direct improvements of the labyrinth control. In the end fr(v) was used as a tuning parameter because the estimates were not deemed to be sufficiently accurate.

Since the plates the labyrinth levels were built on were not completely flat, mea-surements of the ball decelerating could not be performed. Instead this experi-ment was performed on a different plate with a noticeably greater friction con-stant. However, the result presented in Figure 4.1 is very fluctuating, because it is a second order derivative from the measurements. Therefore, an external sen-sor measuring the acceleration of the ball would have been needed to accurately estimate fr(v).

With that said, simply tuning a constant value for fr(v) made it so that the ball did not stop as often. Other ways of reducing this problem could be to increase

vref and tune the controllers to be slightly more aggressive. Experiments to verify 37

(48)

38 5 Discussion

this were never performed but the rolling resistance when moving could possibly have been ignored.

5.1.2

Controller Performances

On both the easy and hard labyrinth level, the RMSE was highest when using the GSOA controller followed by the GS controller. The reason why the LQR on average managed to follow the path the best can be attributed to tuning. Both the GS and GSOA controllers had more tuning parameters and were therefore more difficult to tune. It is also possible that the implemented gain scheduling never would have been able to reach the path following performance of the LQR. Since the GS and GSOA had the same tuning of the gain scheduling part, the rea-son why GSOA had a higher RMSE is probably because of the obstacle avoidance. It was observed that the path following in some cases became unstable after the obstacle avoidance kicked in near a hole, which can be seen in Figure 4.9c near hole 12. This could be because the maneuver to avoid a hole caused the speed to exceed the reference speed, or caused the ball to move far away from the ref-erence path, which in turn would lead to oscillations when the path following controller tries to compensate for the sudden disturbance.

Although there is a clear trend between the controllers RMSE on both labyrinth levels, the success rate is not as consistent and does not always correlate with the RMSE. One would expect that low positional RMSE should result in a higher success rate, which is the case for the easy labyrinth level in Figure 4.2. However, the result from the hard level in Figure 4.6 shows that the GS controller both has a higher success rate and higher RMSE than the LQR. This might be explained by the fact that the gain scheduling controller prioritizes keeping the reference speed when it is closer to the path and prioritizes position when it is far away from it. Thereby, a larger deviation from the path is accepted, resulting in higher RMSE, while the speed is more closely held, which seems to be a better strategy on the hard labyrinth level.

The reason why the GSOA controller performed much worse in terms of success-ful runs on the hard level, can with high certainty be attributed to the difficulties this controller had with hole 12. As mentioned, an oscillatory behavior was ob-served when the ball approached this hole, which more than 18 % of the times led to the ball falling into the hole. Interestingly, the number of falls in holes before hole 12 was in most cases lower compared to the LQR and GS controller. This would suggest that the GSOA controller would have a similar success rate as the other controllers, if not for hole 12. A simple calculation, which takes into account the number of falls that statistically would occur after hole 12, is given in Appendix C. It suggest that the success rate would be 80 % without hole 12, meaning that it would be on par with the GS controller’s 78.7 % success rate. In addition to the problem occurring around hole 12, the GSOA controller also fell in holes as a result of trying to avoid other holes. Some examples of this are shown in Appendix B.3.

(49)

5.2 Method 39

5.2

Method

This section acknowledges some weaknesses of the method and discusses the re-liability and validity of the results.

5.2.1

Weaknesses

The choice of using Python as the programming language for the image process-ing may be questionable. Since this is a fast system, execution speed is crucial and a programming language like Python is not as fast as C++. However, the bottleneck of the whole program was specifically the capturing of the image and not the image processing itself. It was never investigated if image capturing is as slow in C++ but it is very likely that this is a hardware question with the RPi Camera being the limiting component. The requested frame rate was set to 28 Hz and both a slower and a faster requested frame rate resulted in a slower actual frame rate. This means that when a too fast frame rate is requested and can not be met either the RPi or the camera module perform worse.

A major disadvantage with the system was that it lacked feedback of the servo angles. As a consequence it was assumed that the angles the controller requested were active immediately. This made it hard to use the system model as a motion model in the Kalman filter and a simpler constant velocity model had to be used instead. If the active angles were known, a more accurate motion model could have been used, yielding better state estimations. Furthermore, the delay of the requested angles becoming active could have been studied and maybe been con-sidered by the controller.

An effect that was early thought of, but in the end ignored since it did not seem to affect the measurements, was the distortion of the plate coordinate system when the plate was tilted. When the ball was found in the image the position was transformed to the plate coordinate system before sending the measurement to the Kalman filter. This transformation did not consider the angle of the plate. This simplification did not seem to affect the measurements, since the system only operated within angles close to zero, but no experiments were performed to confirm this. The measurements might have been more accurate if this distortion was considered, but for that the active plate angles would have been needed to be known.

In order for the controller to compensate for the uneven areas of the labyrinth plate, an integration part would have been needed. A traditional integration part was not experimented with since the stability of the controller was already unsure. It was thought that it would ruin the performance rather than help. Only a special integration part was implemented to handle static friction and make the ball move when it was detected to be standing still, but as soon as the ball began to move again it was set to zero. However, a well working traditional integration part would definitely have helped in preventing the ball from becoming stuck because of uneven surfaces and enabling the controller to be less aggressively

(50)

40 5 Discussion

tuned.

5.2.2

Reliability and Validity

It is important to mention that the reliability of the results is low. This is due to a number of reasons. One of which being that this was a home built system and a replication of it would probably behave quite differently and it would thereby need different tuning. The performance of the system was also very sensitive to servo angle calibration and the brightness in the room. Since the controller did not have an integrating part that was active when the ball moved, it could not handle angle offsets well. Furthermore, the ball detection worked best in a well lit room and it is not guaranteed that the ball would be detected well enough in other environments.

Regarding the validity of the RMSE measure it should be explained how the final values were derived. The RMSE was calculated every iteration during a run and when the run was finished the mean value was calculated. This means that if the ball got stuck away from the path, the RMSE accumulated. Which might be reasonable since becoming stuck is not a desirable action but it can yield an undeservingly bad RMSE. Another deceptive matter of the RMSE measure is that a tighter labyrinth yields a lower RMSE. This is because the walls of the labyrinth limit the ball from deviating too far from the reference path. This is most likely why the controllers have a higher RMSE on the easy labyrinth compared to the hard.

(51)

6

Conclusions

The following chapter presents conclusions that can be drawn from this master’s thesis, based on the results in Chapter 4 and discussion in Chapter 5. It is eval-uated whether the aim stated in Section 1.1 has been reached and the questions asked in Section 1.2 are answered. Suggestions for future work are also discussed.

6.1

Spline Interpolated Path

The use of a spline interpolated reference path, as opposed to using discrete ref-erence points, proved to be working really well. A method for projecting the balls position to the spline worked in most scenarios, provided that a reasonable starting position for the root finding algorithm was given. The only downside of this method was that the ball could get stuck if the reference heading given by the spline tangent was in the direction of a wall. This was solved using an integrating part to the reference heading, which is described in Section 3.5.4.

6.2

Control Algorithms

When it comes to following a path the LQR proved to perform the best, as it had the lowest RMSE for both labyrinth levels. It was also the easiest to tune, which is an important factor to consider when choosing a controller design. Regarding the success rate the gain scheduled controller managed to outperform the LQR on the hard level, but not by much.

One advantage with the gain scheduling controller design is that it is intuitive. If two different actions are to be accomplished simultaneously then use two

References

Related documents

The scope of this work focuses on assessing and improving the modularisation aspects of the Omnideck 6 omnidirectional treadmill product family. This requires a knowledge of the

Vikten av omdöme och att kunna förhålla sig till regler som andra har satt och bedöma när det är dags att göra ett avsteg för att omsorgstagaren vill det, eller har andra

most of them conducted at personal meetings and one of them by telephone. All interviews were recorded after permission of the interviewee and the recordings enabled a transcript

Wiktors Hemma i förhållande till dessa två frågor kan det även här dras paralleller till det faktum att resurserna kanske är en aning ojämnt fördelade även om de finns i

Ett annat hinder som ungdomarna belyser är boendesegregation, många av ungdomarna umgås inte med svenska ungdomar eftersom de bor för långt bort, de bor inte i samma område

The aim of the study was to investigate the effects of guided ICBT on measures of PTSD symptoms, depression, and other anxiety symptoms, as well as quality of life against a

I vår undersökning har vi haft med tre olika Länsstyrelser där samtliga har svarat, men responsen från just Hallands läns kommuner och hvb-hem har varit för låg för att vi skall

NORIA can be strengthened at little cost by selective mutual opening of national R&amp;D programmes to allow research and innovation funders and performers to build