DanielAnkelhed Onlowordercontrollersynthesisusingrationalconstraints

(1)

Linköping studies in science and technology. Thesis. No. 1398

On low order controller synthesis

using rational constraints

Daniel Ankelhed

REGLERTEKNIK

AU

TOM_{ATIC CONT}ROL LINKÖPING

Division of Automatic Control Department of Electrical Engineering Linköping University, SE-581 83 Linköping, Sweden

http://www.control.isy.liu.se ankelhed@isy.liu.se

(2)

Swedish postgraduate education leads to a Doctor’s degree and/or a Licentiate’s degree. A Doctor’s Degree comprises 240 ECTS credits (4 years of full-time studies).

A Licentiate’s degree comprises 120 ECTS credits, of which at least 60 ECTS credits constitute a Licentiate’s thesis.

Linköping studies in science and technology. Thesis. No. 1398

On low order controller synthesis using rational constraints Daniel Ankelhed

ankelhed@isy.liu.se www.control.isy.liu.se Department of Electrical Engineering

Linköping University SE-581 83 Linköping

Sweden

ISBN 978-91-7393-669-9 ISSN 0280-7971 LiU-TEK-LIC-2009:6 Copyright c 2009 Daniel Ankelhed

(3)

(4)

(5)

Abstract

In order to design robust controllers, H∞synthesis is a common tool to use. The

con-trollers that result from these algorithms are typically of very high order, which compli-cates implementation. However, if a constraint on the maximum order of the controller is set, that is lower than the order of the plant, the problem is no longer convex and it is then relatively hard to solve. These problems become very complex, even when the order of the system to be controlled is low.

The approach used in the thesis is based on formulating the constraint on the maxi-mum order of the plant as a polynomial equation. By using the fact that the polynomial is non-negative on the feasible set, the problem is reformulated as an optimization problem where the nonconvex polynomial function is to be minimized over a convex set defined by linear matrix inequalities.

To solve this optimization problem, two methods have been proposed. The first method is a barrier method and the second one is a method based on a primal-dual frame-work. These methods have been evaluated on several problems and compared with a well-known method found in the literature. To motivate this choice of method, we have made a brief survey of available methods available for solving the same or related prob-lems.

The proposed methods emerged as the best methods among the three for finding lower order controllers with the same or similar performance as the full order controller. When the aim is to find the lowest order controller with no worse than +50 % increase in the closed loop H∞norm, then the three compared methods perform equally well.

(6)

(7)

Populärvetenskaplig sammanfattning

Robust reglering är ett verktyg som används för att konstruera regulatorer till system som har stora krav för tålighet mot parametervariationer och störningar. Ett vanligt verktyg för att konstruera dessa typer av robusta regulatorer är H∞-syntes. För att använda detta

verktyg utgår man vanligen från en nominell modell av systemet som man utökar med olika viktfunktioner som motsvaras av de krav på robusthet för parameterosäkerheter och störningar som ställs på systemet. Detta får som följd att det utökade systemet ofta får mycket högre komplexitet än det nominella systemet. Regulatorns komplexitet blir i regel lika hög som det utökade systemets komplexitet. Inför man en övre gräns för den grad av komplexitet som tolereras, resulterar detta i mycket komplexa optimeringsproblem som måste lösas för att en regulator ska kunna konstrueras, även i de fall då det nominella systemet har låg komplexitet. Att ha en regulator med låg komplexitet är önskvärt då det förenklar implementering.

I denna avhandling presenteras två metoder för att lösa denna typen av optimerings-problem. Dessa två metoder jämförs med en tidigare metod beskriven i litteraturen och har utvärderats på ett antal system med avseende på två aspekter. Den första är att hitta regu-latorer med låg komplexitet men samtidigt behålla samma eller liknande prestanda. Den andra aspekten är att hitta regulatorer med så låg ordning som möjligt, men då prestanda prioriteras i andra hand. De två metoder som föreslås i avhandlingen lyckas bättre än me-toden från litteraturen med avseende på den första aspekten, medan likvärdigt resultat fås med avseende på den andra.

(8)

(9)

Acknowledgments

First, I would like to thank my supervisors, Prof. Anders Hansson and Prof. Anders Helmersson for your guidance and support during my years as a PhD student. You have been a great source of ideas and inspiration. I would also like to thank Prof. Lennart Ljung for letting me join the automatic control group, and thank you Ulla Salaneck for always being so helpful and keeping track of everything here.

I would like to thank all the people at the automatic control group in Linköping for contributing to the nice and friendly atmosphere. The coffee break discussions, sometimes about very unpredictable topics, are always entertaining.

A special thanks goes to Janne Harju Johansson. We starting our PhD studies at basically the same time and shared the office for about three years. I really enjoyed our discussions about research, rock music, shooting, fiction literature or about just anything. A big thank you goes to the people who have proofread various parts this thesis: Rikard Falkeborn, Dr. Gustaf Hendeby, Lic. Janne Harju Johansson, Christian Lyzell, Alfio Masi, Daniel Petersson and Lic. Johanna Wallén. Your comments have been really valuable. Dr. Gustaf Hendeby deserves an extra thank you for always answering my questions about LA_{TEX, M}_ATLAB_{or other computer related questions.}

I would like to take the opportunity to thank my family. My parents Yvonne and Göte, always trying to cheer me up and supporting me. Lars, always willing to help with anything, Erik we always have fun together. Joakim, Johanna, and Jacob, I hope we will be better at keeping in touch in the future.

Last, but not least, I would like to thank the Swedish research council for financial support under contract no. 60519401.

Linköping, February 2009

Daniel Ankelhed

(10)

(11)

I

Preliminaries

7

2 Linear systems and H∞synthesis 9 2.1 Linear system representation . . . 9

2.1.1 The controller . . . 9

2.1.2 The closed loop system . . . 10

2.2 Performance bounds . . . 11

2.2.1 H∞controller synthesis . . . 12

2.3 Reformulation of LMI constraints . . . 16

2.4 Gramians and balanced realizations . . . 17

2.5 Recovering the matrix P from X and Y . . . 18

2.6 Obtaining the controller . . . 19

2.7 A General algorithm for H∞synthesis . . . 19

2.8 Formulating an ROF problem as an SOF problem . . . 20

3 Optimization 21 3.1 Nonconvex optimization . . . 21

3.1.1 Local methods . . . 21

3.2 Convex optimization . . . 22

3.2.1 Convex sets and functions . . . 22

3.2.2 Generalized inequalities . . . 23

(12)

3.3 A problem with matrix inequalities . . . 24

3.4 First order optimality conditions . . . 24

3.5 Unconstrained optimization . . . 24

3.5.1 Newton’s method . . . 25

3.5.2 Line search . . . 26

3.5.3 Newton’s method with Hessian modification . . . 27

3.6 Constrained optimization . . . 27

3.6.1 Newton’s method for nonlinear equations . . . 28

3.6.2 Central path . . . 29

3.6.3 A generic Primal-Dual interior point method . . . 29

II

Overview

31

4 Methods for reduced order H∞problems 33 4.1 Specific methods for reduced order H∞problems . . . 33

4.1.1 Alternating Projections . . . 33

4.1.2 Solving BMIs using Branch and bound . . . 34

4.1.3 LMIRANK . . . 34

4.1.4 HIFOO . . . 34

4.1.5 Nonsmooth, multidirectional search . . . 35

4.1.6 Nonsmooth H∞synthesis . . . 35

4.1.7 Barrier method for solving BMIs . . . 35

4.1.8 Augmented Lagrangian method . . . 36

4.1.9 Sequential semidefinite programming (SSDP) . . . 36

4.1.10 Concave minimization . . . 37

4.1.11 Cone Complementarity Linearization Method . . . 37

4.2 General methods for solving related problems . . . 37

4.2.1 PENNON . . . 37

4.2.2 Predictor-corrector method for nonconvex SDPs . . . 38

4.2.3 Successive linearization methods for nonlinear SDPs . . . 38

4.2.4 A Sequential semidefinite programming method, SSP . . . 38

4.3 Controller reduction methods . . . 38

4.4 Summary . . . 39

III

Applications

41

5 A barrier method 43 5.1 Problem formulation . . . 43 5.2 A semidefinite program, (SDP) . . . 44 5.2.1 Optimality conditions . . . 44 5.2.2 Definitions . . . 45 5.2.3 Central path . . . 45

5.3 An extra barrier function . . . 46

(13)

xiii

5.4.1 Initial point calculation . . . 47

5.4.2 Solving the main, nonconvex problem . . . 47

5.4.3 Stopping criteria . . . 48

5.4.4 The algorithm . . . 49

5.5 Optimality of the solution . . . 50

5.6 Finding the controller . . . 50

5.7 Algorithm performance . . . 50 6 A Primal-Dual method 51 6.1 Problem formulation . . . 51 6.2 Optimality conditions . . . 51 6.2.1 KKT conditions . . . 52 6.2.2 Central path . . . 52

6.3 Solving the equations . . . 52

6.3.1 Symmetry transformation . . . 53

6.3.2 Matrix-vector form . . . 54

6.4 Handling nonconvexity and singularity . . . 55

6.5 Uniqueness of symmetric search directions . . . 55

6.6 Computing step lengths . . . 56

6.7 Initial point calculation . . . 56

6.8 A Mehrotra-like method . . . 57

6.8.1 The predictor step . . . 57

6.8.2 The corrector step . . . 58

6.9 The algorithm . . . 59

6.10 Utilization of structure . . . 59

6.11 Finding the controller . . . 60

6.12 Algorithm performance . . . 60

7 Evaluation of the methods 61 7.1 Benchmarking criteria . . . 61

7.2 Constrained matrix-optimization problem library, COMPleib . . . 62

7.2.1 Aircraft (AC) problems . . . 62

7.2.2 Reduced order controller (ROC) feedback problems . . . 62

7.3 Evaluation details . . . 62

7.3.1 How the evaluation was performed . . . 62

7.3.2 The tests . . . 64

7.4 Evaluated methods . . . 64

7.4.1 The Barrier method . . . 64

7.4.2 The Primal-Dual method . . . 65

7.4.3 HIFOO . . . 65

7.5 An in-depth study of AC8 and ROC8 . . . 65

7.5.1 AC8 . . . 66

7.5.2 ROC8 . . . 67

7.6 Concluding the results . . . 68

7.6.1 AC problems . . . 69

(14)

7.6.3 Quantifying the results . . . 72 7.7 Concluding remarks on the evaluation . . . 73

8 Conclusions and further work 75

Bibliography 77

A Additional tables with results 81

A.1 Contents . . . 81 A.2 Notation . . . 81

(15)

Notational conventions

Abbreviations and Acronyms

Abbreviation Meaning

BFGS Broyden, Fletcher, Goldfarb, Shannon

KKT Karush-Kuhn-Tucker

KYP Kalman-Yakubovic-Popov

LMI Linear Matrix Inequality LTI Linear Time-Invariant NLP NonLinear Problem/Program

NT Nesterov-Todd

ROF Reduced Order Feedback

SDP SemiDefinite Program

SOCP Second Order Cone Program SOF Static Output Feedback SVD Singular Value Decomposition

(16)

Symbols, Operators and Functions

Notation Meaning

A () 0 The matrix A is positive (semi)definite A ≺ () 0 The matrix A is negative (semi)definite A K(K) 0 Generalized (strict) inequality in the cone K

x ≥ (>) 0 Element-wise (strict) inequality

AT _{Transpose of matrix A}

A−1 Inverse of matrix A

A⊥ Orthogonal complement of matrix A Aij Component (i, j) of matrix A

xi The ith element of vector x

det(A) Determinant of a matrix A ∈ Rn×n In Identity matrix of size n × n

hA, Bi Inner product of A and B ||x||p p-norm of vector x

||A||p p-norm of matrix A

tr(A) Trace of a matrix A

range A Range of a matrix A

ker A Kernel of a matrix A

A ⊗ B Kronecker product of matrix A and B

A ⊗sB Symmetric Kronecker product of matrix A and B

vec(A) Vectorization of matrix A

svec(A) Symmetric vectorization of matrix A ∈ S vech(A) Half-vectorization of matrix A ∈ Rn×n

∇xf (x) The gradient of function f (x) with respect to x

∇2

xxf (x) The hessian of function f (x) with respect to x

minxf (x) Minimize f (x) with respect to x

maxxf (x) Maximize f (x) with respect to x

argminxf (x) The minimizing argument of f (x)

blkdiag(A, B, . . .) Block diagonal matrix with submatrices A, B, . . . Re(a) The real part of the complex number a

(17)

3

Sets

Notation Meaning

R Set of real numbers

R+ Set of nonnegative real numbers

R++ Set of positive real numbers

Rn Set of real-valued vectors with n rows

Rn×m Set of real-valued matrices with n rows and m columns Sn Set of symmetric matrices of size n × n

Sn+ The positive semidefinite cone of symmetric matrices

Sn++ The positive definite cone of symmetric matrices

(18)

(19)

1

Introduction

1.1 Background

The development of robust control theory emerged during the 80s and and a contributory factor certainly was the fact that the robustness of Linear Quadratic Gaussian (LQG) con-trollers can be arbitrarily bad as reported in Doyle (1978). A few years later, in Zames (1981), an important step in the development towards a robust control theory was taken, where the concept of H∞theory was introduced. The H∞synthesis, which is an

impor-tant tool when solving robust control problems, was a cumbersome problem to solve until a technique was presented in Doyle et al. (1989), which is based on solving two Riccati equations. Using this method, the robust design tools became much easier to use and gained popularity. Quite soon thereafter, linear matrix inequalities (LMIs) were found to be a suitable tool for solving these kinds of problems by using reformulations of the Riccati equations. Also related problems, such as gain scheduling synthesis, fit into the LMI framework. In parallel to the theory for solving problems using LMIs, numerical methods for this purpose was being developed and made available.

Typical applications for robust control include systems that have high requirements for robustness to parameter variations and for disturbance rejection. The controllers that result from these algorithms are typically of very high order, which complicates imple-mentation. However, if a constraint on the maximum order of the controller is set, that is lower than the order of the plant, the problem is no longer convex and is then relatively hard to solve. These problems become very complex, even when the order of the system to be controlled is low. This motivates the use of efficient algorithms that can solve these kinds of problems.

We have devoted Chapter 4 to briefly present methods that has been found in the liter-ature which solve the problem of finding a reduced order H∞controller. The conclusion

from that chapter is that a method named HIFOOperforms best among all the methods. A drawback with this method is that the resulting controller is sensitive to the choices of

(20)

initial points which are randomized, hence the results from running this method can be unreliable, unless a big batch of runs are performed.

1.2 Contributions

Two methods for solving the problem of finding low order H∞ controllers have been

proposed. An evaluation of the methods and a comparison with an existing method in the literature have been made. The evaluation of the method shows that the proposed methods perform better than the existing method when the aim is to find lower order controllers with no or little degradation of the closed loop performance. When the lowest order controller is to be found that has no more than +50 % performance loss, then the three methods perform equally well.

1.3 Thesis outline

The thesis is structured as follows. In Chapter 2, we present theorems related to linear systems and H∞ synthesis. In Chapter 3 optimization preliminaries are presented. In

Chapter 4, we present some existing methods for reduced order H∞synthesis and some

general optimization methods that could be used in this application. In Chapter 5, we present a new method for low order H∞synthesis which is based on a primal log-barrier

formulation and in Chapter 6 we another method is presented which is based on a primal-dual framework. In Chapter 7 the two mentioned methods are evaluated and compared with an existing method found in the literature named HIFOO. Additional tables with results can be found in Appendix A. We summarize and conclude the results and suggest topics for future studies in the last chapter of the thesis, Chapter 8.

(21)

Part I

Preliminaries

(22)

(23)

2

Linear systems and

H

∞

synthesis

In this chapter, basic theorems related to linear systems and H∞synthesis are presented.

This includes defining the performance measure for a system, reformulating this to an LMI (Linear Matrix Inequality) framework and formulating the nonlinear semidefinite program, which eventually will be the problem that we will solve. Those familiar with the subject may want to skip directly to the next chapter.

2.1 Linear system representation

We begin by describing a linear system, H, with state vector, x ∈ Rnx_{. The input vector} contains the disturbance signal, w ∈ Rnw_{, and the control signal, u ∈ R}nu_{. The output} vector contains the measurement, y ∈ Rny_{, and the performance measure, z ∈ R}nz_{. In} terms of its system matrices, we can represent the linear system as

H :   ˙ x z y  =   A B1 B2 C1 D11 D12 C2 D21 D22     x w u  , (2.1)

where D22 is assumed to be zero, i.e., the system is strictly proper from u → y. For

simplicity, it is assumed that the system is on minimal form, i.e., it is both observable and controllable. However, in order to find a controller, it is enough to assume detectability and stabilizability (non observable and non controllable modes are stable), but the for-mulas will become more complex in this case. The system is illustrated in Figure 2.1.

2.1.1 The controller

The linear controller is denoted K. It takes the system measurement, y, as input and the output vector is the control signal, u. The system matrices for the controller are defined

(24)

H

w

u

z

y

Figure 2.1: A system with two inputs and two outputs. The signals are: disturbance, w, control input, u, performance measure, z, and output, y.

by the equation K : ˙xK u =KA KB KC KD xK y , (2.2)

where xK ∈ Rnkis the state vector of the controller. How the controller is connected to

the system is illustrated in Figure 2.2.

H

K

w

y

u

z

Figure 2.2: A standard setup for H∞ controller synthesis, with the plant H

con-trolled through feedback by the controller K.

In the literature, it is common to distinguish between three different cases of control problems depending on how many states the controller has. These are:

1. The Static Output Feedback (SOF) problem where nk = 0, i.e., the controller has

no states.

2. The Reduced Order Feedback (ROF) problem where 0 < nk< nx.

3. The Full order controller problem where nk = nx.

The first two problems are considered difficult to solve efficiently while the third one is easy to solve in comparison with the other two. The is due to the third problem being convex, while the other two are nonconvex. See Chapter 3 for details about convexity.

2.1.2 The closed loop system

Now we want to derive expressions for the closed loop system in terms of the system matrices of the plant and the controller. Let us denote the closed loop system by Hc. The

(25)

2.2 Performance bounds 11

system matrices of the closed loop system can be derived by combining (2.1) and (2.2): u = KCxK+ KDy = KCxK+ KDC2x + KDD21w ˙ xK= KAxK+ KBy = KAxK+ KBC2x + KBD21w ˙ x = Ax + B1w + B2u = (A + B2KDC2)x + B2KCxK+ (B1+ B2KDD21)w z = C1x + D11w + D12u = (C1+ D12KDC2)x + D12KCxK+ (D11+ D12KDD21)w

From the equations above we obtain the matrix-vector form expressions as

Hc:   ˙ x ˙ xK z  =   A + B2KDC2 B2KC B1+ B2KDD21 KBC2 KA KBD21 C1+ D12KDC2 D12KC D11+ D12KDD21     x xK w  . (2.3) Denoting the closed loop system states xc = xxK and using the above matrix partition-ing, we can write (2.3) as

˙xC z =AC BC CC DC xC w , (2.4) where xC∈ Rnx+nk.

2.2 Performance bounds

Performance bounds are used to specify the properties of a dynamic system in terms of the maximum gain. Using the Bounded real lemma, which is introduced below, we can state an equivalent condition for a system to have H∞norm less than γ, which is more

suitable for our purpose. We have chosen not to state the problem of finding the optimal H∞norm of the system. The reason is that it is more computationally and theoretically

difficult than finding a suboptimal controller given a certain value of γ, see (Postlethwaite and Skogestad, 1996, page 367). This is also motivated by the fact that we can scale the problem in such a way that a specific value of γ corresponds to the desired performance of the closed loop system.

Lemma 2.1 (Bounded real lemma, Scherer (1990))

For anyγ > 0 we have that A is Hurwitz and ||H(s)||∞ < γ hold if and only if there

exists aP ∈ S++such that

AT_{P + P A + C}T_C _{P B + C}T_D

BTP + DTC DTD − γ2

≺ 0. (2.5)

We can rewrite the inequality (2.5) as P A + AT_P _{P B} BTP −γ2_I +C T DT I C D ≺ 0. (2.6)

(26)

Multiply (2.6) by γ−1and let P1= γ−1P and we get P1A + ATP1 P1B BT_P 1 −γI +C T DT γ−1I C D ≺ 0. (2.7)

From now on, we will drop the index on P for convenience. To continue from here, the following lemma is useful.

Lemma 2.2 (Schur complement, Boyd et al. (1994)) Assume thatR ∈ Sn

, S ∈ Sm

, G ∈ Rn×m_{. Then the following conditions are equivalent:}

R ≺ 0, S − GTR−1G ≺ 0 and S GT G R ≺ 0. (2.8)

Proof: Perform a congruence transformation of (2.8) using the non-singular matrix

U = I 0 −R−1_G _I , (2.9)

i.e., post-multiply (2.8) by U and pre-multiply it by UT and we get I −GT_R−1 0 I S GT G R I 0 −R−1_G _I =S − G T_R−1_G ₀ 0 R ≺ 0, (2.10)

which proves the lemma.

Now, by using Lemma 2.2 the inequality (2.7) can be written as   P A + AT_P _{P B} _CT BT_P _−γI _DT C D −γI  ≺ 0, (2.11)

which is, given P and γ, an LMI in the system matrices A, B, C, D. Given A, B, C, D, it is an LMI in P and γ. The inequality (2.5) can now be replaced by (2.11) in Lemma 2.1.

2.2.1 H

∞

controller synthesis

Next, we want to use Lemma 2.1, with (2.5) rewritten as (2.11), in order to find a controller such that the H∞ norm of the closed loop system is less than γ. Let the matrix P be

partitioned as

P = X X2 X2T X3

(27)

and insert this together with the closed loop system matrices (AC, BC, CC, DC) into

(2.11). After some rearrangements we get the matrix inequality     XA + AT_X _AT_X 2 XB1 C1T XT 2A 0 X2TB1 0 BT 1X B1TX2 −γI DT11 C1 0 D11 −γI     | {z } Q + +     XB2 X2 X₂TB2 X3 0 0 D12 0     | {z } U KD KC KB KA | {z } K C2 0 D21 0 0 I 0 0 | {z } VT +C2 0 D21 0 0 I 0 0 T | {z } V KD KC KB KA T | {z } KT     XB2 X2 XT 2B2 X3 0 0 D12 0     T | {z } UT ≺ 0, (2.13)

which is bilinear in the controller variables, (KA, KB, KC, KD) and the matrices (X,

X2, X3). Here we have given the matrices names (Q, U, K, V ), which correspond to

the matrices in the next lemma, which states two conditions on the existence of a K that satisfies (2.13). First, we need to define the concept of an orthogonal complement. Definition 2.1 (Orthogonal complement). Let V⊥denote any full rank matrix such that ker V⊥= range V . Then V⊥is an orthogonal complement of V .

Remark 2.1. Note that for V⊥ to exist, V needs to have linearly dependent rows. We have that V⊥V = 0. There exist infinitely many choices of V⊥.

In the following lemma, the matrix K is eliminated, which explains the name of the lemma.

Lemma 2.3 (Elimination Lemma, Gahinet and Apkarian (1994))

Given matrices_{Q ∈ R}n×n,_{U ∈ R}n×m,_{V ∈ R}n×p, there exists a_{K ∈ R}m×psuch that

Q + U KVT + V KTUT ≺ 0 (2.14)

if and only if

U⊥QU⊥T ≺ 0 and V⊥QV⊥T ≺ 0, (2.15)

whereU⊥is an orthogonal complement ofU and V⊥is an orthogonal complement ofV . IfU⊥orV⊥does not exist, the corresponding inequality disappears.

Proof: The necessity part is obtained by pre- and post-multiplying (2.14) by U⊥ (V⊥) and U⊥T (V⊥T), respectively. For the sufficiency part, see Gahinet and Apkarian (1994).

(28)

The next step is to derive the orthogonal complements U⊥and V⊥. Note that U =     XB2 X2 XT 2B2 X3 0 0 D12 0     = _P ₀ 0 I     B2 0 0 I 0 0 D12 0     . Additionally, let P−1= Y Y2 YT 2 Y3 . (2.16)

An orthogonal complement U⊥can now be constructed as

U⊥=     B2 0 0 I 0 0 D21 0     ⊥ P−1 0 0 I

and by using Lemma 2.3 and performing some rearrangements, (2.13) is now equivalent to the two LMIs

NX 0 0 I T   XA + AT_X _XB 1 C1T BT 1X −γI DT11 C1 D11 −γI   NX 0 0 I ≺ 0 NY 0 0 I T   AY + Y AT _{Y C}T 1 B1 Y C1 −γI D11 BT 1 D11T −γI   NY 0 0 I ≺ 0, (2.17)

where NXand NY denote any base of the null-spaces of C2D21 and B2T DT12 respec-tively. Now, the LMIs (2.17) are coupled by the relation of X and Y through (2.12) and (2.16), which is not very convenient. Therefore, consider the following lemma.

Lemma 2.4 (Packard (1994)) SupposeX, Y ∈ Snx

++ andnk is a non-negative integer. Then the following statements

are equivalent.

1. There existX2, Y2∈ Rnx×nkandX3, Y3∈ Snksuch that

P = X X2 XT 2 X3 0 and P−1₌ Y Y2 YT 2 Y3 0. (2.18) 2. X I I Y 0 and rankX I I Y ≤ nx+ nk. (2.19)

Remark 2.2. By using the relations (2.9) and (2.10), we note that

X I I Y =I Y −1 0 I X − Y−1 ₀ 0 Y I 0 Y−1 I (2.20)

(29)

which implies that rankX I

I Y

= nx+ rank(X − Y−1) = nx+ rank(XY − I)

⇔ rank(XY − I) ≤ nk. (2.21)

We would like to replace the rank constraint in (2.19) with a smooth function to be able to apply gradient methods for optimization. To do this, we use the following lemma. Lemma 2.5 (Helmersson (2009))

Assume that the inequality

X I I Y 0 (2.22) holds. Let det(λI − (I − XY )) = nx X i=0 ciλi = λnx+ cnx−1λ nx−1_{+ . . . + c} 1λ + c0 (2.23)

be the characteristic polynomial of(I − XY ). Then the following statements are equiv-alent ifnk< nx:

1. rank(XY − I) ≤ nk

2. cnx−nk−1(X, Y ) = 0 Additionally,

ci(X, Y ) ≥ 0, ∀i. (2.24)

Remark 2.3. The computation of cican be done in MATLABusing the command poly.

The first and second order derivatives of ci cannot be calculated directly by using that

command. Instead, we can use a procedure described in Helmersson (2009).

The solvability conditions for the H∞problem, which is essential for this thesis, will

now be stated.

Lemma 2.6 (H∞controllers for continous plants)

The problem of finding a linear controller such that the closed loop systemHc is stable

and such that ||Hc||∞ < γ, is solvable if and only if there exist X, Y ∈ S++nx, which

satisfy NX 0 0 I   XA + AT_X _XB 1 C1T BT 1X −γI DT11 C1 D11 −γI   NX 0 0 I T ≺ 0 (2.25a) NY 0 0 I   AY + Y AT Y C₁T B1 Y C1 −γI D11 BT1 D11T −γI   NY 0 0 I T ≺ 0 (2.25b) X I I Y 0 (2.25c) cnx−nk−1(X, Y ) = 0. (2.25d)

(30)

Proof: Combine Lemma 2.1–2.5, or equivalently, combine the LMI reformulation of Theorem 4.3 in Gahinet and Apkarian (1994) with Lemma 2.5.

2.3 Reformulation of LMI constraints

Throughout this thesis, we will interchangeably use different reformulations of (2.25), depending on the context. A useful concept when vectorizing symmetric matrix variables is half-vectorization, which will be defined next.

Definition 2.2 (Half-vectorization). Let

X =       x11 x12 . . . x1n x21 x22 ... .. . . .. xn1 xn2 . . . xnn       . Then vech(X) = x11 x21 . . . xn1 x22 . . . xn2 x33 . . . xnn T , i.e., vech stacks the columns of X from the principal diagonal downwards in a column vector. See Lütkepohl (1996) for properties and details.

Remark 2.4. Note that vech(X) does not require X to be symmetric, but if it is not, obviously some information will be lost and the operation is not invertible.

By choosing appropriate symmetric matrices A(j)_i and C(j)_{and using vech, we can}

rewrite (2.25) equivalently as m X i=1 A(j)_i xi≺ C(j), j = 1, 2, 3 (2.26a) ¯ cnx−nk−1(x) = 0, (2.26b) where x = vech(X) vech(Y ) ∈ Rm _{and m = n}

x(nx+ 1). The index j corresponds to

each of the three LMIs. Note that (2.26a) replaces the LMIs (2.25a)–(2.25c) and that (2.26b) replaces (2.25d). The LMIs (2.26a) can be written in an even more compact way by constructing block diagonal matrices Ai = blkdiag(A

(1) i , A (2) i , A (3) i ) and C =

blkdiag(C(1), C(2), C(3)) which lets us write (2.26) as

m X i=1 Aixi≺ C ¯ cnx−nk−1(x) = 0. (2.27)

The formulation (2.27) will consist of bigger matrices than (2.26). These three representa-tions of the problem, (2.25)–(2.27), will be used interchangeably in this thesis, depending on the context.

(31)

2.4 Gramians and balanced realizations 17

2.4 Gramians and balanced realizations

It is well known that for a linear system, there exist infinitely many realizations and for two realizations of the same system, there exists a transformation matrix T that connects the representations. Let (A, B, C, D) and ( ¯A, ¯B, ¯C, ¯D) be the system matrices for two realizations of the same system. Then there exist a transformation matrix T such that

¯

A = T AT−1, B = T B,¯ C = CT¯ −1, D = D.¯ (2.28) The two systems have the same input-output properties, but are represented differently. The state vectors are connected by the relation ¯x = T x.

Definition 2.3 (Controllability and Observability gramians). Let (A, B, C, D) be the system matrices for a stable linear system of order n. Then there exist X, Y ∈ Sn+that

satisfy

XA + ATX + BBT = 0, AY + Y AT + CTC = 0, (2.29) where X, Y are called the Controllability and Observability gramians, respectively.

The gramians are often used in model reduction algorithms and can be interpreted as a measure of controllability and observability of a system. For more details, see e.g. Glover (1984) or Postlethwaite and Skogestad (1996).

Definition 2.4 (Balanced realization). A system is said to be in a Balanced realization if the Controllability and Observability gramians are equal and diagonal matrices, i.e., X = Y = Σ, where Σ = diag(σ1, . . . , σn) and where {σ1, . . . , σn} are called the

Hankel singular values.

For any stable linear system, a transformation matrix T can be computed that brings the system to a balanced realization by using (2.28). The procedure is sometimes called Balancing and described in Glover (1984), but stated here in Algorithm 1 for convenience. Actually, we can perform a more general type of a balanced realization around any X0, Y0∈ Sn++which is described in Definition 2.5.

Definition 2.5 (Balanced realization (around (X0, Y0)). Given any X0, Y0 ∈ Sn++

which need not necessary be solutions to (2.29), we can calculate a transformation T such that (2.31) holds. Then the realization described by (2.30) is a Balanced realization around X0, Y0.

The balancing procedure around X0, Y0is the same as normal balancing, thus

Algo-rithm 1 can be used for the same purpose. The balanced realizations of a system has shown to have some nice numerical properties. We will come back to the impact of balancing in Chapter 5.

(32)

Algorithm 1 Balancing Given X, Y ∈ Sn++.

1: Perform a Cholesky factorization of X and Y :

X = RT_XRX, Y = RTYRY

2: Then perform a Singular Value Decomposition (SVD) RYRTX = U ΣV

T_,

where Σ is a diagonal matrix with positive elements (the singular values) along the diagonal.

3: The state transformation can now be calculated as T = Σ−1/2VTRX.

4: The new system matrices are then given by ¯

A = T AT−1, B = T B,¯ C = CT¯ −1, D = D¯ (2.30) and

¯

X = T−TXT−1= Σ, Y = T Y T¯ T = Σ. (2.31)

2.5 Recovering the matrix

P

from

X

and

Y

Assume that we have found a pair X, Y that satisfy (2.25). We now wish to construct a P such that (2.18) holds. First note the equality

Y Y2 YT 2 Y3 = (X − X2X3−1X2T)−1 −X−1X2(X3− X2TX−1X2)−1 −X₃−1XT 2(X − X2X3−1X2T)−1 (X3− X2TX−1X2)−1 , (2.32) which is easily verified by multiplying by

X X2

XT₂ X3

from the left. Now we want to find X2 ∈ Rnx×nk and X3 ∈ Rnk×nk such that X − Y−1 = X2X3−1X

T

2, i.e., using the

fact that the (1, 1) elements in (2.32) are equal. Perform a Cholesky factorization of X = RT

XRXand Y = RTYRY. Then we have

RT_XRX− R−1_Y R−T_Y = X2X3−1X T 2,

which can be rewritten as

RYRTXRXRYT − I = RYX2X3−1X T 2R

T Y.

Then use a singular value decomposition RYRTX= U ΣVT and we obtain

U (Σ2− I)UT = U Γ2UT = RYX2X3−1X T 2R

T

(33)

2.6 Obtaining the controller 19 where Σ =Σnk 0 0 Inx−nk , Γ2= Σ2− Inx, Γ = Γnk 0 0 0 .

Similar to the previous section, we get the transformation matrix T = Σ−1/2VTRX

which balances the system, such that T−TXT−1= T Y TT = Σ. Now we can choose X3= Σnk and X2= T TΓnk 0 , which satisfies (2.18).

2.6 Obtaining the controller

In the previous section we recovered the matrix variable P =

X X2

X₂T X3

, and we can obtain the controller system matrices KA, KB, KC, KDby solving the following optimization

problem, which was suggested in Beran (1997). min

d,KA,KB,KC,KD d subject to LMI (2.13) ≺ dI

(2.34)

Since we have used Lemma 2.6, we know that the solution satisfies d∗ ≤ 0. Numerical issues can however complicate things, that is why we try to obtain a margin by minimizing d. Normally, this should give us a d < 0, but if not, we may have to modify the problem by increasing γ and try to solve again.

The resulting closed loop H∞norm is calculated by using the MATLAB command norm(Hc,inf,tol), where Hc refers to the closed loop system and tol is the desired tolerance. From Lemma 2.1 we know that the norm of the plant will be less than γ, i.e., ||Hc||∞< γ.

2.7 A General algorithm for

H

∞

synthesis

Now we outline an algorithm for H∞ synthesis using an LMI approach, presented as

Algorithm 2. The first step in this algorithm requires a solution of a nonconvex problem, and algorithms that solves that is what we will focus on in this thesis.

Algorithm 2 Algorithm for H∞synthesis using LMIs

Given performance criteria γ and system matrices (A, B, C, D).

1: Find X, Y that satisfy (2.25) or equivalently, x that satisfies (2.26) or (2.27). 2: Balance the system around (X, Y ), as described in Section 2.4.

3: Recover P from X and Y as described in Section 2.5.

4: Solve (2.34) to get the controller system matrices (KA, KB, KC, KD).

(34)

2.8 Formulating an ROF problem as an SOF problem

The procedure formulating an ROF problem as an SOF problem is described in El Ghaoui et al. (1997) but we will restate it here and provide some additional details. Study the plant in (2.1). Assume we augment the state vector with xK ∈ Rnc, the control input vector

with uK ∈ Rnc and the measured output with yK ∈ Rnc and rename them ˜x, ˜u and ˜y,

respectively, i.e., ˜ x = x xK ˜ u =uK u ˜ y =yK y . (2.35)

In plain words, we add nc states, nc control signals and nc outputs to the system. The

system matrices for the new augmented plant, we choose as ˜ A = A 0nx×nc 0nc×nx 0nc×nc , B˜1= B1 0nc×nw , B˜2= 0nx×nc B2 Inc×nc 0nc×nu ˜ C1= C1 0nz×nc , D˜11= D11, D˜12= 0nz×nc D12 (2.36) ˜ C2= 0nc×nx Inc×nc C2 0ny×nc , D˜21= 0nc×nw D21 , ˜D22= 0(nc+ny)×(nc+nu). Now we want to use feedback on this system using a static output feedback controller, i.e., a controller with no states. The control signal is calculated as

˜ u = ˜K ˜u =KA KB KC KD ˜ y, (2.37)

where we have given names to the submatrices in ˜K, for reasons that will be apparent later. This results in the closed loop system equations

˙˜ x = ˜A˜x + ˜B1w + ˜B2u˜ =A + B2KDC2 B2KC KBC2 KA ˜ x +B1+ B2KDD21 KBD21 w (2.38) z = ˜C1x + ˜˜ D11w + ˜D12u˜ = C1+ D12KDC2 D12KC ˜x + D11+ D12KDD21w. (2.39)

These equations are equal to the ones in (2.3), which means that by augmenting the plant this way and synthesizing an H∞SOF controller u = ˜K ˜u we can get an ROF controller

for the original plant using the exact same matrices in the controller, but arranged dif-ferently. The augmented system contains more states than the original plant, which may not be a desired property, since it may increase the computational burden. However, the control parameters do not increase in number, which may benefit certain methods. We will come back to this in Chapter 4.

(35)

3

Optimization

In optimization one wants to maximize or minimize a function value and at the same time satisfy given constraints. In this chapter, optimization preliminaries are presented and some methods that can be used to solve optimization problems will be outlined. The presentation closely follows relevant sections in Boyd and Vandenberghe (2004) and No-cedal and Wright (2006). Those familiar with the subject may want to skip directly to the next chapter.

3.1 Nonconvex optimization

In nonconvex optimization, the objective function or the constraint or both are nonconvex. There is no efficient method to solve a general nonconvex optimization problem, so spe-cialized methods are often used to solve them. Even small problems in few variables may be hard to solve, and larger problems with many variables may be impossible to solve. Nonconvex optimization is a very active research topic today.

3.1.1 Local methods

One approach to use when solving nonconvex optimization problems is to use local meth-ods. A local method searches among the feasible points in a local neighborhood for the optimal value. The drawback is that the solution we find may not be the global optimum, and that it may be very sensitive to the choice of the initial starting point, which in gen-eral must be provided or found using some heuristics. The parameters in the algorithm may have to be tuned for these algorithms to work. Both the Barrier method and the Primal-Dual method, which are presented in this thesis, are local methods.

(36)

3.2 Convex optimization

Convex optimization problems are a subgroup of nonlinear optimization problems, where the objective function and the constraint set are convex. This in turn gives some very useful properties. Convex problems are in general considered easy to solve, in comparison to nonconvex ones, but it may not always be so due to e.g. large scale or numerical issues or both. Surprisingly many nonconvex problems can be rewritten as equivalent convex ones.

3.2.1 Convex sets and functions

In this section, convex sets and functions are defined, which are important concepts in optimization. We begin by defining a convex set.

Definition 3.1 (Convex set). A set C ⊆ Rnis a convex set if the line segment between two arbitrary points x1, x2∈ C lies within C, i.e.,

θx1+ (1 − θ)x2∈ C, θ ∈ [0, 1], (3.1)

which means that all the points on the line segment must be in the set for it to be convex. The concept is illustrated in Figure 3.1.

θ x1+ (1−θ)x2

C

Figure 3.1: Illustrations of a convex set as defined in Definition 3.1.

With the definition of a convex set, we can continue with the definition of a convex function.

Definition 3.2 (Convex function). A function f : Rn _{→ R is a convex function if}

dom f is a convex set and if for all x1, x2∈ dom f we have that

f θx1+ (1 − θ)x2 ≤ θf (x1) + (1 − θ)f (x2), θ ∈ [0, 1]. (3.2)

In plain words, this means that the line segment between x1, f (x1) and x2, f (x2) lies

above the graph of f as illustrated in Figure 3.2.

Definition 3.3 (Cone). A set K ⊆ Rn_{is a cone, if for every x ∈ K we have that}

(37)

3.2 Convex optimization 23 x1, f(x1) x2, f(x2) θ f(x1) + (1−θ)f(x2) f θx1+ (1−θ)x2 f x1 x2

Figure 3.2: Illustrations of a convex function as defined in Definition 3.2.

Definition 3.4 (Convex cone). A cone K ⊆ Rn _{is a convex cone, if it is convex or}

equivalently, if for arbitrary points x1, x2∈ K we have

θ1x1+ θ2x2∈ K, θ1, θ2≥ 0. (3.4)

3.2.2 Generalized inequalities

In this section, we will explain the concept of generalized inequalities. To do that, we first need to define a proper cone.

Definition 3.5 (Proper cone). A convex cone K is a proper cone, if the following prop-erties are satisfied:

• K is closed.

• K is solid, i.e., it has nonempty interior.

• K is pointed, i.e., x ∈ K and −x ∈ K implies x = 0.

By using the definition of proper cones, we can define the concept of generalized inequalities.

Definition 3.6 (Generalized inequality). A generalized inequality Kwith respect to a

proper cone K is defined as

x1Kx2⇔ x1− x2∈ K. (3.5)

The strict generalized inequality (K) is defined analogously. From now on, the index

K is sometimes dropped when the cone is implied from context. We remark that the set of semidefinite matrices is a proper cone.

(38)

3.3 A problem with matrix inequalities

Now we define an optimization problem with matrix inequality constraints using the con-cepts described in the previous section.

min

x f0(x)

subject to fi(x) 0, i ∈ I,

hi(x) = 0, i ∈ E ,

(3.6)

where f0 : Rn → R, fi : Rn → Smi, hi : Rn → Rmi, and where denotes positive

semidefiniteness. The functions are smooth and real-valued and E and I are two finite sets of indices.

3.4 First order optimality conditions

In this section, the first order necessary conditions for x∗ to be a local minimizer are stated. When presenting these conditions, the following definition is useful.

Definition 3.7 (Lagrangian function). The Lagrangian function (or just Lagrangian for short) for the problem in (3.6) is

L(x, Z, v) = f0(x) − X i∈I hZi, fi(x)i − X i∈E hvi, hi(x)i (3.7)

where Zi∈ Sn+and hA, Bi = tr(A T

, B) denotes the inner product between A and B. We are now ready to state the first-order necessary conditions for (3.6) that must hold at x∗for it to be an optimum.

Theorem 3.1 (First order necessary conditions, Boyd and Vandenberghe (2004)) Supposex∗is any local solution of (3.6), and that the functionsfi, hi in (3.6) are

con-tinuously differentiable. Then there exists Lagrange multipliersZ_i∗, i ∈ E and v∗_i, i ∈ I, such that the following conditions are satisfied at(x∗, Z∗, v∗)

∇xL(x∗, Z∗, v∗) = 0, (3.8a)

hi(x∗) = 0, ∀i ∈ E, (3.8b)

fi(x∗) 0, ∀i ∈ I, (3.8c)

Z_i∗fi(x∗) = 0, ∀i ∈ I. (3.8d)

Z_i∗ 0, ∀i ∈ I, (3.8e)

The conditions (3.8) are sometimes referred to as the Karush-Kuhn-Tucker (KKT) conditions.

3.5 Unconstrained optimization

In this section we will present the background theory for solving the unconstrained opti-mization problem

min

(39)

3.5 Unconstrained optimization 25

where f : Rn_{→ R is twice continuously differentiable. We assume that the problem has}

at least one local optimum x∗. The following theorem will be helpful later on. Theorem 3.2 (First order necessary conditions, Nocedal and Wright (2006))

Ifx∗is a local minimizer andf (x) is continuously differentiable in an open neighborhood ofx∗, then∇f (x∗_{) = 0.}

There exist several kinds of methods to solve problems like (3.9), among these are the gradient methods. We are going to describe Newton’s method, which is thoroughly described in e.g. Nocedal and Wright (2006); Boyd and Vandenberghe (2004), where also convergence proofs are provided.

3.5.1 Newton’s method

Assume we have a point xk ∈ dom f (x). Then we can create a second order Taylor

approximation (or model) Mk(p) of f (x) at xkas

Mk(p) = f (xk) + ∇f (xk)Tp + 1 2p T_∇2_{f (x} k)p. (3.10) xk, f(xk) Mk(p) f(x) f(x) xk+pNk, f(xk+pNk) x∗, f(x∗ ) x

Figure 3.3: The function f (x) and its second order Taylor approximation Mk(p).

The Newton step pN_k is the minimizer of Mk(p). The optimum x∗minimizes f (x).

Definition 3.8. The Newton direction pN

Kis defined by

∇2_{f (x}

k)pNk = −∇f (xk). (3.11)

The Newton direction has some interesting properties. If ∇2f (xk) 0:

• pN

k minimizes Mk(p) in (3.10), as illustrated in Figure 3.3.

• ∇f (xk)TpNk = −∇f (xk)T ∇2f (xk)

−1

∇f (xk) < 0 unless ∇f (xk) = 0, i.e.,

(40)

If ∇2

f (x) 0, the Newton step does not minimize Mk(p) and is not guaranteed be a

decent direction. If that is the case, let Bk ∈ S++be an approximation of ∇2f (x) and

choose the search direction by solving BkpMk = −∇f (xk). Then pMk is guaranteed to be

a descent direction and minimizes the convex quadratic model M_kB(p) = f (xk) + ∇f (xk)Tp +

1 2p

T_B

kp. (3.12)

An approximation Bk can for example be chosen by adding a constant times the identity

matrix, such that

Bk= ∇2f (xk) + dI 0, d ≥ 0. (3.13)

Definition 3.9 (Newton decrement). The Newton decrement Λ(xk) is defined as

Λ(xk) = ∇f (xk)TB_k−1∇f (xk) 1/2

= (−∇f (xk)TpMk )

1/2_. _(3.14)

The Newton decrement is useful as a stopping criterion. Let M_kB(p) be the convex quadratic model of f at xk. Then

f (xk) − inf p M B k (p) = f (xk) − MkB(p M k ) = 1 2Λ(xk) 2_, (3.15) and we see that Λ(xk)2/2 can be interpreted as an approximation of f (xk) − p∗, i.e., a

measure of distance from optimality based on the quadratic convex modified approxima-tion of f at xk.

3.5.2 Line search

When a descent direction pM

k is determined, we want to find the minimum of the function

along that direction. Since MB

k (p) is only an approximation, pMk will not necessarily

minimize f (xk+ p). Therefore we want to solve

min

α∈(0,1]

f (xk+ αpMk ). (3.16)

This can be done in several ways. The approach we will describe here is called Back-tracking line search, which is an approximate method, but cheaper in computation than exact methods. The idea is not to solve (3.16), but just to reduce f enough, according to some criterion. The procedure is outlined in Algorithm 3 and illustrated in Figure 3.4. Algorithm 3 Backtracking line search

Given a descent direction pkfor f (xk), β ∈ (0, 0.5), γ ∈ (0, 1).

α := 1

while f (xk+ αpk) > f (xk) + βα∇f (xk)T do

α := γα end while

(41)

3.6 Constrained optimization 27 α₌₀ α₀ f(xk) +α∇f(xk)Tpk f(xk) +αβ∇_f(xk)T_pk f(xk+α pk) α f

Figure 3.4: The figure illustrates Backtracking line search. The curve shows the function f along the descent direction. The lower dashed line shows the linear ex-trapolation of f and the upper dashed line has a slope of factor β smaller. The condition for accepting a point is that α ≤ α0.

3.5.3 Newton’s method with Hessian modification

Now we outline a method in Algorithm 4, which is based on Algorithm 9.5 in Boyd and Vandenberghe (2004), but with a Hessian modification as suggested in Nocedal and Wright (2006). The stopping criterion used is the Newton decrement, defined in Defini-tion 3.9.

Algorithm 4 Newton’s method with Hessian modification Given a starting point x0∈ dom f , tolerance > 0

loop

Compute the Newton step by solving BkpMk = −∇f (xk) with minimum value

d ≥ 0 such that Bk= ∇2f (xk) + dI 0.

Compute Newton decrement Λ(xk).

Compute stopping criterion. quit if Λ(xk) < Λtol.

Choose step size αkby using Backtracking line search.

Update iterate. xk+1:= xk+ αkpMk .

Set k := k + 1. end loop

3.6 Constrained optimization

We now turn to constrained optimization. The approach we will present here is the primal-dual approach. As the name suggests, the approach uses both the original, primal variables as well as introduces additional ones, the dual variables. The foundation of the method is

(42)

to start from the KKT conditions (3.8) and add slack variables Sito the inequalities. We

get the following equations

∇xL(x∗, Z∗, v∗) = 0, (3.17a)

hi(x∗) = 0, ∀i ∈ E, (3.17b)

fi(x∗) − S∗i = 0, ∀i ∈ I, (3.17c)

Z_i∗S∗_i = 0, ∀i ∈ I. (3.17d) Z_i∗ 0, S∗_i 0, ∀i ∈ I, (3.17e) and we note that this is now a collection of equality constraints, except for the inequality constraints (3.17e). This system of equations can be solved using Newton’s method for equations, but (3.17e) must be monitored manually and measures must be taken, such as backtracking line search to maintain the positive semidefiniteness.

3.6.1 Newton’s method for nonlinear equations

Recall from Section 3.5 that Newton’s method is based on the principle of minimizing a second order model, which is created by taking the first three terms of the Taylor series approximation, see (3.10), of the function around the current iterate xk. The Newton step

is the vector that minimizes that model. When solving nonlinear equations, Newton’s method is derived in a similar way, the difference is that a linear model is used.

Assume that r : Rn → R and that we want to find x∗_{such that r(x}∗_{) = 0. Assume}

we have a point xk such that r(xk) ≈ 0. Denote the Jacobian of r as J (x) and we can

create a linear model Mk(p) of r(xk+ p) as

Mk(p) = r(xk) + J (xk)p. (3.18)

If we choose pk = −J (xk)−1r(xk) we see that Mk(pk) = 0 and this choice of pk is

Newton’s method in its pure form. A structured description is given in Algorithm 5. If Algorithm 5 Newton’s method for nonlinear equations

Given x0

for k = 0, 1, 2, . . . do

Calculate a solution pkto the Newton equations

J (xk)pk = −r(xk).

xk+1:= xk+ pk

end for

any positivity constraints are present, such as (3.17e), we can use line search techniques to find a feasible point along the search direction.

Remark 3.1. Note that Newton’s method for nonlinear equations can be extended to func-tions r : Rn → Sn

or r : Sn→ Sn_{by applying standard differential calculus for matrices.}

(43)

3.6 Constrained optimization 29

3.6.2 Central path

As mentioned above, an algorithm that tries to solve the KKT system (3.17), could apply Newton’s method and ignore (3.17e) but take it into consideration when computing the step length. Unfortunately, only very short steps can be taken because of (3.17d), and the convergence will therefore be very slow. We can make it possible to take longer steps by relaxing the condition in (3.17d). A way to do this relaxation is to use the central path. Definition 3.10 (Central path). The central path is defined as the set of solution points for which ∇xL(x, Z, v) = 0, (3.19a) hi(x) = 0, ∀i ∈ E, (3.19b) fi(x) − Si= 0, ∀i ∈ I, (3.19c) ZiSi= νIni, ∀i ∈ I. (3.19d) Zi 0, Si 0, ∀i ∈ I, (3.19e)

where ν ≥ 0 and where L(x, Z, v) is defined in Definition 3.7.

The idea is to start with a ν = ν0 and let ν tend to zero, in order to make the limit

solution satisfy the KKT conditions. Methods following this principle are called path-following methods. A useful concept in these kinds of methods, is the duality measure. Definition 3.11 (Duality measure). The duality measure µ for (3.6), is defined as

µ =X i∈I hZi, Sii X i∈I ni. (3.20)

The duality measure can be interpreted as a measure of optimality.

3.6.3 A generic Primal-Dual interior point method

Using what has been presented in recent sections, we can outline a simple algorithm that produces iterates that tend towards a solution of the KKT conditions.

Define ν = σµ, where 0 ≤ σ ≤ 1 usually is referred to as the Centering parameter. Write the equations (3.19a)–(3.19d) on the form r(x, Z, v, S, ν) = 0.

Assume that an initial point (x, Z, v, S) is available that satisfies the condition (3.19e). Compute the next iterate as

(x+, Z+, v+, S+) = (x, Z, v, S) + α(∆x, ∆Z, ∆v, ∆S), (3.21) where (∆x, ∆Z, ∆v, ∆S) is the search direction and α is a step length such that the next iterate S+, Z+satisfies (3.19e). We outline what is just described in Algorithm 6.

The choice of centering parameter σ is a tricky issue, which has no obvious solution. We will present one way to choose σ in Chapter 6.

In this chapter some optimization preliminaries have been presented, on which the methods described in Chapter 5 and Chapter 6 are based. In those chapters we will discuss the details more thoroughly.

(44)

Algorithm 6 A generic interior point method

Choose an initial point z = (x, Z, v, S) that satisfies (3.19e). Choose error tolerance .

Compute the duality measure µ, and choose σ. Set ν = σµ. while ||r(x, Z, v, S, ν)||2> do

Choose a search direction (∆x, ∆Z, ∆v, ∆S) Find α > 0 such that Z+, S+_{are feasible in (3.21).}

Update iterate using (3.21).

Compute µ using (3.20), and choose σ. Set ν = σµ. end while

(45)

Part II

Overview

(46)

(47)

4

Methods for reduced order

H

∞

problems

Several methods that can be used for solving reduced order H∞problems have emerged

during the last twenty years. The purpose of this chapter is to give an overview of what has been done in this field of research during the last two decades.

The methods that are described here are divided in two different groups. These are specific methods for finding reduced order H∞controllers and general methods that can

be applied to solve related optimization problems. We will also briefly mention controller reduction methods, which also could be used in this context.

4.1 Specific methods for reduced order

H

∞

problems

We begin by presenting a collection of methods which have been designed especially for solving low order H∞problems. All of the methods mentioned here are local methods

unless otherwise stated.

4.1.1 Alternating Projections

The alternating projection method, as described in Grigoriadis and Skelton (1996) and Beran (1997), appears in two versions. The first one is the Standard alternating projection with quite slow convergence and the second one is the Directional alternating projection, which is somewhat faster. The method is based on an LMI approach with symmetric matrix variables, similar to what is done in this thesis. In those methods, in each itera-tion a projecitera-tion on the rank constraint, a projecitera-tion on convex constraints and another projection on rank constraints are performed. The latter version uses algebra to improve convergence by choosing a clever direction. The method minimizes the maximum real part of the poles of the system, which is a related problem to H∞synthesis, because it

can be formulated using the same framework. The convergence of these algorithms are not always guaranteed.

(48)

4.1.2 Solving BMIs using Branch and bound

In Beran (1997), the author presents a branch and bound algorithm for solving BMIs, which divides the set into several smaller ones where upper and lower bounds on the optimal value are calculated. The upper bounds are found by fixing a parameter and the lower bound either by relaxing the bilinear constraints or using Lagrangian duality. Calculating the lower bounds in each region comes down to solving an SDP for each region.

In Tuan and Apkarian (2000) another branch and bound method is presented. It is also based on the BMI and LMI framework using symmetric matrix variables. The space consisting of the variables that are connected via bilinear expressions is divided. In each part of that space, convex LMI problems are solved to get lower and upper bounds on the objective function. The viability of the optimization algorithm is shown by solving relatively small problems in robust control. The effectiveness of the algorithm is highly dependent on the number of bilinear terms which may be high even for small problems.

Both these methods are global methods which may seem very appealing to use. How-ever the number of branches grows very fast with the number of variables, which will be a very limiting factor.

4.1.3 LMIRANK

This method is similar to the Alternating projections method described in Section 4.1.1, but it is more general and not designed specifically for solving control problems. It can solve problems involving LMIs and rank constraints on matrices. It uses a Newton like approach called Tangent and lift, in the directional step. Even though it is a Newton like approach, it does not always have local quadratic convergence and no actual convergence analysis has been done yet. It is implemented in a solver called LMIRANK1, which is described in Orsi et al. (2006), where numerical experiments, including an application to an output feedback stabilization problem, show the effectiveness of the algorithm. The solver LMIRANK can be interfaced by YALMIP, Löfberg (2004).

We have used this method on a couple of examples. However, our experience using this package is that applying this in a straight-forward way to a LMI formulation and rank constraint such as (2.17) and (2.19) has shown to fail on several examples due to numerical problems.

4.1.4 HIFOO

HIFOO2 (H-Infinity Fixed-Order Optimization) is a complete package that can be run in MATLAB. HIFOOis described in Burke et al. (2006). The supporting package HANSO, which is used for the nonconvex, nonsmooth optimization, uses a hybrid algorithm which combines a quasi-Newton algorithm (BFGS) for the initial phase and a local bundle phase which tries to verify local optimality for the best points that BFGS finds. If the bundle phase does not succeed, a gradient sampling phase, see Burke et al. (2005), is used to

1_{Available from: http://users.rsise.anu.edu.au/˜robert/lmirank/.} 2_{Available from: http://www.cs.nyu.edu/overton/software/hifoo/.}

(49)

4.1 Specific methods for reduced order H∞problems 35

improve the approximation of the local minimizer and it returns a rough local optimality measure.

Due to randomization of the initial points, evaluation of the algorithm is not straight-forward, since the experiments are not repeatable with the same result each time. Though, a statistical evaluation can be done by performing the experiments several times.

In Gumussoy and Overton (2008), this method is shown to perform very well com-pared to several other methods. We will use this method as a reference when we evaluate the methods that are suggested in this thesis. The evaluation is presented in Chapter 7.

4.1.5 Nonsmooth, multidirectional search

This method, presented in Apkarian and Noll (2006a), solves the problem by first min-imizing the spectral abscissa of the closed loop system to find parameters for a stable controller. These parameters are then used as a starting point when optimizing locally to minimize the H∞norm. Both optimization problems are nonsmooth and nonconvex.

The optimization is done by propagating a simplex in the parameter space using reflec-tion, expansion and contraction of the simplex. The simplex can have different forms that have different advantages depending on the problem. This method totally avoids sym-metric matrix variables, and is said to be able to be used to design small or medium sized controllers for very large systems as long as the decision variables are not too many. Ac-cording to the authors the method has managed to synthesize a stabilizing controller for an aircraft model with 55 states. In Gumussoy and Overton (2008), the method is com-pared to HIFOOwhen synthesizing static output feedback for some controllers for small systems (four or five states). For those systems, HIFOOmanaged to find controllers with better performance.

4.1.6 Nonsmooth H∞

synthesis

In Apkarian and Noll (2006b), a method that uses the same approach to solve the problem as Nonsmooth, multidirectional search, is presented. The difference is how the problem is solved. Two variants of a first order method that uses sub-gradients are presented that exploits the structure of the H∞norm. The method is evaluated numerically and shown

to be very efficient even for systems with hundreds of states. In the article, numerical examples show that it outperforms the Augmented Lagrangian method (see Section 4.1.8) in all examples but one. In Gumussoy and Overton (2008), HIFOOis applied to the same examples and obtains better results than both these methods.

4.1.7 Barrier method for solving BMIs

In Kocvara et al. (2005), a nonlinear SDP algorithm for static output feedback is pre-sented. It is based on the generalized augmented Lagrangian method. The nonconvex un-constrained sub-problems are solved either by a variant of the modified Newton method or by the trust-region algorithm. The faster Newton method is used in the beginning of the algorithm, and then a switch to the more robust trust-region method is made.

(50)

The algorithm gave rise the a more general method for solving problems with BMIs, which is implemented in the code PENBMI3_{which can be interfaced by YALMIP.}

The numerical evaluation of the algorithm shows that it is efficient in most cases, but sometimes it suffers from numerical problems. One of the sets of examples which they use (AC systems) is the same as we use in Chapter 7. However, the fact that only static output feedback controllers are searched for, does not make our evaluations really comparable, since in our evaluation we are aiming at preserving the closed loop performance as well as possible but at the same time trying to find a low order controller.

In Gumussoy and Overton (2008), Table 1 shows the results from HIFOObeing ap-plied to seven systems to find static output feedback controllers. PENBMI managed to find controllers for three of these systems, where HIFOOwas better in two cases. In the other four cases, PENBMI failed for reasons explained in Kocvara et al. (2005).

In El Ghaoui et al. (1997) and in Section 2.8 it is shown that only being able to solve SOF problems is not a restriction in any way since ROF problems can be equivalently reformulated as SOF problems. This means that this method could be used to synthesize controllers for ROF problems, but this is not done in Kocvara et al. (2005).

4.1.8 Augmented Lagrangian method

In Apkarian et al. (2003), a method is presented that is based on the concept of symmetric matrix variables, similar to the approach in this thesis. In principle, to find the next step in the algorithm, it iteratively minimizes the second order Taylor approximation of the augmented Lagrangian function where the nonconvex constraint is relaxed. The authors use a special scheme to update the Lagrangian multipliers in each iteration. Then a line search is performed. It requires computation of a gradient and a (Gauss-Newton) Hessian. Due to the nature of the non-convexity, the Hessian is not always positive definite, and sometimes needs to be convexified by adding a positive diagonal correction term. The method is robust, but converges linearly in the worst case. A Trust-region framework is also proposed, but claimed to be more computational demanding. In Fares et al. (2001) the framework is extended to problems in robust control theory and a proof of global convergence for the augmented Lagrangian method with matrix inequality constraints is found in Noll et al. (2004).

4.1.9 Sequential semidefinite programming (SSDP)

In Fares et al. (2002) a primal-dual method is presented which uses symmetric matrix variables that is an extension of SQP (Sequential Quadratic Programming). Either the BMI formulation or a formulation with a rank constraint can be used which are reformu-lated using a matrix equality constraint. This constraint is put into the objective function when the Lagrangian is constructed, but at the same time kept as a linearized constraint. The Lagrangian is then approximated by a second order Taylor expansion at the current point to get an SDP, which is solved to find the step. The Hessian is not always positive definite, and has to be convexified. This method is reported to have better convergence properties (local super-linear and quadratic convergence) than the Augmented Lagrangian

(51)

4.2 General methods for solving related problems 37

method, but said not to be as robust. The numerical examples indicate that the two meth-ods obtain roughly the same H∞norm, and thus have the same performance. However,

in Gumussoy and Overton (2008), they claim that HIFOOoutperforms the Augmented Lagrangian method and thus the SSDP method as well.

4.1.10 Concave minimization

This method, as presented in Apkarian and Tuan (1999, 2000), is based on a general LMI framework for robust control. The rank constraint is formulated as an algebraic matrix expression that is concave, which is then minimized. The problem is solved using the Frank and Wolfe method. In Fares et al. (2001) it is reported that sometimes the method fails to converge even in the neighborhood of a local solution.

4.1.11 Cone Complementarity Linearization Method

This method, El Ghaoui et al. (1997), is based on the LMI formulation described in e.g. Gahinet and Apkarian (1994). It considers the problem of finding an SOF controller by solving the related problem of iteratively minimizing tr(XY ) in each step by linearizing it, i.e. minX,Y tr(VkX + WkY ) subject to the LMIs. For an SOF controller, if the

minimum is 2n, the original problem has a solution. It seems quite probable that this method suffers from the same drawbacks as the concave minimization approach described in the previous section, since it is solved using the same method, i.e., the Frank Wolfe method.

In the article, it is shown that computing reduced order feedback (ROF) controllers can be seen as a special case of computing SOF controllers by augmenting the plant with extra states, inputs and outputs. We explained this procedure and provided some additional details in Section 2.8.

A related method called SLPMM is presented in Leibfritz (2001), but has better con-vergence properties, since it always generates a strictly decreasing sequence of function values. The method considers a mix of H∞/H2static output feedback control design.

No comparison with other methods is made in this article.

4.2 General methods for solving related problems

While there is a lot of work done in nonlinear optimization, when limiting the methods to those that can handle semidefinite constraints, much less can be found. Below some methods are described that can be applied to solve the problem formulation (5.2) that we will use in this thesis. It is very hard to actually judge how well these methods would perform when applied to this problem, but our intention is to point out that these methods could be used to solve the same problem.

4.2.1 PENNON

PENNON, (Penalty method for nonlinear and semidefinite programming), which is de-scribed in Kocvara and Stingl (2003), is a computer program for solving convex nonlinear