H -NormBasedLPVModellingandControl NonlinearOptimizationApproachesto

(1)

Linköping studies in science and technology. Thesis. No. 1453

Nonlinear Optimization Approaches

to

H

2 -Norm Based LPV Modelling and

Control

Daniel Petersson

REGLERTEKNIK

AU

TOM_{ATIC CONT}ROL

LINKÖPING

Division of Automatic Control Department of Electrical Engineering Linköping University, SE-581 83 Linköping, Sweden

http://www.control.isy.liu.se petersson@isy.liu.se

(2)

This is a Swedish Licentiate’s Thesis.

Swedish postgraduate education leads to a Doctor’s degree and/or a Licentiate’s degree. A Doctor’s Degree comprises 240 ECTS credits (4 years of full-time studies).

A Licentiate’s degree comprises 120 ECTS credits, of which at least 60 ECTS credits constitute a Licentiate’s thesis.

Linköping studies in science and technology. Thesis. No. 1453

Nonlinear Optimization Approaches to H₂-Norm Based LPV Modelling and

Control

Daniel Petersson petersson@isy.liu.se www.control.isy.liu.se Department of Electrical Engineering

Linköping University SE-581 83 Linköping

Sweden

(3)

(4)

(5)

Abstract

To be able to analyze certain classes of non-linear systems, it is necessary to try to represent them as linear parameter varying systems or even as linear fractional representations. For linear parameter varying systems and linear fractional rep-resentations of systems there exists many advanced analysis methods such as IQC-analysis and µ-analysis. This means that an important intermediate step in all this is to generate a linear parameter varying model that describes these non-linear system sufficiently well.

The first contribution in this thesis is a novel method that tries, through non-linear programming and a quasi-Newton framework, to generate a non-linear param-eter varying model given linearized state space models. The idea behind the method is to preserve the input-output relations of the given linearized systems and, in an H2-measure, find the best one. To handle uncertainties in data an extension of the proposed method is presented. It is shown how the computa-tionally hard robust optimization approach to the uncertain case can be approxi-mated using a problem specific regularization.

The second contribution in this thesis is a method for synthesizing output-feed-back H2controllers of arbitrary order. This method also uses non-linear program-ming and a quasi-Newton framework to achieve this. One great benefit with this method is that it also possible to impose structure in the controller.

Both of the methods described above tries to solve non-linear and non-convex problems, which means that the problem of finding a good initial estimate is an important problem. For both methods an initialization procedure is proposed to try to find an initial estimate.

The methods are evaluated on several examples and show promising results. A contributing factor is that significant effort has been spent on utilizing the struc-ture of the optimization problems to make the methods efficient.

(6)

(7)

Populärvetenskaplig sammanfattning

För att det ska vara möjligt att analysera vissa klasser av olinjära system kan det vara nödvändigt att beskriva dem som linjärt parametervarierande system. Då kan man använda sig av de analysmetoder som finns för linjärt parameter-varierande system. Ett väldigt viktigt mellansteg i denna process är alltså att generera linjärt parametervarierande system som kan beskriva dessa olinjära sys-tem bra.

Det första bidraget i denna avhandling är en ny metod som försöker att gener-era ett linjärt parametervarigener-erande system givet linjärisgener-erade tillståndsmodeller. Idén bakom metoden är att försöka bevara insignal-utsignal förhållandet från de givna linjäriserade systemen och hitta en bra approximation till det ursprungli-ga systemet. En utökning av denna metod, för att ta hand om fallet då det finns osäkerheter i det data som är givet, presenteras också. Denna utökning använder sig av regularisering för att skapa en mer robust lösning.

Det andra bidraget är en metod för att skapa regulatorer av godtycklig storlek. Med hjälp av denna metod är det också möjligt att införa struktur i regulatorn. Båda dessa problem är klassiskt svåra men viktiga problem inom reglerteknik. De metoder som föreslås baseras på lösning av icke-konvexa optimeringsprob-lem, vilket betyder att det är väldigt viktigt att hitta en bra initial gissning. I avhandlingen ges även förslag på hur detta skall lösas.

Metoderna är testade på ett antal exempel och visar stor potential.

(8)

(9)

Acknowledgments

First of all I would like to thank my supervisors Professor Lennart Ljung and Dr. Johan Löfberg for your support and all the guidance I’ve got. Especially Johan, for your wast knowledge in optimization, our discussions and for always answering my questions.

I would like to thank Professor Lennart Ljung again, as the former head of the Division of Automatic Control, for letting me join the Automatic Control group and our former administrator Ulla Salaneck for always keeping track of every-thing and always being helpful. My gratitude also goes to the current head of the Division of Automatic Control, Professor Svante Gunnarsson, and our current administrator, Åsa Karmelind. Both of you got very big shoes to fill, and you are doing a great job filling them.

A big thank you goes to the people who have proofread various parts of this the-sis: Lic. Rikard Falkeborn, Lic. Henrik Ohlsson and Dr. Henrik Tidefelt. Your comments have been very valuable. Dr. Henrik Tidefelt deserves an extra thanks together with Dr. Gustaf Hendeby for doing a very good job with LA_{TEX and all} sorts of other computer related stuff. Lic. Rikard Falkeborn and Dr. Ragnar Wallin also deserves an extra thanks for always being there to discuss both rele-vant and (mostly) irrelerele-vant subjects.

For financial support, I would like to thank the European Commission under contract No. AST5-CT-2006-030768-COFCLUO.

Finally, I would like to thank the two most important persons in my life, Maria and Wilmer. Thank you for always being there!

Linköping, November 2010 Daniel Petersson

(10)

(11)

5.4 General Nonlinear Programming Approach to Controller Synthesis 57 5.4.1 Cost Function . . . 57 5.4.2 Gradient . . . 57 5.5 Examples . . . 58 6 Computational Aspects 67 6.1 Initialization . . . 67 6.2 Computations . . . 69 6.2.1 A Complete Solver . . . 70 7 Concluding Remarks 71 7.1 Conclusions . . . 71 7.2 Future research . . . 72 Bibliography 73

(13)

Notation

Symbols and operators Notation Meaning

p Vector of scheduling parameters

A Bold, capitalized latin letters denote matrices

aij Denote element (i, j) in the matrix A

A ≺() 0 Ais negative (semi) definite A () 0 Ais positive (semi) definite

AT Transpose of matrix A A−1 Inverse of matrix A tr A Trace of matrix A O _Ordo Abbreviations Abbreviation Meaning

ltv _{Linear time varying} lpv Linear parameter varying

lti Linear time invariant sdp Semidefinite programming nlp Nonlinear programming

lft Linear fractional transformation lfr Linear fractional representation lmi _{Linear matrix inequality} bmi _{Bilinear matrix inequality}

(14)

(15)

1

Introduction

1.1 Background

Linear parameter varying systems first emerged as gain-scheduling controllers. The basic idea of gain-scheduling is that you have a number of linearized mod-els from different operating points in a nonlinear system. For each of these linear models a controller is designed and then interpolated to obtain a con-troller for the whole parameter region. The parameters that describe the oper-ating points are called scheduling parameters. Hence, the resulting controller depends on these parameters and is parameter dependent. The development of gain-scheduling emerged during the 80s and became popular in industrial ap-plication, this due to the wide applicability to systems in industry. It was soon realized that many nonlinear systems can also be represented as linear parame-ter varying systems. Laparame-ter, various methods have been developed for analysis and synthesis in the context of linear parameter varying systems. Many of these meth-ods are based on a certain class of optimization problems, namely semidefinite programming. Semidefinite programs have been a hot research area for the last two decades, but one drawback with semidefinite programming is, often, that the complexity of the problems grows very rapidly with the size of the problem,

e.g. in control synthesis the amount of states in a system. In this thesis we will

instead take a step back and try to formalize our problems as a general nonlinear program and solve it by using a quasi-Newton framework.

(16)

2 1 Introduction

1.2 Contributions

This thesis is based on both published, Petersson and Löfberg [2009, 2010], and unpublished results.

The first contribution, Petersson and Löfberg [2009, 2010], is the part that in-volves generating lpv-models given state space models. This contributes to ma-jor parts of chapters 3 and 4.

Daniel Petersson and Johan Löfberg. Optimization based LPV-approximation of multi-model systems. In Proceedings of the Eu-ropean Control Conference, pages 3172–3177, Budapest, Hungary, 2009.

Daniel Petersson and Johan Löfberg. Robust generation of LPV state-space models using a regularized H2-cost. In Proceedings of the Multi-Conference on Systems and Control, pages 1170–1175, Yoko-hama, Japan, 2010.

The second contribution is about controller synthesis, more precisely about gen-erating static and reduced-order output-feedback H2 controllers. These results are not yet published elsewhere.

1.3 Thesis Outline

The thesis is organized as follows. Chapter 2 contains some basic optimization background and basic system theory and also explains the two main system norms, H2and H∞. In Chapter 3 we start by presenting an overview of methods for generating lpv-models, then a novel method for generating lpv-models given a set of state space models is presented. In Chapter 4 we extend the idea from Chapter 3 by adding a regularization term, and use the regularization as a proxy for robust optimization. Chapter 5 also start with a brief overview of existing methods and then presents a method for synthesizing static and reduced-order output-feedback H2controllers. In Chapter 6 we present our suggestion for ini-tialization for the problem introduced in Chapter 3, we also show how to utilize the structure in the problem to speed up the computations. Chapter 7 concludes the thesis and presents some directions for future research.

(17)

2

Prerequisites

In this chapter some basic optimization background and basic system theory, in-cluding the two main system norms, H2and H∞, is presented.

2.1 Optimization

This section will give a brief presentation of optimization and some methods that can be used to solve optimization problems. The presentation will closely follow relevant sections in Nocedal and Wright [1999].

A general optimization problem can mathematically be written as min

x f (x)

subject to gI,i(x) ≤ 0, i = 1, . . . , mI gE,i(x) = 0, i = 1, . . . , mE

where f (x) is the cost function, f : Rn → R_{and x ∈ R}n_{, and g}_I,i_{(x), g}_E,i_{(x) are} the constraint functions. A vector x∗is called optimal if it produces the smallest value of the cost function of all the x that satisfies the constraints. In this thesis we will mostly work with problems that are unconstrained,i.e., problems where

we do not have any gI,i(x) or gE,i(x). The solution to the optimization problem, f (x∗), is called a minimum. This can either be a local or global minimum and the point where this value is attained, x∗

is called a minimizer (local or global). Theorem 2.1 (First order necessary condition). If x∗is a local minimizer and f is continuously differentiable in an open neighborhood of x∗

, then ∇f (x∗) = 0.

(18)

4 2 Prerequisites

Optimization problems can be divided into two classes, convex optimization problems and non-convex optimization problems. The problems of interest in this thesis will be non-convex. To explain what a non-convex problem is, we start by explaining what a convex problem is.

Let us first define a convex set, which is a set, such that if you take any two points in the set and draw a line between them, that line should also lie in the set. A convex function is defined in the same manner. A function is convex if it satisfies

f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)

for all x, y ∈ N and θ ∈ [0, 1], where N is a convex set.

A convex optimization problem is an optimization problem where both the cost function and the feasible set (the set of x’s defined by the constraints) are convex. Convex optimization problems have the very nice feature that a local minimizer is always a global minimizer. This means that when solving a convex optimization problem you know that when you have found a minimum, you have found the global minimum. In a non-convex optimization problem you do not have this guarantee, and you have the problem of not knowing if you have found a local or a global minimum. For further reading seee.g., Nocedal and Wright [1999].

2.1.1 Local Methods

One approach to solve non-convex optimization problems is to use local methods, methods which seek for a local minimizer,i.e., a point that in its neighborhood

of feasible points has the smallest value of the cost function. A class of local methods which is widely used today in solving nonlinear non-convex problems is quasi-Newton line search methods. These methods typically require that the cost function is twice continuously differentiable, at least for the convergence theory to hold.

The line search strategy is to find a direction pk, and a step αk, such that f (xk) = fk > f (xk+ αkpk). The obvious questions that arise are, “In which direction?” and

“How long step?”. To these two questions there are many answers and a few of which will be presented here. The most obvious answer to the first question is to take the steepest descent direction, which is pk = −

∇_f_k

||∇_f_k|| and choose α as αk = arg min

α f (xk

−_αp_k_).

A benefit with this choice of pk is that we only need information about the

gra-dient and no second-order information,i.e., information about the Hessian. The

problem is that the convergence with this choice can be extremely slow. By ex-ploiting second order information about the cost function we can produce a bet-ter search direction. Assume a model function

mk(p) = fk+ pT∇fk+ pT∇2fkp,

(19)

2.1 Optimization 5 the solution to pk = arg min p mk(p) is pk = −(∇2fk) −₁_∇

fk. This entity both contains information about the direction

and the length of the step, the length of the step is only adjusted if the step does not produce a sufficiently large decrease in f due to discrepancy between mk(pk)

and f (xk + pk). This choice of direction and step length is called the Newton

method. There are however two major drawbacks with this method, we need to calculate the Hessian which can be very time consuming, and we need the Hessian to be positive definite.

Quasi-Newton Methods

Quasi-Newton methods are methods that resemble Newton methods but in some way tries to approximate the Hessian in a computationally efficient manner. As in the Newton method we start with a model function

mk(p) = fk+ ∇fkTp +

1 2p

T_B

kp,

where Bk is a symmetric positive definite matrix. Instead of calculating a new

Bk for every iteration we want to update Bk to obtain Bk+1. As for the Newton

method, the minimizer to the model function is pk = −B

−₁

k ∇fk, which is then used

to calculate xk+1as

xk+1= xk+ αkpk.

Here αk is chosen according to some conditions which will not be further

dis-cussed here, seee.g., Nocedal and Wright [1999] for further reading.

One way of updating Bkis to let Bk+1be the solution to the problem

Bk+1= arg min

B

||_{B − B}_k||

G−_k1 (2.1a)

subject to B = BT, Bsk = yk (2.1b)

where sk = αkpkand yk = ∇fk+1− ∇fk. The norm that is used in the optimization

problem is the weighted Frobenius norm,

||_B|| G−_k1 = G− 1 2 k BG −1 2 k _F , G_k = 1 Z 0 ∇_{f (x}_k_{+ τ α}_k_p_k_)dτ.

The structure of the optimization problem (2.1) can be explained like this. The constraint that B, which is an approximation of the Hessian, should be symmet-ric is obvious for the case when we have a twice differentiable function. The sec-ond constraint, the secant equation, makes B an approximation of the Hessian. To determine Bk+1 uniquely we choose to take a B that in some sense is closest

to Bk, and also we want to make the minimization problem scale-invariant and

non-dimensional, which explains the minimization and the choice of norm and weights.

(20)

6 2 Prerequisites

The optimization problem 2.1 has a closed form solution B_k+1= (I − ρkykskT)Bk(I − ρkskykT) + ρkykykT, ρk =

1

y_kTsk .

This update of Bkis called the dfp (Davidon-Fletcher-Powell) updating formula.

Now if we look back on what we wanted to do, we see that we wanted to calculate

pk = −B

−₁

k ∇fk, so we want the inverse of Bk. Since Bk+1is a rank two update of

Bk, we can find a closed form expression for the inverse of Bk+1= H−_k+11 as

H_k+1= Hk− H_kykykTHk y_kTH_kyk + sks T k y_kTsk .

An even better updating formula is the bfgs (Broyden-Fletcher-Goldfarb-Shanno) updating formula where we solve a similar optimization problem as before but for Hk+1instead. H_k+1 = arg min H ||H − H_k||_G k subject to H = HT, Hyk = sk

which has the solution

H_k+1= (I − ρkskyTk)Hk(I − ρkyksTk) + ρkskskT.

The benefit with quasi-Newton methods is that every iteration in the optimiza-tion scheme now can be performed with complexity O(n2), not including func-tion and gradient evaluafunc-tions.

The bfgs-scheme will be used extensively in the strategies we propose in this thesis.

2.2 System Theory

This section reviews some of the standard system theoretical concepts and ex-plain some system norms that will be used in the thesis.

2.2.1 Basic Theory and Notation

A linear time invariant system, lti-system, can be described as

˙x(t) = Ax(t) + Bu(t) (2.2a)

y(t) = Cx(t) + Du(t) (2.2b)

where x(t) ∈ Rnx_{is the state of the system, u(t) ∈ R}nu _{is the input to the system} and y(t) ∈ Rny _{is the output of the system and A, B, C and D constant matrices} of suitable dimensions. The system in (2.2) is expressed in state space form, the corresponding input-output relation from u(t) to y(t) is

(21)

2.2 System Theory 7

where U (s) and Y (s) are the Laplace transforms of u(t) and y(t) and

G(s) = C(sI − A)−1B+ D = " A B C D # .

Here we also introduce the notation "

A B

C D

#

as the transfer function from u(t) to y(t).

A natural generalization of lti-systems are linear time varying systems, ltv-systems, where the state space matrices can be dependent on time. The draw-back is that ltv-systems are very hard to analyze and work with. This raises the need of an intermediate step to represent systems, this is where linear parameter varying systems, lpv-systems, comes in. lpv-systems depend on scheduling pa-rameters, p, that can vary with time, but are measurable. An lpv-system can be written, in state space representation, as

˙x(t) = A(p)x(t) + B(p)u(t),

y(t) = C(p)x(t) + D(p)u(t),

where p is the vector of scheduling parameters. Note that there is no restriction on how the lpv-system depends on the scheduling parameters, hence it can be highly nonlinear. lpv-systems have the property that if the scheduling parame-ters in the lpv-system are frozen the system becomes a regular lti-system.

2.2.2 Gramians

We now introduce the controllability Gramian P and observability Gramian Q, for a stable continuous time system defined as

P= ∞ Z 0 eAτBBTeATτdτ, (2.3a) Q= ∞ Z 0 eATτCTCeAτdτ. (2.3b)

The Gramians have nice physical interpretations. The controllability Gramian describes how much each of the states is affected by the inputs. To realize this, first assume that input i is an impulse, i.e., ui(t) = δ(t). The state vector becomes, x(t) = eAtBi. Now do this with all inputs and let Bi be the i:th column of the

matrix B and construct the matrix X(t) = eAtB. Now looking at the definition of the controllability Gramian

P= ∞ Z 0 eAτBBTeATτdτ = ∞ Z 0 X(τ)XT(τ)dτ,

(22)

8 2 Prerequisites

The observability Gramian describes how the different states contributes to the energy in the output. This can be realized if we assume that the input is zero, then the output signal is y(t) = CeAtx0, where x0is the initial state. Now measure the energy in the output as

∞ Z 0 yT(τ)y(τ)dτ = xT₀ ∞ Z 0 eATτCTCeAτdτ x0= xT0Qx0. (2.4)

Note that the Gramians P and Q satisfy the Lyapunov equations

AP+ PAT + BBT = 0, (2.5a)

ATQ+ QA + CTC= 0. (2.5b)

For a stable discrete time system, equations (2.3) and (2.5) are replaced by P= ∞ X k=0 AkBBT(Ak)T, (2.6a) Q= ∞ X k=0 (Ak)TCTCAk (2.6b)

where P and Q satisfy the discrete Lyapunov equations

APAT −_P_{+ BB}T _{= 0,} _(2.7a)

ATQA − Q+ CTC= 0. (2.7b)

The Gramians are useful tools when it comes to,e.g., model reduction and

com-puting the H2-norm, as we will see in this thesis.

2.2.3 System Norms

System norms are important tools when it comes to analyzing systems. We will in this section try to explain two system norms, the H2- and the H∞-norms. These system norms can be interpreted as norms that answer the question: “given information about the allowed input, how large can the output be?”.

Continuous Time H₂-Norm

In the time domain the H2-norm of a system G can be interpreted as ||_G||₂₌ _max u(t)=unit impulses y (t) ₂, (2.8)

and is defined as (see Skogestad and Postlethwaite [2007])

||_G||₂_{= ||g(t)||}₂₌ v u u u t ∞ Z 0 X ij |_g_ij_(τ)|2_dτ, _(2.9)

(23)

2.2 System Theory 9

where gij is the ij:th element of the impulse response matrix, g(t). In the

fre-quency domain the definition of the H2-norm of a continuous time lti-system, i.e., (2.2), is ||_G||_H 2= v u u u t tr         1 2π ∞ Z −∞ G(jω)G(jω)H_dω         . (2.10)

For (2.10) to be defined we see that the system needs to be strictly proper,i.e.,

we need D = 0 in (2.2). To compute the norm efficiently, the expression can be rewritten to a numerically more suitable form. This can be done using the Gramians for the system (2.3). Using Parseval’s identity we rewrite (2.10) as

||_G||H₂= v u u u t tr          ∞ Z 0 (CeAt_B)T _(CeAt_{B) dt}          = v u u u t tr          ∞ Z 0 (CeAt_{B) (Ce}At_B)T_dt          . (2.11)

Note that the system needs to be stable or the H2-norm of the system will be infinite. By using (2.3) it is possible to rewrite (2.11) as

||_G||H₂= p

tr BT_QB₌

√

tr CPCT_, _(2.12)

where Q and P are the observability and controllability Gramians respectively for the system G, satisfying the equations in (2.5).

Discrete Time H₂-Norm

The H2-norm of a discrete time lti-system, see Skelton et al. [1998], is defined as

||_G||_H 2 = v u u u t tr          1 2π π Z −_π G(ejω_)G(ejω₎H_dω          . (2.13)

Note that for the discrete time H2-norm we do not need D = 0. Using (2.6) and Parseval’s identity we rewrite the discrete H2-norm as

||_G||_H 2=

q

trDT_D_{+ B}T_{QB =}q_tr_DDT _{+ CPC}T _(2.14) where Q and P are the observability and controllability Gramians respectively, for the system G, satisfying the equations in (2.7).

The H∞-Norm

In the time domain the H∞-norm of a system G is defined as ||_G||_∞₌ _sup ||_u(t)|| 2=1 y (t) ₂.

In the equation above it can be seen that the H∞-norm of a system is the max-imum gain possible given an input ||u(t)||₂ = 1. In the frequency domain the

(24)

10 2 Prerequisites H_∞_{-norm becomes} ||_G||_∞ _{= sup} ω∈R ¯ σ (G(jω))

and for discrete time systems

||_G||_∞₌ _sup

ω∈[−π,π]

¯

σ (G(ejω)),

where ¯σ (A) means the largest singular value of the matrix A. Computing the H∞ -norm for a continuous time system is not as easy as computing the H2-norm, you need to solve an optimization problem. That is calculating the smallest value of

γ such that the matrix H has no eigenvalues on the imaginary axis, where

H= A+ BR −₁ DTC BR−1BT −_CT_{(I + DR}−1_DT_)C −_{(A + BR}−1_DT_C)T ! and R = γ2_{I − D}T_D.

(25)

3

Generation of

LPV

State Space

Models Using

H

2 -Minimization

In this chapter we will present a novel method for generating linear parameter varying models (lpv-models). The method uses an approach that tries to preserve the input-output relations from the given models in the resulting lpv-model. Es-sentially, lpv-models are nonlinear systems, that are described by linear differ-ential equations whose coefficients depend on scheduling parameters. We begin this chapter with a review of some existing methods for generating lpv-models.

3.1 Overview of Existing Methods for Generating

LPV

-Models

The behaviour of an lpv-model can be described by ˙x(t) = A(p(t))x(t) + B(p(t))u(t),

y(t) = C(p(t))x(t) + D(p(t))u(t),

where x(t) is the state, u(t) and y(t) are the input and output signals and p(t) is the vector of scheduling parameters. In flight control applications, the components of p(t) are typically mass, position of centre of gravity and various aerodynamic coefficients, but can also include state dependent parameters such as altitude and velocity, specifying current flight conditions.

Some advanced robustness analysis methods, such as IQC-analysis and µ-analysis, seee.g., Megretski and Rantzer [1997] and Zhou et al. [1996], require a conversion

of the lpv-model into a linear fractional representation (LFR), seee.g., Zhou et al.

[1996]. For this to be possible it is necessary that the parametric matrices A(p), B(p), C(p) and D(p) of the lpv-model are rational in p. This requirement is often violated in lpv-models generated directly from a nonlinear model description,

(26)

12 3 Generation of lpv State Space Models Using H2-Minimization

ther due to presence of nonlinear parametric expressions or tabulated data in the model. In both cases, rational approximations must be used to obtain a suitable model.

In the last decades intensive research has been carried out on lpv-models, seee.g.,

Rugh and Shamma [2000], Leith and Leithead [2000]. An important reason for this interest is that it is a powerful tool for modeling of nonlinear systems, for ex-ample aircrafts (Marcos and Balas [2004]) or wafer stages (Wassink et al. [2005]). This is because it is often not sufficient to use only one linear controller to stabi-lize and achieve good performance over the operating space of interest. In these cases it is necessary to use some sort of gain scheduling. The gain scheduling ap-proach can be summarized as follows: find one or more scheduling parameters which can completely parametrize the operating space of interest. Then define a family of linearized models for the model associated with the set of operating points and design a parametric controller which ensures sufficient performance over the whole region.

In this chapter we concentrate on finding a tool that helps us to model a nonlinear system as an lpv-system and use lpv-analysis and synthesis. The generation of lpv-models can simplistically be divided into two main families of methods, global methods (see e.g., Nemani et al. [1995], Lee and Poolla [1999], Bamieh

and Giarre [2002], Felici et al. [2007], Tóth [2008]) and local methods (seee.g.,

Steinbuch et al. [2003], Wassink et al. [2005], Lovera and Mercere [2007], Caigny et al. [2008], Pfifer and Hecker [2008]). An excellent survey of existing methods can be found in Tóth [2008].

3.1.1 Global Methods

In the global approach a global identification experiment is performed by excit-ing the system, while the schedulexcit-ing parameters change the dynamics of the sys-tem. An advantage with this approach of generating lpv-models is that it is also possible to capture the rate of change of the parameters and how they can vary be-tween different operating points. However one drawback is that it is sometimes, for example in flight applications, not possible to perform such an experiment.

3.1.2 Local Methods

In the local approach, the lpv-models are generated by interpolating, or in some other way combining, several local lti-models. The local models can for example have been identified using a set of measurements for every local model, for which there exists several methods, seee.g., Ljung [1999], or by linearizing a nonlinear

model in different operating points. In this class of methods the assumption that the system can operate at different fixed operating points, where the scheduling parameters are “frozen”, is assumed. There are of course systems where this is not possible and where this class of methods is not applicable, requiring the use of global methods. Another drawback with this class of methods is that it does not take time variations of the scheduling parameters into account, thus limit-ing local methods to systems where the schedullimit-ing parameters vary slowly in

(27)

3.2 An Input-Output Preserving Approach to lpv-Modeling 13

time, which is a commonly used assumption when gain scheduling is used, see Shamma and Athans [1992]. A good paper that explains the pitfalls of interpo-lation is Tóth et al. [2007]. There are several different variants of methods that use the local approach, the method presented in this chapter is one of them. One drawback that many of the local methods suffers from is that they need the lo-cal models to be given in the same state space basis to be able to be used, see

e.g., Pfifer and Hecker [2008]. Some methods have remedies by transforming the

system to a common basis, usually some canonical form, seee.g., Steinbuch et al.

[2003]. However, they can suffer from bad numerics in that form. The method presented in this thesis does not have the drawback of having to use local models with the same state space basis or even the same amount of states.

3.2 An Input-Output Preserving Approach to

LPV

-Modeling

Our goal with the method proposed in this chapter is to try to preserve the input-output behavior of the model. Ideally the goal would be, given a nonlinear model

GNL(p), to find an lpv-model, ˆG(p), that is optimal with respect to some global discrepancy measure on the model error, for instance the following integral

min ˆ A, ˆB, ˆC, ˆD Z GNL(p) − ˆG(p) d p, where ˆ G(p) = " _ˆ A(p) B(p)ˆ ˆ C(p) D(p)ˆ # .

This is not always practical or even tractable. For example in many applications,

e.g., flight applications, one often only have a simulation model available and not

an analytical nonlinear model and it is only possible to extract linearized models for discrete values of the scheduling parameters. Having this in mind the cost function is changed into a discrete version

min ˆ A, ˆB, ˆC, ˆD X i Gi − ˆG(pi) ,

where Gi is a linearized version of GNL(pi). Looking at this equation we see that

the systems Gi and G(p) do not need to have the same ordering in the states or

even the same amount of states,i.e., a method based on this measure can also

(28)

The system matrices, ˆA, ˆB, ˆCand ˆD will be a linear combination of some basis functions wk(p), e.g., in the polynomial case, monomials. The system matrices in

the lpv-model will then depend on p as ˆ A(p) =X k wk(p) ˆA(k) (3.1a) ˆ B(p) =X k wk(p) ˆB(k) (3.1b) ˆ C(p) =X k wk(p) ˆC(k) (3.1c) ˆ D(p) =X k wk(p) ˆD(k). (3.1d)

Our goal is to find the coefficient matrices ˆA(k)_{, ˆ}_B(k)_{, ˆ}_C(k)_{and ˆ}_D(k)_.

The choice of which norm to use is still not decided. The two most widely used norms in system theory are H2- and H∞-norm. They will both capture the input-output relation of the system, see Section 2.2.3. The one that will be chosen in this chapter is the H2-norm, for reasons that will be explained later. A benefit of formulating the problem as proposed is that, independent of the choice of the norm, the optimization will be invariant to state transformations.

3.2.1 Invariance to State Transformations

As said before the idea of the method proposed in this chapter is to capture the input-output behavior of the given linear models. An advantage with the method is that even though some elements can depend polynomially or even non-continuously on the parameters, it can make use of the fact that the realization is non-unique. The model ˆG(p) =

" _ˆ

A(p) B(p)ˆ ˆ

C(p) D(p)ˆ #

has the same transfer function and input-output behavior as ˆGT(p), with

ˆ GT(p) = " _ˆ A_T(p) Bˆ_T(p) ˆ C_T(p) D(p)ˆ # = " T(p) ˆA(p)T(p)−1 T(p) ˆB(p) ˆ C(p)T(p)−1 _D(p)_ˆ # .

where T(p) is a non-singular transformation matrix that can depend on p. This means that for every model Gi we are not only limited to find the best

approx-imation between Gi = " Ai Bi C_i D_i # and ˆGi = " _ˆ Ai Bˆi ˆ C_i Dˆ_i #

, but to find the match between Gi = " T_iA_iT−_i1 T_iB_i CiT −₁ i Dˆi # and ˆGi = " _ˆ A_i Bˆ_i ˆ Ci Dˆi #

. We illustrate that this property can come in handy in some cases with an example.

(29)

3.2 An Input-Output Preserving Approach to lpv-Modeling 15 Example 3.1: Effect of State Transformations

Assume we are given samples from the lpv-model

ˆ A(p) =               0.4p2+ 3p − 3.6 −0.4(p3−24p−40) p 0.2(27p3+55p2+37p−160) p 0.4p2+ 3.6p − 3.2 −0.2(2p3+3p2−46p−10) p 0.2(27p3+23p2−_96p−20) p 1.6p − 1.6 −0.2(8p2−33p−5) p 0.2(23p2−_68p−10) p               , ˆ B(p) =         8 + 7p + p2 6 + 2p + p2 3         , ˆ C(p) = 0.2 + 0.2p −0.2(−9p+p2−10) p − 0.8(−p+4p2−₅₎ p , ˆ D(p) = 0.

It would be difficult to use an element-wise method, i.e., a method where we interpolate the individual elements in the system matrices independently, with low order polynomials to identify this lpv-model due to the rational functions. However, a different realization of this model is given by

T(p) =          0.2 −_0.2 _0.2 0 1_p −2 p 0 0 1          , ˆ A_T(p) = T(p) ˆA(p)T−1(p) =         −_{2 + p} _{3 + p} _{5 + 2p} 2 + 2p −_{4 + 3p} _{1 + 5p} −_{8 + 8p} _{1 + 5p} −_{2 + 3p}         , ˆ B_T(p) = T(p) ˆB(p) =         1 + p 2 + p 3         , ˆ CT(p) = ˆC(p)T −₁ (p) =1 + p 2 + 2p 3 + 3p, ˆ DT(p) = ˆD(p) = 0.

Obviously, this model is affine in p. In Figure 3.1 we see how the individual elements in ˆAand ˆA_T vary for p ∈ [−10, −1].

The example above illustrates the important property that when searching in the class of lpv-models with low order dependency on p, it is some times possible to find an equivalent model with respect to the input-output relation, even though the given model depends non-polynomially on p. One important thing to note is that the D-matrix is not affected by a state transformation. Thus state transfor-mations do not help to simplify the dependence of p in D.

(30)

16 3 Generation of lpv State Space Models Using H2-Minimization −10 −5 0 −10 −5 0 −15 −10 −5 0 5 −15 −10 −5 0 5 −20 −10 0 10 −20 0 20 −40 −20 0 20 40 −20 0 20 −20 −10 0 10 (a) ˆAT −10 −5 −10 −5 −10 −5 −10 −5 −10 −5 −10 −5 −10 −5 −10 −5 −10 −5 (b) ˆA

Figure 3.1:The elements in the 3 × 3 matrices ˆAand ˆA_T, from Example 3.1, as a function of p. In the transformed state basis the dependence is affine and in the original state basis it is nonlinear.

3.3 LPV

-Modeling Using the

H

₂

-Norm

In this section we continue with the idea presented in the previous section but specify the norm to be the H2-norm and also taking the squared norm,i.e., the optimization problem is formulated as

min ˆ A, ˆB, ˆC, ˆD X i Gi− ˆG(p (i)₎ 2 H₂ = min_ˆ A, ˆB, ˆC, ˆD X i ||_E_i||2_H 2 = min_A_ˆ_{, ˆ}_B_{, ˆ}_C_{, ˆ}_DV . (3.2)

To study the problem, (3.2), let us start by looking at the case when we only have one model and later generalize this to the case where we have multiple models. To introduce some structure in the problem we define the error systems as

E = G − ˆG(p),

which can be realized in state space form as

E = " A_E B_E CE DE # =           A 0 0 Aˆ ! B ˆ B ! C −_Cˆ _{D − ˆ}_D           . (3.3)

This realization of the error system will later prove beneficial when rewriting the optimization problem. Notice that for a continuous-time model the H2-norm is unbounded if the model is not strictly proper, i.e., D = ˆD is needed for all models (which is trivially true in the case when both D = 0 and ˆD = 0). Thus the problem of finding ˆDcan hence be seen as a separate problem which is not addressed here. Throughout the chapter it is assume that the models we are given are stable; otherwise the H2-norm is not defined.

(31)

3.3 lpv_{-Modeling Using the H}₂_-Norm 17

By using the result in Section 2.2.3 we can rewrite the cost function as ||_E||2_H

2 = tr B

T

EQEBE= tr CEPECTE. (3.4)

where QEand PEare the observability and controllability Gramians respectively,

for the error system E, satisfying the equations

AEPE+ PEATE+ BEBTE = 0, (3.5a)

AT_EQE+ QEAE+ CTECE = 0. (3.5b)

With the realization (3.3) of E and the partitioning of the Gramians PEand QEas

PE = P X XT ˆP ! , QE= Q Y YT Qˆ ! , (3.6)

six Sylvester and Lyapunov equations

AP+ PAT + BBT = 0, (3.7a) AX+ X ÂT + B ˆBT = 0, (3.7b) ˆ A ˆP+ ˆP ÂT + ˆB ˆBT = 0, (3.7c) ATQ+ QA + CTC= 0, (3.7d) ATY+ Y Â − CTCˆ = 0, (3.7e) ˆ ATQˆ + ˆQ Â+ ˆCTCˆ = 0, (3.7f) are obtained.

Note that P and Q satisfy the Lyapunov equations for the controllability and the observability Gramians for the given system, and ˆP and ˆQsatisfy the Lyapunov equations for the controllability and the observability Gramians for the sought system. With the partitioning (3.6) of PE and QEit is possible to rewrite the cost

function, (3.4), as ||_E||2_H 2 = tr BTQB+ 2BTY ˆB+ ˆBTQ ˆˆB, (3.8a) ||_E||2_H 2 = tr CPCT −_{2CX ˆ}_CT _{+ ˆ}_{C ˆ}_{P ˆ}_CT_. _(3.8b) The equations in (3.8) are equivalent and will later be useful in simplifying the derivations for the gradients of the cost function (3.4) or more generally (3.2). One major drawback with this approach though, is that the cost function, in the general case, is non-convex. This can easily be realized with the following exam-ple.

Example 3.2: Non-Convexity Begin with the system

Gtrue= " ₋ 1 1 1 0 # .

(32)

Now we want to find a system, Gapprox, Gapprox= " a b c 0 # ,

that approximates the system Gtrue, where a, b and c are the decision variables. Consider an initial guess in an optimization formulation to be the system

G0= " −₂ −₁ −₁ ₀ # .

If we now do a line search from the system G0along a descent direction, Gapprox= " ₋ 2 + t −_{1 + 2t} −_{1 + 2t} ₀ # , t ∈ [0, 1],

then the value of the cost function, E(t) = Gapprox(t) − Gtrue 2

H₂, along the search direction is non-convex, see Figure 3.2.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.16 0.18 0.2 0.22 0.24 0.26 0.28 t E(t)

Figure 3.2: The value of the cost function along the search direction de-scribed in Example 3.2. The function clearly demonstrates the presence of local minima along the search direction.

The non-convexity of the problem is of course problematic and with it the initial-ization of the problem becomes important, which will be addressed in Section 6.1. In the examples in Section 3.6 we however see that the method is able to find good model approximations despite the non-convexity.

(33)

3.4 General Nonlinear Programming Approach to lpv-Modeling 19

3.4 General Nonlinear Programming Approach to

LPV

-Modeling

In this section, we try to solve the optimization problem by addressing it as a general nonlinear optimization problem, taking great care to develop expressions that can be evaluated and differentiated efficiently.

3.4.1 Cost Function

In this subsection the cost function for the minimization problem (3.2) for con-tinuous time is expressed. The discrete time case is analogous and shown in Appendix 3.B.

The equations in (3.8) are equivalent and can both be used to calculate the H2 -norm. They are both useful to simplify the derivations for the gradients later, but for now only (3.8a) is used to compute the cost function. It is now straightfor-ward to express the cost function for the more general case when having multiple models,i.e., rewrite the cost function V in (3.2) with the new partitioning (3.6)

V =X i ||_E_i||2_H 2 = X i trBT_i Q_iB_i + 2BT_i Y_iBˆ_i+ ˆBT_i Qˆ_iBˆ_i. (3.9) The optimization problem (3.2) can now be written as

min ˆ

A(k)_{, ˆ}_B(k)_{, ˆ}_C(k)V . (3.10) Keep in mind the parametrization of the system matrices introduced in (3.1). Ad-ditionally, Pi, Qi, ˆPi, ˆQi, Xi and Yi satisfy the equations

A_iP_i+ PiATi + BiBTi = 0, (3.11a) AiXi+ XiAîT + BiBˆTi = 0, (3.11b) ˆ AiˆPi+ ˆPiAîT + ˆBiBˆTi = 0, (3.11c) AT_i Q_i+ QiAi+ CTi Ci = 0, (3.11d) AT_i Y_i+ YiAî −CTi Cî = 0, (3.11e) ˆ AT_i Qˆ_i+ ˆQ_iAˆ_i+ ˆCT_i Cˆ_i = 0. (3.11f) The cost function (3.9) to the optimization problem (3.10) is now expressed in the sought variables Â, ˆB, ˆC, the given data Ai, Bi, Ci and in the different partitions of

the Gramians for the error system,i.e., the solution to the equations in (3.11) that

can easily be calculated. Important to stress is that the Gramians are not decision variables, but auxiliary variables used to evaluate the cost function.

3.4.2 Gradient of the Cost Function

An appealing feature of the proposed nonlinear optimization approach to solve the problem using the H2-norm is that the equations (3.9) are differentiable in

(34)

the system matrices, ˆA, ˆBand ˆC(see Dooren et al. [2008], Wilson [1970]). In addi-tion, the closed form expression obtained when differentiating the cost function is expressed in the given data (A, B and C), the optimization variables ( ˆA, ˆBand

ˆ

C) and solutions to equations (3.11).

The calculations are lengthy, so we will concentrate on calculating the gradient with respect to ˆA, i.e., ∂||E||

2 H2

∂ ˆA , and leave the remainder for Appendix 3.A. The notation " ∂||E||2_H2 ∂ ˆA # ij = ∂||E|| 2 H2

∂ âij is used, and âij denote the individual elements in Â.

We start by considering the case when only given one linear model and then later extend the result to multiple models and an lpv-model. ˆQand Y depend on

ˆ

Awhich we need to keep in mind when differentiating (3.8a) with respect to Â. Hence, " ∂||E||2_H2 ∂ ˆA # ij becomes        ∂ ||E||2H₂ ∂ ˆA        ij = tr 2 ˆBBT ∂Y ∂ âij + ˆB ˆBT ∂ ˆQ ∂ âij ! , (3.12)

where_{∂ ˆa}∂Y ij and

∂ ˆQ

∂ âijdepend on Âvia the differentiated versions of equations (3.7e) and (3.7f), AT ∂Y ∂ âij + ∂Y ∂ âij ˆ A+ Y∂ Â ∂ âij = 0, (3.13a) ˆ AT ∂ ˆQ ∂ âij + ∂ ˆQ ∂ âij ˆ A+∂ ˆA T ∂ âij ˆ Q+ ˆQ ∂ ˆA ∂ âij = 0. (3.13b)

To simplify the calculations of the gradient we need the following lemma (see Yan and Lam [1999]).

Lemma 3.1. If M and N satisfy the following Sylvester equations AM+ MB + C = 0, NA+ BN + D = 0, then tr CN = tr DM.

Now applying Lemma 3.1 on (3.7) and (3.13) yields tr ˆBBT ∂Y ∂ âij = tr XTY∂ Â ∂ âij , tr ˆB ˆBT ∂ ˆQ ∂ âij = tr ˆP ∂ Â T ∂ âij ˆ Q+ ˆQ∂ Â ∂ âij ! .

(35)

3.4 General Nonlinear Programming Approach to lpv-Modeling 21

Inserting this in (3.12) yields        ∂ ||E||2H₂ ∂ ˆA        ij = tr 2XTY∂ Â ∂ âij + ˆP ∂ Â T ∂ âij ˆ Q+ ˆQ∂ Â ∂ âij !! = = 2 tr ∂ Â T ∂ âij ˆQ ˆP+ YTX ! .

It follows that ∂||E|| 2 H2

∂ ˆA = 2 ˆQ ˆP+ Y

T_X

. Analogously we can calculate the gradi-ents with respect to ˆBand ˆC(see Appendix 3.A).

∂ ||E||2H₂ ∂ ˆA = 2 ˆ_{Q ˆ}_P_{+ Y}T_X (3.14a) ∂ ||E||2H₂ ∂ ˆB = 2 ˆQ ˆB+ Y T_B (3.14b) ∂ ||E||2H₂ ∂ ˆC = 2 ˆ_{C ˆ}_{P − CX} (3.14c)

The closed form expressions obtained are expressed in the given data (A, B and C), the optimization variables ( ˆA, ˆBand ˆC) and solutions to equations (3.11), some of them already computed when calculating the cost function. To be more precise, the computational effort of computing the derivative is within a small constant factor from the computational effort required to compute the cost function. See Section 6.2 for a more detailed discussion on computational aspects.

The results are easily extended to the more general form when given multiple linearized models and when the lpv-model depends on some basis functions as in (3.1),i.e., ˆA(p) = P

kwk(p) ˆA(k). The gradient of (3.9) with respect to the

coefficient matrices ˆA(k)_{, ˆ}_B(k)_{and ˆ}_C(k)_becomes ∂V ∂ ˆA(k) = 2 X i wk(pi) ˆQiˆPi+ YTi Xi , ∂V ∂ ˆB(k) = 2 X i wk(pi) ˆQiBˆi+ YTi Bi , ∂V ∂ ˆC(k) = 2 X i wk(pi) ˆCiˆPi−CiXi .

Remark 3.1. As can be seen in the derivation of the gradient, it is never assumed that all of the parameters in the system matrices, ˆA, ˆB, ˆC, should be free. The elementwise

differ-entiation derivation shows that one can choose any element to be either free or constant. With this fact it is possible to impose arbitrary sparsity structure in the elements of the system matrices.

(36)

3.5 Semidefinite Programming Approach to

LPV

-Modeling

In this section a semidefinite programming (sdp) approach is presented as an alternative method to solve the lpv-modeling problem. This method was de-veloped hoping that, when formulating the problem as a bilinear optimization problem, it could create a structure whose non-convexity had better properties than the problem in the previous section. This, however, was not the case and the method is only presented here as a suggestion as an alternative method and will not be pursued and undergo the same analysis as the general nonlinear program-ming approach. Even worse, the complexity of the optimization problem grows very rapidly with the dimensions of the models and with the number of given linearized models and soon becomes computationally intractable.

With the realization (3.3) and equations (3.4) and (3.5) we rewrite the H2-norm for a system as a minimization problem

min QE

tr BT_EQ_EB_E

s.t. AT_EQ_E+ QEAE+ CTECE≺0, QE 0.

In this optimization problem Q has been made a semidefinite variable and also the Lyapunov equation is changed into a linear matrix inequality (lmi). This optimization problem can be rewritten using a Schur complement to obtain an equivalent problem (see Boyd et al. [1994])

min γ,QE γ s.t. QEAE+ A T EQE CTE CE −I ! ≺₀_, γI B T EQE QEBE QE ! ₀_. (3.15)

The optimization problem above calculates the H2-norm of the system, but our goal was to find the variables ˆA, ˆBand ˆC. To do this we let also these variables free. Rewriting the minimization problem (3.15) using the partitioning (3.6) of QEand the realization (3.3) of E and optimizing over the sought system matrices,

introducing bilinear matrix inequalities (bmis), min γ,Q,Y, ˆQ, Â, ˆB, ˆC γ s.t.          QA+ ATQ Y Â+ ATY CT YTA+ ÂTY Q ˆÂ+ ÂTQˆ −_CˆT C −_Cˆ −_I          ≺₀_,          γI BTQ+ ˆBTYT BTY+ ˆBTQˆ QB+ Y ˆB Q Y YTB+ ˆQ ˆB YT Qˆ          ₀_. (3.16)

Generalizing this to the case when we are given multiple models and want to generate an lpv-model is straightforward. Now rewriting (3.16) for the problem

(37)

3.6 Examples 23

when having multiple models we obtain min γi,Qi,Yi, ˆQi, Â(k), ˆB(k), ˆC(k) N X i=1 γi s.t.          QiAi+ ATi Qi YiAî+ ATi Yi CTi Y_iTAi+ ÂTi Yi QîAî+ ÂTi Qî −CˆTi C_i −_Cˆ_i −_I          ≺₀_,          γiI BTi Qi+ ˆBTi YTi BTi Yi+ ˆBTi Qî QiBi+ YiBî Qi Yi YT_i Bi+ ˆQiBˆ YTi Qî          ₀_, (3.17)

where ˆA(k), ˆB(k)and ˆC(k)are the coefficient matrices in (3.1). To try to solve this type of problem, a local iterative two-step algorithm was proposed in Helmersson [1994]. Start by keeping ˆA(k), ˆB(k) constant, and solve (3.17) for Qi, ˆC(k). Then

keep Yi, ˆQi constant and solve (3.17) for Qi, ˆA(k), ˆB(k), ˆC(k). Continue doing this

until convergence.

Generally bilinear semidefinite programs are very hard to solve, see Mesbahi et al. [1995]. The reason for accepting this additional cost in complexity for introduc-ing bmis in this problem was with hopes of that the bilinearity could create a structure in the problem whose non-convexity had better properties. No such structure was found and utilized and in addition note that in the sdp case both

ˆ

A(k), ˆB(k), ˆC(k)and Qi, Yi, ˆQi are optimization variables. This will make the

com-plexity of the optimization problem grow rapidly with the dimensions of the models and with the number of given models and soon become computationally intractable.

3.6 Examples

In this section we start with an example to show that the method can work as a model reduction tool. Then we present an illustrative example to shed light on some properties of the proposed method, and continue with a more practical ex-ample illustrating the applicability of the proposed method to realistically sized problems.

When solving the examples, the function fminunc in Matlab was used as the quasi-Newton solver framework. To generate a starting point for the solver, which is a an extremely important problem in need of significant amounts of research, the initialization procedure explained in Section 6.1 was used.

As a comparison in the examples, where we generate an lpv-model, we will use a method that will be called element-wise approximation. This method has, for example, been used in Varga et al. [1998]. The main idea of the method is very simple but works very well. Given a number of linearized models, expressed in the same state space basis, one interpolates all the individual elements in the system matrices independently.

(38)

Example 3.3: Model Reduction

In this example we generate 500 random stable systems with 20 states and 2 in-puts and 2 outin-puts using the function rss in Control System Toolbox in Matlab. On each of these systems we reduce the number of states to 15 with the proposed method and then compare with the function hankelmr in Robust Control Tool-box in Matlab, which does model reduction via truncation of Hankel singular values. When we reduce the system with the proposed method, the Hankel re-duction is used as an initial point. In this case the proposed method will work as a refinement method ontop of the Hankel reduction. In Figure 3.3 we see a histogram of how much we can refine, in H2-norm, the system reduced using Hankel reduction with the proposed method. We see that the proposed method works well as an model reduction method and can in most cases decrease the model reduction error 2-6 times, measured in the H2-norm.

0 2 4 6 8 10 12 0 5 10 15 20 25 No. of systems Quotient

Figure 3.3: The figure illustrates, in a histogram, how much a system reduced using Hankel reduction has been improved using the proposed method. The x-axis is the quotient between the H2-norm of the error system using Hankel reduction and of the error system for the proposed method.

Example 3.4: Academic Example

Here a small academic example is presented to show the potential of the new method and to show the importance of addressing system properties.

(39)

3.6 Examples 25

The system in this example is

G = G1G2, where G1= 1 s2_{+ 2ζ} 1s + 1 , G2= 9 s2_{+ 6ζ} 2s + 9 , (3.18a) ζ1= 0.1 + 0.9p, ζ2= 0.1 + 0.9(1 − p), p ∈ [0, 1]. (3.18b) The system was sampled by selecting 30 equidistant points in [0, 1], i.e., we are given 30 linear models with four states each.

The data is given in a state basis where all the elements in the system matrices happen to depend nonlinearly on the parameter p, see Figure 3.4. In this ba-sis it will undoubtedly be hard to find a good low order approximation with an element-wise approach with polynomial dependence of p. The interesting and obvious property of this example is that there exists a state basis for which the model has affine dependence on p; in fact only two elements of the system matrix Aare affine in p while all other matrix elements in A, B and C are constants.

0 0.5 1 −0.4 −0.2 0 0 0.5 1 −1 −0.9 −0.8 0 0.5 1 −0.4 −0.2 0 0 0.5 1 0 0.2 0.4 0 0.5 1 0.8 0.9 1 0 0.5 1 −1 −0.5 0 0 0.5 1 −4 −2 0 0 0.5 1 0 0.5 1 0 0.5 1 −0.4 −0.2 0 0 0.5 1 0 2 4 0 0.5 1 −4 −2 0 0 0.5 1 1 2 3 0 0.5 1 −0.4 −0.2 0 0 0.5 1 0 0.5 1 0 0.5 1 −3 −2 −1 0 0.5 1 −4 −2 0

Figure 3.4: The elements in the A-matrix as function of p. The system is given in a representation where the elements in the matrix depends nonlin-early on p.

To test the algorithm, 15 validation points were generated from (3.18). From the results in Table 3.1 we see that when the proposed nlp-method is used, a high accuracy low order (indeed affine) lpv-model of the system can be found. If we try to obtain a model using an element-wise method with first order polynomials we, of course, obtain a much worse model. Achieving comparable results using an element-wise strategy requires polynomials of order 9. We also see that the

(40)

sdp_{-method is even worse than the element-wise strategy and converges to a local} minimum. To further illustrate the accuracy in the validation points the H2-norm for the error model in the 15 validation points is shown in Figure 3.5.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0 0.2 0.4

Error

Error in H2−norm for 15 validations points, element−wise, degree 1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 1 2x 10 −3 Error

Error in H2−norm for 15 validations points, element−wise, degree 9

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 1 2x 10 −5 Error

Error in H2−norm for 15 validations points, NLP−method, degree 1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.5 1 Error p

Error in H2−norm for 15 validations points, SDP−method, degree 1

Figure 3.5: The figure illustrates the H2-norm of the error system in 15 val-idation points for the different methods. Note the different scales and that it takes a polynomial of order 9 using the element-wise approach to obtain similar results as with the proposed method using an affine function.

Table 3.1:Numerical results for Example 3.4.

Method P i||Ei||2H₂ Degree Time (s) Proposed nlp-method 6.45 · 10−8 1 40 Element-wise 3.79 · 10−5 9 0.049 Element-wise 0.514 1 0.013 Proposed sdp-method 2.75 1 673

This example illustrates what was said in Section 3.2.1. Even though the realiza-tion of the model given is non-polynomial in the parameters, we are able to find an underlying model with only affine dependence. It thus illustrates the impor-tance to look at the input-output relation and not only the individual elements.

(41)

3.6 Examples 27

To also show that the proposed method, using the nlp-approach, is applicable to real-world examples we will now show how the method performs on larger examples. The example below uses an lpv-model of an aircraft. There are three models of different complexity; all models have 22 states but depend on one, two or three parameters. From the given lpv-model we create a number of lti-models and try to generate an lpv-model that resembles the original one.

Example 3.5: Nonlinear Aircraft Model

The models used in this example were developed and used in the EU-project called cofcluo, see http://cofcluo.isy.liu.se/. The three models de-scribe, with different complexity, an airplane in closed-loop in the longitudinal direction. These models were used to evaluate different performance criteria for the airplane. The models are given as lfrs, which makes it possible to extract any number of linear models in the parameter region. This is how estimation and val-idation data is created. The methods used to evaluate the performance criteria of the models demands the models to be represented as lfrs and preferably lfrs of low order, with respect to the repetitions of the parameters in the ∆-block. This is because the computational complexity grows rapidly for the evaluation methods with the size of the ∆-block. In the following examples we will try to generate an lpv-model that resembles the original lfr, but that can be transformed into an lfr with smaller ∆-block,i.e., reduce the repetitions of the parameters in the

∆-block. We will do this by utilizing the property that you can induce structure in the system matrices and the property that the lpv-generating method can be used to find accurate models of low order.

To reduce the size of the resulting lfr we choose the resulting model structure to be affine in the parameters. We also choose the rank of the coefficient matrix for the constant term to be full and the coefficient matrices for the linear terms to have low rank. The reason for these choices are that the size of the smallest lfr for this kind of lpv-system is known on beforehand, seee.g., Hecker [2006].

One problem with the models are that they are not strictly proper, i.e., D , 0,

which is necessary for the lpv-generating method to be applicable. This fact is circumvented by first ignoring the D-matrix and generate a model for the model without the D-matrix. Then do a polynomial interpolation of the D-matrix, which is a scalar function because it is a single-input-single-output system, and after-words incorporate this function for the D-matrix in the lpv-model before making the lfr.

All the parameters are normed and can vary between −1 and 1. The resulting models are validated using the relative H2-norm for the system without the D-matrix. The models are also validated using the relative H∞-norm for the com-plete system, including the D-matrix.

(42)

The results from the reduction of the different models can be seen in Table 3.2 and Figures 3.6, 3.7 and 3.8. In Table 3.2 we see that the size of the ∆-block have been reduced substantially and in Figures 3.6, 3.7 and 3.8 we see that the reduced models represents the original models well.

Table 3.2: Numerical results for the airplane model in Example 3.5. n∆is the size of the ∆-block before the reduction and n∆,aafter.

No. of n∆ n∆,a Computational No. of models used

parameters time for estimation

1 20 10 31m 55s 10 2 62 24 9h 10m 55s 100 3 98 42 5h 01m 26s 125 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 2.5 3 3.5 4 4.5 5

x 10−5 Relative error in H2−norm for 100 random points

Error p −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 3.5 4 4.5 5

x 10−4 Relative error in Hinf−norm for 100 random points

Error

p

Figure 3.6: Relative error in H2- and H∞-norm at 100 validation points, in the one parameter case.

H -NormBasedLPVModellingandControl NonlinearOptimizationApproachesto

Nonlinear Optimization Approaches

to

H

2

-Norm Based LPV Modelling and

Control

Daniel Petersson

Abstract

Populärvetenskaplig sammanfattning

Acknowledgments

Contents

Notation

1

Introduction

1.1

Background

1.2

Contributions

1.3

Thesis Outline

2

Prerequisites

2.1

Optimization

2.1.1

Local Methods

2.2

System Theory

2.2.1

Basic Theory and Notation

2.2.2

Gramians

2.2.3

System Norms

3

Generation of

LPV

State Space

Models Using

H

2

-Minimization

3.1

Overview of Existing Methods for Generating

LPV

-Models

3.1.1

Global Methods

3.1.2

Local Methods

3.2

An Input-Output Preserving Approach to

LPV

-Modeling

3.2.1

Invariance to State Transformations

3.3

LPV

-Modeling Using the

H

-Norm

3.4

General Nonlinear Programming Approach to

LPV

-Modeling

3.4.1

Cost Function

3.4.2

Gradient of the Cost Function

3.5

Semidefinite Programming Approach to

LPV

-Modeling

3.6

Examples