On Polynomial Coefficients and Rank Constraints

(1)

Technical report from Automatic Control at Linköpings universitet

On Polynomial Coefficients and Rank

Constraints

Anders Helmersson

Division of Automatic Control

E-mail: andersh@isy.liu.se

15th October 2008

Report no.: LiTH-ISY-R-2867

Address:

Department of Electrical Engineering Linköpings universitet

SE-581 83 Linköping, Sweden

WWW: http://www.control.isy.liu.se

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

Technical reports from the Automatic Control group in Linköping are available from http://www.control.isy.liu.se/publications.

(2)

Abstract

Rank constraints on matrices emerge in many automatic control applica-tions. In this short document we discuss how to rewrite the constraint into a polynomial equations of the elements in a the matrix. If addition semidefinite matrix constraints are included, the polynomial equations can be turned into an inequality. We also briefly discuss how to implement these polynomial constraints.

(3)

Avdelning, Institution Division, Department

Division of Automatic Control Department of Electrical Engineering

Datum Date 2008-10-15 Språk Language 2 Svenska/Swedish 2 Engelska/English 2 Rapporttyp Report category 2 Licentiatavhandling 2 Examensarbete 2 C-uppsats 2 D-uppsats 2 Övrig rapport 2

URL för elektronisk version http://www.control.isy.liu.se

ISBN — ISRN

—

Serietitel och serienummer Title of series, numbering

ISSN 1400-3902

LiTH-ISY-R-2867

Titel Title

On Polynomial Coefficients and Rank Constraints

Författare Author

Anders Helmersson

Sammanfattning Abstract

Rank constraints on matrices emerge in many automatic control applications. In this short document we discuss how to rewrite the constraint into a polynomial equations of the elements in a the matrix. If addition semidefinite matrix constraints are included, the polynomial equations can be turned into an inequality. We also briefly discuss how to implement these polynomial constraints.

Nyckelord

(4)

1 Introduction

We employ the coefficients in characteristic polynomial of a matrix Z as close-ness measure to a rank constraint, when we assume that Z has no negative eigenvalues.

Rank constraints are for instance used when searching for reduced-order H∞ controller. The existence of such a controller can be described in terms

of two linear matrix inequalities (LMIs) in two symmetric matrix variables, X, Y ∈ Rn×n, where n denotes the order of the system to be controlled, see [2]. A third LMI connects X and Y :

X I

I Y

0. (1)

The existence of a controller of reduced order, r < n, is given by rank X I I Y ≤ n + r. (2)

The rank condition (2) can also be rewritten (using Schur complement) as

rank(XY − I) ≤ r. (3)

2 Polynomial Criterion

The characteristic polynomial of a matrix Z ∈ Rn×n is defined by

det(λI − Z) =

n

X

i=0

ci(Z)λi. (4)

where the coefficients, ci(Z), are polynomial functions of the elements in Z. For

instance, c0(Z) = det(Z), cn−1(Z) = tr(Z) and cn(Z) = 1. Note that when

Z = I − XY , the characteristic polynomial can also be defined in terms of the Hankel singular values, σi, of (X, Y ) as

det (λ − 1)I + Σ2) = n Y i=1 λ + σ2_i − 1 = n X i=0 ciλi.

Lemma 1. Let Z ∈ Rn×n _{be a matrix with real non-positive eigenvalues,}

λi(Z) ≤ 0, and let ci(Z) be the coefficients of the characteristic polynomial

of Z as defined in (4). Then, the following statements are equivalent if r < n: (i) cn−r−1(Z) = 0;

(ii) rank Z ≤ r.

Proof. Showing (ii) ⇒ (i) is trivial, since (ii) is equivalent to that λr(Z) = . . . =

λn(Z) = 0, where λi(Z) denotes the ith eigenvalue of Z. To show (i) ⇒ (ii), we

note that cn−r−1(Z) = X J ∈Cr+1([1,n]) Y i∈J λi(−Z), (5)

(5)

where Ci(U ) denotes the set of all combinations of i elements from the set U1.

Since, every λi(−Z) ≥ 0, we conclude that cn−r−1(Z) = 0 implies that every

product of (5) must be zero, and consequently at least one factor, λi(Z), in each

product must be zero. Thus, at least n − r factors, λi(Z) must be zero, and

consequently (ii) holds.

As a measure of how close the matrix Z is to the rank constraint rank Z ≤ r, we can use cn−r−1(Z)/cn−r(Z), which seems to have some nice properties. For

instance, if we are close to rank Z ≤ r such that (n − r) eigenvalues values, λi, i = r + 1, . . . , n, are close to 0 and the remaining ones are large, then

cn−r−1(Z) cn−r(Z) ≈ n X i=r+1 λi(−Z)

3 Derivates of c

i

(Z)

Let C(Z) denote the characteristic polynomical of Z(x) and let Zi =_∂x∂

iZ(x): C(Z) = det(λI − Z) = n X i=0 ci(Z)λi.

We can use the fact that ∂ ∂xi log det Z(x) = tr Z−1(x)Zi. We apply this on C(Z), ∂ ∂xi

det(λI − Z(x)) = − det(λI − Z(x)) tr(λI − Z(x))−1Zi.

Next, introduce the polynomial

P (λ) =cnIλn−1+ (cnZ + cn−1I) λn−2+ . . . + cnZn−1+ cn−1Zn−2+ . . . + c1I .

Then

P (λ)(λI − Z(x)) = cnIλn−1(λI − Z(x))

+ (cnZ + cn−1I) λn−2(λI − Z(x)) + . . . + cnZn−1+ cn−1Zn−2+ . . . + c1I (λI − Z(x)) = cnλn+ cn−1λn−1+ . . . + c1λ I − cnZn+ cn−1Zn−1+ . . . + c1Z | {z } −c0I = cnλn+ cn−1λn−1+ . . . + c1λ + c0 I = det(λI − Z(x))I

1 _{Formally we can define C}

i(U ) by C0 = {∅}, and Ci(U ) = {{x} ∪ y : x ∈ U, y ∈

Ci−1(U \ {x})} for i > 0.

(6)

where we have used the fact that the charactestic polynomial of Z applied to itself yields zero.

Consequently, (λI − Z(x))−1= P (λ)/ det(λI − Z(x)), and ∂ ∂xi C(Z) = ∂ ∂xi det(λI − Z(x)) = − tr P (λ)Zi, and ∂ck ∂xi = ( 0, k = n − tr cnZn−k−1+ cn−1Zn−k−2+ . . . + ck+1I Zi, otherwise (6)

Note that ck(Z) can be computed as a polynomial function of tr Z, tr Z2, . . . ,

tr Zn−k_{. For instance, c}

n = 1, cn−1 = − tr Z, cn−2 = 1₂ (tr Z)2− tr Z2, and

cn−3= 1₆ −2 tr Z3+ 3(tr Z)(tr Z2) − (tr Z)3.

3.1 Computing First-Order Derivatives

Higher-order derivatives of ck(Z) can be derived analogously. Here we try to

find an efficient implementation of the computation of the first and second-order derivatives. It is possible to compute the first-second-order derivatives in O(n4) operations (multiplications and additions). Here, the main effort lies in the computation of Z, Z2_{, . . . , Z}n−k−1_.

First, we compute the coefficients ci(Z) in the characteristic polynomial of

Z. This can be done using the poly function in Matlab, but a more efficient algorithm due to Berkowitz [1] is faster and requires O(n4_{) operations. A simple}

Matlab implementation is given below function p = berkowitz (A) [n, m] = size (A);

if n ~= m, error (’A must be square’); end; p = 1; for k = n: -1: 1, R = A(k,k+1:n); ci = [1 -A(k,k)]; for i = k+1:n ci = [ci -R*A(k+1:n,k)]; R = R*A(k+1:n,k+1:n); end

p = toeplitz (ci, [1 zeros(1,n-k)]) * p; end

Next, using the coefficients, ci(Z), we can compute the following series of

matrices

Ck= cn(Z)Zn−k−1+ cn−1(Z)Zn−k−2+ . . . + ck+1(Z)I. (7)

(7)

This is most efficiently computed iteratively as Cn= 0, Cn−1= I, Cn−2= Z + cn−1(Z)I, .. . Ck= Ck+1Z + ck+1(Z)I, .. . C0= C1Z + c1(Z)I,

which in total requires O(n4_{) operations.}

Now, the first-order derivatives of ck(Z) can be computed as

∂ck

∂xi

= − tr CkZi

where the computation of each coefficient requires O(n2_{) operations.}

Conse-quently, n2_{derivatives can be computed in O(n}4_{) operations.}

3.2 Computing Second-Order Derivatives

The second-order derivatives are given by ∂2_c k ∂xi∂xj = − ∂ ∂xj tr CkZi = − tr∂Ck ∂xj Zi− tr CkZij

The derivatives of Ci can be computed iteratively as

∂Cn ∂xj = 0, ∂Cn−1 ∂xj = 0, ∂Cn−2 ∂xj = Zj+ ∂cn−1 ∂xj I, .. . ∂Ck ∂xj = ∂Ck+1 ∂xj Z + Ck+1Zj+ ∂ck+1 ∂xj I, .. . ∂C0 ∂xj = ∂C1 ∂xj Z + C1Zj+ ∂c1 ∂xj I, 5

(8)

Consequently, ∂2ck ∂xi∂xj = − tr ∂Ck+1 ∂xj Z + Ck+1Zj+ ∂ck+1 ∂xj I Zi− tr CkZij = − tr∂Ck+1 ∂xj ZZi− tr Ck+1ZjZi− ∂ck+1 ∂xj tr Zi− tr CkZij = − tr ∂Ck+2 ∂xj Z + Ck+2Zj+ ∂ck+2 ∂xj ZZi − tr Ck+1ZjZi+ tr Ck+1Zjtr Zi− tr CkZij = − tr∂Ck+2 ∂xj Z2Zi+ (tr Ck+2Zj) (tr ZZi) − tr Ck+2ZjZZi − tr Ck+1ZjZi+ tr Ck+1Zjtr Zi− tr CkZij.

After reordering of terms we get, ∂2_c k ∂xi∂xj = (tr Cn−1Zj) tr Zn−k−2Zi − tr Cn−1ZjZn−k−2Zi + . . . + (tr Ck+2Zj) (tr ZZi) − tr Ck+2ZjZZi + (tr Ck+1Zj) (tr Zi) − tr Ck+1ZjZi − tr CkZik.

Without any structure in Ziwe need as much as O(n6) operations to compute ∂2_c

k

∂xi∂xj, since the computation of all CkZj and Z

k_Z

j requires O(n6) operations.

This is based on the assumption that there are n2_{elements in Z such that i and}

j are from 1 to n2_.

3.3 Improving Computational Efficiency

With structure in Z, we can reduce this. If Zi = LiRi and Zj = LjRj are

assumed to contain single elements (ones). Here we assume that L and R are single-rank matrices. We can factor each terms as

tr CmZjZkZi= tr RiCmLjRjZkLi= (tr RiCmLj) (tr RjZnLi) .

Also,

(tr CmZj) tr ZkZi = (tr RjCmLj) tr RiZkLi

Since we can compute Cn, Cn−1, . . . , C1 and Z, Z2, . . . , Zn−1 using 2(n − 2)

matrix n × n multiplications we need O(n4_{) operations. Using the structure of}

Z, the trace operation, tr RiCmLj can be replaced by a simple extraction of the

appropriate element in Cmwhere Ri defines the row and Lj defines the column.

If Z is diagonal, we can reduce this to O(n2) operations not including the operations needed to compute ci.

In the case when Z = I − XY , the number of operations will be of the same order. Note that Zi= −XiY − XYi, where one of Xi and Yi is zero, and

Zij = −XiYj.

(9)

For instance, the second-order derivatives of cn−1and cn−2become ∂2_c n−1 ∂xi∂xj = − tr Zij, and ∂2_c n−2 ∂xi∂xj = (tr Zi)(tr Zj) − tr ZiZj− tr Cn−2Zij

References

[1] Stuart J. Berkowitz. On computing the determinant in small parallel time using a small number of processors. Inf. Process. Lett., 18(3):147–150, 1984. [2] P. Gahinet and P. Apkarian. A linear matrix inequality approach to H∞

control. International Journal of Robust and Nonlinear Control, 4(4):421– 448, 1994.