On Polynomial Coefficients and Rank Constraints

(1)

Technical report from Automatic Control at Linköpings universitet

On Polynomial Coefficients and Rank

Constraints

Anders Helmersson

Division of Automatic Control

E-mail: andersh@isy.liu.se

4th February 2009

Report no.: LiTH-ISY-R-2878

Address:

Department of Electrical Engineering Linköpings universitet

SE-581 83 Linköping, Sweden

WWW: http://www.control.isy.liu.se

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

Technical reports from the Automatic Control group in Linköping are available from http://www.control.isy.liu.se/publications.

(2)

Abstract

Rank constraints on matrices emerge in many automatic control applica-tions. In this short document we discuss how to rewrite the constraint into a polynomial equations of the elements in a the matrix. If addition semidefinite matrix constraints are included, the polynomial equations can be turned into an inequality. We also briefly discuss how to implement these polynomial constraints.

(3)

Avdelning, Institution Division, Department

Division of Automatic Control Department of Electrical Engineering

Datum Date 2009-02-04 Språk Language 2 Svenska/Swedish 2 Engelska/English 2 Rapporttyp Report category 2 Licentiatavhandling 2 Examensarbete 2 C-uppsats 2 D-uppsats 2 Övrig rapport 2

URL för elektronisk version http://www.control.isy.liu.se

ISBN — ISRN

—

Serietitel och serienummer Title of series, numbering

ISSN 1400-3902

LiTH-ISY-R-2878

Titel Title

On Polynomial Coefficients and Rank Constraints

Författare Author

Anders Helmersson

Sammanfattning Abstract

Rank constraints on matrices emerge in many automatic control applications. In this short document we discuss how to rewrite the constraint into a polynomial equations of the elements in a the matrix. If addition semidefinite matrix constraints are included, the polynomial equations can be turned into an inequality. We also briefly discuss how to implement these polynomial constraints.

Nyckelord

(4)

1 Introduction

We employ the coefficients in characteristic polynomial of a matrix Z as closeness measure to a rank constraint.We assume that Z has no negative eigenvalues.

Rank constraints are for instance used when searching for reduced-order H∞ controller. The existence of such a controller can be described in terms

of two linear matrix inequalities (LMIs) in two symmetric matrix variables, X, Y ∈ Rn×n, where n denotes the order of the system to be controlled, see [2]. A third LMI connects X and Y :

X I

I Y

0. (1)

The existence of a controller of reduced order, r < n, is given by rank X I I Y ≤ n + r. (2)

The rank condition (2) can also be rewritten (using Schur complement) as

rank(XY − I) ≤ r. (3)

This report is an extension of LiTH-ISY-R-2867. New upper and lower bounds of cn−r−1/cn−r with proofs are provided.

2 Polynomial Criterion

The characteristic polynomial of a matrix Z ∈ Rn×n _{is defined by}

det(λI − Z) =

n

X

i=0

ci(Z)λi. (4)

where the coefficients, ci(Z), are polynomial functions of the elements in Z. For

instance, c0(Z) = det(Z), cn−1(Z) = tr(Z) and cn(Z) = 1. Note that when

Z = I − XY , the characteristic polynomial can also be defined in terms of the Hankel singular values, σi, of (X, Y ) as

det (λ − 1)I + Σ2) = n Y i=1 λ + σ2_i − 1 = n X i=0 ciλi.

Lemma 1. Let Z ∈ Rn×n _{be a matrix with real non-negative eigenvalues,}

λi(Z) ≥ 0, and let ci(−Z) be the coefficients of the characteristic polynomial

of −Z as defined in (4). Then, the following statements are equivalent if r < n: (i) cn−r−1(−Z) = 0;

(ii) rank Z ≤ r.

Proof. Showing (ii) ⇒ (i) is trivial, since (ii) is equivalent to that λr+1(Z) =

. . . = λn(Z) = 0, where λi(Z) denotes the ith eigenvalue of Z. To show (i) ⇒

(ii), we note that

cn−r−1(−Z) = X J ∈Cr+1([1,n]) Y i∈J λi(Z), (5)

(5)

where Ci(U ) denotes the set of all combinations of i elements from the set U1;

for instance, C2({1, 2, 3}) = {{1, 2}, {1, 3}, {2, 3}}.

Since, every λi(Z) ≥ 0, we conclude that cn−r−1(−Z) = 0 implies that every

product of (5) must be zero, and consequently at least one factor, λi(Z), in each

product must be zero. Thus, at least n − r factors, λi(Z) must be zero, and

consequently (ii) holds.

Instead of using the coefficient cn−r−1(−Z) directly as a measure of the

matrix’s closeness to the rank condition, rank Z ≤ r, we instead use cn−r−1(−Z)

cn−r(−Z)

as a measure of closeness. It has the nice property that if n − r eigenvalues values, λr+1, . . . , λn, are close to 0 and the remaining ones are large, then

cn−r−1(−Z) cn−r(−Z) ≈ n X i=r+1 λi(Z).

More specifically, the following relations hold in general.

Lemma 2. Let Z be a matrix with non-negative eigenvalues ordered by λ1(Z) ≥

λ2(Z) ≥ . . . ≥ λn(Z) ≥ 0. Then, the following relations hold:

1 r + 1 n X i=r+1 λi(Z) ≤ cn−r−1(−Z) cn−r(−Z) ≤ n X i=r+1 λi(Z), (6) or, equivalently, cn−r−1(−Z) cn−r(−Z) ≤ n X i=r+1 λi(Z) ≤ (r + 1) cn−r−1(−Z) cn−r(−Z) . (7)

Proof. The upper bound in (6) (the lower bound in (7)) is obtained by observing that Cr([1, k − 1]) ⊆ Cr([1, n]) assuming k ≤ n, and consequently

n X k=r+1 λk ! X J ∈Cr([1,n]) Y i∈J λi | {z } cn−r(−Z) ≥ n X k=r+1  λk X J ∈Cr([1,k−1]) Y i∈J λi   = X J ∈Cr+1([1,n]) Y i∈J λi | {z } cn−r−1(−Z) ,

where we have dropped the argument (Z).

The lower bound in (6) (the upper bound in (7)) is obtained by first observing that (r + 1) X J ∈Cr+1([1,n]) Y i∈J λi | {z } cn−r−1(−Z) , = X J ∈Cr([1,n])   X k∈[1,n]\J λk Y i∈J λi  .

1 _{Formally, we can define C}

i(U ) recursively by C0 = {∅}, and Ci+1(U ) = {{x} ∪ y : x ∈ U, y ∈ Ci(U \ {x})} for i ≥ 0.

(6)

Every term in the left-hand side is generated by the right-hand side in (r + 1) copies, which is the reason for the factor in the left-hand side. Next using the fact that the eigenvalues, λk are ordered we can use

X k∈[1,n]\J λk ≥ n X k=r+1 λk, and consequently, (r + 1) X J ∈Cr+1([1,n]) Y i∈J λi | {z } cn−r−1(−Z) , ≥ X J ∈Cr([1,n]) n X k=r+1 λk Y i∈J λi ! = n X k=r+1 λk ! X J ∈Cr([1,n]) Y i∈J λi | {z } cn−r(−Z) ,

from which the relation follows.

3 Derivatives of c

i

(Z)

Let C(Z) denote the characteristic polynomical of Z(x) and let Zi =_∂x∂

iZ(x): C(Z) = det(λI − Z) = n X i=0 ci(Z)λi.

We can use the fact that ∂ ∂xi log det Z(x) = tr Z−1(x)Zi. We apply this on C(Z), ∂ ∂xi

det(λI − Z(x)) = − det(λI − Z(x)) tr(λI − Z(x))−1Zi.

Next, introduce the polynomial

P (λ) =cnIλn−1+ (cnZ + cn−1I) λn−2+ . . . + cnZn−1+ cn−1Zn−2+ . . . + c1I .

Then

P (λ)(λI − Z(x)) = cnIλn−1(λI − Z(x))

+ (cnZ + cn−1I) λn−2(λI − Z(x)) + . . . + cnZn−1+ cn−1Zn−2+ . . . + c1I (λI − Z(x)) = cnλn+ cn−1λn−1+ . . . + c1λ I − cnZn+ cn−1Zn−1+ . . . + c1Z | {z } −c0I = cnλn+ cn−1λn−1+ . . . + c1λ + c0 I = det(λI − Z(x))I 4

(7)

where we have used the fact that the charactestic polynomial of Z applied to itself yields zero.

Consequently, (λI − Z(x))−1= P (λ)/ det(λI − Z(x)), and ∂ ∂xi C(Z) = ∂ ∂xi det(λI − Z(x)) = − tr P (λ)Zi, and ∂ck ∂xi = ( 0, k = n − tr cnZn−k−1+ cn−1Zn−k−2+ . . . + ck+1I Zi, otherwise (8)

Note that ck(Z) can be computed as a polynomial function of tr Z, tr Z2, . . . ,

tr Zn−k_{. For instance, c}

n = 1, cn−1 = − tr Z, cn−2 = 1₂ (tr Z)2− tr Z2, and

cn−3= 1₆ −2 tr Z3+ 3(tr Z)(tr Z2) − (tr Z)3.

3.1 Computing First-Order Derivatives

Higher-order derivatives of ck(Z) can be derived analogously. Here we try to

find an efficient implementation of the computation of the first and second-order derivatives. It is possible to compute the first-second-order derivatives in O(n4) operations (multiplications and additions). Here, the main effort lies in the computation of Z, Z2_{, . . . , Z}n−k−1_.

First, we compute the coefficients ci(Z) in the characteristic polynomial of

Z. This can be done using the poly function in Matlab, but a more efficient algorithm due to Berkowitz [1] is faster and requires O(n4_{) operations. A simple}

Matlab implementation is given below function p = berkowitz (A) [n, m] = size (A);

if n ~= m, error (’A must be square’); end; p = 1; for k = n: -1: 1, R = A(k,k+1:n); ci = [1 -A(k,k)]; for i = k+1:n ci = [ci -R*A(k+1:n,k)]; R = R*A(k+1:n,k+1:n); end

p = toeplitz (ci, [1 zeros(1,n-k)]) * p; end

Next, using the coefficients, ci(Z), we can compute the following series of

matrices

Ck= cn(Z)Zn−k−1+ cn−1(Z)Zn−k−2+ . . . + ck+1(Z)I. (9)

(8)

This is most efficiently computed iteratively as Cn= 0, Cn−1= I, Cn−2= Z + cn−1(Z)I, .. . Ck= Ck+1Z + ck+1(Z)I, .. . C0= C1Z + c1(Z)I,

which in total requires O(n4_{) operations.}

Now, the first-order derivatives of ck(Z) can be computed as

∂ck

∂xi

= − tr CkZi

where the computation of each coefficient requires O(n2_{) operations.}

Conse-quently, n2_{derivatives can be computed in O(n}4_{) operations.}

3.2 Computing Second-Order Derivatives

The second-order derivatives are given by ∂2_c k ∂xi∂xj = − ∂ ∂xj tr CkZi = − tr∂Ck ∂xj Zi− tr CkZij

The derivatives of Ci can be computed iteratively as

∂Cn ∂xj = 0, ∂Cn−1 ∂xj = 0, ∂Cn−2 ∂xj = Zj+ ∂cn−1 ∂xj I, .. . ∂Ck ∂xj = ∂Ck+1 ∂xj Z + Ck+1Zj+ ∂ck+1 ∂xj I, .. . ∂C0 ∂xj = ∂C1 ∂xj Z + C1Zj+ ∂c1 ∂xj I, 6

(9)

Consequently, ∂2ck ∂xi∂xj = − tr ∂Ck+1 ∂xj Z + Ck+1Zj+ ∂ck+1 ∂xj I Zi− tr CkZij = − tr∂Ck+1 ∂xj ZZi− tr Ck+1ZjZi− ∂ck+1 ∂xj tr Zi− tr CkZij = − tr ∂Ck+2 ∂xj Z + Ck+2Zj+ ∂ck+2 ∂xj ZZi − tr Ck+1ZjZi+ tr Ck+1Zjtr Zi− tr CkZij = − tr∂Ck+2 ∂xj Z2Zi+ (tr Ck+2Zj) (tr ZZi) − tr Ck+2ZjZZi − tr Ck+1ZjZi+ tr Ck+1Zjtr Zi− tr CkZij.

After reordering of terms we get, ∂2_c k ∂xi∂xj = (tr Cn−1Zj) tr Zn−k−2Zi − tr Cn−1ZjZn−k−2Zi + . . . + (tr Ck+2Zj) (tr ZZi) − tr Ck+2ZjZZi + (tr Ck+1Zj) (tr Zi) − tr Ck+1ZjZi − tr CkZik.

Without any structure in Ziwe need as much as O(n6) operations to compute ∂2_c

k

∂xi∂xj, since the computation of all CkZj and Z

k_Z

j requires O(n6) operations.

This is based on the assumption that there are n2_{elements in Z such that i and}

j are from 1 to n2_.

3.3 Improving Computational Efficiency

With structure in Z, we can reduce this. If Zi = LiRi and Zj = LjRj are

assumed to contain single elements (ones). Here we assume that L and R are single-rank matrices. We can factor each terms as

tr CmZjZkZi= tr RiCmLjRjZkLi= (tr RiCmLj) (tr RjZnLi) .

Also,

(tr CmZj) tr ZkZi = (tr RjCmLj) tr RiZkLi

Since we can compute Cn, Cn−1, . . . , C1 and Z, Z2, . . . , Zn−1 using 2(n − 2)

matrix n × n multiplications we need O(n4_{) operations. Using the structure of}

Z, the trace operation, tr RiCmLj can be replaced by a simple extraction of the

appropriate element in Cmwhere Ri defines the row and Lj defines the column.

If Z is diagonal, we can reduce this to O(n2) operations not including the operations needed to compute ci.

In the case when Z = I − XY , the number of operations will be of the same order. Note that Zi= −XiY − XYi, where one of Xi and Yi is zero, and

Zij = −XiYj.

(10)

For instance, the second-order derivatives of cn−1and cn−2become ∂2_c n−1 ∂xi∂xj = − tr Zij, and ∂2_c n−2 ∂xi∂xj = (tr Zi)(tr Zj) − tr ZiZj− tr Cn−2Zij

References

[1] Stuart J. Berkowitz. On computing the determinant in small parallel time using a small number of processors. Inf. Process. Lett., 18(3):147–150, 1984. [2] P. Gahinet and P. Apkarian. A linear matrix inequality approach to H∞

control. International Journal of Robust and Nonlinear Control, 4(4):421– 448, 1994.