Technical report from Automatic Control at Linköpings universitet
On Polynomial Coefficients and Rank
Constraints
Anders Helmersson
Division of Automatic Control
E-mail: andersh@isy.liu.se
4th February 2009
Report no.: LiTH-ISY-R-2878
Address:
Department of Electrical Engineering Linköpings universitet
SE-581 83 Linköping, Sweden
WWW: http://www.control.isy.liu.se
AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET
Technical reports from the Automatic Control group in Linköping are available from http://www.control.isy.liu.se/publications.
Abstract
Rank constraints on matrices emerge in many automatic control applica-tions. In this short document we discuss how to rewrite the constraint into a polynomial equations of the elements in a the matrix. If addition semidefinite matrix constraints are included, the polynomial equations can be turned into an inequality. We also briefly discuss how to implement these polynomial constraints.
Avdelning, Institution Division, Department
Division of Automatic Control Department of Electrical Engineering
Datum Date 2009-02-04 Språk Language 2 Svenska/Swedish 2 Engelska/English 2 Rapporttyp Report category 2 Licentiatavhandling 2 Examensarbete 2 C-uppsats 2 D-uppsats 2 Övrig rapport 2
URL för elektronisk version http://www.control.isy.liu.se
ISBN — ISRN
—
Serietitel och serienummer Title of series, numbering
ISSN 1400-3902
LiTH-ISY-R-2878
Titel Title
On Polynomial Coefficients and Rank Constraints
Författare Author
Anders Helmersson
Sammanfattning Abstract
Rank constraints on matrices emerge in many automatic control applications. In this short document we discuss how to rewrite the constraint into a polynomial equations of the elements in a the matrix. If addition semidefinite matrix constraints are included, the polynomial equations can be turned into an inequality. We also briefly discuss how to implement these polynomial constraints.
Nyckelord
1
Introduction
We employ the coefficients in characteristic polynomial of a matrix Z as closeness measure to a rank constraint.We assume that Z has no negative eigenvalues.
Rank constraints are for instance used when searching for reduced-order H∞ controller. The existence of such a controller can be described in terms
of two linear matrix inequalities (LMIs) in two symmetric matrix variables, X, Y ∈ Rn×n, where n denotes the order of the system to be controlled, see [2]. A third LMI connects X and Y :
X I
I Y
0. (1)
The existence of a controller of reduced order, r < n, is given by rank X I I Y ≤ n + r. (2)
The rank condition (2) can also be rewritten (using Schur complement) as
rank(XY − I) ≤ r. (3)
This report is an extension of LiTH-ISY-R-2867. New upper and lower bounds of cn−r−1/cn−r with proofs are provided.
2
Polynomial Criterion
The characteristic polynomial of a matrix Z ∈ Rn×n is defined by
det(λI − Z) =
n
X
i=0
ci(Z)λi. (4)
where the coefficients, ci(Z), are polynomial functions of the elements in Z. For
instance, c0(Z) = det(Z), cn−1(Z) = tr(Z) and cn(Z) = 1. Note that when
Z = I − XY , the characteristic polynomial can also be defined in terms of the Hankel singular values, σi, of (X, Y ) as
det (λ − 1)I + Σ2) = n Y i=1 λ + σ2i − 1 = n X i=0 ciλi.
Lemma 1. Let Z ∈ Rn×n be a matrix with real non-negative eigenvalues,
λi(Z) ≥ 0, and let ci(−Z) be the coefficients of the characteristic polynomial
of −Z as defined in (4). Then, the following statements are equivalent if r < n: (i) cn−r−1(−Z) = 0;
(ii) rank Z ≤ r.
Proof. Showing (ii) ⇒ (i) is trivial, since (ii) is equivalent to that λr+1(Z) =
. . . = λn(Z) = 0, where λi(Z) denotes the ith eigenvalue of Z. To show (i) ⇒
(ii), we note that
cn−r−1(−Z) = X J ∈Cr+1([1,n]) Y i∈J λi(Z), (5)
where Ci(U ) denotes the set of all combinations of i elements from the set U1;
for instance, C2({1, 2, 3}) = {{1, 2}, {1, 3}, {2, 3}}.
Since, every λi(Z) ≥ 0, we conclude that cn−r−1(−Z) = 0 implies that every
product of (5) must be zero, and consequently at least one factor, λi(Z), in each
product must be zero. Thus, at least n − r factors, λi(Z) must be zero, and
consequently (ii) holds.
Instead of using the coefficient cn−r−1(−Z) directly as a measure of the
matrix’s closeness to the rank condition, rank Z ≤ r, we instead use cn−r−1(−Z)
cn−r(−Z)
as a measure of closeness. It has the nice property that if n − r eigenvalues values, λr+1, . . . , λn, are close to 0 and the remaining ones are large, then
cn−r−1(−Z) cn−r(−Z) ≈ n X i=r+1 λi(Z).
More specifically, the following relations hold in general.
Lemma 2. Let Z be a matrix with non-negative eigenvalues ordered by λ1(Z) ≥
λ2(Z) ≥ . . . ≥ λn(Z) ≥ 0. Then, the following relations hold:
1 r + 1 n X i=r+1 λi(Z) ≤ cn−r−1(−Z) cn−r(−Z) ≤ n X i=r+1 λi(Z), (6) or, equivalently, cn−r−1(−Z) cn−r(−Z) ≤ n X i=r+1 λi(Z) ≤ (r + 1) cn−r−1(−Z) cn−r(−Z) . (7)
Proof. The upper bound in (6) (the lower bound in (7)) is obtained by observing that Cr([1, k − 1]) ⊆ Cr([1, n]) assuming k ≤ n, and consequently
n X k=r+1 λk ! X J ∈Cr([1,n]) Y i∈J λi | {z } cn−r(−Z) ≥ n X k=r+1 λk X J ∈Cr([1,k−1]) Y i∈J λi = X J ∈Cr+1([1,n]) Y i∈J λi | {z } cn−r−1(−Z) ,
where we have dropped the argument (Z).
The lower bound in (6) (the upper bound in (7)) is obtained by first observing that (r + 1) X J ∈Cr+1([1,n]) Y i∈J λi | {z } cn−r−1(−Z) , = X J ∈Cr([1,n]) X k∈[1,n]\J λk Y i∈J λi .
1 Formally, we can define C
i(U ) recursively by C0 = {∅}, and Ci+1(U ) = {{x} ∪ y : x ∈ U, y ∈ Ci(U \ {x})} for i ≥ 0.
Every term in the left-hand side is generated by the right-hand side in (r + 1) copies, which is the reason for the factor in the left-hand side. Next using the fact that the eigenvalues, λk are ordered we can use
X k∈[1,n]\J λk ≥ n X k=r+1 λk, and consequently, (r + 1) X J ∈Cr+1([1,n]) Y i∈J λi | {z } cn−r−1(−Z) , ≥ X J ∈Cr([1,n]) n X k=r+1 λk Y i∈J λi ! = n X k=r+1 λk ! X J ∈Cr([1,n]) Y i∈J λi | {z } cn−r(−Z) ,
from which the relation follows.
3
Derivatives of c
i(Z)
Let C(Z) denote the characteristic polynomical of Z(x) and let Zi =∂x∂
iZ(x): C(Z) = det(λI − Z) = n X i=0 ci(Z)λi.
We can use the fact that ∂ ∂xi log det Z(x) = tr Z−1(x)Zi. We apply this on C(Z), ∂ ∂xi
det(λI − Z(x)) = − det(λI − Z(x)) tr(λI − Z(x))−1Zi.
Next, introduce the polynomial
P (λ) =cnIλn−1+ (cnZ + cn−1I) λn−2+ . . . + cnZn−1+ cn−1Zn−2+ . . . + c1I .
Then
P (λ)(λI − Z(x)) = cnIλn−1(λI − Z(x))
+ (cnZ + cn−1I) λn−2(λI − Z(x)) + . . . + cnZn−1+ cn−1Zn−2+ . . . + c1I (λI − Z(x)) = cnλn+ cn−1λn−1+ . . . + c1λ I − cnZn+ cn−1Zn−1+ . . . + c1Z | {z } −c0I = cnλn+ cn−1λn−1+ . . . + c1λ + c0 I = det(λI − Z(x))I 4
where we have used the fact that the charactestic polynomial of Z applied to itself yields zero.
Consequently, (λI − Z(x))−1= P (λ)/ det(λI − Z(x)), and ∂ ∂xi C(Z) = ∂ ∂xi det(λI − Z(x)) = − tr P (λ)Zi, and ∂ck ∂xi = ( 0, k = n − tr cnZn−k−1+ cn−1Zn−k−2+ . . . + ck+1I Zi, otherwise (8)
Note that ck(Z) can be computed as a polynomial function of tr Z, tr Z2, . . . ,
tr Zn−k. For instance, c
n = 1, cn−1 = − tr Z, cn−2 = 12 (tr Z)2− tr Z2, and
cn−3= 16 −2 tr Z3+ 3(tr Z)(tr Z2) − (tr Z)3.
3.1
Computing First-Order Derivatives
Higher-order derivatives of ck(Z) can be derived analogously. Here we try to
find an efficient implementation of the computation of the first and second-order derivatives. It is possible to compute the first-second-order derivatives in O(n4) operations (multiplications and additions). Here, the main effort lies in the computation of Z, Z2, . . . , Zn−k−1.
First, we compute the coefficients ci(Z) in the characteristic polynomial of
Z. This can be done using the poly function in Matlab, but a more efficient algorithm due to Berkowitz [1] is faster and requires O(n4) operations. A simple
Matlab implementation is given below function p = berkowitz (A) [n, m] = size (A);
if n ~= m, error (’A must be square’); end; p = 1; for k = n: -1: 1, R = A(k,k+1:n); ci = [1 -A(k,k)]; for i = k+1:n ci = [ci -R*A(k+1:n,k)]; R = R*A(k+1:n,k+1:n); end
p = toeplitz (ci, [1 zeros(1,n-k)]) * p; end
Next, using the coefficients, ci(Z), we can compute the following series of
matrices
Ck= cn(Z)Zn−k−1+ cn−1(Z)Zn−k−2+ . . . + ck+1(Z)I. (9)
This is most efficiently computed iteratively as Cn= 0, Cn−1= I, Cn−2= Z + cn−1(Z)I, .. . Ck= Ck+1Z + ck+1(Z)I, .. . C0= C1Z + c1(Z)I,
which in total requires O(n4) operations.
Now, the first-order derivatives of ck(Z) can be computed as
∂ck
∂xi
= − tr CkZi
where the computation of each coefficient requires O(n2) operations.
Conse-quently, n2derivatives can be computed in O(n4) operations.
3.2
Computing Second-Order Derivatives
The second-order derivatives are given by ∂2c k ∂xi∂xj = − ∂ ∂xj tr CkZi = − tr∂Ck ∂xj Zi− tr CkZij
The derivatives of Ci can be computed iteratively as
∂Cn ∂xj = 0, ∂Cn−1 ∂xj = 0, ∂Cn−2 ∂xj = Zj+ ∂cn−1 ∂xj I, .. . ∂Ck ∂xj = ∂Ck+1 ∂xj Z + Ck+1Zj+ ∂ck+1 ∂xj I, .. . ∂C0 ∂xj = ∂C1 ∂xj Z + C1Zj+ ∂c1 ∂xj I, 6
Consequently, ∂2ck ∂xi∂xj = − tr ∂Ck+1 ∂xj Z + Ck+1Zj+ ∂ck+1 ∂xj I Zi− tr CkZij = − tr∂Ck+1 ∂xj ZZi− tr Ck+1ZjZi− ∂ck+1 ∂xj tr Zi− tr CkZij = − tr ∂Ck+2 ∂xj Z + Ck+2Zj+ ∂ck+2 ∂xj ZZi − tr Ck+1ZjZi+ tr Ck+1Zjtr Zi− tr CkZij = − tr∂Ck+2 ∂xj Z2Zi+ (tr Ck+2Zj) (tr ZZi) − tr Ck+2ZjZZi − tr Ck+1ZjZi+ tr Ck+1Zjtr Zi− tr CkZij.
After reordering of terms we get, ∂2c k ∂xi∂xj = (tr Cn−1Zj) tr Zn−k−2Zi − tr Cn−1ZjZn−k−2Zi + . . . + (tr Ck+2Zj) (tr ZZi) − tr Ck+2ZjZZi + (tr Ck+1Zj) (tr Zi) − tr Ck+1ZjZi − tr CkZik.
Without any structure in Ziwe need as much as O(n6) operations to compute ∂2c
k
∂xi∂xj, since the computation of all CkZj and Z
kZ
j requires O(n6) operations.
This is based on the assumption that there are n2elements in Z such that i and
j are from 1 to n2.
3.3
Improving Computational Efficiency
With structure in Z, we can reduce this. If Zi = LiRi and Zj = LjRj are
assumed to contain single elements (ones). Here we assume that L and R are single-rank matrices. We can factor each terms as
tr CmZjZkZi= tr RiCmLjRjZkLi= (tr RiCmLj) (tr RjZnLi) .
Also,
(tr CmZj) tr ZkZi = (tr RjCmLj) tr RiZkLi
Since we can compute Cn, Cn−1, . . . , C1 and Z, Z2, . . . , Zn−1 using 2(n − 2)
matrix n × n multiplications we need O(n4) operations. Using the structure of
Z, the trace operation, tr RiCmLj can be replaced by a simple extraction of the
appropriate element in Cmwhere Ri defines the row and Lj defines the column.
If Z is diagonal, we can reduce this to O(n2) operations not including the operations needed to compute ci.
In the case when Z = I − XY , the number of operations will be of the same order. Note that Zi= −XiY − XYi, where one of Xi and Yi is zero, and
Zij = −XiYj.
For instance, the second-order derivatives of cn−1and cn−2become ∂2c n−1 ∂xi∂xj = − tr Zij, and ∂2c n−2 ∂xi∂xj = (tr Zi)(tr Zj) − tr ZiZj− tr Cn−2Zij
References
[1] Stuart J. Berkowitz. On computing the determinant in small parallel time using a small number of processors. Inf. Process. Lett., 18(3):147–150, 1984. [2] P. Gahinet and P. Apkarian. A linear matrix inequality approach to H∞
control. International Journal of Robust and Nonlinear Control, 4(4):421– 448, 1994.