Technical report from Automatic Control at Linköpings universitet
On Polynomial Coefficients and Rank
Constraints
Anders Helmersson
Division of Automatic Control
E-mail: andersh@isy.liu.se
15th October 2008
Report no.: LiTH-ISY-R-2867
Address:
Department of Electrical Engineering Linköpings universitet
SE-581 83 Linköping, Sweden
WWW: http://www.control.isy.liu.se
AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET
Technical reports from the Automatic Control group in Linköping are available from http://www.control.isy.liu.se/publications.
Abstract
Rank constraints on matrices emerge in many automatic control applica-tions. In this short document we discuss how to rewrite the constraint into a polynomial equations of the elements in a the matrix. If addition semidefinite matrix constraints are included, the polynomial equations can be turned into an inequality. We also briefly discuss how to implement these polynomial constraints.
Avdelning, Institution Division, Department
Division of Automatic Control Department of Electrical Engineering
Datum Date 2008-10-15 Språk Language 2 Svenska/Swedish 2 Engelska/English 2 Rapporttyp Report category 2 Licentiatavhandling 2 Examensarbete 2 C-uppsats 2 D-uppsats 2 Övrig rapport 2
URL för elektronisk version http://www.control.isy.liu.se
ISBN — ISRN
—
Serietitel och serienummer Title of series, numbering
ISSN 1400-3902
LiTH-ISY-R-2867
Titel Title
On Polynomial Coefficients and Rank Constraints
Författare Author
Anders Helmersson
Sammanfattning Abstract
Rank constraints on matrices emerge in many automatic control applications. In this short document we discuss how to rewrite the constraint into a polynomial equations of the elements in a the matrix. If addition semidefinite matrix constraints are included, the polynomial equations can be turned into an inequality. We also briefly discuss how to implement these polynomial constraints.
Nyckelord
1
Introduction
We employ the coefficients in characteristic polynomial of a matrix Z as close-ness measure to a rank constraint, when we assume that Z has no negative eigenvalues.
Rank constraints are for instance used when searching for reduced-order H∞ controller. The existence of such a controller can be described in terms
of two linear matrix inequalities (LMIs) in two symmetric matrix variables, X, Y ∈ Rn×n, where n denotes the order of the system to be controlled, see [2]. A third LMI connects X and Y :
X I
I Y
0. (1)
The existence of a controller of reduced order, r < n, is given by rank X I I Y ≤ n + r. (2)
The rank condition (2) can also be rewritten (using Schur complement) as
rank(XY − I) ≤ r. (3)
2
Polynomial Criterion
The characteristic polynomial of a matrix Z ∈ Rn×n is defined by
det(λI − Z) =
n
X
i=0
ci(Z)λi. (4)
where the coefficients, ci(Z), are polynomial functions of the elements in Z. For
instance, c0(Z) = det(Z), cn−1(Z) = tr(Z) and cn(Z) = 1. Note that when
Z = I − XY , the characteristic polynomial can also be defined in terms of the Hankel singular values, σi, of (X, Y ) as
det (λ − 1)I + Σ2) = n Y i=1 λ + σ2i − 1 = n X i=0 ciλi.
Lemma 1. Let Z ∈ Rn×n be a matrix with real non-positive eigenvalues,
λi(Z) ≤ 0, and let ci(Z) be the coefficients of the characteristic polynomial
of Z as defined in (4). Then, the following statements are equivalent if r < n: (i) cn−r−1(Z) = 0;
(ii) rank Z ≤ r.
Proof. Showing (ii) ⇒ (i) is trivial, since (ii) is equivalent to that λr(Z) = . . . =
λn(Z) = 0, where λi(Z) denotes the ith eigenvalue of Z. To show (i) ⇒ (ii), we
note that cn−r−1(Z) = X J ∈Cr+1([1,n]) Y i∈J λi(−Z), (5)
where Ci(U ) denotes the set of all combinations of i elements from the set U1.
Since, every λi(−Z) ≥ 0, we conclude that cn−r−1(Z) = 0 implies that every
product of (5) must be zero, and consequently at least one factor, λi(Z), in each
product must be zero. Thus, at least n − r factors, λi(Z) must be zero, and
consequently (ii) holds.
As a measure of how close the matrix Z is to the rank constraint rank Z ≤ r, we can use cn−r−1(Z)/cn−r(Z), which seems to have some nice properties. For
instance, if we are close to rank Z ≤ r such that (n − r) eigenvalues values, λi, i = r + 1, . . . , n, are close to 0 and the remaining ones are large, then
cn−r−1(Z) cn−r(Z) ≈ n X i=r+1 λi(−Z)
3
Derivates of c
i(Z)
Let C(Z) denote the characteristic polynomical of Z(x) and let Zi =∂x∂
iZ(x): C(Z) = det(λI − Z) = n X i=0 ci(Z)λi.
We can use the fact that ∂ ∂xi log det Z(x) = tr Z−1(x)Zi. We apply this on C(Z), ∂ ∂xi
det(λI − Z(x)) = − det(λI − Z(x)) tr(λI − Z(x))−1Zi.
Next, introduce the polynomial
P (λ) =cnIλn−1+ (cnZ + cn−1I) λn−2+ . . . + cnZn−1+ cn−1Zn−2+ . . . + c1I .
Then
P (λ)(λI − Z(x)) = cnIλn−1(λI − Z(x))
+ (cnZ + cn−1I) λn−2(λI − Z(x)) + . . . + cnZn−1+ cn−1Zn−2+ . . . + c1I (λI − Z(x)) = cnλn+ cn−1λn−1+ . . . + c1λ I − cnZn+ cn−1Zn−1+ . . . + c1Z | {z } −c0I = cnλn+ cn−1λn−1+ . . . + c1λ + c0 I = det(λI − Z(x))I
1 Formally we can define C
i(U ) by C0 = {∅}, and Ci(U ) = {{x} ∪ y : x ∈ U, y ∈
Ci−1(U \ {x})} for i > 0.
where we have used the fact that the charactestic polynomial of Z applied to itself yields zero.
Consequently, (λI − Z(x))−1= P (λ)/ det(λI − Z(x)), and ∂ ∂xi C(Z) = ∂ ∂xi det(λI − Z(x)) = − tr P (λ)Zi, and ∂ck ∂xi = ( 0, k = n − tr cnZn−k−1+ cn−1Zn−k−2+ . . . + ck+1I Zi, otherwise (6)
Note that ck(Z) can be computed as a polynomial function of tr Z, tr Z2, . . . ,
tr Zn−k. For instance, c
n = 1, cn−1 = − tr Z, cn−2 = 12 (tr Z)2− tr Z2, and
cn−3= 16 −2 tr Z3+ 3(tr Z)(tr Z2) − (tr Z)3.
3.1
Computing First-Order Derivatives
Higher-order derivatives of ck(Z) can be derived analogously. Here we try to
find an efficient implementation of the computation of the first and second-order derivatives. It is possible to compute the first-second-order derivatives in O(n4) operations (multiplications and additions). Here, the main effort lies in the computation of Z, Z2, . . . , Zn−k−1.
First, we compute the coefficients ci(Z) in the characteristic polynomial of
Z. This can be done using the poly function in Matlab, but a more efficient algorithm due to Berkowitz [1] is faster and requires O(n4) operations. A simple
Matlab implementation is given below function p = berkowitz (A) [n, m] = size (A);
if n ~= m, error (’A must be square’); end; p = 1; for k = n: -1: 1, R = A(k,k+1:n); ci = [1 -A(k,k)]; for i = k+1:n ci = [ci -R*A(k+1:n,k)]; R = R*A(k+1:n,k+1:n); end
p = toeplitz (ci, [1 zeros(1,n-k)]) * p; end
Next, using the coefficients, ci(Z), we can compute the following series of
matrices
Ck= cn(Z)Zn−k−1+ cn−1(Z)Zn−k−2+ . . . + ck+1(Z)I. (7)
This is most efficiently computed iteratively as Cn= 0, Cn−1= I, Cn−2= Z + cn−1(Z)I, .. . Ck= Ck+1Z + ck+1(Z)I, .. . C0= C1Z + c1(Z)I,
which in total requires O(n4) operations.
Now, the first-order derivatives of ck(Z) can be computed as
∂ck
∂xi
= − tr CkZi
where the computation of each coefficient requires O(n2) operations.
Conse-quently, n2derivatives can be computed in O(n4) operations.
3.2
Computing Second-Order Derivatives
The second-order derivatives are given by ∂2c k ∂xi∂xj = − ∂ ∂xj tr CkZi = − tr∂Ck ∂xj Zi− tr CkZij
The derivatives of Ci can be computed iteratively as
∂Cn ∂xj = 0, ∂Cn−1 ∂xj = 0, ∂Cn−2 ∂xj = Zj+ ∂cn−1 ∂xj I, .. . ∂Ck ∂xj = ∂Ck+1 ∂xj Z + Ck+1Zj+ ∂ck+1 ∂xj I, .. . ∂C0 ∂xj = ∂C1 ∂xj Z + C1Zj+ ∂c1 ∂xj I, 5
Consequently, ∂2ck ∂xi∂xj = − tr ∂Ck+1 ∂xj Z + Ck+1Zj+ ∂ck+1 ∂xj I Zi− tr CkZij = − tr∂Ck+1 ∂xj ZZi− tr Ck+1ZjZi− ∂ck+1 ∂xj tr Zi− tr CkZij = − tr ∂Ck+2 ∂xj Z + Ck+2Zj+ ∂ck+2 ∂xj ZZi − tr Ck+1ZjZi+ tr Ck+1Zjtr Zi− tr CkZij = − tr∂Ck+2 ∂xj Z2Zi+ (tr Ck+2Zj) (tr ZZi) − tr Ck+2ZjZZi − tr Ck+1ZjZi+ tr Ck+1Zjtr Zi− tr CkZij.
After reordering of terms we get, ∂2c k ∂xi∂xj = (tr Cn−1Zj) tr Zn−k−2Zi − tr Cn−1ZjZn−k−2Zi + . . . + (tr Ck+2Zj) (tr ZZi) − tr Ck+2ZjZZi + (tr Ck+1Zj) (tr Zi) − tr Ck+1ZjZi − tr CkZik.
Without any structure in Ziwe need as much as O(n6) operations to compute ∂2c
k
∂xi∂xj, since the computation of all CkZj and Z
kZ
j requires O(n6) operations.
This is based on the assumption that there are n2elements in Z such that i and
j are from 1 to n2.
3.3
Improving Computational Efficiency
With structure in Z, we can reduce this. If Zi = LiRi and Zj = LjRj are
assumed to contain single elements (ones). Here we assume that L and R are single-rank matrices. We can factor each terms as
tr CmZjZkZi= tr RiCmLjRjZkLi= (tr RiCmLj) (tr RjZnLi) .
Also,
(tr CmZj) tr ZkZi = (tr RjCmLj) tr RiZkLi
Since we can compute Cn, Cn−1, . . . , C1 and Z, Z2, . . . , Zn−1 using 2(n − 2)
matrix n × n multiplications we need O(n4) operations. Using the structure of
Z, the trace operation, tr RiCmLj can be replaced by a simple extraction of the
appropriate element in Cmwhere Ri defines the row and Lj defines the column.
If Z is diagonal, we can reduce this to O(n2) operations not including the operations needed to compute ci.
In the case when Z = I − XY , the number of operations will be of the same order. Note that Zi= −XiY − XYi, where one of Xi and Yi is zero, and
Zij = −XiYj.
For instance, the second-order derivatives of cn−1and cn−2become ∂2c n−1 ∂xi∂xj = − tr Zij, and ∂2c n−2 ∂xi∂xj = (tr Zi)(tr Zj) − tr ZiZj− tr Cn−2Zij
References
[1] Stuart J. Berkowitz. On computing the determinant in small parallel time using a small number of processors. Inf. Process. Lett., 18(3):147–150, 1984. [2] P. Gahinet and P. Apkarian. A linear matrix inequality approach to H∞
control. International Journal of Robust and Nonlinear Control, 4(4):421– 448, 1994.