Quadrature-based Methods for Sylvester Equations

(1)

IN

DEGREE PROJECT TECHNOLOGY, FIRST CYCLE, 15 CREDITS

STOCKHOLM SWEDEN 2018 ,

Quadrature-based Methods for Sylvester Equations

JOSEPHINE THUVANDER

KTH ROYAL INSTITUTE OF TECHNOLOGY

SCHOOL OF ENGINEERING SCIENCES

(2)

INOM

EXAMENSARBETE TEKNIK, GRUNDNIVÅ, 15 HP

STOCKHOLM SVERIGE 2018 ,

Kvadraturbaserade metoder för Sylvesterekvationer

JOSEPHINE THUVANDER

KTH

SKOLAN FÖR TEKNIKVETENSKAP

(3)

Quadrature-based Methods for Sylvester Equations

JOSEPHINE THUVANDER

Bachelor’s Programme in Simulation Technology and Virtual Design

Date: May 28, 2018 Supervisor: Emil Ringh

Co-supervisor: Elias Jarlebring

Examiner: Katarina Gustavsson

School of Engineering Sciences

(4)

(5)

iii

Abstract

The aim of this thesis is to investigate properties of quadrature-based methods

to approximate a solution to the Sylvester equation AX + XB ^T = C. The

Sylvester equation models large systems, hence the need for efficient numerical

methods. The trapezoidal rule and Simpson’s rule, two classical methods, are

implemented, as well as quadrature by Stenger. The results presented here

shows the most promising method is quadrature by Stenger.

(6)

iv

Sammanfattning

I den här uppsatsen undersöks egenskaper för kvadraturbaserade metod för

att approximera en lösning till Sylvesterekvationen AX + XB ^T = C . Syl-

vesterekvationen används ofta för att modellera stora system, vilket motiverar

behovet av effektiva numeriska metoder. De klassiska metoderna trapetsregeln

och Simpsons regel implementeras, samt metoden som här kallas för Stenger-

kvadratur. Resultaten visar att Stengerkvadratur är att föredra framför de övriga

undersökta metoderna.

(7)

v

Acknowledgements

I would like to thank my two supervisors: Emil Ringh for taking time to an-

swer questions, good explanations during the meetings and giving valuable

feedback on the thesis, and Elias Jarlebring for feedback on preparations for

the thesis presentation.

(8)

Chapter 1 Introduction

1.1 Purpose

The Sylvester equation is a linear matrix equation written as

AX + XB ^T = C, (1.1)

with given matrices A ∈ R ^m ^×m , B ∈ R ⁿ ^×n and C ∈ R ^m ^×n , and where we want to determine the matrix X ∈ R ^m ^×n . It is common in many application ar- eas such as linear dynamical systems [1], in partial differential equations, sys- tems theory and model reduction, and as a preconditioner [11]. The Sylvester equation (1.1) models systems that can be large, leading to the need for ef- ficient methods for computations of an approximative soluton. The purpose of this study is to investigate properties of three quadrature-based methods for computing the approximative solution to the Sylvester equation (1.1). The methods included are the trapezoidal rule, Simpson’s rule and quadrature by Stenger.

0 0.5 1 1.5 2 2.5 3

Figure 1.1: Solution of the Poisson equation in two dimensions, see Section 4.2.

1

(12)

2 CHAPTER 1. INTRODUCTION

1.2 Preliminaries

In this section theory needed in upcoming sections is presented. All matrices considered are real, though many of the definitions and theorems are valid in the complex case as well, see e.g. [6].

1.2.1 Matrix Definitions

Definition 1.1. (Spectrum, [12, p. 181])

The spectrum of a matrix A ∈ R ^m ^×m , denoted by σ(A), is the set of all eigen- values of A.

Definition 1.2. (Stable matrix, [6,10])

A matrix A ∈ R ^m ^×m is stable if all of its eigenvalues have negative real parts.

Lemma 1. [12, p. 8]

The matrix A ∈ R ^m ^×m is nonsingular if and only if 0 /∈ σ(A).

Definition 1.3. (Diagonalization operator, [9])

Let F 1 , . . . , F k be square matrices. Then the diagonalization operator is de- fined by

diag(F 1 , . . . , F k ) =



 

 F 1

...

F k



 

 ,

i.e. it yields a matrix with a block diagonal structure.

Definition 1.4. (Jordan Decomposition, [6, p. 354, (9.1.1)-(9.1.2)]) A Jordan decomposition of a matrix A ∈ R ^m ^×m is such that

A = X diag(J 1 , . . . , J q )X ⁻¹ ,

where X ∈ R ^m×m is nonsingular and each Jordan block J i is defined by

J i =



 



λ i 1 · · · · 0 0 λ i 1 · · · ...

... ... ... ... ...

... ... ... ... 1 0 · · · · 0 λ i



 



∈ R ^m ⁱ ^×m ⁱ ,

where λ i are eigenvalues of A.

(13)

CHAPTER 1. INTRODUCTION 3

Lemma 2. [6,9]

A matrix A ∈ R ^m×m is diagonalizable if and only if A = X diag(λ 1 , . . . , λ m )X ⁻¹ ,

where X ∈ R ^m ^×m is nonsingular with the eigenvectors of A as columns and λ i are the corresponding eigenvalues of A.

Definition 1.5. (Schur decomposition, [12, p. 187])

A Schur decomposition of a matrix A ∈ R ^m ^×m is defined by

A = QT Q ^T , (1.2)

where Q ∈ R ^m×m is orthogonal and T ∈ R ^m×m is upper-triangular.

1.2.2 Matrix Functions

Assume A ∈ R ^m ^×m and let the scalar function f(z) be defined be on σ(A).

Then f(z) can be generalized to a matrix function f(A) such as f : R ^m×m → R ^m×m ,

i.e. we get a function that maps a matrix to a matrix [6, 9]. An example of a matrix function is the matrix exponential e ^A . One way to define matrix func- tions is by using the Jordan decomposition. For other ways to define matrix functions, see [6, Ch. 9]. Using Definition 1.4, we can now define the corre- sponding matrix function f(A) to f(z) as

f (A) = X diag(F 1 , . . . , F q )X ⁻¹ , with X ∈ R ^m×m and

F i =



 

 

f (λ i ) ^f ⁽¹⁾ _1! ^(λ ⁱ ⁾ · · · · ^f (n ⁽ⁿⁱ⁻¹⁾ i −1)! (λ i ) 0 f (λ i ) ... ··· ...

0 0 ... ... ...

... ... ... ... ^f ⁽¹⁾ _1! ^(λ ⁱ ⁾ 0 · · · · · · 0 f (λ i )



 

 

∈ R ^m ⁱ ^×m ⁱ ,

where λ i are eigenvalues of A, see e.g. [6, (9.1.3) and (9.1.4)]. If A is a diag- onalizable matrix, then by Lemma 2 we get a matrix function on the form

f (A) = Xdiag(f(λ 1 ), . . . , f (λ q ))X ⁻¹ ,

where X have the eigenvectors of A as columns.

(14)

4 CHAPTER 1. INTRODUCTION

1.2.3 Matrix Norms

Definition 1.6. (Frobenius norm, [12, p. 22])

Let A ∈ R ^m×n . Then the Frobenius norm of A is defined by

kAk ^F = v u u t

X m i=1

X n j=1

|a ^ij | ² .

The Frobenius norm satisfies the submultiplicative property kABk ^F ≤ kAk ^F kBk ^F ,

where the matrices A and B are of conforming sizes, see e.g. [12, p.23]. This property will later be used to bound integrating functions.

Definition 1.7. (2-Norm, [6, p. 73])

Let A ∈ R ^m×n . The 2-norm of A is then defined by kAk 2 = p

λ max (A ^T A).

Lemma 3. [12, p. 20]

Let D ∈ R ^m×m be a diagonal matrix and denote the diagonal elements with d 1 , . . . , d m . Then

kDk 2 = max

1≤i≤m |d ⁱ |.

1.2.4 Kronecker Product and Vectorization Operator

Definition 1.8. (Kronecker product, [3, p. 181])

Let A ∈ R ^m ^×n and B ∈ R ^p ^×q . The Kronecker product of A and B, denoted by A ⊗ B, is defined by

A ⊗ B =



 

 

a 11 B a 12 B · · · a ¹ⁿ B a 21 B a 22 B · · · a ²ⁿ B

... ... ...

a m1 B a m2 B · · · a ^mn B



 

 

∈ R ^mp ^×nq .

Definition 1.9. (Vectorization operator, [3, p. 183])

Let B ∈ R ^m ^×n and denote the columns of B by b 1 , b 2 . . . , b n . The vectoriza-

(15)

CHAPTER 1. INTRODUCTION 5

tion operator is defined by

vec(B) =



 

  b 1

b 2

...

b n



 

 

∈ R ^mn . (1.3)

Lemma 4. [3,6]

Let A ∈ R ^m×n , B ∈ R ^p×q and X ∈ R ^n×p , then

vec(AXB) = (B ^T ⊗ A)vec(X). (1.4)

As we will see later, this property can be used to rewrite the Sylvester

equation (1.1).

(16)

Chapter 2 The Sylvester Equation

Assume A ∈ R ^m ^×m , B ∈ R ⁿ ^×n and C ∈ R ^m ^×n . The Sylvester equation (1.1) is a linear matrix equation on the form

AX + XB ^T = C,

where A, B and C are given and we want to determine the matrix X ∈ R ^m×n satisfying (1.1). The Lyapunov equation is a special case of equation (1.1), where B = A, and it takes the form

AX + XA ^T = C,

see [10, eq. (5)]. The Sylvester equation (1.1) can be rewritten using the vec- torization operator (1.3) and the identity (1.4). We get

vec(AX + XB ^T ) = vec(C) vec(AX) + vec(XB ^T ) = vec(C)

(I n ⊗ A + B ⊗ I m )vec(X) = vec(C), (2.1) where I n and I m are identity matrices of dimension n × n and m × m respec- tively. Equation (2.1), which is equivalent to equation (1.1), is a linear system of equations on the form

Ax = c, (2.2)

where

A =(I ⁿ ⊗ A + B ⊗ I ^m ) ∈ R ^mn ^×mn , (2.3) x = vec(X) ∈ R ^mn ,

c = vec(C) ∈ R ^mn .

6

(17)

CHAPTER 2. THE SYLVESTER EQUATION 7

The linear system (2.2) has a unique solution for every c if and only if A is nonsingular [9,11].

We now state some theorems for the existence and uniqueness of a solution to equation (1.1). The following theorem uses the spectrums of A and −B ^T , i.e. σ(A) and σ(−B ^T ).

Theorem 5. (Existence and uniqueness, [3, p. 448])

Let A ∈ R ^m ^×m , B ∈ R ⁿ ^×n and C ∈ R ^m ^×n . Then the Sylvester equation (1.1) has a unique solution if and only if σ(A) ∩ σ(−B ^T ) = ∅.

Proof. Let the eigenvalues of A and B ^T be denoted by λ 1 , . . . , λ m and µ 1 , . . . , µ n

respectively. The eigenvalues of A = (I n ⊗A+B⊗I ^m ) are λ i +µ j , see [8, The- orem 4.4.5]. If σ(A) ∩ σ(−B ^T ) = ∅, we get

λ i + µ j 6= 0, for i = 1, . . . , m and j = 1, . . . , n,

and by Lemma 1 it follows that A is nonsingular. Thus the linear system (2.2) has a unique solution [8,9].

Theorem 6. Let A ∈ R ^m ^×m and B ^T ∈ R ⁿ ^×n be stable matrices. Then the Sylvester equation (1.1) has an unique solution.

Proof. The stability of A and B ^T implies that σ(A) ∩ σ(−B ^T ) = ∅, and it follows from Theorem 5 that equation (1.1) has a unique solution.

2.1 Numerical Methods

There are many numerical methods available for the computation of an ap- proximative solution to the Sylvester equation. Two fundamental, direct meth- ods, the Kronecker product method and the Bartels-Stewart algorithm, are pre- sented in what follows. For more thorough reviews of available methods, see e.g. [1,11].

2.1.1 The Kronecker Product Method

As mentioned in the previous section, the Sylvester equation (1.1) can be

rewritten as the linear systems of equations (2.1). Assuming A and B ^T to

be stable matrices, this system has a unique solution and can be solved using

e.g. Gaussian elimination. This method is a direct and naive method that can

be implemented on small linear systems, as a part of other algorithms, but the

methods is not suited for large linear systems [9], since Gaussian elimination

requires ² ₃ m ³ n ³ flops for a matrix such as (2.3), see e.g. [3,12].

(18)

8 CHAPTER 2. THE SYLVESTER EQUATION

2.1.2 Bartels-Stewart Algorithm

The Bartel-Stewart algorithm is another direct method to compute an approx- imative solution to the Sylvester equation (1.1). The method is based on the Schur decomposition (1.2). Assuming the Schur decompositions of A and B are

A = QT Q ^T , B = W U W ^T ,

equation (1.1) can be reformulated to a equivalent equation on the form

T Y + Y U ^T = D, (2.4)

where

Y = Q ^T XW, D = Q ^T CW.

[9]. Implementing the Bartel-Stewart algorithm on (2.4) with an "recycling"

strategy, as suggested in [6, Algorithm 7.6.2], where the matrix D is over- written in each step, the algorithm requires mn(m + n) flops [6]. For a full presentation of the algorithm, see e.g. [2, 6]. The lyap command in MAT- LAB, which is used to compute the reference solutions in Section 4, is based on the Bartels-Stewart algorithm [9].

2.2 Integral Formulation

The solution X to the Sylvester equation (1.1) can be formulated as an im- proper integral using the matrix exponential. For other possible closed form formulations, see e.g. [10,11].

Theorem 7. ( [10, p. 554])

Let A ∈ R ^m ^×m and B ^T ∈ R ⁿ ^×n be stable matrices. Then the unique solution to the Sylvester equation (1.1) is

X = − Z ∞

0 e ^At Ce ^B ^T ^t dt. (2.5)

(19)

CHAPTER 2. THE SYLVESTER EQUATION 9

Proof. Since A and B ^T are stable matrices, the existence and uniqueness fol- lows from Theorem 6. Let

X = ˆ − Z ∞

0 e ^At Ce ^B ^T ^t dt (2.6)

and assume A and B are diagonalizable matrices such that A = U DU ⁻¹ ,

B = W ΛW ⁻¹ ,

and denote the eigenvalues of A and B ^T by λ 1 , . . . , λ m and µ 1 , . . . , µ n re- spectively. ˆ X exists if the integral on the righthand side of (2.6) converges.

Consider the inequality

ke ^At Ce ^B ^T ^t k ≤ kCkke ^At kke ^B ^T ^t k,

where k · k denotes the 2-norm. Using Lemma 3, the norm of the matrix exponential e ^At is bounded by

e ^At = U diag(e ^λ ¹ ^t , . . . , e ^λ ^m ^t )U ⁻¹

≤ kUk diag(e ^λ ¹ ^t , . . . , e ^λ ^m ^t ) U ⁻¹

= kUk e ^λ ^max ^t U ⁻¹

= kUkkU ⁻¹ k|e ^Re(λ ^max ^)t ||e ^Im(λ ^max ^)t |

= kUkkU ⁻¹ k|e ^Re(λ ^max ^)t | · 1

= kUkkU ⁻¹ k|e ^Re(λ ^max ^)t |, (2.7) where λ max is the largest eigenvalue of A. Similarly, denoting the largest eigenvalue of B ^T by µ max , we get

ke ^B ^T ^t k ≤ k(W ^T ) ⁻¹ kkW ^T k|e ^Re(µ ^max ^)t |. (2.8)

(20)

10 CHAPTER 2. THE SYLVESTER EQUATION

Using (2.7)-(2.8) and the stability of A and B ^T , we get Z ∞

0 ke ^At Ce ^B ^T ^t kdt ≤ kCk Z ∞

0 ke ^At kke ^B ^T ^t kdt (2.9)

≤ kCkkUkkU ⁻¹ kk(W ^T ) ⁻¹ kkW ^T k

· Z ∞

0 |e ^(Re(λ ^max ^)+Re(µ ^max ^))t |dt

= γ

e ^(Re(λ ^max ^)+Re(µ ^max ^))t Re(λ max ) + Re(µ max )

∞ 0

= −γ

Re(λ max ) + Re(µ max ) ,

where γ = kCkkUkkU ⁻¹ kk(W ^T ) ⁻¹ kkW ^T k. Hence the integral on the left- hand side in (2.9) converges, and the convergence of the integral in (2.6) now follows from this. Thus we get that ˆ X exists. Finally, inserting ˆ X into the lefthand side of the Sylvester equation (1.1) gives

A ˆ X + ˆ XB ^T = A



−

Z ∞

0 e ^At Ce ^B ^T ^t dt



 +



−

Z ∞

0 e ^At Ce ^B ^T ^t dt



 B ^T

= − Z ∞

0 Ae ^At Ce ^B ^T ^t + e ^At Ce ^B ^T ^t B ^T dt

= − h

e ^At Ce ^B ^T ^t i ∞

0 = C,

which shows that ˆ X is a solution to (1.1).

(21)

Chapter 3 Quadrature Methods for Matrix Equations

Quadrature methods defined on scalar valued function can, in accordance with Section 1.2.2, be extended to matrix functions. The aim in this section is to formulate numerical solutions to the Sylvester equation (1.1) by applying quadrature methods to the integral formulation (2.5).

3.1 Numerical Integration

Let f(t) be a function defined on an interval [a, b]. A numerical approximation of the integral of f(t) is given by

I = Z b

a

f (t)dt ≈ X n

i=1

w i f (t i ), (3.1)

where different choices of weights and quadrature points (w i , t i ) gives different numerical methods [4, p. 521-522]. Extending the scalar valued function to a matrix function f(At) defined on the interval [a, b], we get the result

I = Z b

a

f (At)dt ≈ X n

i=1

w i f (At i ),

where f(At i ) can be evaluated as discussed in Section 1.2.2.

11

(22)

12 CHAPTER 3. QUADRATURE METHODS FOR MAT. EQNS

3.2 Trapezoidal Rule

The Trapezoidal rule is a method using a first order polynomial to approximate the integral (3.1). For a uniform grid, the trapezoidal rule is defined by

Z b

a

f (t)dt ≈ T (h) = h

2 (f (t 0 ) + f (t n )) + h

n −1

X

i=2

f (t i ),

where the step size h is

h = b − a n ,

i.e. the step size is constant, see [4, p. 527-528]. The method has an order of accuracy of O(h ² ), [4, p. 528]. Using the definition of matrix function, the generalization of the trapezoidal rule to matrix functions becomes

Z b

a

f (At)dt ≈ T (h) = h

2 (f 0 + f n ) + h X n−1

i=2

f i , (3.2)

where f i = f (At i ), see [6, p. 527-528].

3.3 Simpon’s Rule

For scalar functions, Simpson’s rule is defined by Z b

a

f (t)dt ≈ S(h) = h 3

X n i=0

w i f (t i ), (3.3)

where n is even, the step size h is

h = b − a n , and

w i =

 



 

1 if i = 0, n, 4 if i odd,

2 if i even, and i 6= 0, n.

(23)

CHAPTER 3. QUADRATURE METHODS FOR MAT. EQNS 13

The order of accuracy for (3.3) is O(h ⁴ ) [4, p. 531]. Generalizing (3.3) to matrix functions, we get

Z b

a

f (At)dt ≈ S(h) = h 3

X n i=0

w i f i ,

where f i = f (At i ), see [6, p. 528]. Since Simpson’s rule can also be formu- lated using the trapezoidal rule with an extraplolation, the matrix generaliza- tion is equivalent to

S(h) = 1

3 (T (h) + 2M (h)), where T (h) is the trapezoidal value from (3.2) and

M (h) = h

n −1

X

i=0

f i+1/2 , see [4, p. 528, 530].

3.4 Quadrature by Stenger

Instead of choosing a fixed step size, a non-uniform grid given by h = π ²

√ k ,

t i = log(e ^ih + p

1 + e ^2ih ), (3.4)

w i = h

√ 1 + e ^−2ih , (3.5)

is proposed in [7], where h is the step size, k ∈ Z\{0}, t i are quadrature points and w i are weights. Using the above, quadrature by Stenger for the function f (t) = e ^−t on the infinite interval [0, ∞] is defined by

Z ∞

0 e ^−t dt ≈ X k i= −k

w i e ^−t ⁱ , (3.6)

with t i and w i given by (3.4)-(3.5). The quadrature points and weights are chosen such that the error is bounded by

Z ∞

0 e ^−t dt − X k i= −k

w i e ^−t ⁱ

≤ C ^st e ^−π ^√ ^2k ,

(24)

14 CHAPTER 3. QUADRATURE METHODS FOR MAT. EQNS

where C st is a constant, see [7, Lemma 5]. Assuming A is a stable matrix and f (At) = e ^At , the corresponding matrix version of (3.6) is

Z ∞

0 e ^At dt ≈ X k i= −k

w i e ^At ⁱ ,

where t i and w i are defined by (3.4)-(3.5).

3.5 Approaches for Improper Integrals

By quadrature by Stenger, we have a direct way of handling improper integrals such as (2.5). For the trapezoidal rule and Simpson’s rule, two approaches for improper integrals, applicable for both methods, are presented in this section.

3.5.1 Truncated Upper Limit

One strategy to implement the trapezoidal rule and Simpson’s rule is to trun- cate the upper limit R to bound the truncation error. We start by considering the norm of the integral in (2.5), i.e.

Z ∞

0 e ^At Ce ^B ^T ^t dt =

Z R

0 e ^At Ce ^B ^T ^t dt +

Z ∞

R

e ^At Ce ^B ^T ^t dt

≤ Z R

0 ke ^At Ce ^B ^T ^t kdt + Z ∞

R

ke ^At Ce ^B ^T ^t kdt. (3.7)

Assume A ∈ R ^m×m and B ∈ R ^n×n are stable and diagonalizable matrices such that

A = U DU ⁻¹ ,

B = W ΛW ⁻¹ ,

(25)

CHAPTER 3. QUADRATURE METHODS FOR MAT. EQNS 15

and denote the eigenvalues of A and B ^T by λ 1 , . . . , λ m and µ 1 , . . . , µ n respec- tively. Then the righthand side integral in (3.7) is bouned by

Z ∞

R

ke ^At Ce ^B ^T ^t kdt ≤ kCk Z ∞

R

ke ^At kke ^B ^T ^t kdt

≤ kCk Z ∞

R

kUe ^Dt U ⁻¹ kk(W ^T ) ⁻¹ e ^Λt W ^T kdt

≤ kCkkUkkU ⁻¹ kk(W ^T ) ⁻¹ kkW ^T k Z ∞

R

ke ^Dt kke ^Λt kdt.

(3.8) The improper integral in (3.8) can be bounded using the Frobenius norm de- fined in Section 1.2.3. For the matrix exponetial e ^Dt we get

ke ^Dt k ^F = v u u t

X m i=1

|e ^d ^i,i ^t | ²

≤ v u u t

X m i=1

|e ^λ ^max ^t | ²

= √

m |e ^λ ^max ^t |

= √

m |e ^Re(λ ^max ^)t ||e ^Im(λ ^max ^)t |

= √

m |e ^Re(λ ^max ^)t | · 1

= √

m |e ^Re(λ ^max ^)t |, (3.9)

where λ max is the largest eigenvalue of A, i.e. the eigenvalue closest to zero since we assume A to be stable. Analogously, denoting the largest eigenvalue of B ^T by µ max , we get

ke ^Λt k ^F ≤ √

n |e ^Re(µ ^max ^)t |. (3.10)

(26)

16 CHAPTER 3. QUADRATURE METHODS FOR MAT. EQNS

Combining (3.9)-(3.10) we get Z ∞

R

ke ^Dt kke ^Λt kdt ≤ Z ∞

R

√ mn |e ^Re(λ ^max ^)t ||e ^Re(µ ^max ^)t |dt

= √ mn

e ^(Re(λ ^max ^)+Re(µ ^max ^))t Re(λ max ) + Re(µ max )

∞ R

= − √ mn

e ^(Re(λ ^max ^)+Re(µ ^max ^))R Re(λ max ) + Re(µ max )

.

If we require a bound on this to be e.g. 10 ^−d , where d is the number of accurate decimals, we get

− √ mn

e ^(Re(λ ^max ^)+Re(µ ^max ^))R Re(λ max ) + Re(µ max )

< 10 ^−d , and a value for the limit R is given by

R > −1 λ max + µ max

log

− √ mn10 ^d λ max + µ max

. (3.11)

Thus R is depending on the largest eigenvalues and the size of the problem.

Finally, an approximate solution to (1.1) is given by

X = ˆ − Z R

0 e ^At Ce ^B ^T ^t dt. (3.12)

3.5.2 Substitution of Variables

Another possibility to handle improper integrals is to use substitution of vari- ables. Let t = ₁ _−x ^x . Then we can write the solution (2.5) as

X = − Z 1

0 e ^A ¹ ^−x ^x Ce ^B ^T ¹ ^−x ^x 1 (1 − x) ² dx.

This means that we go from integrating over a infinite interval to a smaller

interval, and therefore the quadrature points are less spread out. Notice that the

factors in the integrating function are unbounded at x = 1, hence integration

up to x = 1 is not possible.

(27)

Chapter 4 Numerical Experiments and Results

In this section the quadrature methods from Section 3 are applied on random matrices and the classical problem of the Poissons equation in two dimensions.

The value of the truncated upper limit R in (3.12) is determined using (3.11) with d = 16. For the substitution of variables, the integrations are done us- ing the upper limit x = 0.99999. The step size is halved for every iteration, hence the number of quadrature points are doubled. The maximum number of iterations is set to 15 and an error tolerance of 10 ⁻¹⁶ is used.

The implementations are done in MATLAB version R2018a and the re- sults are compared with the built in lyap function, which is a solver for ma- trix equations formulated as equation (1.1). CPU times are measure using the stopwatch timing tic-toc command. The matrix exponentials are evaluated using the expm command. The simulations are carried out on a computer with an Intel Xeon E5-1650 3.20 GHz processor and 12 GB RAM.

4.1 Implementation on Random Matrices

Using stable and random matrices, we can investigate what effects the dis- tribution of the eigenvalues and the problem size have on the errors. The eigenvalues are uniformly distributed random numbers generated by the rand command in MATLAB.

17

(28)

18 CHAPTER 4. NUMERICAL EXPERIMENTS AND RESULTS

4.1.1 Distribution of Eigenvalues

The numerical experiments in this part are done using randomized, dense matrices A, B ∈ R ^10×10 . The eigenvalues are distributed on the intervals [ −0.02, −0.1], [−0.2, −1] and [−2, −10] on the real axis in the complex plane, i.e. the intervals are shifted with a factor of 10 in the three cases. The results are shown in the figures 4.1-4.3.

-0.1 -0.09 -0.08 -0.07 -0.06 -0.05 -0.04 -0.03 -0.02 -1

-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

A B

(a)

10¹ 10² 10³ 10⁴

10^-14 10^-12 10^-10 10^-8 10^-6 10^-4 10^-2 10⁰ 10² 10⁴

Trapezoidal, truncation Trapezoidal, substitution Simpon's, truncation Simpson's, subsitution Stenger

(b)

Figure 4.1: The eigenvalues are close to zero in (a). The relative errors are shown in (b).

-1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

A B

(a)

10¹ 10² 10³ 10⁴

10^-14 10^-12 10^-10 10^-8 10^-6 10^-4 10^-2 10⁰ 10²

10⁴ Trapezoidal, truncation

Trapezoidal, substitution Simpon's, truncation Simpson's, subsitution Stenger

(b)

Figure 4.2: Improved convergence for quadrature by Stenger

when the interval is shifted from the origin.

(29)

CHAPTER 4. NUMERICAL EXPERIMENTS AND RESULTS 19

-10 -9 -8 -7 -6 -5 -4 -3 -2

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

A B

(a)

10¹ 10² 10³ 10⁴

10^-14 10^-12 10^-10 10^-8 10^-6 10^-4 10^-2 10⁰ 10²

10⁴ Trapezoidal, truncation

Trapezoidal, substitution Simpon's, truncation Simpson's, subsitution Stenger

(b) Figure 4.3: Results for the third interval.

The distribution of the eigenvalues is essential for the quadrature by Stenger to converge, with poor result if the eigenvalues are close to zero. In the sec- ond interval, Simpson’s rule implemented with substitution of variables con- verges when using around 1000 quadrature points, but the number of quadra- ture points needed for convergence is increased to 10000 for the third inter- val. The convergence for the trapezoidal rule implemented with substitution of variables is also effected in a negative way when the interval is shifted to the left. The trapezoidal rule using a truncated upper limit, seems to be less effected by the distribution of the eigenvalues, but overall both versions of the trapezoidal rule have slow convergence.

4.1.2 Problem Sizes

Another way to investigate the properties is by looking at the size of the prob- lem. In this part we look at how the problem size effects the approxima- tion errors and the execution time. The study is done on both dense and diagonal matrices. The matrices A ∈ R ^m ^×m and B ∈ R ^m ^×m , with m = 1, 2, 4, 8, 10, 20 . . . , 100 , have eigenvalues in the interval [−2, −10] on the real axis in the complex plane. For each increase in problem size, the matrices are extended with new randomized values for the enlarged part. The relative errors presented are those obtained at the last iterations for correpsonding problem size. The diagonal matrices are implemented using the spare-structure in MATLAB.

The results in figure (4.4)-(4.5) shows the relative errors and the CPU times

as a function of the problem size for dense and diagonal matrices respectivley.

(30)

20 CHAPTER 4. NUMERICAL EXPERIMENTS AND RESULTS

10⁰ 10¹ 10² 10³ 10⁴

10^-16 10^-14 10^-12 10^-10 10^-8 10^-6 10^-4

(a)

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0

20 40 60 80 100 120

(b) Figure 4.4: Results for dense matrices.

For dense matrices, the CPU time increases for all methods as the problems size grows. Quadrature by Stenger is the method obtaining the best result both with respect to the relative errors and CPU time. Simpson’s rule implemented using substitution also shows small relative errors.

10⁰ 10¹ 10² 10³ 10⁴

10^-16 10^-14 10^-12 10^-10 10^-8 10^-6 10^-4

(a)

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0

5 10 15 20 25 30 35 40 45

(b) Figure 4.5: Results for diagonal matrices.

For the diagonal matrices, the execution times for quadrature by Stenger

are larger then the other methods, but as for the dense case, accomplishes the

smallest relative errors. The trapezoidal rule implemented using upper trun-

cated limits doesn’t perform well in either of the cases.

(31)

CHAPTER 4. NUMERICAL EXPERIMENTS AND RESULTS 21

4.2 The Poisson Equation in Two Dimensions

The Poisson equation in two dimensions with Dirichlet boundary conditions is a problem where we want to determine a function u = u(x, y) that satisfies

∂ ² u

∂x ² + ∂ ² u

∂y ² = −f(x, y), (x, y) ∈ Ω, (4.1a)

u = 0, (x, y) ∈ ∂Ω, (4.1b)

where Ω is defined by Ω = {(x, y) : α x ≤ x ≤ β ^x , α y ≤ y ≤ β ^y }, i.e. a square domain, and f(x, y) is a function defined on this region, see [6, p. 224]. The boundary of Ω is denoted by ∂Ω. The Poisson equation is an elliptic and time- independent partial differential equation, and can be used to, e.g. model heat conduction and in the field of fluid dynamics [5, p. 130]. We discretize the domain Ω, equation (4.1a) and the boundary conditions (4.1b) in three step as suggested in [5].

First, we divide the domain Ω into an equidistant grid, i.e. we have the step size h = h x = h y . Let m and n be the number of inner gridpoints in the x- and y-direction respectively. Then h is defined by

h = β x − α ^x m + 1 , and the discretized points are

x i = ih, i = 1, . . . m, y j = jh, j = 1, . . . n, see [5, p. 135].

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 0

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure 4.6: Discretization of the domain Ω using m = 14 and n = 9. The

inner points are marked with circles and the points at the boundary with dots.

(32)

22 CHAPTER 4. NUMERICAL EXPERIMENTS AND RESULTS

To discretize equation (4.1a), let u(x i , y j ) ≈ u ^i,j . The second order par- tial derivatives are approximated using the central finite-difference approxi- mations

∂ ² u

∂x ² (x i , y j ) ≈ u i+1,j − 2u ^i,j + u i −1,j

h ² , (4.2a)

∂ ² u

∂y ² (x i , y j ) ≈ u i,j+1 − 2u ^i,j + u i,j −1

h ² , (4.2b)

see [5, (7.19)-(7.20)].

Finally, using the boundary conditions (4.1b), the values for the discretized points on the boundary ∂Ω are

u i,0 = 0, u i,n+1 = 0, i = 1, . . . , m u 0,j = 0, u m+1,j = 0, j = 1, . . . , n, see [5, (7.12)].

Using (4.2a)-(4.2b) and imposing the BCs we get a system of equations on the form

Au = c, (4.3)

where the coefficient matrix A ∈ R ^mn ^×mn has the structure

A = 1 h ²



 



T n 0 . . . . . . 0 0 T n 0 . . . 0 ... ... ... ... ...

... ... ... ... 0 0 . . . . . . 0 T n



 

 + 1

h ²



 

 

2I n −I ⁿ 0 . . . 0

−I n 2I n −I n ... ...

0 −I ⁿ 2I n ... 0 ... ... ... ... −I ⁿ 0 . . . 0 −I ⁿ 2I n



 

  .

A can be rewritten using the Kronecker product from Section 1.2.4, i.e. A =

(33)

CHAPTER 4. NUMERICAL EXPERIMENTS AND RESULTS 23

(I m ⊗ A + B ⊗ I ⁿ ), see [6, p. 224]. Further, with f(x i , y i ) = f i,j , we get

u =



 

 u 1,1

u 2,1

...

u n,1

u 1,2

u 2,2

...

u n,2

...

u 1,n

u 2,n

...

u n,n



 



∈ R ^mn and c =



 

 f 1,1

f 2,1

...

f n,1

f 1,2

f 2,2

...

f n,2

...

f 1,n

f 2,n

...

f n,n



 



∈ R ^mn .

If we let u = vec(U) and c = vec(C), then we get the equation (I m ⊗ A + B ⊗ I ⁿ )vec(U) = vec(C), which is the vectorized form of the Sylvester equation

AU + U B ^T = C, (4.4)

where A ∈ R ⁿ ^×n , B ∈ R ^m ^×m , C ∈ R ⁿ ^×m and we want to determine U ∈ R ⁿ ^×m .

4.2.1 Applying the Quadrature Methods

In this section the methods from Sections 3.2-3.4 are applied on the Sylvester equation (4.4) representing the Poisson equation (4.1a-4.1b). The simulations are done using α x = 0, β x = 1.5, α y = 0 and β y = 1.0. The domain Ω is discretized using m = 200 and n = 133, giving a problem size of 26600. The matrices A and B are sparse, hence they are implemented using the sparse structure in MATLAB.

Since the matrices need to be stable and the localization of the eigenvalues

is essential, (4.4) is rewritten on the equivalent form

(34)

24 CHAPTER 4. NUMERICAL EXPERIMENTS AND RESULTS

− 1

h ² (AU + U B ^T ) = −C. (4.5)

This ensures that the matrices A and B are stable and have nice eigenvalues, i.e. eigenvalues not too close to zero.

-8 -7 -6 -5 -4 -3 -2 -1 0

10⁴ -1

-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

A B

Figure 4.7: All eigenvalues have negative real parts.

The results when applying the methods on the Poisson equation are shown i figures (4.8)-(4.12).

10¹ 10² 10³ 10⁴

10^-16 10^-14 10^-12 10^-10 10^-8 10^-6 10^-4 10^-2 10⁰ 10²

Trapezoidal, truncation Trapezoidal, substitution Theoretical error

(a)

10¹ 10² 10³ 10⁴

10^-16 10^-14 10^-12 10^-10 10^-8 10^-6 10^-4 10^-2 10⁰ 10²

Simpon's, truncation Simpson's, subsitution Theoretical error

(b)

Figure 4.8: Relative errors compared with theoretical errors for

the trapezoidal rule and Simpson’s rule.

(35)

CHAPTER 4. NUMERICAL EXPERIMENTS AND RESULTS 25

10¹ 10² 10³ 10⁴

10^-16 10^-14 10^-12 10^-10 10^-8 10^-6 10^-4 10^-2 10⁰ 10²

Stenger Theoretical error

Figure 4.9: Relative errors for quadrature by Stenger compared with the theoretical errors.

10

¹

10

²

10

³

10

⁴

10

^-12

10

^-10

10

^-8

10

^-6

10

^-4

10

^-2

10

⁰

10

²

Trapezoidal, truncation Trapezoidal, substitution Simpon's, truncation Simpson's, subsitution Stenger

Figure 4.10: Comparison of all methods.

The results shows that the convergence of Simpson’s rule is slower than

expected. Quadrature by Stenger shows good convergence properties.

(36)

26 CHAPTER 4. NUMERICAL EXPERIMENTS AND RESULTS

10

¹

10

²

10

³

10

⁴

10

^-10

10

^-5

10

⁰

10

⁵

10

¹⁰

Trapezoidal, truncation Trapezoidal, substitution Simpon's, truncation Simpson's, subsitution Stenger

Figure 4.11: Slow convergence for the residual norms for all methods except quadrature by Stenger.

0 2000 4000 6000 8000 10000 12000 14000 16000 0

10 20 30 40 50 60 70 80

Trapezoidal, truncation Trapezoidal, substitution Simpon's, truncation Simpson's, subsitution Stenger

Figure 4.12: CPU time as a function of quadrature points.

The time comparison in figure (4.12) shows big differences of CPU times

for the trapezoidal rule and Simpson’s rule compared to quadrature by Stenger.

(37)

CHAPTER 4. NUMERICAL EXPERIMENTS AND RESULTS 27

Since the theoretical error for quadrature by Stenger is faster decaying com-

pared to the other two methods, we can expect the execution time to be smaller

for quadrature by Stenger.

(38)

Chapter 5 Conclusions

In this thesis quadrature-based methods have been investigated as a way to compute an approximative solution to the Sylvester equation. Included meth- ods were the trapezoidal rule, Simpson’s rule and quadrature by Stenger. The methods were applied on random matrices of different dimensions and struc- tures, and also applied on the Poisson equation.

The trapezoidal rule and Simpson’s rule, two methods using fixed step size, were infeasible for dense matrices with sizes up to 100 × 100. Even though Simpson’s rule implemented with substitution of variables obtained small rel- ative errors, the execution times were large. The trapezoidal rule, being a second order method, had very slow rate of convergence for all numerical sim- ulations carried out.

Quadrature by Stenger performed well both with respect to convergence and execution time, but the results also showed that the distribution of the eigenvalues were important for the convergence. If the eigenvalues were close to zero, quadrature by Stenger was the least favorable method, but shifting the eigenvalues to the lefthand side of the origin reduced the relative errors. The results for quadrature by Stenger also brings out the interest of how to further explore appropriate ways to chose quadrature points and weights.

28

(39)

Bibliography

[1] A. C. Antoulas. Approximation of Large-Scale Dynamical Systems. So- ciety for Industrial and Applied Mathematics, 2005.

[2] R. Bartels and G. Stewart. Solution of the matrix equation AX + XB = C. Communications of the ACM, 15(9):820–826, September 1972.

[3] Å. Björck. Numerical Methods in Matrix Computations. Springer, 2015.

[4] G. Dahlquist and Å. Björck. Numerical Methods in Scientific Computing, Volume I. Society for Industrial and Applied Mathematics, 2008.

[5] L. Edsberg. Introduction to computation and modeling for differential equations. Wiley, 2008.

[6] G. H. Golub and C. F. Van Loan. Matrix Computations, 4th Edition. The John Hopkins University Press, 2013.

[7] L. Grasedyck. Existence and computation of low kronecker-rank approx- imations for large linear systems of tensor product structure. Computing, 72(3):247–265, Jun 2004.

[8] R. A. Horn and C. R. Johnson. Topics in matrix analysis. Cambridge University Press, Cambridge, 1991.

[9] E. Jarlebring. Lecture notes in numerical linear algebra, 2017.

[10] P. Lancaster. Explicit solutions of linear matrix equations. SIAM Review, 12(4):544–566, October 1970.

[11] V. Simoncini. Computational methods for linear matrix equations. SIAM Review, 58(3):377–441, 2016.

[12] L. N. Trefethen and D. Bau III. Numerical Linear Algebra. Society for Industrial and Applied Mathematics, 1997.

29

(40)

www.kth.se

Quadrature-based Methods for Sylvester Equations

IN

DEGREE PROJECT TECHNOLOGY, FIRST CYCLE, 15 CREDITS

STOCKHOLM SWEDEN 2018 ,

Quadrature-based Methods for Sylvester Equations

JOSEPHINE THUVANDER

KTH ROYAL INSTITUTE OF TECHNOLOGY

SCHOOL OF ENGINEERING SCIENCES

INOM

EXAMENSARBETE TEKNIK, GRUNDNIVÅ, 15 HP

STOCKHOLM SVERIGE 2018 ,

Kvadraturbaserade metoder för Sylvesterekvationer

JOSEPHINE THUVANDER

KTH

SKOLAN FÖR TEKNIKVETENSKAP

Quadrature-based Methods for Sylvester Equations

JOSEPHINE THUVANDER

Bachelor’s Programme in Simulation Technology and Virtual Design

Date: May 28, 2018 Supervisor: Emil Ringh

Co-supervisor: Elias Jarlebring

Examiner: Katarina Gustavsson

School of Engineering Sciences

iii

Abstract

The aim of this thesis is to investigate properties of quadrature-based methods

to approximate a solution to the Sylvester equation AX + XB T = C. The

Sylvester equation models large systems, hence the need for efficient numerical

methods. The trapezoidal rule and Simpson’s rule, two classical methods, are

implemented, as well as quadrature by Stenger. The results presented here

shows the most promising method is quadrature by Stenger.

iv

Sammanfattning

I den här uppsatsen undersöks egenskaper för kvadraturbaserade metod för

att approximera en lösning till Sylvesterekvationen AX + XB T = C . Syl-

vesterekvationen används ofta för att modellera stora system, vilket motiverar

behovet av effektiva numeriska metoder. De klassiska metoderna trapetsregeln

och Simpsons regel implementeras, samt metoden som här kallas för Stenger-

kvadratur. Resultaten visar att Stengerkvadratur är att föredra framför de övriga

undersökta metoderna.

v

Acknowledgements

I would like to thank my two supervisors: Emil Ringh for taking time to an-

swer questions, good explanations during the meetings and giving valuable

feedback on the thesis, and Elias Jarlebring for feedback on preparations for

the thesis presentation.

Contents

1 Introduction 1

1.1 Purpose . . . 1

1.2 Preliminaries . . . 2

1.2.1 Matrix Definitions . . . 2

1.2.2 Matrix Functions . . . 3

1.2.3 Matrix Norms . . . 4

1.2.4 Kronecker Product and Vectorization Operator . . . . 4

2 The Sylvester Equation 6 2.1 Numerical Methods . . . 7

2.1.1 The Kronecker Product Method . . . 7

2.1.2 Bartels-Stewart Algorithm . . . 8

2.2 Integral Formulation . . . 8

3 Quadrature Methods for Matrix Equations 11 3.1 Numerical Integration . . . 11

3.2 Trapezoidal Rule . . . 12

3.3 Simpon’s Rule . . . 12

3.4 Quadrature by Stenger . . . 13

3.5 Approaches for Improper Integrals . . . 14

3.5.1 Truncated Upper Limit . . . 14

3.5.2 Substitution of Variables . . . 16

4 Numerical Experiments and Results 17 4.1 Implementation on Random Matrices . . . 17

4.1.1 Distribution of Eigenvalues . . . 18

4.1.2 Problem Sizes . . . 19

4.2 The Poisson Equation in Two Dimensions . . . 21

4.2.1 Applying the Quadrature Methods . . . 23

vi

CONTENTS vii

5 Conclusions 28

Bibliography 29

Chapter 1 Introduction

1.1 Purpose

The Sylvester equation is a linear matrix equation written as

AX + XB T = C, (1.1)

Figure 1.1: Solution of the Poisson equation in two dimensions, see Section 4.2.

1

2 CHAPTER 1. INTRODUCTION

to approximate a solution to the Sylvester equation AX + XB ^T = C. The

att approximera en lösning till Sylvesterekvationen AX + XB ^T = C . Syl-

AX + XB ^T = C, (1.1)

The spectrum of a matrix A ∈ R ^m ^×m , denoted by σ(A), is the set of all eigen- values of A.

A matrix A ∈ R ^m ^×m is stable if all of its eigenvalues have negative real parts.

The matrix A ∈ R ^m ^×m is nonsingular if and only if 0 /∈ σ(A).

Definition 1.4. (Jordan Decomposition, [6, p. 354, (9.1.1)-(9.1.2)]) A Jordan decomposition of a matrix A ∈ R ^m ^×m is such that

A = X diag(J 1 , . . . , J q )X ⁻¹ ,

where X ∈ R ^m×m is nonsingular and each Jordan block J i is defined by

∈ R ^m ⁱ ^×m ⁱ ,

A matrix A ∈ R ^m×m is diagonalizable if and only if A = X diag(λ 1 , . . . , λ m )X ⁻¹ ,

where X ∈ R ^m ^×m is nonsingular with the eigenvectors of A as columns and λ i are the corresponding eigenvalues of A.

A Schur decomposition of a matrix A ∈ R ^m ^×m is defined by

A = QT Q ^T , (1.2)

where Q ∈ R ^m×m is orthogonal and T ∈ R ^m×m is upper-triangular.

Assume A ∈ R ^m ^×m and let the scalar function f(z) be defined be on σ(A).

Then f(z) can be generalized to a matrix function f(A) such as f : R ^m×m → R ^m×m ,

f (A) = X diag(F 1 , . . . , F q )X ⁻¹ , with X ∈ R ^m×m and

f (λ i ) ^f ⁽¹⁾ _1! ^(λ ⁱ ⁾ · · · · ^f (n ⁽ⁿⁱ⁻¹⁾ i −1)! (λ i ) 0 f (λ i ) ... ··· ...

... ... ... ... ^f ⁽¹⁾ _1! ^(λ ⁱ ⁾ 0 · · · · · · 0 f (λ i )