Boundary Summation Equation Preconditioning for Ordinary Differential Equations with Constant Coefficients on Locally Refined Meshes

(1)

Examensarbete

Boundary Summation Equation Preconditioning for

Ordinary Di

fferential Equations with

Constant Coe

fficients on Locally Refined Meshes

Maimaitiyiming Guzainuer

(2)

(3)

Boundary Summation Equation Preconditioning for

Ordinary Di

fferential Equations with

Constant Coe

fficients on Locally Refined Meshes

Department of Mathematics, Link¨oping University

Maimaitiyiming Guzainuer

LiTH - MAT - EX - 2012/ 09 - SE

Examensarbete: 30 hp

Level: D

Supervisor: Henrik Brand´en,

Examiner: Henrik Brand´en,

(4)

(5)

Abstract

This thesis deals with the numerical solution of ordinary differential equations (ODEs) using finite difference (FD) methods. In particular, boundary summation equation (BSE) preconditioning for FD approximations for ODEs with constant coefficients on locally refined meshes is studied. Firstly, the BSE for FD approximations of ODEs with constant coefficients is derived on a locally refined mesh. Secondly, the obtained linear system of equations are solved by the iterative method GMRES. Then, the arithmetic complexity and convergence rate of the iterative solution of the BSE formulation are discussed. Finally, numerical experiments are performed to compare the new approach with the FD approach. The results show that the BSE formulation has low arithmetic complexity and the convergence rate of the iterative solvers is fast and independent of the number of grid points.

Keywords: Ordinary differential equations, constant coefficients, finite difference, boundary summation equation, GMRES, convergence rate.

(6)

(7)

Acknowledgments

First of all, I would like to thank my supervisor, Henrik Brand´en, for his excellent and patient guidance and support during my thesis work. From the bottom of my heart, I thank my husband for his constant support and my awesome son for making my life more meaningful. Finally, I express my deepest thanks to my beloved parents and sister for their unconditional love and support during my entire life.

(8)

(9)

Nomenclature

Symbols

uxx Second derivative of function u(x).

uj An approximate solution at xj.

G(x, t) The Green’s function. E(xj) Fundamental solution at xj.

D₊D− Difference operator.

Abbreviations

FD Finite Difference

BSE Boundary Summation Equation

BIE Boundary Integral Equation

GMRES Generalized Minimal Residual Method

FFT Fast Fourier Transform

(10)

(11)

4 Arithmetic complexity and convergence rate 19 4.1 Arithmetic complexity . . . 19 4.1.1 Uniform mesh . . . 19 4.1.2 Refined mesh . . . 21 4.2 Convergence rate . . . 25 4.2.1 Uniform mesh . . . 25 4.2.2 Refined mesh . . . 27 5 Experimental results 33 6 Conclusions and future work 35 6.1 Conclusion . . . 35

6.2 Possible extensions . . . 35

A BSE on the 2D uniform and refined mesh 39 A.1 Uniform mesh . . . 39

A.1.1 FD method . . . 39

A.1.2 BSE . . . 42

A.2 Locally refined mesh . . . 46

(12)

A.2.1 FD method . . . 47 A.2.2 BSE . . . 50

(13)

Chapter 1

Introduction

Almost every phenomenon in nature from electromagnetism to turbulence to popu-lation dynamics to particle movements are modelled as and described by differential equations (DEs). There are two types of DEs and they are called ordinary differential equations (ODEs) and partial differential equations (PDEs). ODEs are equations in-volving derivatives of functions of a single variable, while PDEs involve partial deriva-tives of functions of more than one variable. Most often, finding the exact solution of such equations, regardless of them being ODE or PDE, using analytical methods is a difficult task. Approximate solutions, therefore, of such equations are required. The most commonly used techniques for finding approximate solutions of ODEs and PDEs are numerical methods, such as finite difference (FD) method, finite element method (FEM), finite volume method (FVM), and so on. The system resulting from discretiza-tion is usually large and sparse, and it is solved by iterative solvers such as Conjugate Gradient (CG) or Generalized Minimal Residuals (GMRES). However, for many im-portant problems, the convergence rate of an iterative solver is rather slow. Therefore, some convergence acceleration technique is required. This is relevant, in particular when solving DEs.

In this thesis, we consider linear ODEs with constant coefficients. A FD approxima-tion of an ODE gives rise to a difference equaapproxima-tion. Inspired by the theory of boundary integral equation (BIE), we use a fundamental solution of the FD operator to rewrite the ordinary difference equation into an equivalent boundary summation equation (BSE). This transformation may be viewed as a preconditioning technique and the number of iterations required by an iterative solver is hopefully significantly reduced.

This thesis is organized as follows. In Chapter 2, we briefly give an introduction to the DEs, FD approximation, preconditioning, fundamental solution, BIEs, the GMRES algorithm and the FFT algorithm. In the next chapter, the BSE is be derived on a locally refined mesh. Then in Chapter 4, we investigate properties of the BSE precondition-ing for FD approximations on a locally refined mesh. In Chapter 5, the experimental results regarding the convergence rate of GMRES for solving the derived BSE for FD method will be presented. Finally we will conclude the thesis and discuss possible fu-ture extensions in Chapter 7, and we include the derivation of the BSE on the uniform and refined mesh in 2D in the Appendix to serve those who want to experiment with this in 2D.

(14)

(15)

Chapter 2

Background

Every thesis provides the reader with some related background. This thesis is no ex-ception. So the following will be presented in order: the model equation, FD approx-imation, preconditioning, Green’s function, fundamental solution, BIE, GMRES and FFT.

2.1 Model problem

In this thesis we consider the following boundary value problem

             uxx(x)= f (x), 0 < x < 1, u(0)= c0, ux(1)= c1. (2.1)

2.2 Finite di

fference approximation

Let N > 1 be a positive integer and let h= 1/N. Also let xj= jh, j= 1, 2, · · · , N −1,

be the descritization of the interval (0, 1). Function u will be computed at grid points xj, where uj is an approximation to the true solution u(xj). Then using the central

difference approximation

D₊D−uj=

uj+1− 2uj+ uj−1

h2 (2.2)

for uxx(xj), and the backward difference approximation

D−uN=

uN− uN−1

h (2.3)

for ux(xN), we obtain the following finite difference equations

             uj+1− 2uj+ uj−1= h2f(xj), j= 1, 2, · · · , N − 1, u0= c0, uN− uN−1= hc1. (2.4) Guzainuer, 2012. 3

(16)

In matrix-vector form, (2.4) can be written as                          −2 1 0 · · · 0 1 −2 1 · · · 0 .. . ... ... ... ... 0 · · · 1 −2 1 0 · · · 0 1 −1                          | {z } A                          u1 u2 .. . uN−2 uN−1                          | {z } U =                          h2f1− c0 h2f2 .. . h2_f N−2 h2_f N−1− hc1                          | {z } b . (2.5)

2.3 Preconditioning

The solution of large sparse linear system of the form

Ax= b, (2.6)

where A is an N × N matrix and b a given right-hand side vector, is central to many numerical simulations in science and engineering and is often the most time consuming part of a computation. Generally, system (2.6) is solved by iterative methods, such as CG and GMRES since they require fewer storage and operations than direct methods (especially when an approximate solution of relatively low accuracy is sought). For many important problems, the convergence rate of iterative solvers is rather slow, and preconditioning is required. The term preconditioning refers to transforming system (2.6) into an equivalent preconditioned system

MAx= Mb, (2.7)

where the matrix M is a preconditioner and have the following three properties:

1. The memory required for storing M should at most be of the same order as that for storing A.

2. The number of arithmetic operations required for constructing and applying M should be comparable with that of A.

3. The convergence rate of an iterative solver should be significantly improved.

Most preconditioning techniques use some approximate inverse of A as a precondi-tioner. If M is an approximation of A−1, then MA is close to the identity matrix, sug-gesting fast convergence for many iterative methods. Equivalently, one can replace A with some approximation ˜Afor which M= ˜A−1can be applied efficiently. The inverse is usually not constructed, but y = M−1zis computed by solving My = z with some efficient algorithm; see, for example, [10] for details.

In this thesis, we consider a new preconditioning technique that uses a fundamental solution of differential operator for constructing a good preconditioner. The concept of fundamental solutions of linear differential operators will thus be introduced in the following section.

2.4 Green’s function

To construct the analytical solution of the ODE (2.1), here we mainly present the method of Green’s function that is only for ODEs with homogeneous boundary

(17)

condi-2.4. Green’s function 5

tions. To solve (2.1) by this method, we first transform it into a problem with homoge-neous boundary conditions. Let v(x)= u(x) − c0− c1x, then v(x) satisfies

             vxx(x)= f (x), v(0)= 0, vx(1)= 0. (2.8)

Then, if G(x, t) is a Green’s function of (2.8), one can show that

v(x)= Z 1

0

G(x, t) f (t) dt, (2.9)

where the function G(x, t) has the following properties [8]: (1) Gxx(x, t)= 0, for 0 < x < t and for t < x < 1.

(2) G satisfies the boundary conditions G(0, t)= Gx(1, t)= 0.

(3) G is a continuous function, that is G(t+, t) = G(t−, t) = 0.

(4) G has a jump in derivative of size 1 at x= t, that is Gx(t+, t) − Gx(t−, t) = 1.

Now we determine G(x, t) by using above mentioned properties. From (1), we obtain the following general form of the Green’s function of (2.8)

G(x, t)=        a+ bx, 0 < x < t, c+ dx, t < x < 1, (2.10)

where the coefficients a, b, c, d can be determined from (2), (3) and (4). that is, a= 0,

d= 0, a+ bt = c + dt,

d − b= 1.

(2.11)

From this we obtain a= 0, b = −1, c = −t, and d = 0, so G(x, t)=        −x, 0 < x < t, −t, t< x < 1. (2.12)

Thus, the solution of (2.1) is

u(x)= Z 1

0

G(x, t) f (t) dt+ c0+ c1x. (2.13)

Therefore, once the Green’s function is computed for the differential operator _dxd22 and

boundary conditions, we can find the analytical solution to (2.1) using (2.13), for any f(x).

As an example, let us consider the following              uxx= ex, 0 < x < 1, u(0)= 1, ux(1)= e. (2.14)

Then by using (2.13), we have

u(x)= Z x 0 −tetdt+ Z 1 x −xet dt+ 1 + ex = · · · = ex. (2.15)

(18)

2.5 Fundamental solution

The fundamental solution E of a linear differential operator P is defined by

PE= δ(x), (2.16)

where δ(x) is Dirac’s delta function and it satisfies

δ(x) =        ∞, x= 0, 0, x , 0, (2.17) and Z ∞ −∞ δ(x) dx = 1. (2.18)

In fact, the importance of fundamental solution is due to the following relations

PE ∗ u=δ ∗ u = u,

E ∗ Pu=E ∗ f = u. (2.19)

where (∗) is a convolution operator, if Pu= f , then the convolution of E(x) and f (x) is defined by the relation

(E ∗ f )(x)=

Z ∞

−∞

E(x − y) f (y) dy.

Thus, E can be used to construct both the right and left inverse of P, if the convolutions are well defined [12].

The model problem (2.1) is second order ODE, so (2.16) has the following form

d2_E(x)

dx2 = δ(x). (2.20)

To determine the E(x), we need to integrate (2.20) twice with respect to x. From the property of the Heaviside step function H(x), we have

dH(x) dx = δ(x), (2.21) where H(x)=        0, x< 0, 1, x ≥0. (2.22)

Therefore, the first integration of (2.20) with respect to x yields

dE(x)

dx = H(x) + C. (2.23)

Here, C is an arbitrary constant. For convenience, let us choose C = −1₂. Integrating (2.23) with respect to x then gives

E(x)= Z H(x) dx −1 2x+ D = xH(x) − 1 2x+ D = 1 2|x|, (2.24)

(19)

2.6. Boundary integral equation 7

where we have taken the constant D= 0 in the third equality. For later reference, the discrete version of (2.20) is given

D₊D−Ej=

Ej+1− 2Ej+ Ej−1

h2 = δj/h, (2.25)

where δjis the Kronecker delta function, δ0= 1 and δj= 0, j , 0. One can show that

Ej= | j|h/2.

2.6 Boundary integral equation

The target problem can be converted into an integral equation since it is a linear ODE with constant coefficients [3]. As we have derived in the previous section, (2.1) has the following fundamental solution

E(x)= 1

2 | x |. (2.26)

It satisfies

d2

dx2E(x)= δ(x), (2.27)

where δ(x) is Dirac’s delta function.

From the Green’s second identity [4], we have

Z 1 0 u(x)d 2_E(x) dx2 − E(x) d2u(x) dx2

dx=hu(x)d E(x)_dx − E(x)du_dxix=1

x=0. (2.28)

Substituting (2.1) and (2.27) into (2.28) yields

u(x)= Z 1

0

f(x)E(x) dx+hu(x)d E(x)_dx − E(x)du(x)_dx ix=1

x=0. (2.29)

Hence, (2.29) is the BIE of (2.1). Often BIEs are solved by so called Boundary Element Method (BEM).

2.7 GMRES

The GMRES algorithm by [14] is an iterative method that belongs to the family of Krylov methods and is an extension of the MINRES (minimal residual) method to deal with non-symmetric linear systems.

GMRES solves a linear system

Ax= b

for x by iteratively minimizing the I2-norm of the residual,

k b − Axkk2,

where xkis the approximation to x in step k. In exact arithmetic, at each step GMRES

solves the following least squares problem:

k rkk= min xk∈K

(20)

in the Krylov subspace

K_k= span{r₀, Ar₀, · · · , Ak−1r0},

where r0= b − Ax0and x0is some initial guess. One can show that the Krylov vectors

are linearly independent, so the dimension of Kkis k.

As a consequence of residual minimization property [13],

k rkk=k pk(A)r0k, (2.31)

where pk(z) is a polynomial of degree k with pk(0)= 1. If A is diagonalizable, X−1AX=

Λ, where Λ is a diagonal matrix, then for the GMRES residuals follows from (2.31) that k rkk ≤ κ(X) min pk∈P pk(0)=1 max λ∈σ(A)| pk(λ) | k r0k, (2.32) where κ(X) is the condition number of any matrix of eigenvectors of A. If κ(X) is not too large, it is a reasonable approximation to say that the convergence of GMRES is determined by the eigenvalues of A.

For later reference, we state the following two Theorems.

Theorem 2.7.1 An N × N matrix A is diagonalizable if it has N linearly independent eigenvectors.

Proof We begin by showing that A has N linearly independent eigenvectors if it is diagonalizable. Let us assume that A is diagonalizable, i.e., D = X−1AXwhere D = diag(λ1, · · · , λN) and X is N × N invertible matrix defined by X= [X1X2 · · · XN]. From

the knowledge of Fundamental Subspaces, we know that the columns of X form a basis for_RN_{and hence be linearly independent. Therefore, X}

1, X2, · · · , XNare linearly

independent column vectors.

We can write D= X−1_AX_{as AX}_{= XD or equivalently,}

A[X1, X2, · · · , XN]= [X1, X2, · · · , XN]                    λ1 0 · · · 0 0 λ2 · · · 0 .. . ... ... ... 0 0 · · · λN                    . (2.33)

The rule of matrix arithmetic tells us that the jth column of XD is λjXj and the jth

column of AX is AXj, that is

AX1= λ1X1, AX2= λ2X2, · · · , AXN = λNXN.

Hence, the diagonal entries of D are the eigenvalues of A and their corresponding eigenvectors are the columns of X and they are linearly independent.

Now we show that A is diagonalizable if it has N linearly independent eigenvectors. Let us assume that A has N linearly independent eigenvectors.

As we noted above the columns of AX are given by

AX1, AX2, · · · , AXN.

Since X1, X2, · · · , XNare eigenvectors of A, so it can be written as

(21)

2.8. FFT 9

Therefore, AX can be written as

AX= [AX1, AX2, · · · , AXN]= [λ1X1, λ2X2, · · · , λNXN]= DX.

By pre-multiplying this expression by X−1, we obtain X−1AX= D. where X is invertible. Therefore A is diagonalizable.

Theorem 2.7.2 If A is diagonalizable and has only k distinct eigenvalues then GMRES iteration will terminate at the solution in at most k iteration.

Proof Suppose that the eigenvalues λ1, λ2, · · · , λn take on the k distinct values τ1 <

τ2< · · · < τk. We define a polynomial Qk(λ) by

Qk(λ)=

(−1)k τ1τ2· · ·τk

(λ − τ1)(λ − τ2) · · · (λ − τk),

and note that Q(λi)= 0 for i = 1, 2, · · · , n. Then (2.32) implies

k rkk ≤ κ(S ) min Qk∈P Qk(0)=1 max λ∈σ(A)| Qk(λ) | k r0k= 0. (2.34)

2.8 FFT

The FFT algorithm is a particular way of computing the Discrete Fourier Transform (DFT) of a sequence in a very efficient way. To understand the working principle of FFT, we first give a brief introduction to the DFT. The DFT is useful as a computational tool that provides an efficient means for computing other quantities. Basically, the computational problem for the DFT is to compute the sequence y= [y0, y1, · · · , yN−1]T,

from another given sequence x= [x0, x1, · · · , xN−1]T, according to the formula

yk= N−1

X

n=0

xnωknN, k= 0, 1, · · · , N − 1, (2.35)

where ωN= e2πik/Nhas the following properties

symmetry: ωk_N+N/2= −ωk_N,

periodic: ωk_N+N= ωk_N. (2.36)

In general, the DFT of a sequence xnis complex valued. Similarly, the inverse DFT is

xn= 1 N N−1 X n=0 ykω−knN , n= 0, 1, · · · , N − 1, (2.37)

which represents the components of x as a linear combination of sines and cosines, with coefficients given by the components of y. Generally, a direct computation of yk

requires N2complex multiplications and N(N − 1) complex additions.

To illustrate how the FFT algorithm works in general, we consider the case N= 4. From (2.35) we have yk= 3 X n=0 xnωknN, k= 0, 1, · · · , 3. (2.38)

(22)

By writing the four equations in full, we have y0=x0ω0N+ x1ωN0 + x2ω0N+ x3ω0N, y1=x0ω0N+ x1ωN1 + x2ω2N+ x3ω3N, y2=x0ω0N+ x1ωN2 + x2ω4N+ x3ω6N, y3=x0ω0N+ x1ωN3 + x2ω6N+ x3ω9N. (2.39) From (2.36), we obtain ω0 N = ω 4 N = 1, ω 2 N= ω 6 N= −1, ω 1 N = ω 9 N.

After regrouping, we have

y0 =(x0+ ω0Nx2)+ ω0N(x1+ ω0Nx3),

y1 =(x0−ω0Nx2)+ ω1N(x1−ω0Nx3),

y2 =(x0+ ω0Nx2)+ ω2N(x1+ ω0Nx3),

y3 =(x0−ω0Nx2)+ ω3N(x1−ω0Nx3).

(2.40)

We observe that the transform can be computed with only 8 additions or subtractions and 6 multiplications, instead of the expected (4 − 1) ∗ 4= 12 additions and 42 _{= 16}

multiplications. The main point is that computing the DFT of the original 4-point sequence has been reduced to two 2-point data sub-sequences corresponding to it’s even numbered and odd numbered components. This property holds in general: the DFT of N-point sequence can be computed by splitting it into two DFTs of half the length, provided N is even. If there are log₂N levels of recursion, and if the weights ωn _{are precomputed, the total number of arithmetic operations required by the FFT}

algorithm is 5N log2N. The same arithmetic complexity is required to the computation

(23)

Chapter 3

BSE on uniform and refined

mesh

In this chapter, we derive the BSE for the target problem (2.1) on uniform and refined mesh. For the purpose of comparison, we also present the FD counterpart.

3.1 Uniform mesh

3.1.1 FD method

To repeat what is in Section 2.2, the FD approximations of (2.1) on the uniform mesh give us the following system of equations

AU= b, (3.1) where A=                          −2 1 0 · · · 0 1 −2 1 · · · 0 .. . ... ... ... ... 0 · · · 1 −2 1 0 · · · 0 1 −1                          , U=                          u1 u2 .. . uN−2 uN−1                          , b=                          h2_f 1− c0 h2_f 2 .. . h2fN−2 h2fN−1− hc1                          .

The solution can be computed by a suitable iterative method.

3.1.2 BSE

This section presents the derivation process of an alternative representation of (3.1) that leads to a new way of solving (3.1).

This formula can be derived by performing integration by parts two times,

Z 1 0

v(x)uxx(x) dx= v(1)ux(1) − u(1)vx(1)+ u(0)vx(0) − v(0)ux(0)

+Z 1

0

vxx(x)u(x)dx.

(3.2)

(24)

Equation (3.2) has several different discrete counterparts. We choose the following version since it suits our problem well,

N−1 X k=1 vkD+D−ukh= N−1 X k=1 vk uk+1− 2uk+ uk−1 h2 h = N−1 X k=1 vk uk+1 h2 h −2 N−1 X k=1 vk uk h2h+ N−1 X k=1 vk uk−1 h2 h = N X k=2 vk−1 uk h2h −2 N−1 X k=1 vk uk h2h+ N−2 X k=0 vk uk h2h = N X k=0 vk+1− 2vk+ vk−1 h2 ukh − v−1 u0 h − v0 u1 h + 2v0 u0 h + 2vN uN h − vN uN−1 h − vN+1 uN h = N X k=0 ukD+D−vkh+ v0 u0− u1 h − u0 v−1− v0 h + vN uN− uN−1 h − uN vN+1− vN h . (3.3)

Therefore, we have the following rule of summation by parts as a discrete form of the integration by parts.

Lemma 3.1.1 Let ujand vjbe two mesh functions. Then N−1 X k=1 vkD+D−ukh= N X k=0 ukD+D−vkh+ v0 u0− u1 h − u0 v−1− v0 h + vN uN− uN−1 h − uN vN+1− vN h . (3.4)

Proof See above.

Now, let ujbe a solution of (3.1) and D+D−Ej−k = δj−k/h. From Lemma 3.1.1, it

follows that N−1 X k=1 Ej−kfkh= N−1 X k=1 Ej−kD+D−ukh = N X k=0 ukD+D−Ej−kh+ Ej u0− u1 h − u0 Ej+1− Ej h + Ej−N uN− uN−1 h − uN Ej−N−1− Ej−N h = uj+ Ej u0− u1 h − c0 Ej+1− Ej h + Ej−Nc1− uN Ej−N−1− Ej−N h , (3.5)

(25)

3.1. Uniform mesh 13 or equivalently uj= c0 Ej+1− Ej h − Ej u0− u1 h − Ej−Nc1+ uN Ej−N−1− Ej−N h + N−1 X k=1 Ej−kfkh, j= 0, 1, · · · , N. (3.6)

An equation of the form (3.6) is called BSE. And (3.6) is the BSE formulation of (3.1). For the target problem, the boundaries are two points and the unknowns in the BSE (3.6) are approximations of the function or its derivatives at these boundary points. For PDEs, the boundaries are curves or surfaces, and the unknowns are sums of function values or approximations of derivatives over these curves or surfaces. Hence the name BSE.

To solve (3.6), we first need to find the values of (u1− u0)/h and uN. Note that (3.6)

implies that u1− u0 h =c0 E2− 2E1− E0 h2 + E1− E0 h u1− u0 h − E1−N− E−N h c1 + uN E−N− E1−N− E−N−1+ E−N h2 + N−1 X k=1 E1−k− E−k h fkh, (3.7) and uN = c0 EN₊₁− EN h − EN u0− u1 h − E0c1+ uN E−1− E0 h + N−1 X k₌₁ EN−kfkh. (3.8)

With φ1 = (u1 − u0)/h, φ2 = uN and Ej = |xj|/2, this is a system of two coupled

equations in fixed point form

φ = Aφ + b, (3.9) where φ = φ1 φ2 ! , A= 1/2 0 xN/2 1/2 ! , and b=                        c1/2 − 1 2 N−1 X k=1 fkh c0/2 − 1 2 N−1 X k=1 xN−kfkh.                        .

Using (3.6), the solution ujcan now be computed from φ1and φ2according to

uj= c0 2 + | j|h 2 φ1− | j − N|h 2 c1+ 1 2φ2+ N−1 X k₌₁ | j − k|h 2 fkh, j= 0, 1, · · · , N. (3.10)

(26)

3.2 Locally refined mesh

For DEs for which the solution changes rapidly only in a part of the computational domain, it probably is a good idea to use different step sizes in different parts of domain. By adding extra mesh points only where high resolution is needed, one minimizes the total number of mesh points which results in shorter execution time and less allocated memory [6].

In the following sections, we will solve (2.1) in domainΩ := {0 ≤ x ≤ 1}, which we split into two sub-domains 0 ≤ x ≤ 1/2 and 1/2 ≤ x ≤ 1. We descretize these two sub-domains as

Ωh:= {ih : i = 0, 1, · · · , N},

Ωhp:= {Nh + (i − N)hp: i= N, N + 1, · · · , 3N}.

(3.11)

For simplicity, we choose hp= h/2.

Figure 3.1 illustrates the discretization of the computational domainΩ.

x0 0 xN 1 2 x x3N 1 h hp

Figure 3.1: Discretization of the computational domainΩ, where h denotes the step size ofΩhand hpdenotes the step size ofΩhp.

3.2.1 FD method

In this section, we solve (2.1) with the FD method in the domain Ω on the locally refined mesh presented in the previous section. By using the FD operators we obtain the following equations

                                   u0= c0, uj+1− 2uj+ uj−1= h2fj, j= 1, 2, · · · , N − 1, uN+2− 2uN+ uN−1= h2fN, uj+1− 2uj+ uj−1= h2 4 fj, j= N + 1, N + 2, · · · , 3N − 1, u3N− u3N−1= h 2c1. (3.12)

It can be written in matrix-vector form as follows

(27)

3.2. Locally refined mesh 15 where A=                                                       −2 1 1 −2 1 ... ... ... 1 −2 1 1 −2 0 1 1 −2 1 1 −2 1 ... ... ... 1 −2 1 1 −1                                                       , and b=                                                                                     h2f1− c0 h2f2 .. . h2fN−1 h2fN h2 4 fN+1 h2 4 fN+2 .. . h2 4 f3N−2 h2 4 f3N−1                                                                                     .

The solution can be found by using a suitable iterative method.

3.2.2 BSE

In this section, we derive the BSE version of (3.13). According to the domain de-scretization (3.11), from the FD approximation it follows that

u0= c0, Dh₊Dh−uj= fj, j= 1, 2, · · · , N − 1, uN+2− 2uN+ uN−1 h2 = fN, Dh/2₊ Dh/2− uj= fj, j= N + 1, N + 2, · · · , 3N − 1, Dh/2− u3N= c1. (3.14)

(28)

If Dh

+Dh−Ej−k = δj−k/h, then from (3.14) and Lemma 3.1.1 it follows that

uj= c0 Ej+1− Ej h + Ej u1− u0 h − Ej−N uN− uN−1 h + uN Ej−N−1− Ej−N h + N−1 X k=1 Ej−kfkh, (3.15)

where j= 0, 1, . . . , N and Ejis a fundamental solution . Similarly, if Dh/2₊ Dh/2− Fj−k =

δj−k/h/2, it follows that uj= uN Fj−N+1− Fj−N h/2 + Fj−N uN₊₁− uN h/2 − Fj−3Nc1+ u3N Fj−3N−1− Fj−3N h/2 + 3N−1 X k=N+1 Fj−kfkh/2, (3.16)

where j= N, N + 1, · · · , 3N. From (3.14), it follows that uN= 1 3(2uN+1+ uN−1− h 2_f_(x N)+ h2 4 f(xN+1)). (3.17)

This equation will be used to connect the two sub-domains.

When the solution of a linear system of equations is computed by an iterative method, the grid dependency of convergence of iterative method is estimated by the behaviour of a coefficient matrix of considered linear system. If the coefficient matrix does not depend on step size h and grid points N, then the number of iterations required by an iterative method will also not depend on h and N; that is, the convergence will be grid independent. However, it is possible to obtain grid independent convergence also for matrices that do depend on h an N. According to above argument, we choose φ1 = u1

−u0

h , φ2 = uN−uN−1

h , φ3 = uN for (3.15) and choose φ4 = uN, φ5 = uN+1−uN

h/2 and

φ6= u3Nfor (3.16). Then (3.15) and (3.16) imply

φ1 = 1 2φ1+ 1 2φ2− N−1 X k=1 1 2fkh, (3.18) φ2 = 1 2φ1+ 1 2φ2+ N−1 X k=1 1 2fkh, (3.19) φ3= 1 4φ1+ 1 2φ3+ c0 2 + N−1 X k=1 EN−kfkh, (3.20) φ4= 1 2φ4+ 1 2φ6− c1 4 + 3N−1 X k=N+1 FN−kfk h 2, (3.21) φ5= 1 2φ5+ c1 2 − 3N−1 X k=N+1 h 4fk, (3.22) φ6= 1 2φ4+ 1 4φ5+ 1 2φ6+ 3N−1 X k=N+1 F3N−kfk h 2. (3.23)

(29)

3.2. Locally refined mesh 17

We are looking for an alternative equivalent formulation of (3.14). When deriving Equations (3.18), (3.19), (3.20), (3.21), (3.22) and (3.23) we have used all the equations except (3.28). In fact, the contribution of (3.28) is that connecting the two sub-domains, So it also needs to be included.

By replacing φ2and φ5into (uN− uN−1)/h and (uN+1− uN)/h/2 in (3.28), we obtain

the following equation

φ2= φ5+

h

4fN+1− h fN. (3.24)

In addition, for later reference, we define the following mathematical trick

φ3= φ4. (3.25)

By imposing (3.24) and (3.25), we eliminate φ2and φ3from (3.18), (3.19) and (3.20),

then eliminate φ4and φ5from (3.21), (3.22) and (3.23). After elimination we obtain

the following six equations

φ1= 1 2φ1+ 1 2φ5+ h 8fN+1− h 2fN− N−1 X k=1 1 2fkh, φ2= 1 2φ1+ 1 2φ5+ h 8fN+1− h 2fN+ N−1 X k=1 1 2fkh, φ3= 1 4φ1+ 1 2φ4+ c0 2 + N−1 X k=1 EN−kfkh, φ4= 1 2φ3+ 1 2φ6− c1 4 + 3N−1 X k=N+1 FN−kfk h 2, φ5= 1 2φ2− h 8fN+1+ h 2fN+ c1 2 − 3N−1 X k=N+1 h 4fk, φ6= 1 4φ2+ 1 2φ3+ 1 2φ6− h 16fN+1+ h 4fN+ 3N−1 X k=N+1 F3N−kfk h 2. (3.26)

The fixed point form of (3.26) is

φ = Aφ + b, (3.27) where A=                            1/2 0 0 0 1/2 0 1/2 0 0 0 1/2 0 1/4 0 0 1/2 0 0 0 0 1/2 0 0 1/2 0 1/2 0 0 0 0 0 1/4 1/2 0 0 1/2                            , b =                             h 8fN+1− h 2fN− PN−1 k=1 1 2fkh h 8fN+1− h 2fN+ P N−1 k=1 1 2fkh c0 2 + P N−1 k=1 EN−kfkh −c1 4 + P 3N−1 k=N+1FN−kfkh₂ −h 8fN+1+ h 2fN+ c1 2 − P3N−1 k=N+1 h 4fk −₁₆h fN+1+h₄fN+ P3N−1k=N+1F3N−kfkh₂                             .

(30)

φ4, φ5and φ6according to uj=                                      jh 2 φ1+ 1 2c0− | N − j | h 2 φ2+ 1 2φ3+ N−1 X k=1 | j − k|h 2 fkh, for j= 0, 1, · · · , N, 1 2φ4+ | N − j | h 4 φ5− | 3N − j | h 4 c1+ 1 2φ6+ 3N−1 X k=N+1 | j − k | h 4 fk h 2, for j= N, N + 1, · · · , 3N. (3.28)

(31)

Chapter 4

Arithmetic complexity and

convergence rate

In this chapter, we investigate the arithmetic complexity and convergence rate of GM-RES applied to BSE and FD method.

4.1 Arithmetic complexity

This section present an arithmetic complexity of GMRES for the FD method and BSE, respectively, on uniform and refined mesh.

4.1.1 Uniform mesh

FD method

We have solved (2.1) with FD method, and it yielded the following linear system of equations AU= b, (4.1) where A=                          −2 1 0 · · · 0 1 −2 1 · · · 0 .. . ... ... ... ... 0 · · · 1 −2 1 0 · · · 0 1 −1                          , U=                          u1 u2 .. . uN−2 uN−1                          , b=                          h2_f 1− c0 h2_f 2 .. . h2fN−2 h2fN−1− hc1                          .

Here, A is (N −1)×(N −1), symmetric, and tridiagonal matrix. To compute the solution of (4.1), we choose the GMRES. In general, the computational cost for the GMRES algorithm can be divided in two parts:

• The computation of the residual rk= b − Auk.

• Minimization of the residual, k rkk= minuk∈Kk k b− Aukk, which is a least square

problem of size k. Here, the required arithmetic operations for step k is O(kN).

(32)

Since the second part of total computational cost of GMRES is the same for FD method and BSE, in the rest of the thesis, we only consider the the first part of the total number of arithmetic operations of GMRES. For (4.1), the cost of one residual evaluation of GMRES is O(N).

BSE

In this section we consider the iterative solutions of (3.6). For simplicity, let us rewrite it as follows uj+ Ej u0− u1 h − uN Ej−N−1− Ej−N h = N+1 X k=−1 Ej−kf˜kh, j= 0, 1, · · · , N, (4.2) where ˜ fk= c0 δk−1−δk h2 −δk uk− uk+1 h2 + uk−1δk−N−1− ukδk−N h2 − δk−Nc1 h + χkfk, (4.3) and χk=        1, k= 1, 2, · · · , N − 1, 0, elsewhere. (4.4)

So values of ˜fkfor k= 0, 1, · · · , N + 1, are as follows

                                         ˜ f−1 = 0, ˜ f0 = −c_h02, ˜ f1 =c_h02 + f1, .. . ˜ fN−1 = fN−1, ˜ fN = −c_h1, ˜ fN+1 = 0. (4.5)

The matrix-vector formulation of (4.2) is

Au= b, (4.6) where A=                          −E1+ h 0 0 · · · −1₂h −E2 h 0 · · · −1₂h .. . ... ... ... ... −EN−1 0 · · · h −1₂h −EN 0 · · · 0 1₂h                          ,

(33)

4.1. Arithmetic complexity 21 and b=                                                               N+1 X k=−1 E1−kf˜kh2− c0E1 N+1 X k=−1 E2−kf˜kh2− c0E2 .. . N+1 X k=−1 EN−1−kf˜kh2− c0EN−1 N+1 X k=−1 EN−kf˜kh2− c0EN                                                               .

It can be seen from (4.6) that A is N × N, non-symmetric sparse matrix. To compute the solution of (4.6), the GMRES solver is chosen. Where it needs O(N) operations for one residual evaluation. In addition, to compute b vector, the FFT uses O(N log₂N) arithmetic operations.

4.1.2 Refined mesh

FD method

We have obtained the following linear system of equations, by using the FD method to (2.1) Au= b, (4.7) where A=                                                       −2 1 1 −2 1 ... ... ... 1 −2 1 1 −2 0 1 1 −2 1 1 −2 1 ... ... ... 1 −2 1 1 −1                                                       ,

(34)

and b=                                                                                     h2f1− c0 h2f2 .. . h2fN−1 h2fN h2 4 fN+1 h2 4 fN+2 .. . h2 4 f3N−2 h2 4 f3N−1− h2 4 c1                                                                                     .

Here, the A is (3N − 1) × (3N − 1), non-symmetric, and sparse matrix. To compute the solution of (4.7), we use the GMRES solver. Where it needs O(N) operations for evaluating one residual.

BSE

In this section we consider the solution of the locally refined system (3.15) and (3.16). For simplicity, let us write (3.15) and (3.16) in the following form

2uj− ju1− (N − j)uN−1+ (N − j − 1)uN = c0− jc0+ N−1 X k₌₁ ( j − k) fkh2, j ∈Ωh, (4.8) 2uj+ (N − j − 1)uN− (N − j)uN+1− u3N = − 3N − j 2 hc1+ 3N−1 X k=N+1 j − k 4 fkh 2_, j ∈Ωhp. (4.9) At j= N, (4.8) and (4.9) are uN− Nu1=c0− Nc0+ N−1 X k=1 (N − k) fkh2, uN− u3N =Nhc1+ 3N−1 X k=N+1 (N − k) 4 fkh 2_. (4.10) Subtraction yields u3N = Nu1+ (1 − N)c0+ Nhc1+ N−1 X k=1 (N − k) fkh2− 3N−1 X k=N+1 (N − k) 4 fkh 2_, (4.11)

(35)

4.1. Arithmetic complexity 23

where T = (1 − N)c0+ Nhc1+ P_kN−1₌₁(N − k) fkh2−P3N−1k=N+1 (N−k)

4 fkh2. Substitution of

(4.11) into (4.8) and (4.9) give

                     2uj− (N − j)uN−1+ (N − j − 1)uN−_Nju3N = −_NjT + c0− jc0+ PN−1_k₌₁( j − k) fkh2, j ∈Ωh, 2uj− Nu1+ (N − j − 1)uN− (N − j)uN+1= T −3N− j₂ hc1+ P3N−1_k_=N+1 j−k₄ fkh2, j ∈Ωhp. (4.12) Together with (3.28), the matrix-vector form can be written as follows

Au= b, (4.13) where A=                                                            2 0 · · · 0 −(N − 1) N −2 −1/N 2 · · · 0 −(N − 2) N −3 −2/N ... ... .._. .._. .._. 2 −2 1 −(N − 2)/N 0 1 0 −(N − 1)/N 0 0 −1/3 1 −2/3 · · · 0 −N −2 3 0 −N −3 2 2 .. . ... ... ... ... −N −2N 2N − 1 0 · · · 2 N −2N − 1 2N 0 · · · 0                                                            ,

(36)

b=                                                                                                                                           N−1 X k=1 (1 − k) fkh2− 1 NT N−1 X k=1 (2 − k) fkh2− c0− 2 NT .. . N−1 X k=1 (N − 2 − k) fkh2+ c0− (N − 2)c0− (N − 2) N T N−1 X k=1 (N − 1 − k) fkh2+ c0− (N − 1)c0− (N − 1) N T −1 3h 2 f(xN)+ 1 12h 2 f(xN+1) 3N−1 X k=N+1 N+ 1 − k 4 fkh 2₋(2N − 1)hc1 2 + T 3N−1 X k=N+1 N+ 2 − k 4 fkh 2₋(2N − 2)hc1 2 + T .. . 3N−1 X k=N+1 3N − 1 − k 4 fkh 2₋(1)hc1 2 + T 3N−1 X k=N+1 3N − k 4 fkh 2_{− T}                                                                                                                                           .

Here, A is 3N × 3N non-symmetric matrix. To compute the solution of (4.13), the GMRES solver is chosen. Where it uses O(N) operations for one residual evaluation. In addition, for computing the vector b, FFT uses O(N log2N) operations.

Table 4.1: Arithmetic complexity of GMRES for total residual evaluations for the FDM and BSE, respectively, on uniform and refined mesh. Where N is the number of grid points and M= 3N.

Uniform mesh

FDM BSE

Complexity O(N2₎ _{O(N log} 2N)

Refined mesh

FDM BSE

Complexity O(M2₎ _{O(M log} 2M)

As can be seen from Table 4.1, the computation for iterative solutions of BSE needs less arithmetic operations than the FDM.

(37)

4.2. Convergence rate 25

4.2 Convergence rate

In the previous section, for computing the solutions of linear systems (4.1), (4.6), (4.7) and (4.13), the GMRES method is chosen. In this section, we analyse the convergence rate of GMRES for those linear systems. As we stated in chapter 2, the number of iterations of GMRES required for convergence is determined by the number of dis-tinct eigenvalues of a diagonalizable matrix. So first we prove the diagonalizability of A in considered linear system, then by applying Theorem 2.7.2, we determine the convergence of GMRES.

4.2.1 Uniform mesh

First, we consider (4.1), where A is (N − 1) × (N − 1) matrix and its all eigenvalues are distinct [15]. Since eigenvectors corresponding to distinct eigenvalues are linearly independent, A has a full set of linearly independent eigenvectors. It follows from Theorem 2.7.1 that A is diagonalizable. Hence, Theorem 2.7.2 implies that GMRES needs at most N − 1 iterations to compute (4.1).

Next we estimate the number of iterations of GMRES required for convergence for (3.9). To do so, first of all, let us convert it into the matrix-vector form as follows

1 0 −xN 1 ! | {z } A φ1 φ2 ! |{z} φ =       c1−PkN−1=1 fkh c0−PN−1_k₌₁ xN−kfkh       | {z } b . (4.14)

where A has only one eigenvector,

v= 0 1 !

,

corresponding to double repeated eigenvalue λ = 1. Since there is no second lin-early independent eigenvector of this matrix, Theorem 2.7.1 indicates that A is non-diagonalizable. However, A is 2 × 2 matrix, so it is safe to say that GMRES will be converged to the solution in at most two iterations.

Finally, we consider (4.6). In order to find out whether A is diagonalizable, we find eigenvalues and its associated eigenvectors. The eigenvalues can be found by

(38)

computing

det(A − Iλ)=det                          (−E1+ h − λ) 0 0 · · · −1₂h −E2 (h − λ) 0 · · · −1₂h .. . ... ... ... ... −EN−1 0 · · · (h − λ) −1₂h −EN 0 · · · 0 (1₂h −λ)                          =(−E1+ h − λ)det                          (h − λ) 0 0 · · · −1₂h 0 (h − λ) 0 · · · −1 2h .. . ... ... ... ... 0 0 · · · (h − λ) −1 2h 0 0 · · · 0 (1₂h −λ)                          + −h 2 (−1)N+1det                          −E2 (h − λ) 0 · · · 0 −E3 0 (h − λ) · · · 0 .. . ... ... ... ... −EN−1 0 · · · 0 (h − λ) −EN 0 · · · 0 0                          =(−E1+ h − λ)(h − λ)N−2 h 2 −λ + − h 2 (−1)N+1(−EN)(−1)N(h − λ)N−2 =(h − λ)N−2 (−E1+ h − λ) h 2−λ + h 2EN(−1) 2N₊₁_. (4.15) By setting p(λ) = 0, we obtain three distinct eigenvalues: λ1 = h with algebraic

multiplicity N − 2, λ2 = h(1 +

√

N)/2 with algebraic multiplicity one and λ3 = h(1 −

√

N)/2 with algebraic multiplicity one. After that, to find the associated eigenvectors of eigenvalues λ1, λ2 and λ3, we solve the equation (A − Iλj)X = 0 for j = 1, 2, 3.

Using eigenvalue λ1= h, we have

(A − hI)X=                          −E1 0 0 · · · −1₂h −E2 0 0 · · · −12h .. . ... ... ... ... −EN−1 0 · · · 0 −1₂h −EN 0 · · · 0 −1₂h                                                   x1 x2 .. . xN−1 xN                          =                          0 0 .. . 0 0                          . (4.16)

Letting x2= t1, x3= t2, · · · , xN−1= tN−2, we obtain the following associated

eigenvec-tors of λ1= h, X=                               x1 x2 x3 .. . xN−1 xN                               =                               0 1 0 .. . 0 0                               t1+                               0 0 1 .. . 0 0                               t2+ · · · +                               0 0 0 .. . 1 0                               tN−2. (4.17)

Thus, λ1 = h has N − 2 linearly independent eigenvectors. Then from the knowledge

of algebraic multiplicity of eigenvalues, it can be known that each of λ2and λ3has one

(39)

To sum up, A has a full set of linearly independent eigenvectors. From Theorem 2.7.1 it follows that A is diagonalizable. Hence, According to Theorem 2.7.2, GMRES uses at most three iterations to compute (4.6).

4.2.2 Refined mesh

To begin with, we consider (4.7). In order to find out whether A is diagonalizable, we need to know the number of linearly independent eigenvectors. So for this, we find the eigenvalues of A by computing the determinant of (A − Iλ), for simplicity, we take N= 3. det(A − Iλ)= (−2 − λ) 1 0 0 0 0 0 0 1 (−2 − λ) 1 0 0 0 0 0 0 1 (−2 − λ) 0 1 0 0 0 0 0 1 (−2 − λ) 1 0 0 0 0 0 0 1 (−2 − λ) 1 0 0 0 0 0 0 1 (−2 − λ) 1 0 0 0 0 0 0 1 (−2 − λ) 1 0 0 0 0 0 0 1 (−1 − λ) = (−2 − λ) (−2 − λ) 1 0 0 0 0 0 1 (−2 − λ) 0 1 0 0 0 0 1 (−2 − λ) 1 0 0 0 0 0 1 (−2 − λ) 1 0 0 0 0 0 1 (−2 − λ) 1 0 0 0 0 0 1 (−2 − λ) 1 0 0 0 0 0 1 (−1 − λ) − 1 1 0 0 0 0 0 0 (−2 − λ) 0 1 0 0 0 0 1 (−2 − λ) 1 0 0 0 0 0 1 (−2 − λ) 1 0 0 0 0 0 1 (−2 − λ) 1 0 0 0 0 0 1 (−2 − λ) 1 0 0 0 0 0 1 (−1 − λ) = · · · = (−2 − λ)7_{(−1 − λ) − (−2 − λ)}6_{− 2(−2 − λ)}5_{(−1 − λ)}_{+ 3(−2 − λ)}4 + 5(−2 − λ)3_{(−1 − λ)}_{+ (−2 − λ)}2₍₃_{+ λ).}

By setting (A − Iλ)= 0, we obtain eight distinct eigenvalues. It means there are eight linearly independent eigenvectors since the number of distinct eigenvalues is equiv-alent to the order of A. In terms of Theorem 2.7.1, A is diagonalizable. Numerical experiments indicate that also for a general N, all the eigenvalues are distinct and the matrix is diagonalisable, so it doesn’t seem possible to prove convergence in less than 3N − 1 iterations.

Then we consider the fixed point refined BSE (3.27). First of all, let us convert it into a matrix -vector form

(40)

where A=                            1/2 0 0 0 −1/2 0 −1/2 1 0 0 −1/2 0 −1/4 0 1 −1/2 0 0 0 0 −1/2 1 0 −1/2 0 −1/2 0 0 1 0 0 −1/4 −1/2 0 0 1/2                            , b=                             h 8fN+1− h 2fN− PN−1 k=1 1 2fkh h 8fN+1− h 2fN+ P N−1 k=1 1 2fkh c0 2 + P N−1 k=1 EN−kfkh −c1 4 + P 3N−1 k=N+1FN−kfkh₂ −h 8fN+1+ h 2fN+ c1 2 − P3N−1 k=N+1 h 4fk −₁₆h fN+1+h₄fN+ P3N−1k_=N+1F3N−kfkh₂                             .

Similarly, eigenvalues of A can be found by computing

det(A − Iλ)=det                            (1/2 − λ) 0 0 0 −1/2 0 −1/2 (1 − λ) 0 0 −1/2 0 −1/4 0 (1 − λ) −1/2 0 0 0 0 −1/2 (1 − λ) 0 −1/2 0 −1/2 0 0 (1 − λ) 0 0 −1/4 −1/2 0 0 (1/2 − λ)                            =(1 2 −λ)det                       (1 − λ) 0 0 −1/2 0 0 (1 − λ) −1/2 0 0 0 −1/2 (1 − λ) 0 −1/2 −1/2 0 0 (1 − λ) 0 −1/4 −1/2 0 0 (1/2 − λ)                       −1 2det                       −1/2 (1 − λ) 0 0 0 −1/4 0 (1 − λ) −1/2 0 0 0 −1/2 (1 − λ) −1/2 0 −1/2 0 0 0 0 −1/4 −1/2 0 (1/2 − λ)                       =(1 2 −λ)(1 − λ)det                 (1 − λ) −1/2 0 0 −1/2 (1 − λ) 0 −1/2 0 0 (1 − λ) 0 −1/2 0 0 (1/2 − λ)                 + (1 2−λ)( 1 2)det                 0 (1 − λ) −1/2 0 0 −1/2 (1 − λ) −1/2 −1/2 0 0 0 −1/4 −1/2 0 (1/2 − λ)                 +1 4det                 1/2 0 0 0 −1/4 (1 − λ) −1/2 0 0 −1/2 (1 − λ) −1/2 0 −1/2 0 (1/2 − λ)                 =(1 − λ)2_{(1 − λ)(1/2 − λ) − 1/4}2_. (4.19)

(41)

By setting det(A − Iλ)= 0, we find three distinct eigenvalues: λ1 = 1 with algebraic

multiplicity two, λ2= (3 −

√

5)/4 with algebraic multiplicity two and λ3= (3 +

√ 5)/4 with algebraic multiplicity two.

For λ1 = 1, (A − I)X=                            −1/2 0 0 0 −1/2 0 −1/2 0 0 0 −1/2 0 −1/4 0 0 −1/2 0 0 0 0 −1/2 0 0 −1/2 0 −1/2 0 0 0 0 0 −1/4 −1/2 0 0 −1/2                                                       x1 x2 x3 x4 x5 x6                            =                            0 0 0 0 0 0                            . (4.20)

Letting x5 = s and x6 = t, we have x1= −s, x2 = 0, x3 = −t and x4 = s/2. From this

relation we can write the associated eigenvectors in the following form

X=                            x1 x2 x3 x4 x5 x6                            =                            −1 0 0 −1₂ 1 0                            s+                            0 0 −1 0 0 1                            t.

Thus λ1 = 1 has two eigenvectors.

For λ2 = (3 − √ 5)/4, A −3 − √ 5 4 I X=                                  −1₊√5 4 0 0 0 −1/2 0 −1/2 1+ √ 5 4 0 0 −1/2 0 −1/4 0 1+ √ 5 4 −1/2 0 0 0 0 −1/2 1+ √ 5 4 0 −1/2 0 −1/2 0 0 1+ √ 5 4 0 0 −1/4 −1/2 0 0 −1+ √ 5 4                                                             x1 x2 x3 x4 x5 x6                            =                            0 0 0 0 0 0                            . (4.21) Letting x5= t we have x1= 1+ √5 2 t, x2= −1 − √ 5 2 t, x3= 1+ √ 5 4 t, x4= 1 2t, x6=0. (4.22)

The eigenvector corresponding to eigenvalue λ2 = (3 −

√ 5)/4 is X=                            x1 x2 x3 x4 x5 x6                            =                               1+√5 2 −1−√5 2 1+√5 4 1 2 1 0                               t.

(42)

Thus λ2= (3 −

√

5)/4 has one linearly independent eigenvector. For λ3= (3 + √ 5)/4, A −3+ √ 5 4 I X=                                  −1−√5 4 0 0 0 −1/2 0 −1/2 1− √ 5 4 0 0 −1/2 0 −1/4 0 1− √ 5 4 −1/2 0 0 0 0 −1/2 1− √ 5 4 0 −1/2 0 −1/2 0 0 1− √ 5 4 0 0 −1/4 −1/2 0 0 −1− √ 5 4                                                             x1 x2 x3 x4 x5 x6                            =                            0 0 0 0 0 0                            . (4.23) Letting x1= t and x3= s, we have

x2= 2√5 1 − √ 5 t, x4= − t 2 + 1 − √ 5 2 s, x5= −1 − √5 2 t, x6= √ 5 2 t+ 2 −√5 − 1s. (4.24)

The all eigenvectors corresponding to eigenvalue λ2= (3 −

√ 5)/4 are X=                            x1 x2 x3 x4 x5 x6                            =                                1 2√5 1−√5 0 −1₂ −1−√5 2 √ 5 2                                t+                              0 0 1 1−√5 2 0 2 −√5−1                              s. Thus, λ2 = (3 − √

5)/4 has two linearly independent eigenvectors. To conclude, the number of linearly independent eigenvectors is five. It follows from Theorem 2.7.1 that Ais non-diagonalizable and it does not seem possible to prove convergence of GMRES in less than six iterations.

Lastly, we consider (4.13). For simplicity, we consider the case N= 3,

A=                                            2 −2 1 0 0 0 0 0 −1/3 0 1 0 0 0 0 0 0 −2/3 0 −1/3 1 −2/3 0 0 0 0 0 −3 0 −2 3 0 0 0 0 0 −3 0 −3 2 2 0 0 0 0 −3 0 −4 3 0 2 0 0 0 −3 0 −5 4 0 0 2 0 0 −3 0 −6 5 0 0 0 2 0 −3 0 −7 6 0 0 0 0 2                                            . (4.25)

(43)

As we have done above, firstly we find its eigenvalues by computing

det(A − Iλ) = det                                            (2 − λ) −2 1 0 0 0 0 0 −1/3 0 (1 − λ) 0 0 0 0 0 0 −2/3 0 −1/3 (1 − λ) −2/3 0 0 0 0 0 −3 0 −2 (3 − λ) 0 0 0 0 0 −3 0 −3 2 (2 − λ) 0 0 0 0 −3 0 −4 3 0 (2 − λ) 0 0 0 −3 0 −5 4 0 0 (2 − λ) 0 0 −3 0 −6 5 0 0 0 (2 − λ) 0 −3 0 −7 6 0 0 0 0 (2 − λ)                                            = (1 − λ)det                                      (2 − λ) 1 0 0 0 0 0 −1/3 0 (1 − λ) −2/3 0 0 0 0 0 −3 −2 (3 − λ) 0 0 0 0 0 −3 −3 2 (2 − λ) 0 0 0 0 −3 −4 3 0 (2 − λ) 0 0 0 −3 −5 4 0 0 (2 − λ) 0 0 −3 −6 5 0 0 0 (2 − λ) 0 −3 −7 6 0 0 0 0 (2 − λ)                                      +2 3det                                      (2 − λ) −2 1 0 0 0 0 0 0 −1/3 (1 − λ) −2/3 0 0 0 0 −3 0 −2 (3 − λ) 0 0 0 0 −3 0 −3 2 (2 − λ) 0 0 0 −3 0 −4 3 0 (2 − λ) 0 0 −3 0 −5 4 0 0 (2 − λ) 0 −3 0 −6 5 0 0 0 (2 − λ) −3 0 −7 6 0 0 0 0                                      = · · · = (2 − λ)4_(−9λ5_{+ 81λ}4_{− 261λ}3_{+ 346λ}2_{− 347λ}_{+ 57).} (4.26) By setting (A − Iλ)= 0, we obtain six distinct eigenvalues, that is

λ1=2, with multiplicity four,

λ2=3.7683 + 0.9173i, with multiplicity one,

λ3=3.7683 − 0.9173i, with multiplicity one,

λ4=0.6915 + 1.416i, with multiplicity one,

λ5=0.6915 − 1.416i, with multiplicity one,

λ6=0.0803, with multiplicity one.

(4.27)

In order to determine whether A is diagonalizable, we only concentrate our atten-tion on the eigenvalue λ1= 2 since we already know that other five distinct eigenvalues

(44)

can be found by solving

det(A − 2I)= det                                            0 −2 1 0 0 0 0 0 −1/3 0 −1 0 0 0 0 0 0 −2/3 0 −1/3 −1 −2/3 0 0 0 0 0 −3 0 −2 1 0 0 0 0 0 −3 0 −3 2 0 0 0 0 0 −3 0 −4 3 0 0 0 0 0 −3 0 −5 4 0 0 0 0 0 −3 0 −6 5 0 0 0 0 0 −3 0 −7 6 0 0 0 0 0                                                                                       x1 x2 x3 x4 x5 x6 x7 x8 x9                                            =                                            0 0 0 0 0 0 0 0 0                                            . (4.28) If we write the reduced echelon form of (A − 2I), we obtain

                                           1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0                                            .

Then letting x1 = t1, x2 = t2, x3 = t3, x4 = t4 and x9 = 0, we obtain the following

associated eigenvectors of λ1= 2, X=                                            x1 x2 x3 x4 x5 x6 x7 x8 x9                                            =                                            1 0 0 0 0 0 0 0 0                                            t1+                                            0 1 0 0 0 0 0 0 0                                            t2+                                            0 0 1 0 0 0 0 0 0                                            t3+                                            0 0 0 1 0 0 0 0 0                                            t4.

Hence λ1 = 2 has four linearly independent eigenvectors. To conclude, there are nine

linearly independent eigenvectors associated with six distinct eigenvalues. From the Theorem 2.7.1 we know that A is diagonalizable. Numerical experiments indicate that also for a general N, all the eigenvalues are distinct and the matrix is diagonalisable. Hence according to the Theorem 2.7.2, GMRES needs at most six iterations to compute (4.13).

(45)

Chapter 5

Experimental results

In this chapter, results from numerical experiments to test the convergence rate of the BSE method are presented. In particular, the number of iterations of the GMRES for required convergence for the BSE is presented. The results are compared with the iteration counts of GMRES for required convergence for the FDM. Results for both the uniform mesh and the refined mesh are presented. We consider (2.1) for f (x) = ex_,

f(x) = − sin x, and f (x) = x2_{, respectively, with appropriate boundary conditions.}

And we used full GMRES with the zero vector as initial guess and tolerance 10−6_.

Numerical experiments are preformed on Mac OS X 10.6.8 with 2.4 GHz Intel Core Duo processor and 4 GB memory. The results are summarized in Table 5.1.

Table 5.1: The number of iterations of the GMRES for required convergence for the FDM and BSE, respectively, on uniform and refined mesh, where N is the number of grid points.

N Uniform mesh Refined mesh

FDM BSE FDM BSE 12 10 3 7 6 24 22 3 19 6 33 31 3 28 6 63 61 3 58 6 129 127 3 124 6 255 253 3 250 6

As can be seen from Table 5.1, The iteration counts for required convergence of the GMRES method is constant and mesh independent for BSE, while it is mesh dependent for FDM.

As can be seen from Table 5.2, refined mesh requires less execution time and allo-cated memory than the uniform mesh.

(46)

Table 5.2: The number of mesh points and execution time needed to compute the solu-tion of BSE with the same accuracy on uniform and refined mesh.

Step l∞ Time (seconds)

BSE RBSE error BSE RBSE

128 96 1.07E - 2 .0068 .0065

256 195 5.2E - 3 .0082 .0078

512 390 2.6E - 3 .0117 .0098

(47)

Chapter 6

Conclusions and future work

6.1 Conclusion

In this thesis, a new convergence acceleration technique for the iterative solution of linear systems of equations that is obtained from FD discretizations for ODEs with constant coefficients on locally refined meshes is presented. Benefited from the knowl-edge of BIE, the difference equations are transformed into an equivalent BSE by using the discrete fundamental solution of difference operator. Since the system of BSE is well suited to compute with iterative methods, the GMRES solver is applied. The numerical results showed that (a) the required memory for storing the coefficient ma-trix in the system of BSE is O(N), where N is the number of grid points of the BSE ; (b) the convergence rate of the GMRES method for BSE is significantly improved, where the number of iterations for required convergence is a constant and independent of the number of grid points; (c) solving the linear system of BSE is computationally not expensive, when the computational cost of a least square problem is neglected, one residual evaluation of GMRES is performed in O(N log₂N) arithmetic operations, where N is grid points.

6.2 Possible extensions

For possible future extension we propose the following:

• In this thesis, the new preconditioning technique is applied to second order ODE with constant coefficients. But it can be applied to higher order ODEs and PDEs with constant coefficients.

• The BSE is formulated from the FD discretization, but it is applicable to other discretization methods as long as the grids are structured.

• The new approach can be used to higher dimensional PDEs.

(48)

Boundary Summation Equation Preconditioning for Ordinary Differential Equations with Constant Coefficients on Locally Refined Meshes

Examensarbete

Boundary Summation Equation Preconditioning for

Ordinary Di

fferential Equations with

Constant Coe

fficients on Locally Refined Meshes

Maimaitiyiming Guzainuer

Boundary Summation Equation Preconditioning for

Ordinary Di

fferential Equations with

Constant Coe

fficients on Locally Refined Meshes

Abstract

Acknowledgments

Nomenclature

Symbols

Abbreviations

Contents

Chapter 1

Introduction

Chapter 2

Background

2.1

Model problem

2.2

Finite di

fference approximation

2.3

Preconditioning

2.4

Green’s function

2.5

Fundamental solution

2.6

Boundary integral equation

2.7

GMRES

2.8

FFT

Chapter 3

BSE on uniform and refined

mesh

3.1

Uniform mesh

3.1.1

FD method

3.1.2

BSE

3.2

Locally refined mesh

3.2.1

FD method

3.2.2

BSE

Chapter 4

Arithmetic complexity and

convergence rate

4.1

Arithmetic complexity

4.1.1

Uniform mesh

4.1.2

Refined mesh

4.2

Convergence rate

4.2.1

Uniform mesh

4.2.2

Refined mesh

Chapter 5

Experimental results

Chapter 6

Conclusions and future work

6.1

Conclusion

6.2

Possible extensions