Performance comparisons of preconditioned iterative methods for problems arising in PDE-constrained optimization

(1)

IT 15 023

Examensarbete 30 hp March 2015

Performance comparisons of

preconditioned iterative methods

for problems arising in PDE-constrained optimization

Shiraz Farouq

(2)

(3)

Teknisk- naturvetenskaplig fakultet UTH-enheten

Besöksadress:

Ångströmlaboratoriet Lägerhyddsvägen 1 Hus 4, Plan 0 Postadress:

Box 536 751 21 Uppsala Telefon:

018 – 471 30 03 Telefax:

018 – 471 30 00 Hemsida:

http://www.teknat.uu.se/student

Abstract

Performance comparisons of preconditioned iterative methods for problems arising in PDE-constrained optimization

Shiraz Farouq

The governing dynamics of simple and complex processes, whether physical,

biological, social, economic, engineering, or even rather a mere figment of imagination, can be studied via numerical simulations of mathematical models. These models in many cases can be thought to consist of one, or frequently, several coupled partial differential equations (PDEs). In many applications, the aim of such simulations is not only to study the behavior of the underlying processes, but also to optimize or control those in some optimal way. These are referred to as optimal control problems constrained by PDEs and are stated in the form of a constrained minimization problem. The general framework under which such problems are studied is referred to as PDE-constrained optimization.

In this thesis, we aim to solve three benchmark optimal control problems, namely, the optimal control of the Poisson equation, the optimal control of the

convection-diffusion equation and the optimal control of the Stokes system.

Numerically tackling these problems lead to a large optimality system with a saddle point structure. Systems with a saddle point structure are indefinite and in general, ill-conditioned, thus posing great challenges for iterative solvers seeking to find their solution. Preconditioning the optimality system is a possible strategy to deal with the issue. The main focus of the thesis is therefore to solve the resulting optimality systems with various preconditioners available in literature and compare their efficiency. Moreover, additional challenges arise when dealing with

convection-diffusion control problems which we effectively deal by employing the local projection stabilization (LPS) scheme. Furthermore, Axelsson and Neytcheva in [40] proposed a preconditioner for efficiently solving large nonlinear coupled multi-physics problems. We successfully apply this preconditioner to the first two benchmark problems with promising results.

Examinator: Jarmo Rantakokko Ämnesgranskare: Sverker Holmgren Handledare: Maya Neytcheva

(4)

perhaps my readers will say that "I had perhaps something to say, but did not know how to express it."

— Fyodor Dostoyevsky, The Idiot

(5)

Introduction 1

PDE-constrained optimization problems appear is a variety of social, scientific, industrial, medical and engineering applications including optimal control, optimal design and parameter identification. The general formulation of a PDE-constrained optimization problem is given by

miny,u J (y, u)

s.t.C(y, u) =0, (1.1)

whereJ represents the cost functional, C is a PDE-constraint, y is the state variable, and u is the decision variable. The PDE-constraint, which basically models the underlying process that needs to be controlled, is referred to as the state equation. Depending on the nature of the objective function J and decision variable u, the PDE-constrained optimization problem (1.1) assumes the nature of an optimal control, optimal design or parameter identification problem.

Correspondingly, u is then referred to as a control, design or parameter identification variable. For an overview on a wide range of PDE-constrained optimization problems, the reader is referred to [19,20,21].

In this thesis, we are concerned with the optimal control problems constrained by PDEs. As a brief introduction to the topic, consider a heating source u applied to the surface of a domain Ω to control its temperature y. Let us assume that the temperature distribution y in Ω is satisfied by the following partial differential equation:

−∆y=u in Ω. (1.2)

Now suppose that we want the domain Ω to acquire a target (desired) temperature distribution ˆy for some forcing term u. In other words, we want to control the heating source u, such that the domain under consideration acquires a temperature y that is as close to the target temperature ˆy as possible. This target tracking must be defined over some norm, which in our case is the L²(Ω) norm. The process can now be described in the form of a constrained minimization problem.

Hence we write:

miny J (y) = ¹

2�y−ˆy�²_L2(Ω)

s.t. −∆y=u,

(1.3)

(9)

where u now represents the control variable.

Let us further generalize the problem by including boundary conditions. Let ∂Ω denote the bound- ary of Ω. Then

y=gDon ∂Ω_D, ∂y

∂n =gNon ∂Ω_N,

where ∂Ω_D represents the Dirichlet boundary and ∂Ω_N represents the Neumann boundary. ∂y is the directional derivative in the outer normal direction of Ω. gD and gN are the Dirichlet and∂n Neumann boundary data, respectively. Further,

∂Ω=∂Ω_D∪^∂ΩN, ∂Ω_D∩^∂ΩN={^∅}^.

The above problem, in general, is ill-posed. Theory suggests adding the norm of the control u along with the so called Tikhonov regularization parameter (also called the cost parameter) β > 0to the cost functional. This makes the solution to the problem, well-defined. So we now write:

miny,u J (y, u) = ¹

2�y− ^ˆy�²_L₂_(Ω)+¹

2β�u|²_L₂_(Ω) s.t. −∆y=u,

y=gDon ∂ΩD, ∂y

∂n =gNon ∂ΩN,

∂Ω=∂Ω_D∪^∂ΩN, ∂Ω_D∩^∂ΩN={^∅}^.

(1.4)

One way to deal with this minimization problem and find its optimal solution is through the first order optimality conditions, also known as the Karush-Kuhn-Tucker (KKT) conditions. This results in the involvement of another function, the Lagrange multiplier λ (also referred in literature as the dual or adjoint state variable). The solutions to the minimization problem for the state function y and control function u are called the optimal state and the optimal control respectively. The existence and the uniqueness of the optimal solution is not discussed here and the reader is referred to [21].

Numerically dealing with the PDE-constrained optimization problems require two steps, namely, discretization and optimization. Which step is taken first depends on which one of following two approaches is pursued:

• The discretize-then-optimize approach.

• The optimize-then-discretize approach.

In the first approach, we discretize the objective function and the state equation(s), formulate its Lagrangian and the corresponding first order optimality conditions, and then build an optimality system to numerically solve the problem. In the second approach, we first formulate the La- grangian and its corresponding first order optimality conditions, discretize them, and then finally solve the problem numerically. Many PDE-constrained optimization problems, especially where the PDE is self-adjoint, both approaches lead to the same discrete optimality system. However, certain PDE’s are not self adjoint, for example, the convection-diffusion equation. Heinkenschloss et al. in [23] showed that when streamline upwind Petrov-Galerkin (SUPG) stabilization¹ is applied to the optimal control of the convection-diffusion equation, the optimize-then-discretize and

1If a pure Galerkin approach is used, both approaches will result in a similar algebraic system. However, the solution will have unwanted oscillations.

(10)

discretize-then-optimize approaches can lead to different discrete algebraic² systems, resulting in different solution outcomes. This also poses challenge in the context of designing efficient preconditioners for such problems.

Whichever approach is taken, the resulting optimality system acquires a saddle point structure, given by:

Ax=b, (1.5)

where we solve for x with

A =

�A B₁^T B2 −C

�

∈^R^n×n

and b∈^Rⁿ^. (1.6)

The problem that we have discussed so far is called a distributed optimal control problem where the control u is distributed throughout the domain Ω. Another related problem is referred to as the boundary control problem where the control only acts on the boundary so that u∈ L²(∂Ω). For example, a typical Dirichlet boundary control problem is of the form

miny,u

1 2

�

Ω(y− ˆy)²dx+¹ 2β

�

∂Ωu²ds s.t. −∆y=g in Ω

y=u on ∂Ω.

(1.7)

Similarly, a Neumann boundary control problem is given by miny,u

1 2

�

Ω(y− ^ˆy)²dx+¹ 2β

�

∂Ωu²ds s.t. −∆y=g in Ω

∂y

∂n =u on ∂Ω.

(1.8)

We consider only the distributed optimal control problems. Moreover, we have limited ourselves to distributed optimal control problems without constraints on the control and/or the state. Such problems are considered in [52,53,54]. Furthermore, time-dependent optimal control problems are also not considered here, and the interested reader is referred to [56,57].

Remark (The parameter identification problem) In many other applications, reconstruction of the state of a dynamical system from measured data is required. These are referred to as the parameter identification or the inverse problem. Examples include weather forecasting, oil reservoir modeling and various models of cognitive motor control. Such problems can also be stated as a PDE-constrained optimization problem. While these problems are not the focus of this thesis, we however give a brief sketch of it.

Consider a thin membrane³ whose deflection y under some force u is defined by the Poisson equation

− ∇ ·^σ∇y=u in Ω,

where σ(Ω)represents the unobservable material property (spatially varying) of the system, i.e., the coefficient that we want to recover. Suppose we have some observable measurements ˆy for

2We use the term algebraic here, for in the case of optimize-then-discretize, where the convection-diffusion equation is discretized using the SUPG, there is no finite dimensional problem for which the resulting system can be considered as an optimality system, cf. [23].

3This example is taken from [44].

(11)

displacements caused under the influence of certain forces u. The task is to find σ for which the output state y of the system best matches the observations ˆy. This task takes the form of an optimization problem:

miny,σ J (y, σ) = ¹

2�y− ˆy�²+βR(σ) s.t. − ∇ ·^σ∇y=u in Ω,

(1.9)

where R(σ)is the regularization operator introduced to make the solution well-behaved.

We assumed that the dimension of solution space y and the observation space ˆy are matched.

In most situations, however, only few observations are available, which requires introducing an operator Ψ that maps the solution space to the observation space. In this case, we write:

miny,σ J (y, σ) = ¹

2�Ψy−ˆy�²+βR(σ) s.t. − ∇ ·σ∇y=u in Ω.

(1.10)

Interested readers can refer to, for instance, [44,45,46].

1.1 Major goal of the study

Due to their robustness, direct solvers are methods of choice when it comes to solving discrete saddle point systems. However, as the mesh size h gets smaller, the discrete saddle point system becomes large. This results in increased computational requirements which direct solvers, in general, find difficult to cope with. In these circumstances, due to their modest requirements for computer resources, iterative solution methods become the feasible method of choice. These methods are usually based on projection processes onto Krylov subspaces. Examples include the conjugate gradient (CG) method, the minimal residual (MINRES) method, and the generalized minimal residual (GMRES) method. However, due to indefiniteness and, in general, unfavorable spectral properties of a large discrete saddle point system, iterative methods suffer from slow rate of convergence and in some cases a lack of it. In the context of PDE-constrained optimization problems, where some regularization parameter, say β approaches zero, the issue is further exacerbated. Therefore, efficient design of iterative solvers becomes necessary. One way to achieve efficiency in iterative methods is to use a suitable preconditioning strategy to improve the spectral properties of the underlying saddle point system, see for instance, Axelsson and Neytcheva [1]. Further, saddle point systems possess a certain block structure, which has been utilized for designing efficient preconditioners in various studies such as [17,24,26,35].

We will discuss and implement various preconditioning strategies available in literature for the saddle point systems arising in PDE-constrained optimization problems. Three benchmark problems are considered: the distributed optimal control of the Poisson equation, the distributed optimal control of the convection-diffusion equation and the distributed optimal control of the Stokes system. We aim to summarize the proposed preconditioning techniques and test the most promising ones on the defined benchmark problems to provide a fair comparison of efficiency of the solvers both in terms of number of iterations as well as in required computing time.

1.2 Problem setting and available tools

We limit ourselves to two space dimensions. To discretize and create the matrices, the open source FEM librarydeal.ii [58] is used. The package provides tools for constructing the discretization meshes as well as for assembling the necessary FEM matrices. In addition,deal.ii

(12)

provides a number of preconditioners and iterative solvers, but also interface to other well estab- lished numerical algebra packages, such as Trilinos [59], Hypre [60] and PETSc [61].

1.3 Layout of the Thesis

This thesis is organized in the following manner:

• In Chapter 2, using the discretize-then-optimize approach, we describe the mathematical model for our three benchmark distributed optimal control problems. We describe the process of discretization of the objective function and its constraints. Finally we describe the formulation of the first order optimality conditions, concluding that the process leads to a saddle point system.

• In Chapter 3, we discuss the basic theory of different iterative methods.

• In Chapter 4, we discuss the basic idea and motivation behind preconditioning iterative methods based on Krylov subspaces. The topic of multigrid methods is also touched upon in this chapter.

• In Chapter 5, we give the literature overview of various techniques for constructing preconditioners for saddle point systems arising in different PDE-constrained optimization problems.

• In Chapter 6, we present and discuss the results of our numerical experiments.

• In Chapter 7, we summarize the results and draw conclusions.

(13)

"Curiouser and curiouser!" cried Al- ice

Lewis Carroll, Alice in Wonderland

Mathematical framework for PDE-constrained 2

optimization problems

The benchmark PDE-constrained optimization problems that we consider in our study are:

1. The distributed optimal control of the Poisson equation.

2. The distributed optimal control of the convection-diffusion equation.

3. The distributed optimal control of the Stokes system.

We start with the continuous formulation of the minimization problem. Each problem is then tack- led using the discretize-then-optimize approach. A starting point is to state the weak formulation for the state equations and then use an appropriate finite element method for discretization¹. The continuous minimization problems is then replaced by its discrete counterpart. Introducing the Lagrange multiplier and solving for the optimality conditions yields an optimality system with a saddle point structure. In many cases, this optimality system can be reduced, more specif- ically, by eliminating the control u. The resulting system is then called the reduced optimality system.

2.1 The saddle point theory

We follow the introductory discussion in [35, 5]. Let V and Λ be the Hilbert spaces with inner products(·^,·)V and(·^,·)Λ, associated with the norms � · �V and� · �Λ respectively. The mixed variational formulation reads:

Find y∈V and λ∈^Λ

a(y, v) +b(v, λ) = f(v) ∀v∈V,

b(y, µ)−c(λ, µ) =g(µ) ∀^µ∈^Λ, ^(2.1)

1Although we take the discretize-then-optimize approach here, note that for both the Poisson control problem as well as the Stokes control problem, using the second approach, i.e., optimize-then-discretize, yield the same discrete optimality system. For the case of convection-diffusion control problem, this is not the case, see Heinkenschloss et al. in [23]. We will further elaborate on the issue in section2.3.

(14)

where a, b, c are bounded bilinear forms on V×V →R, V×^Λ→R, and Λ×^Λ→^Rrespec- tively, whereas f ∈V^∗and g∈ ^Λ^∗are the linear forms; with V_h^∗and Λ^∗_hbeing the dual space of V_hand Λ_hrespectively.

Let us consider the following assumptions:

1. a and c are symmetric, i.e.,

a(w, v) =a(v, w) ∀v, w∈V, c(r, µ) =c(µ, r) ∀µ, r∈^Λ. (2.2) 2. a and c are nonnegative, i.e.,

a(v, v)≥⁰ ∀v∈V, c(µ, µ)≥⁰ ∀^µ∈^Λ. (2.3) If condition (1) holds, then the functional

S(v, µ) = ¹

2a(v, v) +b(v, µ)−¹₂c(µ, µ)−f(v)−g(µ) (2.4) is a convex function of v ∈ V and a concave function of µ ∈ Λ; more on this can be found in [6]. Such a functionalS is referred to as a saddle point function. If additionally, condition (2) also holds, then(y, µ)is the solution to (2.1) if and only if(y, µ)is a saddle point ofS, i.e.,

S(y, µ)≤ S(y, λ)≤ S(v, λ). (2.5) Let us now reformulate (2.1) as a (non-mixed) variational problem: Let X be the product space V×^Λequipped with inner product ((y, λ),(v, µ))_X = (y, v)_V+ (p, µ)_Λ with norm �(y, λ)�X =

�((y, λ),(y, λ))_X. Further, let X^∗ be the dual space of X. Then the variational formulation reads:

Find x₁= (y, λ)∈X such that

B(x1, x2) =F (x2), ∀x2(v, µ)∈X, (2.6) where

B(z, x2) =a(w, v) +b(v, r) +b(w, µ)−c(r, µ), F (x2) = f(v) +g(µ), (2.7) for x2= (v, µ), z= (w, r).

Associating a linear operatorA ∈ L(X, X^∗)toB, we write

�Ax₁, x2�X^∗,X =B(x₁, x2). (2.8) Finally, using the operator notation we rewrite (2.6), which now reads:

Ax=F^, in X^∗. (2.9)

The existence and uniqueness of a solution for (2.9) and (2.6) follows from the book by Babushka and Aziz [2]. In the special case of c(·^,·) = 0, existence and uniqueness follows the classical result by Brezzi [3].

2.1.1 The discretized saddle point system

The discretized version of (2.9) leads to a large saddle point system represented by

Ax =F^,² (2.10)

2Instead of writingAhx_h=Fh, we consider it convenient to reuse the notation of the continuous case (2.9).

(15)

or � A B₁^T B2 −C

� �x1

x2

�

=

�f g

�

, (2.11)

where f ∈^Rⁿ¹, g∈^R^m¹, A∈^Rⁿ¹^×n¹, B1, B2∈^R^m¹^×n¹, C∈^R^m¹^×m¹, m1≤n1. Let us consider the following conditions:

1. A is symmetric, i.e., A=A^T.

2. The symmetric part of A, i.e., H≡ ¹₂(A+A^T)is positive semidefinite.

3. B1=B2=B or B is symmetric, i.e., B1=B₂^T.

4. C is symmetric, i.e., C=C^Tand positive semidefinite.

If all the above conditions are satisfied thenAis symmetric and indefinite. Indefiniteness implies Ahas both positive and negative eigenvalues. If either A�= A^T(condition 1) or B₁�=B2(condition 3), or both are not satisfied, thenAis non-symmetric, in which caseAis referred to as the generalized saddle point system³. Moreover, the discretized version of X introduces an additional parameter into the picture, i.e., the mesh size h. As h decreases, the saddle point system becomes large, while its spectral properties deteriorate. In practice, larger saddle point systems are typically ill conditioned. Further discussion on saddle point problems can be found with more detail in [5].

2.2 Distributed optimal control of the Poisson equation

Consider the distributed optimal control problem constrained by the Poisson equation:

2�y− ˆy�²_L2(Ω)+¹

2β�u�²_L2(Ω),

−∆y=u in Ω, y=gDon ∂Ω_D, ∂y

∂n =gNon ∂Ω_N,

∂Ω=∂Ω_D∪^∂ΩN, ∂Ω_D∩^∂ΩN={^∅}^.

(2.12)

where y is the state of the system, ˆy is the desired state of the system to be achieved and β>0 is a regularization parameter. Further, ∂Ω represents the boundary of the domain Ω.

2.2.1 The weak formulation of the state equation

The starting point for finite element discretization is to find the variational or weak formulation of the equations involved. Let us first start with the state equation, which in this case is the Poisson equation. For a weak formulation, we need an appropriate set of test functions v. Further, we need a natural functional space, where v and the weak solution y can naturally exist. So if Ω⊂^R², then this homely space can be defined as a Sobolev space, i.e.,

H¹(Ω) =

�

y : Ω→^R| y, ∂y

∂x1, ∂y

∂x2 ∈ L²(Ω)

� ,

3Linearized Navier-Stokes equations is an example.

(16)

with the corresponding solution and test spaces given by

H¹_E={y∈ H¹(Ω)|y=gDon ΩD}^,

H¹_E₀ ={v∈ H¹(Ω)|v=0on Ω_D}^. ^(2.13) Note that while the Dirichlet condition is part of the solution space H_E¹, it vanishes in the test space H¹_E₀, i.e., the test functions v takes zero value on the Dirichlet boundary. The Neumann boundary condition, in contrast, is not restricted in the solution or the test spaces. The weak formulation now reads:

Find y∈ H¹_Esuch that

a(y, v) =l(v) ∀v∈ H_E¹₀, (2.14) where the bilinear and linear forms a(·^,·)^{and l}(·)are given by

a(y, v) =

�

Ω∇y.∇v l(v) =

� Ωvu+

�

∂Ω_NvgN.

2.2.2 The finite element discretization of the state equation

We use the Galerkin approach for finite element discretization. Let Y₀^h⊂H¹_E₀be the n-dimensional vector space of test functions spanned by{^φ1, ..., φn}. Let Y_E^h ⊂ H_E¹ be the vector space for trial functions having basis {^φ1, ..., φn}. We extend the trial basis by {^φn+1, ..., φn+n_∂}to satisfy the Dirichlet boundary data on ΩD. The finite element approximation yh∈Y_E^his uniquely determined by the vector of coefficients y={y1, ..., yn}. Thus

y_h=

∑

n i=1

y_iφ_i+

n+n_∂ i=n+1

∑

y_iφ_i. (2.15)

Note that the space Y_E^h is constructed such that the trial functions (shape functions) in (2.15) coincides the appropriate test functions in Y_E^h₀. This is referred to as the Galerkin method. Using this formulation for our state equation, the finite element discretization reads:

Find yh∈Y_E^hsuch that

a(yh, vh) =l(vh) ∀v∈Y_E^h₀ (2.16) where the bilinear and linear forms a(·^,·)and l(·)are given by

a(yh, v_h) =

�

Ω∇yh.∇vh, l(v_h) =

�

Ωv_hu_h+

�

∂Ω_Nv_hgN.

We discretize the control u using the trial space. Let ψ_i be the set of basis functions such that span{ψ₁, ..., ψ_m} = U^h ⊂ L²(Ω). We note that ψ_i may be different than φ_i. The Galerkin finite element approximation of the control u is

u_h=

∑

m i=1

uiψ_i, (2.17)

where the discretized control u_his uniquely determined by the vector of coefficients u={u1, ..., un}.

(17)

The Galerkin finite element approximation of the state equation can also be written as a system of algebraic equations, i.e.,

Ky=Qu+d. (2.18)

with

K={k_i,j} ∈^R^n×n^, k_i,j=

�

Ω∇^φj.∇^φi, Q={qi,j} ∈^R^n×m^, qi,j=

� Ωφ_iψ_j,

d={d_i}i=1,...n, {d_i} =

�

∂Ω_Nφ_igN−

n+n_∂ j=n+j

∑

y_j^�

Ω∇^φi.∇^φj,

where K is the stiffness matrix, Q is the mass matrix and d is a vector containing the boundary terms.

2.2.3 The finite element discretization of the minimization problem

We replace the continuous minimization problem (2.12) with its discrete counterpart, i.e.,

minyh,uhJ (y_h, u_h) = ¹₂�y_h− ˆy�²_L2(Ω)+¹₂β�u_h�²_L2(Ω) (2.19)

�

Ω∇yh.∇vh=

�

Ωvhuh ∀v∈Y_E^h₀. (2.20) Now,

�y_h− ˆy�²=

�

Ω(y_h− ˆy)²

=

∑

i

∑

j yiyj

�

Ωφ_iφ_j−²

∑

j yj

� Ωφ_jˆy+

� Ω ˆy²

=y^TMyy−^2y^T^b+C, where

My={m^y_i,j} ∈^R^n×n, m^y_i,j=

� Ωφ_iφ_j is the state mass matrix,

b={bi,j} ∈^Rⁿ, bi =

� Ω ˆyφ_i, and C is some constant.

Moreover,

�u_h�²₂=u^TMuu, where

Mu={m^u_i,j} ∈^R^m×m, m^u_i,j=

� Ωψ_iψ_j is the control mass matrix.

Further, we know that (2.20) can we written in the form (2.18), i.e.,

Ky=Qu+d. (2.21)

(18)

Thus, we have the minimization problem miny,u J (^{y, u}) = ¹

2y^TMyy−^y^T^b+¹

2βu^TMuu s.t. Ky=Qu+d.

(2.22)

which is an equivalent representation of minimization problem (2.19).

Finally, we introduce the Lagrange multiplier λ. The finite element discretization of λ is done using the functional space of the state y. So we write⁴

λ_h=

∑

n i=1

λ_iφ_i+

n+n_∂ i=n+1

∑

λ_iφ_i. (2.23)

The discretized Lagrange multiplier λ_h is uniquely determined by the vector of coefficients λ = {^λ1, ..., λ_n}. λ is also called the adjoint state variable since it satisfies

K^Tλ=b−Myy. (2.24)

2.2.4 First order optimality conditions

We write the discrete Lagrangian function:

L(y, u, λ) = ¹

2y^TMyy−^y^T^b+ ^β

2u^TMuu−^λ^T(Ky−Qu−^d). (2.25) The first order optimality conditions or Karush-Kuhn-Tucker (KKT) conditions are obtained by differentiating L with respect to the state y, control u and the Lagrange multiplier λ resulting in the so called adjoint, gradient and state equations, i.e.,

∇yL(y^∗, u^∗, λ^∗) =Myy^∗−^b+K^Tλ^∗=0 (2.26)

∇uL(y^∗, u^∗, λ^∗) =βMuu^∗−Q^Tλ^∗=0 (2.27)

∇λL(y^∗, u^∗, λ^∗) =Ky^∗−Qu^∗−^d=0. (2.28) The resulting optimality system is by

AF



y u λ



≡



My 0 K^T 0 βMu −Q

K −Q 0







y u λ



=



b 0 d



 . (2.29)

If the state y and the control u are discretized using the same basis, then M = My = Mu = Q and the stiffness matrix K is symmetric, i.e., K=K^T. We can then write

AF



y u λ



≡



M 0 K^T

0 βM −M

K −M 0







y u λ



=



b 0 d



 . (2.30)

Clearly (2.30) has a saddle point structure, i.e., it has the form AF=

�A B^T B −C

�

4Note however that ∑ⁿ_i₌⁺_nⁿ₊^∂₁λ_iφ_i=0follows from [55].

(19)

where

A=

�M 0 0 βM

�

, B=^�K −M� and C=0.

Remark Under the assumption that the test and trial spaces are the same, the stiffness matrix K is symmetric, i.e., K=K^T. Then as discussed in (Chapter 4, [18]), for any vector v in Rⁿ,

vKv≥^0, (2.31)

implying K is at least semi positive-definite. Moreover, if ΩD �= ^∅then K is positive definite. In case of pure Neumann boundary conditions, K is only positive semi-definite, and for the solution of (2.18) to exist, the relation

u_i=

� Ωφ_iu+

�

∂Ωφ_ig_N=0 (2.32)

must hold.

Further, for any vector v∈^Rⁿ, there holds

vMv≥^0, ^(2.33)

i.e., the mass matrix M is symmetric positive definite.

2.2.5 The reduced optimality system

If the state y, control u and the adjoint λ are discretized using the same finite element spaces, then it is possible to eliminate the gradient equation using the relation

u= ¹ βλ.

This gives us

AR

�y λ

�

≡



M K^T K −¹_βM



� y λ

�

=

�b d

�

. (2.34)

Note thatARalso has the saddle point structure, i.e., it has the form

AR=

�A B^T B −C

�

where

A=M, B=K and C= ¹ βM.

2.3 Distributed optimal control of the convection-diffusion equa- tion

Convection-diffusion processes help describe physical processes involving both diffusion and transport. One such example is the spread and transport of a pollutant with wind. The process is described through the following equation:

−ε∆y+ (�w· ∇)y=u in Ω, y=gDon ∂ΩD, ∂y

∂n =gNon ∂ΩN,

∂Ω=∂Ω ∪^∂Ω ^, ^∂Ω ∩^∂Ω ={^∅}^,

(2.35)

(20)

wherew is some vector field (for instance, direction of the wind) and g� D and gN are the given boundary data. Further, we assume that �w is divergence free, i.e., ∇ · �w = 0. The scalar quantity ε (for instance, viscosity) satisfies 0<ε�¹and the smaller the value it takes, the more convection-dominated the problem becomes.

Note that when �w �= 0, the problem is not self adjoint. This simply means that if we define the convection-diffusion operator by

L = −^�∇²+ �w∇ (2.36)

then _�

Ω(Lu)v�=

�

Ωu(Lv). (2.37)

The consequence to this is that the coefficient matrix derived through discretization is no longer symmetric. Further, when it comes to solving convection-diffusion control problems, discretize- then-optimize and optimize-then-discretize, yield different outcomes. One method to mitigate the issue is through employing the subgrid modeling and local projection stabilization (LPS) schemes, as discussed in [30,29,28].

In a convection-diffusion control problem, the aim is to find a control u such that the state y attains a desired state ˆy. The problem can be defined as a minimization of the cost functional with a convection-diffusion PDE stated as a constraint. The problem can be formulated as:

2�y− ˆy�²_L2(Ω)+¹

2β�u�²_L2(Ω)

s.t. −ε∆y+ (�w· ∇)y+cy=u in Ω y=gDon ∂ΩD, ∂y

∂n =gNon ∂ΩN,

∂Ω=∂Ω_D∪^∂ΩN, ∂Ω_D∩^∂ΩN={^∅}^.

(2.38)

2.3.1 Weak formulation of the state equation

The state equation in this case is the convection-diffusion equation. Before we define the weak formulation, let us first define the space where trial and test functions will live. So if Ω⊂^R², then this space can be defined by the Sobolev space

H¹(Ω) =

�

y : Ω→^R| y, ∂y

∂x1, ∂y

∂x2 ∈L²(Ω)

� , whereas its residents, i.e., the test and trial spaces are

H_E¹ ={y∈ H¹(Ω)|y=gDon Ω_D} H¹_E₀ ={v∈ H¹(Ω)|v=0on ΩD}^. The weak formulation of the convection-diffusion equation reads:

Find y∈ H¹_Esuch that

a(y, v) =l(v) ∀v∈H¹_E₀. (2.39) The bilinear form a(·^,· )on the left side is given by

a(y, v) =�

�

Ω∇y.∇u+

�

Ω(�w.∇y)v. (2.40)

Similarly, the linear form l(· )on the right side is given by l(v) =

�

Ωuv+�

�

∂ΩN

g_Nv. (2.41)

(21)

2.3.2 Stabilization Techniques

The finite element discretization of convection-diffusion equation is not straightforward. As discussed in [18], when the convection term dominates, the discrete solution to the problem is inaccurate due to the presence of non-physical oscillations. To further understand the issue, we quantify the effect of convection and diffusion terms by the Peclet number. Let the velocityw be� represented by

w� =C0�w_∗, (2.42)

where C0is a positive constant andw�_∗is normalized to have a value of one in some norm. Then the Peclet number is given by

P = ^C_�⁰^L^, (2.43)

whereL is the characteristic length scale of the problem.

So if P≤1, the problem is diffusion dominated and the solution may be within acceptable range of accuracy, nonetheless, it is best to have P→0.

If P �1, the solution can have a steep gradient requiring a very fine mesh to correctly resolve it. If the mesh is not fine, the standard Galerkin finite element discretization method will not work desirably.

Different approaches have been proposed to captivate the demon of unwarranted oscillations in processes governed by a convection-diffusion equation. One such method is the streamline upwind Petrov-Galerkin (SUPG) method. However, using SUPG in the context of optimal control of the convection-diffusion equation, optimize-then-discretize and discretize-then-optimize result in structurally different algebraic systems, leading to different solution outcomes and preconditioning strategies. In order to overcomes this issue, Becker and Vexler in [29] proposed a method referred to as the local projection stabilization (LPS) scheme, which borrows some ideas from subgrid modeling method discussed in [30,31]. Pearson and Wathen in [28] used the LPS scheme resulting in an optimality system that has same structure whether optimize-then-discretize or discretize-then-optimize is used.

2.3.2.1 The Petrov-Galerkin formulation: Streamline Upwind Petrov-Galerkin (SUPG) Sta- bilization

In this formulation, the trial and test spaces are different, i.e., the trial space is given by Y_E^h⊂H₁^E, and the test space is given by the vectors of the form

v_h+δw.� ∇v_h, (2.44)

where v_h∈V_E^h₀and δ is a constant that needs to be calculated apriori. Using this formulation, the bilinear form apg(·^,·)is given by

apg(y_h, v_h) =�

�

Ω∇y_h.∇u_h+

�

Ω(�w.∇y_h)v_h+δ

�

Ω(�w.∇y_h)(�w.∇v_h)−^�δ

�

Ω(∇²y_h)(�w.∇v_h). (2.45) In case of linear and bilinear elements, the higher order term is zero. Thus

apg(yh, vh) =�

�

Ω∇yh.∇uh+

�

Ω(�w.∇yh)vh+δ

�

Ω(�w.∇yh)(�w.∇vh). (2.46) Similarly, the linear form lpg(·)is given by

lpg(v_h) =

�

Ωuv_h+�

�

∂Ω_NgNv_h+δ

�

Ωu(�w.∇v_h). (2.47)

(22)

The Petrov-Galerkin finite element method, also called Streamline Upwind Petrov-Galerkin (SUPG) scheme, reads:

Find y_h∈V_E^hsuch that

apg(yhvh) =lpg(vh) ∀vh∈V_E^h₀. (2.48) To find the value of δ, we let δk to be defined locally on individual elements. As suggested in [18,24], we have

δ_k =





 h_k 2��w�2

� 1− ¹

P_h^k

�

if P_h^k >1 0 otherwise

(2.49)

where

P_h^k = �w��2h_k 2�

and h_kis measured in the direction of the wind. For example, as described in [24], if we have a rectangular domain with sides given by hxand hyand�w= [cosθ, sinθ], then

hk =min� hx

|cosθ|^, hy

|sinθ|

� .

2.3.2.2 Local Projection Stabilization (LPS) scheme

We follow with the discussion on LPS scheme in [29]. Consider a two-dimensional mesh TH

consisting of open cells which in the present case consist of quadrilaterals.

Figure 2.1:THwith 4 cells.

Using uniform refinement on meshTH gives us a refined mesh Th. For instance, ifTH consists of 4 cells, a uniform refinement of it will create 16 cells onTh. Each refined cell onTH creates 4 cells inTh, referred to as a patch P; so we will have 4 patches in this case.

(23)

Figure 2.2: Refined mesh Th with 16 cells and 4 patches (each patch portrayed by a different color) with representing the support on the patches.

LPS uses standard finite element discretization with stabilization based on local projections. Now the sequel requires us to define an L²(P)orthogonal projection operator⁵

Ph: L²(Ω)→V_H^const, (2.50)

on the patches of the domain, with V_H^const being a cell-wise⁶ constant function on the patches.

Further, this operator satisfies the following approximation and stability properties on the patches, cf. [29]:

�P_hv�_L²_(P)≤c�v�_L²_(P) ∀v∈L²(P). (2.51)

�v−P_hv�_L²_(P)≤ch�∇v�_L²_(P) ∀v∈H¹(P). (2.52)

We now introduce a positive stabilization parameter δ associated with a bilinear symmetric stabi- lization form τ_h^δ : Vh×Vh→^Rgiven by:

τ_h^δ(uh, vh) =δ(�w· ∇uh−Ph(�w· ∇uh)× (�w· ∇vh−Ph(�w· ∇vh))). (2.53) The stabilization parameter δ is defined locally on individual elements and depends on the Peclet number

P_h^k= ^h^k��w�

� . Following from Becker and Vexler in [29], we have

δ_k =



 h_k

��w�^, ^{if P}

hk≥^1, 0, otherwise.

(2.54)

An Algorithm to solve the convection-diffusion equation based on the LPS scheme is discussed in Appendix B.

5Note that orthogonal projection operator gives a good average approximation of a function, compared to an interpola- tion operator. However, both these operators have difficulty in approximating highly oscillatory or discontinuous functions.

6There are four cells in a patch in our considered example.

(24)

Remark The L²(P)orthogonal projection is computed on the gradient of the approximated solution ˜u_hon each patch P∈ THgiven by

Ph(�w· ∇˜uh)→MP_hξ= (�w· ∇˜uH, vH), (2.55) where M_P_h is the standard mass matrix assembled onTHand ξ ∈V_H^constis a vector of cell-wise constant functions on patches. The idea here is to capture sharp oscillations in the gradient of approximated solution ˜u_Hon the patches. These captured oscillations ξ are then interpolated to the fine meshThand added to the computed solution ˜u_h.

2.3.3 Finite element discretization of the state equation

Let V₀^h ⊂ H¹_E₀ be the n-dimensional vector space of test functions spanned by{^φ1, ..., φ_n}. Let Y_E^h⊂ H_E¹ be the vector space for trial functions spanned by{^φ1, ..., φn}and extended further by {^φn+1, ..., φn+n_∂}to satisfy the Dirichlet boundary data on Ω_D. The finite element approximation yh∈Y_E^his uniquely determined by the vector of coefficients y={y1, ..., yn}. So we write

y_h=

∑

n i=1

Yiφ_i+

n+n_∂ i=n+1

∑

Yiφ_i. (2.56)

We can discretize the control u using the trial space. Let φ_ibe the set of basis functions such that span{^φ1, ..., φ_n}^:= U^h ⊂ L²(Ω). Then the finite element approximation of the control u is given by

uh=

∑

m

i=1uiψ_i, (2.57)

where uhis uniquely determined by the vector of coefficients u={u1, ..., un}.

The finite element approximation of the state equation can be written as a system of algebraic equations given by

Fy=Mu+d, (2.58)

where F=�K+N+T. Further,

M={m_ij}i,j=1,...,n, m_ij=

� Ωφ_iφ_j, K={kij}i,j=1,...,n, kij =

�

Ω∇^φi.∇^φj, N={n_ij}i,j=1,...,n, n_ij =

�

Ω(�w∇^φj).φ_i, T={^τ_h,i,j^δ }i,j=1,...,n, τ_h,i,j^δ =δ

�

Ω(�w· ∇^φi−Ph(�w· ∇^φi)× (�w· ∇^φj−Ph(�w· ∇^φj))), d={d}i=1,...n, {d_i} = −

n+n_∂ j=n+j

∑

yj

�

Ω∇^φi.∇^φj.

Here, M is the mass matrix, K is the stiffness matrix, N contains the convection term (skew- symmetric), d contains the boundary terms (assuming gN = 0) and T is the LPS stabilization term.

(25)

2.3.4 The finite element discretization of the minimization problem

We replace the continuous minimization problem (2.38) with its discrete counterpart:

miny,u

1

2y^TMy−^y^T^b+¹

2βu^TMu s.t. Fy=Mu+d,

(2.59)

where b is given by

b={bi}i=1,...n, ^�

Ω ˆyφ_idΩ.

We now introduce the Lagrange multiplier λ, discretizing it using the functional space of state y.

So we obtain

λ_h=

∑

n i=1

λ_iφ_i+

n+n_∂ i=n+1

∑

λ_iφ_i. (2.60)

The discretized Lagrange multiplier (adjoint state) λh is uniquely determined by the vector of coefficients λ={^λ1, ..., λn}^.

2.3.5 First order optimality conditions

We formulate the discrete Lagrangian function given by L(y, u, λ) = ¹

2y^TMy−^y^T^b+βu^TMu−^λ^T(Fy−Mu+d). (2.61) By definition, the optimal point of L is obtained when its partial derivatives with respect to y, u, and λ are set to zero. This leads to the following optimality system, i.e.,

AF



y u λ



≡



M 0 F^T

0 βM −M

F −M 0







y u λ



=



b 0 d



 . (2.62)

Note the difference between the systems (2.30) and (2.62). Whereas K is a discrete Laplacian and thus symmetric positive definite (spd), F is non-symmetric as it contains the discrete coun- terparts of convection (skew-symmetric) and stabilization terms.

We can see that (2.62) has a saddle point structure, i.e., it has the form

AF =

�A B^T

B 0

� , where

A=

�M 0

0 βM

�

, B=^�F−M�

and C=0.

2.3.6 The reduced optimality system

If the state y, control u and the adjoint λ are discretized using the same finite element spaces, then it is possible to eliminate the gradient equation in (2.62) using the relation

u= ¹ βλ,

Performance comparisons of preconditioned iterative methods for problems arising in PDE-constrained optimization

Examensarbete 30 hp March 2015

Performance comparisons of

preconditioned iterative methods

for problems arising in PDE-constrained optimization

Shiraz Farouq

Abstract

Performance comparisons of preconditioned iterative methods for problems arising in PDE-constrained optimization

Shiraz Farouq

perhaps my readers will say that "I had perhaps something to say, but did not know how to express it."

— Fyodor Dostoyevsky, The Idiot

Contents

Introduction 1

1.1 Major goal of the study

1.2 Problem setting and available tools

1.3 Layout of the Thesis

Mathematical framework for PDE-constrained 2

optimization problems

2.1 The saddle point theory

2.1.1 The discretized saddle point system

2.2 Distributed optimal control of the Poisson equation

2.2.1 The weak formulation of the state equation

2.2.2 The finite element discretization of the state equation

∑

∑

∑

∑

2.2.3 The finite element discretization of the minimization problem

∑

∑

∑

∑

∑

2.2.4 First order optimality conditions

2.2.5 The reduced optimality system

2.3 Distributed optimal control of the convection-diffusion equa- tion

2.3.1 Weak formulation of the state equation

2.3.2 Stabilization Techniques

2.3.3 Finite element discretization of the state equation

∑

∑

∑

∑

2.3.4 The finite element discretization of the minimization problem

∑

∑

2.3.5 First order optimality conditions

2.3.6 The reduced optimality system