• No results found

Performance comparisons of preconditioned iterative methods for problems arising in PDE-constrained optimization

N/A
N/A
Protected

Academic year: 2022

Share "Performance comparisons of preconditioned iterative methods for problems arising in PDE-constrained optimization"

Copied!
95
0
0

Loading.... (view fulltext now)

Full text

(1)

IT 15 023

Examensarbete 30 hp March 2015

Performance comparisons of

preconditioned iterative methods

for problems arising in PDE-constrained optimization

Shiraz Farouq

(2)

 

(3)

Teknisk- naturvetenskaplig fakultet UTH-enheten

Besöksadress:

Ångströmlaboratoriet Lägerhyddsvägen 1 Hus 4, Plan 0 Postadress:

Box 536 751 21 Uppsala Telefon:

018 – 471 30 03 Telefax:

018 – 471 30 00 Hemsida:

http://www.teknat.uu.se/student

Abstract

Performance comparisons of preconditioned iterative methods for problems arising in PDE-constrained optimization

Shiraz Farouq

The governing dynamics of simple and complex processes, whether physical,

biological, social, economic, engineering, or even rather a mere figment of imagination, can be studied via numerical simulations of mathematical models. These models in many cases can be thought to consist of one, or frequently, several coupled partial differential equations (PDEs). In many applications, the aim of such simulations is not only to study the behavior of the underlying processes, but also to optimize or control those in some optimal way. These are referred to as optimal control problems constrained by PDEs and are stated in the form of a constrained minimization problem. The general framework under which such problems are studied is referred to as PDE-constrained optimization.

In this thesis, we aim to solve three benchmark optimal control problems, namely, the optimal control of the Poisson equation, the optimal control of the

convection-diffusion equation and the optimal control of the Stokes system.

Numerically tackling these problems lead to a large optimality system with a saddle point structure. Systems with a saddle point structure are indefinite and in general, ill-conditioned, thus posing great challenges for iterative solvers seeking to find their solution. Preconditioning the optimality system is a possible strategy to deal with the issue. The main focus of the thesis is therefore to solve the resulting optimality systems with various preconditioners available in literature and compare their efficiency. Moreover, additional challenges arise when dealing with

convection-diffusion control problems which we effectively deal by employing the local projection stabilization (LPS) scheme. Furthermore, Axelsson and Neytcheva in [40] proposed a preconditioner for efficiently solving large nonlinear coupled multi-physics problems. We successfully apply this preconditioner to the first two benchmark problems with promising results.

Examinator: Jarmo Rantakokko Ämnesgranskare: Sverker Holmgren Handledare: Maya Neytcheva

(4)

perhaps my readers will say that "I had perhaps something to say, but did not know how to express it."

— Fyodor Dostoyevsky, The Idiot

(5)

Contents

1 Introduction 5

1.1 Major goal of the study. . . 8

1.2 Problem setting and available tools . . . 8

1.3 Layout of the Thesis . . . 9

2 Mathematical framework for PDE-constrained optimization problems 10 2.1 The saddle point theory . . . 10

2.1.1 The discretized saddle point system . . . 11

2.2 Distributed optimal control of the Poisson equation . . . 12

2.2.1 The weak formulation of the state equation . . . 12

2.2.2 The finite element discretization of the state equation . . . 13

2.2.3 The finite element discretization of the minimization problem . . . 14

2.2.4 First order optimality conditions . . . 15

2.2.5 The reduced optimality system . . . 16

2.3 Distributed optimal control of the convection-diffusion equation . . . 16

2.3.1 Weak formulation of the state equation. . . 17

2.3.2 Stabilization Techniques . . . 18

2.3.2.1 The Petrov-Galerkin formulation: Streamline Upwind Petrov-Galerkin (SUPG) Stabilization . . . 18

2.3.2.2 Local Projection Stabilization (LPS) scheme . . . 19

2.3.3 Finite element discretization of the state equation. . . 21

2.3.4 The finite element discretization of the minimization problem . . . 22

2.3.5 First order optimality conditions . . . 22

2.3.6 The reduced optimality system . . . 22

2.3.7 Solving the convection-diffusion optimality system using the LPS scheme . 23 2.4 Distributed optimal control of Stokes equations . . . 23

2.4.1 Weak formulation of the state equation. . . 25

2.4.2 The finite element discretization of the state equation . . . 26

2.4.3 Solvability of the Stokes system . . . 26

2.4.4 The finite element discretization of the minimization problem . . . 27

2.4.5 First order optimality conditions . . . 27

2.4.6 The reduced optimality system . . . 28

2.5 Error estimates for homogeneous distributed optimal control problems . . . 28

3 An overview on iterative methods 29 3.1 Iterative methods . . . 29

3.1.1 Relaxation scheme: Richardson iteration . . . 30

3.1.2 Chebyshev semi-iteration . . . 30

3.2 Projection Methods . . . 30

3.2.1 General framework. . . 31

3.3 Krylov subspace iterative methods . . . 32

3.3.1 The Generalized Minimal Residual (GMRES) Method . . . 32

3.3.2 The Minimal Residual (MINRES) Method . . . 34

(6)

4 Preconditioned Krylov subspace iterative methods and multigrid (MG) precondition-

ers 36

4.1 The idea of preconditioned Krylov subspace iterative methods . . . 37

4.2 Multigrid methods as a preconditioner . . . 38

5 Preconditioners for saddle point systems arising in PDE-constrained optimization problems 39 5.1 Preconditioner construction techniques for saddle point systems . . . 39

5.1.1 (Negative) Schur complement approximation . . . 39

5.1.2 Inexact Schur complement approximation . . . 40

5.1.3 Operator preconditioning with norms . . . 41

5.1.4 Preconditioners based on interpolation theory. . . 42

5.1.5 Schur complement approximation using the commutator argument . . . 43

5.1.6 Operator factorization . . . 43

5.1.6.1 Efficient solution forPF . . . 44

5.2 Preconditioners for PDE-constrained optimization problems . . . 46

5.2.1 Preconditioners for the distributed optimal control problem constrained by the Poisson equation. . . 46

5.2.1.1 Operator preconditioning with standard norms . . . 46

5.2.1.2 Operator preconditioning with non-standard norms. . . 47

5.2.1.3 Operator preconditioning with interpolation operator . . . 47

5.2.1.4 Schur complement approximation . . . 48

5.2.1.5 Operator factorization . . . 48

5.2.2 Preconditioners for the distributed optimal control problem constrained by the convection-diffusion equation . . . 49

5.2.2.1 (Negative) Schur complement approximation . . . 50

5.2.2.2 Operator preconditioning with non-standard norms. . . 51

5.2.2.3 Operator factorization . . . 51

5.2.3 Preconditioners for distributed optimal control problem constrained by the Stokes system . . . 52

5.2.3.1 Operator preconditioning with standard norms . . . 52

5.2.3.2 Operator preconditioning with non-standard norms. . . 53

5.2.3.3 Inexact Schur complement approximation . . . 53

5.2.3.4 Mesh independent Schur complement approximation . . . 54

5.2.3.5 Parameter independent Schur complement approximation . . . . 55

5.2.3.6 Schur complement approximation using the commutator argument 56 5.3 Overview of spectrally equivalent numerical approximation of discrete differential operators . . . 57

5.3.1 Approximation of the discrete differential operators arising in distributed op- timal control of the Poisson equation . . . 58

5.3.2 Approximation of the discrete differential operators arising in distributed op- timal control of the convection-diffusion equation . . . 58

5.3.3 Approximation of the discrete differential operators arising in distributed op- timal control of the Stokes system . . . 59

6 Numerical results 60 6.1 Distributed optimal control problem constrained by the Poisson equation . . . 60

6.1.1 Block diagonal preconditionerpsonP�bd1, Pearson and Wathen . . . 61

6.1.2 Block lower triangular preconditionerpsonP�blt2, Pearson and Wathen . . . . 62

6.1.3 Block diagonal preconditionerpsonP�bd2, Pearson and Wathen . . . 62

6.1.4 Block diagonal preconditionerpsonP�nsn, W. Zulehner . . . 63

6.1.5 Block factorization preconditionerpsonP�F, Axelsson and Neytcheva . . . 63

6.1.6 Discussion . . . 64

(7)

6.2 Distributed optimal control problem constrained by the convection-diffusion equation 67

6.2.1 Block diagonal preconditionercdP�bd1, Pearson and Wathen . . . 68

6.2.2 Block lower triangular preconditionercdP�blt, Pearson and Wathen . . . 69

6.2.3 Block diagonal preconditionercdP�bd2, Pearson and Wathen . . . 69

6.2.4 Block diagonal preconditionercdP�nsn, W. Zulehner . . . 70

6.2.5 Block factorization preconditionercdP�F, Axelsson and Neytcheva . . . 70

6.2.6 Discussion . . . 71

6.3 Distributed optimal control problem constrained by the Stokes system . . . 74

6.3.1 Block diagonal preconditionerstkP�bd1, Rees and Wathen. . . 75

6.3.2 Block diagonal preconditionerstkP�comm, Pearson . . . 76

6.3.3 Block diagonal preconditionerpsonP�nsn, W. Zulehner . . . 77

6.3.4 Discussion . . . 77

7 Summary and conclusion 82 A Results for distributed convection-diffusion control problem for �=1/1500 88 B Local projection stabilization (LPS) 91 B.1 The double-glazing problem . . . 92

(8)

Mad Hatter: Why is a raven like a writing-desk?

Lewis Carroll, Alice in Wonderland

Introduction 1

PDE-constrained optimization problems appear is a variety of social, scientific, industrial, medical and engineering applications including optimal control, optimal design and parameter identifica- tion. The general formulation of a PDE-constrained optimization problem is given by

miny,u J (y, u)

s.t.C(y, u) =0, (1.1)

whereJ represents the cost functional, C is a PDE-constraint, y is the state variable, and u is the decision variable. The PDE-constraint, which basically models the underlying process that needs to be controlled, is referred to as the state equation. Depending on the nature of the objective function J and decision variable u, the PDE-constrained optimization problem (1.1) assumes the nature of an optimal control, optimal design or parameter identification problem.

Correspondingly, u is then referred to as a control, design or parameter identification variable. For an overview on a wide range of PDE-constrained optimization problems, the reader is referred to [19,20,21].

In this thesis, we are concerned with the optimal control problems constrained by PDEs. As a brief introduction to the topic, consider a heating source u applied to the surface of a domain Ω to control its temperature y. Let us assume that the temperature distribution y in Ω is satisfied by the following partial differential equation:

−∆y=u in Ω. (1.2)

Now suppose that we want the domain Ω to acquire a target (desired) temperature distribution ˆy for some forcing term u. In other words, we want to control the heating source u, such that the domain under consideration acquires a temperature y that is as close to the target temperature ˆy as possible. This target tracking must be defined over some norm, which in our case is the L2(Ω) norm. The process can now be described in the form of a constrained minimization problem.

Hence we write:

miny J (y) = 1

2�y−ˆy�2L2(Ω)

s.t. −∆y=u,

(1.3)

(9)

where u now represents the control variable.

Let us further generalize the problem by including boundary conditions. Let ∂Ω denote the bound- ary of Ω. Then

y=gDon ∂ΩD, ∂y

∂n =gNon ∂ΩN,

where ∂ΩD represents the Dirichlet boundary and ∂ΩN represents the Neumann boundary. ∂y is the directional derivative in the outer normal direction of Ω. gD and gN are the Dirichlet and∂n Neumann boundary data, respectively. Further,

∂Ω=∂ΩD∂ΩN, ∂ΩD∂ΩN={}.

The above problem, in general, is ill-posed. Theory suggests adding the norm of the control u along with the so called Tikhonov regularization parameter (also called the cost parameter) β > 0to the cost functional. This makes the solution to the problem, well-defined. So we now write:

miny,u J (y, u) = 1

2�y− ˆy2L2(Ω)+1

2β�u|2L2(Ω) s.t. −∆y=u,

y=gDon ∂ΩD, ∂y

∂n =gNon ∂ΩN,

∂Ω=∂ΩD∂ΩN, ∂ΩD∂ΩN={}.

(1.4)

One way to deal with this minimization problem and find its optimal solution is through the first order optimality conditions, also known as the Karush-Kuhn-Tucker (KKT) conditions. This results in the involvement of another function, the Lagrange multiplier λ (also referred in literature as the dual or adjoint state variable). The solutions to the minimization problem for the state function y and control function u are called the optimal state and the optimal control respectively. The existence and the uniqueness of the optimal solution is not discussed here and the reader is referred to [21].

Numerically dealing with the PDE-constrained optimization problems require two steps, namely, discretization and optimization. Which step is taken first depends on which one of following two approaches is pursued:

• The discretize-then-optimize approach.

• The optimize-then-discretize approach.

In the first approach, we discretize the objective function and the state equation(s), formulate its Lagrangian and the corresponding first order optimality conditions, and then build an optimality system to numerically solve the problem. In the second approach, we first formulate the La- grangian and its corresponding first order optimality conditions, discretize them, and then finally solve the problem numerically. Many PDE-constrained optimization problems, especially where the PDE is self-adjoint, both approaches lead to the same discrete optimality system. However, certain PDE’s are not self adjoint, for example, the convection-diffusion equation. Heinkenschloss et al. in [23] showed that when streamline upwind Petrov-Galerkin (SUPG) stabilization1 is ap- plied to the optimal control of the convection-diffusion equation, the optimize-then-discretize and

1If a pure Galerkin approach is used, both approaches will result in a similar algebraic system. However, the solution will have unwanted oscillations.

(10)

discretize-then-optimize approaches can lead to different discrete algebraic2 systems, resulting in different solution outcomes. This also poses challenge in the context of designing efficient preconditioners for such problems.

Whichever approach is taken, the resulting optimality system acquires a saddle point structure, given by:

Ax=b, (1.5)

where we solve for x with

A =

�A B1T B2 −C

Rn×n

and b∈Rn. (1.6)

The problem that we have discussed so far is called a distributed optimal control problem where the control u is distributed throughout the domain Ω. Another related problem is referred to as the boundary control problem where the control only acts on the boundary so that u∈ L2(∂Ω). For example, a typical Dirichlet boundary control problem is of the form

miny,u

1 2

(y− ˆy)2dx+1 2β

∂Ωu2ds s.t. −∆y=g in Ω

y=u on ∂Ω.

(1.7)

Similarly, a Neumann boundary control problem is given by miny,u

1 2

(y− ˆy)2dx+1 2β

∂Ωu2ds s.t. −∆y=g in Ω

∂y

∂n =u on ∂Ω.

(1.8)

We consider only the distributed optimal control problems. Moreover, we have limited ourselves to distributed optimal control problems without constraints on the control and/or the state. Such problems are considered in [52,53,54]. Furthermore, time-dependent optimal control problems are also not considered here, and the interested reader is referred to [56,57].

Remark (The parameter identification problem) In many other applications, reconstruction of the state of a dynamical system from measured data is required. These are referred to as the pa- rameter identification or the inverse problem. Examples include weather forecasting, oil reservoir modeling and various models of cognitive motor control. Such problems can also be stated as a PDE-constrained optimization problem. While these problems are not the focus of this thesis, we however give a brief sketch of it.

Consider a thin membrane3 whose deflection y under some force u is defined by the Poisson equation

− ∇ ·σ∇y=u in Ω,

where σ(Ω)represents the unobservable material property (spatially varying) of the system, i.e., the coefficient that we want to recover. Suppose we have some observable measurements ˆy for

2We use the term algebraic here, for in the case of optimize-then-discretize, where the convection-diffusion equation is discretized using the SUPG, there is no finite dimensional problem for which the resulting system can be considered as an optimality system, cf. [23].

3This example is taken from [44].

(11)

displacements caused under the influence of certain forces u. The task is to find σ for which the output state y of the system best matches the observations ˆy. This task takes the form of an optimization problem:

miny,σ J (y, σ) = 1

2�y− ˆy�2+βR(σ) s.t. − ∇ ·σ∇y=u in Ω,

(1.9)

where R(σ)is the regularization operator introduced to make the solution well-behaved.

We assumed that the dimension of solution space y and the observation space ˆy are matched.

In most situations, however, only few observations are available, which requires introducing an operator Ψ that maps the solution space to the observation space. In this case, we write:

miny,σ J (y, σ) = 1

2�Ψy−ˆy�2+βR(σ) s.t. − ∇ ·σ∇y=u in Ω.

(1.10)

Interested readers can refer to, for instance, [44,45,46].

1.1 Major goal of the study

Due to their robustness, direct solvers are methods of choice when it comes to solving discrete saddle point systems. However, as the mesh size h gets smaller, the discrete saddle point system becomes large. This results in increased computational requirements which direct solvers, in general, find difficult to cope with. In these circumstances, due to their modest requirements for computer resources, iterative solution methods become the feasible method of choice. These methods are usually based on projection processes onto Krylov subspaces. Examples include the conjugate gradient (CG) method, the minimal residual (MINRES) method, and the generalized minimal residual (GMRES) method. However, due to indefiniteness and, in general, unfavorable spectral properties of a large discrete saddle point system, iterative methods suffer from slow rate of convergence and in some cases a lack of it. In the context of PDE-constrained optimization problems, where some regularization parameter, say β approaches zero, the issue is further exacerbated. Therefore, efficient design of iterative solvers becomes necessary. One way to achieve efficiency in iterative methods is to use a suitable preconditioning strategy to improve the spectral properties of the underlying saddle point system, see for instance, Axelsson and Neytcheva [1]. Further, saddle point systems possess a certain block structure, which has been utilized for designing efficient preconditioners in various studies such as [17,24,26,35].

We will discuss and implement various preconditioning strategies available in literature for the saddle point systems arising in PDE-constrained optimization problems. Three benchmark prob- lems are considered: the distributed optimal control of the Poisson equation, the distributed optimal control of the convection-diffusion equation and the distributed optimal control of the Stokes system. We aim to summarize the proposed preconditioning techniques and test the most promising ones on the defined benchmark problems to provide a fair comparison of efficiency of the solvers both in terms of number of iterations as well as in required computing time.

1.2 Problem setting and available tools

We limit ourselves to two space dimensions. To discretize and create the matrices, the open source FEM librarydeal.ii [58] is used. The package provides tools for constructing the dis- cretization meshes as well as for assembling the necessary FEM matrices. In addition,deal.ii

(12)

provides a number of preconditioners and iterative solvers, but also interface to other well estab- lished numerical algebra packages, such as Trilinos [59], Hypre [60] and PETSc [61].

1.3 Layout of the Thesis

This thesis is organized in the following manner:

• In Chapter 2, using the discretize-then-optimize approach, we describe the mathematical model for our three benchmark distributed optimal control problems. We describe the pro- cess of discretization of the objective function and its constraints. Finally we describe the formulation of the first order optimality conditions, concluding that the process leads to a saddle point system.

• In Chapter 3, we discuss the basic theory of different iterative methods.

• In Chapter 4, we discuss the basic idea and motivation behind preconditioning iterative methods based on Krylov subspaces. The topic of multigrid methods is also touched upon in this chapter.

• In Chapter 5, we give the literature overview of various techniques for constructing precondi- tioners for saddle point systems arising in different PDE-constrained optimization problems.

• In Chapter 6, we present and discuss the results of our numerical experiments.

• In Chapter 7, we summarize the results and draw conclusions.

(13)

"Curiouser and curiouser!" cried Al- ice

Lewis Carroll, Alice in Wonderland

Mathematical framework for PDE-constrained 2

optimization problems

The benchmark PDE-constrained optimization problems that we consider in our study are:

1. The distributed optimal control of the Poisson equation.

2. The distributed optimal control of the convection-diffusion equation.

3. The distributed optimal control of the Stokes system.

We start with the continuous formulation of the minimization problem. Each problem is then tack- led using the discretize-then-optimize approach. A starting point is to state the weak formulation for the state equations and then use an appropriate finite element method for discretization1. The continuous minimization problems is then replaced by its discrete counterpart. Introducing the Lagrange multiplier and solving for the optimality conditions yields an optimality system with a saddle point structure. In many cases, this optimality system can be reduced, more specif- ically, by eliminating the control u. The resulting system is then called the reduced optimality system.

2.1 The saddle point theory

We follow the introductory discussion in [35, 5]. Let V and Λ be the Hilbert spaces with inner products(·,·)V and(·,·)Λ, associated with the norms � · �V and� · �Λ respectively. The mixed variational formulation reads:

Find y∈V and λΛ

a(y, v) +b(v, λ) = f(v) ∀v∈V,

b(y, µ)−c(λ, µ) =g(µ) ∀µΛ, (2.1)

1Although we take the discretize-then-optimize approach here, note that for both the Poisson control problem as well as the Stokes control problem, using the second approach, i.e., optimize-then-discretize, yield the same discrete optimality system. For the case of convection-diffusion control problem, this is not the case, see Heinkenschloss et al. in [23]. We will further elaborate on the issue in section2.3.

(14)

where a, b, c are bounded bilinear forms on V×V →R, V×ΛR, and Λ×ΛRrespec- tively, whereas f ∈Vand g∈ Λare the linear forms; with Vhand Λhbeing the dual space of Vhand Λhrespectively.

Let us consider the following assumptions:

1. a and c are symmetric, i.e.,

a(w, v) =a(v, w) ∀v, w∈V, c(r, µ) =c(µ, r) ∀µ, rΛ. (2.2) 2. a and c are nonnegative, i.e.,

a(v, v)≥0 ∀v∈V, c(µ, µ)≥0µΛ. (2.3) If condition (1) holds, then the functional

S(v, µ) = 1

2a(v, v) +b(v, µ)−12c(µ, µ)−f(v)−g(µ) (2.4) is a convex function of v ∈ V and a concave function of µ ∈ Λ; more on this can be found in [6]. Such a functionalS is referred to as a saddle point function. If additionally, condition (2) also holds, then(y, µ)is the solution to (2.1) if and only if(y, µ)is a saddle point ofS, i.e.,

S(y, µ)≤ S(y, λ)≤ S(v, λ). (2.5) Let us now reformulate (2.1) as a (non-mixed) variational problem: Let X be the product space V×Λequipped with inner product ((y, λ),(v, µ))X = (y, v)V+ (p, µ)Λ with norm �(y, λ)�X =

�((y, λ),(y, λ))X. Further, let X be the dual space of X. Then the variational formulation reads:

Find x1= (y, λ)∈X such that

B(x1, x2) =F (x2), ∀x2(v, µ)∈X, (2.6) where

B(z, x2) =a(w, v) +b(v, r) +b(w, µ)−c(r, µ), F (x2) = f(v) +g(µ), (2.7) for x2= (v, µ), z= (w, r).

Associating a linear operatorA ∈ L(X, X)toB, we write

�Ax1, x2X,X =B(x1, x2). (2.8) Finally, using the operator notation we rewrite (2.6), which now reads:

Ax=F, in X. (2.9)

The existence and uniqueness of a solution for (2.9) and (2.6) follows from the book by Babushka and Aziz [2]. In the special case of c(·,·) = 0, existence and uniqueness follows the classical result by Brezzi [3].

2.1.1 The discretized saddle point system

The discretized version of (2.9) leads to a large saddle point system represented by

Ax =F,2 (2.10)

2Instead of writingAhxh=Fh, we consider it convenient to reuse the notation of the continuous case (2.9).

(15)

or � A B1T B2 −C

� �x1

x2

=

�f g

, (2.11)

where f ∈Rn1, g∈Rm1, A∈Rn1×n1, B1, B2Rm1×n1, C∈Rm1×m1, m1≤n1. Let us consider the following conditions:

1. A is symmetric, i.e., A=AT.

2. The symmetric part of A, i.e., H≡ 12(A+AT)is positive semidefinite.

3. B1=B2=B or B is symmetric, i.e., B1=B2T.

4. C is symmetric, i.e., C=CTand positive semidefinite.

If all the above conditions are satisfied thenAis symmetric and indefinite. Indefiniteness implies Ahas both positive and negative eigenvalues. If either A�= AT(condition 1) or B1�=B2(condition 3), or both are not satisfied, thenAis non-symmetric, in which caseAis referred to as the gen- eralized saddle point system3. Moreover, the discretized version of X introduces an additional parameter into the picture, i.e., the mesh size h. As h decreases, the saddle point system be- comes large, while its spectral properties deteriorate. In practice, larger saddle point systems are typically ill conditioned. Further discussion on saddle point problems can be found with more detail in [5].

2.2 Distributed optimal control of the Poisson equation

Consider the distributed optimal control problem constrained by the Poisson equation:

miny,u J (y, u) = 1

2�y− ˆy�2L2(Ω)+1

2β�u�2L2(Ω),

−∆y=u in Ω, y=gDon ∂ΩD, ∂y

∂n =gNon ∂ΩN,

∂Ω=∂ΩD∂ΩN, ∂ΩD∂ΩN={}.

(2.12)

where y is the state of the system, ˆy is the desired state of the system to be achieved and β>0 is a regularization parameter. Further, ∂Ω represents the boundary of the domain Ω.

2.2.1 The weak formulation of the state equation

The starting point for finite element discretization is to find the variational or weak formulation of the equations involved. Let us first start with the state equation, which in this case is the Poisson equation. For a weak formulation, we need an appropriate set of test functions v. Further, we need a natural functional space, where v and the weak solution y can naturally exist. So if Ω⊂R2, then this homely space can be defined as a Sobolev space, i.e.,

H1(Ω) =

y : Ω→R| y, ∂y

∂x1, ∂y

∂x2 ∈ L2(Ω)

� ,

3Linearized Navier-Stokes equations is an example.

(16)

with the corresponding solution and test spaces given by

H1E={y∈ H1(Ω)|y=gDon ΩD},

H1E0 ={v∈ H1(Ω)|v=0on ΩD}. (2.13) Note that while the Dirichlet condition is part of the solution space HE1, it vanishes in the test space H1E0, i.e., the test functions v takes zero value on the Dirichlet boundary. The Neumann boundary condition, in contrast, is not restricted in the solution or the test spaces. The weak formulation now reads:

Find y∈ H1Esuch that

a(y, v) =l(v) ∀v∈ HE10, (2.14) where the bilinear and linear forms a(·,·)and l(·)are given by

a(y, v) =

∇y.∇v l(v) =

vu+

∂ΩNvgN.

2.2.2 The finite element discretization of the state equation

We use the Galerkin approach for finite element discretization. Let Y0h⊂H1E0be the n-dimensional vector space of test functions spanned by{φ1, ..., φn}. Let YEh ⊂ HE1 be the vector space for trial functions having basis {φ1, ..., φn}. We extend the trial basis by {φn+1, ..., φn+n}to satisfy the Dirichlet boundary data on ΩD. The finite element approximation yh∈YEhis uniquely determined by the vector of coefficients y={y1, ..., yn}. Thus

yh=

n i=1

yiφi+

n+n i=n+1

yiφi. (2.15)

Note that the space YEh is constructed such that the trial functions (shape functions) in (2.15) coincides the appropriate test functions in YEh0. This is referred to as the Galerkin method. Using this formulation for our state equation, the finite element discretization reads:

Find yh∈YEhsuch that

a(yh, vh) =l(vh) ∀v∈YEh0 (2.16) where the bilinear and linear forms a(·,·)and l(·)are given by

a(yh, vh) =

∇yh.∇vh, l(vh) =

vhuh+

∂ΩNvhgN.

We discretize the control u using the trial space. Let ψi be the set of basis functions such that span{ψ1, ..., ψm} = Uh ⊂ L2(Ω). We note that ψi may be different than φi. The Galerkin finite element approximation of the control u is

uh=

m i=1

uiψi, (2.17)

where the discretized control uhis uniquely determined by the vector of coefficients u={u1, ..., un}.

(17)

The Galerkin finite element approximation of the state equation can also be written as a system of algebraic equations, i.e.,

Ky=Qu+d. (2.18)

with

K={ki,j} ∈Rn×n, ki,j=

φj.∇φi, Q={qi,j} ∈Rn×m, qi,j=

φiψj,

d={di}i=1,...n, {di} =

∂ΩNφigN

n+n j=n+j

yj

φi.∇φj,

where K is the stiffness matrix, Q is the mass matrix and d is a vector containing the boundary terms.

2.2.3 The finite element discretization of the minimization problem

We replace the continuous minimization problem (2.12) with its discrete counterpart, i.e.,

minyh,uhJ (yh, uh) = 12�yh− ˆy�2L2(Ω)+12β�uh2L2(Ω) (2.19)

∇yh.∇vh=

vhuh ∀v∈YEh0. (2.20) Now,

�yh− ˆy�2=

(yh− ˆy)2

=

i

j yiyj

φiφj2

j yj

φjˆy+

ˆy2

=yTMyy2yTb+C, where

My={myi,j} ∈Rn×n, myi,j=

φiφj is the state mass matrix,

b={bi,j} ∈Rn, bi =

ˆyφi, and C is some constant.

Moreover,

�uh22=uTMuu, where

Mu={mui,j} ∈Rm×m, mui,j=

ψiψj is the control mass matrix.

Further, we know that (2.20) can we written in the form (2.18), i.e.,

Ky=Qu+d. (2.21)

(18)

Thus, we have the minimization problem miny,u J (y, u) = 1

2yTMyyyTb+1

2βuTMuu s.t. Ky=Qu+d.

(2.22)

which is an equivalent representation of minimization problem (2.19).

Finally, we introduce the Lagrange multiplier λ. The finite element discretization of λ is done using the functional space of the state y. So we write4

λh=

n i=1

λiφi+

n+n i=n+1

λiφi. (2.23)

The discretized Lagrange multiplier λh is uniquely determined by the vector of coefficients λ = {λ1, ..., λn}. λ is also called the adjoint state variable since it satisfies

KTλ=b−Myy. (2.24)

2.2.4 First order optimality conditions

We write the discrete Lagrangian function:

L(y, u, λ) = 1

2yTMyyyTb+ β

2uTMuuλT(KyQud). (2.25) The first order optimality conditions or Karush-Kuhn-Tucker (KKT) conditions are obtained by differentiating L with respect to the state y, control u and the Lagrange multiplier λ resulting in the so called adjoint, gradient and state equations, i.e.,

yL(y, u, λ) =Myyb+KTλ=0 (2.26)

uL(y, u, λ) =βMuu−QTλ=0 (2.27)

λL(y, u, λ) =KyQud=0. (2.28) The resulting optimality system is by

AF

y u λ

≡

My 0 KT 0 βMu −Q

K −Q 0

y u λ

=

b 0 d

 . (2.29)

If the state y and the control u are discretized using the same basis, then M = My = Mu = Q and the stiffness matrix K is symmetric, i.e., K=KT. We can then write

AF

y u λ

≡

M 0 KT

0 βM −M

K −M 0

y u λ

=

b 0 d

 . (2.30)

Clearly (2.30) has a saddle point structure, i.e., it has the form AF=

�A BT B −C

4Note however that ∑ni=+nn+1λiφi=0follows from [55].

(19)

where

A=

�M 0 0 βM

, B=K −M� and C=0.

Remark Under the assumption that the test and trial spaces are the same, the stiffness matrix K is symmetric, i.e., K=KT. Then as discussed in (Chapter 4, [18]), for any vector v in Rn,

vKv0, (2.31)

implying K is at least semi positive-definite. Moreover, if ΩD �= then K is positive definite. In case of pure Neumann boundary conditions, K is only positive semi-definite, and for the solution of (2.18) to exist, the relation

ui=

φiu+

∂ΩφigN=0 (2.32)

must hold.

Further, for any vector vRn, there holds

vMv0, (2.33)

i.e., the mass matrix M is symmetric positive definite.

2.2.5 The reduced optimality system

If the state y, control u and the adjoint λ are discretized using the same finite element spaces, then it is possible to eliminate the gradient equation using the relation

u= 1 βλ.

This gives us

AR

y λ

M KT K −1βM

� y λ

=

b d

. (2.34)

Note thatARalso has the saddle point structure, i.e., it has the form

AR=

�A BT B −C

where

A=M, B=K and C= 1 βM.

2.3 Distributed optimal control of the convection-diffusion equa- tion

Convection-diffusion processes help describe physical processes involving both diffusion and transport. One such example is the spread and transport of a pollutant with wind. The process is described through the following equation:

ε∆y+ (�w· ∇)y=u in Ω, y=gDon ∂ΩD, ∂y

∂n =gNon ∂ΩN,

∂Ω=∂Ω∂Ω , ∂Ω∂Ω ={},

(2.35)

(20)

wherew is some vector field (for instance, direction of the wind) and g� D and gN are the given boundary data. Further, we assume that �w is divergence free, i.e., ∇ · �w = 0. The scalar quantity ε (for instance, viscosity) satisfies 0<ε1and the smaller the value it takes, the more convection-dominated the problem becomes.

Note that when �w �= 0, the problem is not self adjoint. This simply means that if we define the convection-diffusion operator by

L = −2+ �w∇ (2.36)

then

(Lu)v�=

u(Lv). (2.37)

The consequence to this is that the coefficient matrix derived through discretization is no longer symmetric. Further, when it comes to solving convection-diffusion control problems, discretize- then-optimize and optimize-then-discretize, yield different outcomes. One method to mitigate the issue is through employing the subgrid modeling and local projection stabilization (LPS) schemes, as discussed in [30,29,28].

In a convection-diffusion control problem, the aim is to find a control u such that the state y attains a desired state ˆy. The problem can be defined as a minimization of the cost functional with a convection-diffusion PDE stated as a constraint. The problem can be formulated as:

miny,u J (y, u) = 1

2�y− ˆy�2L2(Ω)+1

2β�u�2L2(Ω)

s.t. −ε∆y+ (�w· ∇)y+cy=u in Ω y=gDon ∂ΩD, ∂y

∂n =gNon ∂ΩN,

∂Ω=∂ΩD∂ΩN, ∂ΩD∂ΩN={}.

(2.38)

2.3.1 Weak formulation of the state equation

The state equation in this case is the convection-diffusion equation. Before we define the weak formulation, let us first define the space where trial and test functions will live. So if Ω⊂R2, then this space can be defined by the Sobolev space

H1(Ω) =

y : Ω→R| y, ∂y

∂x1, ∂y

∂x2 ∈L2(Ω)

� , whereas its residents, i.e., the test and trial spaces are

HE1 ={y∈ H1(Ω)|y=gDon ΩD} H1E0 ={v∈ H1(Ω)|v=0on ΩD}. The weak formulation of the convection-diffusion equation reads:

Find y∈ H1Esuch that

a(y, v) =l(v) ∀v∈H1E0. (2.39) The bilinear form a(·,· )on the left side is given by

a(y, v) =

∇y.∇u+

(�w.∇y)v. (2.40)

Similarly, the linear form l(· )on the right side is given by l(v) =

uv+

∂ΩN

gNv. (2.41)

(21)

2.3.2 Stabilization Techniques

The finite element discretization of convection-diffusion equation is not straightforward. As dis- cussed in [18], when the convection term dominates, the discrete solution to the problem is inaccurate due to the presence of non-physical oscillations. To further understand the issue, we quantify the effect of convection and diffusion terms by the Peclet number. Let the velocityw be� represented by

w� =C0�w, (2.42)

where C0is a positive constant andw�is normalized to have a value of one in some norm. Then the Peclet number is given by

P = C0L, (2.43)

whereL is the characteristic length scale of the problem.

So if P≤1, the problem is diffusion dominated and the solution may be within acceptable range of accuracy, nonetheless, it is best to have P→0.

If P �1, the solution can have a steep gradient requiring a very fine mesh to correctly resolve it. If the mesh is not fine, the standard Galerkin finite element discretization method will not work desirably.

Different approaches have been proposed to captivate the demon of unwarranted oscillations in processes governed by a convection-diffusion equation. One such method is the streamline up- wind Petrov-Galerkin (SUPG) method. However, using SUPG in the context of optimal control of the convection-diffusion equation, optimize-then-discretize and discretize-then-optimize result in structurally different algebraic systems, leading to different solution outcomes and precondition- ing strategies. In order to overcomes this issue, Becker and Vexler in [29] proposed a method re- ferred to as the local projection stabilization (LPS) scheme, which borrows some ideas from sub- grid modeling method discussed in [30,31]. Pearson and Wathen in [28] used the LPS scheme resulting in an optimality system that has same structure whether optimize-then-discretize or discretize-then-optimize is used.

2.3.2.1 The Petrov-Galerkin formulation: Streamline Upwind Petrov-Galerkin (SUPG) Sta- bilization

In this formulation, the trial and test spaces are different, i.e., the trial space is given by YEh⊂H1E, and the test space is given by the vectors of the form

vh+δw.� ∇vh, (2.44)

where vh∈VEh0and δ is a constant that needs to be calculated apriori. Using this formulation, the bilinear form apg,·)is given by

apg(yh, vh) =

∇yh.∇uh+

(�w.∇yh)vh+δ

(�w.∇yh)(�w.∇vh)−�δ

(∇2yh)(�w.∇vh). (2.45) In case of linear and bilinear elements, the higher order term is zero. Thus

apg(yh, vh) =

∇yh.∇uh+

(�w.∇yh)vh+δ

(�w.∇yh)(�w.∇vh). (2.46) Similarly, the linear form lpg(·)is given by

lpg(vh) =

uvh+

∂ΩNgNvh+δ

u(�w.∇vh). (2.47)

(22)

The Petrov-Galerkin finite element method, also called Streamline Upwind Petrov-Galerkin (SUPG) scheme, reads:

Find yh∈VEhsuch that

apg(yhvh) =lpg(vh) ∀vh∈VEh0. (2.48) To find the value of δ, we let δk to be defined locally on individual elements. As suggested in [18,24], we have

δk =



 hk 2��w�2

� 1− 1

Phk

if Phk >1 0 otherwise

(2.49)

where

Phk = �w��2hk 2�

and hkis measured in the direction of the wind. For example, as described in [24], if we have a rectangular domain with sides given by hxand hyand�w= [cosθ, sinθ], then

hk =min� hx

|cosθ|, hy

|sinθ|

� .

2.3.2.2 Local Projection Stabilization (LPS) scheme

We follow with the discussion on LPS scheme in [29]. Consider a two-dimensional mesh TH

consisting of open cells which in the present case consist of quadrilaterals.

Figure 2.1:THwith 4 cells.

Using uniform refinement on meshTH gives us a refined mesh Th. For instance, ifTH consists of 4 cells, a uniform refinement of it will create 16 cells onTh. Each refined cell onTH creates 4 cells inTh, referred to as a patch P; so we will have 4 patches in this case.

(23)

Figure 2.2: Refined mesh Th with 16 cells and 4 patches (each patch portrayed by a different color) with representing the support on the patches.

LPS uses standard finite element discretization with stabilization based on local projections. Now the sequel requires us to define an L2(P)orthogonal projection operator5

Ph: L2(Ω)→VHconst, (2.50)

on the patches of the domain, with VHconst being a cell-wise6 constant function on the patches.

Further, this operator satisfies the following approximation and stability properties on the patches, cf. [29]:

�Phv�L2(P)≤c�v�L2(P) ∀v∈L2(P). (2.51)

�v−Phv�L2(P)≤ch�∇v�L2(P) ∀v∈H1(P). (2.52)

We now introduce a positive stabilization parameter δ associated with a bilinear symmetric stabi- lization form τhδ : Vh×VhRgiven by:

τhδ(uh, vh) =δ(�w· ∇uh−Ph(�w· ∇uh)× (�w· ∇vh−Ph(�w· ∇vh))). (2.53) The stabilization parameter δ is defined locally on individual elements and depends on the Peclet number

Phk= hk��w�

. Following from Becker and Vexler in [29], we have

δk =



 hk

��w�, if P

hk1, 0, otherwise.

(2.54)

An Algorithm to solve the convection-diffusion equation based on the LPS scheme is discussed in Appendix B.

5Note that orthogonal projection operator gives a good average approximation of a function, compared to an interpola- tion operator. However, both these operators have difficulty in approximating highly oscillatory or discontinuous functions.

6There are four cells in a patch in our considered example.

(24)

Remark The L2(P)orthogonal projection is computed on the gradient of the approximated solu- tion ˜uhon each patch P∈ THgiven by

Ph(�w· ∇˜uh)→MPhξ= (�w· ∇˜uH, vH), (2.55) where MPh is the standard mass matrix assembled onTHand ξ ∈VHconstis a vector of cell-wise constant functions on patches. The idea here is to capture sharp oscillations in the gradient of approximated solution ˜uHon the patches. These captured oscillations ξ are then interpolated to the fine meshThand added to the computed solution ˜uh.

2.3.3 Finite element discretization of the state equation

Let V0h ⊂ H1E0 be the n-dimensional vector space of test functions spanned by{φ1, ..., φn}. Let YEh⊂ HE1 be the vector space for trial functions spanned by{φ1, ..., φn}and extended further by {φn+1, ..., φn+n}to satisfy the Dirichlet boundary data on ΩD. The finite element approximation yh∈YEhis uniquely determined by the vector of coefficients y={y1, ..., yn}. So we write

yh=

n i=1

Yiφi+

n+n i=n+1

Yiφi. (2.56)

We can discretize the control u using the trial space. Let φibe the set of basis functions such that span{φ1, ..., φn}:= Uh ⊂ L2(Ω). Then the finite element approximation of the control u is given by

uh=

m

i=1uiψi, (2.57)

where uhis uniquely determined by the vector of coefficients u={u1, ..., un}.

The finite element approximation of the state equation can be written as a system of algebraic equations given by

Fy=Mu+d, (2.58)

where F=�K+N+T. Further,

M={mij}i,j=1,...,n, mij=

φiφj, K={kij}i,j=1,...,n, kij =

φi.∇φj, N={nij}i,j=1,...,n, nij =

(�w∇φj)i, T={τh,i,jδ }i,j=1,...,n, τh,i,jδ =δ

(�w· ∇φi−Ph(�w· ∇φi)× (�w· ∇φj−Ph(�w· ∇φj))), d={d}i=1,...n, {di} = −

n+n j=n+j

yj

φi.∇φj.

Here, M is the mass matrix, K is the stiffness matrix, N contains the convection term (skew- symmetric), d contains the boundary terms (assuming gN = 0) and T is the LPS stabilization term.

(25)

2.3.4 The finite element discretization of the minimization problem

We replace the continuous minimization problem (2.38) with its discrete counterpart:

miny,u

1

2yTMyyTb+1

2βuTMu s.t. Fy=Mu+d,

(2.59)

where b is given by

b={bi}i=1,...n,

ˆyφidΩ.

We now introduce the Lagrange multiplier λ, discretizing it using the functional space of state y.

So we obtain

λh=

n i=1

λiφi+

n+n i=n+1

λiφi. (2.60)

The discretized Lagrange multiplier (adjoint state) λh is uniquely determined by the vector of coefficients λ={λ1, ..., λn}.

2.3.5 First order optimality conditions

We formulate the discrete Lagrangian function given by L(y, u, λ) = 1

2yTMyyTb+βuTMu−λT(FyMu+d). (2.61) By definition, the optimal point of L is obtained when its partial derivatives with respect to y, u, and λ are set to zero. This leads to the following optimality system, i.e.,

AF

y u λ

≡

M 0 FT

0 βM −M

F −M 0

y u λ

=

b 0 d

 . (2.62)

Note the difference between the systems (2.30) and (2.62). Whereas K is a discrete Laplacian and thus symmetric positive definite (spd), F is non-symmetric as it contains the discrete coun- terparts of convection (skew-symmetric) and stabilization terms.

We can see that (2.62) has a saddle point structure, i.e., it has the form

AF =

�A BT

B 0

� , where

A=

�M 0

0 βM

, B=F−M�

and C=0.

2.3.6 The reduced optimality system

If the state y, control u and the adjoint λ are discretized using the same finite element spaces, then it is possible to eliminate the gradient equation in (2.62) using the relation

u= 1 βλ,

References

Related documents

Numerical representation of PDE-constrained optimization problems leads to large discrete optimality systems with saddle point structure. Saddle point systems are, in

A novel approach to construct a Schur complement approximation is proposed in [23] in the context of algebraic multilevel preconditioning of a matrix arising in finite

Byggstarten i maj 2020 av Lalandia och 440 nya fritidshus i Søndervig är således resultatet av 14 års ansträngningar från en lång rad lokala och nationella aktörer och ett

Omvendt er projektet ikke blevet forsinket af klager mv., som det potentielt kunne have været, fordi det danske plan- og reguleringssystem er indrettet til at afværge

I Team Finlands nätverksliknande struktur betonas strävan till samarbete mellan den nationella och lokala nivån och sektorexpertis för att locka investeringar till Finland.. För

För att uppskatta den totala effekten av reformerna måste dock hänsyn tas till såväl samt- liga priseffekter som sammansättningseffekter, till följd av ökad försäljningsandel

Coad (2007) presenterar resultat som indikerar att små företag inom tillverkningsindustrin i Frankrike generellt kännetecknas av att tillväxten är negativt korrelerad över

Keywords: Nonlinear optimization, mathematical programming, interior- point methods, approximate solutions to systems of linear equations, method of conjugate gradients,