Implementation of Advanced Time Stepping Algorithm Computational

(1)

MASTER’S THESIS

NILS-ERIK SVANGÅRD

Implementation of Advanced Time Stepping Algorithm Computational

Fluid Dynamics Software

MASTER OF SCIENCE PROGRAMME Mathematics

Luleå University of Technology

Department of Mathematics

(2)

Computational Fluid Dynamics Software

M ^ASTER T ^HESIS R ^EPORT

Nils-Erik Svang˚ard nilserik@gmail.com Under the supervision of Ph.D. Niklas Andersson

and Ph.D. Bo Kjellmert

November 21, 2006

(3)

This master thesis contains an investigation of the Portable Extensible Toolkit for Scien- tific calculations (PETSc) and a time stepping implicit solver for solving computational fluid dynamics problems. The work discusses the two different approaches to achieve a time stepping solver with PETSc and what other possibilities PETSc hold for Com- putional Fluid Dynamics (CFD) codes. An implementation using PETSc’s Ordinary Dif- ferential Equation (ODE) solver is presented as well as a solver using PETSc non-linear solver (SNES). The code is implemented in the Volvo Aero in-house CFD software G3D.

The way PETSc was implemented in the present work did not result in a CFD code

with satisfying performance. This thesis discusses the PETSc framework in general, the

present PETSc implementation, reasons for the unsatisfying results and suggestions for

future work.

(4)

1 Introduction 3

2 G3D 3

2.1 Basic overview . . . . 3

2.2 Governing equations . . . . 3

3 PETSc 6 3.1 Basic overview . . . . 6

3.2 Compiling The PETSc Library . . . . 7

3.2.1 General compilation . . . . 7

3.2.2 Volvo Aero compilation . . . . 7

3.3 Using Matlab with PETSc . . . . 7

3.4 Basic Data types in PETSc . . . . 8

3.5 The non-linear solver: SNES . . . . 8

3.6 The time stepper: TS . . . . 9

3.7 Basic PETSc program . . . 10

4 Adding PETSc solver to G3D 11 4.1 Basic overview . . . 11

4.2 Handling mesh data . . . 11

5 SNES Backward-Euler Implementation 13 5.1 Backward-Euler . . . 13

5.2 The code . . . 13

5.3 The FormFunction() . . . 15

6 TS ODE solver Implementation 17 6.1 Backward-Euler . . . 17

6.2 The code . . . 17

6.2.1 The function . . . 19

7 Building and Running the Implicit Solver 19 7.1 Basic overview . . . 19

7.2 Program specifics . . . 20

7.3 Executing . . . 20

8 Conclusions and Discussion 20 8.1 Problems encountered . . . 20

8.2 Future work . . . 21

9 Acronyms 22

(5)

BIBLIOGRAPHY 23

A TS and SNES implementations in FORTRAN 24

A.1 snes.F . . . 24 A.2 ts.f . . . 29

B The modified Bratu-example 35

B.1 ex5f.F . . . 35 B.2 ex5f.h . . . 48

C The makefile 50

(6)

1 Introduction

The thermo and aerodynamics department at Volvo Aero has during the past years performed substantial refinement of its in-house CFD codes, VolSol, in order to meet the demands on more advanced methods in the field of space and aircraft propulsion.

With the introduction of state of the art methods such as unique algorithms for non ideal-gas formulations and conjugated heat transfer simulations there is now a need for a new implicit formulation as well as implementation of advanced parallelization in the code. This work should be seen as an initial investigation to increase to convergence rate to obtain a steady flow solution with the in-house CFD code by a factor of at least 10. All work has been done on a code similar to the actual production code, through the use of the PETSc. Furthermore an investigation into PETSc built in parallelization and dynamic multi-grid methods was conducted. The ABSOFT FORTRAN 8.0 compiler is used everywhere if nothing else is written.

2 G3D

2.1 Basic overview

G3D is a CFD software developed at Chalmers University of Technology and Volvo Aero by Eriksson (1995). The software works with general structured boundary-fitted curve-linear, non-orthogonal multi-block mesh. It solves Navier-Stokes equation using an explicit 3-stage Runge-Kutta algorithm. By introducing an implicit solver to the code larger time steps can be taken even in low speed approaches and hopefully speed up the calculation. G3D uses the finite-volume method. The version of G3D used in this thesis was not parallelized, however there exist versions where each of the calculations are blockwise parallel, meaning that the mesh blocks are distributed over the available nodes and calculations are performed on the local grids.

2.2 Governing equations

In this section the equations governing the fluid dynamics of the flows used in G3D are presented in tensor form. This should be seen as background information and if more information is needed please read Andersson(2005) [1].

The Reynolds-averaged Navier-Stokes (RANS) equations constitute a set of trans- port equations governing transport of mass, momentum and energy:

∂Q

∂t + ∂F j

∂x j

= 0 (1)

where

Q =



 ρ ρ ˜ u i

ρ ˜ e 0



 (2)

(7)

and

F _j =





ρ ˜ u i

ρ ˜ u i u ˜ j + pδ ij − τ ij

ρu j h 0 + q j − u i τ ij



 (3)

where τ ij is the viscous stress obtained by using Boussinesque assumption and is de- fined by

τ ij = (µ + µ t ) ∂u i

∂x j

+ ∂u j

∂x i

− 2∂u k

3∂x k

δ ij

− 2

3 ρkδ ij (4)

The eddy viscosity is defined by

µ t = C µ ρ k ²

ε (5)

Fourier’s heat law gives the heat flux expressed as

q j = −λ ∂ ˜ T

∂x j

= −C p (( µ

P r + µ t

P r t

) ∂ ˜ T

∂x j

) (6)

Turbulent kinetic energy and dissipation of turbulent kinetic energy are modeled as a standard k − ε-model

∂k

∂t + ∂

∂x j

ρu j k − (µ + µ t

σ k

) ∂k

∂x j

= P k − ρε (7)

∂ε

∂t + ∂

∂x j

ρu j ε − (µ + µ t

σ ε

) ∂ε

∂x j

= (C ε

₁

P k − C ε

₁

ρε) ε

k (8)

The expression for the production of turbulent kinetic energy in the standard k − ε is written as

P k =

µ t ( ∂u i

∂x j

+ ∂u j

∂x i

− 2∂u k

3∂x k

) − 2 3 ρkδ ij

∂u i

∂x j

(9) The gas mixture behaves as a perfect gas

P = ρRT (10)

Eqn. 1 is discretized on a structured non-orthogonal boundary-fitted mesh. Integrat- ing Eqn. 1 over an arbitrary volume Ω gives

Z

Ω

∂Q

∂t dV + Z

Ω

∂F j

∂x j

dV = 0 (11)

Using Gauss theorem to rewrite the second volume integral to a surface integral and introducing Q as the cell average of Q over Ω gives

∂Q

∂t V + Z

∂Ω

F _j · dS j = 0 (12)

(8)

In Eqn: 12 S j = n j S is the face area normal vector, an area with a direction. The integral of the fluxes is approximated using the area of all faces and the face average flux

Z

∂Ω

F _j · dS j =

allf aces

X

i=1

F _j ⁱ · S _j ⁱ (13)

Eqn: 11 can now be written as

∂Q

∂t V +

allf aces

X

i=1

F _j ⁱ · S _j ⁱ = 0 (14)

The total flux, F j , is divided into a convective part and a diffusive part

F _j =





ρ˜ u j

ρ˜ u i u ˜ j + pδ ij

ρ˜ e 0 u ˜ j + p˜ u j



 +





0 − σ ij − τ ij

− C p (( _{P r} ^µ + _{P r} ^µ

^t

t

) _∂x ^{∂ ˜} ^T

j

− u ˜ i (σ ij + τ ij )



 (15)

where the first part is the convective and inviscid part, the second part is the diffusive and viscous part. In the code fluxes are evaluated for each cell face in the computational domain by moving a four-point stencil around the domain cells, i.e. they are evaluated in a ”method of lines”-manner. Interface-faces and boundary faces are treated separately using special versions of the flux routines. In the boundary face routines ghost cell val- ues needed for the evaluation of the four point stencil flux expression are extrapolated using MOC (method of characteristics) or similar.

As mentioned earlier G3D uses a three-stage second-order low-storage Runge-Kutta time marching method. When all convective and diffusive flux have been calculated (which is described in detail in Andersson [1]) the temporal derivative of the flow vari- ables for a specific cell is estimated as:

∂Q ⁿ

∂t = F ⁿ (16)

Q ^∗ = Q ⁿ + ∆tF ⁿ Q ^∗∗ = 1

2 [Q ⁿ + Q ^∗ + ∆tF ^∗ ] Q ⁿ⁺¹ = 1

2 [Q ⁿ + Q ^∗ + ∆tF ^∗∗ ] (17)

where superscript n points to the present time step and n + 1 is the next time step.

Superscripts ∗ and ∗∗ describe the sub-time steps. ∆t is the solver time step, the time

between n and n + 1.

(9)

3 PETSc

3.1 Basic overview

PETSc is short for Portable Extensible Toolkit for Scientific calculations and is used for ad- vanced numerical computations. PETSc is a set of libraries and data structures for solv- ing linear and nonlinear systems of equations in parallel. It has built in parallelizing using Message Passing Interface (MPI). PETSc is developed by scientists at the Mathe- matics and Computer Science Division at Argonne National Laboratory and is mainly funded by the U.S. Department of Energy.

Each library works with a specific area and is used by higher level libraries, for ex- ample the time stepping library uses the vector library and the non-linear solver library to solve time stepping problems. The libraries support a multitude of types and algo- rithms like: vectors, Krylov subspace methods, matrices, nonlinear solvers, time step- pers, multigrid methods, distributed data types, automatic differentiation and more.

PETSc is by design well suited to plug into existing code, for example the vector implementation has been made so that it is easy to convert the data structure used by old code to PETSc framework. All types in PETSc have been designed with parallel operations in mind. This means that it is relatively easy to make software that performs calculations in parallel with little prior knowledge on parallel programming. The code is written to promote code reuse and flexibility.

The PETSc libraries are modern and object orientated. Functions to handle errors, handle and displaying data are built in. PETSc is generally tested with different com- pilers by the PETSc development team. This is done with the current version of the compiler tested. The ABSOFT brand of FORTRAN compilers are among the ones tested.

The version of PETSc used in this work is 2.3.0. When doing a new project it is always recommended to use the latest version. The PETSc website [2] is the main resource of documentation and support. The PETSc team provides limited support on the current version and is done on public mailing lists. These lists together with the documentation are very good resources of experience and information and the people answering are very competent.

Function Operation

Create() Create the object.

Get/SetName() Name the object.

Get/SetType() Set the implementation type.

Get/SetOptionsPrefix() Set the prefix for all options.

SetFromOptions() Use command line options.

SetUp() Perform other initialization.

View() View the object.

Destroy() Cleanup object allocation.

Table 1: Almost every object in PETSc supports a basic interface shown above.

(10)

3.2 Compiling The PETSc Library

3.2.1 General compilation

The installation procedure depends on which architecture the library is intended to be used on, detailed information on the installation procedure is available at the PETSc website. The information given here should be seen as a guide for quick installation.

The primary source of information should always be the website. The instructions take for granted that the PETSc source code has been downloaded and unpacked in an ap- propriate directory where the person intending to install has the correct permissions.

And of course a supported environment with all the requirements stated for the PETSc library met.

3.2.2 Volvo Aero compilation

The procedure for compiling PETSc on Volvo Aero differs from the installation proce- dure described in the PETSc installation manual. This is due to the limitations of the ABSOFT FORTRAN 8.0 compiler which is the compiler available on Volvo Aero com- puters. This compiler is old and have many mismatches in flags which are not compat- ible with newer versions. This prompts the user to manually control the behaviour of PETSc installations scripts by editing configuration files and appending the appropriate compiler flags.

Before trying to compile any PETSc application in the Volvo Aero environment the file petscconf has to be edited. It can be found in the directory $P ET SC DIR/bmake/linux−

gnu/. All occurrences of −W, −rpath has to be changed to −L otherwise the compiler will complain. For explanation of the compiling flags the manual for ABSOFT FOR- TRAN 8.0 [6] is recommended.

PETSc compilations scripts provide the functionality to download and unpack di- rectly via the command line. But since the Volvo Aero computers access Internet via a proxy there is no good way for reaching the Internet to download packages from the command line. If Basic Linear Algebra Subprograms (BLAS) and MPI are wanted they have to be downloaded separately, unpacked and put in $P ET SC DIR/externalpackages/, then run conf igure.py with the ”−download − ∗” options and PETSc will use the pack- ages downloaded.

3.3 Using Matlab with PETSc

There are many ways to view and manipulate data in PETSc. An interesting usage is with Matlab, which isn’t used in this work but could potentially be useful when per- forming calculations. There are a couple of ways to use Matlab with PETSc, you can for example:

• Dump data for use in Matlab

(11)

• Use Matlab interactively from PETSc

• Automatically send data back and forth between PETSc and Matlab

A detailed explanation how to do these operations is discussed in the PETSc manual [3].

It is however trivial to output all vectors or matrices from the command-line to a file in Matlab format. For example −vec view matlab will print to standard output (generally the monitor) every time a vector is assembled.

3.4 Basic Data types in PETSc

PETSc has data types that replace int, real and others, they have names like PetscInt, PetscScalar. The PETSc Users manual give instructions on how to use them. Note how- ever that by default PETSc uses double precision and FORTRAN by default uses single precision (at least Volvo Aero ABSOFT FORTRAN installation does). This means that for instance a PetscScalar is the same as real*8, if this needs to be remedied a recom- pile of the PETSc library is needed with the proper flags set. One of the simplest PETSc objects is Vec which is a standard vector, it can be defined as a standard sequential or a parallel vector (using MPI) but still has the same interface. The vector object and all other PETSc objects have predefined functions that do the basic manipulations needed.

There are also objects to handle the management of structured and unstructured grids.

The object PETSc Distributed Array (DA) handles distributed arrays and is for struc- tured grids. IS and VecScatter is for handling vectors for unstructured grids. There is also support for arbitrary degree of freedom in vectors and cells.

3.5 The non-linear solver: SNES

The PETSc object contains methods for solving systems of nonlinear equations of the form:

F (x) = 0 (18)

SNES uses Newton-like methods and includes both line search and trust region tech- niques. The interface to use the object is the same as for almost all PETSc objects (see Table 1). SNES works by computing

p = −approxinv(J) ∗ F (uold) (19)

and then does a line search on

unew = uold + lambda ∗ p (20)

to get the new u (J is the Jacobian). First it uses a test value of 1 for lambda so it tries to compute F (uold + p). It is possible that uold + p has some ”non-physical” values in it.

There are two ways to handle these non-physical values:

(12)

• Before doing the line search, SNES calls a ”pre check” function, that can change the step if it decides that the step must be decreased. A custom pre check function can be defined with SNESLineSearchSetPreCheck().

• In the function that calculates the new values, add a check that exits the function when a non-physical value is detected. Such a solution should make sure that no calculations are done with the non-physical values.

The SNES object also supports matrix-free methods which do not need a Jacobian.

Matrix-free methods are used in this work because the Jacobian was hard to construct.

The matrix-free methods in PETSc are generally slower than thoose using a Jacobian.

Figure 3 is a basic SNES-application also showing the parts of the program that the programmer must implement. The SNES-object is implemented on top of smaller com- ponents this is shown in Figure 1.

Non Linear Solver (SNES)

Profiling Interface Communications Kernels

Preconditioner (PC) Linear Solver (SLES)

Figure 1: The SNES-object is built of smaller components.

3.6 The time stepper: TS

The PETSc Time Stepper (TS) object is an ODE integrator.

It solves ODE’s of the form:

du

dt = F (u) (21)

where the user provide F and its Jacobian. The time stepper uses the PETSc nonlinear solver to solve the equations and provides an easy way to change time stepping algo- rithm. See Table 2 for a list of currently supported time stepping algorithms.

The TS-object is also built on top of smaller modules as is shown in Figure 2.

(13)

Non Linear Solver (SNES) Time Stepper (TS)

Preconditioner (PC) Linear Solver (SLES)

Profiling Interface Communications Kernels

Figure 2: The TS-object is built on top of smaller modules.

Define Type Description

TS EULER euler The explicit Forward-Euler method TS BEULER beuler The implicit Backward-Euler method TS PSEUDO pseudo Pseudo-time stepping

TS CRANK NICHOLSON crank-nicholson The implicit Crank-Nicholson method TS SUNDIALS sundials The LLNL CVODE/SUNDIALS package TS RUNGE KUTTA runge-kutta The explicit Runge-Kutta methods

Table 2: ODE solvers to use with PETSc’s TS object.

3.7 Basic PETSc program

To fully utilize PETSc capabilities much time must be spent reading the documentation and compiling the tutorials and examples. PETSc has a high treshold, mainly because it is designed to be scalable and mathematically correct.

The application code that utilizes PETSc libraries must be contained within an ini- tialize routine and a finalize routine that creates and destroys the framework, in effect code relating to PETSc must be contained within:

c a l l P e t s c I n i t i a l i z e (PETSC NULL CHARACTER, i e r r ) c a l l PetscUserCustomRoutines ( i e r r )

c a l l P e t s c F i n a l i z e ( i e r r ) Option Description

-ts max steps Maximum number of iterations to use -ts max time Set maximum time

-ts type Set which ODE-solver to use, see Table 2

Table 3: Command-line options for PETSc’s TS object

(14)

There are routines to access the command line from within a PETSc enabled pro- gram. For instance to read the command-line option −dt and transfer it to the dt variable write:

c a l l Pe tscOp tionsGe tR e a l (PETSC NULL CHARACTER, ’ − dt ’ , dt , f l g , i e r r )

4 Adding PETSc solver to G3D

4.1 Basic overview

When implementing an implicit solver we have to realise that the best implicit solver step will take much more computation time than an explicit step, at least a factor of 10 (closer to a factor of 100). Thus to get a faster solver than the explicit one the time step must be increased by at least a factor of 100 or more, anything less is not worth the effort making an implicit code. If the time step cannot be increased by this amount it might be better concentrating on making the explicit code as fast as possible.

To make sure that no new calculation errors are introduced, few changes in the orig- inal G3D code are desirable. This is best done by keeping all the original calculation code and replacing the code that perform the step. Since many prospective users of the toolkit will have a lot of old code, the toolkit has been designed to be easily integrated.

For instance the PETSc Vec object is designed to be an easily used replacement for an old vector representation, and automatically adds parallel properties to the vector.

During the course of this work two possible ways of solving time dependant differ- ential equations were explored. The first solution employs the functionality of PETSc nonlinear solver library SNES with a custom made time stepper attached. This solution provides direct manipulation of the step size and a way to handle errors.

The second solution uses the functionality of PETSc time stepping library TS, which in turn uses the nonlinear solver library. One of the benefits of using the TS library is the ability to change the time stepping algorithm easily from the command line.

4.2 Handling mesh data

Since the G3D uses a block structured mesh and PETSc does not support that type, we must handle the data. There are two main alternatives to do this:

• Use old code to manage the parallel blocks and communication for them, but use a big SNES around it to solve the implicit systems.

• Use multiple PETSc DA’s to manage the collection of blocks and use a big SNES around it.

The choice of implementation depends on how difficult/flexible the G3D code is. In

theory, not that much communication code has to be written for either approach, it is

(15)

PETSc

Application init Function evaluation Jacobian evaluation Main routine

Linear Solver (SLES)

KSP PC

Non Linear Solver (SNES)

Post processing

Figure 3: The basic flow of a PETSc SNES application.

already handled in the old code and built in the case of PETSc DA. When analyzing the code however it becomes clear that the second approach is harder, almost all block handling routines has to be rewritten to exploit PETSc parallel capabilities fully. This prompted more research into the first solution alternative. The main code should be the logic from the old G3D code and a new interface using the PETSc routines. In the code FormFunction() and FormJacobian() are named from the PETSc examples and defines which function to solve and its Jacobian. Note that FormJacobian() is not used (because of the thesis time limit), this is worked around by using a matrix free solver that does not require a Jacobian. The matrix free option unfortunately have drawbacks, the main one being that convergence will be bad because no kind of preconditioner can be built.

The solver will struggle to solve the linear systems.

However PETSc does not have code to manage a block structured mesh data struc- ture which is the way G3D stores data, thus there is a need to retain the mesh data structures of the old code and when generating data vectors also add them to the PETSc framework.

PETSc DA’s can only be used when using a structured grid in 1,2 or 3D. The old G3D stores the grid in a 1D array and for 1D arrays one should use the PETSc Vec object directly instead of using DA, to make sure no extra overhead is created.

Since using PETSc DA’s would create a need to rewrite almost all code relating to the interfaces between the block handling of G3D it was decided to use PETSc Vec objects since the important part of this work was to implement an implicit solver. Using DA’s to handle the blocks would also take much more development time, as would implement- ing the code to generate Jacobians. Both these things where dropped mainly because of time shortage, they would both probably improve the convergence and reduce the calculation time significantly.

To run PETSc nonlinear solver without specifying a Jacobian the matrix free General-

ized minimal residual method (GMRES) must be used. This can be specified in the code,

but the easiest way is to use the command line option. The option −snes mf makes sure

(16)

that the SNES solver uses matrix free GMRES.

5 SNES Backward-Euler Implementation

Here follows an approach to achieve time stepping using PETSc SNES library. This is basically identical to what happens in the PETSc TS object. The difference being that using SNES the time step must be handled whereas TS manages the time step automat- ically. This approach is useful if one wants to implement a PETSc Dynamic multigrid method (DMMG) solver, because there is no support yet for DMMG in TS (only the SNES object support DMMG).

5.1 Backward-Euler

For Backward-Euler (conceptually higher order methods are the same) u ⁿ⁺¹ − u ⁿ

dt = Q(u ⁿ⁺¹ ) (22)

so the nonlinear system to be solved at each time step is

u ⁿ⁺¹ − dtQ(u ⁿ⁺¹ ) − u ⁿ = 0 (23)

this is the function that needs to be passed on to SNESSetF unction() via the F ormF unction().

SNES is a solver of nonlinear equations, it solves

G(u) = 0 (24)

where the code provides G() and optionally its Jacobian. For time stepping with SNES the equation has to be rewritten as a nonlinear one and then solved for each time step.

In this case Backward-Euler has been implemented (but implementing Runge-Kutta, Crank-Nicholson or arbitrary method should be trivial).

5.2 The code

First the program need to initialize vital variables used by SNES to complete its calcu- lations. A vector x of the length N is created with the command:

c a l l VecCreateSeq ( PETSC COMM SELF , N, x , i e r r )

x will be used to store the solution. Then two identical copies are made, the first vector copy, r, is going to be used to store the residual and is created with:

c a l l Ve cDup lica te ( x , r , i e r r )

(17)

The second vector copy, xb, is a temporary storage place for the old solution:

c a l l Ve cDup lica te ( x , xb , i e r r ) A empty SNES object is created and initialized:

SNES sne s

c a l l SNESCreate ( PETSC COMM SELF , snes , i e r r ) The function used by SNES to do the evaluation is set:

c a l l SNESSetFunction ( snes , r , FormFunction , PETSC NULL OBJECT , i e r r ) r is the residual and F ormF unction is the FORTRAN function that does the calculations.

All options set can be overridden by the command-line with SNESSetF romOption().

This will only work if this call is added to the source:

c a l l SNESSetFromOptions ( snes , i e r r )

The vector containing the solution is populated with an initial guess which is set from the input file, the values are set via the function F ormInitialGuess():

c a l l F o r m I n i t i a l G u e s s ( x , i e r r )

The solution vector values are copied to a temporary vector containing the previous time step solution.

c a l l VecCopy ( x , xb , i e r r )

How a time stepping loop utilizing SNES could be done is shown below (all lines with a preceding ! are comments):

! Wh i l e l o o p

10 i f ( s t e p . l e . 1 0 0 ) then

! c a l l P e t s c E x c e p t i o n P u s h ( 8 7 , i e r r )

c a l l SNESSolve ( snes , PETSC NULL OBJECT , x , i e r r ) i e r r =bang

! c a l l P e t s c E x c e p t i o n P o p ( 8 7 , i e r r 2 ) IF ( i e r r . EQ . 8 7 ) THEN

t i m e s t e p = t i m e s t e p /2 c a l l VecCopy ( xb , x , i e r r )

w r ite ( 6 , * ) ”No physical solution , tryi n g ” , . ” with s m a l l e r t i m e s t e p : ” , t i m e s t e p

bang=0

ELSE

(18)

c a l l SNESGetConvergedReason ( snes , reason , i e r r ) IF ( re a son . LT . 0 ) THEN

t i m e s t e p = t i m e s t e p /2 c a l l VecCopy ( xb , x , i e r r )

w r ite ( 6 , * ) ” Solution diverged , tryi n g with ” , . ” s m a l l e r t i m e s t e p : ” , t i m e s t e p

ELSE

t i m e s t e p =dt s t e p = s t e p +1

c a l l VecCopy ( x , xb , i e r r )

w r ite ( 6 , * ) ” Solve success , going f or step : ” , step w r ite ( 6 , * ) ”Reason : ” , reason , ” i e r r : ” , i e r r

re a son =0 i e r r =0 ENDIF ENDIF GOTO 10 ENDIF

The time stepping loop contains 100 steps in this example. For every step SNESSolve() tries to solve the F ormF unction(), if there was an error or the solution diverged the time step is cut in half and the time step is recalculated. Otherwise another step is taken.

The commented calls in:

! c a l l P e t s c E x c e p t i o n P u s h ( 8 7 , i e r r )

c a l l SNESSolve ( snes , PETSC NULL OBJECT , x , i e r r ) i e r r =bang

! c a l l P e t s c E x c e p t i o n P o p ( 8 7 , i e r r 2 )

can be used to add custom error breaks, the FORTRAN support for this is a bit flaky.

If needed the PETSc team will provide patches so it can be used. An unpretty solution used here is to put a variable in a COMMON-block and set it to the value returned from the function where an error occurs. In this case bang is this variable, it is set in the F ormF unction() when PETSc produces a non-physical or otherwise undesirable value and exits to the main loop when set. The loop then analyses the value returned and takes appropriate action. More specifically it cuts the time step in half and retries with the last solution if the values indicate that the solution had non-physical values or have diverged. If the functions are solved it continues with the next time step.

5.3 The FormFunction()

This is a description of the function used with PETSc’s SNESSetFunction() to solve a

time dependant nonlinear system of equations using Backward-Euler. The code used is

a mix of pseudo code and real code, look in the appendix for actual code.

(19)

The function takes a PETSc Vec containing the values from the initial guess as input. It copies the values to the array that the old G3D code uses to get the fluxes:

RO( L)= xx ( 1 , L ) RU( L)= xx ( 2 , L ) RV( L)= xx ( 3 , L ) RW( L)= xx ( 4 , L ) ET ( L)= xx ( 5 , L ) RQ( L)= xx ( 6 , L )

REPS ( L)= xx ( 7 , L )

Where L is the number of cells, RO is the density, RU is the momentum in x-direction, RV is the momentum i y-direction ,RW is the momentum in z-direction, ET is the total energy times density, RQ is turbulence kinetic energy and REP S is the dissipation.

Then the cells are updated and fluxes are calculated with G3D’s functions:

c a l l AUXVR c a l l VGRAD

! c a l l LTSP c a l l FLUX c a l l KESRC

The commented line (the line with a preceding !) is the G3D time step setting func- tion. It can be used in the new solver but care must be taken that all settings are correct, the old G3D code by default uses a time step unique to each cell, if time accurate mode is not specified. The solver requires that the time step is the same for all cells.

The above approach is the same as used in the TS implementation in Chapter 6.2.1.

The Backward-Euler calculations are performed:

f f ( 1 , L ) = RO( L)−OLD( 1 , L)−TSF ( L ) DRO( L ) f f ( 2 , L ) = RU( L)−OLD( 2 , L)−TSF ( L ) DRU( L ) f f ( 3 , L ) = RV( L)−OLD( 3 , L)−TSF ( L ) DRV( L ) f f ( 4 , L ) = RW( L)−OLD( 4 , L)−TSF ( L ) DRW( L ) f f ( 5 , L ) = ET ( L)−OLD( 5 , L)−TSF ( L ) DET( L ) f f ( 6 , L ) = RQ( L)−OLD( 6 , L)−TSF ( L ) DRQ( L ) f f ( 7 , L ) = REPS ( L)−OLD( 7 , L)−TSF ( L ) *DREPS( L )

f f () is the result of the Backward-Euler calculation and is returned to PETSc. T SF () is the time step which is the same for each L. OLD() is the corresponding value from the last iteration and the arrays starting with D are the fluxes.

The old values are saved to be used in the next iteration:

(20)

OLD( 1 , L)=RO( L ) OLD( 2 , L)=RU( L ) OLD( 3 , L)=RV( L ) OLD( 4 , L)=RW( L ) OLD( 5 , L)=ET ( L ) OLD( 6 , L)=RQ( L ) OLD( 7 , L)=REPS ( L )

Option Description

-log summary Print performance data at the programs conclusion.

-info Print verbose information on algorithms, data structures etc.

Table 4: Interesting Command-line options for PETSc

6 TS ODE solver Implementation

Here is shown an approach on how to use the PETSc TS time stepper object when solv- ing the equations.

6.1 Backward-Euler

For Backward-Euler:

u ⁿ⁺¹ − u ⁿ

dt = Q(u ⁿ⁺¹ ) (25)

and the PETSc TS object solves an equation of the form du

dt = F (u) (26)

where the code provides F () (and optionally its Jacobian). In this case the PETSc TS object needs

Q(u ⁿ⁺¹ ) (27)

This means that the PETSc TS object manages the time step and the method of solving is not limited to Backward-Euler. The choice of method can be defined in the code or di- rectly at the command line when the program is invoked. This is set by T SSetRHSF unction() via the F ormF unction() which is described in detail below.

6.2 The code

A PETSc Vec object is created using:

(21)

c a l l VecCreateSeq (PETSC COMM SELF , trueL , x , i e r r )

The new vector x will have the length trueL and be a standard sequential array-style vector. P ET SC COMM SELF is the communicator used for parallel communication.

If this program was parallelized P ET SC COMM W ORLD would have been used.

There are other vector object creators that sometimes are better suited for parallel pro- grams, for example V ecCreateMP I() (details in the PETSc manual). Next we fill the vector x with the initial guess set in G3D:

c a l l F o r m I n i t i a l G u e s s ( x , i e r r )

The function will take the values that already exist in G3D, set by the input file, and transfer those values to the PETSc Vec object. An empty PETSc TS object (ODE solver) is created and initialized using:

c a l l TSCreate (PETSC COMM SELF , t s , i e r r ) where ts is declared in the code as T Sts.

The problem to be solved is non-linear and the TS object has to know that. TS also needs to know which algorithm to use, in this case T S BEULER is chosen. Write:

c a l l TSSetProblemType ( t s , TS NONLINEAR, i e r r ) c a l l TSSetType ( t s , TS BEULER , i e r r )

There are however many different algorithms to try. For more algorithm choices see Table 2. Below the right hand side of the function U t = F (t, u) is set. The routine for evaluation of the function F (t, u):

c a l l TSSetRHSFunction ( t s , FormFunction , PETSC NULL OBJECT , i e r r ) Here ts is the PETSc TS object and F ormF unction is the actual FORTRAN subroutine

that does the evaluation.

The time stepper needs to know the time limits. Setting the maximum number of iterations and maximum time is done with:

c a l l TSSetDuration ( t s , i1 0 0 0 , tmax , i e r r ) The starting time and the size of the time step is set:

c a l l T S S e t I n i t i a l T i m e S t e p ( t s , zero , dt , i e r r )

The time boundaries and time step can easily be set from the command line. A brief listing of some frequently used options are available in Table 3 but there are many more.

A more comprehensive listing can be found at PETSc’s website [2] (or by adding the

− help option after any binary linked to PETSc).

TS needs to know which vector it can use to save the solution:

c a l l T S S e t S o l u t i o n ( t s , x , i e r r )

The PETSc vector object x is set as the solution to the ts time stepper object. The options that the user defined on the command line are set to the TS object:

c a l l TSSetFromOptions ( t s , i e r r )

(22)

Next the internal data structures for later use of the time stepper is set up. For basic use it is not necessary to make this call as it is automatically done during the call to T SStep(). If it is used it should be placed after T SCreate() and optional routines of the form T SSet ∗ () and placed before T SStep().

c a l l TSSetUp ( t s , i e r r )

Now the actual stepping takes place. T SStep() steps the requested number of time steps:

c a l l TSStep ( t s , i t s , ftime , i e r r )

The vector used in the calculations and which now contains the solution is destroyed:

c a l l VecDestroy ( x , i e r r )

The PETSc TS object created with T SCreate() is destroyed with T SDestroy():

c a l l TSDestroy ( t s , i e r r )

6.2.1 The function

The F ormF unction evaluates the function F (t, u) for TS is in large parts identical with the SNES approach in Chapter 5.3. But instead of doing Backward-Euler calculations the calculated values are converted to PETSc array:

f f ( 1 , L ) = DRO( true L ) f f ( 2 , L ) = DRU( true L ) f f ( 3 , L ) = DRV( true L ) f f ( 4 , L ) = DRW( true L ) f f ( 5 , L ) = DET( true L ) f f ( 6 , L ) = DRQ( true L ) f f ( 7 , L ) = DREPS( true L )

f f () are the fluxes and is returned to PETSc. T SF () is the time step which is the same for each L.

7 Building and Running the Implicit Solver

7.1 Basic overview

Since the implicit solver uses functions in the PETSc library correct paths and build en-

vironment variables have to be set. To aid with this it is convenient to use the make

utility, and define the build process in a ”Makefile”. The ”Makefile” defines which files

to compile, which compiler options to use and everything related to compiling the pro-

gram. The PETSc project provides sample makefiles which are easily converted to suit

any project. The examples also show how to execute and add test cases to the compiled

program.

(23)

7.2 Program specifics

The makefile used in this project defines three build targets:

• ImplicitG3D: Uses PETSc TS framework.

• withrk: Uses PETSc TS framework, and links a function programmed in the C- language to the binary. Usefull if you want to rewrite some of the TS routines.

• SNESG3D: Uses PETSc SNES framework with a custom time stepper.

To compile the ImplicitG3D target, just write make ImplicitG3D and a binary version will be built. The complete makefile is listed in APPENDIX C.

7.3 Executing

To execute the binary produced by compiling the sources, simply write the file name and append the proper flags. The proper execution flags can be found in the PETSc doc- umentation. For example to view data from PETSc SNES operations you could execute by typing: ./ImplicitG3D − snes mf − snes smonitor − snes view < runf ile, where runf ile is a Volvo Aero specific datafile containing geometries and start values. The run file has two additions to the default one used which sets up PETSc.

They are:

• TSSETTING: Sets up the time step, max iterations and max time.

• IMPLICIT: Uses the new implicit solver.

• EXPLICIT: Uses the old explicit solver, this is used by default.

8 Conclusions and Discussion

8.1 Problems encountered

During this work many problems were encountered. It is inherently hard to convert old code programmed for sequential calculations to a modern object oriented approach. The developers of PETSc proposed at an early stage of development that it might be easier to rewrite the logic in G3D than trying to integrate it in PETSc.

In FORTRAN you can define how you want your variables declared, by default a vari- able is an integer if it starts with the letters i to k, and all others are reals. In PETSc everything has to be declared beforehand. This prompted a rewrite to explicitly declare all G3D variables in order to work with PETSc.

PETSc by default uses double precision variables but G3D uses single precision. This

caused problems. The problem was solved in three different ways, all G3D variable

(24)

declarations was reprogrammed to be double precision, the PETSc library was recom- piled to use single precision and finally a try with runtime type conversion from single precision to double precision and back again. Rewriting the G3D variable declarations enabled PETSc to work, but the G3D routines caused problems. Recompiling the PETSc library for single precision use was not fully tested by the developers of PETSc under FORTRAN (and Linux), which made that route difficult to use. Runtime type conver- sion worked but there were rounding problems which could be identified as an issue with ABSOFT FORTRAN.

The implicit solver works but convergence is very slow. It can have something to do with G3D or the matrix free solver used, but at the time of writing the cause is un- known.

It was discovered that if the solver produces non-physical values then G3D won’t work.

G3D is sensitive to non-physical values. The solver should ideally not produce non- physical values, but it can happen when the time step is poorly chosen.

The data structure that G3D uses to store the solution is not directly compatible with PETSc. Care must be taken when transferring values from G3D to PETSc. The data ar- rays used in G3D contain zeros which are disregarded in the explicit Runge-Kutta in the old solver code. The reason for the zeros in the vectors is that cell values are stored in the same format as node values. Thus some of the array locations are never used in the cell value arrays. PETSc’s implicit and explicit solvers will use the zeros in the arrays if present. The solution to this is to rewrite the arrays so they are in the correct size with- out zeros. This was solved by not copying the entire array but using existing loops to transfer the data to PETSc vector objects.

The PETSc documentation briefly touches the subject of solving two or more equations at the same time. The typical example case is when PETSc solves one equation. This caused some problems trying to decipher how it should be implemented. It turned out that PETSc has good support for this. Both when using vector objects and DA objects for storing data. In the DA’s the dof (degree of freedom) decides how many equations that are stored. And the non-linear solver solves what is provided and doesn’t really care how many equations it solves. The procedure to utilize this is scarce. Before the attempt to convert G3D code to the PETSc framework a PETSc provided example was modified to handle multiple equations. It solves the Bratu problem in parallel. The modification was to solve several equations in parallel which worked almost flawlessly, but is also an easy problem. This code is found in the appendix.

The conclusion is that it is hard to convert old code to new programming approaches and that in this case there are possibilities but it is not trivial to find where the bottle- necks lie.

8.2 Future work

Future work on this solver would be to write a function that computes the Jacobian.

Then other factorization methods can be used, which might speed up the convergence.

(25)

A good idea would be to convert the mesh handling to use PETSc DA objects or have an unstructured mesh in one DA. This will enable the possibility to utilize the PETSc multi grid framework which hopefully would speed up calculations even more. It is however a much larger project (a rewrite of G3D).

9 Acronyms

BLAS Basic Linear Algebra Subprograms CFD Computional Fluid Dynamics DA PETSc Distributed Array

DMMG PETSc Dynamic multigrid method

G3D Generalized 3D (One of Volvo Aeros inhouse solvers.) GMRES Generalized minimal residual method

MPI Message Passing Interface ODE Ordinary Differential Equation

PETSc Portable Extensible Toolkit for Scientific calculations TS PETSc Time Stepper

SNES PETSc non-linear solver

VolSol Volvo Solver

(26)

References

[1] Niklas Andersson, A Study of Subsonic Turbulent Jets and Their Radiated Sound Using Large-Eddy Simulation, Chalmers University Of Technology, Gothenburg, Sweden, 2005.

[2] Satish Balay and Kris Buschelman and William D. Gropp and Dinesh Kaushik and Matthew G. Knepley and Lois Curfman McInnes and Barry F. Smith and Hong Zhang, PETSc Web page, http://www.mcs.anl.gov/petsc, 2001.

[3] Satish Balay and Kris Buschelman and Victor Eijkhout and William D. Gropp and Dinesh Kaushik and Matthew G. Knepley and Lois Curfman McInnes and Barry F.

Smith and Hong Zhang, PETSc Users Manual, ANL-95/11 - Revision 2.1.5, Argonne National Laboratory, 2004.

[4] Satish Balay and William D. Gropp and Lois Curfman McInnes and Barry F. Smith, Efficient Management of Parallelism in Object Oriented Numerical Software Libraries ,Modern Software Tools in Scientific Computing: Efficient Management of Parallelism in Object Oriented Numerical Software Libraries, E. Arge and A. M. Bruaset and H. P.

Langtangen, pp163–202, Birkh¨auser Press, 1997.

[5] M.T Heath, Scientific Computing, University of Illinois at Urbana-Champaign, U.S.A, 1997.

[6] Absoft, Absoft Pro Fortran 8.0 User Guide, Absoft, U.S.A, 2000

(27)

A TS and SNES implementations in FORTRAN

A.1 snes.F

This is the sourcecode showing the SNES-object implementation of the solver:

SUBROUTINE RUN IMPLICIT ( dt , tmax , zero , i 1 0 0 0 )

# in clu d e ”SNESG3D . h”

# in clu d e ”mesh . h”

# in clu d e ”ketw . h”

# in clu d e ” para . h”

# in clu d e ” s c a l . h”

# in clu d e ” t d e r . h”

# in clu d e ” s o l v . h”

P e t s c I n t I , IMESH,NMPL, NIL , NJL , NKL, K, J , L SNES s ne s

Vec x , r

**r e a l *4 inte , aas**

P e t s c S c a l a r OLD( 7 , 1 : 5 4 9 0 0 0 ) COMMON /INTE/ i n t e ,OLD( 7 , 1 : 5 4 9 0 0 0 )

P e t s c S c a l a r tmax , zero , ft ime , dt P e t s c I n t N, i t s , i 1 0 0 0

P e t s c T r u t h f l g

e x t e r n a l FormFunction , F o r m I n i t i a l G u e s s CFL=1

c a l l P e t s c I n i t i a l i z e (PETSC NULL CHARACTER, i e r r )

c a l l Pe t s cOp t io ns Ge t R e al (PETSC NULL CHARACTER, ’ − dt ’ , dt , f l g , i e r r ) c a l l MPI Comm size (PETSC COMM WORLD, size , i e r r )

c a l l MPI Comm rank (PETSC COMM WORLD, rank , i e r r ) DO IMESH=1 ,NMESH

NMPL=NMP(IMESH) NIL = NI (IMESH) NJL = NJ (IMESH) NKL = NK(IMESH) C F i x a t i d s s t e g e t TSF

DO K=1 ,NKL−1 DO J =1 ,NJL−1

DO I =1 , NIL−1

L= I +NIL * ( J −1)+NILNJL ( K−1)+NMPL MWFCN( L)=0

TSF ( L)= dt

OLD( 1 , L)=RO( L )

OLD( 2 , L)=RU( L )

(28)

OLD( 3 , L)=RV( L ) OLD( 4 , L)=RW( L ) OLD( 5 , L)=ET ( L ) OLD( 6 , L)=RQ( L ) OLD( 7 , L)=REPS ( L ) END DO

END DO END DO END DO

N=NMESHNKLNJLNIL7

c a l l VecCreateSeq ( PETSC COMM SELF , N, x , i e r r ) c a l l V e cDup licat e ( x , r , i e r r )

c a l l SNESCreate (PETSC COMM SELF , snes , i e r r )

c a l l SNESSetFunction ( snes , r , FormFunction , PETSC NULL OBJECT , i e r r ) c a l l SNESSetFromOptions ( snes , i e r r )

C c a l l TSCreate (PETSC COMM WORLD, t s , i e r r ) C c a l l TSSetProblemType ( t s , TS NONLINEAR, i e r r ) C c a l l T S S e t S o l u t i o n ( t s , x , i e r r )

C c a l l TSSetRHSFunction ( t s , FormFunction , PETSC NULL OBJECT , i e r r ) c a l l F o r m I n i t i a l G u e s s ( x , i e r r )

C c a l l TSSetType ( t s , TS BEULER , i e r r )

C c a l l T S S e t I n i t i a l T i m e S t e p ( t s , zero , dt , i e r r ) w rite ( 6 , * ) zero , dt , tmax , i1000

C c a l l T S S e t Durat ion ( t s , i1000 , tmax , i e r r ) C c a l l TSSetFromOptions ( t s , i e r r )

C c a l l TSSetUp ( t s , i e r r )

C c a l l TSStep ( t s , i t s , ft ime , i e r r )

c a l l SNESSolve ( snes , PETSC NULL OBJECT , x , i e r r ) c a l l SNESGetIterationNumber ( snes , i t s , i e r r ) w rite ( 6 , * ) i t s , ftime

! 120 f o r m a t ( ’ Number o f p s e u d o tim e −s t e p s ’ , i 5 , ’ f i n a l t i m e ’ , 1 pe8 . 2 ) c a l l VecDestroy ( x , i e r r )

c a l l VecDestroy ( r , i e r r ) C c a l l TSDestroy ( t s , i e r r )

c a l l SNESDestroy ( snes , i e r r ) c a l l P e t s c F i n a l i z e ( i e r r ) i f ( i e r r . ne . 0 ) then

w rite **( 6 , * ) ’ Error in P e t s c f i n a l i z e ( ) . . . ’ endif**

END

C START NISSE

(29)

subroutine F o r m I n i t i a l G u e s s ( X , i e r r ) Vec X

Pe t s cE rro rCo d e i e r r P e t s c S c a l a r l x v ( 0 : 1 ) P e t s c O f f s e t l x i w rite ( 6 , * ) ”H3T”

i e r r =0

c a l l VecGetArray ( X , l x v , l x i , i e r r ) w rite ( 6 , * ) ” h3t ”

c a l l I n i t i a l G u e s s ( l x v ( l x i ) , i e r r ) w rite ( 6 , * ) ”22 ”

c a l l VecRestoreArray ( X , l x v , l x i , i e r r ) w rite ( 6 , * ) ”23”

r e t u r n end

subroutine I n i t i a l G u e s s ( xx , i e r r )

# in clu d e ”SNESG3D . h”

# in clu d e ”mesh . h”

# in clu d e ” s o l v . h”

P e t s c I n t IMESH, I , J , K, L , NIL , NJL , NKL,NMPL P e t s c S c a l a r xx ( 7 , 1 : 5 4 9 0 0 0 )

do IMESH=1 ,NMESH NMPL=NMP(IMESH)

NIL = NI (IMESH) NJL = NJ (IMESH) NKL = NK(IMESH)

do K=1 ,NKL−1 do J =1 , NJL−1

do I =1 ,NIL−1

L= I +NIL * ( J −1)+NILNJL (K−1)+NMPL xx ( 1 , L)=RO( L )

xx ( 2 , L)=RU( L ) xx ( 3 , L)=RV( L ) xx ( 4 , L)=RW( L ) xx ( 5 , L)=ET ( L ) xx ( 6 , L)=RQ( L ) xx ( 7 , L)=REPS ( L )

C w rite ( 6 , * )RO( L ) ,RU( L ) ,RV( L ) ,RW( L)

C w rite ( 6 , * ) xx ( 1 , L ) , xx ( 2 , L ) , xx ( 3 , L ) , xx ( 4 , L) end do

end do

(30)

r e t u r n end

!

! −−−−−−−−−−−−−−−−−−−− E v a l u a t e F u n c t i o n F ( x ) −−−−−−−−−−−−−−−−−−−−−

! Taken from s r c / t s / e x a m p l e s / t u t o r i a l s / e x 1 f . F subroutine FormFunction ( snes , X , F , i e r r )

# in clu d e ”SNESG3D . h”

C# in clu d e ” f i n c l u d e / p e t s c . h”

C# in clu d e ” f i n c l u d e / p e t s c v e c . h”

C# in clu d e ” f i n c l u d e /petscmat . h”

C# in clu d e ” f i n c l u d e / p e t s c p c . h”

C# in clu d e ” f i n c l u d e / p e t s c t s . h”

SNES s ne s

Vec X , F

C Pe t s cE rro rCo d e i e r r P e t s c O f f s e t xidx , f i d x

P e t s c S c a l a r l x f ( 0 : 1 ) , l x x ( 0 : 1 ) C Get t he inp ut a r r a y

C c a l l VecView ( X , PETSC VIEWER STDOUT WORLD, i e r r ) C c a l l VecView ( F , PETSC VIEWER STDOUT WORLD, i e r r )

c a l l VecGetArray ( X , l x x , xidx , i e r r ) C Get output a r r a y

c a l l VecGetArray ( F , l x f , f i d x , i e r r ) i e r r =0

w rite ( 6 , * ) ”H1T”

c a l l doFunction ( l x x ( xidx ) , l x f ( f i d x ) , i e r r ) w rite ( 6 , * ) ”H2T”

c a l l VecRestoreArray ( X , l x x , xidx , i e r r ) c a l l VecRestoreArray ( F , l x f , f i d x , i e r r )

C c a l l VecView ( F , PETSC VIEWER STDOUT WORLD, i e r r ) r e t u r n

end

subroutine doFunction ( xx , f f , i e r r )

# in clu d e ”SNESG3D . h”

# in clu d e ”mesh . h”

# in clu d e ” t d e r . h”

# in clu d e ” s o l v . h”

# in clu d e ” s c a l . h”

**r e a l *4 inte , aas**

P e t s c S c a l a r OLD( 7 , 1 : 5 4 9 0 0 0 )

COMMON /INTE/ i n t e ,OLD( 7 , 1 : 5 4 9 0 0 0 )

(31)

P e t s c I n t NIL , NJL , NKL, I , J , K, L ,NMPL, IMESH P e t s c S c a l a r xx ( 7 , 1 : 5 4 9 0 0 0 ) , f f ( 7 , 1 : 5 4 9 0 0 0 ) C Loop t r o u g h t t he mesh and s e t t he v a l u e s o f RO

do IMESH=1 ,NMESH NMPL=NMP(IMESH)

NIL = NI (IMESH) NJL = NJ (IMESH) NKL = NK(IMESH)

do K=1 ,NKL−1 do J =1 , NJL−1

do I =1 ,NIL−1

L= I +NIL * ( J −1)+NILNJL (K−1)+NMPL RO( L)= xx ( 1 , L )

RU( L)= xx ( 2 , L ) RV( L)= xx ( 3 , L ) RW( L)= xx ( 4 , L ) ET ( L)= xx ( 5 , L ) RQ( L)= xx ( 6 , L ) REPS ( L)= xx ( 7 , L )

C w rite ( 6 , * )RO( L ) ,RU( L ) ,RV( L ) ,RW( L)

C w rite ( 6 , * ) xx ( 1 , L ) , xx ( 2 , L ) , xx ( 3 , L ) , xx ( 4 , L) end do

end do end do end do

C C all f u n c t i o n s t h a t s e t DRO,DRU, . . . C w rite ( 6 , * ) ” i : ” ,RO( 1 ) ,RU( 1 ) ,RV( 1 ) C w rite ( 6 , * ) ” i i : ” ,DRO( 1 ) ,DRU( 1 ) ,DRV( 1 ) C w rite ( 6 , * ) ” i i i : ” , xx ( 1 , 1 ) , xx ( 2 , 1 ) , xx ( 3 , 1 )

c a l l AUXVR c a l l VGRAD

C c a l l LTSP

c a l l FLUX c a l l KESRC

C Loop through mesh and s e t output v e c t o r C w rite ( 6 , * ) ”1” ,RO( 1 ) ,RU( 1 ) ,RV( 1 ) C w rite ( 6 , * ) ”2” ,DRO( 1 ) ,DRU( 1 ) ,DRV( 1 )

do IMESH=1 ,NMESH NMPL=NMP(IMESH)

NIL = NI (IMESH) NJL = NJ (IMESH) NKL = NK(IMESH)

do K=1 ,NKL−1

do J =1 , NJL−1

(32)

do I =1 ,NIL−1

L= I +NIL * ( J −1)+NILNJL (K−1)+NMPL

C f ( L)=RO( n + 1 ) ( L)−RO( n ) ( L)− d e l t a t a u DRO( n +1)( L) f f ( 1 , L ) = RO( L)−OLD( 1 , L)−TSF ( L ) DRO( L)

C w rite ( 6 , * ) f f ( 1 , L ) , ” = ” ,RO(L ) , ”−” ,OLD( 1 , L ) , ”−”

C . , TSF ( L ) , ” * ” ,DRO(L )

f f ( 2 , L ) = RU( L)−OLD( 2 , L)−TSF ( L ) DRU( L) f f ( 3 , L ) = RV( L)−OLD( 3 , L)−TSF ( L ) DRV( L) f f ( 4 , L ) = RW( L)−OLD( 4 , L)−TSF ( L ) DRW( L) f f ( 5 , L ) = ET ( L)−OLD( 5 , L)−TSF ( L ) DET( L) f f ( 6 , L ) = RQ( L)−OLD( 6 , L)−TSF ( L ) DRQ( L) f f ( 7 , L ) = REPS ( L)−OLD( 7 , L)−TSF ( L ) DREPS( L) xx ( 1 , L)=RO( L )

xx ( 2 , L)=RU( L ) xx ( 3 , L)=RV( L ) xx ( 4 , L)=RW( L ) xx ( 5 , L)=ET ( L ) xx ( 6 , L)=RQ( L ) xx ( 7 , L)=REPS ( L ) OLD( 1 , L)=RO( L ) OLD( 2 , L)=RU( L ) OLD( 3 , L)=RV( L ) OLD( 4 , L)=RW( L ) OLD( 5 , L)=ET ( L ) OLD( 6 , L)=RQ( L ) OLD( 7 , L)=REPS ( L ) end do

end do end do end do

C w rite ( 6 , * ) ”3” , f f ( 1 , 1 ) , f f ( 2 , 1 ) , f f ( 3 , 1 ) i n t e = i n t e +1

w rite ( 6 , * ) ” I t e r e r i n g : ” , i n t e r e t u r n

end

C END NISSE

A.2 ts.f

This is the sourcecode showing the TS-object implementation of the solver:

SUBROUTINE RUN IMPLICIT ( dt , tmax , zero , i 1 0 0 0 )

(33)

# in clu d e ” ImplicitG3D . h”

# in clu d e ”mesh . h”

# in clu d e ”ketw . h”

# in clu d e ” para . h”

# in clu d e ” s c a l . h”

P e t s c I n t I , IMESH,NMPL, NIL , NJL , NKL, K, J , L TS t s

Vec x , r

P e t s c S c a l a r tmax , zero , ft ime , dt P e t s c I n t N, i t s , i 1 0 0 0

P e t s c T r u t h f l g

e x t e r n a l FormFunction , F o r m I n i t i a l G u e s s CFL=1

c a l l P e t s c I n i t i a l i z e (PETSC NULL CHARACTER, i e r r )

c a l l Pe t s cOp t io ns Ge t R e al (PETSC NULL CHARACTER, ’ − dt ’ , dt , f l g , i e r r ) c a l l MPI Comm size (PETSC COMM WORLD, size , i e r r )

c a l l MPI Comm rank (PETSC COMM WORLD, rank , i e r r ) DO IMESH=1 ,NMESH

NMPL=NMP(IMESH) NIL = NI (IMESH) NJL = NJ (IMESH) NKL = NK(IMESH) C F i x a t i d s s t e g e t TSF

DO K=1 ,NKL−1 DO J =1 ,NJL−1

DO I =1 , NIL−1

L= I +NIL * ( J −1)+NILNJL ( K−1)+NMPL MWFCN( L)=0

TSF ( L)= dt END DO END DO END DO END DO

N=NMESHNKLNJLNIL7

c a l l VecCreateSeq ( PETSC COMM SELF , N, x , i e r r ) c a l l V e cDup licat e ( x , r , i e r r )

c a l l TSCreate (PETSC COMM WORLD, t s , i e r r ) c a l l TSSetProblemType ( t s , TS NONLINEAR, i e r r ) c a l l T S S e t S o l u t i o n ( t s , x , i e r r )

c a l l TSSetRHSFunction ( t s , FormFunction , PETSC NULL OBJECT , i e r r ) c a l l F o r m I n i t i a l G u e s s ( x , i e r r )

c a l l TSSetType ( t s , TS BEULER , i e r r )

c a l l T S S e t I n i t i a l T i m e S t e p ( t s , zero , dt , i e r r )

(34)

w rite ( 6 , * ) zero , dt , tmax , i1000

c a l l T S S e t Duratio n ( t s , i1000 , tmax , i e r r ) c a l l TSSetFromOptions ( t s , i e r r )

c a l l TSSetUp ( t s , i e r r )

c a l l TSStep ( t s , i t s , ft ime , i e r r ) w rite ( 6 , * ) i t s , ftime

! 120 f o r m a t ( ’ Number o f p s e u d o tim e −s t e p s ’ , i 5 , ’ f i n a l t i m e ’ , 1 pe8 . 2 ) c a l l VecDestroy ( x , i e r r )

c a l l VecDestroy ( r , i e r r ) c a l l TSDestroy ( t s , i e r r ) c a l l P e t s c F i n a l i z e ( i e r r ) i f ( i e r r . ne . 0 ) then

w rite **( 6 , * ) ’ Error in P e t s c f i n a l i z e ( ) . . . ’ endif**

END C START NISSE

subroutine F o r m I n i t i a l G u e s s ( X , i e r r ) Vec X

Pe t s cE rro rCo d e i e r r P e t s c S c a l a r l x v ( 0 : 1 ) P e t s c O f f s e t l x i w rite ( 6 , * ) ”H3T”

i e r r =0

c a l l VecGetArray ( X , l x v , l x i , i e r r ) w rite ( 6 , * ) ” h3t ”

c a l l I n i t i a l G u e s s ( l x v ( l x i ) , i e r r ) w rite ( 6 , * ) ”22 ”

c a l l VecRestoreArray ( X , l x v , l x i , i e r r ) w rite ( 6 , * ) ”23 ”

r e t u r n end

subroutine I n i t i a l G u e s s ( xx , i e r r )

# in clu d e ” ImplicitG3D . h”

# in clu d e ”mesh . h”

# in clu d e ” s o l v . h”

P e t s c I n t IMESH, I , J , K, L , NIL , NJL , NKL,NMPL P e t s c S c a l a r xx ( 7 , 1 : 5 4 9 0 0 0 )

do IMESH=1 ,NMESH

NMPL=NMP(IMESH)

(35)

NIL = NI (IMESH) NJL = NJ (IMESH) NKL = NK(IMESH)

do K=1 ,NKL−1 do J =1 , NJL−1

do I =1 ,NIL−1

L= I +NIL * ( J −1)+NILNJL (K−1)+NMPL xx ( 1 , L)=RO( L )

xx ( 2 , L)=RU( L ) xx ( 3 , L)=RV( L ) xx ( 4 , L)=RW( L ) xx ( 5 , L)=ET ( L ) xx ( 6 , L)=RQ( L ) xx ( 7 , L)=REPS ( L )

C w rite ( 6 , * )RO( L ) ,RU( L ) ,RV( L ) ,RW( L)

C w rite ( 6 , * ) xx ( 1 , L ) , xx ( 2 , L ) , xx ( 3 , L ) , xx ( 4 , L) end do

end do end do end do r e t u r n end

!

! −−−−−−−−−−−−−−−−−−−− E v a l u a t e F u n c t i o n F ( x ) −−−−−−−−−−−−−−−−−−−−−

! Taken from s r c / t s / e x a m p l e s / t u t o r i a l s / e x 1 f . F subroutine FormFunction ( t s , t , X , F , i e r r )

# in clu d e ” ImplicitG3D . h”

C# in clu d e ” f i n c l u d e / p e t s c . h”

C# in clu d e ” f i n c l u d e / p e t s c v e c . h”

C# in clu d e ” f i n c l u d e /petscmat . h”

C# in clu d e ” f i n c l u d e / p e t s c p c . h”

C# in clu d e ” f i n c l u d e / p e t s c t s . h”

TS t s

P e t s c S c a l a r t

Vec X , F

C Pe t s cE rro rCo d e i e r r P e t s c O f f s e t xidx , f i d x

P e t s c S c a l a r l x f ( 0 : 1 ) , l x x ( 0 : 1 ) C Get t he inp ut a r r a y

C c a l l VecView ( X , PETSC VIEWER STDOUT WORLD, i e r r ) C c a l l VecView ( F , PETSC VIEWER STDOUT WORLD, i e r r )

c a l l VecGetArray ( X , l x x , xidx , i e r r )

(36)

C Get output a r r a y

c a l l VecGetArray ( F , l x f , f i d x , i e r r ) i e r r =0

w rite ( 6 , * ) ”H1T”

c a l l doFunction ( l x x ( xidx ) , l x f ( f i d x ) , i e r r ) w rite ( 6 , * ) ”H2T”

c a l l VecRestoreArray ( X , l x x , xidx , i e r r ) c a l l VecRestoreArray ( F , l x f , f i d x , i e r r )

C c a l l VecView ( F , PETSC VIEWER STDOUT WORLD, i e r r ) r e t u r n

end

subroutine doFunction ( xx , f f , i e r r )

# in clu d e ” ImplicitG3D . h”

# in clu d e ”mesh . h”

# in clu d e ” t d e r . h”

# in clu d e ” s o l v . h”

**r e a l *4 inte , aas**

P e t s c S c a l a r DROGAD( 1 , 1 : 5 4 9 0 0 0 ) COMMON /INTE/ i n t e ,DROGAD( 1 , 1 : 5 4 9 0 0 0 )

P e t s c I n t NIL , NJL , NKL, I , J , K, L ,NMPL, IMESH P e t s c S c a l a r xx ( 7 , 1 : 5 4 9 0 0 0 ) , f f ( 7 , 1 : 5 4 9 0 0 0 ) C Loop through t he mesh and s e t t he v a l u e s o f RO

do IMESH=1 ,NMESH NMPL=NMP(IMESH)

NIL = NI (IMESH) NJL = NJ (IMESH) NKL = NK(IMESH)

do K=1 ,NKL−1 do J =1 , NJL−1

do I =1 ,NIL−1

L= I +NIL * ( J −1)+NILNJL (K−1)+NMPL RO( L)= xx ( 1 , L )

RU( L)= xx ( 2 , L ) RV( L)= xx ( 3 , L ) RW( L)= xx ( 4 , L ) ET ( L)= xx ( 5 , L ) RQ( L)= xx ( 6 , L ) REPS ( L)= xx ( 7 , L )

C w rite ( 6 , * )RO( L ) ,RU( L ) ,RV( L ) ,RW( L)

C w rite ( 6 , * ) xx ( 1 , L ) , xx ( 2 , L ) , xx ( 3 , L ) , xx ( 4 , L) end do

end do

(37)

C C all f u n c t i o n s t h a t s e t DRO,DRU, . . . C w rite ( 6 , * ) ” i : ” ,RO( 1 ) ,RU( 1 ) ,RV( 1 ) C w rite ( 6 , * ) ” i i : ” ,DRO( 1 ) ,DRU( 1 ) ,DRV( 1 ) C w rite ( 6 , * ) ” i i i : ” , xx ( 1 , 1 ) , xx ( 2 , 1 ) , xx ( 3 , 1 )

c a l l AUXVR c a l l VGRAD c a l l LTSP c a l l FLUX c a l l KESRC

C Loop through mesh and s e t output v e c t o r C w rite ( 6 , * ) ”1” ,RO( 1 ) ,RU( 1 ) ,RV( 1 ) C w rite ( 6 , * ) ”2” ,DRO( 1 ) ,DRU( 1 ) ,DRV( 1 )

do IMESH=1 ,NMESH NMPL=NMP(IMESH)

NIL = NI (IMESH) NJL = NJ (IMESH) NKL = NK(IMESH)

do K=1 ,NKL−1 do J =1 , NJL−1

do I =1 ,NIL−1

L= I +NIL * ( J −1)+NILNJL (K−1)+NMPL f f ( 1 , L ) = DRO( L )

f f ( 2 , L ) = DRU( L ) f f ( 3 , L ) = DRV( L ) f f ( 4 , L ) = DRW( L ) f f ( 5 , L ) = DET( L ) f f ( 6 , L ) = DRQ( L ) f f ( 7 , L ) = DREPS ( L ) aas=DROGAD( 1 , L)− f f ( 1 , L )

C IF ( aas . NE. 0 )THEN

C w rite ( 6 , * ) ”AAS DIFF : ” , aas

C w rite ( 6 , * ) L , DRO( L)

c END IF

DROGAD( 1 , L)= f f ( 1 , L ) end do

end do end do end do

C w rite ( 6 , * ) ”3” , f f ( 1 , 1 ) , f f ( 2 , 1 ) , f f ( 3 , 1 ) i n t e = i n t e +1

w rite ( 6 , * ) ” I t e r e r i n g : ” , i n t e r e t u r n

end

Implementation of Advanced Time Stepping Algorithm Computational

MASTER’S THESIS

NILS-ERIK SVANGÅRD

Implementation of Advanced Time Stepping Algorithm Computational

Fluid Dynamics Software

MASTER OF SCIENCE PROGRAMME Mathematics

Luleå University of Technology

Department of Mathematics

Computational Fluid Dynamics Software

M ASTER T HESIS R EPORT

Nils-Erik Svang˚ard nilserik@gmail.com Under the supervision of Ph.D. Niklas Andersson

and Ph.D. Bo Kjellmert

November 21, 2006

The way PETSc was implemented in the present work did not result in a CFD code

with satisfying performance. This thesis discusses the PETSc framework in general, the

present PETSc implementation, reasons for the unsatisfying results and suggestions for

future work.

Contents

1 Introduction 3

2 G3D 3

2.1 Basic overview . . . . 3

2.2 Governing equations . . . . 3

3 PETSc 6 3.1 Basic overview . . . . 6

3.2 Compiling The PETSc Library . . . . 7

3.2.1 General compilation . . . . 7

3.2.2 Volvo Aero compilation . . . . 7

3.3 Using Matlab with PETSc . . . . 7

3.4 Basic Data types in PETSc . . . . 8

3.5 The non-linear solver: SNES . . . . 8

3.6 The time stepper: TS . . . . 9

3.7 Basic PETSc program . . . 10

4 Adding PETSc solver to G3D 11 4.1 Basic overview . . . 11

4.2 Handling mesh data . . . 11

5 SNES Backward-Euler Implementation 13 5.1 Backward-Euler . . . 13

5.2 The code . . . 13

5.3 The FormFunction() . . . 15

6 TS ODE solver Implementation 17 6.1 Backward-Euler . . . 17

6.2 The code . . . 17

6.2.1 The function . . . 19

7 Building and Running the Implicit Solver 19 7.1 Basic overview . . . 19

7.2 Program specifics . . . 20

7.3 Executing . . . 20

8 Conclusions and Discussion 20 8.1 Problems encountered . . . 20

8.2 Future work . . . 21

9 Acronyms 22

BIBLIOGRAPHY 23

A TS and SNES implementations in FORTRAN 24

A.1 snes.F . . . 24 A.2 ts.f . . . 29

B The modified Bratu-example 35

B.1 ex5f.F . . . 35 B.2 ex5f.h . . . 48

C The makefile 50

1 Introduction

The thermo and aerodynamics department at Volvo Aero has during the past years performed substantial refinement of its in-house CFD codes, VolSol, in order to meet the demands on more advanced methods in the field of space and aircraft propulsion.

2 G3D

2.1 Basic overview

2.2 Governing equations

In this section the equations governing the fluid dynamics of the flows used in G3D are presented in tensor form. This should be seen as background information and if more information is needed please read Andersson(2005) [1].

The Reynolds-averaged Navier-Stokes (RANS) equations constitute a set of trans- port equations governing transport of mass, momentum and energy:

∂Q

∂t + ∂F j

∂x j

= 0 (1)

where

Q =



 ρ ρ ˜ u i

ρ ˜ e 0



 (2)

and

F j =





ρ ˜ u i

ρ ˜ u i u ˜ j + pδ ij − τ ij

ρu j h 0 + q j − u i τ ij



 (3)

where τ ij is the viscous stress obtained by using Boussinesque assumption and is de- fined by

τ ij = (µ + µ t )  ∂u i

M ^ASTER T ^HESIS R ^EPORT

F _j =

τ ij = (µ + µ t ) ∂u i

µ t = C µ ρ k ²

µ t ( ∂u i

∂u i