Ambient Occlusion for Dynamic Objects and Procedural Environments

(1)

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Ambient Occlusion for Dynamic Objects and

Procedural Environments

Examensarbete utfört i Informationskodning vid Tekniska högskolan vid Linköpings universitet

av

Joel Jansson

LiTH-ISY-EX--13/4658--SE

Linköping 2013

Department of Electrical Engineering Linköpings tekniska högskola Linköpings universitet Linköpings universitet SE-581 83 Linköping, Sweden 581 83 Linköping

(2)

(3)

Ambient Occlusion for Dynamic Objects and

Procedural Environments

Master’s Thesis

by

Joel Jansson LiTH-ISY-EX--13/4658--SE

Supervisor: Gustav Taxén

Avalanche Studios

Examiner: Ingemar Ragnemalm

isy, Linköpings universitet

(4)

(5)

Avdelning, Institution

Division, Department

Division of Information Coding Department of Electrical Engineering Linköpings universitet

SE-581 83 Linköping, Sweden

Datum Date 2013-06-10 Språk Language Svenska/Swedish Engelska/English Rapporttyp Report category Licentiatavhandling Examensarbete C-uppsats D-uppsats Övrig rapport

URL för elektronisk version

http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-91918

ISBN

—

ISRN

LiTH-ISY-EX--13/4658--SE

Serietitel och serienummer

Title of series, numbering

ISSN

—

Titel

Title Ambient Occlusion for Dynamic Objects and Procedural Environments

Författare

Author

Joel Jansson

Sammanfattning

Abstract

In computer graphics, lighting is an important area. To simulate shadows from area light sources, indirect lighting and shadows from indirect light, a class of algo-rithms commonly known as global illumination algoalgo-rithms can be used. Ambient occlusion is an approximation to global illumination that can emulate shadows from area light sources and shadows from indirect light, giving very soft shadows. For real-time applications, ambient occlusion can be precomputed and stored in maps or per vertex. However, that can only be done with good results if the ge-ometry is static. Therefore, a number of methods that can handle more or less dynamic scenes have been introduced in the recent years.

In this thesis, a collection of ambient occlusion methods for dynamic objects

and procedural environments will be described. The main contribution is the

introduction of a novel method that handles ambient occlusion for procedural en-vironments. Another contribution is a description of an implementation of Screen Space Ambient Occlusion (SSAO).

SSAO is an algorithm that calculates approximate ambient occlusion in real-time by using the depths of surrounding pixels. It handles completely dynamic scenes with good performance.

The method for procedural environments handles the scenario where a number of building blocks are procedurally assembled at run-time. The idea is to precom-pute an ambient occlusion map for each building block where the self-occlusion is stored. In addition, an ambient occlusion grid is precomputed for each block to ac-commodate the inter-block occlusion. At run-time, after the building blocks have been assembled, the ambient occlusion from the grids is blended with the ambient occlusion from the maps to generate new maps, valid for the procedural environ-ment. Following that, the environment can be rendered with high quality ambient occlusion at almost no cost, in the same fashion as for a static environment where the ambient occlusion maps can be completely precomputed.

Nyckelord

Keywords SSAO, Ambient Occlusion, Global Illumination, Real-Time Rendering, Computer

(6)

(7)

Abstract

In computer graphics, lighting is an important area. To simulate shadows from area light sources, indirect lighting and shadows from indirect light, a class of algo-rithms commonly known as global illumination algoalgo-rithms can be used. Ambient occlusion is an approximation to global illumination that can emulate shadows from area light sources and shadows from indirect light, giving very soft shadows. For real-time applications, ambient occlusion can be precomputed and stored in maps or per vertex. However, that can only be done with good results if the ge-ometry is static. Therefore, a number of methods that can handle more or less dynamic scenes have been introduced in the recent years.

In this thesis, a collection of ambient occlusion methods for dynamic objects and procedural environments will be described. The main contribution is the introduction of a novel method that handles ambient occlusion for procedural en-vironments. Another contribution is a description of an implementation of Screen Space Ambient Occlusion (SSAO).

SSAO is an algorithm that calculates approximate ambient occlusion in real-time by using the depths of surrounding pixels. It handles completely dynamic scenes with good performance.

The method for procedural environments handles the scenario where a number of building blocks are procedurally assembled at run-time. The idea is to precom-pute an ambient occlusion map for each building block where the self-occlusion is stored. In addition, an ambient occlusion grid is precomputed for each block to ac-commodate the inter-block occlusion. At run-time, after the building blocks have been assembled, the ambient occlusion from the grids is blended with the ambient occlusion from the maps to generate new maps, valid for the procedural environ-ment. Following that, the environment can be rendered with high quality ambient occlusion at almost no cost, in the same fashion as for a static environment where the ambient occlusion maps can be completely precomputed.

(8)

(9)

Acknowledgments

I would like to thank my friend Linda for reviewing this thesis. I would also like to thank my mother for letting me stay at her place and write on this thesis on numerous occasions. Finally, I would like to thank Ulf Assarsson for answering kindly and swiftly to my question about his article.

The bunny model is courtesy of Stanford Computer Graphics Laboratory.

(10)

(11)

Chapter 1 Introduction

In this chapter the background is explained, which leads to the problem and goal formulations. The solution is then briefly presented. The structure of this thesis is then described and last a list of abbreviations can be found.

1.1 Background

In applications such as video games and visualization, real-time computer graphics is used to display series of images to convey the illusion of looking into another world. To generate these images, usually called frames, a lot of calculations have to be performed. One thing that has to be calculated is the amount of light at each point. In reality light is very complex. To simulate the paths of individual photons as well as also taking into account the wave properties of light would be extremely computationally expensive. Therefore, simplified models are employed, which are approximations of reality. In real-time computer graphics the models are usually rather simple and it is also common to rely on precomputation.

A quite simple model of lighting is to determine the illumination of a point by only taking into account its position and orientation relative to the light sources and optionally the viewer. This is called a local illumination model.

A phenomenon that local illumination models fail to capture is shadowing. A point is in shadow if the light source is not visible from the point in question. In addition, if a point is in shadow, that usually does not mean that it receives no light at all. This is because light is (mostly diffusely) reflected from other surfaces. This is called indirect light, shown in Figure 1.1. Models that take into account how the light is blocked by obstacles and reflected to other objects and lights them indirectly are referred to as global illumination models.

Global illumination models are naturally more expensive computationally than local models. A common strategy is therefore to use a local illumination model in combination with some special algorithms that handle some specific global phe-nomena such as shadows from direct light and mirror like reflections.

A commonly used local illumination model is the phong illumination model. It has a diffuse term, a specular term and an ambient term. The diffuse term models

(14)

Figure 1.1: Indirect light

direct diffuse lighting and is view-independent. The specular term models view dependent lighting. The ambient term is there to account for all the indirect light in the scene.

If one uses a constant ambient term everywhere a very flat and unnatural look is achieved. A solution to this is to use fill lighting [Akenine-Möller and Haines 2002, p. 81], strategically placed pseudo lights that simulate the indirect light. It is used to minimize the cases where an object is only lit by the ambient light.

1.1.1 Ambient Occlusion

Ambient Occlusion (AO) is often used as an approximation to global illumination. More specifically it is used as an approximation to shadows from indirect light or from area light sources. Ambient occlusion is roughly defined as how much the nearby geometry is occluding a point. It is a scalar between zero and one that only depends on the geometry, not on any light sources. The ambient occlusion value can then be used to scale the lighting, for example the constant ambient term in the phong illumination model. It can also be used with fill lights or lighting from environment maps. The result is proximity shadows between objects that are close to each other, and corners and crevices that are darker, which can be interpreted as these points getting less indirect light or as shadows from area light sources. In reality the light, including the indirect light, usually comes more from some specific directions, something that is not taken into account by ambient occlusion. A real world lighting situation that has a similar look to the results of ambient occlusion is that of an overcast or cloudy day. Figure 1.2 shows an image rendered with ambient occlusion.

The traditional way of calculating ambient occlusion is expensive in terms of processing power and not suitable to do at run-time in a real-time application. One way to handle that is to precompute ambient occlusion and store it per vertex of a model or in ambient occlusion maps. This of course only works well for static models. If the ambient occlusion from dynamic objects is to be accounted for, and the ambient occlusion of animated characters properly handled, another method has to be used.

(15)

1.1 Background 3

Figure 1.2: Ambient Occlusion

1.1.2 Lighting in Video Games

Early video games used very simple lighting. A breakthrough was the use of lightmaps for the lighting of the static world in Quake [Abrash 1997]. In Quake direct lighting including shadows was precomputed and stored in lightmaps (at run-time the lightmaps could also be modified by some simple dynamic lighting). Later games have used global illumination algorithms such as radiosity to generate lightmaps. In this way very realistic lighting can be produced including indirect diffuse illumination. This lighting is static, however. It is not possible to move the lights that were used in the precomputation and moving objects do not affect the lighting in terms of shadows and color bleeding (colored diffuse interreflections).

With the introduction of programmable GPUs, real-time dynamic per pixel lighting became possible. Thus some games got rid of the static lightmaps and computed dynamic shadows from direct light in real-time. If that is done you miss the indirect lighting part of the lightmaps, however. One way to get at least some sort of indirect shadows, which still fits well with dynamic direct lighting and shadows, is to use ambient occlusion. This is because ambient occlusion is only dependent on geometry and not the positions of lights. Thus precomputed ambient occlusion maps have been used to much success. For example in the game Just Cause by Avalanche Studios.

(16)

1.2 Problem

The ambient occlusion cast by moving objects cannot be handled by precomputed AO maps. The self-occlusion of animated characters can neither be properly han-dled by precomputed AO maps. Avalanche Studios wanted to use some sort of dynamic ambient occlusion to handle those cases.

Avalanche Studios also wanted to use procedurally generated environments. The idea is that various building blocks, such as walls, ceilings, panels and fur-niture are hand-modeled. Then at run-time the building blocks are assembled in a procedural fashion to generate a large variety of environments automatically. When that is the case, AO maps can only be precomputed for the self-occlusion of the building blocks, the inter-object occlusion cannot be known until after the procedural assembly, which occurs at run-time.

1.3 Goal

The goal of this thesis was to have a working implementation of methods that can handle ambient occlusion for dynamic (moving or deformable) objects and/or procedural environments. The methods could be previously published, extensions of previously published ones, or newly invented ones.

1.4 Solution

A Screen Space Ambient Occlusion (SSAO) algorithm has been implemented that works well for moving and deformable objects. SSAO is an algorithm that calcu-lates approximate ambient occlusion in real-time by using the depths of surround-ing pixels.

The method presented in [Malmer et al. 2007] has also been implemented, it is called Ambient Occlusion Grids in this thesis. It is a method for the ambient occlusion cast by moving rigid objects. Ambient occlusion is precomputed in each point of a 3D grid around each such object. These AO grids are then used at run-time to determine the amount of AO that is cast on nearby surfaces. In the original algorithm the AO grids are also used for self-occlusion, i.e. the AO from a grid is also cast on the object that the grid corresponds to. A new way has been devised to combine AO grids with precomputed ambient occlusion maps, letting the high frequency self-occlusion be stored in maps and the low frequency inter-object occlusion be stored in the AO grids. This gives higher quality self-occlusion than the original Ambient Occlusion Grids method.

Finally, a new technique has been devised that handles ambient occlusion for procedural environments. It works by first precomputing an AO grid for each building block, in the same way as in the method in the previous paragraph. An AO map with the self-occlusion is also precomputed for each building block. Then at run-time, when the building blocks have been assembled into an environment, the AO from the AO grids is combined with the precomputed AO maps and saved to new AO maps that are valid for the procedurally generated environment.

(17)

1.5 Structure of This Thesis 5

The main part of this work was carried out at Avalanche Studios in the autumn of 2007 and the beginning of 2008. However, this thesis was not finished back then. This can be something to have in mind regarding what methods that have been implemented and what related work that is referenced.

1.5 Structure of This Thesis

In this chapter the background, the problem and the goal were presented. The solution was also briefly described.

The next chapter gives a more thorough background, including a review of some computer graphics, a precise definition of ambient occlusion and an analysis of what it is. This is also where the related work is gone through. After that come three chapters that in depth describe SSAO, Ambient Occlusion Grids and Ambient Occlusion for Procedural Environments. Following these chapters there is a description of the implementation. Then comes a chapter that presents the results, which contains screenshots and performance numbers. Finally, the thesis is concluded and some ideas of future work are presented.

1.6 Abbreviations

AO Ambient Occlusion

API Application Programming Interface CPU Central Processing Unit

GPU Graphics Processing Unit PCA Principal Component Analysis PRT Precomputed Radiance Transfer SSAO Screen Space Ambient Occlusion

(18)

(19)

Chapter 2 Background

In this chapter we will first review some relevant concepts in real-time computer graphics. The next section describes ambient occlusion. Then we will review the work others have done regarding ambient occlusion.

2.1 Review of Some Computer Graphics

It is assumed that the reader has at least taken a basic academic course in computer graphics or has equivalent knowledge.

This section starts with a review of the rendering pipeline. Then some slightly more advanced concepts are described that may not have been covered in a basic course in computer graphics. These are concepts that are good to be familiar with to understand the rest of the text.

2.1.1 The Rendering Pipeline

See figure 2.1 for a schematic description of a programmable rendering pipeline. It starts with the vertices being sent to the vertex shader. It executes a vertex shader program on each vertex. Then triangles are assembled from the vertices, clipped and mapped to the viewport. The triangles are then rasterized and the output from the rasterizer are fragments. Those fragments are then sent to the pixel shader (or fragment shader) where a pixel shader program is executed on the fragments. The shaded fragments are then tested or blended with regards to alpha, stencil and depth. The result of this are colored pixels that can be shown on the screen. [Akenine-Möller and Haines 2002, p. 214].

2.1.2 Other Concepts

Here some relevant slightly more advanced concepts are described. First comes

Render to Texture that is used in all the techniques that has been implemented. Early Z Pass and Deferred Shading are described next, these are techniques that

are often combined with SSAO and AO Grids respectively. Last comes a short

(20)

Vertices

Vertex Shader

Clipping & Viewport Mapping Triangle Assembly

Rasterization

Alpha, Stencil, Depth Testing

Pixels on the screen Pixel Shader

Figure 2.1: Programmable rendering pipeline, with vertex and pixel shaders

section about using textures as lookup tables, which is a technique that is used in the AO grids method in the variant with average occluded direction. It is also used in a number of the other references.

Render to Texture

Instead of showing the pixels on the screen, they can be written to a texture. This texture could then be used in a later pass. In modern GPUs there is hardware support for this and the texture does not have to go back through the CPU. This is called render to texture. [Akenine-Möller and Haines 2002, p. 223].

Early Z Pass

In the conventional rendering pipeline described above, fragments are shaded be-fore they are depth tested. Hence, if we have a scene with a lot of overdraw, i.e.

(21)

2.2 Ambient Occlusion 9

a lot of objects are drawn that are overlapping each other, many fragments will be shaded and then discarded. If there are a lot of complex pixel shaders and the pixel shading hence is the bottleneck, this is a serious problem. However, newer GPUs have functionality called early-z rejection that does the depth test already in the rasterization stage [Sekulic 2004]. To take advantage of this, the scene can be rendered in front-to-back order, to ensure that most of the fragments that will not be visible in the final scene are rejected early. Alternatively, the scene can first be rendered with pixel shading disabled. Then without clearing the depth buffer in between, the scene is rendered a second time, now with pixel shading enabled but with depth buffer write disabled [Akenine-Möller and Haines 2002, pp. 422-423]. This is called early z pass in [Mittring 2007]. Another term for the same concept is z pre-pass. A disadvantage with this technique is that the geometry is rendered twice.

Deferred Shading

Deferred shading is taking this concept even further. First a pass is done where the world space positions1_{, normals, diffuse colors, specular, etc. of all pixels are}

rendered to multiple render targets. Then a second pass is made where the final color of each pixel is determined with the help of the previously generated buffers. As in early z pass, overdraw of shaded pixels is eliminated, but in deferred shading the geometry is only rendered once. [Akenine-Möller and Haines 2002, p. 232]

Using Textures as Lookup Tables

Some functions are computationally expensive to compute. A way to handle that is to precompute a lookup table. In pixel shader programs this can be implemented by using a texture as a lookup table. This way we also get the averaging for free by using filtered lookups. This can be used with 1-, 2- or 3-dimensional textures.

2.2 Ambient Occlusion

In section 1.1.1 a pragmatic motivation for ambient occlusion was presented ac-companied with a rough explanation of what it is. Here comes a more stringent definition and analysis of what it is.

Ambient occlusion A at a point p with normal ˆn is defined as [Shanmugam

and Arikan 2007]: A(p, ˆn) = 1 π Z Ω V ( ˆω, p)max(ˆn · ˆω, 0)d ˆω (2.1)

Ω is the unit hemisphere above the point p. V ( ˆω, p) is a visibility function,

defined as zero if no geometry is visible in direction ˆω from point p and one

otherwise. Since Ω is the hemisphere in the direction of ˆn, ˆn · ˆω will always be

positive. Hence equation 2.1 can be simplified to:

(22)

A(p, ˆn) = 1

π

Z

Ω

V ( ˆω, p)ˆn · ˆωd ˆω (2.2)

The reason why there is a factor _π1 before the integral is because it is desirable that A is normalized so that it varies between zero and one, and the maximum value of the integral happens to be π. Why the maximum value of the integral is

π is explained in section 2.2.1.

The ambient occlusion integral is traditionally computed with a Monte Carlo method, namely ray casting. This is done by casting some rays at different direc-tions in the hemisphere above the point and check if they hit something, see figure 2.2.

p

n

ω

Ω

V(ω, p) = 1

V(ω, p) = 0

Figure 2.2: Illustration of the ambient occlusion integral as well as how to compute it with ray casting

Ambient occlusion was first used widely in the film industry. [Landis 2002] is sometimes credited to have introduced ambient occlusion. However, in [Bredow 2002], another chapter of the same course notes, ambient occlusion is described as well, by another author from a different company. Both these articles describe ambient occlusion as a replacement for fill or bounce lights or more detailed global illumination algorithms.

[Landis 2002] defines ambient occlusion in terms of how to compute it, not in form of an integral equation. It is hence described as follows: to compute AO at a point, cast rays in a hemisphere around the normal at that point, check how many rays that hit some geometry and divide by the number of rays cast. The rays are also distributed with a cosine distribution, motivated by stating that the strongest

(23)

diffuse light contribution is in the direction of the surface normal. The use case is placing a computer generated object in a real-world film scene and get realistic looking lighting for that object. The AO is used to scale the lighting contribution from an environment map. This environment map is generated by photographing a sphere that is placed on the location that the computer generated object will be placed.

[Bredow 2002] also describes ambient occlusion in terms of how to compute it, although the computation is slightly different. In this description it is computed by placing two area lights in the scene, one representing the sky and the other one representing the ground. These area lights are then sampled with a number of rays, taking into account blocking geometry.

[Christensen 2003] describes ambient occlusion in the same way as [Landis 2002]. In addition, he describes a maximum distance for objects to cause occlusion. In terms of calculating AO with ray casting that translates to a ray length, hits beyond this ray length are not registered.

2.2.1 Derivation of the Normalization Factor

The reason why the maximum value of the AO integral is π, and hence that the normalization factor is _π1 has not been found in the related work referenced in section 2.3, therefore it is derived here.

The maximum value of the integral in equation 2.2 occurs when V ( ˆω, p) = 1

for all ˆω, with other words, p is fully occluded. The surface area of a sphere is

4πr2 [Nordling and Österman 2004, p. 401]. The area of a unit sphere is hence 4π and the area of a hemisphere 2π. What we are calculating in equation 2.2 is, however, not just the integrated visibility over the hemisphere, but the cosine weighted visibility (due to the dot product ˆn · ˆω), and this does matter for the

maximum value. If V ( ˆω, p) = 1 for all ˆω, the integral in equation 2.2 becomes:

Z

Ω

ˆ n · ˆωd ˆω

This integral can instead be expressed in spherical coordinates, see figure 2.3. When integrating over the surface area of a sphere, in spherical coordinates, the differential area element is R2_{sinθdθdϕ [Persson and Böiers 1988, p. 271]. Since}

we are integrating over the unit hemisphere, R = 1. We can thus simplify the area element to sinθdθdϕ. The dot product ˆn · ˆω can be expressed as cos θ and the

integral hence becomes:

2π Z 0 π 2 Z 0 cos θ sin θdθdϕ (2.3)

This double integral can be computed by first noting that the primitive function of cos x sin x can be calculated as this:

(24)

θ

φ

R

ω

n

Figure 2.3: Spherical coordinates.

Z cos x sin xdx = sin x = t dt = cos xdx = Z tdt = t 2 2 + C = sin2_x 2 + C (2.4)

Using equation 2.4 we can compute the double integral in equation 2.3 as:

2π Z 0 π 2 Z 0 cos θ sin θdθdϕ = 2π Z 0 sin2_θ 2 π2 0 dϕ = 2π Z 0 (1 2 − 0)dϕ = hϕ 2 i2π 0 =2π 2 − 0 = π

2.2.2 Obscurance

A concept that is very similar to ambient occlusion is obscurance. It was in-troduced before ambient occlusion although it seems that the original inventors of ambient occlusion were unaware of it. In contrast to ambient occlusion, it is described in the framework of radiosity and also related more to indirect light, rather than lighting from area light sources, as the earliest descriptions of ambient occlusion.

Obscurance w of a point P was introduced by [Zhukov et al. 1998] as:

w(P ) = 2 π Z Z x∈hS2 ρ(L(P, x)) cos αdx (2.5) where:

(25)

hS2 _{is the unit hemisphere above P}

L(P, x) =   

distance(P, C) where C is the first intersection point of ray P x with the scene

+∞ if P x does not intersect the scene

ρ(L(P, x)) an empirical mapping function that maps the distance

L(P, x) to energy coming from the obscuring patch in that direction. α the angle between direction P x and patch normal

It is assumed that the farther away the first obscuring patch is, the higher is the energy coming from that patch, although with an upper limit. Hence ρ(L) is a monotone increasing function for L < Lmax and ρ(L) = 1 for L > Lmax.

The important difference between AO and obscurance is hence that in AO the visibility function is either zero or one, whereas the corresponding function in obscurance is continuous between zero and one and scaled with the distance to the occluder. It is also defined in the opposite way, one means not occluded at all in obscurance, whereas it means fully occluded in AO. The reason for the scaling factor 2 before the integral is not explained in the article and is likely a mistake since it is not present in [Iones et al. 2003], which is a later paper about obscurance by two of the same authors (plus two additional ones).

2.2.3 Interpretation of Ambient Occlusion

Ambient occlusion is hence the cosine-weighted fraction of how much of the hemi-sphere that is visible. So how can that be useful? Visibility is associated with shadows. But in the general case, to generate physically correct shadows, visi-bility cannot be decoupled from what light sources the visivisi-bility is computed to. Nevertheless, if we look at figure 1.2 we can see that it contains a lot of soft shad-ows. Soft shadows can occur because of two things, either direct lighting from area light sources or indirect light, which can also be seen as light from an area light source, namely the whole environment around. There is, however, no light source mentioned in the ambient occlusion equation. Therefore, a number of dif-ferent interpretations of what ambient occlusion is, and of what light source or light sources that we could be in shadow from at point p, are presented in the following.

The first one is that it is a shading and shadowing factor for direct light coming from an area light source with uniform intensity over the positive hemisphere. In this case the decoupling of visibility from light source location is possible to do without approximations because the light is uniform over the hemisphere. A real-world situation where lighting similar to this could occur could for example be the sky dome on an overcast day.

The second interpretation is that it is an aggregated visibility factor for all the diffuse light coming from the environment around, except from the object itself. That is, both direct and indirect light. For this interpretation to be useful, an environment map around the object has to be generated. The ambient occlusion also has to be computed for the object in isolation, otherwise objects around that

(26)

are meant to reflect indirect light are actually counted as occluders. It is clearly an approximation because the light is not uniform over the hemisphere. This is the interpretation used in e.g. [Landis 2002] and parts of [Malmer et al. 2007].

The third interpretation is that it is a factor that determines how much of the ambient light reaches the point p. This means that it is assumed that the surfaces close to p contributes nothing in terms of indirect illumination at p and hence nothing in terms of ambient light. This is hence a factor to determine how much of the ambient light is shadowed. For this to work we have to set the visibility function to zero beyond a certain distance. For this interpretation, as well as for the first one, it is relevant to talk about self-occlusion and inter-object occlusion. Self-occlusion being how the object shadows itself and inter-object occlusion being contact shadows and proximity shadows. This interpretation is also an approximation because in most real world situations the indirect light does not come equally distributed from all directions. This is the interpretation used in [Zhukov et al. 1998], in other parts of [Malmer et al. 2007] as well as in [Shanmugam and Arikan 2007] and [Mittring 2007].

2.3 Related Work

In this section the related work is gone through. First comes a sub section describ-ing early work that in some way or another relates to ambient occlusion. Then comes a description of methods for handling dynamic objects and different kinds of special cases. After that, the screen or image space class of ambient occlusion methods is reviewed. The section ends with a conclusion.

2.3.1 The Early Work

[Miller 1994] introduces accessibility shading, a precursor to ambient occlusion. It is used to shade surfaces according to aging. When man made objects age, dust and tarnish build up, they are then cleaned; however, it is hard to reach corners and crevices, hence some dust and tarnish remain there. It is this process that accessibility shading wants to simulate. It does so by calculating how accessible the surface is to a sphere shaped probe. This gives similar results as calculating how much of the hemisphere that is visible.

[Zhukov et al. 1998] introduces obscurance, which essentially is the same as ambient occlusion. The difference is that the obscurance of a point P is defined as the integration over the hemisphere of an approximation of the energy coming from the patches visible from P. It is assumed that more energy is coming from patches further away, and hence the energy coming from a specific patch is taken as a monotone increasing function of the distance to the patch. This function has an upper bound of one, when the distance is larger than some maximum. The visibility in a direction is defined in terms of the first patch hit by a ray in that direction from P.

[Landis 2002] explains the use of ambient occlusion in the movie industry. It is precomputed and stored in ambient occlusion maps. These are then combined with lighting stored in environment maps. He also describes the use of an extension to

(27)

2.3 Related Work 15

ambient occlusion called “bent normal”. When the ambient occlusion is computed, an average light direction vector is also calculated. This is then used to bend the normals when they are used to lookup into the environment maps to get the ambient light.

[Sloan et al. 2002] introduces precomputed radiance transfer (PRT). Radiance transfer is the transfer function from incident to exiting radiance. This can be precomputed for static objects. In the article it is stored in Spherical Harmonics (SH) basis. At run-time distant light expressed as environment maps in SH basis is just dotted with the PRT to solve the rendering equation. Radiance neighborhood-transfer is also introduced to model the radiance neighborhood-transfer from an object onto a receiver. In this case a matrix multiplication instead of a dot product has to be performed. This can be used to cast shadows, reflections and caustics.

A strength of this method is that it completely handles low frequency radiance transfer, not just the visibility as ambient occlusion does. It seems to be quite fast for self-transfer. A drawback is that it depends on a costly precomputation step. Also it is not as fast for neighborhood-transfer.

[Pharr and Green 2004] popularized ambient occlusion for real-time applica-tions such as video games. They present the method of precomputing ambient occlusion and storing it as vertex attributes or in ambient occlusion maps. Fur-thermore, they describe the idea of illuminating the object with an environment map and using the occlusion value as well as a bent normal (average unoccluded direction) to find the final lighting value.

To handle dynamically deforming objects such as the meshes in character ani-mation, they introduce the idea of storing the AO values for a series of reference poses and interpolating between them at run-time. They do not give an imple-mentation of that method but references the NVIDIA technology demo “Ogre”, for a proof of the feasibility.

They also describe the method of generating AO maps by unwrapping the model. They do this by having a vertex program that outputs the UV coordinate as screen space position.

2.3.2 Handling Dynamic Objects and Special Cases

[Bunnell 2005] describes a method for dynamic ambient occlusion and indirect lighting. It handles ambient occlusion for both deforming and moving objects. In the solution the geometry is approximated with discs and then ambient occlusion is calculated between the discs. It can also be used to calculate the indirect lighting transferred between the discs. A disadvantage with this method is that the performance is dependent on the complexity of the geometry. By utilizing a hierarchy the time complexity is, however, managed to be O(n log n), where n is the number of vertices.

[Kontkanen and Laine 2005] introduces ambient occlusion fields, which are used to calculate ambient occlusion cast on a receiver from an occluder. Ambient occlusion is precomputed and stored in a field around the occluder object. This is practically implemented as two cubemaps that stores functions. The first map stores the spherical cap as a function of direction and distance. The second map

(28)

stores the average occluding direction. At run-time an integral has to be calculated but that is stored in a lookup table also stored as a cubemap. The strong point of this method is that it has quite low memory requirements. A drawback is that it needs long precomputation time. It also has no self-occlusion.

[Zhou et al. 2005] presents a method to handle direct illumination giving soft shadows in dynamic scenes. For each scene entity a shadow field is precomputed. Source Radiance Fields (SRF) for light sources and Object Occlusion Fields (OOF) for objects. The shadow fields are a set of concentric spheres with a cube map at each sample point. The cube maps are compressed by means of representing them with spherical harmonics for low frequency shadows or wavelets for high frequency shadows. At run-time the SRFs and OOFs are combined according to the scene configuration.

[Kontkanen and Aila 2006] presents a method for rendering animated charac-ters with dynamic ambient occlusion. They store ambient occlusion per vertex as a function of pose. The pose is expressed as a number of animation parameters, for example the joint angles. For each pose the ambient occlusion value at every vertex is stored. They do not compress the data, and they have not implemented a hardware version of the run-time evaluation. It gives visually good results for the self-occlusion of animated characters. The disadvantages are that it needs very long precomputation time and that performance and storage is geometry depen-dent and dependepen-dent on number of poses. It also only handles self-occlusion of the characters, not the occlusion that is cast to the environment.

[Kirk and Arikan 2006] presents another method for dynamic ambient occlusion for character animation. They also store ambient occlusion as a function of pose but the pose representation is different and the data is also compressed. Poses are clustered and PCA is also made of the pose space. The good things with this method are that it gives visually good results and probably has fast run-time evaluation (no numbers are given). The disadvantages are that it requires a very long precomputation step and needs to store quite a lot of data even though it is compressed.

[Hegeman et al. 2006] present an approximation to ambient occlusion in out-door scenes with a lot of vegetation. The main method, which is approximate ambient occlusion for trees, is based on properties of trees. Namely that deeper into the tree and closer to the ground it is more occluded. Ambient occlusion cast from trees onto the ground or on moving objects is approximated by the solid angle of the occluder (the tree). More specifically the tree is approximated by spheres, or only one sphere, which according to the authors works well in practice.

[Malmer et al. 2007] presents a method for the ambient occlusion cast by moving rigid objects. Ambient occlusion is precomputed in each point of a 3D grid around each such object. These AO grids are then used at run-time to determine the amount of AO that is cast on nearby surfaces. A more elaborate variant is also described where in addition to the occlusion value the average occluded direction is stored as well.

This method is similar to the neighborhood transfer in [Sloan et al. 2002], although not as general and therefore much faster. The main strong point of this method is its good run-time performance. The disadvantages are that it

(29)

2.3 Related Work 17

needs precomputation and storage. In addition, it only accounts for low frequency ambient occlusion and does not handle deformable geometry.

2.3.3 Screen Space Methods

[Shanmugam and Arikan 2007] calculates completely dynamic ambient occlusion by separating it into two domains. High frequency ambient occlusion caused by nearby occluders and low frequency ambient occlusion caused by distant occluders. Both parts require access to an ND-buffer where the normals and depths of all pixels are stored in a pre pass.

The high frequency part is a full-screen pixel shader pass. For each pixel an approximate AO value is calculated from the geometry that is close in image space. This is done by sampling a number of pixels in the ND-buffer around the current pixel and then approximately reconstruct the world space surfaces that these pixels correspond to. The surfaces are approximated by spheres and the ambient occlusion in the current pixel is calculated as the ambient occlusion caused by these spheres.

The distant occluder part uses an approximation of the geometry as a collection of spheres. For each sphere the maximum distance it would influence is determined. These radii of influence in addition to the center points of the spheres are sent to the vertex shader that creates a billboard corresponding to each sphere. The billboards are then rasterized and sent to the pixel shader. For every billboard the pixel shader calculates the AO caused by the corresponding sphere in each pixel that the billboard covers. Then the AO values from the different billboards are combined in the blender stage.

The two parts each results in a full screen buffer with AO values. These are combined additively to get a final combined AO buffer. The major strong point of this method is that it generates completely general and dynamic AO. The biggest drawback is that it is quite slow according to the performance numbers given, so to be practically usable some sort of optimization is needed.

In [Mittring 2007] a similar method to the high frequency part in [Shan-mugam and Arikan 2007] is described. It is called Screen Space Ambient Occlusion (SSAO). Instead of assuming that the geometry corresponding to the pixels can be approximated with spheres and then doing a real ambient occlusion computation, only the depths of the surrounding pixels are taken into account, and a simple depth comparison is performed to calculate the ambient occlusion contribution. A trick to reduce sample count is also introduced, the samples are initially located in a sphere around the pixels and then they are reflected against a random plane. The section that describes SSAO is however just half a page of text and then some screenshots, so a lot of detail is omitted.

[Quilez 2007] was one of the first more detailed descriptions of SSAO. One of the main characteristics is that the positions of the samples are taken in world space and then projected back to screen space.

[Fox 2007] introduced another variant of SSAO that the author calls Ambient Occlusion Crease Shading. It utilizes the normals instead of the depths and handles the problem of incorrect self-occlusion (described in section 3.2.1 on page 21) very

(30)

well.

2.3.4 Conclusion

A clear path of development can be seen. Earlier, precomputation-based algo-rithms were dominating, then more dynamic methods appeared and finally run-time only methods arrived. This of course also goes hand in hand with the devel-opment of hardware. With the arrival of programmable GPUs a whole new class of algorithms became possible.

What also can be noted is that earlier methods were in general more accu-rate, but later accuracy has been traded for the ability to handle dynamic scenes. Precomputed accurate values of course become inaccurate if the scene is changed and the precomputed values are not valid any more. But it also has to do with a growing realization that because ambient occlusion is a rather subtle effect, and also fairly low frequency, the human eye is pretty forgiving when it comes to approximations. And even more so because ambient occlusion in itself is an approximation.

(31)

Chapter 3 Screen Space Ambient

Occlusion

The term Screen Space Ambient Occlusion (SSAO) was first introduced by Crytek in [Mittring 2007]. However, the idea of calculating ambient occlusion in a full screen pass with the help of a depth buffer was first introduced by [Shanmugam and Arikan 2007].

This chapter starts with an overview of the method. Then comes the detailed description. The chapter is concluded with a discussion of some strengths and weaknesses of SSAO.

3.1 Overview of the Method

First of all the scene is rendered once to get a full screen buffer with the depth of all pixels on the screen. Then a full screen quad is drawn to invoke the pixel shader on all pixels of the screen. For each pixel some depth samples around that pixel are retrieved from the depth buffer. The difference between each depth sample and the current pixel’s depth is calculated. The average of these depth differences is an approximation to ambient occlusion for the current pixel. In figure 3.1 we can see the case when the pixel that AO is to be calculated for has a higher depth value than the samples around. Hence it is occluded according to SSAO. Figure 3.2 shows the case when the depth value is lower than for the samples around, this gives no occlusion.

The result is a full screen buffer with ambient occlusion values for all pixels on the screen. This buffer can then be used in the lighting shader.

3.2 Details of the Algorithm

There are a number of different varieties of the method. The first one is a full-screen pass whose output is a full-full-screen buffer with ambient occlusion values ready to be fetched for the conventional renderer. This is the classic way to do it

(32)

Screen plane / depth buffer Other depth samples

Figure 3.1: Occluded pixel according to SSAO.

Screen plane / depth buffer

Figure 3.2: Unoccluded pixel according to SSAO.

and is what was described in section 3.1. The advantages are that we have small variations in the cost associated with the operation, and that we can easily apply post filters such as blur. Another option is to just apply it on a subset of the screen, for example by drawing bounding volumes of the objects that should have SSAO applied on them. The advantage with this is a reduced cost if the screen is not completely covered by such objects. Another advantage is that we may not even want the effect over the full screen but just on some objects, such as animated characters. Some method to eliminate overdraw should be employed, however. A drawback with this method is that the cost is highly variable, depending on how much screen space the SSAO-enabled objects cover. A third option is to apply it per pixel as part of the lighting pixel shader. This is pleasing due to the simplicity but has the drawback that one cannot easily apply post-filters on the SSAO part only.

The following is roughly the same method as the one described by [Quilez 2007].

First the scene is rendered to get a full screen buffer with the depths of all pixels on the screen. If an early z pass is performed this is already done (or almost done, depending on if the Z-buffer can be accessed or not, see section 3.2.2). If

(33)

3.2 Details of the Algorithm 21

deferred shading is used there may also already be a depth buffer.

Then comes the SSAO pass. A full screen quad is rendered to invoke the pixel shader on all pixels on the screen. The input to this pass is the depth buffer output in the previous pass. A sampling pattern is also taken as input. Here this is in form of a set of vectors uniformly distributed in a sphere. The last input is a set of random vectors stored as a texture.

Pseudo Code for the SSAO Pixel Shader

1. Retrieve the depth of the current pixel position from the depth buffer. 2. Calculate the view space position from the depth.

3. Then for each sample:

(a) Retrieve a sample point offset.

(b) Do the random reflect as in [Mittring 2007] on the offset vector. (c) Add the randomized offset vector to the view space position. (d) Project it back to screen space.

(e) Retrieve the depth of this screen space coordinate.

(f) Find the difference between this depth and the depth of the current pixel.

(g) Take the AO contribution as

depth_dif f erence/(1 + depth_dif f erence2)

as an approximation to distance fall-off. (h) Add this to the total AO sum.

4. Multiply the AO sum with some constant and divide with number of samples.

Last comes a blur pass. The reason for this is that if random sampling is used, a noisy image is produced, and this can be remedied with a blur filter. This blur should ideally be edge-aware so that the AO does not get smeared out to places where there should not be any AO.

3.2.1 Avoiding Incorrect Self-Occlusion

If a flat surface is oriented at an angle towards the viewer, some samples will be closer to the viewer and will get a lower depth value. Hence they will contribute to the occlusion, even though in reality they are on the same flat surface. This will be a problem if only positive occlusion is accounted for. That is, if samples on the other side of the pixel, those further from the viewer and thus with a higher depth value, are discarded. In this case a flat surface not orthogonal to the viewer will occlude itself, even though it is flat. Figure 3.3 shows a plane that is orthogonal to the viewer, hence the depth values will approximately be equal and there will

(34)

be no occlusion. Figure 3.4 shows a plane at an angle to the viewer, which can give incorrect self-occlusion.

One way to get around this problem, if the normals are accessible, is to shoot the rays only in the hemisphere above the plane of the point, and not in all directions of a sphere around the point. Another simpler way to solve it that does not require the normals is simply to keep the negative occlusion values. This way the self-occlusion takes out each other on a flat surface.

Screen plane / depth buffer Pixel that AO is to be calculated for

Other depth samples

Figure 3.3: Plane orthogonal to the viewer, no occlusion.

This depth sample will be counted as occluder

Figure 3.4: Plane at an angle to the viewer, incorrect self-occlusion.

One problem remains with this approach though; at the silhouette edge there are no samples on the negative side, because the object ends there. This will cause objects to have a dark silhouette. A way to fix this is to detect the silhouette and in that case put the occlusion to zero. This will also yield incorrect results in quite a lot of cases, but non-existent occlusion where there should be occlusion is harder to detect than over occlusion where there should not be any at all.

(35)

3.3 Strengths and Weaknesses 23

3.2.2 The Depth Buffer and Reconstructing the Position

If an API is used that permits access to the Z-buffer, such as DirectX 10, it can be used as depth buffer in the algorithm. However, the depth values in the Z-buffer are not linear if perspective projection is used. When they are used in the SSAO algorithm it will probably be necessary to rescale them, otherwise the result may be strange when the differences between depths are calculated. In this work DirectX 9 was used, which cannot access the Z-buffer directly. Therefore render to texture was used to write the linear depth to a separate buffer in the depth pass. The view space position can be reconstructed in a number of different ways. One option is to use the screen space (x, y) coordinates, which are readily available in the pixel shader, combine them with the depth from the depth buffer and un-project into view space by means of using an inverted un-projection matrix. Another option is to do as in [Wenzel 2006], namely first generate a vector from the view origin through the current pixel and to the far clipping plane. Then scale this vector with the depth of the current pixel. The vector is generated by calculating the view space positions of the view frustum corners. Then in the vertex shader for the full screen quad, output this to one additional output. This vector will then be interpolated so that in the pixel shader we will have a vector from the view origin through the current pixel and to the far clipping plane. The view space position will be along this vector since it goes from the view origin and through the current pixel. So if we scale this vector with the linear depth the result will be the view space position.

3.3 Strengths and Weaknesses

The main strength of SSAO is that it handles completely dynamic scenes and deforming objects since everything is calculated at run-time. The lack of need for a precomputation step is of course something positive in itself. Precomputation slows down the asset creation process and complicates content creation pipelines. If an early z pass is done anyway or if deferred shading is used, an additional strength is that the performance is independent of geometric complexity.

Since this algorithm is performed in screen space, it only takes into account occlusion that is caused by triangles currently visible. First of all that means that triangles outside the view frustum do not contribute to occlusion. Secondly it means that triangles that are behind something in the current view do not contribute to occlusion. This fact is noted in [Shanmugam and Arikan 2007] and they also discuss ways to overcome this with the use of depth-peeling. A third failure case is where the occluder is parallel or close to parallel to the viewing direction. If there is a limit to how far away an occluder can be (ray length) or a distance falloff, the cap, or other end, of the occluder may very well be too far away to influence the point in question. See figure 3.5 and 3.6. If there would be no limit or distance falloff the result would instead be dark glow around objects.

These three problems will make the image have incorrect ambient occlusion. Furthermore, it means that the amount of occlusion at a point can change when an object or the viewpoint is moved. This causes the image to not be entirely

(36)

stable, something that is usually more disturbing than the sole fact that the image is incorrect.

Screen plane / depth buffer Pixel that AO is to be calculated for

Other depth samples

This depth sample is too far away to influence

Occluder

Figure 3.5: Occluder is parallel to the view direction.

This depth sample is occluding

Occluder

Figure 3.6: Same scene but now the viewer has moved and the occluder is not parallel to the view direction.

Another drawback is that it is a quite performance hungry pixel shader, espe-cially in terms of number of texture fetches.

Finally it can be noted that although the principle of SSAO is fairly simple, a lot of tweaking is required to obtain good results.

(37)

Chapter 4 Ambient Occlusion Grids

In this chapter we will review Ambient Occlusion Grids from [Malmer et al. 2007] along with some improvements. We will start with an overview description of the method. Then, as this is a method that depends on precomputation, we will proceed with more detailed sections that in turn describes that and then what happens at run-time. Then there is a section that describes how to calculate the world to grid matrix, it merits a section because it is not described in the original article. Next, there is a section that describes the main extension, which is improved handling of self-occlusion. The chapter is concluded with a discussion of some strengths and weaknesses of Ambient Occlusion Grids.

4.1 Overview of the Method

This is a method to display ambient occlusion cast by moving rigid objects. The ambient occlusion is precomputed in each point of a 3D grid around each object. The values of the grid are stored in a volume texture. At run-time the values from the grid are retrieved to determine the ambient occlusion cast from that object. This is made by first rendering the world space position of each pixel on the screen to texture in one pass. Then in a second pass the bounding box of each grid is drawn to invoke the pixel shader on those pixels that are potentially influenced by the grid. In the pixel shader the world space position of the pixel is used to access the ambient occlusion value in the volume texture corresponding to the current grid. This is made by transforming the world space position with a transformation matrix calculated for that grid. The ambient occlusion values from the different occluders are blended in the blending stage of the GPU.

There is also an extended variant that uses average occluded direction. This is precomputed along with the percentage of occlusion and stored as an accompa-nying vector for each AO value. At run-time the normals are rendered along with the positions. They are then used along with the average occluded direction to compute a more accurate resulting occlusion value, which depends on the normal of the receiving surface.

(38)

4.2 Precomputation

First the coordinates of all grid points of a 3D grid around the object are deter-mined. The grid points are evenly distributed in an axis aligned box that contains the object. Figure 4.1 visualizes the grid corresponding to an object; each grid point is shown as a red dot. The details of how to determine the extent of the grid can be found in the original article.

Figure 4.1: The grid

In each grid point the ambient occlusion caused by the object in this point is calculated. This is different than the conventional ambient occlusion calculation because in that case ambient occlusion is calculated in a point on a surface, with the ambient occlusion as the percentage of the hemisphere normal to the surface that is occluded. Now, however, there is no surface; the ambient occlusion is calculated in a void. Thus, what is calculated is the percentage of the whole sphere that is occluded in that point:

A(p) = 1

4π Z

Ω

V ( ˆω, p)d ˆω (4.1)

In this equation Ω is the unit sphere around p. It can also be noted that the scaling factor is _4π1. Since there is no cosine factor the maximum value of the integral is the area of the unit sphere, which is 4π.

(39)

4.2 Precomputation 27

If the method with average occluded direction is to be used, that has to be calculated and stored too. The pair of average occluded direction vector and occlusion value can be seen as representing a cone, a set of occluded directions. This cone is defined by an axis d, the average occluded direction, and an aperture

α, linked to the percentage of occlusion A. This is illustrated in figure 4.2

α

d

Figure 4.2: The cone of occlusion, defined by its direction d and its aperture α.

At run-time this cone will be clipped by the receiving surface. If the surface is approximated by its tangent plane the calculations are simplified. This is showed in figure 4.3. The calculations are, however, still complex and is therefore precom-puted and stored in a lookup table Tclip. Because of rotational symmetry around

the normal n of the plane, how much of the cone is above the plane is a function of two variables; the angle between n and d, called β and the cone aperture α. Be-cause of what is easiest available at run-time, Tclipis instead stored as a function

of cos(β) and the occlusion A.

α

d

β

n

Figure 4.3: The cone of occlusion clipped by the tangent plane of the receiver, defined by the normal n.

(40)

with one of the authors of the article revealed that it is best solved numerically [U. Assarsson, personal communication, March 4, 2008]. More specifically a number of points are uniformly generated on the unit sphere, then for each α and β, the percentage of points above the plane and inside the cone is calculated. The condition that a point p is inside the cone can be formulated as arccos(p · d) < α if d is a unit vector.

4.2.1 Recalculation of Grid Values

Since the extent of the grid is finite, the values on the border of the grid are not guaranteed to be zero. If they are not zero there would be a visible discontinuity in the shading. This is fixed by recalculating all the values of the grid. If Am is

the maximum value on the grid border, each value of the grid A are recalculated to a new value A0 _as:

A0= max(A − Am 1 − Am

, 0)

This will ensure that the values on the border are zero.

The values for grid points that are inside the object will be one, meaning that they are completely hidden. This is correct, but since the grid will be stored in a volume texture, and filtering will be used when addressing into the texture, the AO value for a point just at the surface of the model will be taken as an interpolation of a grid value just outside the model and a grid value inside the model. If the grid value inside the model is one, incorrect results would be obtained – it will be too dark at the surface. This can, however, be solved easily by substituting every value that is equal to one that also has neighbours that are not one, with the average of the neighbours that are not one.

(41)

4.3 Run-time 29

4.3 Run-time

The method consists of two render passes and the result is a full-screen buffer containing the ambient occlusion values for every pixel. This buffer can then be used in the lighting calculations of the scene.

The first pass consists of rendering the positions to an off screen buffer. If deferred shading is used this is made as part of the usual first pass. Instead of positions, it is possible to render the depths and then retrieve the position from the depth with a calculation.

The second pass consists of doing the following for each object that has an ambient occlusion grid:

1. Calculate world to grid matrix and set the corresponding pixel shader con-stant.

2. Set volume texture pixel shader constant.

3. Draw the back faces of the bounding box of the grid (with depth testing disabled) to invoke the pixel shader on all pixels potentially influenced by the grid.

4. In the pixel shader:

(a) Retrieve the world space position of the current pixel from the position buffer.

(b) Multiply the world space position with the world to grid matrix. (c) Use this value as address to the volume texture lookup.

(d) Output the retrieved ambient occlusion value.

The reason to draw the back faces of the bounding box instead of the front faces is that the front faces may be clipped by the near plane, yielding incorrect results.

If the method with average occluded direction is used, the difference to above is that the normals are rendered along with the positions in the first pass. Then in the second pass, the average occluded direction is retrieved along with the AO value from the volume texture. The average occluded direction is multiplied with the world matrix and then dotted with the normal. The result is then used together with the AO value as address to the Tclip texture. The advantage with

this extended variant is that the visual quality is better, and also that in the case when an object is behind a thin wall, the AO does not shine through to the other side of the wall.

If the AO grids of two or more objects overlap in screen space, a combination of the AO values has to be calculated. This is made by multiplicative blending: 1 − AOcombined= (1 − AO1) × (1 − AO2). As said in the article, this is just a crude

approximation to the combination of the effect of different occluders, but it works well in practice.

(42)

In this scene the bunny has an AO grid. The grid points are shown as red dots. At each such point an AO value has been precomputed. Those values are stored in a volume texture.

The first step in the runtime AO grids algorithm is to generate a position buffer and a normal buffer. These are shown below.

Position buffer Normal buffer

Then the back faces of the bounding box of the grid are drawn to invoke the pixel shader on those pixels potentially affected by the AO grid.

The back faces of the bounding box. (This is not an actual buffer in the algorithm,

it is just an illustration to show on which pixels the pixel shader will be invoked.)

For each pixel the position and normal are fetched and these are used to fetch the value from the AO grid volume texture.

Resulting AO buffer (containing 1 - AO)

The AO buffer can then be used together with direct lighting.

Final scene

Figure 4.4: Explanation of the AO grids algorithm, visualizing the intermediate buffers.

(43)

4.4 Calculating the World to Grid Matrix 31

4.4 Calculating the World to Grid Matrix

One thing not mentioned in the article is how to calculate the world to grid matrix. The matrix is used to change coordinate system from the world coordinate system to the texture coordinate system of the volume texture. In other words; going from world space to texture space of the volume texture. This can be decomposed into three different parts. First, there is a change of coordinate system from world to model. Then from model to grid. Finally, there is a transformation to the texture coordinate system of the volume texture. This will be elaborated in the following. All the objects have an associated transformation matrix that is used to posi-tion and orient the model in the scene, i.e. it transforms the model from model coordinates to world coordinates. This matrix is called the world matrix in Di-rectX and the model matrix in OpenGL, from this point on it is called TW orld.

Now we want to apply the inverse transform, from world space to model space. This can be computed by inverting TW orld.1

Then there is the transform from model space to grid space, TGrid. In the

implementation that has been done in this work, the grid coordinate system has its origin in one of the corners of the grid box, and the grid is axis aligned in model space. This means that TGrid is just a translation. The origin is placed

in the corner that has the lowest coordinate value in each component. This, in combination with the fact that the grid is axis aligned, ensures that the grid is situated in the positive octant of the grid coordinate system.

The last thing is going from grid space to volume texture space, i.e. trans-forming to the texture coordinate system. Since coordinates inside the texture are between zero and one in each component, each component should be normalized with the grid size in the corresponding direction. Hence the grid to volume tex-ture transform (TGridT oV olT ex) becomes a scaling matrix with 1/sizeX, 1/sizeY

and 1/sizeZ as the scale factors. All this combined gives:

TW orldT oGrid= TW orld−1 TGridTGridT oV olT ex (4.2)

TGridand TGridT oV olT excan be precomputed and concatenated into TM odelT oV olT ex.

Hence at run-time T_{W orld}−1 is computed and then multiplied with TM odelT oV olT ex

so that:

TW orldT oGrid= TW orld−1 TM odelT oV olT ex (4.3)

With the terminology used in this section, a better name for TW orldT oGrid

would be TW orldT oV olT ex. Nevertheless, the name from the original article is kept

so that it is clear that it is the same matrix that is referred to.

1_{If T}

W orldis only composed of rigid body transforms, TW orld−1 can be computed more efficiently as in [Akenine-Möller and Haines 2002, p. 34].

Ambient Occlusion for Dynamic Objects and Procedural Environments

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Ambient Occlusion for Dynamic Objects and

Procedural Environments

Ambient Occlusion for Dynamic Objects and

Procedural Environments

Master’s Thesis

by

Abstract

Acknowledgments

Contents

Chapter 1

Introduction

1.1

Background

1.1.1

Ambient Occlusion

1.1.2

Lighting in Video Games

1.2

Problem

1.3

Goal

1.4

Solution

1.5

Structure of This Thesis

1.6

Abbreviations

Chapter 2

Background

2.1

Review of Some Computer Graphics

2.1.1

The Rendering Pipeline

2.1.2

Other Concepts

2.2

Ambient Occlusion

p

n

ω

ω

ω

ω

Ω

V(ω, p) = 1

V(ω, p) = 0

2.2.1

Derivation of the Normalization Factor

2.2.2

Obscurance

2.2.3

Interpretation of Ambient Occlusion

2.3

Related Work

2.3.1

The Early Work

2.3.2

Handling Dynamic Objects and Special Cases

2.3.3

Screen Space Methods

2.3.4

Conclusion

Chapter 3

Screen Space Ambient

Occlusion

3.1

Overview of the Method

3.2

Details of the Algorithm

3.2.1

Avoiding Incorrect Self-Occlusion

3.2.2

The Depth Buffer and Reconstructing the Position

3.3

Strengths and Weaknesses

Chapter 4