JohanTörn ComparisonBetweenTwoDiﬀerentScreenSpaceAmbientOcclusionTechniques

(1)

Master of Science in Digital Game Development June 2017

Comparison Between Two Different

Screen Space Ambient Occlusion

Techniques

Johan Törn

Faculty of Computing

(2)

This thesis is submitted to the Faculty of Computing at Blekinge Institute of Technology in partial fulfillment of the requirements for the degree of Bachelor of Science in Digital Game Development. The thesis is equivalent to 10 weeks of full time studies.

Contact Information: Author(s):

Johan Törn

E-mail: jotb13@student.bth.se

University advisor: Francisco Lopez Luro

Department of Creative Technologies

Faculty of Computing Internet : www.bth.se

(3)

Abstract

Context. In this project a comparison between two screen space ambient occlusion techniques are presented. The techniques are Scalable AO (SAO) and Multiresolution SSAO (MSSAO) since they both are techniques that use mipmaps to accelerate their calculations.

Objectives. The aim is to see how big the difference is between the results of these two techniques and a golden reference that is an object space ray traced texture that is created with mental ray in Maya and how long time the computation takes.

Methods. The comparisons between the AO textures that these techniques produce and the golden references are performed using Structural Similarity Index (SSIM) and Perceptual Image Difference (PDIFF).

Results. On the lowest resolution, both techniques execute in about the same time on average, except that SAO with the shortest distance is faster. The only effect caused by the shorter distance, in this case, is that more samples are taken in higher resolution mipmap levels than when longer distances are used. The MSSAO achieved a better SSIM value meaning that MSSAO is more similar to the golden reference than SAO. As the resolution increases the SSIM value between both techniques become more similar with SAO getting a better value and MSSAO getting slightly worse, while the execution time for MSSAO has larger increases than SAO. Conclusions. It is concluded that MSSAO is better than SAO in lower resolution while SAO is better in larger resolution. I would recommend that SAO is used for indoor scenes where there are not many small geometry parts close to each other that should occlude each other. MSSAO should be used for outdoor scenes with a lot of vegetation which has many small geometry parts close to each other that should occlude. At higher resolution, MSSAO takes longer computational time as compared with SAO, while at lower resolution the computational time is similar.

Keywords: Ambient Occlusion, Comparison, Visual Difference

(4)

Chapter 1 Introduction

In this thesis, occlusion will be used for both occlusion and obscurance as they are seen as interchangeable since both ambient occlusion and ambient obscurance are ways to describe how much ambient lighting should be applied on a specific point. Screen space ambient occlusion (SSAO) is a technique that approximates ambient occlusion (AO) in screen space. The key advantage that SSAO techniques have over other AO techniques, like ray tracing in object space [6], is that they are fast and allow to predict how long the computation will take since it will take about the same amount of time independent of how complex the geometry in the scene is. There are many different techniques, but a lot of them have several disadvantages in common, like only being able to compute AO from near objects and/or missing AO from objects that are currently not being seen by the camera. A way to increase the performance of taking objects far away into account has been to use multiple resolutions, like mipmaps [8, 9].

A problematic issue is that there are different techniques but almost no com-parisons between them, making it hard to know which technique is the best. Different AO techniques only compare themselves against a limited number of other techniques and usually, Horizon-based AO is one of them.

To make the comparison, Structural Similarity Index (SSIM) and Perceptual Image Difference (PDIFF) is used. SSIM is a method that measures the quality of images if one is assumed to be perfect quality [14]. PDIFF is a program that uses a computational model of the human visual system when it compares two images [3].

The aim of this project is to compare two similar SSAO techniques, Scalable AO [9] and Multiresolution SSAO [8] that use mipmaps to accelerate calculations. The first aim is to evaluate the difference between these techniques and a golden reference. The golden reference will be an object space ray traced texture that will be created with mental ray in Maya [2]. A second aim is to understand how these techniques perform in terms of computation.

The specific objectives of this project is:

• Implement Scalable AO and Multiresolution AO using DirectX 11.

• Render AO texture in Maya and with Scalable AO and Multiresolution AO using different resolutions and AO radii.

(6)

Chapter 1. Introduction 2 • Timing the Scalable AO and Multiresolution AO when they are rendered. • Compare the resulting AO textures from the two techniques with the golden

references from Maya, using SSIM index and PDIFF. • Compare the timing of Scalable AO and Multiresolution. The research questions are:

• Which of Scalable AO and Multiresolution SSAO confer least errors using the metrics SSIM and PDIFF compared to the reference ambient occlusion texture computed in object space?

• How do these techniques compare in computational time?

(7)

Chapter 2 Related Work

A

B

Figure 2.1: A and B are two different points being occluded. No AO will be applied on point A since there is nothing obscuring it. Point B will have AO applied to it since there is geometry on the side obscuring the point.

Ambient occlusion is a part of the lighting calculation of a scene where you determine how much indirect lighting that can hit a point. This is done by looking at how much the nearby surfaces are occluding the possible directions where indirect light could come from as illustrated in figure 2.1. The idea is that when surfaces are closer to each other, less indirect lighting would be able to reach certain points and create darker areas as seen in figure 2.2 where there are darker areas on the wall behind the cloth when AO is applied. Ray tracing in object space is a calculation-heavy task and is not suitable for real-time rendering. Ray tracing in object space could be used for offline pre-computation and later use it in real-time application [6]. However, this only includes static objects that do not change position. Animations could be done when the pre-computation step is done for every frame of the animation and applied just in the same way as for any static objects with no animations. This gives a big disadvantage that AO could

(8)

Chapter 2. Related Work 4

Figure 2.2: The difference ambient occlusion makes to a scene. The scene with ambient occlusion to the left and the scene without ambient occlusion to the right. The difference is clearer when looking at the wall behind the cloth hanging in front of it.

not be applied between static and dynamic objects with unknown movement, like a player’s character, in a scene since it can not be pre-computed. Ray tracing in object space could be done as described by [6] where the center point of each triangle in an object shoots outs rays in a hemisphere with the center about the triangles normal. Thereafter, each ray is followed to find out if it intersects any geometry before it reaches the limit of the hemisphere. By storing how many of the rays intersect with geometry and which average direction the rays that did not intersect went, this could later be used to do the rest of the calculation in a real-time application. A way to eliminate the disadvantage of not being able to apply AO between static and dynamic objects could be done by doing all calculations in real-time. This is done by simplifying the calculations of AO. A way to accomplish this is to do the calculation in screen space. This could be done similar to how the ray trace in object space is done but instead in screen space. This takes away the calculation of intersection testing of the rays and individual triangles. Instead, the test is done against values that are stored in pixels. The value of the pixel will only store the most recent value that has been written to that pixel and therefore if several triangles overlap a single pixel only one triangles value will be stored.

(9)

Chapter 2. Related Work 5 pixel is occluded, nearby pixels are sampled and a depth comparison is performed. The occlusion value is determined based upon how many of the sampled points are occluded of the total amount of samples. A big gain with this approach as compared with earlier AO techniques is that it does not require any pre-computation and it executes in constant time independent of scene complexity. After this first technique, several different SSAO techniques have been introduced, each of these addressing different aspects to improve. One example of a newer technique is the StarCraft II AO which decreases the amount of AO applied [7]. This technique uses a falloff function where points that are further away contribute with less occlusion and solve self-occlusion problems. In contrast, Horizon-Based Ambient Occlusion (HBAO) uses more computation for the AO by walking pixel by pixel in certain directions and calculating the horizon angle instead of making a random sampling of pixels [5]. This feature makes the HBAO produce a higher-quality result than earlier SSAO techniques but at the cost of longer computational time compared to other SSAO techniques. In order to reduce computation time HBAO is usually rendered in a lower resolution. The Alchemy AO took a different path [10]. Instead of trying to create the best looking SSAO, another solution was designed to match other requirements such as being fast on Xbox 360 and on DirectX 11 compatible hardware of different qualities. This was done by limiting the number of samples taken for each pixel. Instead of stepping through pixels like HBAO but rather sampling few pixels in a spiral pattern and using those sampling points even if other pixels closer could occlude more. The further away from the camera the pixel that is occluded is situated the less amount of samples is used. Another desirable function was to allow control to the artist of how the AO look by giving variables that could be changed to give a different but predictable look of the AO.

There have been comparisons between SSAO techniques, however, they usu-ally do not have any perception metrics to calculate the different but rather more of a talk about difference [10, 12]. In these publications there are discussions about the implementation that were performed and comparisons to HBAO. In the paper where the Multiresolution SSAO (MSSAO) is presented a comparison between their implementation and four other techniques are reported using SSIM to compare the techniques against a reference image created with Blender where the authors concluded that theirs was most similar to the reference [8].

2.1 Scalable Ambient Obscurance

(10)

Chapter 2. Related Work 6 then proceeding with samples further and further away [10]. The advantage with Scalable AO is that a mipmap hierarchy is generated with information about the depth of the scene, for every frame. This feature is used when sampling the depth of the scene to calculate the occlusion. Samples that are close to the pixel being occluded are taken from the highest resolution mipmap level. In addition, when positions are further away the samples are taken from lower resolution mipmap levels. This procedure is used to increase the number of cache hits, meaning that the data that is needed is in the cache memory, which is very small, but gives very fast access, since the possibility of sampling the same texture pixel increases when sampling lower resolution mipmap levels are done. Higher cache hit rate increased the calculation speed since fewer reads go to the slower but larger memory on the graphics card [9].

2.2 Multiresolution Screen Space Ambient

Occlu-sion

(11)

Chapter 3 Method

SSIM and PDIFF are the two metrics used to measure the similarity between two images. In this case, these two metrics are used for comparison between AO texture from the implemented techniques and the golden reference. SIMM and PDIFF are used since they do not rely heavily on a single pixel value, but take surrounding pixels into account. In the SSIM paper, it is shown that SSIM gives a better result than mean square error which does a pixel to pixel comparison but could have been used instead[14]. This is good since both SAO and MSSAO will not produce a pixel perfect gradient from dark to light. This, therefore, gives a good value as long as the overall pixels in the surrounding are close to the golden reference value. It also makes it easy to recreate the experiment.

The camera information from Maya is saved when the textures are rendered in order to allow the texture to be recreated later in the application. Experiments with the different AO techniques that have been implemented is tested separately versus the golden references created in Maya. The execution time is measured for the implemented techniques and the AO textures that are created are saved in PNG format. Everything uses the same resolution, position and projection matrices. The final procedure is to compare AO textures from the DirectX 11 application with the golden references using SSIM index and PDIFF metrics.

The object space AO texture is used for the quality comparison since object space methods give a higher visual quality over a screen space method as there is less information about the scene in screen space. There have been suggestions on utilizing several cameras or having several depth layers when doing AO calculation in real time in order to produce more similar results to object space AO[13, 4]. However, these techniques introduces longer rendering time. The object space technique was not used in the time comparison since is is not suitable for real time rendering as the screen space techniques are.

For testing the Crytek Sponza [1] is used for all experiments. This scene is used because it has large smooth surfaces and areas where the are lot of small meshes close to each other. This gives the ability to see how the techniques behave in different situations in one scene. It is also publicly available making it easier for comparison with other techniques if using the same scene.

(12)

Chapter 3. Method 8

3.1 Structural Similarity Index

SSIM is a method that is used to measure the quality of an image if the other one is assumed to be perfect quality [14]. The implementation done by [14] will be used with GNU Octave. This technique assumes that human visual system is adapted for extracting structural information. It not only compare single pixels against each other, but includes pixels within an 11×11 in the implementation suggested.

3.2 Perceptual Image Difference

PDIFF is a program that uses a computational model of the human visual system when it compares two images [3]. It is a tool that could be used to automatically assess rendering software when updates are done and could result in small changes, but these changes may not be noticeable for a human. A reference image is done using one implementation, then an other implementation can be done and tested with PDIFF against the reference image to see if people will notice any difference between the older and newer implementation.

For the settings, the field of view was set to 45 degrees and illumination to 250 cd/m2_{. A higher field of view would cause more failed pixels. Moreover, 45}

was chosen since that was a little larger than a computer setup used when this work was done. Illumination at 250 cd/m2 _{was chosen since that was the highest}

illumination that the screen in the setup could produce. Higher illumination values would give a slightly more failed pixels since the tool expect people to be able to notice changes more easily.

3.3 Settings in mental ray

The default settings are used for mental ray in Maya for creating the golden reference, but with the following changes:

• Overall quality is 1.0

• Numbers of samples for AO is 256 • Spread for AO is 1.0

(13)

Chapter 3. Method 9 resolution. Number of samples is a setting for the AO pass that tells how many rays that will be used to determine how much a specific point is occluded and the spread is how much of the hemisphere above the point that will be used for the samples going from 0 to 1, where 0 is a single direction and 1 is the entire hemisphere. The last setting that is changed between rendering different images is the distance in the AO setting and the output resolution. These two settings are equal to the one that is used for the images produced in the DirectX 11 application.

3.4 Maya to DirectX 11 Application

In order to be able to compare the AO texture from Maya and the DirectX 11 application, the camera in both applications has to point in the same direction and project points to the same location in screen space. In order to do this, the information needed about the camera is saved when images are rendered in Maya. Information about the camera such as camera position, how it is rotated and how it performs projection is essential. The camera in Maya uses an almost normal OpenGL projection matrix. In this thesis, the assumption of not having a view frustum offset results in B and D to be 0. The distinction between Maya’s projection matrix from a normal OpenGL matrix is that the E and G element are negated. P rojectionM atrix =     A 0 B 0 0 C D 0 0 0 E F 0 0 G 0    

While the application with the AO implementations uses DirectX 11, the projec-tion matrix is a little different from the OpenGL and the camera in the applicaprojec-tion view in positive Z-axis while the other axes are the same. In order to solve the is-sue with having the same projection between Maya and the application, a matrix is recreated in the application with the information from Maya. However, the A and C elements from Maya can be reused in the new projection matrix since these two element only affect the X and Y coordinates of a vertex when projected. The F element in an DirectX 11 matrix will be 1 and the E and G elements is calculated from the information about the near and far clip plane that have been obtained from Maya.

(14)

Chapter 3. Method 10 object directly under the world node had been done. Therefore, no scaling, pivot rotation and translation, shearing, or having the transformation as a child to another transformation have been done. Furthermore, the transformation matrix only includes rotation around the X-,Y-, Z-axis and the translation of the object. Because the Z-axis is pointing in the other direction while X- and Y-axis are pointing in the same directions, the transformation matrix in DirectX 11 will be corresponding to the one from Maya, which will be the matrix that is equal to the mirror against the XY-plane multiplied with the transformation matrix from Maya and then multiplied with the mirror over XY-plane once more.

3.5 Time measurement

The time measured is the shader execution time for different steps in the AO calculated for the Sponza scene. The ID3D11Query interface is used with

D3D11_QUERY_TIMESTAMP to query a tick number before change of shader, updating of variables to the GPU and setting rendering targets. After the shader is executed, another query for a tick number is done.

D3D11_QUERY_TIMESTAMP_DISJOINT is also queried over every frame to get the frequency to be able to convert the number of ticks to milliseconds and to be able to discard any result if it would happen to be disjoint since they are not reliable.

3.6 Implementing the two AO techniques

(15)

Chapter 4 Results

Both SAO and MSSAO are producing AO textures that are compared with a golden reference from Maya with the two metrics. Timing on both techniques, is measured to estimate their performance on a GTX 680. The golden reference in Maya is in object space. Different distances and resolutions have been used on all the textures. The distances are in scene units, because the SAO techniques have the requirement that 1 scene unit would be perceived as a 1m object. The scene has been scaled to match that requirement and used in all the tests.

4.1 Visuals

Table 4.2 shows the SSIM map and mean SSIM between the two techniques and a golden reference image. SSIM range from 0 to 1, where 1 is that the images are identical. Table 4.1 shows how many pixels that are visually different according to PDIFF. Dist. 1280×720 1920×1080 2560×1440 3840×2160 SA O 0.5 75200 (8.2%) 147157 (7.1%) 201541 (5.5%) 379057 (4.6%) 1.0 125312 (13.6%) 217955 (10.5%) 289989 (7.9%) 620085 (7.5%) 1.5 186319 (20.2%) 316332 (15.3%) 420519 (11.4%) 897530 (10.8%) 2.0 256461 (27.8%) 434020 (20.9%) 556086 (15.1%) 1230833 (14.8%) 3.0 401873 (43.6%) 688385 (33.2%) 869361 (23.6%) 1899327 (22.9%) MSSA O 0.5 100139 (10.9%) 205380 (9.9%) 247161 (6.7%) 443069 (5.3%) 1.0 177097 (19.2%) 328483 (15.8%) 372397 (10.1%) 658382 (7.9%) 1.5 252766 (27.4%) 446188 (21.5%) 486183 (13.2%) 882043 (10.6%) 2.0 318569 (34.6%) 545771 (26.3%) 582989 (15.8%) 1138504 (13.7%) 3.0 411762 (44.7%)731989 (34.7%) 809844 (22.0%) 1590430 (19.2%) Table 4.1: The amount off pixels that are different with the PDIFF comparison when different resolution and distances are used. The distance is in scene units.

(16)

(17)

Chapter 4. Results 13 MSSA O 0.5 0.84931 0.84452 0.83855 0.84099 1.0 0.81980 0.81367 0.80624 0.80971 1.5 0.80618 0.79351 0.78525 0.78953 2.0 0.79045 0.78130 0.77244 0.77720 3.0 0.77505 0.76501 0.75505 0.76030

Table 4.2: SSIM map and the mean SSIM value between the AO textures and the golden references in different resolution and distances. Darker parts on the map indicate less similarity. The distances is in scene units.

4.2 Performance

(18)

Chapter 4. Results 14 1280x7200 1920x1080 2560x1440 3840x2160 2 4 6 8 10 Resolution Time (ms) Average Time SAO 0.5 SAO 1.0 SAO 1.5 SAO 2.0 SAO 3.0 MSSAO 0.5 MSSAO 1.0 MSSAO 1.5 MSSAO 2.0 MSSAO 3.0

(19)

Chapter 5 Analysis and Discussion

As seen in figure 4.1 the execution time is on average the same, independent of the distance for each of the two techniques. On the lowest resolution, both techniques execute in about the same time on average, except SAO with a distance of 0.5 that is a little faster. The only thing the shorter distance does, in this case, is that more samples are taken in higher resolution mipmap levels than when longer distances are used. The only possible explanation would be that there are more cache hits with the shorter distance and lower resolution than the one with longer distances and lower resolution since the number of samples taken is the same in the exact same camera position and viewing direction. As the resolution increases both techniques take longer time. However, time for MSSAO increases faster than SAO. This is likely to be caused by MSSAO taking more samples than SAO. SAO takes 11 samples in total per pixel in the finest resolution while MSSAO could take up to 36 samples per pixel in any resolution. As the resolution increases, MSSAO will take more samples and therefore increase the computational time. However, the MSSAO achieved a better SSIM value meaning that MSSAO is more similar to the golden reference than SAO. As the resolution increases the SSIM value between both techniques become more similar with SAO getting a better value and MSSAO getting slightly worse, while the execution time for MSSAO has larger increases than SAO.

As seen in the SSIM maps in figure 5.1b the SAO has a problem with applying the right AO when lots of geometry is close to each other like the plants around the pillars and in the pots that are close to the pillars. While the MSSAO managed to apply the AO better around the plants, figure 5.1a, it also applied too much AO if two triangles are just slightly angled toward each other. This error can be seen on the pieces of cloth on figure 5.1c where SAO has less error as seen on figure 5.1d.

SAO with the highest resolution and the shortest distance has the highest SSIM value, as table 4.2 shows. This is probably the case since with higher resolution there is more information about the scene in that image and with the same number of mipmap levels, the smallest mipmap will have more information about the scene than the smallest in lower resolution. This discrepancy confers that samples taken in the smallest mipmaps will be more accurate and therefore

(20)

Chapter 5. Analysis and Discussion 16

(a) (b)

(c) (d)

Figure 5.1: (a), (b), (c) and (d) shows MSSAO and SAO SSIM maps. Darker parts indicated less simularty between the technique and the referance image. MSSAO (a) and SAO (b) shows the error around the plants close to the pillars in the scene. MSSAO (c) and SAO (d) shows the error on the cloth hanging down the walls. These examples are from the SSIM map with a distance of 0.5 and resolution of 1920×1080.

giving a better result. With shorter distance, the number of samples taken in higher resolution mipmaps is higher than when longer distances are used and therefore giving a more accurate result.

MSSAO gets a little worse SSIM value as the resolution increases. A likely reason for the SSIM value to decrease as the distance goes up is likely caused by that AO from lower resolution mipmaps levels gets more influence over the total AO since more samples are taken there. While higher resolution get a lower SSIM value since the technique has a limitation of an 11×11 kernel in each mipmap level, meaning that if geometry that is outside of the kernel in the lowest resolution mipmap level will not be included even if it is within the distance from the point being occluded.

(21)

Chapter 5. Analysis and Discussion 17

Figure 5.2: Example of error caused by the referents technique being done in object space while the other techniques being in screen space. This is the SSIM map of MSSAO 1920×1080 with 0.5 units in distance.

In both techniques, there are dark lines under the curtains in between the pillars on all resolutions and at all distances as seen in figure 5.2. This is because the golden reference image is done in object space, while these two techniques use screen space and therefore do not have information about how the scene looks behind objects or how thick objects are. Because the curtains are very thin, the occlusion calculated in object space will apply less AO since rays that get behind the curtains would not hit anything. MSSAO and SAO do not have information about the thickness and will just sample the position on the curtain and do AO calculation based on that. This is not something that is easily solved since more information about the scene needs to be known. However, it is a trade off that is known in SSAO techniques and accepted for the gain in speed.

(22)

Chapter 6 Conclusions and Future Work

The conclusion is that MSSAO is better than SAO in longer distances while SAO is better in larger resolution and shorter distances. I would recommend that SAO is used for indoor scenes where there are not too many small meshes close to each other that should occlude each other and that MSSAO is used for outdoor scenes with a lot of vegetation which has many small geometry parts close to each other that should occlude. At higher resolution both techniques take longer computational time, but MSSAO has a larger increase compared with SAO. At lower resolution the computational time is similar.

6.1 Future Work

In the current project only one scene is evaluated. It would be of interest to evaluate the two techniques in a series of scenes, that is a computer game. The SAO uses less computational resources and thus scenes would be rendered faster. Short computational time is often preferred in modern games to increase the sense of reality.

Other things that could be added to the scene are lighting, diffuse, specu-lar and normal textures. Smaller differences in the AO texture might not be noticeable when lighting and other textures are added.

This experiment could be done with having people assessing the difference rather than only using metrics. People would probably say the techniques would be different compared to the reference, but which technique is most similar and has least error might give a different result as compared with metrics.

It would also be interesting to test the techniques on a more varied set of hardware like weaker integrated GPUs on the CPU and on new top of the line desktop GPUs. Since execution time is about the same on the two lower resolu-tions and gets bigger on the higher resoluresolu-tions it would be of interest to compare performance on different hardwares. A weaker integrated GPU would proba-bly have have a bigger difference in lower resolutions while a current top of the line external GPUs might have smaller difference in higher resolutions since the computational power of the GPUs would be less or greater than that of a GTX 680.

(23)

References

[1] Crytek sponza model. http://www.crytek.com/cryengine/cryengine3/ downloads. Accessed: 2017-03-27.

[2] Maya | computer animation & modeling software | autodesk. http://www. autodesk.com/products/maya/overview. Accessed: 2017-04-03.

[3] Perceptual image difference utility. http://pdiff.sourceforge.net/. Ac-cessed: 2017-02-11.

[4] Louis Bavoil and Miguel Sainz. Multi-layer dual-resolution screen-space am-bient occlusion. In SIGGRAPH 2009: Talks, SIGGRAPH ’09, pages 45:1– 45:1. ACM.

[5] Louis Bavoil, Miguel Sainz, and Rouslan Dimitrov. Image-space horizon-based ambient occlusion. In ACM SIGGRAPH 2008 Talks, pages 22:1–22:1. ACM.

[6] Randima Fernando. Ambient occlusion. In GPU Gems: Programming Tech-niques, Tips and Tricks for Real-Time Graphics. Addison-Wesley Profes-sional.

[7] Dominic Filion and Rob McNaughton. Effects & techniques. In ACM SIG-GRAPH 2008 Games, pages 133–164. ACM.

[8] Thai-Duong Hoang and Kok-Lim Low. Efficient screen-space approach to high-quality multiscale ambient occlusion. The Visual Computer, 28(3):289– 304.

[9] Morgan McGuire, Michael Mara, and David Luebke. Scalable ambient ob-scurance. In Proceedings of the Fourth ACM SIGGRAPH / Eurographics Conference on High-Performance Graphics, pages 97–103. Eurographics As-sociation.

[10] Morgan McGuire, Brian Osman, Michael Bukowski, and Padraic Hennessy. The alchemy screen-space ambient obscurance algorithm. In Proceedings of the ACM SIGGRAPH Symposium on High Performance Graphics, pages 25–32. ACM.

(24)

References 20 [11] Martin Mittring. Finding next gen: CryEngine 2. In ACM SIGGRAPH 2007

Courses, pages 97–121. ACM.

[12] Ville Timonen. Line-sweep ambient obscurance. Computer Graphics Forum, 32(4):97–105.

[13] Kostas Vardis, Georgios Papaioannou, and Athanasios Gaitatzes. Multi-view ambient occlusion with importance sampling. In Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, I3D ’13, pages 111–118. ACM.

(25)

Appendix A

AO Textures

(26)

Appendix A. AO Textures 22 (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) (l) (m) (n) (o)

Figure A.1: AO textures in 1280×720.

(27)

(28)

(29)

(30)

Appendix B

Execution Time

Table B.1 shows the average frame execution time for the shaders of each step in AO calculation process. The times are averaged over 2½ minutes of execution. A step in the process can contain several shaders or executes a shader several times during a frame, the average time is then is the average total time for all the shader executions during a frame and not a single shader execution.

(31)

Appendix B. Execution Time 27 Res. Dist. Depth to Z Downsample AO Blur Total

SA O 1280×720 0.5 0.59 0.17 0.57 0.24 1.56 1 0.90 0.17 0.57 0.24 1.88 1.5 0.91 0.18 0.57 0.25 1.90 2 0.83 0.17 0.57 0.24 1.82 3 0.79 0.18 0.56 0.24 1.76 1920×1080 0.5 0.29 0.35 1.12 0.45 2.22 1 0.40 0.38 1.22 0.48 2.47 1.5 0.48 0.36 1.20 0.48 2.53 2 0.47 0.34 1.21 0.48 2.50 3 0.54 0.34 1.20 0.48 2.55 2560×1440 0.5 0.53 0.56 1.97 0.77 3.79 1 0.26 0.55 2.12 0.80 3.73 1.5 0.30 0.55 2.10 0.81 3.75 2 0.26 0.55 2.13 0.82 3.75 3 0.37 0.55 1.95 0.76 3.63 3840×2160 0.5 0.64 1.10 4.34 1.63 7.70 1 0.68 1.10 4.33 1.62 7.73 1.5 0.64 1.09 4.31 1.63 7.67 2 0.71 1.09 4.33 1.63 7.76 3 0.71 1.09 4.31 1.63 7.74 MSSA O 1280×720 0.5 - 0.52 1.07 0.20 1.79 1 - 0.70 1.03 0.20 1.93 1.5 - 0.72 1.08 0.20 2.01 2 - 0.72 1.07 0.20 1.99 3 - 0.61 1.06 0.20 1.89 1920×1080 0.5 - 0.56 1.96 0.23 2.76 1 - 0.58 2.05 0.28 2.92 1.5 - 0.47 2.12 0.28 2.87 2 - 0.44 2.08 0.21 2.73 3 - 0.44 2.06 0.23 2.73 2560×1440 0.5 - 0.83 3.33 0.35 4.54 1 - 0.95 3.40 0.33 4.68 1.5 - 0.87 3.54 0.33 4.74 2 - 0.94 3.56 0.36 4.89 3 - 0.80 3.34 0.36 4.50 3840×2160 0.5 - 1.73 7.09 0.51 9.33 1 - 1.86 6.83 0.51 9.20 1.5 - 1.79 7.08 0.50 9.37 2 - 1.83 7.12 0.54 9.48 3 - 1.83 7.17 0.55 9.55

(32)

Appendix C

Source code

C.1 SAO

Listing C.1: linjerZ.hlsl 1 /∗∗

2 Open Source under the "BSD" l i c e n s e : http ://www. opensource . org / l i c e n s e s /bsd−l i c e n s e . php

3

4 Copyright ( c ) 2011 −2012 , NVIDIA 5 All r i g h t s r e s e r v e d .

6

7 R e d i s t r i b u t i o n and use in source and binary forms , with or without modification , are permitted provided that the f o l l o w i n g c o n d i t i o n s are met :

8

9 R e d i s t r i b u t i o n s of source code must r e t a i n the above copyright notice , t h i s l i s t of c o n d i t i o n s and the f o l l o w i n g d i s c l a i m e r .

10 R e d i s t r i b u t i o n s in binary form must reproduce the above copyright notice , t h i s l i s t of c o n d i t i o n s and the f o l l o w i n g d i s c l a i m e r in the documentation and/ or other m a t e r i a l s provided with the d i s t r i b u t i o n . 11 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND

CONTRIBUTORS "AS IS " AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE

IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,

EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES ; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS

(33)

Appendix C. Source code 29 INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY , WHETHER IN CONTRACT, STRICT LIABILITY , OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

12 ∗/ 13 Texture2D<f l o a t > depthTex : r e g i s t e r ( t0 ) ; 14 15 c b u f f e r constants : r e g i s t e r ( b0 ) { 16 f l o a t c0 ; 17 f l o a t c1 ; 18 f l o a t c2 ; 19 } 20 21 s t r u c t IN 22 { 23 f l o a t 4 Pos : SV_POSITION; 24 f l o a t 2 TexCoord : TEXCOORD0; 25 } ; 26

27 f l o a t main ( IN input ) : SV_TARGET 28 {

29 f l o a t depth = depthTex . Load ( f l o a t 3 ( input . Pos . xy , 0) ) ; 30 depth = c0 / ( depth ∗ c1 + c2 ) ; 31 return depth ; 32 } Listing C.2: downsample.hlsl 1 /∗∗ 2 \ f i l e SSAO_minify . pix

3 \ author Morgan McGuire and Michael Mara , NVIDIA Research 4 DX11 HLSL port by Leonardo Zide , Tryearch

5

7

10

(34)

Appendix C. Source code 30 12

EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES ; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY , WHETHER IN CONTRACT, STRICT LIABILITY , OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

16 17 ∗/ 18 Texture2D<f l o a t > previousMipTexture : r e g i s t e r ( t0 ) ; 19 20 s t r u c t IN 21 { 22 f l o a t 4 Pos : SV_POSITION; 23 f l o a t 2 TexCoord : TEXCOORD0; 24 } ; 25

26 f l o a t main ( in IN input ) : SV_TARGET 27 {

28 i n t 2 ssp = input . Pos . xy ;

29 return previousMipTexture . Load ( i n t 3 ( ssp ∗ 2 + i n t 2 ( ( ssp . y & 1) , ( ssp . x & 1) ) , 0) ) ;

30 }

Listing C.3: AO.hlsl 1 /∗∗

(35)

Appendix C. Source code 31 3 \ author Morgan McGuire and Michael Mara , NVIDIA Research 4

5 Reference implementation of the S c a l a b l e Ambient Obscurance (SAO) screen −space ambient obscurance algorithm .

6

7 The optimized a l g o r i t h m i c s t r u c t u r e of SAO was published in McGuire , Mara , and Luebke , S c a l a b l e Ambient

Obscurance ,

8 <i>HPG</i> 2012 , and was developed at NVIDIA with support from Louis Bavoil .

9

10 The mathematical i d e a s of AlchemyAO were f i r s t d e s c r i b e d in McGuire , Osman , Bukowski , and Hennessy , The

11 Alchemy Screen−Space Ambient Obscurance Algorithm , <i>HPG </i> 2011 and were developed at

12 Vicarious Visions . 13

14 DX11 HLSL port by Leonardo Zide of Treyarch 15

16 <hr> 17

19

22

24

(36)

Appendix C. Source code 32 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,

28 29 ∗/ 30 31 #d e f i n e NUM_SAMPLES 11 32 #d e f i n e LOG_MAX_OFFSET 3 33 #d e f i n e MAX_MIP_LEVEL 5 34 #d e f i n e FAR_PLANE_Z 300.0 35 #d e f i n e NUM_SPIRAL_TURNS 7 36 37 Texture2D<f l o a t > l i n j e r Z B u f f e r : r e g i s t e r ( t0 ) ; 38 39 c b u f f e r vars : r e g i s t e r ( b1 ) { 40 f l o a t 4 projConst ; 41 f l o a t rad iu s ; 42 f l o a t b i a s ; 43 f l o a t intensityDivRadius6 ; // I n t e n s i t y i s 0.3 to t h i s v a r i a b l e i s equal to 0.3/ r adi us ^6 44 f l o a t p r o j S c a l e ; 45 } ; 46 47 s t r u c t IN { 48 f l o a t 4 Pos : SV_POSITION; 49 f l o a t 2 TexCoord : TEXCOORD0; 50 } ; 51 52 f l o a t 3 r e c o n s t r u c t P o s i t i o n ( in f l o a t 2 screenspacePos , in f l o a t z ) {

53 return f l o a t 3 ( ( screenspacePos ∗ projConst . xy + projConst . zw) ∗ z , z ) ;

54 } 55

(37)

Appendix C. Source code 33 57 return normalize ( c r o s s ( ddx ( pos ) , ddy ( pos ) ) ) ;

58 } 59

60 f l o a t 3 getViewspacePosition ( in i n t 2 screenspacePos ) { 61 f l o a t 3 Pos ;

62 Pos . z = l i n j e r Z B u f f e r . Load ( i n t 3 ( screenspacePos , 0) ) ; 63 Pos = r e c o n s t r u c t P o s i t i o n ( f l o a t 2 ( screenspacePos ) + f l o a t 2 ( 0 . 5 , 0 . 5 ) , Pos . z ) ; 64 return Pos ; 65 } 66

67 void sampleLocation ( in i n t sampleIndex , in f l o a t o f f se t A n g l e , out f l o a t sampleDistance , out f l o a t 2 sampleLocationDirection ) {

68 sampleDistance = f l o a t ( sampleIndex + 0 . 5 ) ∗ ( 1 . 0 / NUM_SAMPLES) ;

69

70 f l o a t angle = sampleDistance ∗ (NUM_SPIRAL_TURNS ∗ 6 . 2 8 ) + o f f s e t A n g l e ;

71

72 sampleLocationDirection = f l o a t 2 ( cos ( angle ) , s i n ( angle ) ) ;

73 } 74

75 f l o a t 3 getSamplePosition ( in i n t 2 screenspacePos , in f l o a t 2 sampleLocationDirection , in f l o a t sampleDistance ) { 76 uint mipLevel = clamp ( i n t ( f i r s t b i t h i g h ( i n t (

sampleDistance ) ) ) − LOG_MAX_OFFSET, 0 , MAX_MIP_LEVEL) ; 77 78 i n t 2 screenspaceSamplePos = i n t 2 ( sampleLocationDirection ∗ sampleDistance ) + screenspacePos ; 79 80 uint2 s i z e ; 81 uint dummy; 82 l i n j e r Z B u f f e r . GetDimensions ( mipLevel , s i z e . x , s i z e . y , dummy) ;

83 i n t 2 mipPos = clamp ( screenspaceSamplePos >> mipLevel , uint2 (0 , 0) , s i z e − uint2 (1 , 1) ) ; 84

(38)

Appendix C. Source code 34 86 samplePos . z = l i n j e r Z B u f f e r . Load ( i n t 3 ( mipPos ,

mipLevel ) ) ; 87 88 samplePos = r e c o n s t r u c t P o s i t i o n ( f l o a t 2 ( screenspaceSamplePos ) + f l o a t 2 ( 0 . 5 , 0 . 5 ) , samplePos . z ) ; 89 90 return samplePos ; 91 } 92

93 f l o a t sampleAO ( in i n t 2 screenspacePos , in f l o a t 3 centerPos , in f l o a t 3 viewspaceNormal , in f l o a t sampleRadius , in i n t sampleIndex , in f l o a t s ta r tO f f s et A n g le , in f l o a t radius2 ) { 94 f l o a t sampleDistance ; 95 f l o a t 2 sampleLocationDirection ; 96 97 sampleLocation ( sampleIndex , st a r tO f fs e tA ng le , sampleDistance , sampleLocationDirection ) ; 98 99 sampleDistance ∗= sampleRadius ; 100 101 f l o a t 3 samplePos = getSamplePosition ( screenspacePos , sampleLocationDirection , sampleDistance ) ; 102 103 f l o a t 3 v = samplePos − centerPos ; 104 f l o a t vv = dot (v , v ) ; 105 f l o a t vn = dot (v , viewspaceNormal ) ; 106 107 const f l o a t e p s i l o n = 0 . 0 1 ; 108 109 f l o a t f = max( radius2 − vv , 0 . 0 ) ; 110 return f ∗ f ∗ f ∗ max( ( vn − b i a s ) / ( e p s i l o n + vv ) , 0 . 0 ) ; 111 } 112 113

114 f l o a t 2 main ( in IN input ) : SV_TARGET { 115 f l o a t 2 output = 0 ;

116 f l o a t radius2 = ra diu s ∗ r adi us ; 117

(39)

Appendix C. Source code 35 119 120 f l o a t 3 viewspacePos = getViewspacePosition ( screenspacePos ) ; 121 122 f l o a t 3 viewspaceNormal = reconstructNormal ( viewspacePos ) ; 123

124 output . g = clamp ( viewspacePos . z ∗ ( 1 . 0 / FAR_PLANE_Z) , 0 . 0 , 1 . 0 ) ; 125 126 f l o a t s t a r t O f f s e t A n g l e = (3 ∗ screenspacePos . x ^ screenspacePos . y + screenspacePos . x ∗ screenspacePos . y ) ∗ 10; 127 128 f l o a t sampleRadius = −p r o j S c a l e ∗ r adi us / viewspacePos . z ; 129 130 f l o a t sum = 0 . 0 ; 131 132 f o r ( i n t i = 0 ; i < NUM_SAMPLES; i++) {

133 sum += sampleAO ( screenspacePos ,

viewspacePos , viewspaceNormal , sampleRadius , i , s t a rt O f fse t An g l e , radius2 ) ;

134 }

135

136 output . r = max ( 0 . 0 , 1.0 − sum ∗

intensityDivRadius6 ∗ ( 5 . 0 / NUM_SAMPLES) ) ; 137 138 i f ( abs ( ddx ( output . g ) ) < 0 . 0 2 ) { 139 output . r −= ddx ( output . r ) ∗ ( ( screenspacePos . x & 1) − 0 . 5 ) ; 140 }

141 i f ( abs ( ddy ( output . g ) ) < 0 . 0 2 ) {

(40)

Appendix C. Source code 36 2 \ f i l e SAO_blur . pix

3 \ author Morgan McGuire and Michael Mara , NVIDIA Research 4

5 \ b r i e f 7−tap 1D cross −b i l a t e r a l blur using a packed depth key

6

7 DX11 HLSL port by Leonardo Zide , Treyarch 8

10

13

15

19 ∗/ 20

(41)

Appendix C. Source code 37 22 23 #d e f i n e EDGE_SHARPNESS 1.0 24 25 c b u f f e r var : r e g i s t e r ( b0 ) { i n t 2 a x i s ; } ; 26 27 s t r u c t IN { 28 f l o a t 4 Pos : SV_POSITION; 29 f l o a t 2 TexCoord : TEXCOORD0; 30 } ; 31

32 f l o a t 2 main ( in IN input ) : SV_TARGET{

33 f l o a t gaussian [ 5 ] = { 0.153170 , 0.144893 , 0.122649 , 0.092902 , 0.062970 } ;

34

35 f l o a t 2 output ;

36 i n t 2 screenspacePos = input . Pos . xy ; 37

38 f l o a t 4 tmp = AOTexture . Load ( i n t 3 ( screenspacePos , 0) ) ; 39 f l o a t sum = tmp . r ; 40 f l o a t depth = tmp . g ; 41 output . g = depth ; 42 43 f l o a t totalWeight = gaussian [ 0 ] ; 44 sum ∗= totalWeight ; 45 46 [ u n r o l l ] 47 f o r ( i n t i = −4; i <= 4 ; i++) { 48 i f ( i != 0) { 49 tmp = AOTexture . Load ( i n t 3 ( screenspacePos + a x i s ∗ ( i ∗ 2) , 0) ) ; 50 f l o a t sampleDepth = tmp . g ; 51 f l o a t value = tmp . r ; 52

53 f l o a t weight = 0.3 + gaussian [ abs (

i ) ] ; 54 55 weight ∗= max ( 0 . 0 , 1.0 − (2000.0 ∗ EDGE_SHARPNESS) ∗ abs ( sampleDepth − depth ) ) ; 56

(42)

Appendix C. Source code 38 58 totalWeight += weight ; 59 } 60 } 61 62 const f l o a t e p s i l o n = 0 . 0 0 0 1 ;

63 output . r = sum / ( totalWeight + e p s i l o n ) ;

(43)

Appendix C. Source code 39 31 return a ; 32 e l s e 33 return b ; 34 } 35

36 OUT main ( IN input ) {

37 i n t 2 sceenSpacePos = input . Pos . xy ∗ 2 ; 38 P i x e l I n f o p i x e l I n f o [ 4 ] ;

39 p i x e l I n f o [ 0 ] . P o s i t i o n = VPositionTex . Load ( i n t 3 ( sceenSpacePos + p o s i t i o n O f f s e t s [ 0 ] , 0) ) ; 40 p i x e l I n f o [ 0 ] . Normal = normalTex . Load ( i n t 3 (

sceenSpacePos + p o s i t i o n O f f s e t s [ 0 ] , 0) ) ; 41

(44)

Appendix C. Source code 40 62 Min( p i x e l I n f o [ 0 ] , p i x e l I n f o [ 1 ] ) , 63 Max( p i x e l I n f o [ 2 ] , p i x e l I n f o [ 3 ] ) 64 ) 65 ) ; 66 67 p i x e l I n f o S o r t e d [ 2 ] = Min( 68 Max( 69 Max( p i x e l I n f o [ 0 ] , p i x e l I n f o [ 1 ] ) , 70 Min( p i x e l I n f o [ 2 ] , p i x e l I n f o [ 3 ] ) 71 ) , 72 Max( 73 Min( p i x e l I n f o [ 0 ] , p i x e l I n f o [ 1 ] ) , 74 Max( p i x e l I n f o [ 2 ] , p i x e l I n f o [ 3 ] ) 75 ) 76 ) ; 77 78 p i x e l I n f o S o r t e d [ 3 ] = Max( 79 Max( p i x e l I n f o [ 0 ] , p i x e l I n f o [ 1 ] ) , 80 Max( p i x e l I n f o [ 2 ] , p i x e l I n f o [ 3 ] ) 81 ) ; 82 83 OUT output ; 84 85 i f ( p i x e l I n f o S o r t e d [ 3 ] . P o s i t i o n . z − p i x e l I n f o S o r t e d [ 0 ] . P o s i t i o n . z <= 1 . 0 ) { 86 output . VPos = ( p i x e l I n f o S o r t e d [ 1 ] . P o s i t i o n + p i x e l I n f o S o r t e d [ 2 ] . P o s i t i o n ) / 2 . 0 ; 87 output . Normal = ( p i x e l I n f o S o r t e d [ 1 ] . Normal

+ p i x e l I n f o S o r t e d [ 2 ] . Normal ) / 2 . 0 ;

88 }

89 e l s e {

90 output . VPos = p i x e l I n f o S o r t e d [ 1 ] . P o s i t i o n ; 91 output . Normal = p i x e l I n f o S o r t e d [ 1 ] . Normal ;

92 }

(45)

Appendix C. Source code 41 94 95 return output ; 96 } Listing C.6: AOLowestRes.hlsl 1 c b u f f e r vars : r e g i s t e r ( b0 ) { 2 i n t mipmapLevel ; 3 i n t maxSampleDistance ; 4 f l o a t rad iu s ; 5 f l o a t p r o j S c a l e ; 6 } ; 7 8 Texture2D<f l o a t 4 > VPositionTex : r e g i s t e r ( t0 ) ; 9 Texture2D<f l o a t 4 > normalTex : r e g i s t e r ( t1 ) ; 10 11 s t r u c t IN { 12 f l o a t 4 Pos : SV_POSITION; 13 f l o a t 2 TexCoord : TEXCOORD0; 14 } ; 15

16 f l o a t sampleAO ( in i n t 2 sampleLocation , in f l o a t 3 centerPos , in f l o a t 3 viewspaceNormal )

17 {

18 f l o a t 3 sampledPos = VPositionTex . Load ( i n t 3 ( sampleLocation . xy , 0) ) ;

19 f l o a t 3 v = normalize ( sampledPos − centerPos ) ; 20 f l o a t t = s a t u r a t e ( dot ( viewspaceNormal , v ) ) ;

21 f l o a t f = d i s t a n c e ( centerPos , sampledPos ) / r adiu s ;

22 return ( 1 . 0 − min ( 1 . 0 , f ∗ f ) ) ∗ t ; 23 } ;

24

(46)

Appendix C. Source code 42

33 i f ( i == 0 && j == 0)

34 continue ;

35 sum += sampleAO ( i n t 2 ( centerPos . x +

i , centerPos . y + j ) , viewspacePos , viewspaceNormal ) ; 36 sampleCount++; 37 } 38 } 39 f l o a t 2 AONear ; 40 AONear [ 0 ] = sum ; 41 AONear [ 1 ] = sampleCount ; 42 return AONear ; 43 } 44

45 f l o a t 4 main ( in IN input ) : SV_TARGET { 46

47 f l o a t 3 viewspacePos = VPositionTex . Load ( i n t 3 ( input . Pos . xy , 0) ) ;

48 f l o a t 3 viewspaceNormal = normalTex . Load ( i n t 3 ( input . Pos . xy , 0) ) ;

49

50 i n t sampleRadius = i n t ( p r o j S c a l e ∗ r adi us / viewspacePos . z ) >> mipmapLevel ;

51

52 i n t sampleDistans = min ( maxSampleDistance , sampleRadius ) ; 53 54 i f ( sampleDistans < 1) { 55 d i s c a r d ; 56 } 57

58 f l o a t 2 AONear = nearAO( i n t 2 ( input . Pos . xy ) ,

viewspacePos , viewspaceNormal , sampleDistans ) ; 59

60 f l o a t 3 output ;

(47)

Appendix C. Source code 43 2 i n t mipmapLevel ; 3 i n t maxSampleDistance ; 4 f l o a t rad iu s ; 5 f l o a t p r o j S c a l e ; 6 } ; 7 8 Texture2D<f l o a t 4 > VPositionTex : r e g i s t e r ( t0 ) ; 9 Texture2D<f l o a t 4 > normalTex : r e g i s t e r ( t1 ) ; 10 Texture2D<f l o a t 4 > previousLevelAO : r e g i s t e r ( t2 ) ; 11 Texture2D<f l o a t 4 > p r e v i o u s L e v e l P o s i t i o n : r e g i s t e r ( t3 ) ; 12 Texture2D<f l o a t 4 > previousLevelNormal : r e g i s t e r ( t4 ) ; 13 14 #d e f i n e Tn 8.0 15 #d e f i n e Tz 16.0 16 17 s t r u c t IN { 18 f l o a t 4 Pos : SV_POSITION; 19 f l o a t 2 TexCoord : TEXCOORD0; 20 } ; 21

22 f l o a t sampleAO ( in i n t 2 sampleLocation , in f l o a t 3 centerPos , in f l o a t 3 viewspaceNormal )

23 {

28 return ( 1 . 0 − min ( 1 . 0 , f ∗ f ) ) ∗ t ; 29 } ;

30

(48)

Appendix C. Source code 44 d i r e c t i o n . x − j , previousLevelLocation . y + d i r e c t i o n . y − i ) ; 39 40 f l o a t 3 samplePos = p r e v i o u s L e v e l P o s i t i o n . Load ( i n t 3 ( l o c a t i o n , 0) ) ; 41 f l o a t 3 sampleNormal = previousLevelNormal . Load ( i n t 3 ( l o c a t i o n , 0) ) ; 42 43 f l o a t wight ; 44 i f ( d i r e c t i o n . x == j && d i r e c t i o n . y == i ) { 45 wight = 9 . 0 / 1 6 . 0 ; 46 } 47 e l s e i f ( d i r e c t i o n . x != j && d i r e c t i o n . y != i ) 48 { 49 wight = 1 . 0 / 1 6 . 0 ; 50 } 51 e l s e { 52 wight = 3 . 0 / 1 6 . 0 ; 53 } 54 55 f l o a t Wn = pow( ( ( dot ( centerNormal , sampleNormal ) + 1.0 ) / 2.0 ) , Tn) ; 56 f l o a t Wz = pow( ( 1.0 / ( 1.0 +

abs ( samplePos . z − centerDepth ) ) ) , Tz) ; 57 58 wight = wight ∗ Wn ∗ Wz; 59 60 f l o a t 3 sampleAO = previousLevelAO . Load ( i n t 3 ( l o c a t i o n , 0) ) ;

61 AOFar [ 0 ] += sampleAO [ 0 ] ∗ wight ;

64 }

65 }

66

(49)

Appendix C. Source code 45 68 }

69

70 f l o a t 2 nearAO( in i n t 2 centerPos , in f l o a t 3 viewspacePos , in f l o a t 3 viewspaceNormal , in i n t sampleDistans ) { 71 f l o a t sum = 0 . 0 ; 72 i n t sampleCount = 0 ; 73 74 f o r ( i n t i = −sampleDistans ; i <= sampleDistans ; i += 2) 75 { 76 f o r ( i n t j = −sampleDistans ; j <= sampleDistans ; j += 2) 77 {

78 sum += sampleAO ( i n t 2 ( centerPos . x +

i , centerPos . y + j ) , viewspacePos , viewspaceNormal ) ; 79 sampleCount++; 80 } 81 } 82 83 f l o a t 2 AONear ; 84 AONear [ 0 ] = sum ; 85 AONear [ 1 ] = sampleCount ; 86 return AONear ; 87 } 88

89 f l o a t 3 combainAO( in f l o a t 2 AONear , in f l o a t 3 AOFar) 90 {

91 f l o a t 3 AOCombined ;

92 AOCombined [ 0 ] = max(AONear [ 0 ] / AONear [ 1 ] , AOFar [ 0 ] ) ;

93 AOCombined [ 1 ] = AOFar [ 1 ] + AONear [ 0 ] ; 94 AOCombined [ 2 ] = AOFar [ 2 ] + AONear [ 1 ] ;

95 return AOCombined ;

96 } 97

98 f l o a t 4 main ( in IN input ) : SV_TARGET{ 99

(50)

Appendix C. Source code 46 103 i n t sampleRadius = i n t ( p r o j S c a l e ∗ r adi us /

viewspacePos . z ) >> mipmapLevel ; 104

viewspacePos , viewspaceNormal , sampleDistans ) ; 112 f l o a t 3 AOFar = farAO ( i n t 2 ( input . Pos . xy ) ,

viewspacePos . z , viewspaceNormal ) ;

113 f l o a t 3 AOCombained = combainAO(AONear , AOFar) ; 114 115 return f l o a t 4 (AOCombained , 1 . 0 ) ; 116 } Listing C.8: AOhighestRes.hlsl 1 c b u f f e r vars : r e g i s t e r ( b0 ) { 2 i n t mipmapLevel ; 3 i n t maxSampleDistance ; 4 f l o a t rad iu s ; 5 f l o a t p r o j S c a l e ; 6 } ; 7 8 Texture2D<f l o a t 4 > VPositionTex : r e g i s t e r ( t0 ) ; 9 Texture2D<f l o a t 4 > normalTex : r e g i s t e r ( t1 ) ; 10 Texture2D<f l o a t 4 > previousLevelAO : r e g i s t e r ( t2 ) ; 11 Texture2D<f l o a t 4 > p r e v i o u s L e v e l P o s i t i o n : r e g i s t e r ( t3 ) ; 12 Texture2D<f l o a t 4 > previousLevelNormal : r e g i s t e r ( t4 ) ; 13 14 #d e f i n e Tn 8.0 15 #d e f i n e Tz 16.0 16 17 s t r u c t IN { 18 f l o a t 4 Pos : SV_POSITION; 19 f l o a t 2 TexCoord : TEXCOORD0; 20 } ; 21

(51)

Appendix C. Source code 47 23 {

28 return ( 1 . 0 − min ( 1 . 0 , f ∗ f ) ) ∗ t ; 29 } ;

30 31

(52)

Appendix C. Source code 48

52 e l s e {

53 wight = 3.0 / 1 6 . 0 ;

54 }

55

56 f l o a t Wn = pow ( ( ( dot ( centerNormal ,

sampleNormal ) + 1 . 0 ) / 2 . 0 ) , Tn) ; 57 f l o a t Wz = pow ( ( 1 . 0 / ( 1 . 0 + abs ( samplePos . z − centerDepth ) ) ) , Tz) ; 58 59 wight = wight ∗ Wn ∗ Wz; 60 61 f l o a t 3 sampleAO = previousLevelAO . Load ( i n t 3 ( l o c a t i o n , 0) ) ;

65 } 66 } 67 68 return AOFar ; 69 } 70 71

(53)

Appendix C. Source code 49 89 { 0.9099292 f , 0.121637 f } , 90 { 0.6441103 f , −0.4789202 f } , 91 { 0.4463334 f , −0.867959 f } , 92 { 0.1551299 f , −0.2643994 f } , 93 { −0.9389378 f , −0.075156 f } 94 } ; 95 sampleCount = 16; 96 f o r ( i n t i = 0 ; i < 16; i++)

97 sum += sampleAO ( centerPos +

p o s i t i o n s [ i ] ∗ sampleDistans , viewspacePos , viewspaceNormal ) ; 98 } 99 e l s e { 100 f o r ( i n t i = −sampleDistans ; i <= sampleDistans ; i += 1) 101 { 102 f o r ( i n t j = −sampleDistans ; j <= sampleDistans ; j += 1) 103 { 104 sum += sampleAO ( i n t 2 ( centerPos . x + i , centerPos . y + j ) , viewspacePos , viewspaceNormal ) ; 105 sampleCount++; 106 } 107 } 108 } 109 110 f l o a t 2 AONear ; 111 AONear [ 0 ] = sum ; 112 AONear [ 1 ] = sampleCount ; 113 return AONear ; 114 } 115

116 f l o a t 3 combainAO( in f l o a t 2 AONear , in f l o a t 3 AOFar) 117 {

118 f l o a t 3 AOCombined ;

119 AOCombined [ 0 ] = max(AONear [ 0 ] / AONear [ 1 ] , AOFar [ 0 ] ) ;

120 AOCombined [ 1 ] = AOFar [ 1 ] + AONear [ 0 ] ; 121 AOCombined [ 2 ] = AOFar [ 2 ] + AONear [ 1 ] ;

(54)

Appendix C. Source code 50 123 }

124

125 f l o a t 4 main ( in IN input ) : SV_TARGET{ 126

129

130 i n t sampleRadius = i n t ( p r o j S c a l e ∗ r adi us / viewspacePos . z ) >> mipmapLevel ;

131

viewspacePos , viewspaceNormal , sampleDistans ) ; 139 f l o a t 3 AOFar = farAO ( i n t 2 ( input . Pos . xy ) ,

viewspacePos . z , viewspaceNormal ) ;

140 f l o a t 3 AOCombained = combainAO(AONear , AOFar) ; 141

142 f l o a t AOmax = AOCombained [ 0 ] ;

143 f l o a t AOaverage = AOCombained [ 1 ] / AOCombained [ 2 ] ; 144 f l o a t AOfinal = 1.0 − ( 1 . 0 − AOmax) ∗ ( 1 . 0 − AOaverage ) ; 145 AOCombained [ 0 ] = 1.0 − AOfinal ; 146 147 return f l o a t 4 (AOCombained , 1 . 0 ) ; 148 } Listing C.9: blur.hlsl 1 s t r u c t IN { 2 f l o a t 4 Pos : SV_POSITION; 3 f l o a t 2 TexCoord : TEXCOORD0; 4 } ; 5 6 Texture2D<f l o a t 4 > previousLevelAO : r e g i s t e r ( t0 ) ; 7

(55)

Appendix C. Source code 51 9 f l o a t maxAO = 0 . 0 ; 10 f l o a t unnormalizedSumAO = 0 . 0 ; 11 f l o a t totalSamples = 0 . 0 ; 12 f l o a t totalWeight = 0 . 0 ; 13 [ u n r o l l ] 14 f o r ( i n t x = −1; x <= 1 ; x++){ 15 [ u n r o l l ] 16 f o r ( i n t y = −1; y <= 1 ; y++){

17 f l o a t 2 samplePos = { input . Pos . x +

x , input . Pos . y + y } ; 18 f l o a t 3 samplePixelAOcombined = previousLevelAO . Load ( i n t 3 ( samplePos , 0) ) ; 19 f l o a t tmp = ( abs ( x ) + 1 . 0 ) ∗( abs ( y ) + 1 . 0 ) ; 20 f l o a t weight = 1.0 / tmp ; 21 22 maxAO += weight ∗ samplePixelAOcombined [ 0 ] ; 23 unnormalizedSumAO += weight ∗ samplePixelAOcombined [ 1 ] ; 24 totalSamples += weight ∗ samplePixelAOcombined [ 2 ] ; 25 totalWeight += weight ; 26 } 27 }

28 return f l o a t 4 (maxAO / totalWeight ,

unnormalizedSumAO / totalWeight , totalSamples / totalWeight , 0 . 0 ) ;

JohanTörn ComparisonBetweenTwoDiﬀerentScreenSpaceAmbientOcclusionTechniques

Comparison Between Two Different

Screen Space Ambient Occlusion

Techniques

Johan Törn

Abstract

Contents

Chapter 1

Introduction

Chapter 2

Related Work

A

B

2.1

Scalable Ambient Obscurance

2.2

Multiresolution Screen Space Ambient

Occlu-sion

Chapter 3

Method

3.1

Structural Similarity Index

3.2

Perceptual Image Difference

3.3

Settings in mental ray

3.4

Maya to DirectX 11 Application

3.5

Time measurement

3.6

Implementing the two AO techniques

Chapter 4

Results

4.1

Visuals

4.2

Performance

Chapter 5

Analysis and Discussion

Chapter 6

Conclusions and Future Work

6.1

Future Work

References

Appendix A

AO Textures

Appendix B

Execution Time

Appendix C

Source code

C.1

SAO