• No results found

Real-time Raytracing and Screen-space Ambient Occlusion

N/A
N/A
Protected

Academic year: 2022

Share "Real-time Raytracing and Screen-space Ambient Occlusion"

Copied!
67
0
0

Loading.... (view fulltext now)

Full text

(1)

INOM

EXAMENSARBETE DATALOGI OCH DATATEKNIK,

AVANCERAD NIVÅ, 30 HP STOCKHOLM SVERIGE 2019,

Real-time Raytracing and Screen-space Ambient Occlusion

POORIA GHAVAMIAN

(2)

Real-time Raytracing and Screen-space Ambient Occlusion 

Pooria Ghavamian   

Supervisor at ​KTH​: ​Björn Thuresson  Examiner​: Tino Weinkauf 

Supervisor at ​Fatshark​: ​Axel Kinner   

   

 

 

 

 

 

(3)

Abstract 

 

This paper investigates the advances in real-time ambient occlusion (AO). Topics discussed are  state-of-the-art screen-space techniques and raytraced ambient occlusion. Methods compared are  our screen-space ambient occlusion (SSAO) variant, horizon-based ambient occlusion(HBAO),  Unity’s scalable AO (AlchemyAO), multi-scale volumetric AO (MSVO), and raytraced AO (RTAO). 

The methods were compared based on the errors produced in dynamic scenes, performance and  similarity to reference scenes rendered by an offline raytracer. Important dynamic scene errors were  highlighted, visual results were objectively evaluated using Structural Similarity Index (SSIM) and  Unity engine was used as a common platform for all the methods in order to obtain performance  metrics. RTAO managed to achieve a strikingly high SSIM score, while, MSVO traded some 

accuracy to be the fastest of all the methods. Further analysis of different implementations and their  strengths and weaknesses are provided.  

                                                 

(4)

Sammanfattning 

Denna studie utforskar framsteg inom realtid ambient occlusion (AO). Ämnen som diskuteras är  senaste typen av screen-spaceteknik och raytraced ambient occlusion. Metoderna som jämförs är vår  egen screen-space ambient occlusion (SSAO) variant, horizon-based ambient occlusion (HBAO),  Unitys scalable AO (Alchemy AO), multi-scale volumetric AO (MSVO), och raytraced AO (RTAO). 

De olika metoderna jämfördes baserat på prestanda, likheter till referens scener och fel som 

tillverkas inom dynamiska scener. Viktiga dynamiska scener var markerad och de visuella resultaten  var objektivt evaluerad genom användning av Structural Similarity Index (SSIM). Unity motorn  användes som en gemensam plattform för alla typer av metoder för att få fram prestanda mått. 

RTAO lyckades att uppnå ett högt SSIM betyg medan MSVO blev den snabbaste av alla metoder  dock har lägre precision. Ytterligare analys av olika genomföringar och deras styrkor samt svagheter  ingår i rapporten. 

                                                 

(5)

Table of Content   

Abstract

Background

Introduction

Motivation

DirectX Raytracing

Problem Statement 10 

Research Question 10 

Goal 11 

Related Work 12 

Rendering Equation 12 

Ambient Occlusion 13 

Monte-Carlo Method 16 

Screen-space Ambient Occlusion Methods 19 

Crytek SSAO 20 

Normal-oriented SSAO 21 

Horizon-based Ambient Occlusion 22 

Volumetric Obscurance 24 

Alchemy and Scalable Ambient Occlusion 25 

Multi-scale Ambient Occlusion 27 

Denoising and Blur 28 

Gaussian Blur 28 

Separable Blur 30 

Bilateral Blur and variations 31 

Spatiotemporal approach 32 

Implementations 34 

SSAO Variant 34 

HBAO 35 

Unity’s SAO (AlchemyAO) 36 

MSVO 37 

RTAO 38 

Results 40 

Dynamic Scene Applicability 40 

Banding and Noise 40 

Temporal Lag 41 

(6)

Over-occlusion 41 

Edge-of-Screen artefacts 41 

Flickering 42 

Performance 42 

Scalability 42 

Resolution 42 

Sample Count 44 

Radius 45 

Raytraced Ambient Occlusion 46 

Resolution 46 

Radius 47 

Sample Count 47 

Accuracy 48 

Discussion 52 

Criticism and Future Work 54 

Conclusion 55 

Acknowledgment 55 

Bibliography 56 

Appendix A 60 

Appendix B 62 

 

 

 

(7)

Background 

 

This section lays down the theoretical foundation for the thesis. Included in this chapter are the        problem statement, theory, motivation, related studies and finally the research question that governs        the main direction for the rest of the paper. A brief discussion on methods for evaluation will also        be provided at the end.   

 

Introduction   

The project is carried out at Fatshark, a video game company in Stockholm. Their products, like        most of the modern video games, rely heavily on nuanced visual cues to leave a convincing        perceptual impression on the users and provide a strong sense of immersion. An effective way of        accomplishing this goal is by paying extra close attention to the interplay of lighting and scene        elements.  

 

The way light interacts with the environment, in reality, is quite complex and carries information that        ranges from the redness of an apple to the age of a distant galaxy. Light is comprised of photons,        and the visible light that is the focus of this study is only a small part of the electromagnetic        spectrum. There are many strange properties to light, such as being both a particle and a wave at the        same time. This duality is behind familiar behavior like caustics, afterglow, reflection, refraction and        also other less present optical phenomena like the 22° halo and diffraction.  

   

It should come as no surprise that accurately capturing and displaying all the intricacies of light can        prove to be an insurmountable task. Therefore, in real-time computer graphics, we have to        approximate light behavior and work with a simplified model of it. One way of calculating light at        each point is to take into account only the light sources and the object being lit. This simplified        model is called local illumination. A widely used illumination model of the local variety is the Phong        reflection model which breaks down a reflection off an object to ambient, diffuse and specular        terms. The diffuse term which uses Lambertian reflection, is the direct light hitting a surface and it is        dependent on the incident angle and independent from the view. The Specular term, on the other        hand, attempts to capture the specular highlights seen in polished shiny objects and is        view-dependent. It is the ambient term, however, that could prove to be too simplified and looks        incorrect as it is only a constant value spread equally across the scene. Therefore, shadows and the        inter-reflection of other objects and the way they affect each other are ignored in local illumination       

(8)

models. On the other hand, models that do take into account the reflected light from other objects        within a scene (indirect light), are called global illumination models.  

 

Global illumination (GI) is a crucial component of a rendered scene. As mentioned before, surfaces        receive light from all directions. If that was not the case, objects not under direct light (e.g. corner of        the walls) would appear pitch black. Instead of only taking the effect of the light source directly on        objects (direct lighting), GI focuses on following light bounces as it hits different objects in the        environment. This approach drastically changes the scene’s quality by replicating a lighting behavior        that is closer to reality. However, an accurate implementation is computationally expensive and a        simple scene can take hours to render. Therefore, a mix of local illumination model and algorithms        that handle specific GI effects is used for real-time applications.  

 

Motivation   

Ambient Occlusion is an effect that helps us approximate Global Illumination. Simply put, ambient        occlusion is a measure of exposure of each point in the scene to the ambient lighting and is a scalar        value between zero and one. This effect is powerful in bringing out details in the scene and unlike        other forms of lighting does not depend on light direction. Some games have decided to        precompute ambient occlusion in their indirect lighting calculations through Monte Carlo methods        [1]. In this method rays are cast in a random and uniform fashion around the hemisphere of a        normal and the visibility function is calculated based on the intersections.       ​Baking ​(non-real-time  solution), ​the ambient occlusion can increase quality and performance for static scenes, however,        when it comes to dynamic scenes a real-time solution is needed. 

 

Baking is a powerful tool in computer graphics when used appropriately, and is basically a transfer        process. It’s a precomputation that can be done for many different attributes. Instead of calculating        textures, and other attributes in real-time, one can bake them before running a scene and just reload        them when needed. Indeed, this can only be applied to static objects, as otherwise, if an object        moves within a scene, its baked shadow will remain in the same place.  

 

The methods in the real-time realm can be divided into object-space and screen-space approaches.       

Object-space and screen-space refer to the coordinate systems used in computer graphics and based        on what calculation is taking place, one might switch from one space to another. As the name        suggests, screen-space is defined by the screen with coordinates in 2D. Whereas, object-space is the        coordinate system from the object’s point of view. 

 

(9)

Object-space solutions have a cost that is proportional to scene complexity and usually, sampling is        done to find ray intersection with non-empty areas around each pixel. Whereas, screen-space        solutions make use of depth, normal and position buffers (screen-space data). Scene complexity        does not affect screen-space solutions and they have constant cost, and they are only affected by the        resolution used for rendering [1].  

 

A well-known screen-space solution is SSAO [14], which uses only the       ​z​-buffer (which stores depth        information) as its input. The ambient occlusion factor is therefore calculated as a function of        samples acquired in the spherical depth test. Although the result is not physically accurate, it still        manages to increase the quality of the scene in a robust way. Since SSAO shares many properties        with other screen-space solutions, it will be analyzed in detail. There are, of course, certain problems        associated with this approach which we will discuss in a future section.  

 

DirectX Raytracing   

DirectX Raytracing (DXR) introduces new elements to the DirectX 12 API. It enables        GPU-accelerated raytracing. It provides developers with new shaders, acceleration structures, and        other useful elements. It is quite low-level and the developer is given a lot of freedom to optimize        their implementation, however, they have to be dealing with all the minute details of the pipeline.  

 

Raytracing is a powerful tool that at its core is quite intuitive. The way raytracing is done is quite        reminiscent of how ancient Greek philosophers thought human vision actually worked. In what is        called Extramission Theory, visual perception is made possible through beams that leave our eyes        and hit objects around us. Unlike rasterization, where primitive traversal takes place, in raytracing        the 3D scene is not projected to 2D screen before coloring and effects like reflection, refraction, and        shadows are produced through the act of casting rays, without any special effects.  

 

A basic Whitted raytracer as demonstrated in “An Improved Illumination Model for Shaded        Display” [49], uses a recursive structure to achieve photorealism. There are two types of rays:       

primary rays and     ​secondary rays. ​Primary rays are used to compute visibility, in what is also        known as raycasting. In a naive implementation, a primary ray is generated, cast through each pixel        and is checked against the triangles for the closest intersection distance. The introduction of        secondary rays, on the other hand, managed to solve three of the important challenges that the        rendering community faced at the time. These challenges were reflection, refraction, and shadows.       

Secondary rays are spawned at the point where primary rays intersect with an object. For example, in        the case of a diffuse object, a secondary ray is sent towards the light source (also known as shadow       

(10)

spawned is under the shadow of the object that the ray has intersected with. The same idea applies        to reflection and refraction as well.  

 

  Figure 1.​ Recursive raytracing showcasing primary and secondary rays. Note how shade() is invoking trace() 

for further assessment of the radiance. Source from [1]. 

     

The main operations of a naïve raytracer can, therefore, be separated into two parts:  

● Intersection calculation 

● Shading   

For a given scene with       ​N primitives, a basic raytracer will carry out intersection tests for each        primitive in   ​O(N) time complexity. The shading that takes place upon intersection is similarly done        in ​O​(​N​) time. Since optimization has moved towards focusing on execution profiling (detecting the        hotspots in a program), a raytracer can be optimized in mostly two different ways. According to the        equations below, the total time within a program can be broken down as follows: 

   

Where: 

 

   

(11)

As shown above, one can either carry out low-level optimization of carrying out a given task       ​i or reduce the number of times our program has to run that task. In this context, the intersection test        will be defined as task           ​i​. ​DirectX Raytracing (DXR) provides us with means to optimize this hotspot        with its new additions to the DirectX 12 API.  

 

Using a technique called acceleration structure, one can significantly reduce the number of        unnecessary intersection tests. Acceleration structures can be divided into two main groups of       ​object  hierarchies and   ​spatial subdivision​, with each side having their own advantages and disadvantages. DXR        opts for the former class of acceleration structures by deploying the commonly used bounding        volume hierarchies (BVH) in its implementation (although, it does not mandate its use) [35]. BVH        helps in enclosing the primitives in an axis-aligned bounding box (AABB) and guarantees bounded        memory usage [35]. DXR uses a two-level hierarchy, with the structure divided into top-level        acceleration structure (TLAS) and bottom-level acceleration structure (BLAS). This model helps        with optimizing ray traversal, and also opens up possibilities for dynamic objects [10].  

 

  Figure 2.​ A scene with 5 objects and their BVH. Source from [1]. 

 

Figure 2 depicts a simple BVH with spheres chosen as the bounding volume. The scene is        deconstructed in the form of a hierarchical tree. The topmost node is called the root node and        contains the entire scene. The internal nodes are analogous to TLAS in DXR and have pointers to        their children, which could be either another internal node or a leaf node. Leaf nodes are similar to        BLAS and hold the geometry that will be used for rendering. In a naïve raytracer, intersection test        has to be done against every single geometry in the scene and thus is extremely inefficient. However,        with a BVH, a shadow ray (also used in ambient occlusion) that returns the first hit found will have        to only carry intersection tests with sphere BVs. If the ray misses the BV, it can safely disregard all        the content under the missed node. Otherwise, it will recurse until it reaches a leaf node, and then        the intersection test is carried out with the geometry. Using BVH can, therefore, significantly       

(12)

improve the performance by helping the ray prune large sections of the scene. It should come as no        surprise that DXR gives great importance to acceleration structures in their raytracing setup.  

 

DXR also introduces new additions to the high-level shader language (HLSL) in the form of        raygeneration, closest-hit, any-hit, and miss shaders [10].  

 

  Figure 3. ​Overview of the three main components needed for DXR. Source from [35].  

 

As it can be seen in figure 3, DXR breaks down their architecture to three main components. 

Acceleration structures are a two-level hierarchy that encompasses the geometry and facilitate the  faster search for ray intersections. There are trade-offs in the way the acceleration structures are  implemented, that depending on what qualities are sought out for should be taken into 

consideration. For example, for ray intersection performance, larger bottom-level structures are  desired. Whereas, for flexibility more and smaller top-level structures should be deployed. On a  high-level, the raytracing pipeline object holds the function declarations of shaders, compiled 

programs and the payload that shaders need for communication. The shading binding table (SBT) is  one of the most important segments in the setup since it binds all the programs and the TLAS  needed to know what resources an invoked shader needs and which shader should be executed for a  given geometry. 

 

(13)

Problem Statement    

The main focus of this thesis will be on Ambient Occlusion (AO) in a dynamic setting, where fluid        frame transition is a necessity. Given the importance of AO and how it enriches a given scene, the        new DirectX 12 and advances in real-time rendering techniques such as DirectX Raytracing (DXR),        could enhance the quality and efficiency of this technique. Our eyes are usually not sensitive to        low-frequency variations in light, which allows us to simplify the light transport in terms of ambient        occlusion. Therefore, this leaves the onus on ambient occlusion to convincingly create the illusion of        Global Illumination, and help us in visually discerning the details in objects.  

 

The desired result of this paper is identifying an ambient occlusion effect that meets the following        criteria:  

● Applicable to Dynamic scenes: the solution has to require no precomputation, and        operate in real-time. This means that no artefacts should be produced due to camera        movement, changes in geometry or locomotion.  

 

● Performant: Latency is an issue that is highly perceptible to the user. Therefore, the frame        frequency needs to be maintained within 60-90 frames per second for a desirable interaction        experience [1]. With rapid advances in technology, one must take into account the increasing        demand for higher resolution and sampling rate. Moreover, scene complexity and geometry        is an important factor that needs consideration, as an average scene in a modern video game        will have millions of triangles at any given moment.  

 

● Accurate​: This criterion falls more under the qualitative aspect of the technique and one        that will be the most obvious to the user. Our solution has to have an overall convincing and        accurate appearance. Ambient occlusion disappearing at the edges, halo, over- and        under-occlusion are examples of a violation of this condition. Naturally, a realistic ambient        occlusion effect has to take into account off-screen items as well for an accurate result.  

 

The criteria above will establish the core upon which the evaluation of the effect will be carried out.       

As can be seen, the requirements mostly harken back to the goals of real-time graphics effect.   

Research Question 

“To what extent can real-time raytracing in DXR produce a viable alternative to screen-space  ambient occlusion solutions?” 

(14)

Goal   

The aim of this thesis is to investigate state-of-the-art approaches in ambient occlusion, and        specifically study how DXR affects the experience. The hypothesis is that with DXR we can attain        higher qualities while not hindering the performance in a significant way. The main means of        evaluation will be through measuring render-time for performance and extracting Structural        Similarity Index (SSIM) [47] between proposed methods and a standard for accuracy.   

  SSIM   

Structural Similarity Index is a relatively new method that has gained traction in the field of        visualization. It allows to measure the similarity and quality between two images, under the condition        that one has the ideal quality. SSIM works on the principle that human eyes pay attention to        structural differences in images and therefore, unlike other methods does not compare individual        pixels. Factors such as luminance, contrast and structural information are computed and averaged        using an 8x8 sliding window. The sliding window moves one pixel to the other, and the score is        computed from the mean SSIM. The SSIM used in this paper follow the form:  

   

   

Where and are the luminance, contrast and structural correlation  components of the image, respectively. The Matlab implementation used is directly based on the  SSIM paper [47]. More traditional methods like mean squared error (MSE) and peak signal-to-noise  ratio (PSNR) carry out pixel-by-pixel analysis and do not reflect the Human Visual System (HVS)  effectively. Ambient occlusion maps are black and white and can have sharp changes that may cause  MSE and PSNR to overestimate or underestimate the quality of an image by relying only on 

absolute error. SSIM on the other hand, like HVS, does not only depend on pixels and takes into  account chrominance and correlation to find structural similarities. Although more complicated,  SSIM is a much more apt metric for our use case.  

         

(15)

Related Work 

 

This section is dedicated to the previous work done in evaluating ambient occlusion. First, the        theoretical background required to understand these related works is provided. Then different        methods in object-space and screen-space are presented, with how they make use of many of the        common techniques. 

 

Rendering Equation    

Before exploring ambient occlusion, it is beneficial to have an overview of the rendering equation, as        it is the mathematical basis for all the interactions between light and surfaces within a scene.       

Outgoing radiance is the term that is outputted at the end of a rendering system and is a measure of        the amount of light reflected from a point ​p ​in the direction towards the viewer.  

   

  Figure 4. ​light and surface interaction. Source from [37] 

 

The surface properties are formalized by the bidirectional reflectance distribution function (BRDF).       

Simply put, BRDF is the ratio of incident light in direction       to the outgoing radiance in direction          from a point ​p on the surface and is denoted by . The outgoing radiance can,

       

therefore, be written as [7]: 

 

 

(16)

Since ambient occlusion deals primarily with ambient light, certain useful assumptions can be made.       

It is important to outline these assumptions as they will be the basis of how ambient occlusion will        be formalized for the rest of the paper. The rendering model in equation 1 does not have a term to        take into account the effect of an emissive surface, and another system is needed for the emissive        case. Secondly, the surface is assumed to be isotropic/Lambertian and scatters ambient light equally        in all directions [37] and therefore, the BRDF can be taken out of the integrand and be treated as a        constant. The final assumption that has been mentioned earlier concerns itself with the indirect light.       

The ambient light is isotropic and is equally incident, making       a constant term. A binary visibility          term is then multiplied with       to allow contribution only for unoccluded regions. The following        section will demonstrate the equation that is derived after these assumptions are made, and will        discuss how it is used in different methods. 

Ambient Occlusion   

So far the rationale behind the importance given to ambient occlusion has been outlined. It is        beneficial to now delve deeper into the analytical details of the effect. Ambient occlusion itself is a        special case of ambient obscurance that operates in object-space and is evaluated in a preprocess        [50]. In this paper, any form of indirect light (i.e. light modulated through refraction or reflection)        falls under the term ambient light [7]. An object illuminated only by the ambient term in the Phong        model will have a “flat” appearance, due to the constant nature of this factor.  

 

  Figure 5. ​(left) rendering with no ambient occlusion. (right) rendering with ambient occlusion. Source from 

[33] 

 

(17)

As it can be seen in the figure above, none of the details of the spaceship model is discernible with        no ambient occlusion and the object on the left has no depth complexity, whereas, the same object        rendered with ambient occlusion reveals much more detail to the observer.   

 

The concept of obscurance as defined by       ​Zhukov et al.     ​[50] tackles this issue by taking into account        the amount of ambient light that is       ​accessible     ​to a given point ​p ​on a surface with normal . Ambient        obscurance is formulated as:  

 

   

Where  ωrepresents all the directions under the unit sphere       over point     ​p​.    is the distance    between point     ​p and the closest occluding object, and a monotonically smooth kernel       is applied  to visibility that attenuates to 0 at maximum distance. The rationale behind this is to restrict the        contribution of occluders the farther they are away from the originating point. The ambient light in        this approach is modeled after non-absorbing transparent gas with constant emittance      τper unit    volume [50]. Formally, this translates to the integration of the unit hemisphere       with center at the        normal of point ​p​.  

 

The ambient occlusion term used in modern literature, is the ambient obscurance that is reversed        and redefined, to measure the amount of light that is       ​occluded   ​by surrounding objects around point ​p​.         

The attenuation function         is replaced by a binary visibility term      , which acts as the visibility        function from the rendering equation for a given direction      . The equation averages the light        occluded at the point of interest. The value is then cosine-weighted with the angle between the        normal     of point   ​p and the occluder. This means that occluders parallel to the surface defined by        have more blocking power, whereas, occluders on the horizon of ​p will have almost no effect,

       

which is a more intuitive approach to how light is naturally blocked. Therefore, ambient occlusion or        the cosine-weighted, normalized percentage of the unoccluded hemisphere can be written as        equation 3. It was also shown earlier in the report how the equation below can be obtained from the        rendering equation. 

 

   

Although they are usually used interchangeably, the difference between obscurance and occlusion is        that in the latter, the visibility function has discrete values of 0 and 1. Obscurance, on the other        hand, includes a continuous kernel that takes distance to the occluder as its input. Moreover, in       

(18)

ambient occlusion, a value of 1 means completely occluded, whereas, it means the exact opposite for        obscurance.  

 

For the AO function, when       ​p is completely occluded, or when        = for all , the integral        reaches its highest value of . Since for computational purposes we want to clamp the value of AO        to 0 and 1, we multiply it by a normalization factor of . The maximum value of for this integral        can be justified through differential geometry. Since relevant literature does not delve into the        rationale behind this maximum value, a brief derivation will be provided.  

 

As we are operating in unit sphere      , the term       will be equivalent to         and is used      interchangeably throughout the report. If we switch to spherical coordinates [43], the integral over        the surface area of the sphere will require the differential area element, expressed as        [16]. Where       is the radius and has a value of 1 in this case and the surface area of the hemisphere is       

. Thus, the integral in spherical coordinates is:  

 

   

A simple integration by simplification of the inner integral results in the following: 

 

   

  And this integral will be simply: 

 

 

 

(19)

Monte-Carlo Method   

Now that we have formally established the ambient occlusion, we need to focus on how to evaluate        the integral. In the field of rendering especially complicated integrals are abundant, and most lack a        closed-form solution. Therefore, numerical methods and approximations need to be employed. One        such approximation that makes use of the       ​law of large numbers         ​is the Monte-Carlo integration. In this          method, stochastic sampling is done to estimate the value of integrals. The random nature of this        method gives it the power to estimate even integrands with discontinuities. The general form of the        Monte-Carlo estimator can be expressed as [37]: 

 

 

In the equation above, based on the law of large numbers, as N (number of samples) approaches  infinity, the estimator converges to the original integrand. Since we can not afford to have an infinite  number of samples, some noise and variance will be present. In this equation, is a random  sample from the integration domain, and is known as the ​probability density function​ (PDF). 

PDF is defined as the relative probability of to be chosen in the integral domain.  

 

The earliest mention of Ambient occlusion proposes a Monte-Carlo sampling method done through  raycasting [23]. For every point on the surface, rays are cast in the unit hemisphere around the  normal and is intersections tests are carried out with the surrounding occluders. Finally, the number  of intersections is divided by the number of rays cast. As shown in the figure below: 

 

  Figure 6. ​Ambient Occlusion found through raytracing. 

 

(20)

The Monte-Carlo approach is quite simple and can solve complex integrands in an unbiased and  stable manner. However, it is not without its disadvantages. One main problem with this method, as  opposed to other deterministic ways of solving the integrand (e.g. Reimann’s sum), is its slow rate of  convergence where N is the number of samples [37]. When used for rendering, the variance  manifests itself as noise, and the relationship effectively means that in order to reduce the noise by  half, one has to multiply the number of samples by a factor of 4. Increasing the sample number  exponentially to have a linear decrease in noise is simply not a feasible solution in a real-time  context. Therefore, the choice of PDF becomes a crucial step in variance reduction. The ambient  occlusion integral in equation 2, can therefore, be written as: 

 

   

Since increasing sample count is not feasible, one has to pick a PDF that behaves closely to the  original integrand function. One safe type of PDF in the absence of a better knowledge of the  function is the uniform distribution. In this class of PDFs, sample points all have an equal chance of  being chosen. Since we established that the surface area of the unit hemisphere is , the uniform  PDF will be: 

  And therefore, equation 8 will take the form: 

 

 

However, this uniform PDF could produce a large amount of noise as the variance from the 

expected value for certain integrands is bound to be high in complex rendering functions. Therefore,  samples we choose have to be distributed more around the ​important​ parts of the original function. 

By not choosing a proper PDF, it is possible to completely miss out on an important part of the  function, for example, one representing a light source. Since the variance of a Monte-Carlo 

integration of a constant function is 0, the closer the chosen PDF is to the actual function, the less  variance and noise is produced. This principle is the rationale behind the method used in ​importance  sampling ​[35]​. 

 

(21)

  Figure 7. ​The effect of probability distribution function on variance. (left) PDF choice is completely 

different from f(x) and produces a significant amount of variance. (middle) PDF choice is a uniform  distribution that misses the important section of the function. (right) PDF choice is similar to f(x), therefore, 

providing significant variance reduction. Source from [21]. 

 

Therefore, by choosing PDF to be the cosine-weighted hemisphere, we give it similar attributes to  the AO function: 

 

  This choice will result in the simple estimator: 

 

   

Monte-Carlo integration is a powerful method and is quite straightforward for a raytracer. However,  both uniform sampling and importance sampling suffer from ​sample clumping​. The problem arises  when two or more sample points fall in close proximity and return the same information. Since the  sample budget is restricted, clumping will greatly hinder the accuracy and efficiency of the 

estimation. An improved approach is that of Quasi Monte-Carlo method, which strives to find a  middle ground between ​aliasing​ caused by deterministic techniques (e.g. Riemann’s sum) and the ​noise  caused by stochastic methods (e.g. Monte-Carlo Integration). This is done through ​stratified sampling  [8].  

 

Further optimization to the raytraced solution presented above was proposed, which utilized spatial  subdivision structures and the fact that ambient occlusion is a function of local geometry[48]. In  their solution, an octree is used and occluders are traversed for more efficient intersection tests. 

Furthermore, one can average the rays that don’t hit any occluders to derive the “bent normal” and  store it in the ambient occlusion maps [23]. Bent normals will be discussed later, as they can be used 

(22)

to look up the environment map, and are quite useful in representing accurate reflections and  ambient light, therefore, increasing the overall quality of a render.  

 

Until quite recently, raytracing the ambient occlusion was precomputed for static scenes and was  used mostly in high-end films and effects. Due to being computationally expensive, the 

precomputed values, although accurate, failed at processing dynamic scenes. Furthermore, one must  note that ambient occlusion itself is an approximation and an effect that does not alert the human  eyes. Therefore, it was only a matter of time that a trade-off between dynamic handling capability  and accuracy was made.  

 

An early attempt to leave the computationally expensive object-space could be credited to the  disk-based approximation[5]. This method provides another way of calculating ambient occlusion  that is suitable for dynamic scenes, without using raytracing. In this approach, meshes are turned to  surfels (surface disks) and instead of using the visibility function, the occlusion of one disk on  another is calculated to approximate the ambient occlusion. This method without optimizations runs  in the order of , which is quite expensive. The introduction of a two-pass method and a  hierarchical structure of disks reduces this to [1]. Since visibility estimation is done on a  per-vertex basis, some linear interpolation artefacts are produced. Moreover, highly tessellated  objects are required for correct high-frequency shadowing, for instance, in the case of contact  shadows. A per-fragment approach is suggested to tackle these problems [18]. This disk-based  approach does not quite fall under the screen-space category and is more of a surface discretization  technique, however, it paves the way to screen-space methods.  

Screen-space Ambient Occlusion Methods   

On top of being expensive, object-space methods will naturally have cost positively correlated with  scene complexity. The introduction of deferred rendering and experimentations with screen-space  data (e.g. unsharp masking) for enhancing contrast and depth perception opened the doors to new  approaches in evaluating the AO integrand [40, 29]. It was not long after that a method using the  ND-buffer (normal and depth buffer) was proposed that computed ambient occlusion as a 

full-screen pass. This approach splits the scene into near and far ambient occlusions and computes  high-frequency AO using the image-space for near objects, and far objects are processed through  spherical occluders as the low-frequency part [41]. 

   

(23)

Crytek SSAO  

Screen-Space Ambient Occlusion (SSAO) was the first AO technique that was made available for  use in a fully dynamic real-time application by Crytek [14]. SSAO spawned many variations that  strived to improve on its limitations. The main difference between them lies in the way the  screen-space data is handled, and how the ambient occlusion term is interpreted.  

 

The SSAO as proposed initially for CryEngine 2 makes use only of the z-buffer. The spatial 

coherency of the z-buffer allows it to be used as a representation of the scene, and in order to check  for occlusion, neighboring pixels are sampled and depth comparison is carried out. After generating  a depth map during the render, a full-screen quad is drawn to invoke the pixel shader for each point  p​. Samples are constructed as uniformly random vectors that are added to the view space position in  a sphere surrounding the surface point ​p​, as shown below: 

 

 

Figure 8. ​Crytek’s SSAO. Random samples are shown as circles and rectangles as depth buffer fetches. The  occlusion value in this figure is ⅓, as 2 samples fall below the surface. Source from [31]. 

 

For every virtual sample point’s depth ​d ​we need to check if it with the z-value in the depth map. 

This is a basic form of containment test which is done by carrying out a depth comparison to check  if the sample falls inside the geometry or on the outside. One problem with this implementation that  is readily noticeable is that due to the spherical sampling, almost half the sample point falls behind  the geometry even for flat surfaces. This gives scenes using this variation of SSAO a distinctive grey  color caused by self-occlusion.  

(24)

Normal-oriented​​SSAO 

An improved SSAO variant used in​ StarCraft II​, changes the sphere into a hemisphere and takes into  account the normal of the point and flips the sample points that fall below the surface[14]: 

 

 

Figure 9. ​SSAO used in ​StarCraft II with normal-oriented hemisphere. Source from [31].  

   

The attenuation factor is also brought back, as the step function shown in figure 10. The 

occlusion factor coupled with sample flipping effectively halves the samples needed and also handles  self-occlusion. Furthermore, the sampling position is captured in world coordinates and projected to  screen coordinates. and downsampling is done to save bandwidth. Finally, similar to the original  SSAO, a geometry-aware filter is applied to remove the high-frequency noise caused by sampling. 

Due to the low-frequency nature of ambient occlusion, this error-prone AO approximation is still  widely used in the industry.  

(25)

  Figure 10. ​Occlusion function as depicted in the source from [39]. 

   

Horizon-based Ambient Occlusion   

Another approach that is quite different than SSAO’s sampling for assessing visibility is 

Horizon-based Ambient Occlusion (HBAO) [4]. This approach makes use of horizon mapping [28] 

and raymarches the depth buffer under the assumption that it is a continuous heightfield. Therefore,  they rewrite the equation 2 as [4]: 

   

   

Where is a linear attenuation function, is the elevation angle and is the azimuthal angle,  note the elevation interval of the inner integral. In this approach, any point that falls below the  horizon is counted as an occluder. This assumption is possible due to the fact that the depth buffer  being treated as a heightfield. However, there can be many instances that the evaluation can lead to  erroneous shadowing due to depth discontinuities below the horizon.  

 

(26)

 

Figure 11. ​Different components of HBAO. The inner integral marches the heightfield defined by a 2D slice,  and the outer integral swipes the slice within the hemisphere. Source from [29]. 

   

In this implementation, the hemisphere is split to 2D slices, where is found through raymarching  the heightfield for each angle , and is the angular offset of the tangent surface defined by normal 

. Monte-Carlo integration is once again introduced, as equation 13 can be converted to: 

 

   

 

And therefore, sampling the azimuthal angles uniformly at random, the estimator can be written as:  

 

   

Where PDF is which cancels out with the multiplier of the integral, resulting in a simpler term. 

As noise is preferable to banding artefacts, they jitter the step size per pixel [39]. Overall, this 

approach requires much more computation than SSAO. However, it can be seen from the equations  that HBAO is closer to the original definition of AO, is more physically-based due to raymarching,  and thus should produce higher quality output.  

   

(27)

Volumetric Obscurance 

Point sampling is a common and convenient way of sampling for screen-space solutions. However,  due to the discrete nature of this approach, certain artefacts are created during movement as samples  become occluded and disoccluded. As it can be seen in the figure 12 below of the Crytek SSAO  sampling scheme, a differential change in geometry makes the occlusion value pop from 4/8 to 3/8. 

Furthermore, the geometry on the left will have the same AO as a flat surface, which in itself is  wrong as well. 

   

  Figure 12. ​A radical change in AO as a result of differential change in geometry. Source from [17]. 

   

A smoother result can be obtained by using line sampling. The idea is to approximate the 

unoccluded volume around point ​p​ through the ratio of visible to occluded length on a given line  segment. Using line sampling is one of the main ideas behind another screen-space approach that  goes by the name of Volumetric Obscurance (VO) [24]. Based on the ambient obscurance in  equation 1, a volumetric obscurance (VO) quantity is defined as:  

 

   

Where ​X​ is the 3D volume around ​p​ and O is the occupancy function which takes the value of 0 if  there are no occluders at ​x​ and 1 otherwise.  

 

 

(28)

 

Figure 13. ​Volumetric Obscurance as defined in [24]. Where f/h is the ratio of the unoccluded segment to  the occluded segment of a line sample that enters the hemisphere at U and exits it at V. Source from [31].  

 

Alchemy and Scalable Ambient Occlusion   

Alchemy ambient occlusion (Alchemy AO), builds on top of the methods explained above and at  the same time harkens to the original derivation [31]. The main improvement is by intelligently  picking a falloff function that will simplify the equation.  

   

   

Where ​t​ is the sample distance between point ​p​ and sample ​s​, and ​u ​is the user-defined falloff  constant. As stated in the paper, the specific falloff function above represents a shifted hyperbola  that had earlier produced desirable results in ​StarCraft II ​[14]. Replacing the falloff function with 

in the ambient occlusion equation will yield: 

 

   

(29)

 

The equation is rewritten by defining , a vector from point ​p​ to the occluder in direction  . The numerical estimator for the integral is thus derived as:  

 

( )   

Where 𝜖 is used to prevent division by zero. is a vector from point ​p​ to the ​i​th sample. is thevi zp   camera space depth and bias distance is an aesthetics parameter that can be adjusted to deal with  self-shadowing and light leaks. and are used to alter the intensity of the occlusion and modify  the contrast respectively.  

 

Figure 14. ​Overview of Alchemy AO. The sampling scheme is similar to both HBAO and VAO [31]. 

 

The sampling is done similar to the VAO approach [24], where a disk of a certain radius is defined  around point ​p​ in screen-space, and a sample point ​s ​is chosen uniformly at random on the disk, and  is projected to camera space, similar to HBAO [4], to a point ​q​ on the surface. The component in  equation 18 is, therefore, calculated based on the vector from ​q​ to ​p ​[31]. Scalable Ambient 

Occlusion (SAO), is an improvement on Alchemy AO [30]. This approach uses the same estimator,  yet, focuses on making Alchemy AO scale better at high resolutions and cuts latency. One 

significant improvement is creating a mipmap from the depth buffer, which helps handle larger radii  and increases cache efficiency. Samples close to ​p​ will fetch the high-resolution level of the depth  buffer, while the samples further away will have the low-precision buffer.  

(30)

Multi-scale Ambient Occlusion   

  Figure 15. ​ multi-resolution approach as used in MSSAO. Source from [17]. 

 

Almost all the screen-space solutions presented here need to apply a form of blur at the end to  remove high-frequency noise produced during the sampling. The blur pass itself is quite expensive  and produce further deviation from the standard. Furthermore, the mentioned methods struggle  with capturing both local high-frequency AO and the global larger scale AO. A multi-scale ambient  occlusion (MSSAO) approach is proposed [17] which computes AO at multiple resolutions and  combines them at the end to obtain a map that holds both the effect of far occluders and also the  high-frequency AO from local nearby occluders.  

 

Figure 16. ​Interleaved sampling pattern for use with a 3x3 low-pass filter. Source from [17]. 

 

At all resolutions except the highest precision, a fairly small sampling kernel is used, with samples  taken at every other pixel. For example, the 11x11 sampling kernel used in figure 16 will have 36  texel fetches per fragment instead of 121. This method of sampling is similar to interleaved  sampling, however, without the randomness. The multi-resolution solution greatly removes noise  and the need for excessive blurring.  

(31)

Denoising and Blur   

This section will outline certain design choices common between the methods and how they affect  the final AO output. Further individual breakdown of certain methods will be carried out as well.  

 

Almost all the methods presented above require or encourage using a separate blur pass. Depending  on whether sampling is done stochastically or non-randomly, the final image will either have noise or  banding artefacts. In the industry, banding and aliasing are commonly traded for noise and 

techniques such as dithering are widely used, as the result is easier on the human eye.   

 

Denoising and filtering are equally crucial to both screen-space ambient occlusion methods and the  raytraced method. Due to the way sampling is done for the numerical estimators, variance is an  all-present quality that manifests itself as noise in rendering. The presented methods produce a  varying amount of high-frequency noise based on their sampling schemes and require some form of  denoising.  

 

Gaussian Blur  

The Gaussian blur is one of the most widely used effects in rendering [1] and is the cornerstone of  many filtering techniques that will be presented in this paper. The blur in its simplest form is quite  expensive, however, with certain adjustments, it can be made computationally feasible. The 

optimizations themselves can spawn new blurring techniques used in the methods presented in this  paper such as bilateral blue and separable blur.  

 

The Gaussian blur uses the Gaussian kernel, which is a -tap convolutional filter that  weights the pixels that fall inside its square according to the 2D Gaussian curve, formalized as: 

 

   

Where is the standard deviation of the Gaussian distribution. The kernel falls under 

rotation-invariant filter kernels, and only uses the distance from the central pixel in its computation  [1]. can be thought of as the window size of the curve in figure 17. A larger standard deviation  provides more blur, however, it also increases memory access, which is undesirable under the tight  budget for real-time applications.  

 

(32)

The Gaussian blur in the form presented in equation 21, however, is not usable as a   image blurred using a fragment shader that deploys a -tap Gaussian filter would roughly  require a staggering 4 billion texture fetches. This problem is remedied by noting that equation 21  can be reconstructed by multiplying two 1-D Gaussian kernels: 

 

   

The Gaussian filter, therefore, is said to be ​separable.​ This effectively means that instead of weighting  the contribution of each pixel and adding them, we can separate the operation into two passes. For  example, for the support in figure 18, a horizontal pass will take the two texels to its right and  left, and weight their contribution. A second vertical pass will repeat the same process for the two  texels above and below the central pixel, and it will yield the same result as the expensive single pass  blur. The number of texture fetches fall from to , where is the support. For the previous  example, applying two 33-tap Gaussian filters to the image will now require 243  million fetches, which is a massive optimization.  

 

 

Figure 17. ​The Gaussian function. Note how the weight of neighboring pixels is reduced over the radially  symmetric curve. Source from [12]. 

         

(33)

Separable Blur 

   

 

Figure 18. ​A 5x5-tap Gaussian filter. (a) An expensive single pass Gaussian blur with 25 texel fetches can be  separated into two one-dimensional horizontal and vertical passes. (b) Horizontal blur filter is applied which  is fed to a vertical blur filter (c). The filter is separable as multiplying the weights of (b) and (c) results in the 

original two-dimensional filter. Source from [1].  

 

The separable blur that has been derived is fast and efficient, however, it lacks geometry-awareness,  which is a highly desirable quality in filtering ambient occlusion. As shown in figure 18, the filter in  its current state does not differentiate geometries and blurs the entire scene, this causes shadow  bleeding in ambient occlusion and the edges are not preserved, which is a source of error. To  combat this, an extension of Gaussian blur called the Bilateral filter tries to reduce or eliminate the  contribution of texels that are deemed unrelated to the central pixel [36].  

   

(34)

Bilateral Blur and variations 

A bilateral filter in its simplest form takes into account the difference between the intensities, and  builds upon the Gaussian filter in the following way: 

 

   

Where ​q ​is the neighboring texel and ​p​ is the central pixel. is the normalization factor and the  filter has now two weighting components. The first segment is the standard kernel that averages the  intensity in the spatial domain, and the second part takes into account the intensity difference. 

Therefore, pixels with intensity difference will have no spatial interactions with one another. 

Although the bilateral filter is quite effective at edge-preservation, the complexity of a brute force  implementation of it is , where N is the pixel count. One optimization method is to restrict  the range to . This will naturally change the complexity to . Further  optimization can be done by employing the techniques mentioned earlier in the separable blur. 

Bilateral filters are ​not​ separable, however, the artefacts caused by the separation is deemed a  reasonable trade-off for the increase in speed. The complexity of a separable bilateral blur is  therefore . 

 

  Figure 19. ​(left) Original input (middle) brute force Bilateral filter (right) separable kernel. Notice the streaks 

that appear as an artefact at the bottom of the right image. Source from [36]. 

   

The bilateral filter is robust and is not bound to intensity checks only. Cross or joint bilateral filters  take into account other data such as depth, velocity, and normals in deciding where to apply the  blur. In ambient occlusion the filter used has to be geometry-aware, and preserving the edges in a  blur is integral to the quality of the result. A relatively safe assumption in AO is that surfaces that  belong to different geometries will have different depths. Therefore, the intensity term can be  changed to depth terms: 

(35)

 

 

Where and are the respective depths of central pixel p, and its neighbor q. It can be seen  from equation 24, how as depth difference increases the intensity contribution of neighbor q 

decreases. The Gaussian filter and its variants are quite effective in denoising, however, their  myopic spatial implementation leaves a lot of room for improvement for dynamic scenes. Further  optimization can be done by taking the temporal coherency of scenes into consideration. 

 

Spatiotemporal approach  

Pixel shaders are increasingly using more of the computational power of systems for real-time  applications. This computational strain can be significantly alleviated by noting that in a standard  scene, surface regions, camera movement, and lighting are temporally stable and usually do not  undergo rapid changes from one frame to the next. Reprojection methods take advantage of said  temporal coherency and use samples from previous frames for various applications and fall under  two categories of ​reverse​ and ​forward ​[34].   

   

  Figure 20. ​Images with their coherence. As it can be seen both camera movement (left), and animation  (middle and right) demonstrate significant temporal coherency, with green denoting the visible portions from 

previous frames, and red the newly visible points. Source from [34].   

 

For a triangle rendered at , both the current frame and previous ( ) frame, the vertex position  is calculated and if they are close enough, the shaded value from history buffer can be used instead  of doing new shading computation [34]. This approach was proposed independently by two 

different papers, and the buffer is referred to as ​history buffer​ or ​real-time reprojection cache ​and  the practice of saving shaded values of previous frames for reuse is known as ​reverse reprojection caching  [27].  

 

It is common in post-processing effects to make use of two off-screen buffers that hold  intermediary and final results in a feedback loop. Due to the back-and-forth nature of how the 

(36)

AO, reverse reprojection caching can be done with sampling AO values from frame. 

Temporal filtering can also be done by filtering and blending the AO terms of previous frames with  the current frame, which also helps in eliminating noise pops caused by movement.  

 

There are certain issues that one must take into account when using temporal refinement and  filtering. Depending on which variety of temporal methods is used, the method might incur  overhead on the system. For example, temporal filtering through blending will not be a logical  approach if the scene is mostly static. Moreover, the feedback can produce new artefacts, most  notably temporal lag. Another decision that is important in the design of such filters is how to  handle invalid pixels. An invalid AO value can happen if a disocclusion has taken place (newly  visible surfaces) or the sample neighborhood of the pixel has changed [27]. There are many  approaches to handling these issues that will be out of the scope of this paper.  

 

As has been demonstrated so far, the evolution of SSAO methods highlights certain qualities and  trade-offs. All SSAOs trade accuracy for speed and are based on many assumptions that help  eliminate many constraints, yet, bring a myriad of erroneous results with them. Different sampling  patterns that trade noise for banding are proposed; each striving to increase sampling quality and  efficiency. Multi-scale approaches focus on faster and more efficient texture fetches and different  estimator setups focus on better convergence for their estimators. Monte-Carlo integrators play a  significant role in both object-space and screen-space approaches, which consequently make  denoising a critical step in making the samples usable for human eyes in a real-time context. The  following section will have a closer look at some of these design decisions and how it affects the  final result of different solutions.  

 

 

 

 

 

(37)

Implementations  

 

This section will cover the inner mechanisms of the above-mentioned methods. Included, is also a  variation on the normal-oriented hemispherical SSAO that has been implemented predominantly to  examine how it scales next to other standard industry-level implementations. The latest version of  Unity was chosen as a platform to run the different implementations as it is a widely used both in  the indie and AAA scene, thus it reflects the current demands of the industry. Furthermore, Unity  can be extended with custom image effects and can be equipped with a post-processing stack that  provides two state-of-the-art screen-space ambient occlusion solutions: ​Multi-scale Volumetric Occlusion  (MSVO)​ ​and Unity’s ​Scalable Ambient Obscurance ​(SAO) [46]. At the time of the writing of this paper,  Unity has also provided an experimental build that can handle DXR and plans to integrate DXR to  its high-definition render pipeline (HDRP) in the future. The built-in profiler was also used for more  detailed measurements.  

 

The vertex shader passes a full-screen quad in all implementations and only snippets deemed  important from the fragment shaders of MSVO, SAO, HBAO, RTAO, and our variation on SSAO  will be presented here; their common traits will be mentioned and further comparison will be done  in the results section.  

 

SSAO Variant 

Our method is based on ​StarCraft II​’s normal-oriented hemispherical ambient occlusion [14], the  main difference mostly lies in the sampling pattern and the inclusion of a normal-based falloff  function which brings the implementation closer to the original AO expression. It also can use a  geometry-aware bilateral blur, similar to the one used by Unity. However, it was found that the  samples converge nicely on their own. The main purpose of this variant is to compare it with 

industry standard solutions and also to introduce techniques common among screen-space solutions.  

 

Firstly, we produce samples on the C# side within a unit hemisphere and scale it with a simple  interpolation function to make the samples fall closer to the origin instead of them being completely  randomly distributed. Then, we push these samples to the HLSL (high-level shader language) side  using an array. In order to prevent banding, the samples are further reflected along a random vector  on a per-fragment basis. The random vector is extracted from a normal map that is scaled and tiled  as below: 

 

normalize(tex2D(_NoiseTex, (_ScreenSize) * uv / _NoiseTex_TexelSize ).xy * 2.0f - 1.0f); 

(38)

Then for maximum randomness, we produce two sets of random directions that will be used as  offsets. One sample set is reflected against the normal map texture, while the other will be rotated at  a random degree along the ​z​ direction. The sampling radius is scaled to account for projection. Then  the sample vectors are added as offsets to the fragment position, and the depth test is carried out. 

The occlusion values are accumulated and divided by the sample count. However, two weighting  factors based on distance and normal is added:  

 

float occ = max(0.0, dot(normal, normalize(sample_Dir)) - _Bias) * (1.0 /  (1.0+length(sample_Dir)) * _Intensity); 

 

The dot product​​dot(normal, normalize(sample_Dir))​ between normal and sample direction  allows us to give the most weight to occluders that are right in front of the fragment, while the ​(1.0 

/(1.0+length(sample_Dir))​ ​is the falloff function that reduces the contribution of occluders. Their  implementation has 3 aesthetic parameters: ​_Intensity​, ​_Bias, ​and​​_Radius​.​The​​_Intensity 

parameter adjusts the strength of the AO, while ​_Bias​ is the size of the cone defined by the normal  of the fragment and the sample direction. This implementation is fast and satisfactory, however, it  suffers from some familiar SSAO errors that will be discussed in the results section. 

 

HBAO   

Horizon-based ambient occlusion (HBAO) takes a very different and a much more physically based  approach than the previous implementation. The HBAO used for testing is [51], which uses the  Nvidia implementation [4]. The implementation has many auxiliary features like color bleeding  which will not be used for the purpose of this paper. The sampling parameters are divided into  directions and steps. Directions are randomly chosen azimuth angles that define a slice and steps are  the raymarching steps that are done within a slice. Similar to the double integrand in the solution,  the implementation will have two for-loops in its fragment shader, as shown in the pseudo-code  below: 

 

for (int d = 0; d < DIRECTIONS; ++d) {  float angle = theta * float(d); 

 

// Randomly rotate direction and Compute its normalized 2D direction  // Jitter starting sample within the first step  

//calculate tangent of ray   

for (int s = 0; s < STEPS; ++s) {   

//calculate texture increment for ray marching, snap to the center of texel to 

(39)

 

//fetch sample’s view position  //advance step  

//find horizon vector and length for calculating occlusion   

float3 horizon_Vector = sample_ViewPos - p_ViewPos; 

float horizon_LengthSqr = dot(horizon_Vector,horizon_Vector); 

float occlusion = dot(normal, horizon_Vector) * rsqrt(horizon_LengthSqr);

   

//check if horizon angle is maximum and scale by the attenuation factor  //add contribution 

   

//accumulate ao 

The algorithm is much more complex than the previous SSAO and there are many variations that  try to optimize the angle calculations. As it is shown in the simplified pseudo-code above one  noticeable assumption is that there are no discontinuities in the heightmap, even though, it could be  the case that the maximum horizon angle will have unoccluded region under it.  

 

Unity’s SAO (AlchemyAO)   

Scalable ambient occlusion (SAO) as specified by McGuire[30] is an extension on the Alchemy  ambient occlusion. The core fragment shader, however, is the same as AlchemyAO. The  pseudo-code below provides an overall look at the algorithm: 

 

half4 frag_ao(v2f i) : SV_Target 

   

//fetch view-space normal and depth of fragment   // offset depth value to avoid precision error 

//reconstruct view-space position “vpos_o” from depth  for (int s = 0; s < _SampleCount; s++) 

// Sample a point v_s1 in a hemisphere, flip according to normal and add vpos_o  float3 v_s1 = PickSamplePoint(uv, s); 

v_s1 = faceforward(v_s1, -norm_o, v_s1); 

float3 vpos_s1 = vpos_o + v_s1; 

 

// Reproject the sample point 

// find Depth of the reprojected sample point 

References

Related documents

In Figure 4.8, the image (a) represents the mask input image occluded with sunglasses, the image (b) represents the detected occlusions by level 2 image division, the occlusion

Effects implemented in the game are: backface and view frustum culling, Phong shading, deferred shading, Exponential shadow maps, Screen space ambient occlusion,

In a sub-group analysis of paper V, including patients with either embolic or thrombotic occlusions only, there were no significant differences in mortality between open

• Which of Scalable AO and Multiresolution SSAO confer least errors using the metrics SSIM and PDIFF compared to the reference ambient occlusion texture computed in object space.. •

The aim of this study is to examine whether there exists a cause-and-effect on consumer’s purchasing behavior and perceived quality when using a gender- congruent

Ambient Occlusion är en teknik för ambient ljussättning i digitala tredimensionella scener. Sådana scener ljussätts vanligtvis med en konstant mängd ambient ljus på samtliga

So, testing different moisture control products for a significant time period, in variating environmental conditions, as well as different structure types and

Overlaying the Process Josefin Antus Urban Variance 2016·17 01 02 03 04 05 06 ELEVATION WASH PRINT WASH DYE TIE DRY CARVE BLOCK FABRIC REFINEMENT WASH DYE TIE DRY CARVE BLOCK WASH