Evaluation of tone mapping operators for use in real time environments

(1)

Evaluation of tone mapping

operators for use in real time

environments

by

Jonas Hellsten

LITH-ITN-MT-EX--07/044--SE

(2)

Abstract

As real time visualizations become more realistic it also becomes more important to simulate the perceptual effects of the human visual system. Such effects include the response to varying illumination, glare and differences between photopic and scotopic vision. This thesis evaluates several different tone mapping methods to allow a greater dynamic range to be used in real time visualisations. Several tone mapping methods have been implemented in the Avalanche Game Engine and evaluated using a small test group. To increase immersion in the visualization several filters aimed to simulate perceptual effects has also been implemented. The primary goal of these filters is to simulate scotopic vision. The tests showed that two tone mapping methods would be suitable for the environment used in the tests. The S-curve tone mapping method gave the best result while the Mean Value method gave good results while being the simplest to implement and the cheapest. The test subjects agreed that the simulation of scotopic vision enhanced the immersion in a visualization. The primary difficulties in this work has been lack of dynamic range in the input images and the challenges in coding real time graphics using a graphics processing unit.

(3)

List of Figures

2.1 Dynamic range . . . 4

2.2 Dynamic range with tonemapping . . . 4

2.3 A book page overlaid on a gradient ramp . . . 6

2.4 Tone mapping graphs . . . 8

2.5 Overview of the human eye . . . 9

2.6 Adaptation of cones and rods . . . 10

2.7 Glare . . . 11

2.8 CIE luminosity functions . . . 12

2.9 HDR motion blur comparison . . . 14

3.1 Post processing pipeline . . . 17

3.2 Sample positions . . . 20

3.3 Normal distribution . . . 22

3.4 Bloom Function . . . 23

3.5 Filter Approximation pipeline . . . 24

3.6 Function approximation . . . 25

3.7 Brightpass . . . 27

4.1 Tone mapping pipeline . . . 30

4.2 Tone mapping comparison, daylight . . . 30

4.3 Tone mapping comparison, high contrast . . . 31

4.4 Scotopic comparison . . . 31

(7)

LIST OF FIGURES vi

4.5 Tone mapping grades . . . 31

A.1 Tone mapping pipeline during the day . . . 3

A.2 Tone mapping pipeline during the night . . . 4

A.3 Night Vision pipeline during the night . . . 5

A.4 Tone mapping comparison, daylight . . . 6

A.5 Tone mapping comparison, high contrast . . . 7

(8)

List of Tables

4.1 Parameter values . . . 33

4.2 Theoretical performance . . . 34

4.3 Measured performance . . . 34

(9)

Acronyms & Abbreviations

GPU Graphics Processing Unit

PSF Point Spread Function

HDR High Dynamic Range

LDR Low Dynamic Range

BRDF Bidirectional Reflectance Distribution Function

MSAA MultiSample AntiAliasing

FPS Frames Per Second

(10)

Chapter 1

Introduction

1.1 Purpose

In computer graphics light is collected by a camera, virtual or real, and then displayed by some display system. Such a display system could range from regular printed paper to scaning laser displays but LCD and CRT systems are most common and relevant for this work. The displayed image is not a direct representation of the captured light, it is rather a representation of how the image would have appeared to an observer at the scene. How an image appear to an observer is a highly complex process that is not fully understood. In photography this has been approached by adjusting the various parameters of the camera system and adjustments in the development process. In computer visualisations an idealized camera is assumed and the virtual environment is created to be directly mapped to the display. The reason real light values can’t be used is the limited dynamic range of display systems. The luminance produced by common computer monitors is only a fraction of the luminance viewable by the human visual system.

Another approach for visualisations would be to use real lighting information to produce an representation of the scene with real illumination values and use a virtual camera to transform this image to something viewable on a regular display. This leads to the problem of determining how to transform the objective information that has a high dynamic range (HDR) to a low dynamic range (LDR) image that can be displayed on a monitor. The transformation should map HDR values to LDR values, a process called tone mapping in the literature. To improve the appearance even more various perceptual effects of the visual system can be simulated. Such effects are usually not considered a part of the tone mapping but are very closely related.

In short the purpose with this work is to increase the apparent quality of real time com-puter generated graphics by simulating the human visual system. The use of ‘apparent’ should be stressed since it is the users perception of the image that is most important.

(11)

1.2. Introduction – Goals 2

1.2 Goals

The first goal for this work is to evaluate tone mapping methods suitable for real time use. Part of this goal was to select a few methods fast enough for real time visualizations, implement the selected methods in a graphics engine and do a user test of the imple-mentation. As the purpose is real time use; a time-dependent method should be most suitable.

The second goal was to simulate perceptual effects of the human visual system. Since the visual system is incredible complex and not well understood a complete simulation is unfeasable. The effects that were chosen was the simulation of glare and scotopic (night) vision. The target for this work is primarily computer games, but other visualisations could benefit as well.

1.3 Avalanche game studios

This work was done for Avalanche game studios in Stockholm. Avalanche was founded in 2003 and delivered their first game, Just Cause in 2006. Just Cause is a technically advanced third person action game with extremely large environments. The Avalanche Engine test build was used as a basis for the implementation for of this work.

(12)

Chapter 2

Background

2.1 Dynamic range

Human visual perception can differentiate between very small differences in intensity while being capable to register very large intensities. The reason for this is in part because the visual system has a logarithmic response; contrast is enhanced for low intensities and compressed for large intensities. Another important function is adaptation; the visual system takes some time to adjust to different intensities. This can be seen when walking inside a dark room on a bright day or walking down a dark road and become temporary blinded by a approaching cars headlight. The difference between the smallest visible difference in intensity and the pain-threshold for intensities is the dynamic range and is often measured in the logarithmic scale. Humans have approximately 14 log10 units of

dynamic range according to Ferwerda [8], while the display used to write this text has around 3 log10 units of dynamic range, see Figure 2.1.

Photography and computer graphics in general have traditionally been limited to the dynamic range of the printer or display device. Paul Debevec described in 1997 a method to recover the entire visible dynamic range from a series of photographs [4]. The images produced with this method are named High dynamic range (HDR) while the traditional representation has been retroactively renamed Low dynamic range (LDR). Eric Reinhard describes HDR graphics in more detail in his book ‘High dynamic range imaging’ [20]. High dynamic range images are commonly stored in a floating point representation, like the radiance (hdr) and openEXR (exr) file formats. See section 2.4 for more information on HDR.

2.2 Tone mapping

Tone mapping is the process of transforming the HDR representation of an image to an LDR representation, see Figure 2.2. This has been a problem even for photographers

(13)

2.2. Background – Tone mapping 4 Starlight Moonlight Indoor lighting Sunlight 10-3 10-1 102 105 Luminance Log cd/m2 Human, momentary 5 log10 Human, total Regular display 3 log10 8-14 log10

Figure 2.1: Approximate dynamic range of different systems, left: the total dynamic range of the human visual system, middle: the dynamic range of human visual system when adapted to a specific luminance, right: dynamic range of common display systems such as LCD and CRT monitors

Starlight Moonlight Indoor lighting Sunlight World Eye 14 log10 →5 log10

LDR Image Display Eye

3 log10 → 3 log10 →5 log10

Indoor Ligting HDR Image Display Tone mappning Eye

14 log10 → 3 log10 →5 log10

Starlight Moonlight Indoor lighting Sunlight

Figure 2.2: Left: the eye and visual system transforms the light from the world to a perceived image with lower dynamic range, Middle: Regular display limits the dynamic range that can be used at a time, Right: Tone mapping is used to simulate the visual systems reduction of dynamic range before being displayed

(14)

2.2. Background – Tone mapping 5

who have used various development methods to enhance the images before printing them [19]. Tone mapping has proved to be a difficult problem as the large number of different methods can testify. The reason for this is the high complexity and adaptivity of the visual system. The simplest group of tone mapping methods are global methods that use the same function over the whole image. Such methods may fail to produce good results if the luminance changes in the image. Like standing in a dark room looking out a window on a bright day. More advanced, local methods also use the surrounding pixels to determine the appropriate intensity. Details in dark and bright areas that might be removed with global methods can be better reproduced with local methods, see Figure 2.3. Local methods have an advantage in being more similar to the visual system, but some suffer from contrast reversals that lead to halo artefacts. Unfortunately, local methods remain computational expensive and might therefore not be suitable for real time applications for some time. As the goal for this work is not only to achieve real time performance but also to do this in a game with minimal overhead only very simple methods were considered. It should be noted that GPU’s evolve rapidly and methods considered slow today might soon be usable. Here is a brief overview of the methods that were considered. See Figure 2.4 for graphs over the different tone mapping methods. When describing the tone mapping methods ”‘world”’ (w) luminance refers to the luminance of the original image, display (d)

refers to the tone mapped luminance, average (a) refers to the average world luminance,

Lmax refers to the maximum world luminance. In several methods a ”‘key”’ value is used.

This value determines the ”‘key”’ or brightness of an image, a low key image will be dark while a high key image will be bright. The exact mapping between key value and overall brightness of the image depends on the method.

2.2.1 Mean Value Mapping

Mean value mapping is the simple linear mapping of the mean value of the image to a key value in the interval [0,1], a key value of 0.5 will result in the average luminance being mean-gray.

Ld= α

Lw

Lavg

Where Ld is the display luminance, Lw the world luminance, Lavg the average luminance

and α the key value of the image. This mapping is very simple and fast, but values higher then Lavg/α will be clamped; which might be undesirable.

(15)

Figure 2.3: A book page overlaid on a gradient ramp. It is almost impossible to make all the text readable with global tone mapping methods since improving the contrast in one area will decrease it in another. Such a case is almost trivial for a local method since the edges of the text can be found and enhanced.

2.2.2 Logarithmic mapping

One desirable characteristic of tone mapping operators is the expansion of low intensities and the compression of high intensities. A logarithmic method seems natural since the visual system is thought to have a logarithmic response. One such function is:

Ld=

log(ωLw+ 1)

log(ωLmax+ 1)

where Lmax is the smallest world luminance that will be mapped to one and ω is a key

value to control the overall display luminance. This method and a more advanced adaptive development are described by Drago [5].

2.2.3 Power mapping

The idea behind this method is to apply a linear mapping between 0 and the maximum illuminance and raise the result to some power that maps the mean value to some key value. This inspired by the gamma function used for displays.

Ld = Lw Lmax logα logLavg Lmax

(16)

2.2.4 S-Curve

Pattanaik has proposed a time dependent method for real time applications [18]. The method consists of several parts: an adaptation model, an appearance model, an inverse appearance model and an inverse adaptation model. While the method is too complex to describe completely, it is based around an s-shaped function:

Ld =

Ln w

Ln w+ σn

where n controlls the slope of the curve and σ is the luminance that is mapped to mean-gray, this is usualy the average luminance. This function has some desirable properties, such always mapping to the [0-1] interval and control over the slope of the function. For this work only the function shown above has been used, the appearance and adaptation models has been ignored since they are a bit to complex for a gaming environment.

2.2.5 Reinhards Photographic tone reproduction

Reinhard presents in his paper ”‘Photographic tone reproduction for digital images”’ [19] a relatively simple mapping followed by local dodging-and-burning to add detail in dark and bright areas that was destroyed in the initial mapping. Dodging-and-burning are techniques derived from photography where the exposure is varied over the image. While the local dodging and burning are computer intensive, the initial mapping is not. This method was also suggested at the Game developer Conference in 2007 [11].

Lr = α Lw Lavg Ld = Lr 1 + Lr

Reinhard also proposed a second variant of this mapping to burn out high luminance in a controllable fashion. Ld= Lr 1 + Lr Lmax 1 + Lr

2.2.6 Wards Visibility matching operator

Ward proposed in his paper ”‘A Visibility Matching Tone Reproduction Operator for High Dynamic Scenes”’ [15] a method that attempts to preserve apparent contrast instead of brightness. This method is a modification of histogram equalization; and as such it requires a histogram to be built. Histogram generation are not a very good match for GPU hardware since random writes or other kinds of accumulation are not supported.

(17)

2.3. Background – Perceptual effects 8 0 0 . 2 0 . 4 0 . 6 0 . 8 1 1 . 2 1 . 4 1 . 6 1 . 8 2 0 0 . 5 1 1 . 5

(a) Mean Value Mapping

0 0 . 2 0 . 4 0 . 6 0 . 8 1 1 . 2 1 . 4 1 . 6 1 . 8 2 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 0 . 7 0 . 8 0 . 9 1 (b) Logarithmic 0 0 . 2 0 . 4 0 . 6 0 . 8 1 1 . 2 1 . 4 1 . 6 1 . 8 2 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 0 . 7 0 . 8 0 . 9 1 (c) Power Mapping 0 0 . 2 0 . 4 0 . 6 0 . 8 1 1 . 2 1 . 4 1 . 6 1 . 8 2 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 0 . 7 0 . 8 0 . 9 (d) S-Curve 0 0 . 2 0 . 4 0 . 6 0 . 8 1 1 . 2 1 . 4 1 . 6 1 . 8 2 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 0 . 7

(e) Reinhards first

0 0 . 2 0 . 4 0 . 6 0 . 8 1 1 . 2 1 . 4 1 . 6 1 . 8 2 0 0 . 2 0 . 4 0 . 6 0 . 8 1 1 . 2 1 . 4 (f) Reinhards second

Figure 2.4: Tonemapping graphs, all methods display the [0:2] range. Constants have the following values: Lavg = 0.7, α = 0.5, Lmax = 2, ω = 1, σ = 0 and n = 2

Methods for building histograms in real time is presented by Scheuemann [22], Green [12] and Fluck[10]. Most of such methods use multipass approaches or texture reads in the vertex shader to get around the limitations of the pixel shader. Wards Visibility matching operator produces good quality images [1], and should be possible to do in real time. Because of the difficulty of constructing the histogram and the problem of making the method time-dependent this method was not implemented.

2.3 Perceptual effects

Most tone mapping methods only cover the transformation from real world illuminance to display luminance. This is fine for photographic use, but for video and real time graphics it is often useful to simulate more features of the visual system. Such features are often caused by adaptation or imperfections in the eye such as scattering or diffraction. The human eye is similar to a camera, see Figure 2.5 for an overview. The lens and the cornea focus the incoming light at the retina. The iris functions as an aperture and limits the transmitted light. The vitreous humor is the transparent liquid that fills the eye globe, it has no function other then to transmit light. The retina is a layer of light sensitive cells that functions as a sensor. There are two kinds of light sensitive cells. Cone cells are responsible for photopic (light) vision and are mostly concentrated to the fovea, this area is in the centre of vision and is only place vision ”‘sharp”’. Rods cells are responsible for scotopic (dark) vision and are more spread around the retina. Rod cells are more sensitive

(18)

2.3. Background – Perceptual effects 9

Figure 2.5: Overview of the human eye, [21]

than cones and are often connected to several bipolar cells to provides a sort of filtering to suppress noise and increase sensitivity [13].

2.3.1 Adaptation

One of the most relevant features of visual system for the purpose of tone mapping is adaptation. While the eye has a dynamic range to work in starlight to bright sunlight it can only use a portion of that range at one time. The process of adjusting this range for different lighting environments is called adaptation. Adaptation methods have been presented by [18], [8], [6]. The used method was presented by Kraczyk [14] and is an exponential decay function:

Lan+1= Lan+ (L − Lan) ∗

1 − exp−∆tT

(2.1)

Where La is the adapted luminance, ∆t is the time between adaptation steps and T the adaptation time, see Figure 2.6. The adaptation of the eye is divided in adaptation of the rods (dark adaptation) and adaptation of the cones (light adaptation). While light adaptation is on the order of a few seconds dark adaptation can take up to half an hour. Because of this the only light adaptation is simulated, but the adaptation time is increased a bit for dark environments. Users would probably not have the patience to wait for dark adaptation, or notice the effect if present. An exponential decay function is a reasonable approximation of light adaptation.

(19)

2.3. Background – Perceptual effects 10 0 1 2 3 4 5 6 7 8 9 1 0 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 0 . 7 0 . 8 0 . 9 1

Figure 2.6: The adaptation response of the rods and cones. The Thin blue line represents the adaptation target, the medium red line the cones with an adaptation time of T = 1 and the thick green line the rods with an adaptation time of T = 4

2.3.2 Glare

Glare is the visual effect surrounding light sources, most noticeable when the eye is dark adapted. Glare has been described by Spencer [23] and is composed of tree different parts. The lenticular halo is a rainbow colored ring around lights caused by diffraction at the edge of the lens [23], see Figure. 2.7(a). The ciliary corona is a large number of slightly colored needle formed lines emanating from the light source, see Figure. 2.7(b). It is caused by scattering in the lens by small particles [27]. Veiling luminance or bloom is a soft glow around bright objects and is caused by scattering in the cornea, lens and vitreous humour, see Figure. 2.7(c). Since glare is caused by defects in the eye and is independent of the distance to the light source a post process convolution filter would work well. Unfortunately the ciliary corona and lenticular halo are only present for small light sources, larger lights blur the effect. This makes real time implementation difficult since downsampling can not be used and the convolution filters need to be very large even for modest resolutions. Bloom is easier to create in real time and variations have already been present in several games, sometimes to the point of making everything glow. The most common approach to bloom is to blur the image and add some portion of the blurred image back to the original image. Spencer describes a more accurate filter for simulating bloom that should provide a more realistic bloom effect if accurate image data is present.

(20)

(a) Lenticular halo. After [16]

(b) Ciliary corona. After [27] (c) Bloom

Figure 2.7: Components of glare

2.3.3 White balancing

The visual response to color is highly dependent on the environment. The visual system tries to compensate for colored lighting to make the appearance of colored objects inde-pendent of lighting color. Photographers often use reference objects of a known color, like a sheet of paper, as a reference for adjustments. A simple way to white balance an image in a tone mapping system is to adapt each color channel separate from each other. Instead of computing the luminance for the image, tone mapp the luminance and convert back to a color image; the RGB channels are tone mapped independently. After some time to allow for adaptation the mean color in the image should then always be mean gray. This is a very simple model and is overly aggresive in compensating for tint. The effect can be reduced by combining a white balanced image with a regular tone mapped image.

2.3.4 Scotopic vision

In low illumination only the rod cells are active, this is known as scotopic vision. Between the scotopic range and photopic is the mesopic range when both rods and cones are active. Scotopic vision is quite different from the photopic due to the difference of rods and cones.

2.3.4.1 Purkinje effect

Rods only come in one variant and are therefore monochromatic. The peak sensitivity of the rods is shifted compared to cones, see Figure 2.8 for spectral distribution of the different cell types. This can cause a contrast reversal for different colors under different levels of illumination. An example of this is a red rose with green leaves. In daylight the red rose appears brighter than the leaves. In low illumination the leaves will appear

(21)

Figure 2.8: CIE luminosity functions. The blue line represents the illuminance response of the rods while the green line represents the illuminance response of the cones

brighter than the rose. This is known as the Purkinje effect [26]. To simulate this effect the best approach would be to use multi spectral image data or a scotopic channel in addition to the RGB channels. Since this is not feasible, approximations such as one presented by Thompson [25] is used. This approximation first converts the RGB values to the XYZ color space using a transformation matrix such as:

   X Y Z   =    0.5149 0.3244 0.1607 0.3654 0.6704 0.0642 0.0248 0.1248 0.8504       R G B    (2.2)

After the transformation an empirical formula is used to estimate the scotopic luminance:

Ls= Y 1.33 1 + Y + Z X − 1.68

The scotopic luminance, LS, can then be multiplied with a bluish gray to give the

ap-pearance of night. The selection of color depends on preference but most people do seem to see the night as slightly blueish.

2.3.4.2 Visual Noise

At low illuminance noise becomes an issue. Noise can originate from stray photons, noise from receptors, neural noise and other sources [25]. This noise is additive but the neural processing can make it appear as multiplicative.

(22)

2.4. Background – HDR graphics 13

2.3.4.3 Visual Acuity

The perception of details becomes worse with decreasing illumination. The reason for this loss is because there are fewer rod cells and they are more spread out over the retina. The individual receptor cells are also connected to form a kind of spatial filtering as a way to reduce noise. In computer graphics this reduction of resolvable spatial frequency can be done by a Gaussian blur function. Thompson presents a modification of this technique to increase the sharpness of the blurred image to produce a better estimation [25]. Krawczyk [14] uses the function:

RF (L) = 17.25 · arctan(1.4log10L + 0.35) + 25.72)

to determine the highest resolvable spatial frequency. Where L is the luminance and RF the resolvable frequency in cycles per degree of the visual angle. The standard deviation for the Gaussian filter kernel is then computed with:

Sacuity =

width f ov ·

1 1.86 · RF (L)

Where Sacuity is the standard deviation, width the resolution of the display in pixels and

fov the field of view.

2.4 HDR graphics

HDR images has traditionally been created by taking several photographs with different exposures and combining them with a program such as hdr shop by Paul Debevec. When talking about HDR in a real time graphics context the term becomes loosely defined. Many games have used the HDR term to describe simple bloom. In order to get full benefit from HDR rendering lighting information needs to be in an HDR format and the rendered image need to be in a HDR format until it is tone mapped. Lighting information includes dynamic lights, lightmaps, emissive materials etc. Regular LDR textures are usually sufficient for reflectance information as it only need to be in the [0,1] range. For realistic rendering the best approach would be to use measured light data and physical models for light calculations and realistic BRDFs for materials. This seems to be an uncommon approach since similar results can be achieved without the extra cost of measuring and validating light data by using artists to create environments with the right appearance. Some of the early work in real-time hdr graphics has been done by the game companies Valve [3] and Bungie [24].

(23)

2.4. Background – HDR graphics 14

(a) Original (b) LDR (c) HDR

Figure 2.9: HDR motion blur comparison. The left picture shows the original image after tone mapping, the middle picture shows the image motion blurred after tone mapping, the right picture shows the image tone mapped after the motion blur. Image courtesy of Paul Debevec, [2]

2.4.1 Benefits of HDR

Using HDR overcomes several limitations of LDR rendering. Refraction and reflection effects can be made more accurate since objects such as the sun will remain bright even if reflected by a object with low reflectivity. Blur effect benefit in the same way, see Figure. 2.9. Other advantages are more dependent on tone mapping and related filters. Environments with high contrast or a mix of bright and dark environments will probably show the largest benefit. In these cases adaptive tone mapping can be used to enhance the contrast between environments while maintaining the ability to see clearly in each environment.

2.4.2 Color format

When applying tone mapping as a post process an HDR render target is needed. The common format for regular rendering is 8bit per component integer format. As few displays can show more colors than this it is sufficient if no post processing is used. If processing is needed higher precision formats such as 12 or 16 bit per components integers can be used, but they still suffer from a limited dynamic range. A simple way to get a good dynamic range without using excessive amount of bits is to use a floating point format. One possibility is to pack floating point information into the regular 32bit pixel. A common example is to use the alpha channel to pack an exponent that is common for all color pixels. This is a simple approach that has fairly good dynamic range and can be used on most hardware. The down side is that hardware blending and fog calculations can be tricky and bilinear interpolation of HDR textures does not always work [3]. This is mostly due to hardware limitations.

(24)

repre-2.4. Background – HDR graphics 15

sent a different exposure level. This method has wide hardware support, but the dynamic range is not very good and MultiSample AntiAliasing (MSAA) does not work with multi-ple render targets, all buffers will use at least twice the amount of memory and bandwidth to store the brightness data.

The third approach is to use a 16bit per component floating point format. This is probably the best format for future implementations as it is the easiest to implement and has the best precision and dynamic range. The main disadvantage is that it is only fully supported in very recent GPUs at the time of writing. Older GPUs may lack support for MSAA or blending. Memory and bandwidth consumption will be double compared to a regular 8bit per component buffer. This was the format used in this work due to the ease of implementation and high precision.

2.4.3 HDR in games

HDR is something of a hot topic in computer games. Several games have been produced that use some form of HDR rendering or another and many more are under development. Different HDR implementations can be very dissimilar. The dynamic range can vary a lot between implementations. In some cases the bloom effect has been labeled ”‘HDR”’. Some games use the HDR moniker when using higher precision formats necessary for some effects while not really using the extra dynamic range. Most games have HDR as extra graphical candy since hardware to use HDR textures and rendering targets are not universally available. The ideal situation, where physical illuminance values are used might be far off even with improving hardware since most graphical work has been done in LDR for so long time. Even popular image editing suites like Adobe Photoshop lack support for HDR images years after the format was introduced.

(25)

Chapter 3

Implementation

3.1 Post processing framework

This work was done as a post process and uses complete rendered image for input, performs calculations on the image using the GPU and displays the result. Very little information about the other parts of the rendering pipeline are used. The image to be processed is set as the source image, the shader program to be used is assigned and the output buffer for the result is set. A full screen polygon was then drawn and the shader program executed for each pixel in the target buffer. A downside with this approach is the overhead associated with reconfiguring the rendering pipeline. Some of this cost can be avoided by combining steps at the cost of flexibility. See Figure 3.1 for an overview of the post processing pipeline.

3.1.1 Avalanche game engine

The Avalanche engine had a post processing framework to facilitate this work. Adaptive tone mapping was performed by using a series of down-scale filters to compute the mean value of the frame buffer. An adaptation filter was then applied to the mean value to simulate adaptation. The tone mapping was done by a simple mean value filter. Bloom was added by using a bright pass filter followed by a blur function and the result added to the tone mapped image. While the Avalanche game engine supports several different platforms this work was done on the PC platform.

3.1.2 Shaders

In order to achieve real time speed the GPU was used to compute all image operations. The reason the GPU is so fast at this kind of work is that it exploits the parallelism inherent in image processing. The GPU contains many small, very simple processing cores called shading units or stream processors. Shading units are very simple processors;

(26)

3.1. Implementation – Post processing framework 17 RenderTarget Adapted Lumminance Mean Lumminance Tonemapping Blur BrightPass

Blur _CorrectionColor

Scotopic lumminance Noise blend Saturate old Adapted Lumminance Adaption Bloom Scotopic Vision Output blend Tone mapping

Figure 3.1: Overview of the post processing pipeline. Note that some parts are done in the same shader, such as the tone mapping and white balancing, while others use several shaders, such as the mean value computation. See A.2 for images from each step.

they do not support any scattered writes and have very limited ability to do branches and loops. The reason they are so fast are that there are so many of them. At the time of writing GPUs with over 300 shading units heve been introduced. The features of shading units are often referred to as supporting different shading models of Microsoft’s Direct X. The GPU used for this work support direct x 9 shading model 3. Shader programs are used to run on the shading units.

3.1.3 Programming language

A high level language like Nvidia’s CG, GL shading Language (GLSL) or High Level Shading Language (HLSL) from Microsoft are usually used to write the shading program. High level languages have the advantage of being easier to read and debug compared to assembly code. This work was written in a combination of C++ and HLSL / CG. C++ was used to configure the post processing pipeline and to compute various constants for the shaders, while HLSL was used to write the shaders. All image processing is done inside the shaders.

3.1.4 Gamma control

Textures are commonly in the sRGB gamma corrected format. This improves the dy-namic range a bit when using 8 bit textures but has an adverse affect when doing color operations [7]. The reason for this is that sRGB is not a linear format and blending in non linear colorspaces is tricky. The image is therefore converted too linear space by use of the hardware gamma correction function present in recent GPU’s. After all tone

(27)

3.2. Implementation – Tone Mapping 18

mapping filters has been applied the inverse transformation is used when rendering the image to the screen. The hardware gamma correction function is often a piecewise linear approximation.

3.1.5 Floating point rendering

The modification of the game engine to use 16bit floating point textures is minimal. As the framebuffer does not support floating point formats, rendering to an offscreen buffer was used.

A problem with the floating point format was that unexpected values such as infinite numbers and not a number are represented. Such numbers are clamped to 0 or 1 when using integer formats. Such malformed pixels can be caused or may be present in the image entering the post processing pipeline. Division by zero was avoided in post processing pipeline by adding a small value, , to the denominator of all divisions. Malformed pixels are clamped to [0,1] before they are sent to the screen, and thus wont be noticed. The problem arises when using any kind of spatial filter, any operation on a Nan or Inf becomes a Nan or Inf. If a blur filter is used such pixels show up as large black squares. If any malformed pixel is sampled to create the mean value the whole screen becomes black since it will spread to the tone mapping operator. There are functions that check if a pixel is bad but a better method would of course be to make sure it will never happen.

3.2 Tone Mapping

The first step in the tone mapping is to compute the luminance of the pixel. This is done by taking the dot product of the color vector and a luminance vector. The luminance vector is taken from the Y component of the XYZ to RGB transform from [29].

[0.2125 0.7154 0.0721]

If the maximum luminance is needed by the tone mapping algorithm it is approximated by summing the diffuse and ambient term of the main light source. For this implementation the sun is almost always the dominant light source since there is only limited dynamic lighting from other sources. This method works quite well if there is one dominant light source. An alternative method is to get the maximum luminance from the image in a similar way to the mean value. The down side of getting the maximum luminance from the image is that it may change rapidly between frames, giving the appearance of flicker.

After all the necessary variables are known they are used to compute the tone mapped luminance. In this work several different methods were implemented, the implementation

(28)

of the tone mapping was straightforward. The equations in section 2.2 where implemented with little difficulty. After the tone mapped illuminance is computed it is transformed back to color information by the following formula:

ct= c ∗

L Lt

Where ct is the tone mapped color vector, c the image color, L the image illuminance

and Lt the tone mapped illuminance. An alternative method from a slide from Game

developer’s conference [11] was tried where the RGB information was transformed to XYZ and then to the Yxy color space. The Y component was then tone mapped and the inverse transform applied to get RGB data. This method did not show any advantage compared to the simpler method.

The key value for tone mapping controls that overall brightness of the image. This was computed with the following function:

n = 4 σ = 1.2 Ls = Lnm α = 1 − Ls Ls+ σn K = Kd∗ α + Kn∗ (1 − α)

Where K is the final tone map key value, n a slope value, σ the half value. This method uses the same S-shaped function as the S-curve tone mapping method. Kd and Kn are

set as constants to control the brightness of the image at day and night respectively. The key value that produces a fixed brightness depend on the tone mapping method. The downside with this method is that it does not use the actual displayed luminance to determine how bright the image should be. The ideal method would be to have real world illuminance values to compute the key value. As this is not available cheating need to be used. Another possibility would be to store the key value as a function over time and possibly space in a texture to allow for more direct control.

3.2.1 Mean Value Computation

Due to the highly parallel hardware of the GPU a simple summation becomes a bit tricky. The mean value is computed by a series of downsampling filters that reduces the resolution to a single pixel. The first pass samples the initial image and computes the luminance. Each subsequent pass down samples the image by averaging a 4x4 area until the average

(29)

(a) Sample positions (b) Deformed grid (c) Downsampling

pyra-mid

Figure 3.2: Sample positions. In order to weight samples to the center of the image the sampling grid is deformed. The left image show how the sample positions are deformed. The center image shows how a regular grid looks after sampling. The right image shows part of a sampling pyramid with the lowest layer deformed

is a single pixel. This implementation uses a simple average, in literature a log average is often used instead [19].

The mean value should ideally be weighted so that center pixels have more influence. This is done by offsetting the sampling grid of the first pass. The offset is computed by computing a vector to the image center and computing the length of this vector. The length is then inverted so that it is zero at the edge of the image and multiplied with a constant, α, in this case α of 0.3 was chosen.

o = v ∗ (√2 − |v|) ∗ α;

Where o is the offset vector, v the distance vector to the center, ranging from [1,1] to [-1,-1]. The offset vector is then simply added to the texture coordinate that is used to sample the image. The downside of this method is that the sides, top and bottom does not get sampled at all, see Figure 3.2. An alternative method would be to first sample a sparse grid over the whole image and then sample a denser grid only over the center area, but this method was not implemented.

3.2.2 Adaptation

Adaptation is done by substituting the average image luminance with a adapted nance. Equation 2.1 is used to compute the adapted luminance from the adapted lumi-nance from the previous frame, the image average and the time to render one frame. The implementation switches between the old adapted luminance and the new one by using a texture array where the index changes from 0-1 or 1-0 every frame. The adaptation time

(30)

3.3. Implementation – Spatial filtering 21

is interpolated between the value for rods and cones depending if it’s a night or day scene. See 2.6 for a graph over the adaptation time.

3.3 Spatial filtering

In order to produce effects like reduction of acuity and bloom; spatial filtering was needed. Spatial filtering is a bit special since the resulting pixel depend on several pixels. Filters are most often described as a point spread function (PSF) that is used to convolve the image. Two different kinds of filters are constructed for different purposes. For acuity reduction a low pass, or blur, filter was needed. The simplest blur filter is a box filter where a box around the pixel is sampled and every weight is the same. While box filters are simple and easy to optimize they do not produce pleasing results. Gaussian blur is a very common filter for high quality blur and were therefore chosen to reduce the acuity of the image. For the estimated PSF for bloom by Spencer [23] another filter was developed that can approximate arbitrary filters as long as they are symmetrical. I call this method function approximation, but some more correct naming may exist. A problem with spatial filtering is that the number of samples increase with O(n2_{) for each}

pixel, where n is the size of the filter. This quickly becomes unfeasible for real time use if even modestly large filter sizes are wanted. The implemented methods use different ways to reduce this complexity.

3.3.1 Gaussian blur

The filter kernel for Gaussian blur follows a normal distribution from which the method has taken its name, see Figure 3.3. The equation for this filter in n dimensions is:

G(u, v) = 1 (2πσ2₎

n/2

e−(r2)/(2σ2)

Where σ is the standard deviation and r is the blur radius. This function has the very nice property of being separable. This mean the image can first be convoluted in the horizontal direction with a one dimensional kernel, and then convoluted in the vertical direction:

i ∗ f = i ∗ fh∗ fv

where i is the image, f is a 2D Gaussian filter, fh andfv are 1d horizontal and vertical

filters. This reduces the complexity to O(n). Multiple succesive small filters can be used instead of one large filter, the effective size of several filters of size n are:

(31)

Since several small filters scale less then linear there is no benefit unless there are hard-ware or performance limitations in using large filters. The weights for each sample was calculated on the CPU and the result passed to the shader by the use of constants. The amount of blur is controlled by adjusting the standard deviation of the normal distribu-tion. Since the Gaussian function takes some time to calculate it was precalculated for stadard deviations from 0.05 to 8 in 0.05 increments. To avoid sampling larger surrounds than necessary several shaders were written, each taking different number of samples. The number of samples needed have been set as the standard deviation times three as values outside this intervall are small. As this is not sufficient to ensure that the sum of the weights is unity; the weights where normalized. Sampling positions were sent to the shader via interpolating registers. This saves some instructions in the shader program. If more samples are needed then interpolating registers are available, an offset vector is set to a constant register and the sampling positions are calculated on the gpu.

In order to further increase performance, a trick described in GPU Gems 2 is used [17]. Linear interpolated samples can be used instead of point sampling. One sample is taken from position i + b/(a + b) where a and b are the sample weights. This sample is then mul-tiplied with (a+b) to get the final result. This should go faster since a linear interpolated sample is faster then several point samples.

- 1 0 - 8 - 6 - 4 - 2 0 2 4 6 8 1 0 0 0 . 0 2 0 . 0 4 0 . 0 6 0 . 0 8 0 . 1 0 . 1 2 0 . 1 4

Figure 3.3: Graph over the normal distribution

3.3.2 Function Approximation

This method ended up not being used for reasons discussed below. I decided to document it because it might be usefully for future applications. The PSF of Spencer’s bloom

(32)

Figure 3.4: Spencer’s bloom function. The PSF components (a) f0,(b) f1 and (c) f2.

After [23]

estimation is very different from most blur filters. It is constructed from three functions:

f0(θ) = 2.61 · 106e−(θ/0.02) 2 f1(θ) = 20.91 (θ + 0.02)3 f2(θ) = 72.37 (θ + 0.02)2 Pp(θ) = 0.384 · f0(θ) + 0.478 · f1(θ) + 0.138 · f2(θ)

Where θ is the angle from the center. This function is therefore dependent on the reso-lution and field of view used. Only the f1 and f2 was used since f0 has very little effect

outside the central pixel, see Figure 3.4. Since we want to improve performance by re-ducing the resolution of the buffer; the center pixel was set to zero and added from the full size buffer.

Using the same method to improve performance as in the Gaussian blur filter is not possible since the function is not separable. One way to approximate this PSF is to filter the images several times with a small box filter. The result from all the filter passes can then be linearly combined to approximate an arbitrary symmetrical filter. As the effective filter size increase by one for each pass with a 3x3 filter the performance scale with O(n). Since modern GPUs cache texture reads several passes with a small filter might have better performance then a large filter.

To find the coefficients for this linear combination a small matlab script was used. The function f (n) was first sampled at the desired size . A dirac function was then filtered with a 3x3 box filter several times to produce a set of filter kernels, b1(x, y)...bn(x, y) so

(33)

Original Image

3x3 Box filter 3x3 Box filter 3x3 Box filter 3x3 Box filter

Filtered Image

b a

c d e

Figure 3.5: Filter approximation pipeline. The original image and each filtered image is multiplied with a weight, a...e, and summed to produce the final filtered image

that the largest matches the sampled function. Since Matlabs matrix divide function only functions with 1D functions, one row from each of bi(x, y) is extracted and put in

the matrix Bi. The coefficients can then be found by dividing the sampled function with

Bi. The Coefficients can then be used to construct the full 2D filter. The filter is also

normalized to ensure that it does not brighten or darken the image. The normalized coefficients was a = 0.935 b = 0.5904 c = 0.1165 d = -0.5167 e = 0.9033 see 3.6(b).

As only n filters are used to create a filter of size n2 _{the result is only an approximation, but}

it works reasonably well if the filter is symmetrical. The mean square error for Spencers function is less then 1.5e-4. The result could be improved by using a more symmetrical filter then a box filter. Since convolutions are commutative it does not matter if we use this method to construct a filter kernel or if we applies the filters directly to the image and combine the images instead.

To implement this a simple 3x3 box filter was implemented in a shader. To reduce the number of passes needed for a given filter size the image was scaled down before filtering. To combine the filtered images a blend shader was made that takes two images and coefficients for each and outputs the result, see Figure 3.5 for a overview of the pipeline. This is done in several passes until the final image is constructed, see Figure 3.6 for the shape of the functions. Having a separate buffer for each pass would waste a lot of memory; if a buffer is not needed anymore it can be reused for subsequent passes. At least 3 buffers are needed, but more can be used to decrease the blending operations and simplify the program layout. After the image is filtered it is recombined with the original image using bilinear sampling.

The result from this method was not very satisfactory. The blur effect is not noticeable if low amount of filtered image is added. If larger amounts are added a circular ring appear around objects that look very dissimilar to real bloom. This is probably due to the lack of real illuminance values for bright objects. The bloom is due to scattering in the eye. It is therefore added before any processing by the eye or brain. The tone mapping method might not be good enougt to produce a similar appearance even if the bloom function is

(34)

3.4. Implementation – Scotopic Vision 25 - 4 - 3 - 2 - 1 0 1 2 3 4 0 0 . 0 2 0 . 0 4 0 . 0 6 0 . 0 8 0 . 1 0 . 1 2

(a) Original Function

- 4 - 3 - 2 - 1 0 1 2 3 4 - 0 . 1 - 0 . 0 8 - 0 . 0 6 - 0 . 0 4 - 0 . 0 2 0 0 . 0 2 0 . 0 4 0 . 0 6 0 . 0 8 0 p a s s 1 p a s s 2 p a s s 3 p a s s 4 p a s s (b) Approximation passes

Figure 3.6: Function approximation: Several passes of a simple filter multiplied with constants is combined to approximate a complex filter

accurate. Problems were also encountered with the down sampling, since the filter is not very smooth some artefacts can be visible when up scaling the result. Because of this and because the method is quite complex to implement and maintain, the Gaussian blur method was chosen to simulate bloom. While this is not as technically accurate it looks better.

3.4 Scotopic Vision

To find the scotopic luminance the transformation described in section 2.3.4.1 is used. Some amount of noise is added to simulate visual noise. For this a texture containing Gaussian noise was created in matlab and sampled in the shader. To get animated noise the texture coordinates are offset with a random value that is changed every n:th frame. The sampled noise is multiplied with a constant to control the amount of noise and added to the scotopic luminance. To get a blue-gray color the scotopic luminance is then multiplied with a color vector with an appropriate color, [0.97, 0.97, 1.0] was suggested by Thompson [25] and [1.05, 0.97, 1.27] by Wolfgang [7]. For this work the color [ 0.95, 0.95, 1.1] was chosen for no particular reason. This scotopic color is then blended with the regular color value. The blending value for this is computed by applying the HLDL smoothstep function to the luminance value for the pixel multiplied with a constant. This

(35)

3.4. Implementation – Scotopic Vision 26

makes bright pixels remain colored while dark pixels become monochromatic.

¯

Cxyz = M ∗ C

Ls = ¯Cxyz[x] ∗ (1.33 ∗ (1 + ( ¯Cxyz[y] + ¯Cxyz[z])/( ¯Cxyz[x] + )) − 1.68)

Ln= Ls+ N ∗ α ¯ Cs = ¯Cn∗ Ln b = 1 − L ∗ L ∗ (3 − 2 ∗ L) ¯ Cf = b ∗ ¯C + (b − 1) ∗ ¯Cs

Where C is the tone mapped color in RGB format, M a transformation matrix from RGB to XYZ colorspace, Ls a approximation of the scotopic illuminance, N a noise value, α a

constant determining the amount of noise, Cn a constant color determining the color at

night, L the tone mapped illuminance, C the tone mapped color and Cf the final color.

3.4.1 Acuity reduction

A Gaussian blur filter is applied to the result of the tone mapping to simulate the loss of acuity at night. A side effect of this filter is to remove the high frequencies from the noise resulting in a more subtitle, blurry noise. The image is not downscaled before applying this filter since the blur effect is small and needs to transition smoothly from the sharp image. Unfortunately the equations in 2.3.4.3 could not be used since the real world illuminance is unknown. The amount of blur that is applied is determined by the following function: n = 4 σ = 1.2 Ls= Lnm α = 1 − Ls Ls+ σn std = α ∗ b

Where std is the standard deviation for the gaussian, n a slope value, σ the half value and b a constant determining the amount of blur. This is the same function used to determine the key value for the tone mapping but multiplied with a constant to control how much blur is applied. If acuity reduction should remain independent of resolution the scaling factor resulution/f ieldof view can be used. This does have a performance impact for higher resolutions since larger filters will then be used.

(36)

3.5. Implementation – White Balancing 27 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 0 . 7 0 . 8 0 . 9 1 0 0 . 5 1 1 . 5 2 2 . 5 3

Figure 3.7: Graph over the brightpass function

3.4.2 Bloom

Since realistic luminance values are not available some cheating is needed to simulate bright areas and illuminates. This is done by using a bright pass filter that extracts the bright regions of the image followed by a blur filter. The result is then added back to the original image. Several different methods to extract the bright areas were tested. The method that was used in the end was a method described by Bungie, [24]. If the illuminance of a pixel is larger than a fraction of the illuminance of the main light source the pixel are probably an illuminant. In this case the fraction is 0.5 and if it is found that the pixel may be part of an illuminant it is multiplied with a scale factor. This has the effect of moderately bright objects get a slight blooming during the day but illuminants such as a car headlight get a significant amount of bloom at night, see Figure 3.7 for the function used. The method works reasonably well, but it could definitely be improved upon. The bright pass filter downscales the image four times in order to increase the effect of the blur filter without decreasing performance.

3.5 White Balancing

White balancing was done by performing tone mapping to each color channel separate from each other instead of performing it on the illumination value. To do this the mean value computation was changed to save color data instead of just illumination data. De-pending on tone mapping method used white balancing might be almost free or increase the time to tone map the image by a factor four. The increase in time is caused by the inability of most graphics hardware to compute complex functions, such as the power func-tion, on a vector element. To adjust the amount of white balancing the white balanced

(37)

3.5. Implementation – White Balancing 28

color was blended together with the regular tone mapped color.

3.5.1 Saturation

Since the white balanced image had a tendency to look gray and washed out a saturation function was added. The code to saturate the image is as follows:

m = ¯ Cr+ ¯Cg+ ¯Cb 3 ¯ Cscaled = ( ¯C − m) ∗ α ¯ Csaturated = ¯Cscaled + m

Where Cr, Cg, Cb is the original RGB color values and ¯Csaturated is the saturated RGB

(38)

Chapter 4

Result and Evaluation

4.1 Visual result

The best way to see the result from this work is to play the game. As this is not possible some screen shots are included. It is recommended to view the images on a monitor as they will not have the same detail and color reproduction in print. The electronic version of this file should have full resolution images, so use the zoom function to get a closer look. Larger versions of the images here are presented in the appendix A.

4.1.1 Post processing pipeline

Images from each step in the post processing pipeline have been extracted in order to visualize the results from each step. Two sets of images have been taken, one in a day light environment and one at night, see Figure 4.1.

4.1.2 Tone mapping methods

To compare the different tone mapping methods two series of images were taken. The first is in the normal daylight environment used in the application, se Figure 4.2. The second has the diffuse sunlight multiplied with 5 to produce a higher contrast environment to show how well the methods handle different lighting environments, see Figure 4.3.

4.1.3 Scotopic vision image

Here are two images to compare the result of the perceptual effects at night. Note the loss of color, noise in the image and loss of acuity, see Figure 4.4.

(39)

4.1. Result and Evaluation – Visual result 30 Original Blend Brightpass Blured brightpass Down sampling Mean illuminance Adapted illuminance Bloom

Tone mapped White balanced

Tone mapped and white balanced

(a) Tone mapping pipeline in daylight.

Original Blend Brightpass Blured brightpass Down sampling Mean illuminance Adapted illuminance Bloom

Tone mapped White balanced

Tone mapped and white balanced

Scotopic illuminance Noise

Noisy scotopic illuminance Night color

Colored image

Blend

Blured Final Image Night Blend factor

(b) Tone mapping pipeline at night.

Figure 4.1: Tone mapping pipeline

(a) Reinhard 1 (b) Reinhard 2 (c) Logarithmic (d) Power (e) S-Curve (f) Mean Value

(40)

4.2. Result and Evaluation – User tests 31

(a) Reinhard 1 (b) Reinhard 2 (c) Logarithmic (d) Power (e) S-Curve (f) Mean Value

Figure 4.3: Comparison of different tone mapping methods with diffuse sunlight multi-plied with 5

(a) Night Image without effects

(b) Night Image with ef-fects

Figure 4.4: Comparison with a night image with and without perceptual night effects.

4.2 User tests

In order to evaluate the result of this work a small user test was performed. There where 5 subjects in this test, all male students in computer technology or similar education, ages 20-25. The tests where performed on an uncalibrated monitor in a standard office lighting environment. Because of this the tests can hardly be considered scientific. The test subjects do belong to a group often targeted by games and cheap uncalibrated screens are commonly used for gaming, so the test setup is not unrealistic.

Reinhard 1 Reinhard 2 Logarithmic Power

mapping S-Curve Mean Value

-2 -1,75 -1,5 -1,25 -1 -0,75 -0,5 -0,25 0 0,25 0,5 0,75 1 1,25

Tone mapping Grades

Contrast Brightness Detail Overall Ad jus te d av e ra g e g ra d e

Figure 4.5: Average grades for the tone mapping methods tested. The graph has been adjusted by subtracting three from every grade to make zero the ideal grade.

(41)

4.2. Result and Evaluation – User tests 32

4.2.1 Preferred tone mapping method

For the first test the subjects where asked to run around in the application and grade each method. Each method was graded in several areas, if the overall brightness felt real, the contrast between bright and dark areas, detail reproduction in dark and bright areas and overall impression. The subjects where also given opportunity to note any abnormal behaviour or comments. Instructions where given on how to change the tone mapping method and the time of day in the application. They where also instructed to ignore some artefacts that are unrelated to this work and to disregard the fact that the tone mapping method produces the same brightness regardless of time. This is a limitation in the test due to difficulty in assigning correct key values for several methods over different lighting conditions. Each area was graded 1-5 where 3 is the optimum, 1 too low and 5 to high. Overall Impression is graded 1-3 where 3 is the best. The mean square error of the grades compared to the ideal grades are presented in Figure 4.5. It should be noted that more time was spent on tweaking variables for Reinhards second method and the S-curve method than the others.

The S-Curve method got the best overall grade. The reason for this is probably because it has better contrast than most of the other methods. It is however the slowest of the meth-ods tested, see 4.3 for performance Figures. If speed and simplicity is of primary concern the mean value mapping produces good results while being the cheapest to implement. Logarithmic mapping scored average, both in quality and speed. Reinhards methods got a decent score, but were the hardest method to tweak for different illuminations; they may produce better results if real illumination values are used. Power mapping got the worst score and lacks any redeeming qualities.

4.2.2 Tweaking

For the second test the subjects where instructed on how to change most parameters used to control the tone mapping and asked to adjust them to achieve a certain look. The tone mapping method used in this test is the S-Curve method. Description of the parameters where writen on the test protocol, see appendix for the test protocol used, more detailed description of parameters where given orally. Parameters for bloom were tested but not included here since the method to do this has been reworked. See Figure 4.1

As the key value determine brightness it shows the largest variation from day to night. Parameters for night effects were not used in the day environments. The white balance and saturation parameters were not very popular, both where close to the values where they have no effect.

(42)

4.3. Result and Evaluation – Performance 33

Clear Day Overcast Day Night

Key -0.07 -0.04 1.36 Slope 1.57 1.3 1.36 Noise freq - - 9 Noise Amount - - 0.56 Night Factor - - 0.84 Night Blur - - 2.2 White Balance 0.26 0.36 0.18 Saturation 1.16 0.96 0.7

Table 4.1: User selected parameter values for different conditions. Key determine the overall brightness, Slope controls contrast, Noise Freq determine the temporal frequency of the noise, Noise amount determine how much noise is added, Night Factor controls the blending between the night filtered image and the unfiltered image, Night Blur controls how much the image is blurred, White balance control how much white balance are applied, saturation controls how much the image is saturated.

4.3 Performance

Due to the difficulty finding a profiling tool that can measure the execution time for shaders performance is divided in two parts. Shader performance is estimated with Nvidias tool NVShaderPerf that calculates the ideal execution time for a shader. Mea-surements of frame rate is also done for some different settings. It should be noted that most shaders are not optimized much.

4.3.1 Theoretical shader performance

This table outlines how many cycles the different tonemapping methods and filters use. Nvidia’s NVShaderPerf was used with ps 3 0 as compile target and geforce 6 series as the target hardware. The half datatype was used for all variables since the accuracy should be adecuate for the task. White balancing was only tested for the S-Curve method. Since most users used very low amount of white balancing and it significantly decreases the performance as seen in 4.2 it might be removed if performance is an issue. If both the S-Curve tone mapping method and the Night Vision Color filter was used the Pixel output of a geforce 6800GT would be about 311 mega pixels per second (MP/s). A HD display with 1920*1024 resolution at 30 frames per second (FPS) would use 60MP/s. So for a HD display with a 6800GT the tone mapping and night vision filter would consume about 20% of the shading power when running at 30FPS. The Blur filter should use 1 cycle per sample, since linear sampling is done one texture sample and one add is done for each

(43)

4.3. Result and Evaluation – Performance 34 Method Cycles Reinhards first 5 Reinhards second 8 Logarithmic mapping 5 Power mapping 8 S-Curve 9

Mean Value mapping 4

White balanced S-Curve 24 Night Vision Color Filter 9

Night Vision Blur Filter 1/sample

Table 4.2: Theoretical performance

Filters Frames per second

No filters 29

Tone mapping 28

Tone mapping and night vision 27

Tone mapping, night vision and full screen blur 26

Table 4.3: Measured performance

sample. Performance estimates for the blur filter might be inaccurate since latency for texture reads might become an issue.

4.3.2 Measured performance

Performance measurements was done with Microsofts PIX tool, the callstack of one frame was grabbed and the runtime read from PIX, see Figure 4.3. As the values are based on only one frame they are not very reliable, but they still illustrate that the performance impact of the filters are small but noticeable. The performance tests was done on a Pentium 4 with 1 Gb ram and a Nvidia 7600 GT 256Mb graphics card and run in 1024x768.

(44)

Chapter 5

Conclusion and Discussion

5.1 Conculsion

The goal for this work was to increase the realism and immersion in real time visualisations by simulating the perceptual effects of human vision. As this is a quite broad goal two less ambitious goals where defined, to apply tone mapping to the image and to construct a series of filters to simulate night vision.

Several simple tone mapping methods that are suitable for real time use were implemented. A user test was used to grade the methods. The S-Curve method was determined to give the best overall result. Because of the limited dynamic range of the source image the effect of tone mapping is a bit subtle, the largest benefit is when very bright objects are introduced or when the overall illuminantion is higher or lower than normal.

Bloom presented some difficulty since it is directly caused by the very large illumination difference of a illuminant and the rest of the image. A blur method using function approx-imation was used to approximate a realistic function for glare presented by Spencer [23] was implemented. As realistic illuminance values are not available this method proved unsuitable. Instead a brightpass filter and a Gaussian blur function is used. While this method was completed after the user tests the general response has been positive.

Filters where constructed to simulate the loss of acuity and color vision in dark environ-ments. The sensitivity shift to shorter wavelengths at night was also taken into account and some Gaussian noise was added. The user tests indicate that the night vision filters do increase the immersion in the game.

5.2 Discussion

The implementation and tests of these methods illustrated the importance of input data for the choice of tone mapping method. The tests scene had limited dynamic range, if different illumination was used other methods might produce better results. There

(45)

5.3. Conclusion and Discussion – Future Work 36

seems to be a balance between predictability and contrast. Methods that are guaranteed to map the image to [0,1] often have low contrast and a washed-out look, Reinhards first, logarithmic and power mapping are examples of such methods. Other methods like Reinhards second and mean value mapping have good contrast but sometimes have unpredictable result if the illumination is higher or lower then normal. S-Curve mapping seem to be a balance between the two types.

While the de-saturation effect of the night filter is hard to notice due to the darkening of the image, the Gaussian blur that simulates acuity reduction and the noise could be hindering if a user has a task to do, as is common in games. Application of such a filter should therefore be used carefully so that it is not regarded as an artificial hinder. My idea of using tone mapping to increase the brightness for dark images and using blur and de-saturate to provide the appearance of ’night’ might not be appreciated by everyone. A complicating factor might be ignorance of how human vision behaves in low light situations. A larger user test in a real game with a more representative sample might be needed before this is used in a commercial game.

Most of the filters used here are quite simple. The largest difficulty has been how much the filters should be applied from the available game data. The general approach has been to use the illuminance from the sun in a function to determine how the filters should be applied. This works to an extent, but would not suffice if an image deviates from the regular illumination. A photometric rendering approach would alleviate this problem as more information could be gathered from the image itself.

5.3 Future Work

Histogram based tone mapping methods such as Ward’s [15] or local methods such as Renihard’s [19] have been evaluated to produce very good results [1]. As GPU’s increase in performance such methods should soon be possible to implement in a game. Adaptation could be improved by the use of two sided functions where increase in adaptation level takes less time then a decrease. Separate functions for cones and rods could also be used to produce different adaptation speed in bright and dark environments. The luminance history function from [7] could also be used to smooth the adaptation response.

Glare could be improved if the ciliary corona and lenticular halo could be added, perhaps by simply transform light position to screen space and render images of the effects. Spatial effects such as depth of field could be simulated by using a blur function that scale the amount of blur per pixel.

Evaluation of tone mapping operators for use in real time environments