Building a Pipeline for Gathering and Rendering With Spatially Variant Incident Illumination Using Real Time Video Light Probes

(1)

Department of Science and Technology Institutionen för teknik och naturvetenskap

Linköpings Universitet Linköpings Universitet

Examensarbete

LITH-ITN-MT-EX--07/019--SE

Building a Pipeline for

Gathering and Rendering With

Spatially Variant Incident

Illumination Using Real Time

Video Light Probes

Nils Högberg

Per Larsson

(2)

LITH-ITN-MT-EX--07/019--SE

Building a Pipeline for

Gathering and Rendering With

Spatially Variant Incident

Illumination Using Real Time

Video Light Probes

Examensarbete utfört i medieteknik

vid Linköpings Tekniska Högskola, Campus

Norrköping

Nils Högberg

Per Larsson

Handledare Jonas Unger

Handledare Stefan Gustavsson

(3)

Rapporttyp Report category Examensarbete B-uppsats C-uppsats D-uppsats _ ________________ Språk Language Svenska/Swedish Engelska/English _ ________________ Titel Title Författare Author Sammanfattning Abstract ISBN _____________________________________________________ ISRN _________________________________________________________________

Serietitel och serienummer ISSN

Title of series, numbering ___________________________________

Datum

Date

URL för elektronisk version

Avdelning, Institution

Division, Department

Institutionen för teknik och naturvetenskap Department of Science and Technology

2007-03-23

x

LITH-ITN-MT-EX--07/019--SE

Building a Pipeline for Gathering and Rendering With Spatially Variant Incident Illumination Using Real Time Video Light Probes

Nils Högberg, Per Larsson

Lighting plays an important part in computer graphics. When making photo realistic renderings the ultimate goal is to generate an image that would be indistinguishable from a real photograph or to seamlessly integrate a synthetic object into a photo. One of the key elements is to get correct lighting and shading in the rendering since the human vision is very well attuned to subtile variations of the lighting. Recent image based techniques has been developed that uses high dynamic range omni directional images, light probes, from the real world as light information to illuminate synthetic objects. Such images are capturing the incident light information in the point it was taken. By using these images as lighting information the objects illuminated will integrate seamlessly into a background photography of the scene. However these techniques assumes

that the lighting is spatially constant throughout the scene.

By using more light probes spatial variations in the scene can be captured. We will here present a pipeline for capturing and rendering with spatially variant light probes using a device that can capture light probes at very high dynamic range. Using this pipeline we have captured high frequency variations in a scene and scenes with complex real world lighting and used this information to render objects representing these variations.

(4)

Upphovsrätt

Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare –

under en längre tid från publiceringsdatum under förutsättning att inga

extra-ordinära omständigheter uppstår.

Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner,

skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för

ickekommersiell forskning och för undervisning. Överföring av upphovsrätten

vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av

dokumentet kräver upphovsmannens medgivande. För att garantera äktheten,

säkerheten och tillgängligheten finns det lösningar av teknisk och administrativ

art.

Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i

den omfattning som god sed kräver vid användning av dokumentet på ovan

beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan

form eller i sådant sammanhang som är kränkande för upphovsmannens litterära

eller konstnärliga anseende eller egenart.

För ytterligare information om Linköping University Electronic Press se

förlagets hemsida

http://www.ep.liu.se/

Copyright

The publishers will keep this document online on the Internet - or its possible

replacement - for a considerable time from the date of publication barring

exceptional circumstances.

The online availability of the document implies a permanent permission for

anyone to read, to download, to print out single copies for your own use and to

use it unchanged for any non-commercial research and educational purpose.

Subsequent transfers of copyright cannot revoke this permission. All other uses

of the document are conditional on the consent of the copyright owner. The

publisher has taken technical and administrative measures to assure authenticity,

security and accessibility.

According to intellectual property law the author has the right to be

mentioned when his/her work is accessed as described above and to be protected

against infringement.

For additional information about the Linköping University Electronic Press

and its procedures for publication and for assurance of document integrity,

please refer to its WWW home page:

http://www.ep.liu.se/

(5)

Abstract

Lighting plays an important part in computer graphics. When making photo realistic renderings the ultimate goal is to generate an image that would be indistinguishable from a real photograph or to seamlessly integrate a syn-thetic object into a photo. One of the key elements is to get correct lighting and shading in the rendering since the human vision is very well attuned to subtile variations of the lighting.

Recent image based techniques has been developed that uses high dynamic range omni directional images, light probes, from the real world as light in-formation to illuminate synthetic objects. Such images are capturing the incident light information in the point it was taken. By using these images as lighting information the objects illuminated will integrate seamlessly into a background photography of the scene. However these techniques assumes that the lighting is spatially constant throughout the scene.

By using more light probes spatial variations in the scene can be captured. We will here present a pipeline for capturing and rendering with spatially variant light probes using a device that can capture light probes at very high dynamic range. Using this pipeline we have captured high frequency variations in a scene and scenes with complex real world lighting and used this information to render objects representing these variations.

(6)

List of Figures

1.1 The improved Real Time Light Probe. . . 3 1.2 The developed pipeline. . . 4 1.3 Rendering of a scene with spatially varying lighting. For this

image around 600 light probes where captured using the Real Time Light Probe. . . 6 2.1 Radiance exitance, flux leaving an infinitesimal area dA in any

direction. . . 9 2.2 Radiance, light incident at an infinitesimalarea dA. . . 10 2.3 The plenoptic function, from Adelson and Bergen [7]. Two

eyes gathering light rays. Though they can not see the light coming in from behind, the plenoptic function describes the information available to any observer at any point in space and time. . . 11 2.4 Camera curve from a Fujifilm FinePix S1Pro digital SLR

Cam-era. . . 13 2.5 Weighting function used to weight the pixels values when

as-sembling HDR images. Removes saturated pixels and small values. . . 14 2.6 A reflective sphere reflects the entire environment exept for a

small area obscured by the sphere. . . 15 2.7 Left: A ray hits the environment and a sample is taken from

the light probe image. Center: A ray hits a specular object and is reflected about the normal direction. Right: A ray hits a diffuse object and an irradiance value is calculated. . . 16 3.1 The Ranger C55 architecture. . . 20 3.2 The multiplexer choosing which exposure to send. . . 22 4.1 Left: The three cameras mounted around the beam splitter.

Right: The existing Real Time Light Probe. . . 24 4.2 The pipeline developed during the pilot study. . . 25 4.3 Rendering of a sphere illuminated by one light probe for each

(9)

5.1 The new camera setup. . . 34 5.2 Rendered images from the simulations of warping red and blue

color with an angle of six degrees. The renderings are made with three different cameras and a ray traced sphere. Left: All colors seen from the center camera. Center: Red and blue channel from an angle of six degrees. Right: The images after warping. . . 34 5.3 Field of view . . . 35 5.4 Screenshot of the final version of the capturing software. . . . 39 5.5 A large object render with spatial varying data. . . 40 5.6 A scene illuminated by data from a real world environment,

this is one frame from the animation with varying light . . . 41 5.7 To put all rays in a common frame of reference a

singel-viewpoint reprojection is made. Each ray of incidence is re-projected from the sphere to its intersection with the optical axis(Z) . . . 42 5.8 Left: Shows the principle for the ray projection. Each ray

to be sampled for illumination is projected to the z axis to lookup the correct light sample. Right: Shows the principle for sampling the nearest probe, all samples will be looked-up in the same probe. . . 43 5.9 Upper left: The synthetic scene used to simulate light data

with spatial varying light. Upper right: Rendering with the data with nearest probe method. Center: Rendering with the same data but with the ray projection method. . . 45 6.1 Left:A reference photo of the scene. Right: A rendering

made with 700 light probes . . . 46 6.2 Upper: Rendering made with 700 light probes Lower:

Ren-dering with only one light probe . . . 47 A.1 The solid angle can be seen as the tree dimensional version of

(10)

Chapter 1 Introduction

Lighting has always been an essential part in computer graphics and es-pecially when making photo realistic renderings1_{. Heavy physically based}

lighting simulations are carried out in almost every rendering software to-day. When creating visual effects for feature film a very common problem is the integration of computer generated three dimensional objects with filmed material. The added computer generated object must appear to the audience as it was already there when the scene was shot. Since the human vision perception is very well adjusted to detect subtle variations in lighting and shading it is of the most importance that the illumination of the object to be integrated with the background material is correct. This creates significant problems for the computer graphics artist that have to match the illumina-tion of the object to the background.

Recent research has developed techniques, referred to as image based lighting (IBL), that uses high dynamic range omni directional images of real environ-ments as lighting information to illuminate synthetic objects. Such a omni directional image is called a light probe, it contains information of the real world illumination incident at the point it was captured. Using such tech-niques to illuminate synthetic objects placed at the same point will make the synthetic objects integrate seamlessly with filmed material. Clearly this is not enough when having objects moving around in the scene, since the light probe only has information of the lighting conditions where it was captured. There is no spatial information in a light probe image, wherefore the pro-cess of integrating computer generated objects is still a very labour intensive task. Additional light sources have to be placed manually by the artist to make a seamless integration, though the light probe image gives the artist a very good head start.

1_{Rendering is the process of producing an image from a mathematical description of a}

(11)

Introduction

In this thesis we will describe a pipeline that we have developed for gathering and rendering with spatially varying light probes. During our project we have had access to a prototype camera device developed by Unger et al. in [2], [5] and [8] that can capture images at a very high dynamic range and at 25 frames per second, reffered to as the Real Time Light Probe. From this existing prototype for gathering high dynamic range light probes we have evaluated the existing system and rebuilt it such as it can be used in research and for visual effects production. At the start of our project the device was in a development state, most of the implementations like interface and frame processing were under development. Our task was to build a solid pipeline, i.e. a production chain, for this device. This includes improving both the hardware and writing new software for capturing of data and software for processing the captured data such that it could be used for rendering using image based lighting techniques. We started the project by conducting a pilot study with the device in its current state and based on that decide what improvements had to be made and what software had to be written.

1.1 Short Introduction to Lighting And Visual

Ef-fects of Today

Today almost every large film production uses computer graphics, many re-cent feature films have added realistic computer generated imagery (CGI) to the film with extremely convincing results. Constructing the correct lighting and shadows is essential for the realistic look, however this is extremely hard to achieve.

One of the first feature films to use realistic visual effects was the Flight of the Navigator from 1986 where a shiny space ship with reflections from real environments where integrated into the film. A maybe more well-known early example is the T1000 robot in Terminator 2 from 1991 that uses the same image based reflection technique. More recently there are numerous of examples, one of the first films to use HDR light probes and image based lighting was X-Men in 2000. Today there are extremely high expectations and demands on the realism of visual effects and especially when mixing real live action and computer graphics. New techniques that can make the process easier and more closely resemble real world lighting is much coveted.

(12)

Introduction

Figure 1.1: The improved Real Time Light Probe.

1.2 The Real Time Light Probe

During this thesis we have used and improved a high dynamic range light probe capture device, we will refer to this device as the Real Time Light Probe or more simple RTLP. It consists of the camera system and a reflective sphere. The Real Time Light Probe is a unique device, it is built on three reprogrammed industrial high speed cameras. It can capture 25 light probes per second at an extrem dynamic range. A common digital camera has a dynamic range of around 1,000:1 wheras the Real Time Light Probe has dynamic rage of around 10,000,000:1. It outputs an enormous amount of data, but within the limit to what a high end computer can cope with. Using this device it is possible to in a short time collect huge amounts of light information in a scene, this opens up a new world of possibilities in the field of image based lighting and visual effects.

1.3 Creating a Pipeline for Visual Effects

A pipeline is a chain of production, in this context the pipeline includes all the various steps that has to be taken for rendering with spatially varying light probes. E.g. gathering light information, processing the data captured, and then using it to illuminate synthetic objects. As the reader probably will notice the steps in the pipeline is a natural way of solving these problems. Though it is essentially not the steps taken that are the focus in this thesis but to get all the steps working and to have a smooth workflow through the steps in the pipeline.

(13)

Introduction White balance and calibration Backplate photo/film Capture HDR light probe data Capture Developing raw HDR data Preprocessing Rendering 3D Objects and materials Light probes Tracking Light probe frames Alignment and warping Renderman light field plugin Compositing Rendering Solve tracking data

Figure 1.2: The developed pipeline.

The pipeline has been built with the primary aim to reduce the time spent in the capturing and processing steps such as more work can be spent on research, developing and testing new ideas in the area of Image Based Light-ing. Considerations have also been taken so that our pipeline also could fit in an existing production pipeline at a visual effects company.

In this chapter we will give a brief basic version to the pipeline developed. As mentioned the technique of using traditional IBL methods has disadvantages when dealing with scenes that has complex lighting, which is often the case in real world scenes. In traditional IBL techniques it is assumed that the lighting is spatially constant. Another limitation today is also time, today a typical light probe image takes in total several minutes to capture and assemble. This makes it impractical to capture more than one or at most a few probes on one set. The purpose for constructing this pipeline is to reduce the limitations in light probe capturing and image based lighting. The pipeline as in figure 1.2, consists of three basic steps; capturing, processing, and rendering.

1.3.1 Capturing Light Information

In this step all data needed for a rendering is captured. This includes light in-formation, tracking data and background photography. Using the improved Real Time Light Probe device we can capture light information along a path where a computer generated object, either moving or static, is going to be integrated into the background photography.

(14)

Introduction

When collecting incident light information along this path we need to have some knowledge of where on the path and at what time in the sequence the light probe has been, i.e. a three dimensional coordinate describing a posi-tion is associated with every light probe captured. This is called tracking, there are many ways to track objects in a scene, among the most common is motion capture. By using two or more cameras and track feature points through a sequence it is possible by triangulation to calculate the 3D po-sitions of the points tracked. In the pipeline proposed here motion-capture is used by filming the Real Time Light Probe with two regular DV video cameras and using a tracking software package. To make the tracking easier we have restricted the movements of the Real Time Light Probe to three de-grees of freedom, i.e x, y and z translations, by attaching it to a translation gantry. However there should be no problem to have a free hand movement. The Real Time Light Probe is operated through a PC using custom software. The raw data is streamed directly to the hard drive during capturing since a computer of today’s standards will not be able to process the data in real time during capturing. The background material is the imagery that the computer created objects is going to be integrated in. This can be filmed material using a moving film camera or a static photograph. This is also captured in this step either before or after gathering of the light probes. Since the rendered frames are going to be integrated into the background material, white balance and color correction images are captured with both the HDR camera and the background camera.

1.3.2 Data Processing

When all material has been captured during the capturing phase, the cap-tured raw HDR-data has to be processed and the video from the tracking cameras have to be tracked. Since the data from the HDR-camera is writ-ten in raw format to the computer it has to be developed, i.e converted from raw data to final HDR images. The Real Time Light Probe consists of a three camera setup, one for each color, mounted on plate pointing at a chromed steel sphere, see figure 1.1. The three cameras are mounted side by side which means that there is a slight angle between the camera in the middle and the two on the sides. Therefore we have to warp the red and blue channel images taken from an angle to get a correct image. Operations like developing and warping the data is done within our software wich also handles various other operations like cropping and white balance. The data can then be written out to disk as frames in different HDR file formats.

(15)

Introduction

Figure 1.3: Rendering of a scene with spatially varying lighting. For this image around 600 light probes where captured using the Real Time Light Probe.

To be able to track the position of where the light probe device has been we use a tracking software package. Within this program we can track the Real Time Light from the two DV-cameras and let the program calculate a 3D position for every frame. The tracking data from the tracking software can then be exported to a program we have written for analyzing the data and then rewrite it to a format used by the rendering software.

1.3.3 Rendering

The final step in the pipeline is rendering, to be able to make a rendering a scene or 3D object is needed. The object can be anything from a sphere to a whole car, 3D objects are created in a modeling program. When the modeling is finished its time to render the final image. Rendering is very time consuming especially when there are more than one image to render. To be able to render an animation in a reasonable time, multiple comput-ers have to be used. By having multiple computcomput-ers it is possible to render one frame on each computer or to divide one image and send the parts to different computers for rendering. Such a system is called a rendering-farm. We constructed and configured a rendering-farm using all the computers we had , the artist machine contacts a server that distributes the rendering job on the computers available. Rendering with varying light probes is a new

(16)

Introduction

concept and it is not yet supported by any commercial renderer. To be able to make the renderings a plug-in had to be used.

The plug-in uses the developed HDR data and the tracking data to connect the corresponding HDR image with a position. It is important that the scene has the same proportions as the captured light, the tracking position from the capturing must have a corresponding position in the scene. Placing the 3D object in the middle of the scene should give the impression that it was placed in the middle of the captured light. All this is taken care of in the plug-in. In figure 1.3 we have rendered an image using the plug-in, around 600 light probes using the Real Time Light Probe where captured for this rendering.

(17)

Chapter 2 Background

2.1 Randiance and Irradiance

Radiometry is the science of measurements of light, taking a light probe is a measurement of the incident light at that point. Here we consider light as photons traveling in a straight line until they hit a surface where they are either reflected or absorbed. Irradiance can be seen as the incident energy from all possible directions on a small area at a certain time. Radiance is a measure of energy that is incident, exitant or passing through a small area over a certain time in a certain direction. It is measured in watts per square meter per steradian [W/(m2· sr)] (See Appendix A for an explanation of steradian).

This can be compared to what is happening in a camera. The shutter is letting in light that is focused and limited in direction by the lens for short amount of time. The light hits the sensor and a voltage proportional to the irradiance is measured at every pixel. Hence light is measured over a small area over a certain time in a certain direction at every pixel in the camera. As long as the pixels are not under or over exposed we get a correct radiance measurement. We will go into further detail about capturing radiance maps using regular digital cameras later. Radiance is fundamental in Image Based Lighting and Global Illumination and is therefore derived here. A good description of radiance can be found in [1] pages 19-24. The energy of a photon with a certain wavelength λ is described by.

eλ =

hc λ Where h is Planc’s constant (h = 6.63 · 10−34

Js) and c is the speed of light (c0 = 299.8 · 106m/s). With n photons of wavelength λ we get the spectral

radiant energy.

(18)

Background

dA

Figure 2.1: Radiance exitance, flux leaving an infinitesimal area dA in any direction.

If we integrate the spectral radiant energy over all wavelengths we will get the radiant energy Qe for any number of photons. Radiant energy is measured

in joules [J].

Qe=

Z ∞

0

Qλdλ

Because light is travelling we can measure the flow of the radiant energy over time. This gives us radiant flux, denoted Pe and is measured in [J/s] or

watts [W].

Pe=

dQe

dt

If we then measure the radiant flux over a infinitesimal area we will get the radiant flux density. This can be measured in two ways. Either we measure the flux incident from all possible directions at a point on a surface and get irradiance Ee. Or we measure the flux leaving a point on a surface in all

possible directions and get radiant exitance Me (see Figure 2.1), this is also

called radiosity and is described by: Me=

dPe

dAe

If we instead of measuring the flux leaving a point on a surface in all possible directions. Measure the light emitted from an infinitesimal area on a light source in a certain direction we get radiant intensity Ie which is measured

in watts per steradian [W/sr].

Ie =

dPe

(19)

Background

dω

L

θ

dA

Figure 2.2: Radiance, light incident at an infinitesimalarea dA.

This leads us to the point where we can define the radiance Le as radiant

flux per unit differential area per steradian. Which basically means all light arriving at a specific point from a certain direction (see Figure 2.2).

Le=

d2Pe

dA cos(θ)dω

By integrating the radiance over all directions from a point x we will get all light incident at that point. This implies that if we can capture the radiance L for a large enough of directions around a point x we can calculate the irradiance for that point. Hence we can recreate the lighting incident at that point and use it as a light source to illuminate computer generated objects.

2.2 The Plenoptic Function

When capturing images it is suitable to see it as a sampling process, where the sampled function represents the radiance as a function of several vari-ables. This function is often referred to as the plenoptic function. The plenoptic function was introduced by Adelson and Bergen [7] as a seven dimensional function that describes how illumination varies in space. The plenoptic function is a function defined as P = P (θ, φ, λ, t, Vx, Vy, Vz) where

P is the radiance arriving at a point (Vx, Vy, Vz) in direction (θ, φ) at time

t with wavelength λ. It is a function describing the light at any point, in any direction, at any wavelength at any time. Thus it is a representation of every possible image at every point in space, taken in any direction at any instance of time for any wavelength. Even though this function is much to general to handle in practice, it describes the information available, see figure 2.3. A camera system can be considered as a sampling function of a subset of the plenoptic function. We can reduce this function by fixing time, i.e. integrating over the shutter time, and limit the wavelength to the spectrum

(20)

Background

Figure 2.3: The plenoptic function, from Adelson and Bergen [7]. Two eyes gathering light rays. Though they can not see the light coming in from behind, the plenoptic function describes the information available to any observer at any point in space and time.

that a digital camera can capture. In a digital camera the dependency of λ is sampled by using three color filters for projecting a continuous spectral distribution onto three discreet RGB intensity values. Hence the plenoptic function is reduced to a function of only five dimensions, P (θ, φ, Vx, Vy, Vz).

Which is a representation of any possible image at every point in any direc-tion or more pertinent in our case a representadirec-tion of every omni direcdirec-tional image in any point in space. With the Real Time Light Probe used in this thesis it is possible to capture a 6D version of the plenoptic function, where space and time can be varied. The 6D version of the plenoptic function is described by P (θ, φ, Vx(t), Vy(t), Vz(t), t).

2.3 High Dynamic Range Imaging

An image captured with a digital camera or scanned from a photograph is an array of brightness values corresponding to the light that the sensor re-ceived. When photographing strong light and deep shadows, e.g. taking a photograph indoors when the sun is shining through a window, the dy-namic range can easily reach five orders of magnitude or more. A normal consumer level digital camera often uses a 10-bit sensor which means that it can only capture a dynamic range of around 1000:1. This is why cameras have a shutter and an aperture so that the light coming in through the cam-era can be limited and avoid saturation. After an image has been captured a non-linear mapping is applied, compressing the dynamic range in the im-age, when the image is saved in a normal image format like for example JPG.

(21)

Background

Since normal image hardcopy and display devices only has a useful range of about 100:1 there has historically been convenient to only have a represen-tation of 256 levels of intensity per color, i.e. only using 8 bits per color to store the image. With interests and research in image based rendering tech-niques the need for representing a higher dynamic range in images evolved, in -92 Ward [13] proposed an image format that could represent a dynamic range of 1076_{:1 using 32 bits per pixel. Since then a number of other image}

formats have been developed that can represent a higher dynamic range of which we will discuss the most important. Such formats are called High Dynamic Range (HDR) image formats. When working with image based lighting techniques we want true measurements of the radiance values in real environments and HDR formats are needed to make this possible.

2.4 HDR file formats

The regular file formats that are used today like JPG and PNG can only represent 8-bits per color channel. Therefore a number of file formats have been developed that can store up to 32-bits per color channel instead. The OpenEXR [20] format developed by ILM is probably the most common in the visual effects industry. It is open source and it uses the 16-bit half data type and has about 10.7 orders of magnitude. It also has extra channels for storing other information such as alpha and depth values. The Radiance, RGBE, format by Greg Ward is another commonly used format. It uses 32 bits/pixel, has a dynamic range of 76 orders of magnitude and uses run-length encoding. Other formats are the SGI LogLuv and the IEEE floating point tiff [1]. The LogLuv encoding is a perceptually based encoding with a logarithmic scale and is implemented as a part of the TIFF library. The LogLuv can use either 24 or 32 bits per pixel and has a dynamic range of 4.8 or 38 orders of magnitude respectively. The floating point tiff, also part of the TIFF library, does not use any encoding. It has a dynamic range of 79 orders of magnitude and is using 96 bits per pixel. The formats used in this thesis are the OpenEXR (.exr), Radiance (.hdr) and the Portable Float map (.pfm) which basically is a raw format.

2.5 Capturing HDR Images

There are mainly two methods for capturing HDR images, either direct cap-ture or by assembly of a series of low dynamic range images. There exist a few professional capture devices that can capture a much higher dynamic

(22)

Background 0 50 100 150 200 250 300 0 0.5 1 1.5 2 2.5

Figure 2.4: Camera curve from a Fujifilm FinePix S1Pro digital SLR Cam-era.

range than ordinary cameras. The SpheronVR [21] has eight orders of mag-nitude and up to 13,000x5,300 pixels. It has a line scan CCD-sensor that is rotated to capture a panorama, though one image has a capture time of about 15-30 minutes. The Ladybug [22] and the SMaL [23] cameras both has four orders of magnitude. The Ladybug uses six sensors and can capture 75% of the full sphere in 15-30 frames per second (fps) with a resolution of 3,600x1,500 pixels. The SMaL is a low resolution sensor capabale of 60 fps in 482x642 pixels. These cameras are extremely expensive, exept from the SMaL, and also have either limitations in dynamic range, resolution or the capture time is very long. We will later describe the newly developed HDR camera that we have used which has dynamic range of 10,000,000:1 at 25 frames per second. The by far most common way to capture HDR images is still by using a digital single-lens reflex (SLR) camera. In 1997 De-bevec and Malik [3] presented a technique for recovering HDR images from a set of low dynamic images with different exposure times captured with the same imaging device. We will here only discuss digital cameras since they are now a days the most used. When photographing a scene, multiple photographs with different exposure times are captured. The range of expo-sures is different from scene to scene but should be chosen so that there is no saturation in the shortest exposure and the longest exposure can describe the darkest parts of the scene. The images can then be assembled to one HDR image. When a photograph is captured with a charged coupled device (CCD1_{) digital camera the pixels get charged according to the amount of}

incident radiance hitting the sensor. This is often proportional to the irradi-ance but in almost every camera a non-linear mapping is applied before the

(23)

Background 0 50 100 150 200 250 0 0.2 0.4 0.6 0.8 1 1.2

Figure 2.5: Weighting function used to weight the pixels values when assem-bling HDR images. Removes saturated pixels and small values.

image is stored. This is to prolong the dynamic range and make the images more visually pleasing. This mapping is called the camera response function f (X), see figure 2.4. The non-linear mapping actually consists of a number of non-linear mappings and is therefore hard to know beforehand, though there are ways to derive it. Debevec et al. [3] proposed a technique to find the camera response function f (X) and it is also implemented in HDRShop [24]. The resulting pixel colors Yij in the image can be expressed with:

Yij = f (Xij) = f (Ei∆tj)

Where an exposure Xijis defined as the irradiance Eiat the sensor multiplied

with the exposure time ∆tj where j denotes the exposure and i is the pixel

location.

Xij = Ei∆tj

It is assumed that the camera response function is monotonic and can there-for be inverted. This means that f−₁

exits. If the camera curve is known the images can be calibrated and the irradiance can be found.

f−₁

(Yij) = Ei∆tj

Since low pixel values are sensitive to noise and many cameras saturate at a lower value than the maximum, a weighting function is introduced, see figure 2.5. To avoid banding there is a ramp around the reliable values in the weighting function. This gives the following expression for the irradiance at every pixel, i.

Eiweighted = g(Yij)

f−1

(Yij)

(24)

Background

Environment

Figure 2.6: A reflective sphere reflects the entire environment exept for a small area obscured by the sphere.

Where g(Yij) is our weighting function. The final HDR image can then be

assembled by using the weighted mean of irradiance values from N exposures:

Zi = PN j=0(g(Yij)f −1_(Y ij) ∆t ) PN j=0g(Yij)

Where Zi is the weighted mean output pixel in the assembled HDR image.

2.6 Light Probes

The first person to use environment images as reflection maps where Blinn [10] who painted the images he used for reflection mapping using computer software. This was then taken further by Miller [11] who used panoramic photographs as reflection and illumination maps. He used a Christmas tree ornament to capture an omni directional image of the environment that he could then remap onto the object to show how the object would look in the real world. The process of photographing omni directional images for captur-ing incident illumination is most often referred to as light probes. The most common technique is to use a highly reflective sphere of some sort. Chromed steel ball bearings are very good for this purpose.This techniques was fur-ther developed when Debevec et al. [4] used omni directional high dynamic range images of reflective spheres for rendering synthetic objects into real scenes using global illumination techniques. By capturing HDR light probes in the real world and then use them as light sources in the global illumination context they achieved almost photo realistic results. The approach of using reflective spheres and HDR imaging for capturing light probes is widely used in the computer graphics industry today, like visual effects, gaming and ar-chitectural visualization. When photographing a mirrored sphere the image will have a view of almost 360 degrees, there is only a small area behind the sphere that is not captured. This is because in a (perfect) mirror the angle

(25)

Background L R P IBL Environment L R L’ R’ P N IBL Environment L R P IBL Environment R0’ R1’ R2’ R3’ R4’ R5’ P’ P0’ P1’ P2’ P3’ P4’ P5’

Figure 2.7: Left: A ray hits the environment and a sample is taken from the light probe image. Center: A ray hits a specular object and is reflected about the normal direction. Right: A ray hits a diffuse object and an irradiance value is calculated.

αi between an incident ray i and the normal n is the same as the angle αr

between the reflected ray r and the normal, i.e. the angle of incidence γ is: γ = αi = αr

In a mirrored sphere as seen from the camera the normal at the circumference of the sphere is orthogonal to the viewing direction. The angle of incidence will then be 90 degrees which means that the incident light seen from the camera is coming from behind the sphere, se figure 2.6. The resolution of a light probe is only determined by the resolution of the camera.

2.7 Image Based Lighting

The process of using an HDR images as sources of illumination for computer generated objects is called Image Based Lighting (IBL). Since a light probe is an accurate radiance measure it contains information about the incident light in that point, i.e. shape, color, and intesity of direct light sources as well as color and distribution of the indirect light in the scene. The key technology for image based lighting is global illumination. Global illumina-tion algorithms are physically based simulaillumina-tions of how light distributes in a scene, i.e. it simulates interreflections of light between surfaces. In other words it is used to calculate how light from light sources interacts with syn-thetic objects and to describe the light that is bouncing one or multiple times diffusely of surfaces in the scene, i.e. indirect illumination. This makes it possible to simulate effects like color bleeding, e.g. a red paper on a white wall will make the the area on the wall close to the paper red.

(26)

Background

The key technique in global illuminations is ray-tracing which essentially is about tracing light rays in the scene. There are three basic cases when calculating the illumination of an object using image based lighting, see figure 2.7. Either the ray R does not hit the object and instead hits the environment. In this case there are two options, if the environment should be rendered as a background to the object a sample from the light probe is taken, otherwise the pixel value is set to black. If on the otherhand the ray hits the object it depends on the surface properties of the object where specular and diffuse surfaces are the two basic cases. Should the ray hit a mirror-like specular surface the ray is reflected R′

about the surface normal b

NP of the point P until it either hits another object or the environment,

the pixel value is then recursively calculated to an incident light value L′

. The number of recurisons is often a user defined parameter. The specular component of the resulting light L reflected back to the camera, i.e. the pixel value, is calculated by the value of the reflected ray L′

and the properties of the specular material. E.g. for a metallic material L′

is multiplied by the color of the material.

For a glass like material the specular component is fainter and the incident light L′

is multiplied by a value depending of the index of refraction. A second ray R′′

is traced through the translucent surface, the light arriving L′′

from this ray is added to the total light L reflected to the camera. In the case when a diffuse surface is hit, the diffuse surface reflects the ray R equally in all directions. Thus the total amount of light incident at the point P is the irradiance Ee in that point. The irradiance is a weighted integral of

all color values arriving L′

i along the reflected rays R ′

i and is calculated by:

E(P, bNn) = Z Ω L′ (P, ω)cos(θ)dω Where L′

(P, ω) is a function representing the incident light arriving at P from the angular direction ω in the upper hemisphere Ω. Thus the contribution of every reflected ray R′

i is weighted by the cosine of the angle between itself

and the surface normal bNp in the point P such as light incident from oblique

angles contributes with less light. Since E is dependent on how the light arriving at the point P is occluded and reflected by other surfaces it cannot be computed analytically. Instead a number of rays R′

i is sent out from P

to sample the irradiance and a weighted average of the resulting values Li

from the rays is taken to estimate E using: E(P, bNn) = 1 k k−1 X i=0 L′ icos(θ)

(27)

Background

The final pixel value is computed by multiplying the irradiance value E with the surface’s diffuse color.

However most materials are not either fully specular or diffuse but consists of both a specular and a diffuse component, e.g. plastic materials. For such materials the specular component is calculated separately and then added to the diffuse component. There also exits a number of other common sur-face material properties such as rough specular reflections and bidirectional distribution functions (BRDFs), e.g. anisotropy. In these cases considera-tions to the distribuconsidera-tions must be taken in the sampling process. For a more throughout description see [1] pages 409-416. For a good introduction to ray-tracing see Shirley [16] and for global illumination Dutré [17].

(28)

Chapter 3 HDR Camera Capture Device

3.1 Hardware

In the real time lightprobe three cameras are used, one for each color. The camera is a Ranger C55 from the company SICK IVP1_{, originally constructed}

for laser range imaging. The C55-camera have a traditional monochrome CMOS2 _{sensor with a resolution 1536 by 512 pixels. The sensor have a}

on-chip processing unit, which makes the it suitable for implementing a broad range of image processing algorithms that can be run in real time. This section will describe the camera in more detail and how the camera program and the capturing algorithm works.

In figure 3.1 a schematic description is made over the C55s architecture. The sensor have 1536 pixel columns and 512 rows, each of the 1536 columns have a A/D converter and a programmable bit-serial processing element (PE) with local memory. All the 1536 PE are feed by a common sequencer and the sequencer is controlled by a general-purpose CPU-processor, see figure 3.1. The sequencer feeds the PE with instruction in a strict SIMD (Single Instruction, Multiple Data) way. This means that all the PE are executing the same instruction but on different data.

A regular CPU gets a series instruction that may say, get this pixel value and do this operation, then this value and so on. In a SIMD the instruction are instead, get all these pixel values and do a operation, all the instructions are made parallel instead of in a sequence. One PE is not that powerful by it self, but by connecting all 1536 PE’s and letting them work in parallel gives the computational power that is needed for real time processing. The CPU runs a small operating system that controlls the communication and all data

1_{http://www.sickivp.se}

(29)

HDR Camera Capture Device

CMOS photodiode array, 1536 x 512 pixels

1536 A/D converters and processing elements 14.6 mm 4.9 mm General-purpose CPU SIMD instruction sequencer Memory I/O controller To PC host

Figure 3.1: The Ranger C55 architecture.

output. For more information about the sensor architecture see Johanssson et al. [14].

3.2 Communications

The C55-camera have two ways to communicate with the host PC, Camer-aLink or through serial interface. CamerCamer-aLink is a standard interface with a broad bandwidth for communication between a camera and a PC. Theoreti-cally it can send 2.04 Gbit/s of data. There is no common port on a PC that can connect to the CamerLink interface, therefor an extra hardware card is needed, this is called a framegrabber. A framegrabber is dedicated to handle the data streams that are produced by the camera and make it possible to store the data on the hard drive. Communication through the serial interface is slower, though it can be used to upload programs and communicate with the camera OS.

3.3 Capture Algorithm

Traditionally HDR capturing can take several minutes for one image. At video frame rates, i.e. 25 fps, one frame only have 40 ms to be processed and therefore the capturing algorithm needs to be highly optimized. The C55 has a operating system that can run different programs in the camera. Unger J. [8] has developed an algorithm that reads out the sensor in a way that

(30)

HDR Camera Capture Device

makes it possible of capturing HDR data in 25 fps. In a regular camera the whole sensor is readout at the same time with the same exposure. In Unger [6] a system for taking spatial varying HDR data is presented, the camera movement is controlled by a computer and all the exposures are taken at the same place. The drawback is that it is time consuming and can not handle variations in the scene. A system that can handle spatial and temporal vari-ations in the scene needs to take the data with a higher frame rate and all the exposures at the same time.

The algorithm uses a rolling shutter progression technique. Rolling shutter implies that the sensor rows are read out multiple times during a single frame. During the readout a “window’ of 38 rows is rolling over sensor. Within this window multiple number of rows are readout and processed parallel, for more information See Unger et al .[8].

The PE can make a 8 bit linear ramp conversion in 9.6 µ s. The total time for one frame is 40 ms and with a resolution of 512 rows implies that each row has 40/512 = 78 µs for A/D conversion. Every row has to be A/D con-verted one time for each exposure, therefor 8 exposures can be taken during one frame. Capturing 8 exposures will give a dynamic range of 10,000,000:1, this can be done in 25 fps which is far better than any of the competitors on the market.

One of the key things that makes it possible to capture HDR data at video frame rates with the C55 camera is that it is possible to A/D convert the same analog signal more than once with different gain settings on the A/D amplifier. There are five real exposures and three amplified exposures that are read out from the sensor with 4 f-stops apart.

A multiplexer structure is used in the camera to take decisions about which exposures should be sent. Figure 3.2 shows a schematic image of how the multiplexer works. The two first exposures are real exposures and will always be sent to the host, the third and fourth exposures are readout at the same time from the sensor but with different gain. The exposure that has the highest unsaturated value are sent to the PC. Doing the same comparison on the rest of the pair will reduce the data sent for each pixel from eight bytes to six bytes. The first five bytes are the exposure value and the sixth byte is a bit mask that holds information about which of the pairs that are sent and which is the best value.

(31)

HDR Camera Capture Device 1X 1X 1X 4X 1X 4X 1X 4X OR OR OR 1 2 3 4 5 6 7 8 Exposure: To PC host

Exposure bit mask

Figure 3.2: The multiplexer choosing which exposure to send.

3.4 HDR Assembly

The HDR assembly process in the HDR-camera differs from a multiple image HDR assembly. When a light probe is captured in the traditional way the cameras response curve is used to weight the pixel value. Since the HDR images is constructed from multiple images that are taken in a time span the scene have to be stationary, any motion or varying light will introduce arti-facts in the final HDR image. The HDR camera do not have this limitations in motions and varying light because of the short frame time. In fact the short frame time gives new opportunities like capturing scenes with motions and varying light. In the C55 there is no need to do any averaging, the best pixel value can be detected by looking after the highest unsaturated pixel value. The final HDR value is computed as:

E = xi ∆ti

Where xi is the A/D converted value from the sensor and ∆ti is the

cor-responding exposure time, i is the index for the exposure where xi has its

maximum valid value below the saturation level. The rest of the pixel values are either saturated or they have lower digital values and therefore lower binary precision, using them will only introduce artifacts in the final image.

(32)

Chapter 4 Pilot Study

To start our work we did a pilot study of the existing Real Time Light Probe (RTLP). The existing system was an early prototype which needed a lot of manual fine tuning to get good results. This was a tedious process which clearly needed improvements. It was decided to do a pilot study to test the system. The study would consist of doing an animation using spatially vari-ant light probes captured with the existing Real Time Light Probe. Hence we would use the RTLP for gathering of light information and use this mate-rial to illuminate a synthetic object moving in a scene. From this we would learn how the existing system worked as well as getting a feeling for how we thought such a system should work. We would also build a basis to a pipeline and get a better knowledge on what things would have to be improved on the Real Time Light Probe as well as in the pipeline that we would develop during the study. Before the pilot study started we discussed the different aspects of what made a production pipeline good and what parts should be included in the pipeline. The key features of the pipeline is stability and efficiency, a pipeline can be seen as the spine in a production and if any part in a pipeline fails the whole production will be affected.

To make a rendering with spatially varying light information there are three basic steps that has to be taken. At first light information in the scene that is going to be used has to be captured. The captured data then has to be processed in different ways, e.g. a remapping from the the image of the reflective sphere to a latlong panorama. Finally the processed data is used to illuminate one or many synthetic objects in the scene. In these basic steps there are numerous of small steps taken. An example would be the rendering step where the small steps could be to model the synthetic object then give it materials, illuminate it, and then render the object to finally do a composit with the background.

(33)

Pilot Study

In this part we will describe the existing Real Time Light Probe setup and continue with a description of the most important steps taken in the pilot study. A discussion of the problems and what improvements where needed in the developed pipeline will then be held. In the next chapter we will describe what improvements has been made to the Real Time Light Probe.

4.1 Current Setup

Figure 4.1: Left: The three cameras mounted around the beam splitter. Right: The existing Real Time Light Probe.

The Real Time Light Probe used during the pilot study where assembled of three Ranger C55 cameras, described in 3.1, one camera for each color in RGB. The three cameras where mounted around a dichroic1 _{RGB beam}

splitter forming a single view point setup. The beam splitter separates a single image into three spectral bands suitable for RGB and distributes the wavelengths into the three different cameras. The lens and beam splitter in this setup came from a broken disassembled video projector, see figure 4.1 left. Since this was a test setup the cameras where loosely mounted on a wooden plate, this caused the cameras to readjust themself from time to time. The thick cables used for the CameraLink frame grabber cards also had a tendency to move the separate cameras causing misalignments between the color channels.

The capturing algorithm used in the existing Real Time Light Probe cap-tured 8 exposures per frame at video frame rates, i.e. 25 frames per second, using no encoding. This produced huge amounts of data and the external output was around 1 Gbit/s for each camera. To meet the demands on storage speed and bandwidth, three individual host computers were used,

1_{A dichroic filter is a color filter used to selectively letting light of a certain range of}

(34)

Pilot Study White balance and calibration Backplate photo/film Capture HDR light probe data Capture Short develop R Preprocessing Rendering 3D Objects and materials Light probes Tracking Renderman environment light Compositing Rendering Find sync point for RGB Alignment Short develop B

Short develop G _sequenceDevelop Light probe_frames _{tracking data}Matchmove Camera

calibration

Figure 4.2: The pipeline developed during the pilot study.

simply one for each camera. Each of the computers where running a copy of a simple capturing program controlled by a series of short cut commands. During capturing it is important that the cameras are synchronized, i.e that the three colors capture its data at the exact same time. A time lag in one of the colors would result in severe artifacts in the resulting image that could not be compensated for in the data processing step. To sync the cameras a 5V TTL signal can be sent through the camera’s serial interface, this triggers the cameras forcing them to capture an image at the exact same moment. In the RTLP setup this signal is generated using a signal generator.

When capturing spatial lighting variations using light probes the reflective sphere needs to be moved in a path or over an area where the variations will be recorded. The easiest way to do this is to mount the HDR-camera and the reflective sphere on a rig with the sphere at a fixed distance from the camera. This way the HDR-camera will always be focused on the reflective sphere. In the existing RTLP setup the camera and sphere were mounted on a steel profile, though the sphere used was a 10 centimeter in diameter chrome ball bearing with a weight of around 10 kg. This made the the whole construction a bit heavy and ungainly to carry around and therefor the steel profile was mounted on a carriage holding the camera rig and the computers, see figure 4.1 right.

4.2 Production chain

During the pilot study a basis to a pipeline were formed, the steps taken will be described here. This pipeline became hard managed and there were many hacks and scripts had to manually tweaked to get a good result. This

(35)

Pilot Study

resulted in a very unreliable pipeline and the workflow where to say the least awkward. The pipeline created for the pilot study, figure 4.2, consists of the three basic steps mentioned above. Inside of these there are numerous of small steps that has to be taken in order to get a decent result. As seen there are alot of improvements that can be made to this pipeline and during our project we will try to minimize the steps that has to be taken to create smooth workflow through the pipeline and minimizing the time from cap-turing to rendering.

It is advantageous to see the process as two parts where two teams are doing their thing with as little contact as possible with each other. One team, team A, is the capturing and processing team. They are only involved in these two parts and when finished with their task they send the material to the other team. The other team, team B, is only involed in the rendering step, all they need is the light information with tracking data to illuminate the synthetic objects they create. While team A is out gathering light probe sequences and processing the data team B is working with creating models and materials for the models. Preferably team A is finished with lighting data when team B want to illuminate and render their scenes. If some step early in pipeline does not work and team A has to spend a lot of time fixing the problems, team B will have to sit and wait to get the material and a lot of time lost.

4.2.1 Capturing

Capturing lighting data using the old Real Time Light Probe required exten-sive preparations and calibrations. The three computers were connected to one monitor through a KVM 2 _{switch. Each computer was running a copy}

of the camera software that controlled the communication and showed a live feed from the camera. Within the software it was possible to switch between the different exposures sent from the cameras and to start/stop the recording. The inaccuracy in the camera positions relative the beam splitter caused misalignments between the cameras, these misalignments where from time to time large enough to clip parts of the sphere. Tuning the optics could be very time consuming, when the optics where changed the different channels needed to be inspected, this implies that each computer needed to be checked. Since there where one computer for each camera and each computer where running a copy of the capturing program with no way of synchronizing when

2_{KVM stands for Keyboard, Video and Mouse and is hardware device that allows}

(36)

Pilot Study

the recording would be started we used a LED-flashlight as sync signal after the recording had been started. We could then find the corresponding frame in all three channels to sync the three streams in the processing step. The recording was started by pressing a keyboard shortcut on the three keyboards preferably at the same time, though the capturing software was a bit unstable and would not always start recording when the command was given which caused many tedious restarts.

subsectionDevelopment The development is a data processing step where the data recorded by the cameras is processed in different ways. Since the output is around 1 Gbit/s for each camera there is no time to assemble the HDR-images on the fly during recording. This is done in this step as well as align the color channels and if needed remap the images into different panoramic mappings.

The processing was mainly done using MatLab scrips. There where different scripts for different tasks, e.g one script for developing the data and one to align the data. These where assembled using cutting and paste depending on what we wanted to do with the data. All of them where hard coded meaning if we needed rotate one color channel 2 degrees it had to be changed in the code of the script.

A typical processing of a sequence would look as following: At first a couple of hundred frames are developed from each stream to find the point were the LED-flashlight is switched off. The developing MatLab script is then customized to start the developing in the corresponding frame for that color channel. Because of the misalignments from the loose cameras the color channels suffered from bad misalignments and rotations, a few frames are developed in a low dynamic range format such as they could be imported to Photoshop. In Photoshop it is possible, with live feedback, to align and rotate the channels, the values are then coded into the alignment script and the whole sequence could then be developed.

4.2.2 Tracking

Since we are mesuring spatial variations of the lighting in a scene by taking multiple light probes we need some way to tell where in the scene a certain light probe was taken. In the the pilot study we used match moving by plac-ing a DV-camera on top of the HDR-camera. Match movplac-ing is the process of calculating camera movements through features in the filmed material.

(37)

Pilot Study

This is done using parallax effects which implies that scenes with much parallax would give more accurate results. In our scene though we had very little parallax, instead it was created by introducing tracking markers in the scene.

4.2.3 Rendering

Figure 4.3: Rendering of a sphere illuminated by one light probe for each frame. The light probe is passing a sharp shadow.

For modeling we used Autodesk Maya [19] and along with Maya the Mental images MentalRay [28] renderer is bundled. In a traditional rendering the lighting artist constructs virtual theoretical light sources and places them in the scene to get the desired illumination and shading. This is a complicated and labour intensive task which needs lots of knowledge and practice. Using IBL will simplify this process, the concept is that instead of using theoretical, even though physically correct, light sources a light probe image is used to illuminate the synthetic objects. During the initial tests a ready to use Men-tal Ray IBL shader where used, the shader uses the light probe to estimate distant light sources in Maya. The estimation of the light sources is made on the basis of the energy in the light probe, the areas where there energy is concentrated and contributes more to the lighting in the scene will get more virtual lights. Traditionally one light probe is used for the entire scene, this implies that even if the scene is animated the light probe will stay the same. In our case we want to change the light probe in every frame. This will however give a very flickery result since the estimated light sources will be recalculated for every frame. This behavior is obviously not desirable when a animation is done. To use the light probes with traditional IBL techniques a plug in to MentalRay would have to be written. Since MentalRay uses a very complicated plug in structure the possibilities with Pixar RenderMan [27] where first investigated. The fact that the university had eight licenses of RenderMan also made it possible to render on multiple computers. In Renderman all shaders are written in Shading Language (SL), SL is a stan-dard for writing shaders and this gives a lot of opportunities to edit and write new shaders.

(38)

Pilot Study

To integrate RenderMan with Maya a package called Renderman Artist Tools (RAT) is used. Custom written shaders can be imported into RAT and con-nected to geometries using a the RAT user interface.

Using traditional image based lighting techniques no spatial variations of the lighting can be captured or recreated using only one light probe. In the pilot study we captured around 700 light probes along a path, hence the spa-tial variations of the lighting in the scene where captured in one dimension. Connecting the tracking information with the corresponding light probe will introduce the possibility to render a small object moving through the cap-tured real world lighting.

The shaders used for generating the illumination using image based lighting techniques makes two calls, one for the specular component and one for the the diffuse. In the diffuse shader an occlusion call is made, this contains two calls. The first checks which parts of the upper hemisphere from the pixel is occluded. The second is called a gather and this sends out rays that calculates the irradiance for the pixel. An example of a diffuse RenderMan IBL light source shader can be seen in Appendix B. The specular call is just a sampling of the light probe according to the reflecting direction. The renderings in the pilot study where made using the simple method where the light probe is changed in every frame. To make the renderings faster we used two different shaders one for the specular and one for the diffuse. In the specular shader we used a high resolution light probe and in the diffuse we used a down sampled version which only was 32x32 pixels. Using a down sampled version of the light probe will save rendering time scince the number of samples taken in the gather call is depending on the size of the light probe. The object is a small diffuse sphere about the same size as the reflective sphere on the Real Time Light Probe, by using the tracking data the synthetic sphere moves in the same way in the modelled scene as when the data was captured. Some of the rendered frames can be seen in figure 4.3.

4.3 Conclusions of the pilot study and the new

pipeline

The pilot study took around four weeks to complete and became very suc-cessful, the fact that the renderings where used in a scientific paper was a good proof that the study had been worth the time. One of the purposes of the pilot study was to discover what improvements had to be made to The Real Time Light Probe, but more important how it was going to be

(39)

im-Pilot Study

proved. We also wanted to have a list of specifications on what software that was needed for creating a smooth workflow through the pipeline. During the pilot study we found a lot of weaknesses in the pipeline we had created and lot of improvements had to be made to fulfill the aims of our thesis project. The existing setup had left a lot to wish for. To start with the cameras were not fixed, from time to time the positions changed and calibrations where needed. The profile that held the HDR-camera and the reflective sphere where too weak making the sphere move around in the captured sequences, this profile was mounted to the rest of the rig by a joint with a lot of play causing vibrations. The wheels on the rig were of the type that can rotate such as it can go in any direction and with three computers the whole thing were really heavy. This made the whole system useless for capturing light information. Therefore the rig was disassembled during the pilot study such as the profile holding the cameras and the reflective sphere could be mounted on a more stable cart with fixed wheels.

The processing of the captured data were a very tedious process, all the MatLab scripts that had to be manually modified where easy to mess up. Aligning the color channels could take half a day and the rest of the day was spent letting the computer develop the sequences. This was a clear bottle-neck in the pipeline.

Using match moving to track the movements of the Real Time Light Probe was clearly not the best solution, the lack of parallax in the scene made the tracking hard and inaccurate even though we used tracking markers. The match moving software where not designed for these kind of things, the movement between the frames were too small and there were too few fea-tures in the scene. To get a decent track a lot of manual work where required. Using RenderMan as a renderer worked well, and since it could be used together with Maya it was easy to create different scenes and setting up new viewpoints. Since we had five dual core computers we could render multiple frames from an animation at the same time, though configuring a render farm with RenderMan turned out to be a complicated and poorly documented. Instead a TLC script was created that could batch start a rendering on a remote machine. a sequence could then be divided into parts where each part where rendered at a single core on a computer. However this is not as efficient as a render farm and it turned out that random frames would crash for apparently no reason. Since each computer had its own job list it was hard to know which frames that had finished and which had failed.

(40)

Pilot Study

After four weeks of hard work we had finally got out renderings from the new created pipeline that proved the concept of rendering with spatially varying light probes. The list of things that needed to be improved had grown during the four weeks and we had to make priorities on what to do first. Constructing a new stable rig that could be hand held with a smaller reflective sphere as well as devoloping software that could be used instead of the scrips where the main priorities. Improving these two steps would make a significant change in the capturing process of the pipeline. We also needed a better and more accurate tracking method to track the position of the Real Time Light Probe. Setting up a rendering farm for RenderMan where all rendering jobs can be mangaged and distributed from one computer and frames that fails can be rerendered automatically were also something that needed to be done.

(41)

Chapter 5 Improving the Pipeline

During the pilot study a base structure to a pipeline had been developed, though a lot of improvements had to be made to get a decent workflow through this pipeline. Above all we had discovered how fragile and hard to manage the current device setup was. A lot of time had to be spent on nursing the device before any data could be captured. Once the data was captured there was a long process before any useful images could come out of the scripts. Clearly there were a lot of things that could and had to be improved. The aim of developing a pipeline for capturing spatially variant light probes was to have a smooth workflow where less time would be spent on tampering with the device and running hard manageable scripts. With a solid chain of production more time could be spent on research, developing new methods for image based lighting techniques or to use the device in production. Thus, the first thing done after the pilot study was a list of requirements on what changes and improvements had to be made in the pipeline and with the Real Time Light Probe. This was naturally divided into two parts, one for hardware and one for what software was needed to be developed and what functions that was essential in the software. In the discussions for these specifications the purposes for the new device had to be thought of. Mainly this new device should be used in research projects wherefor aspects like user friendliness doesn’t have to come in the main focus, though it should be manageable by people with the same background as ourselves.

5.1 Improving the Hardware

The hardware in the current setup was an early proof of concept. It could be used but it was too unstable and to hard to navigate. Therefore a completely new rig for holding the C55-cameras and the reflective sphere had to be

Building a Pipeline for Gathering and Rendering With Spatially Variant Incident Illumination Using Real Time Video Light Probes

Examensarbete

LITH-ITN-MT-EX--07/019--SE

Building a Pipeline for

Gathering and Rendering With

Spatially Variant Incident

Illumination Using Real Time

Video Light Probes

Nils Högberg

Per Larsson

LITH-ITN-MT-EX--07/019--SE

Building a Pipeline for

Gathering and Rendering With

Spatially Variant Incident

Illumination Using Real Time

Video Light Probes

Examensarbete utfört i medieteknik

vid Linköpings Tekniska Högskola, Campus

Norrköping

Nils Högberg

Per Larsson

Handledare Jonas Unger

Handledare Stefan Gustavsson

2007-03-23

LITH-ITN-MT-EX--07/019--SE

Upphovsrätt

Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare –

under en längre tid från publiceringsdatum under förutsättning att inga

extra-ordinära omständigheter uppstår.

Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner,

skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för

ickekommersiell forskning och för undervisning. Överföring av upphovsrätten

vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av

dokumentet kräver upphovsmannens medgivande. För att garantera äktheten,

säkerheten och tillgängligheten finns det lösningar av teknisk och administrativ

art.

Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i

den omfattning som god sed kräver vid användning av dokumentet på ovan

beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan

form eller i sådant sammanhang som är kränkande för upphovsmannens litterära

eller konstnärliga anseende eller egenart.

För ytterligare information om Linköping University Electronic Press se

förlagets hemsida

http://www.ep.liu.se/

Copyright

The publishers will keep this document online on the Internet - or its possible

replacement - for a considerable time from the date of publication barring

exceptional circumstances.

The online availability of the document implies a permanent permission for

anyone to read, to download, to print out single copies for your own use and to

use it unchanged for any non-commercial research and educational purpose.

Subsequent transfers of copyright cannot revoke this permission. All other uses

of the document are conditional on the consent of the copyright owner. The

publisher has taken technical and administrative measures to assure authenticity,

security and accessibility.

According to intellectual property law the author has the right to be

mentioned when his/her work is accessed as described above and to be protected

against infringement.

For additional information about the Linköping University Electronic Press

and its procedures for publication and for assurance of document integrity,

please refer to its WWW home page:

http://www.ep.liu.se/

Abstract

Contents

List of Figures

Chapter 1

Introduction

1.1

Short Introduction to Lighting And Visual

Ef-fects of Today

1.2

The Real Time Light Probe

1.3

Creating a Pipeline for Visual Effects

Chapter 2

Background

2.1

Randiance and Irradiance

2.2

The Plenoptic Function