User perception of varying level of detail for Imposters

(1)

IN

DEGREE PROJECT TECHNOLOGY, FIRST CYCLE, 15 CREDITS

,

STOCKHOLM SWEDEN 2019

User perception of varying

level of detail for imposters

MALTE DAVIDSSON

NIKLAS BERGDAHL

KTH ROYAL INSTITUTE OF TECHNOLOGY

(2)

(3)

DD142x EECS/KTH 2019-07-25

User Perception of Varying Level

of Detail for Imposters

MALTE DAVIDSSON NIKLAS BERGDAHL

Swedish title: Användaruppfattning av Varierad Detaljnivå för Imposters Handledare: Christopher Peters

(4)

ii

Abstract

As the time to render frames increases with the amount of 3D models present and the level of detail of these models, finding a way to render these objects more efficiently would allow real time applications to render a larger number of objects. One such technique is Imposters, but their level of detail is limited by texture memory, especially for animated objects. A user study was per-formed to investigate how changing the level of detail of the angles or frames for imposters in a crowd simulation impacts the perceived realism. The re-sults of this study show that the participants’ confidence in their answers fell as the overall quality rose, but the share of answers that aligned with the higher quality side rose together with overall quality.

(5)

iii

Sammanfattning

Eftersom att tiden det tar att rendera bildrutor ökar med antalet 3D-modeller som avbildas och med modellernas detaljnivå skulle det tillåta realtidsappli-kationer att avbilda fler objekt om ett sätt att effektivare avbilda dessa objekt hittas. En sådan teknik är Imposters, men deras detaljrikedom begränsas av texturminne, särskilt när det kommer till animerade objekt. En användarstu-die utfördes för att undersöka hur förändringar i detaljnivåer med avseende på antalet vinklar och bildrutor för Imposters i en simulering av en folkmassa påverkar den upplevda realismen. Resultaten av studien pekar på att deltagar-nas tilltro till sina svar minskade i takt med att kvalitén ökade, däremot ökade andelen svar där sidan med högre kvalité valdes när kvalitén överlag var högre.

(6)

1 Introduction 1 1.1 Background . . . 1 1.2 3D Graphics . . . 1 1.3 Imposters . . . 2 1.4 Research Question . . . 3 2 Methods 4 2.1 Implementation . . . 4 2.1.1 Generation . . . 4 2.1.2 Rendering . . . 5 2.2 Pilot Study . . . 6 2.3 User Study . . . 6 2.4 Hypothesis . . . 8 2.4.1 Hypothesis . . . 8 2.4.2 Null hypothesis . . . 8 3 Results 9 4 Discussion 14 5 Conclusions 16 Bibliography 17 iv

(7)

Chapter 1 Introduction

1.1 Background

Crowd simulations deal with animating a greater number of models as they move through an environment, with potential applications in evacuation and accessibility scenarios. The amount of models in a crowd simulation is rele-vant for the usability, with a greater number of models resulting in a simulation that takes longer to run and render. If run and render time can be lowered more complex simulations can be used. Dobbyn et al. 2005 [1] wrote about the is-sue of rendering a large amount of characters. They mention that this is a limiting factor in games as well and a reason why there are few games where a large number of detailed characters are shown at once. Since saving imposters takes up so much space, making one of the big questions how to save space, it can be interesting to find out how much the quality can be lowered [2].

1.2 3D Graphics

3D Graphics involves the process of creating 3D models, and for real time ap-plications polygonal 3D models are often used. These models contain data de-tailing a number of points, often called vertices, and lines, often called edges, connecting the vertices together in 3D space. The surface of the model con-sists of polygons, three or more vertices connected to each other by the same number of edges. The minimal polygon is the triangle, consisting of three vertices and three edges.

To display the 3D models on a screen they need to be converted from 3D space to a 2D representation, this process is called rendering. A 3D model can be rendered alone or together with other objects. When rendering more

(8)

2 CHAPTER 1. INTRODUCTION

than one image for usage in real time applications or videos a single image is referred to as a frame.

The number of polygons rendered in a frame affects how quickly the frame is generated. Rendering a large amount of detailed 3D models consisting of a large amount of polygons can affect the performance of real-time applications negatively. Severely limiting the scope, both in time and number of models, of such simulations.

Replacing a model with another with fewer polygons when viewed from a distance is an established way of keeping the number of polygons rendered lower. This process of having stages of detail is called having different level-of-detail (LOD) [3].

1.3 Imposters

Imposters is a method aiming to reduce the time it takes to render a 3D model by preprocessing the 3D model into multiple 2D representations[1]. The most relevant representation is chosen for rendering depending on what angle the object is viewed from amongst other factors. The challenge with implementing imposters comes from limiting the amount of representations generated dur-ing preprocessdur-ing without affectdur-ing the visual quality. Animated object such as human characters present an extra factor in preprocessing as the amount of representations will be multiplied by the amount of frames chosen from the animation. Dobbyn et al. 2005 [1] writes that animated imposters greatly in-creases the amount of texture memory needed in comparison to their inanimate counterparts. With the amount of space required to store an imposter being so high it is common to use a fixed number of templates with only minor, if any, variation. This produces Appearance Clones, similarly, Motion Clones are produced when reusing the same animation [4].

(9)

CHAPTER 1. INTRODUCTION 3

1.4 Research Question

As the texture memory is limited, one of the challenges when using animated imposters is to find the minimum amount of representations, in regards to viewpoints and animation frames, needed to provide a fluid imposter repre-sentation. This raises the questions chosen to be pursued in this study:

What is the minimum amount of viewpoints needed to provide a realistic im-poster representation?

What is the minimum amount of frames needed to provide a realistic imposter representation?

(10)

Chapter 2 Methods

To determine at what levels of detail the sense of realism of the simulation is sufficient a user study was designed and conducted. Before the user study was conducted a pilot version was performed to gather feedback and to try to determine what values to use for the number of frames in the animation and the number of angles, so as best to test the research questions.

2.1 Implementation

To compare the perception of imposters with varying number of angles and frames, algorithms to generate texture atlases and use these atlases for render-ing were developed. The game engine Unity3D was used together with the generation algorithm to create the texture atlases as well as to run the simula-tion with the rendering algorithm.

2.1.1 Generation

Given a number of horizontal angles, vertical angles and number of frames the horizontal angles are divided evenly 360◦around the object, the vertical angles are divided between 0◦and 90◦around the object and finally the frames are divided evenly over the animation cycle. The imposter generation algorithm used can with this information capture images for each combination of the given angles and frames. The images are stored in a texture atlas ready to be used for rendering the imposters.

(11)

CHAPTER 2. METHODS 5

Figure 2.1: Atlas output of the generation with 4 horizontal angles, 2 vertical angles and 2 animation frames.

2.1.2 Rendering

Quadrilateral 3D models consisting out of two triangular polygons was used for each imposter. In each frame the mesh is updated to face the camera and the frame in the atlas corresponding to the closest combination of angle between the camera and the imposter and the animation state is chosen to be applied as the texture of the mesh [3].

(12)

6 CHAPTER 2. METHODS

Figure 2.2: Left: A group of imposters. Right: The same group shown from a different angle.

2.2 Pilot Study

In the pilot study four sample videos were shown to a few participants. They were then asked questions as to whether or not they perceived any difference between the groups of imposters on the two sides of the scene (left and right), or if either side was perceived as more realistic than the other.

As a result of this pilot study a few changes to the scene were decided upon. The length of the videos was increased from 10 to 15 seconds so that participants had a bit more time to form an opinion of the scene. Whilst still not having so much time that they could investigate and compare individual models in depth. The overall number of models was increased so that there were more characters moving about, making it harder to pick out and focus on individual imposters. Finally, the camera was moved further forward so that the foreground of the scene would have about the same number of models as the background.

2.3 User Study

For the user study 16 test cases were prepared, for the two parameters, ani-mation frames and number of angles, in the aniani-mation three possible values were used for both. 8 for the low (L) quality, 16 for the medium (M) quality and 32 for the high (H) quality animation. Then differences of low to medium and medium to high were compared, with only one parameter being changed

(13)

CHAPTER 2. METHODS 7

at a time. With the first letter representing angles and the second represent-ing frames, for the low to medium comparison we got the followrepresent-ing test cases (with zone A being to the left and zone B being to the right): LL zone A and LM zone B, LL zone A and ML zone B, MM zone A and LM zone B, and MM zone A and ML zone B. These were then mirrored so that for every video there was one other test case were the sides were swapped. This gave us 8 cases. For the medium to high comparison the same process was conducted, with low quality replaced with high in the previous example. This gave us another 8 cases for a total of 16. The cases were chosen on the basis that only one value was different between each side of the screen in any test case, and the difference in perception would be more interesting to compare between, for example, medium and high quality rather than low and high quality, as the difference in those cases would be, quite easily, noticeable.

Figure 2.3: Sample screenshot of the scene

The user study was performed with 13 participants, no participants were informed of what the differences between the sides would be. Each partici-pant watched the 16 videos in a different order, and with a different starting video. According to a latin square arrangement. The participants were intro-duced to the environment of the study through a screenshot from one of the videos showing what the scene and characters would look like. After each video shown, the participants were asked to describe their feelings regarding the two sides of the screen. Did either side (left/right) feel different from the other? Was the quality of the character models noticeably different? If the an-swer was yes the participants were asked to elaborate as to which side looked better and in what way they thought it was better.

(14)

8 CHAPTER 2. METHODS

2.4 Hypothesis

The research questions ask what the minimal amount of frames and angles are needed to create a representation of the imposters perceived as realistic. To approach the answer the user study was designed to investigate the perceived difference in realism between our chosen qualities.

2.4.1 Hypothesis

The amount of perceived differences will be higher between the Low and Medium quality than the Medium and Higher quality. The share of answers where the participant chooses the higher quality side as the more realistic one will be higher when the Low and Medium qualities rather than Medium and High qualities are compared.

2.4.2 Null hypothesis

The amount of perceived differences will be the same between the Lower and Higher quality comparisons. The share of answers where the side chosen as more realistic by the participant aligns with the higher quality side will be the same between the Lower and Higher quality comparisons or lower in the Higher quality comparisons.

(15)

Chapter 3 Results

When used in the diagrams lower refers to comparisons made between lower and medium quality settings, higher refers to comparisons made between medium and higher quality settings. Initial A and F refers to the angle or frame having the different quality setting between the sides. Aligned and unaligned refers to the side chosen as more realistic, with aligned corresponding to the side with the higher quality setting, and unaligned the side with the lower quality.

Strength 1 and Strength 2 refers to the confidence the users had in their answers. If the participant expressed that the difference was very noticeable or they motivated their choice with in a way that could be interpreted as linked to the difference in available angles or frames it was classified as Strength 2. All other answers were classified as Strength 1.

(16)

10 CHAPTER 3. RESULTS

Figure 3.1: Table of the number of times the listed quality combination was chosen when compared to the quality inside the parentheses. With the first letter in each combination referring to the number of angles and the second referring to the number of frames in the animation.

The participants perceived one side over the other as more realistic in 42.3% of the times their were questioned as can be seen in Figure 3.1. If filtered to only count answers of Strength 2 this drops down to 14.7% as can be seen in Figure 3.2.

(17)

CHAPTER 3. RESULTS 11

Figure 3.2: Breakdown of all the answers grouped by Difference in Angle or Frame and Alignment to expected answer.

Figure 3.3: Breakdown of all the answers, but filtered to answers of Strength 2, grouped by Difference in Angle or Frame and Alignment to expected answer.

(18)

12 CHAPTER 3. RESULTS

Figure 3.4: Answers grouped by difference in Angle or Frame, Alignment to expected answer and Quality range. AA is Angle Aligned, FA is Frame Aligned, AU is Angle Unaligned and FU is Frame Unaligned

(19)

CHAPTER 3. RESULTS 13

Figure 3.5: Share of answers aligned with the higher quality side being chosen, split up by confidence.

(20)

Chapter 4 Discussion

The first thing we want to highlight is the low number of responses, almost 40% of the time the participant expressed that they perceived no noticeable difference in realism between the two side. This number may be able to be reduced if the participants had been subjected to a more in-depth introduction to the study. But this part of the user study had to be limited to keep it the duration of the study at a manageable level.

Figure 3.4 shows the total number of answers aligned with the higher qual-ity side going up as the overall qualqual-ity of the animation increases, and the total number of unaligned answers going down. Interestingly enough however, the number of participants that were strongly convinced went down as the quality was increased, both when looking at a difference in the number of angles of the imposters as well as when there was a difference in the frames-per-second of their animations.

Figure 3.5 shows us that as the quality of the overall animation increases so does the share of answers aligned with the side that had either more angles or more frames in the video. It also makes it more obvious that when participants were not as confident in their answer as to which side looked better or more realistic they chose the lower-quality side more often than when they felt con-fident in their answer. This was especially true when the overall quality was lower, with participants actually picking the side with worse quality as being more realistic more than half the time. One possible reason for this being the case is the low number of participants. Perhaps we would see this number trending towards 50% as more answers came in since the actual reasons given for this class of answer did not in fact point out any actual differences between the animation on the two sides.

The amount of perceived differences in the lower qualities for angles were

(21)

CHAPTER 4. DISCUSSION 15

27, higher for angles was 22 and the same numbers for frames were 20 and 19. The low difference in combination with the low number of samples we are unable to reject the null hypothesis that the amount of perceived differences will be the same. Surprisingly the share of answers aligning with the higher quality side rose both for when the angles and frames differed when the overall quality was higher and as such we can’t reject the second null hypothesis either. Participants seemed forgiving of the impostors switching between view-ing angles with one person expressview-ing that it looked like someone changed direction which was perceived as more realistic by that participant. Some par-ticipants also expressed a perceived difference in height between the models on the two sides, perhaps due to a difference in the amount of vertical angles resulting in some models seeming to change size as they pass the middle.

One important thing to note about our user study is that the the viewing angle was static in regards to the environment, the only objects that moved in the scene were the imposters. This could lead to fewer changes between the different possible viewing angles for the imposters compared to if a moving camera would have been used. Because of this we believe the effect of chang-ing the number of angles could differ greatly if a similar study was performed but with a moving camera.

(22)

Chapter 5 Conclusions

Users seem to be more able to identify what imposters are animated at higher quality or that have more horizontal and vertical angles captured than when the overall quality of the simulation is low. However, the users that do notice a difference are more able to point out what specifically was changed when the quality is lower. To determine if these two factors are independent from one another further studies would have to be made comparing differences in both factors at the same time. Further, it would seem that the number of frames in the animation are more important when comparing two different LOD simu-lations, especially at lower qualities. Further studies into the best values for both angles and frames in the animation are recommended, and perhaps even comparisons between actual 3D models and their imposters side by side in a crowd simulation environment.

Important weaknesses to point out in this study are the low numbers of participants as well as at what values the two factors are compared. Since we were unable to come to a conclusion as to whether user subconsciously noted a difference in realism between two animations of differing quality or if their choices in the cases where they answered with less conviction were closer to random. We chose the values of the factors because of their large differences, our user study indicated that with smaller differences than these it is possible that very few users, if any, would be able to point to a difference. Similarly, we chose not to compare low with high quality since the differences between the two sides seemed so large as to be hard to miss. A similar study in the future would benefit from showing the participants some examples of the different quality levels so that they know what to expect and what to look for before starting the study, especially if it is trying to pinpoint more exact values for the two variables.

(23)

Bibliography

[1] Simon Dobbyn et al. “Geopostors: A Real-time Geometry/Impostor Crowd Rendering System”. In: ACM Trans. Graph. 24.3 (July 2005), pp. 933– 933. issn: 0730-0301. doi: 10 . 1145 / 1073204 . 1073290. url: http : / / doi . acm . org . focus . lib . kth . se / 10 . 1145 / 1073204.1073290.

[2] A. Beacco et al. “Efficient rendering of animated characters through opti-mized per-joint impostors”. In: Computer Animation and Virtual Worlds 23.1 (2012), pp. 33–47. issn: 1546-4261.

[3] A Aubel, R Boulic, and D Thalmann. “Real-time display of virtual hu-mans: levels of details and impostors”. eng. In: IEEE Transactions on Circuits and Systems for Video Technology 10.2 (2000), pp. 207–217. issn: 1051-8215.

[4] Rachel Mcdonnell et al. “Clone attack! Perception of crowd variety”. eng. In: ACM Transactions on Graphics 27.3 (2008). issn: 0730-0301. url: http://search.proquest.com/docview/34554491/.

(24)

(25)

(26)

www.kth.se

User perception of varying level of detail for Imposters

User perception of varying

level of detail for imposters

MALTE DAVIDSSON

NIKLAS BERGDAHL

User Perception of Varying Level

of Detail for Imposters

Abstract

Sammanfattning

Contents

Chapter 1

Introduction

1.1

Background

1.2

3D Graphics

1.3

Imposters

1.4

Research Question

Chapter 2

Methods

2.1

Implementation

2.1.1

Generation

2.1.2

Rendering

2.2

Pilot Study

2.3

User Study

2.4

Hypothesis

2.4.1

Hypothesis

2.4.2

Null hypothesis

Chapter 3

Results

Chapter 4

Discussion

Chapter 5

Conclusions

Bibliography