• No results found

Gaze guidance through head-mounted Augmented Reality display

N/A
N/A
Protected

Academic year: 2021

Share "Gaze guidance through head-mounted Augmented Reality display"

Copied!
60
0
0

Loading.... (view fulltext now)

Full text

(1)

Faculty of Technology and Society Computer Engineering

Bachelor Thesis

15 university credits

Gaze guidance through head-mounted

Augmented Reality display

Blickstyrning genom huvudmonterad bildskärm för förstärkt verklighet

Viktor Kullberg

Emil Lindkvist

Exam: Bachelor of Science in Engineering in Computer Science 180hp

Area: Computer Science

Date for examination: 28-05-2019

Examiner: Dario Salvo Supervisor: Thomas Pederson Co-supervisors: Dr. Diederick Niehorster, Dr. Günter Alce

(2)
(3)

Abstract

Human decision making is an important factor in the design process for systems and items. With the fast developing of Augmented Reality it is now possible to simulate digital interfaces everywhere. This allows for several application areas both within the industry and for the public and it can be implemented with everything from cellphones to smart glasses. In this thesis, it is investigated how subtle gaze guidance can be implemented using wearable Augmented Reality technology. Subtle cuing is also investigated to see if it can be used as a digital nudge in head-worn Augmented Reality environment. To investigate this, a prototype is developed. By developing a prototype that can perform a controlled experiment, the visual guidance and subtle cueing can be examined with a Posner cueing task. By using eye trackers, saccadic reaction times of the participants are measured and examined. The result shows a significant change in reaction time when using our subtle guidance than without. The conclusion is that subtle cueing can be used as a digital nudge in a head-worn Augmented Reality environment and this thesis can be used for further studies within visual guidance in Augmented Reality with a head-mounted display.

(4)

Sammanfattning

Mänskligt beslutfattande är en viktig faktor i design processen för system och objekt. Med den snabba utvecklingen av området förstärkt verklighet är det nu möjligt att simulera digitala gränssnitt överallt. Det finns flera användningsområden både inom industin och allmänheten där det realiseras med allt ifrån mobiltelefoner till smarta glasögon. I denna uppsats tas ett system fram för att testa hur visuell styrning kan implementeras med hjälp av huvudmonterad förstärkt verklighet. Subtil ledning undersöks också om det kan använ-das som en digital knuff i huvudmonterad förstärkt verklighet. För att undersöka detta tas en prototyp fram. Genom utveckling av denna prototyp kan ett kontrollerat experi-mentet genomföras och visuell styrning samt subtil ledning undersökas. Experiexperi-mentet för att undersöka visuell styrning är Posner cueing task. Genom att använda ögonspårning-sutrustning, kan man mäta reaktionstiderna för sackader hos användarna. Resultatet visar en signifikant skillnad i reaktonstid när vår subtila ledning används. Slutsatsen av denna uppsats är att subtil ledning kan användas som en digital knuff i huvudmonterad förstärkt verklighet och denna rapport kan användas för vidare forskning inom visuell styrning i förstärkt verklighet med huvudmonterade bildskärmar.

(5)

Acknowledgements

Thanks to Thomas Pederson for the guidance and all the valuable feedback during our thesis. Thanks to Diederick Niehorster and Günter Alce for the help with the eye tracking and working with the Hololens.

(6)

Contents

1 Introduction 1

1.1 Background . . . 1

1.2 Conceptual Vision . . . 2

1.3 Research purpose and problem statement . . . 2

1.3.1 Hypothesis and Research questions . . . 2

1.4 Limitations . . . 3

2 Theoretical Background 4 2.1 Augmented Reality . . . 4

2.2 Microsoft Hololens . . . 4

2.3 Unity 3D Engine . . . 4

2.3.1 Mixed Reality Toolkit . . . 4

2.4 Human-Computer Interaction . . . 5

2.5 The Human Vision . . . 5

2.6 Visual Attention . . . 7

2.7 Posner Cueing Task . . . 7

2.8 Eye tracking . . . 8

2.8.1 Systematic errors . . . 8

2.8.2 Effect of small errors . . . 9

2.8.3 Tobii Software . . . 9

3 Related work 11 3.1 "A Guided User Experience Using Subtle Gaze Direction" . . . 11

3.1.1 Comments . . . 11

3.2 "Subtle Cueing for Visual Search in Head-Tracked Head Worn Displays" . . 11

3.2.1 Comments . . . 12

3.3 "Comparing unobtrusive gaze guiding stimuli in head-mounted displays" . 12 3.3.1 Comments . . . 12

3.4 "Augmented Reality Assistance in the Central Field-of-View Outperforms Peripheral Displays for Order Picking: Results from a Virtual Reality Sim-ulation Study" . . . 13

3.4.1 Comments . . . 14

3.5 "Voluntary Spatial Attention has Different Effects on Voluntary and Reflex-ive Saccades" . . . 14

3.5.1 Comments . . . 14

4 Method 15 4.1 Mixed methodology . . . 15

4.2 Literature study . . . 15

4.3 Nunamaker and Chen’s system development . . . 15

4.3.1 Construct a conceptual framework . . . 15

4.3.2 Develop a system architecture . . . 16

4.3.3 Analyze and design the system . . . 16

4.3.4 Build the system . . . 16

4.3.5 Observe and evaluate the system . . . 16

4.4 Controlled Experiment . . . 16

(7)

5.1 Construct a conceptual framework . . . 17 5.1.1 Problem Tree . . . 17 5.1.2 Augmented Reality . . . 17 5.1.3 Eye tracking . . . 18 5.1.4 Visual guiding . . . 18 5.1.5 Controlled Experiment . . . 19

5.2 Develop a system architecture . . . 19

5.2.1 Head mounted Augmented Reality . . . 20

5.2.2 Eye Tracker . . . 20

5.2.3 Computer . . . 21

5.3 Analyze and design the system . . . 21

5.3.1 Choice of visual guiding method . . . 22

5.3.2 Design of the subtle cue . . . 22

5.3.3 Posner Cueing Task . . . 23

5.3.4 Position of the subtle cue . . . 25

5.3.5 System functionality . . . 25

5.3.6 Synchronization of Hololens and Eye tracker . . . 26

5.3.7 System Requirements . . . 27

5.4 Build the system . . . 27

5.4.1 Implementing the system flow . . . 28

5.4.2 Subtle Cue . . . 28

5.4.3 Communication . . . 29

5.4.4 Synchronization . . . 29

5.5 Observe and evaluate the system . . . 30

5.5.1 Testing . . . 30

6 Result 31 6.1 General description . . . 31

6.2 Setup . . . 31

6.3 Independent and Dependent Variables . . . 32

6.4 Participants . . . 32

6.5 Procedure . . . 33

6.6 Processing and gathering the data . . . 34

6.7 Pilot experiment . . . 35

6.8 Final experiment result . . . 36

7 Discussion and Analysis 41 7.1 Analysis Result . . . 41

7.2 Methodology . . . 42

7.3 Related work . . . 42

7.4 Ethics . . . 43

7.5 Threats to validity . . . 43

8 Conclusion and future work 45 8.1 Hypothesis . . . 45

8.2 Research questions . . . 45

8.3 Contribution . . . 45

8.4 Future work . . . 46

(8)

A Test cases 50

(9)

1

Introduction

This chapter introduces the background of this thesis, by first introducing the larger scope and context of the work and then narrowing it down to a research purpose, hypothesis, and research questions.

1.1 Background

Human decision making has always been an important factor in the design process for systems and items. In 1988, Donald Norman introduced the concept of affordances into the field of Human-Computer Interaction (HCI) [1]. Norman described affordance as the design aspect of an item that indicates how it can be used, meaning that by just seeing or feeling an item, you know what it is for [1]. When it comes to digital interfaces, this is something people encounter every day without reflecting upon, for example, clickable buttons and sliders [2]. With the digitization of the world, affordance has become a fundamental design concept within the HCI field and play a big part in the development of interfaces [3]. Since then, other concepts have emerged that try to help discuss the possibility of influenc-ing human decision makinfluenc-ing, among them persuasive technologies [4] and digital nudginfluenc-ing [5][6]. These are two concepts that aim to influence the users’ decision making through design [7].

Nudging is a term that originates from behavioral economics and social psychology [7]. It can be described as influencing an individual by giving them subtle hints and by doing so, guiding them into making the desired decision [8]. There are different types of nudges, where one example is the attraction of visual attention through saliency. If you visually expose certain items to people, it has been proven that they are more prone to consume them [9]. Furthermore, studies have shown that putting healthier items, such as fruits next to the cashier have positive effects on the sales [10]. This implies that visual attention plays a huge part in our decision making and that it is possible to nudge us into making healthier decisions. With the growing number of decisions being taken on digital systems and screens, it is natural that digital nudging is gaining more relevance [6]. Despite this, the research in the field is insufficient and often limited to static screens and interfaces [5][6].

While digital nudging focuses on steering the users to targeted behaviors and decisions [8], persuasive technology is more shifted towards changing attitudes and behaviours [4]. B.J Fogg describes persuasive technology, as a technology that is specifically designed to persuade people, where the persuading is defined as none forceful [4]. Applications which build on this concept is a growing trend. We now have applications that help us exercise, change smoking habits, and that guide us into healthier eating [11] and it is proven that they actually work and positively affect our behaviour [12].

With the rapid development of Augmented Reality and head-worn displays, it is now pos-sible to simulate digital interfaces everywhere you go by adding holograms to the physical world [13]. This blending of the physical and digital world pushes the boundary of persua-sive technology and digital nudging, opening up the potential of new applications within the field of human-computer interaction [11]. One such application area is subtle gaze direction where unobtrusive stimuli are introduced to steer the visual attention of the user[14]. Work has been done trying to subtly guide the gaze of the user in immersive

(10)

environments, although none have been using real Augmented Reality with Head-worn displays [15][16]. Instead, they modulate the saliency in video images [17], or exploit the use of Virtual Reality, simulating an Augmented Reality environment and with the help of eye trackers measuring where people look. The results show that it is possible to guide the visual attention of the user in a subtle way in simulated Augmented Reality either by using subtle cues [16] or with unobtrusive gaze guiding stimuli [15].

1.2 Conceptual Vision

Head-worn Augmented Reality displays are now a reality. This enables new areas when it comes to digital nudging and persuasive technology. A digital system that can nudge the user in everyday situations could potentially be used to make healthier, more economical, or climate friendlier decisions. Imagine a situation where you are grocery shopping wearing a head-worn augmented reality display. You want to eat healthier this week, so you configure your personal persuasive system accordingly. When you are standing in front of the cereal-section looking at the vast amount of different brands, the system can nudge you into picking the healthiest one. This can be done by introducing visual stimuli and as such direct your visual attention to the target, making it more salient. Likewise, if you want to be climate-friendlier, the system can nudge you into picking the more climate-friendly brand.

1.3 Research purpose and problem statement

The purpose of this thesis is to investigate how subtle gaze guidance can be implemented using wearable Augmented Reality technology. This is done as part of a long term research goal to realize the vision introduced in the previous section - wearable persuasive systems that can influence the decision making of the user.

As mentioned above, subtle cueing and unobtrusive stimuli have been proposed as appro-priate mechanisms for visual guidance in a head-mounted Augmented Reality environment. Furthermore, research within the field of nudging shows that visual attention can change the attitude and decision making of the user. Considering this, the following hypothesis and research questions are formulated.

1.3.1 Hypothesis and Research questions

Hypothesis: Subtle cueing can be used as a digital nudge in head-worn Augmented Reality environment.

RQ 1: Which known method for subtly attracting visual attention is appropriate for head-worn Augmented Reality?

RQ 2: To what extent is it possible to subtly alter the visual attention of a person, with head-worn Augmented Reality?

RQ 3: How can eye trackers be used to track the gaze in a head-worn Augmented Reality system?

(11)

1.4 Limitations

• The Microsoft Hololens is used as the wearable Augmented Reality device, further described in section 2.2.

• Specially modified Tobii pro glasses 2 are used as eye trackers. These glasses were custom made by Tobii and provided to us through Humanities lab in Lund. They are further described in section 5.2.2.

• Only one visual guiding method is implemented and tested.

• The controlled experiment only consist of a Posner cueing task further described in section 2.7.

(12)

2

Theoretical Background

This chapter describes technical concepts and theories concerning this thesis.

2.1 Augmented Reality

The augmented reality technique enhances the reality for humans. It combines virtual objects with the real world. To be defined as augmented reality systems must meet three certain requirements. It should combine the real and digital world, the digital and real world should be aligned geometric, and it should also be run in real-time [13].

This can be done by numerous devices such as mobile phones, tablets, and head-worn devices. These devices can apply enhancements to all senses through different types of data generated objects. These object can be realized through video, sound, or GPS data. The data generated impressions make it easy for the human to interact and manipulate the environment. AR and Virtual reality (VR) is often confused with each other. There is one significant difference. In AR parts of the environment is digital, unlike VR, where the whole environment is digital [18].

2.2 Microsoft Hololens

Microsoft Hololens is a head-mounted Augmented Reality display that is completely wire-less. The display is see-through meaning that the user can see the physical world while wearing the device. It runs Windows 10 and allows the user to place digital objects in the physical environment creating an Augmented Reality. The holograms can change and be interacted with using gestures. The field of view, where you can see the holograms is 30x17 ◦ [19].

2.3 Unity 3D Engine

The Unity 3D engine is a world leading game development platform which allows the user to create an immersive environment for the Microsoft Hololens. The Unity interface allows the user to create game objects and drag them around in the scene, placing them in the world. A game object has many attributes, among them shape and size, and by attaching scripts to them, it is possible to create complex movements and interactions. The scripting language in Unity is mainly c#.

2.3.1 Mixed Reality Toolkit

Mixed Reality Toolkit is an open-source driven project that strives to reduce the entry level for Mixed Reality application development. Requirements for it to work are Windows 10, Unity 3D and Visual Studio 2017. The toolkit enables for easy configuration of the basic building blocks in the Hololens, including the camera, the scenery, and inputs to the system [20].

(13)

2.4 Human-Computer Interaction

Human-computer interaction (HCI) is a field of study that focuses on the interaction be-tween human and computers. Traditionally, computers and graphical interfaces on screens were the primary research targets. However, with the rapid development of technology, HCI now covers almost all information systems, including mobile phones and immersive environments [21].

Within HCI, some essential concepts have emerged that explain, not only how we interact with systems but how systems can interact with us. This is something known as implicit and explicit interaction [22], presented in the table below.

Input Output

Explicit Input to the system that is deliberate. eg. using the clicker.

Output from the system that is an

immediate reaction to implicit input eg. new window opens when using clicker.

Implicit

Input from the system that are not deliberate eg. system monitors the surrounding

looking for regions of interest.

Output that is dependent on the implicit input eg. visual guidance towards a particular region of interest.

Table 1: Explicit/Implicit input and output.

Another concept within the field of HCI is peripheral interaction. This particular research field within HCI explores the interaction that takes place in the periphery of attention, where the main focus is somewhere else. Interactions in the periphery require minor pro-cessing, and the primary goal of these kinds of communications is to fluently embed another level of interaction to the user, in the everyday life [23].

2.5 The Human Vision

Our vision starts with light, which enters the eye through the cornea, pupil, and lens. The pupil regulates the amount of light that enters the eye and can contract or expand depending on the surrounding light. The cornea and the lens focus the light and project it onto the retina, which translates this signal to the brain. To be able to do so, the retina is filled with photosensitive receptors called rods and cones. Rods can function with very little light, but can only produce blurry and general images. Cones, on the other hand, require more light but can generate colorful and detailed images [24].

(14)

Figure 1: Anatomy of the human eye [24].

These receptors are unevenly distributed on the retina, and because of this, the visual field can be divided into two major parts, the focused area of vision and the peripheral view.

The focus of our vision has high acuity, this is because the light is projected onto the fovea. This is an area on the retina with a high density of cones, which produces sharp and colorful images. This is the central area in the image below, figure 2.

The peripheral vision covers the majority of our visual field. In the peripheral field, the light is projected onto rods and the further out towards the far peripheral the fewer rods. This means that images in our peripheral view are low in acuity and color [25]. The image below illustrates how much of our visual field is covered by the peripheral view.

Figure 2: The peripheral vision of the human eye [26].

Another important aspect of our vision is how we move our eyes. The limited high acuity of or central field of view forces us to move our eyes and focus on the objects we want to perceive. There are mainly two types of eye movements, fixations and saccades [27].

Saccades

A saccade is a rapid eye movement from one point of focus to another. A saccade takes between 20-40 ms to perform and an average of 200 ms to prepare.

(15)

Fixations

Between saccades the eyes land and fixates on a point aligning the fovea with the tar-get, which enables the processing of details. When scanning an area for information, the duration of a fixation is normally between 50-600 ms.

2.6 Visual Attention

Attention in a broader sense is defined as the process where we filter out information from our surrounding to selectively process chosen parts. Visual attention focuses on inputs to our visual system and how we decide what to focus our vision on and how we process perception. These mechanisms are categorized into bottom-up processes and top-down processes [28].

Bottom-up Processing

Bottom-up processes are normative and fast. They rely on raw visual input that travels in one direction from the retina to the brain and as such, do not use other cognitive processes to analyze the data further. The result is an image with no further context.

Top-down Processing

Top-down processing, on the other hand, utilizes human cognition. These processes are slower and combine the contextual information with the visual input to process the infor-mation. For example, by knowing the context of a sentence, it is possible to figure out a poorly written word.

2.7 Posner Cueing Task

Posner cueing task, also known as the Posner Paradigm, is a well-known standard for testing human attention in psychology [29]. The task itself was standardized by Michael Posner in 1980 [30]. The task is used to study reaction time to target stimuli with different cue conditions. In the standard setup, the experiment participant is seated in front of a 2D screen. The participant is then asked to focus on a fixation point in the middle of the screen, where the fixation point is often a cross or a point. To the left and right side of the fixation point is two square boxes. In fixed intervals, a cue appears in one of the boxes. Right after the cue disappears, a target appears in one of the boxes, and the participant shifts attention. The most common way of measuring the reaction time is by letting the participant press a button on a keyboard [29]. Other ways of doing it include eye trackers and measuring saccades. These trials are done X numbers of times determined by the supervisors. Two different types of cues can be used presented in figure 3 below.

(16)

Figure 3: The endogenous cues is presented on the fixation point while the exogenous cues are presented on the target. The first cue relies on input from central visual field and the second on input from the peripheral view [29].

Posner also shifts between doing valid, invalid, and neutral trials. 80% of the trials with cues were valid and 20% invalid. These three conditions were later used to analyze the result if the stimuli helped or hindered the attention performance [29].

2.8 Eye tracking

Eye tracking is a method of studying the visual attention of a human, which also is known as our point of gaze [31]. The most used technique for eye tracking is pupil center corneal reflection(PCCR) [32]. The concept of PCCR is that an infrared-light source is directed into the pupil of the eye. This causes reflections that are visible and gets captured by a camera directed onto the eye. With the angle calculated between the reflections and cornea, it is possible to create a vector. By then combining the reflection and the direction of the vector, the point of gaze is calculated. All of this is possible using eye trackers [32].

It exists different types of eye trackers such as screen-based eye trackers and eye tracking glasses. The screen-based eye tracking modules are best if the research is based on the subject sitting still in front of a screen [32]. However, if the research needs mobility, eye tracking glasses should be used [33]. Eye tracking is used in a wide variety of markets ranging from psychology to neuromarketing. In psychology, it is used in tests like implicit association test(IAT), and in neuromarketing, it is used to see how people gaze when they walk through a shop [33].

2.8.1 Systematic errors

As mentioned above, eye trackers rely on the pupil center of the eye to accurately measure the gaze vector and as such, the fixation point. The problem is that the human eye pupil is not a static artifact, it is very dynamic, and small changes in light affect the size of

(17)

the pupil. Not only this, but the pupil is not a perfect circle, and the contractions and expansion are often asymmetrical [34]. This means that changes in lighting conditions will affect the center of the pupil and therefore, also the accuracy of the measured gaze data. Studies show that errors up to 2◦ can be contributed to the change of lighting conditions [34].

This is something referred to as a systematic error since the effect is well documented, and therefore, measures can be taken to mitigate them. One such fix is always to have a fixed lighting condition when collecting eye tracking data [34].

2.8.2 Effect of small errors

Measuring the binocular gaze fixation point requires data from both eyes. The vergence angle is the angle between the gaze vectors of the eyes and the target fixation point, as seen in figure 4. The distance to fixation point as a function of the vergence angle is inversely proportional. This means that small changes in the vergence angle will correlate to larger changes in the distance the further away the fixation point is, illustrated in figure 4.

Figure 4: Left figure shows binocular fixation of a far and a near target. LE - Left eye; RE - Right eye; Alpha - vergence; iod - interocular distance. Right figure shows the distance to the BFP as function of vergence angle. The dashed line represent a 2◦ error [34].

As we can see in the plot, an error of 2◦ will cause large fluctuation in distances as close as 60 cm, which is a common distance for typical eye tracking applications. The implication of this is that even small error in measurement can have significant effects on distance [34].

2.8.3 Tobii Software

The Tobii Software enables the user to import the gaze data recorded by the Tobii Pro Glasses 2 and analyze it [35]. It provides easy to use functions to replay recording or to

(18)

analyze them more in depth. Recordings of the gaze data can also be exported into larger spreadsheets.

(19)

3

Related work

Related work presents research that is related to this thesis. The related works presented are associated with gaze guiding in immersive environments and how different visual gaze guiding techniques are used to attract attention. Comments to each paper are also pre-sented as to why they are relevant to the thesis.

3.1 "A Guided User Experience Using Subtle Gaze Direction"

A Guided User Experience Using Subtle Gaze Direction is written by Eli Ben-Joseph and Eric Greenstein [36]. In the article, they demonstrate how one can guide the gaze of a user without them being aware of the guidance. This is possible with a technique that takes advantage of eye biology. The authors try to modulate frequency on images to direct the attention of a user on a 2-D monitor. The frequency they used to modulate the image are 30 Hz which was the maximum frequency the monitor could handle.

They present a hypothesis that states "flickering visible only in the periphery of a user’s field of view can subtly guide their gaze to an area of interest". They are validating this hypothesis with an experiment. The research approach of this article is an experimental approach. They have a hypothesis and then validates it with an experiment on test persons. They create an experiment with six teapots where three of them are modulated and flickers with a different frequency. For ten seconds, they sample data about the test persons gaze with eye trackers and then analyzes the data. Previous studies by others have used subtle modulation on images to attract viewers attention. The results show that the user tends to look at the modulated teapots rather than the none modulated teapots. Furthermore, the users did not indicate any flickering during the test.

3.1.1 Comments

Results from the experiment validated the initial hypothesis that flickering visible only in the periphery field of a user can subtly guide their gaze to an area of interest. This can be useful for us in our thesis if it is possible to implement on Augmented Reality.

3.2 "Subtle Cueing for Visual Search in Head-Tracked Head Worn Dis-plays"

Weiquan Lu et al. [16] investigates the use of subtle cueing in goal-oriented visual search. The problem they identify is that regular traditional use of explicit cues can degrade visual search performance by introducing distortion and clutter. Furthermore, they point out the lack of research when it comes to Augmented Reality, and Head-Worn Displays and subtle cueing techniques as most of the work done is in desktop applications and traditional screens.

The method used is a controlled experiment where the stimuli introduced can be seen in the image below, 5.

Constructing the environment, they simulated an Augmented Reality environment where the participants were asked to find a cross. The environment was a 4096x2048 pixel

(20)

Figure 5: The cues used to overlap the cross. The subtle cue is almost invisible and as such do not clutter the environment [16].

panoramic image replicating an outdoor scene. By introducing the cue with different opacity levels over the cross, the participants were asked to find the cross.

The result shows that reaction time was decreased when introduced to the subtle cues and their conclusion is that subtle cueing is a possible cueing mechanism for Head-Worn Displays.

3.2.1 Comments

This paper is interesting for our thesis because of the results they find. They introduce a possible unobtrusive method for subtle cueing that works in a simulated Augmented Reality which has proven to attract visual attention.

3.3 "Comparing unobtrusive gaze guiding stimuli in head-mounted dis-plays"

Steve Grogorick et al. [15] compares different gaze guidance techniques in an immersive environment. A red dot referred to as ColorDot based on the work by Dorr et al. [37], is implemented and tested. The color dot is presented for 120 ms on the target and shown in intervals of 2 seconds. Furthermore, Subtle Gaze Direction proposed by Bailey et al. [14], is another technique tested. This technique modulates the luminescence and exploits our peripheral vision by introducing stimuli in the peripheral field of view.

The method for testing the stimuli is a controlled experiment. In a Virtual Reality envi-ronment, the participants were asked to observe a scene. Stimuli were then introduced in fixed locations, and the gaze data was collected.

The result shows that for both cueing mechanisms, there was a significant increase in attention for the targeted cueing spot, shown in figure 6. Furthermore, they conclude that none of the stimuli remained completely unnoticed.

3.3.1 Comments

The paper is interesting for our thesis because it introduces techniques for gaze guiding. The article compares more than the two techniques brought up, but these two methods are the ones that can be applied to and Augmented Reality environment where you can not

(21)

Figure 6: Heat map showing the effect of the visual stimuli. The red ring is the area where the stimulus was introduced [15].

modulate the background image. Furthermore, the paper brings up the important factor of removing the stimuli when the gaze is in reach to improve the subtleness. Their result also indicates that none of the stimuli were completely imperceivable, giving us a good discussion base for our research questions, see 1.3.1.

3.4 "Augmented Reality Assistance in the Central Field-of-View Out-performs Peripheral Displays for Order Picking: Results from a Virtual Reality Simulation Study"

Patrick Renner and Thies Pfeiffer investigate gaze guidance in Augmented Reality for the industry [38]. They discuss visual guiding techniques and how to apply them to Augmented Reality. Afterward, they simulate a scenario in virtual reality. From the work of virtual reality simulation, they implement two different augmented reality prototypes, one with Google Glasses and the other with Microsoft Hololens.

Before, others have used different techniques such as highlighting and arrow pointers. Related works have also been done by combining highlighting and arrows [38].

The research approach is a controlled experiment were they vary independent variables to influence dependent variables. They did the experiment without non-AR guidance first to have a standard value to compare with. The no-AR technique is called pick-by-light. They then performed the same task with guiding technique on Google glasses and Microsoft Hololens. The different techniques that were tested with Google glasses and Microsoft Hololens were arrow-based guidance and spherical wave-based guidance. The task in the controlled experiment consisted of placing bricks in different baskets.

Figure 7: In picture a) from [36] the pick-by-light guidance is demonstrated in VR, in b) arrow pointers is demonstrated by Google glasses and in d) arrow pointers are demonstrated by Microsoft Hololens

The results show that with Google glasses, the participant performs worse with than with the non-AR guidance. However, with Microsoft Hololens and visual guidance, the partici-pant performs the task quicker.

(22)

3.4.1 Comments

I think this will contribute to our work by showing us that it is possible to mix different techniques for visual guidance. Furthermore, it shows us the possibilities with Microsoft Hololens combined with visual guidance. It also proves that tasks can be performed slightly faster with visual guidance.

3.5 "Voluntary Spatial Attention has Different Effects on Voluntary and Reflexive Saccades"

K. Seidlits et al. [39] investigates how the different spatial cueing tasks affect saccades. The problem they identify is that there is a disagreement within the field, whether or not spatial attention inhibits, facilitates or has no effect on the saccadic eye movement. Previous studies suggest that endogenous saccadic movement is preceded with a shift of attention. More support for this is the so called pre-motor theory, which says that a preparation of attention shift automatically is followed by a manual response.

However, the author brings up the fact that other studies shows the opposite, where the saccades are delayed in tasks with spatial attention activated. The main problem the authors find between the different studies conducted is the lack of consistency when dif-ferentiating between different saccades. By difdif-ferentiating two types of saccades, reflexive and voluntary, the author hopes to bring clarity to this in the work.

The method of doing so is a Posner Cueing task with endogenous cues, meaning that the cues are located in the center focus of vision. In this case the cues are arrows pointing towards the location of the target.

Four different scenarios were created and tested with one being of particular interest for this thesis. In this block, called Prosaccade-Mixed, the authors use 50% no cues and 50% cues, which is the same setup as in our experiment. All of the cues are valid as well. The result from this experiment is that cued target have faster reaction time(cued ∼225 ms, uncued ∼265 ms).

3.5.1 Comments

This paper is interesting for our thesis because they are conducting a Posner cueing task and measure the reaction time of the saccades. This is very similar to our setup. In contrary to us, they use endogenous cues while we are using exogenous. Furthermore, they are using a computer screen during the experiment while we are going to use a Hololens and Augmented Reality.

(23)

4

Method

This chapter presents the methods used in this thesis and how they are adjusted to fit our system and needs.

4.1 Mixed methodology

A mixed methodology [40] is used in this thesis. The mixed methodology consists of a literature study, Nunamaker and Chen’s development method for informative systems [41] and a controlled experiment [40]. The literature study is used to gain background knowledge of the area and to help build the conceptual framework. The Nunamaker and Chen’s development process is used to develop the system and a controlled experiment is conducted to investigate our hypothesis and research questions with the help of the system.

4.2 Literature study

A literature review is used in this thesis to get a better understanding of the research area and to increase our knowledge in the field. The areas studied are Augmented Reality, Gaze guidance and Eye tracking. Two reliable database sources, Institute of Electrical and Electronics Engineers (IEEE) and Association for Computing Machinery (ACM) are used in the literature search. Papers are examined and evaluated in relevance to the thesis and the most prominent ones are presented in the related work section, see section 3.

4.3 Nunamaker and Chen’s system development

Nunamaker and Chen’s development process is used to develop the system [41]. It is chosen because of its iterative attributes and its flexibility when it comes to development from a concept to a complete system. Nunamaker consists of five different development states, visualized in figure 8, below. The iterative nature of the method enables a systematic and organized approach to system development. At any time in the development stage its possible to move back to an old state to start over or improve a certain aspect of the system.

Figure 8: The Nunamaker and Chen’s development process [41].

4.3.1 Construct a conceptual framework

The first state is to construct a conceptual framework. The main problem is divided into smaller parts, where different problem areas are identified. This problem tree increases the knowledge of the problem and makes it easier to grasp. A literature study, see 4.2 is then

(24)

conducted to gain knowledge about the areas identified in the problem tree. The problem tree is presented in 5.1.

4.3.2 Develop a system architecture

The second state is to develop a system architecture. This is done using rough diagrams and sketches. The diagrams are used to get an overlook of the systems functionality and to identify different components and their relationship. The results are presented in 5.2.

4.3.3 Analyze and design the system

The third state is to design and analyze the system. In this state the abstraction level of the UML diagrams is reduced, increasing the details. Furthermore, a state diagram is created for the whole system and sequence diagrams for individual functions to visualize the data flow. From the diagrams, different system requirements are identified for later testing in 5.5. Furthermore, the visual guiding method and task is designed by analyzing different criteria. The results are presented in 5.3.

4.3.4 Build the system

The fourth state is to build and implement the system. The system is developed according to the diagrams from the two previous states. The results are presented in 5.4.

4.3.5 Observe and evaluate the system

The last state is to observe and evaluate the system. This state verifies the system and test cases are constructed based on the system requirements identified in 5.4. They test the necessary functions of the system built in 5.4. The results are presented in 5.5.

4.4 Controlled Experiment

A controlled experiment is conducted after the system is created and tested by Nunamaker and Chen’s development process. The controlled experiment is chosen because of its ability to measure how varying one variable affects the outcome [40]. The controlled experiment tests our hypothesis presented in 1.3.1.

Subtle cueing can be used as a digital nudge in an Augmented Reality environ-ment.

Independent and dependent variables are identified and an experiment procedure is out-lined. Before the final experiment is conducted, a pilot test of the whole experiment is done to evaluate if everything works and that all necessary parameters are stored. Af-ter the final experiment, a minor survey is presented with questions regarding the users experience. The results of the controlled experiment are presented in section 6.

(25)

5

Implementation

In this chapter, the implementation and testing of the system are presented.

5.1 Construct a conceptual framework

As seen in 4.3, the first step of Nunamaker is to construct a conceptual framework. To do so, we chose to break down our problem into subproblems to get a better idea of what to be solved.

5.1.1 Problem Tree

The problem tree created is illustrated in the image below. The four major parts identified are Visual Guiding, Augmented Reality, Eye tracking and Controlled Experi-ment.

Figure 9: Problem breakdown of the whole system

In the coming sections the sub-problems are further divided to gain even better under-standing of the problem.

5.1.2 Augmented Reality

The first branch is Augmented Reality. This part is further broken down into hardware, implementation, integration, and communication. This is visualized below, in figure 10. Hardware includes the decision of hardware. Implementation includes designing, coding, and testing the Augmented Reality system. The communication problem is between the Augmented Reality system and the eye tracker.

(26)

Figure 10: Augmented Reality divided into sub-problems

5.1.3 Eye tracking

The eye tracking parts is used to evaluate the visual guiding implemented. The prob-lem is divided into four parts, hardware, impprob-lementation integration, and communication, which can be seen in figure 11. The hardware part includes the decision of hardware, and the implementation includes the designing, coding, and testing of the eye-tracker. The communication problem is between the eye tracker and the AR system.

Figure 11: Eye tracking divided into sub-problems.

5.1.4 Visual guiding

Implementing visual guiding is the last problem. We need to find an appropriate method and tune the metrics to fit our software and hardware. This breakdown can be seen below, in figure 12. The implementation includes designing, coding, and testing. The method is the investigation of what kind of visual guiding is suitable for head-worn Augmented Reality. To do this, we will look into the literature and papers to see what others have done.

(27)

Figure 12: Visual guiding divided into sub-problems.

5.1.5 Controlled Experiment

The controlled experiment is how we test the visual guiding to see if it has any effect on the user. The system we develop during the Nunamaker’s system development will be used in the test. Constructing the experiment includes identifying the dependent and independent variables, and a procedure protocol will be needed to make sure every participant is treated equally. Furthermore, the data collected need to be analyzed and put in context. The controlled experiment is further depicted in section 6.

Figure 13: Controlled Experiment divided into subproblems.

5.2 Develop a system architecture

From breaking down our system into different problems, we create a system architecture. The overview of our system with all its parts is presented below in figure 14. Our system is divided into three major parts, the Microsoft Hololens, the Tobii eye tracker, and a com-puter that acts as a controller. The communication between the parts is over WiFi.

(28)

Figure 14: System architecture

5.2.1 Head mounted Augmented Reality

The head-mounted Augmented Reality device is a Microsoft Hololens. The purpose of the Hololens is to implement the task needed for the controlled experiment, including visual guiding. The Hololens is wearable and, as such, fits our vision and purpose described in 1.2 and 1.3. The Hololens is an important part of the system. It handles the Augmented Reality, runs the controlled experiment, and has to be able to communicate with both the computer to send necessary data and with the eye tracker to start and end recordings.

5.2.2 Eye Tracker

The eye tracker used is a pair of modified Tobii Pro Glasses 2. The Tobii Pro Glasses 2 is a wearable eye tracker from Tobii. It is flexible and mobile which opens up for new opportunities for eye tracking. With the device, it is possible to gather and analyze gaze data from mobile environments, for example, when you are walking in a store [35]. The eye tracking technique used by the system is corneal reflection with dark pupil, meaning that some problems introduced in, 2.8.1 might occur. Below is an image of the eye tracker in its original unmodified state, figure 15.

(29)

Figure 15: Features and design of the Tobii Pro Glasses 2 [35].

The design of our eye tracker is specially modified by Tobii to fit the Microsoft Hololens. By doing so, the Full HD Camera has been removed, meaning the system no longer has camera feedback. The implication of this is problems regarding calibration of the device since it will have to rely on the Hololens camera. Below is an image of how the modified eye tracker looks.

Figure 16: The modified Tobii Pro Glasses 2

The purpose of the eye tracker is to measure the gaze of the user during the controlled experiment to get accurate data. The data collected from the eye tracker is stored on an external SD-card and is later retrieved. The eye tracker is controlled by the Hololens, connected to it over WiFi. Furthermore, the eye tracker provides a live data feed, over UDP which the computer can read. This is used to make sure that the eye tracker is getting valid eye tracking samples.

5.2.3 Computer

The computer works as the controller of the system. It monitors the eye tracking data provided over UDP, from the eye tracker to make sure it works. Furthermore, it acts as a server which the Hololens can connect to and send necessary data, further described in section 5.4.3.

5.3 Analyze and design the system

In this chapter, the last preparations are described before implementing the system. This includes the construction of flowcharts, sequence diagrams, and a state machine. Based

(30)

on these diagrams, we formulate some system requirements that are later tested in 5.5.1. Furthermore, the choice of visual guiding and task are discussed and presented.

5.3.1 Choice of visual guiding method

Designing the visual guiding is an essential step in our system. Based on our system and its limitations, we came up with different criteria for our method described as following.

• Cannot be gaze contingent - the specific method cannot rely on instant feedback from the eye tracker. This is because our eye tracker is connected over WiFi, and the latency will be too much for these kinds of methods that rely on instant feedback. • Implementable in Augmented Reality - the method needs to be implementable

in Augmented Reality. Since we are using video see-through Augmented Reality, we have no way of actually modulating or changing the background, since that is the real world.

• Subtle - the method needs to be subtle, meaning that it does not increase clutter in the scene and that it is on the edge of being detectable.

Based on these criteria, we analyzed some different known methods for attracting visual attention as described in the related works, section 3. These included

• Subtle cueing

• Subtle gaze direction

• Saliency Modulation Technique • Flashing red dot

The result from the analysis and explanations are presented below:

Method Not Gaze Contingent Implementable in Augmented Reality Subtle

Subtle Cueing Yes Yes Yes

Subtle Gaze Direction No Yes Yes

Saliency Modulation Technique Yes No Yes

Flashing red dot No Yes Yes

Table 2: Evaluation of visual guiding method

If the flashing red dot is gaze contingent or not is up for discussion. Our definition of the method comes from the work by Grogorick et al. [15] where they add a gaze-contingent deactivation as a way to increase subtleness. They also note that previous work does the same, which is why we list the method as Gaze Contingent.

Based on the analysis, subtle cueing was the only method that passed all the criteria making it our choice of visual guiding.

5.3.2 Design of the subtle cue

To design the subtle cue we looked into the following three attributes Cue shape, Cue opacity, Cue size. These attributes are chosen because they are easy to manipulate in

(31)

Unity and have been tested before in similar situations [42].

The shape of the cue determines how the cue looks in the 3D space. We decided the subtle cue to be the same shape as the target itself. This is done to increase simplicity and because previous work suggests that for subtle cueing, the shape of the cue does not affect the result [42]. Furthermore, these kinds of same shape cues are used in similar cueing experiments [43]. Just like in this paper, our target and cue will be a sphere.

The opacity of the object is how transparent it is. This parameter is varied until the subtle cue is almost invisible but still barely visible and perceivable hence the subtleness. The size of the cue is how large the subtle cue is. Previous work suggests that an in-creased cue size have a positive effect in visual search tasks, although leading to reduced subtleness [42]. Having this in mind, it is important to find a good balance between size and subtleness. Typically, in a Posner cueing task, exogenous cues are bound within the target box, having the same dimension size as the target itself. Furthermore, similar ex-periments utilize cues that are of the same size as the target [43]. Therefore, our subtle cue will have the same size as the target.

Below is a summary of the subtle cue attributes with the corresponding variables to change in Unity, summarized in a table.

Attribute Variables in Unity Comment

Cue Shape Game objects of different shapes. The cue shape will be the same as the target Cue Opacity Alpha channel of object. Values

from 0-255. The alpha value will be changed until the cue is subtle. Cue Size Scale value of game object property The cue will have the same scale values as the target.

Table 3: Attributes of the subtle cue.

5.3.3 Posner Cueing Task

The task chosen to implement is a variation of a Posner cueing task [30]. The task is chosen because it is a well-known standard for testing human attention. It is well documented, and many variations exist, although all examples from literature we found have been done using a standard 2D monitor making our experiment in head-worn Augmented Reality unique. It also fits our selected visual guiding method as described above in 5.3.1, where the otherwise easily perceived cue in the Posner cueing task is replaced with a subtle cue. In our test, the saccades to the target will be used to calculate the reaction time, which is later analyzed, further described in section 6.6.

Furthermore, we will only utilize valid cues. One reason for this is that it simplifies the data gathering. Invalid cues might trigger saccades to the wrong side, and this will be problematic in the data processing and accurately matching the timestamps. In addition to this, the goal of the thesis is to guide the visual attention to a region of interest, and as such, we see no reason for invalid cues.

Since it is a well-known paradigm for attention, we find it suitable for testing subtle cueing in an Augmented Reality environment. Since our goal is to explore possible ways of directing visual attention with wearable Augmented Reality systems, it is important that the tests are done in similar setups. The task will, therefore, be rendered in Augmented

(32)

Reality by the Hololens, where the objects are attached in the space, further described in section, 5.4.1.

Moreover, the task facilitates the data gathering because the visual attention is either right or left which reduces eye tracking errors since we do not need the exact 3D point but can focus on shifts along the x-axis. This also means that the problem with calibration discussed in 5.2.2 are diminished. These easily identified shifts are visualized in section 6.6, figure 26. Below is an activity diagram showing the flow of a Posner cueing task which is later used when building the system and image which visualizes the task.

(33)

Figure 18: The scenery of a Posner Cueing task. The participants focus on the center fixation point and shift their attention to the target which appears in either the left or right box.

5.3.4 Position of the subtle cue

In a Posner cueing task there are two positions for the cue, see 2.7. Exogenous cues which are located outside of the center of focus or endogenous cues which are placed in the center of the screen. Because we want to make the cue as subtle as possible, our cue will be exogenous and placed in the peripheral view of the participant to make it less noticeable.

5.3.5 System functionality

Before starting implementing the system, we analyzed all the parts we needed. From this, we created a state machine describing our system from start to end as seen below in figure, 19. The states can be divided into three domains as following:

Pre task - This domain handles the events that occur before the task is started and contains the following states:

• Tutorial - In this state, a brief tutorial is presented to the participant • Calibration - This state verifies if the eye tracker is getting valid samples. • Sync - This state implements the synchronization algorithm presented in 5.3.6. • Pregame - This is the last state before the task start and makes sure the participant

gaze is focused in the center before starting the task.

(34)

• StareMid - This state handles the timings for when the participant should focus on the center sphere. No subtle cue or target should be present.

• SubtleCue - This state is active when the subtle cue is present.

• Stimuli - This state handles the timing of when the target should appear. Post task - This group includes the states after the task is done.

• EndGame - This state should save and send data to the computer that acts as a server.

The clicker functions refer to the remote click-controller that comes with the Hololens. It can be implemented to control certain events.

Figure 19: Flow of the controlled experiment

5.3.6 Synchronization of Hololens and Eye tracker

Because the Hololens and the eye tracker are separate units, we needed to find a way to synchronize their timestamps to be able to measure the reaction time. This is done by using a slight variation of Christians Algorithm [44]. The sequence diagram below illustrates how the algorithm is implemented, and the resulting implementation is described in 5.4.4.

(35)

Figure 20: Sequence diagram for synchronization of Hololens and Eye.

The Hololens sends a syncEvent to the eye tracker containing a timestamp. This syncEvent is stored on the eye tracker which assign a local timestamp to it. To take into consideration the latency of the call, a round trip time is calculated. These values are later used to calculate the offset between the devices as follows:

ets + rtt/2 − ts = offset

where ets is the Hololens timestamp, rtt the round trip time and ts the eye tracker times-tamp. We add half the round trip time because this is the best approximation of when the message actually arrives.

5.3.7 System Requirements

Based on the sequence chart, state machine, and system architecture, we identified some key system requirements as follows. These requirements are tested later in the developing process, in section 5.5.

R1. The Hololens should communicate with the Eye Tracking device. R2. The Hololens should send necessary data to the computer. R3. The system should store necessary data locally as a backup.

R4. The computer should get live updates of eye tracking data from the Eye Tracker. R5. The position of the cue should be random.

R6. All the object should fit in the Hololens field of view.

5.4 Build the system

(36)

5.4.1 Implementing the system flow

The Augmented Reality system is built using Unity 3D engine and deployed to the Hololens using Visual Studio 2017. The software is implemented in c#, and by writing scripts and attaching them to game objects, we developed the system according to the state machine presented in figure 19. The main game script is attached to the center fixation point since this point will be present during the whole procedure. The image below shows the preview image in Unity with the real view from the Hololens underneath.

Figure 21: Image taken from Unity

Figure 22: Image taken from the Hololens

When the program is started, the game objects are rendered in front of the user. They are placed 1 m in front of the participants and are fixed in the space, meaning that if the participant moves their head, the object will not follow.

5.4.2 Subtle Cue

The subtle cue is implemented based on the attributes outlined in section 5.3.2. As seen in the image below, the shape and size are the same as the target. The only difference is the transparency level. As stated in, 5.3.2 this value is varied until the cue is almost invisible. Both of us did this, and the resulting alpha value is 6. This is the value where we thought the subtle cue matched the criteria of being barely perceivable.

Below, in the image, the alpha value of the subtle cue is 6 and is rendered as transparent compared to the target, which has an alpha value of 255 and is rendered as opaque.

(37)

5.4.3 Communication

The communication between the Hololens and the eye tracker is implemented using a li-brary called HttpClient and the Tobii API [45]. This lili-brary allows for simple HTTP REST calls to be made from the Hololens. The POST calls that are made are as following:

• urlbase/api/events/recordings

• urlbase/api/recordings/<rec_id>/start • urlbase/api/recordings/<rec_id>/stop

The urlbase is the IP address of the eye tracker and is used throughout this section. The first call is used to create a recording, and the eye tracker responds with a recording ID which is stored in the Hololens. This value is then later used in the other calls as <rec_id>, to start and stop the specified recording.

After the task is finished, the data is stored locally by writing it to a file and also sent to a server. The Hololens acts as a client and connects to the server through a TCP socket. The data sent is information about the timestamps, the cue, and the syncing process as a JSON object.

• {"Timestamps":[],"CuedInfo":[],"syncIndex":3,"RTT":66}

For every game iteration, a timestamp is saved before the target appears and information about the subtle cue is stored in CuedInfo.

• 0 - the target was uncued • 1 - the target was cued

The syncIndex is used in the synchronization process and tells which of the 10 sync events that had the lowest round trip time. This is further described in the next section.

5.4.4 Synchronization

The synchronization is implemented according to the sequence diagram displayed in section 5.3.6 and by using the Tobii API [45]. To send the sync event to the eye tracker, the following POST command is made from the Hololens.

• urlbase/api/events

The body of the post call is a JSON object as following

• {"ets": 9975889960, "type": "SYNCEVENT", "tag": "syncing"}

where ets is the Hololens timestamp and type and tag are general information about the event that have to be included according to the documentation.

The eye tracker stores this information as following

• {"ts":145589660,"s":0,"ets":9975889960,"type":"SYNCEVENT","tag":"syncing"} We also need information about the round trip time, rtt. This is because there is latency over the WiFi. We need to add half of the rtt to the Hololens timestamp because this is

(38)

the best approximation of when the message arrives. The offset between the two devices is then calculated as

following-ets + rtt/2 − ts = offset => 9975889960 + 33 − 145589660 = 9830300333

This resulting offset is later used in the data analysis to transform the Hololens timestamps to the eye tracking domain as follows:

targetHololensTime − offset = targetEyeTrackerTime

With this transformation, it is possible to measure the reaction time because both the target and eye tracking timestamps are in the same domain.

Because the latency affects the accuracy of the synchronization, the sync command is sent 10 times. This number is chosen arbitrary and aims to introduce redundancy to the system, and we felt that 10 was enough to achieve this. The process is done to reduce the impact of high, unwanted latency. For each call, the round trip time is calculated, and the call with the lowest value is saved and stored in syncIndex. This syncIndex is used to retrieve the sync event that should be used in the process of calculating the offset.

5.5 Observe and evaluate the system

In this chapter, the system is evaluated thoroughly by using the test cases constructed from the requirements presented in 5.3.7.

5.5.1 Testing

To test the requirements presented in 5.3.7 test cases were constructed. Each part of the system was tested individually, to begin with, and then tested as a complete system. The test cases and test report can be found in Appendix A.

Tobii labs pro was used to test and verify that the eye trackers work and the participants look the way we want them to. As seen in figure 26, presented later, you can see how it looked. In the testing of the whole system, the eye trackers were mounted onto the Hololens as in a.

(a) Mounted eye trackers onto Hololens (b) System testing by Viktor

(39)

6

Result

In this section, the controlled experiment is presented more in detail as well as the results we obtained from the participants.

6.1 General description

As mentioned in section 4.4, the goal of the controlled experiment is to test our hypothesis. The task implemented to do so is the Posner cueing task presented in 5.3.3, which is a part of the system built and implemented in section 5. By measuring the reaction time it takes for a saccade to reach the target, we can compare the values for the cued and uncued targets. Results from similar Posner cueing tasks show a decreased reaction time for cued targets [46], meaning that if our data indicates a faster reaction time towards the cued targets there is an effect of the subtle cue. Furthermore, to validate and test the subtleness of the cue each participant was asked to answer the following questions after the experiment

• Did you notice anything unusual? • If so, could you describe it?

• Did you find the system comfortable?

6.2 Setup

The experiment is conducted under a fixed set of conditions as following

• Fixed lightning - to counter errors in measurement. All the measurements took place in similar small rooms under similar lighting conditions. This was done to make sure that the holograms rendered equally for all the participants.

• Optimal gaze distance - to counter errors in measurement. To minimize errors, the holograms are placed 1 m in front of the participants. This is the lowest recommended distance by Microsoft referred to as the comfort zone [47]. Closer then this and the user might feel at uneasy, and further away, the eye tracker will have a harder time feeding correct values, as discussed in 2.8. Even though the game renders in 3D-space, a white wall is used as a background to reduce contamination.

(40)

Figure 24: Example of a participant setup.

6.3 Independent and Dependent Variables

Independent variables - Factors in controlled experiments that variate. Dependent variables - Variables that are dependent on the factors that are varied in the experiment. In our case the independent variable is the subtle cue and the user, and the depen-dent variable is the reaction time. The subtle cue variable is whether the subtle cue is present or not. The reaction time is considered as the time it takes from the target to appear, until the participant’s saccade lands and fixates on it.

6.4 Participants

The number of participants is 14 and are chosen among student at Malmö University. The participants are all between 20-40 years old and had no prior knowledge of the experiment. No major eye deficiencies were reported among the participants. In discussion with our supervisors, a consent form was determined to be sufficient for the experiments to be conducted, which can be viewed in Appendix B. Before each experiment, the participants were presented with this consent form, which they signed and kept. Further discussion about this is presented in 7.4.

(41)

6.5 Procedure

Figure 25: Procedure of the experiment The procedure of the experiment is

pre-sented below and illustrated in the flow chart.

Before the start of the experiment, the in-structor will start up the program and set up the environment to ensure minimal er-rors. The Hololens with the eye tracker is then fitted onto the participant.

1. A tutorial introduces the participant to the system and presents the task at hand. The tutorial is a sample of the task but without any presence of the subtle cueing. 2. The participant is asked to look left and right at targets while the eye tracking data is monitored on the computer. If the data is accurate, we proceed to the next state. Otherwise, the Hololens and eye tracker are adjusted on the head.

3. In the synchronization, the sync-events are sent to the eye tracker.

4. The instructors start the task by using the clicker.

5. Data from both the Hololens and the eye tracker is collected during the task. 6. The collected data is processed, ana-lyzed, and plotted onto graphs for easier understanding.

(42)

6.6 Processing and gathering the data

To be able to analyze the data, minor pre-processing had to be done. Using Tobii Pro Labs, we could inspect the data and see if the data could be used for analysis or not. Below in figure 26 is an example of data that can be analyzed. The eye tracker is able to identify the participant’s eye throughout the test, and the shift in visual attention are visible when the target appears left or right.

Figure 26: Example of good data from the experiment viewed with Tobii Pro Labs. It shows the saccades and fixation points in both X and Y from the participant.

Below, in figure 27 is a result of a participant that cannot be analyzed and plotted. The eye tracking data is not continuous meaning that a lot of the times the eye are not found. This means that it is impossible to find out when the participant is reacting to the events. The potential reasons behind this are discussed in 7.1.

Figure 27: These results were not able to be processed because the large numbers of errors. All of the participants that we could successfully analyze (8/14), had graphs similar to the one in figure 26 but with minor errors where the eye tracker could not find the eyes or where the participant made unwanted saccades with the eyes. Typically, an unwanted saccade is where the eye tracker registered two saccades to a target instead of one. These unwanted events were identified and removed. This pre-processing left us with data that is consistent and that could easily be analyzed.

(43)

To analyze the data, we used MATLAB. An excel file of the eye tracking data was exported from the Tobii Pro Lab software and imported into MATLAB. Furthermore, the data from the Hololens presented in section 5.4.3 is imported. This data was then processed with the following algorithm.

1. Find the beginning of the recording

2. Identify all the saccades and store the timestamps.

3. Remove every other saccade because these are the ones going into the center fixation. 4. Calculate the timestamp offset from the synchronization method.

5. Compare the remaining, correct saccades with the Hololens data to find if they are cued or uncued.

6. Remove the offset from the Hololens timestamp and calculate the reaction time. 7. Save the reaction time in an array.

This algorithm produces two arrays with reaction time values — one with the cued targets and one with the uncued.

6.7 Pilot experiment

As mentioned in section 4.4, a pilot test of the experiment was conducted to test if the experiment flow and software performs as expected. We conducted the pilot experiment, and both of the experiments passed. All parameters necessary were stored, and the result could be analyzed using the algorithm presented above in section 6.6. The results are presented below in figure 28 and 29 subsequently.

(44)

Figure 29: Result of the pilot test conducted by Viktor

6.8 Final experiment result

In the final experiment, data from 8 out of the 14 participants could be analyzed. The data of the other 6 participants were not included due to the problems mentioned in 6.6. In figure 30, box plots for all uncued values from the valid participants are presented, and subsequently, in figure 30, box plots of all cued values are presented. In figure 32, the summarized result of all, is visualized as box plots. On each box, the red line in the center indicates the median, and the bottom and top edges of the blue box indicate the 25th and 75th percentiles, respectively. The whiskers extend to the most extreme data points(9-91%) not considered outliers, and the outliers are plotted individually using the ’+’ symbol. Outliers refers to extreme values.

(45)

Figure 30: Box plot of the uncued results from 8 participants.

(46)

Figure 32: Box plot of all the results summarized.

The results from the plots are summarized, and average reaction times are calculated and shown in the table below, 4.

Participants

Cued average (ms)

Uncued average (ms)

Difference (ms)

1

159

291

132

2

165

306

141

3

104

239

135

4

145

284

139

5

160

279

119

6

133

246

113

7

129

273

144

8

144

280

136

Summary

145

267

122

Table 4: Average summary of all the participants were the results were valid.

To be able to understand better the variation of the individual target values, a histogram which shows the spread of the reaction times was created, shown in figure 33. Furthermore, a bar plot which shows the overall mean reaction time with the corresponding standard deviation was created, as seen in 34.

(47)

Figure 33: The histogram shows a clear shift in reaction time between cued and uncued targets.

Figure 34: The bar plot shows the average reaction time for all participants, and the black line shows the standard deviation.

To test the statistical significance, a t-test was conducted with the two data sets, uncued and cued values for all participants. The following line was used in MATLAB:

[h,p] = ttest2(CUEDVALUESFINAL,UNCUEDVALUESFINAL) and the results are as following

(48)

• h = 1 - there is a statistical significance

• p = 9.6e-69 - p<0.05 indicates a statistical significance. To summarize the result here are the major findings

• Uncued targets show significantly higher reaction times.

• Cued targets have a higher variation and spread in reaction times • All participants had values in similar ranges.

• All participants reported that they noticed something that could be interpreted as the subtle cue.

• There is a statistically significant difference between uncued and cued values. Further analysis of the result is presented in the next chapter, section 7.1.

Figure

Figure 3: The endogenous cues is presented on the fixation point while the exogenous cues are presented on the target
Figure 4: Left figure shows binocular fixation of a far and a near target. LE - Left eye; RE - Right eye; Alpha - vergence; iod - interocular distance
Figure 7: In picture a) from [36] the pick-by-light guidance is demonstrated in VR, in b) arrow pointers is demonstrated by Google glasses and in d) arrow pointers are demonstrated by Microsoft Hololens
Figure 9: Problem breakdown of the whole system
+7

References

Related documents

Keywords: museum, augmented reality, 3d, exhibition, visitor experience, mobile application, digital humanities.. The purpose of this thesis is to map the process of making an

The evaluation indicated intervention effects of higher psychological flexibility (p = .03), less rumination (p = .02) and lower perceived stress (p = .001), and offers initial

visar antalet korrekta bedömningar för positionen buckal och palatinal, accuracy, för de olika metoderna validerat med Cone Beam Computed Tomography, samt sensitivitet

This thesis include one cohort study of physical activity and recovery of lung function in patients undergoing cardiac surgery, one validation study of two self-reported physical

By extraction from theoretical constructs in the MOHO, two work-related interview assessment instruments have been developed (Kielhofner, 1995; Kielhofner, 2002; Kielhofner,

The results of this study shows that there is no significant difference in route learning or distraction recognition when navigating using AR guidance while other

There were minor differences between the foam products with exception for One Seven A that gave the highest toxic response (e.g. lowest effect concentration). It should be noted

First we argue that the individualistic and consequentialist value base of health economics (as in both welfarism and extra welfarism and indeed health policy