Mixed Reality for Gripen Flight Simulators

Full text

(1)LiU-ITN-TEK-A--21/002--SE. Mixed Reality for Gripen Flight Simulators Tobias Olsson Oscar Ullberg 2021-02-05. Department of Science and Technology Linköping University SE-601 74 Norrköping , Sw eden. Institutionen för teknik och naturvetenskap Linköpings universitet 601 74 Norrköping.

(2) LiU-ITN-TEK-A--21/002--SE. Mixed Reality for Gripen Flight Simulators The thesis work carried out in Medieteknik at Tekniska högskolan at Linköpings universitet. Tobias Olsson Oscar Ullberg Norrköping 2021-02-05. Department of Science and Technology Linköping University SE-601 74 Norrköping , Sw eden. Institutionen för teknik och naturvetenskap Linköpings universitet 601 74 Norrköping.

(3) Abstract This thesis aims to evaluate how different mixed reality solutions can be built and whether or not it could be used for flight simulators. A simulator prototype was implemented using Unreal Engine 4 with Varjo’s Unreal Engine plugin giving the foundation for the evaluations done through user studies. Three user studies were performed to test subjective latency with Varjo XR-1 in a mixed reality environment, test hand-eye coordination with Varjo XR-1 in a video see-through environment, and test the sense of immersion between an IR depth sensor and chroma key flight simulator prototype. The evaluation was seen from several perspectives, consisting of: an evaluation from a latency perspective on how a mixed reality solution would compare to an existing dome projector solution, how well the masking could be done when using either chroma keying or IR depth sensors, and lastly, which of the two evaluated mixed reality techniques are preferred to use in a sense of immersion and usability. The investigation conducted during the thesis showed that while using a mixed reality environment had a minimal impact on system latency compared to using a monitor setup. However, the precision in hand-eye coordination while using VST-mode was evaluated to have a decreased interaction accuracy while conducting tasks. The comparison between the two mixed reality techniques presented in which areas the techniques excel and where they are lacking, therefore, a decision needs to be made to what is more important for each individual use case while developing a mixed reality simulator. Keywords — mixed reality, flight simulator, chroma key, depth sensors, immersion, latency, varjo.

(4) Acknowledgments We would like to thank Saab for allowing us to conduct our thesis with them. At Saab, we would like to thank our supervisor Ted Johansson for guidance and encouragement throughout the work, and we would also like to show our gratification to Stefan Furenbäck and Emma Hansson for giving us tips and tricks, and valuable feedback when we have been working with XR and Unreal Engine. We would also like to direct a thank you to our supervisor at Linköping University, Mikael Pettersson, for all the support in this thesis with all the valuable information and feedback regarding the implementation and user testing, and for reviewing our report continuously throughout the thesis. We would like to thank Ville Kivistö at Varjo for answering our question regarding Varjo XR-1 and their Unreal Engine plugin and giving advice for potential alternative solutions.. ii.

(5) Abbreviation 2D two-dimensional 3D three-dimensional AR augmented reality AV augmented virtuality CAVE Cave Automatic Virtual Environment FR foveated rendering FFR fixed foveated rendering FoV field of view FPS frames per second HMD head-mounted display HOTAS hands-on throttle-and-stick IR infrared MR mixed reality PPD pixels per degree PPI pixels per inch RR real reality SDK software development kit UE4 Unreal Engine 4 VC virtuality continuum VE virtual environment VST video see-through VR virtual reality XR extended reality. iii.

(6) Contents Abstract. i. Acknowledgments. ii. Abbreviation. iii. Contents. iv. List of Figures. vii. List of Tables. ix. 1. 2. Introduction 1.1 Motivation . . . . . 1.2 Purpose . . . . . . . 1.3 Research Questions 1.4 Delimitations . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. 1 1 2 2 2. Background 2.1 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Paper 1: Measuring Latency Through VST-HMD . . . . . . 2.1.2 Paper 2: Hand-Eye Coordination Using a VST AR System 2.1.3 Paper 3: Use of VR-HMD in Flight Training Simulators . . 2.1.4 Paper 4: Evaluation of VR Cockpit in Flight Simulators . . 2.2 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Varjo XR-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Unreal Engine 4 . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Varjo Unreal Engine Plugin . . . . . . . . . . . . . . . . . . 2.3.3 Varjo Markers . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 XR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Augmented Reality . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 Virtual Reality . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.3 Mixed Reality . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Traveling Matte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 Back Drop Colours . . . . . . . . . . . . . . . . . . . . . . . 2.5.2 Setup for Traveling Matte . . . . . . . . . . . . . . . . . . . 2.5.3 Camera Sensors . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.1 Neuronal Latency . . . . . . . . . . . . . . . . . . . . . . . . 2.6.2 The Effects of Latency . . . . . . . . . . . . . . . . . . . . . 2.7 IR depth sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8 Simulator Display Setup . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. 3 3 3 4 4 5 6 6 8 8 8 8 9 9 9 9 10 10 11 12 13 13 13 14 15. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. iv. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . ..

(7) . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. 15 15 16 17 17. Method 3.1 Pre-Study . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Implementation . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Mixed Reality . . . . . . . . . . . . . . . . . . . 3.2.1.1 Chroma Keying . . . . . . . . . . . . 3.2.1.2 IR Depth sensors . . . . . . . . . . . . 3.2.2 Varjo Markers . . . . . . . . . . . . . . . . . . . 3.2.2.1 Stencil Buffer . . . . . . . . . . . . . . 3.2.2.2 Chroma Markers . . . . . . . . . . . . 3.3 Physical setup . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Cockpit . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Chroma Wall . . . . . . . . . . . . . . . . . . . 3.3.3 Lighting . . . . . . . . . . . . . . . . . . . . . . 3.4 User Studies . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Subjective Latency . . . . . . . . . . . . . . . . 3.4.2 Hand-Eye Coordination . . . . . . . . . . . . . 3.4.2.1 Circles on a tablet touch screen . . . 3.4.2.2 Trace the line shape . . . . . . . . . . 3.4.3 IR- Vs. Chroma Key Mixed Reality Simulator . 3.4.3.1 Fill out a questionnaire . . . . . . . . 3.4.3.2 Familiarise with the simulator . . . . 3.4.3.3 Follow the course . . . . . . . . . . . 3.4.3.4 Comparison . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. 18 18 18 19 20 20 20 21 22 23 23 26 26 26 26 27 28 28 29 30 30 30 32. Results 4.1 Pre-Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Mixed Reality Flight Simulator . . . . . . . . . . . . . . . . . 4.3 Physical setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 User Studies - Subjective Latency . . . . . . . . . . . . . . . . 4.5 User Studies - Hand-Eye Coordination . . . . . . . . . . . . . 4.6 User Studies - IR- Vs. Chroma Key Mixed Reality Simulator 4.6.1 Test Group A: IR . . . . . . . . . . . . . . . . . . . . . 4.6.2 Test Group B: Chroma Key . . . . . . . . . . . . . . . 4.6.3 Comparison Questions and Improvements . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. 33 33 33 36 36 37 38 38 38 39. . . . . . . . .. 40 40 40 41 42 43 43 43 46. Conclusion and Future Work 6.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 48 49. 2.9 3. 4. 5. 6. 2.8.1 Monitor . . . . . . . . . . . . . . . . . 2.8.2 Dome Projection . . . . . . . . . . . . 2.8.3 Cave Automatic Virtual Environment 2.8.4 VR-HMD Setup . . . . . . . . . . . . . AB-testing . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. Discussion 5.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Mixed Reality Flight Simulator . . . . . . . . . 5.1.1.1 Simulator Display Setup Comparison 5.1.2 Traveling Matte . . . . . . . . . . . . . . . . . . 5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Subjective Latency . . . . . . . . . . . . . . . . 5.2.2 Hand-Eye Coordination . . . . . . . . . . . . . 5.2.3 IR- Vs. Chroma Key Mixed Reality Simulator .. v. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . ..

(8) 6.1.1. Varjo XR-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 49. Bibliography. 51. Appendices. 55. A Questionnaire. 55. B Hand-eye Coordination per user results. 61. vi.

(9) List of Figures 2.1. Average test result from Article 3, where the users had to score the immersion of a VR setup in the context of a screen and a CAVE setup. . . . . . . . . . . . . . . . . 2.2 The Varjo XR-1 HMD. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Bionic display technology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Varjo Markers is Varjo’s variant of tracking marker. The Markers is of the 25 mm size, with ID’s ranging from 100 to 104. . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Simplified spectrum of mixed reality (MR), where real reality (RR) and virtual reality are on each end of the spectrum of virtual continuum (VC), and augmented reality (AR) and augmented virtuality (AV) lies somewhere in between. . . . . . . 2.6 A five light setup for with two screen lights to light up the backdrop, a key light to light the foreground, a fill light to get ambient light, and a back light to emphasise details of the foreground. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Bayer Pattern camera sensor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8 Representation of IR dot projection, Figure 2.8a, and depth map visualisation, Figure 2.8b. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9 Shows a dome projection display system setup for a flight simulator. This setup has five projectors and a half dome with 220˝ dome projection. . . . . . . . . . . . . 2.10 Shows the CAVE system setup where in Figure 2.10a shows the setup when the projectors are off, and Figure 2.10b shows the setup when the projectors are on. . . The first iteration of the playable environment, where the boxes represent skyscrapers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 The first iteration of the controllable object is a simple representation of an aeroplane. 3.3 Problem with markers when chroma key is enabled. . . . . . . . . . . . . . . . . . . 3.4 A stencil buffer is shown, where anything that has the value 1 is rendered from the colour buffer to the final render in after stencil test. Everything that has the value 0 in the stencil buffer is not rendered from the colour buffer. . . . . . . . . . . . . . . 3.5 Chroma marker, which is a marker from 2.4 that has a green tint to make it blend with the green screen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Input controls in the mockpit, where Figure 3.6a displays the joystick, Figure 3.6b displays the throttle, and Figure 3.6c displays the rudder pedals. . . . . . . . . . . . 3.7 Displays the different rotations pitch (lateral axis), roll (longitudinal axis), and yaw (perpendicular axis). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8 The initial mockpit setup with a chair, controllers from Figure 3.6 and three blue screens. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.9 The three shapes used in the Trace the line user study. Figure 3.9a shows the triangle, Figure 3.9b shows the square and Figure 3.9c shows the star traced. The triangle and square were used to warm up and the star was used to measure hand-eye coordination and precision. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.10 Shows the flight path, where each letter is a checkpoint and the instructions for each checkpoint is described in 3.2. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 5 7 7 9. 10. 12 13 14 16 16. 3.1. vii. 19 19 21. 22 23 24 25 25. 29 31.

(10) 4.1. 4.2 4.3 4.4 4.5. 5.1 5.2. 5.3. The MR-simulator as seen by the user through the HMD. The Chroma key version is presented in 4.1a and IR depth sensor version is presented in 4.1b. Figure 4.1c shows the masking around the hand is in the chroma key simulator and Figure 4.1d shows the masking around the hand is in the IR depth sensor simulator. . . . Final iteration of the playable environment. . . . . . . . . . . . . . . . . . . . . . . . Final iteration of the controllable object. . . . . . . . . . . . . . . . . . . . . . . . . . Chroma marker used in MR-simulator, where Figure 4.4a shows from the perspective through Varjo XR-1 and Figure 4.4b from the real world. . . . . . . . . . . . . . The final physical setup for the MR-simulator, with a mockpit, a green back drop and two screen lights. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Overshoot, in purple, misses the corner when tracing the shape, in black, downwards. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The traced figure, in purple, is very similar in shape to the original shape, in black. The area is therefore very similar, but the precision is not, due to the change of position in space. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The traced shape, in purple, without VST-HMD is showed in Figure 5.3a and with VST-HMD is showed in Figure 5.3b. The traced shape are sometimes on opposites sides of the original shape when traced without VST-HMD in comparison to with VST-HMD. This causes accuracy calculations in Equation 3.1 to result in inaccurate conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. viii. 34 35 35 35 36. 44. 45. 46.

(11) List of Tables 3.1 3.2. Technical specifications of the computers used for the conducted user studies. . . . Instructions given during the test. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 26 31. 4.1 4.2 4.3 4.4 4.5 4.6. Test results with mean, max and min values. . . . . . . . . . . . . . . . . . . . . . . Subjective Latency: Latency in VST and in Mixed Reality . . . . . . . . . . . . . . . Test results from circles on a tablet touch screen. . . . . . . . . . . . . . . . . . . . . Given stats about the star shape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Test result from tracing a shape. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Result from the questionnaire with experience of projector dome simulator, VR, MR, and maneuvering an aircraft in a flight simulator or video game. . . . . . . . .. 36 37 37 37 38. Technical differences between Varjo XR-1 and Varjo XR-3 . . . . . . . . . . . . . . .. 49. 6.1. ix. 38.

(12) 1. Introduction. In the following chapter, a presentation of the reasoning to why this thesis is conducted and what are the expected learnings from this study on different mixed reality (MR) techniques.. 1.1. Motivation. The thesis has been conducted at Saab Aeronautics where their focus is to develop and produce different military grade airborne products. One of these products is the different versions of JAS 39 Gripen and the various tools surrounding it. One of the tools they create is the flight simulators used for pilot training. For flight simulators, it is critical to offer an immersive environment for pilots to conduct training in. The experience in flight simulators should, to the greatest extent, mimic the situation the pilot experience during a mission while flying. Traditionally, a dome solution is used to project an outside world around a simulator cockpit. These solutions quickly become enormous and complex. It is therefore of interest to see what methods are available to shrink such facility in both physical size and complexity. Studies and projects have been carried out to investigate how head-mounted display (HMD) can be used to offer a solution that puts a user in an immersive environment without a large display system. These have shown a great potential for observing an environment but create new problems to solve with how a user should interact with the simulator cockpit. To solve this, attempts have been made to replicate users’ hands in a virtual simulator cockpit that can be interacted with. [1] New technologies show the potential to address the problem in a new way with MR. Instead of building a virtual simulator cockpit and recreating a user´s hands virtually, a physical simulator cockpit can still be used, and the user can see their real hands. But instead of a dome solution, the following two MR techniques can be used, and these are set to be evaluated for this thesis. The first technique, chroma key, can be used outside of the simulator cockpit giving the ability to display both the virtual and physical worlds, simultaneously. The second technique, infrared (IR) depth sensing could be used without any surrounding setup around 1.

(13) 1.2. Purpose the simulator cockpit potentially being a solution that requires even less physical space and still achieves an equivalent feel of immersion for the user.. 1.2. Purpose. The purpose with this thesis is to evaluate how the different MR solutions can be built and to create a prototype for an MR flight simulator with help of the Varjo XR-1 [2] and Unreal Engine 4 (UE4) [3]. With the prototype, the goal is to determine if the solutions could be viable to use, as seen from several perspectives. Firstly, the perspective of latency and whether there is a perceived change in latency while using the Varjo XR-1. Secondly, the perspective of how well the masking with the different techniques can be done. Lastly, the perspective of which of the two techniques are preferred, through user studies, in the sense of immersion and usability in a flight simulator cockpit.. 1.3. Research Questions. 1. How high latency does the system have, from when a visual cue is displayed to the user until they react and presses a button, and is that latency acceptable to users who frequently spends time in a flight simulator? 2. While using the mixed reality cameras with Varjo XR-1 HMD, will the latency between the cameras and display have a significant impact on hand-eye coordination, compared to not wearing an HMD, when doing time-sensitive tasks while maintaining accurate actions? 3. Which of the mixed reality techniques, chroma key or depth sensing using IR, gives the best-perceived immersion and what are the advantages and disadvantages of the different mixed reality techniques?. 1.4. Delimitations. As per the requirements from the client, we limit the development and research to the Varjo XR-1 HMD and UE4. The prototype does not need to be accurate in the sense of having correct physics or look realistic.. 2.

(14) 2. Background. In this chapter the scientific research is presented which has given the foundation for this thesis. The key components that have been part of the requirements for this thesis are also presented, as seen in the perspective of both hardware and software.. 2.1. Related work. This section summarises other scientific papers that are relevant to this paper. The papers include papers of latency tests, hand-eye coordination test and how to create a flight simulator. Whether or not the related work is something that should be replicated or not is further described in Chapter 3 and discussed in Chapter 5.. 2.1.1. Paper 1: Measuring Latency Through VST-HMD. As for the system latency, Greun et al. propose two different ways of measuring latency for a video see-through (VST)-HMD [4]. The first method is to do a cognitive latency test, which tests the reaction time of the user with and without their VST-HMD. The task is to do a rapid response test similar to the Eriksen flank task [5] to measure the perceived latency. In this task, the user pressed a button which allowed the user to see a rendered white circle. After a random number of seconds in the interval [0.5, 3.0] after the button was pressed, the circle changed colour to black, and the user was supposed to release the button. The time it took between the circle changing colour and the button being released was the measured time. Now, as there is some latency in the VST-HMD to render each frame, the measured time with and without the VST-HMD should differ with time of the latency of the system. Greun et al. could therefore analyse if there is any added reaction time to the reaction test with the delay, as it could be according to the Fitts Law [6]. The law states that the time of a task in a system is a function of the distance to, and the size of, the target. This means that the longer the distance to the target, and the smaller the target, the longer time it will take to accomplish the task. This is directly affected by any latency in the system. This article is the foundation for the subjective latency test described in Section 3.4.1, where a rapid response test will be performed.. 3.

(15) 2.1. Related work. 2.1.2. Paper 2: Hand-Eye Coordination Using a VST AR System. Hand-eye coordination is an important feature to keep in mind when developing in MR. Park et al. [7] suggest five different methods to assess hand-eye coordination in a VST-HMD for augmented reality (AR) applications. The five tasks that they test are: 1. Tracing Lines on a Touch Screen 2. Placing the Stylus Over a Dot on the Touch Screen 3. Tracing the Edge of a Metal Sheet 4. Screwing Wingnuts on an Assembly Board 5. Tracing a Pre-Defined Path on a Skull-Model The first task aims to replicate hand activity tasks where the user has to repeat tasks with straight and angular lines with an instrument. This is tested by having six different shapes of straight and angular lines appear at random and being traced by the user on a touch screen. The accuracy can be evaluated by measuring the deviation of the area between the original line and the traced line. The performance time was also measured during the task. The second task aims to find the manual target accuracy, where the user has a stylus and is supposed to press the dots appearing on the screen. In the accuracy calculations, the two main contributors are the distance to the next dot, and the distance between the dot and where the stylus touches the screen. [7] The third task focused on tracing a metal sheet with a tracked stylus. The task is similar to the first task, but in this task, the deviation is measure in volume instead of the area as the tracing is done in three-dimensional (3D)-space. The fourth task refers to manual assembly by screwing five wingnuts located within an area of 18x18 cm with the dominant hand. Only time is recorded during this task. The fifth and final task simulated milling during maxillofacial surgery, focusing on tracing smooth topography on a skull. The first and second task from this article is related to the hand-eye coordination tests described in Section 3.4.2, where the first task is described in Section 3.4.2.1 and the second task is described in Section 3.4.2.2.. 2.1.3. Paper 3: Use of VR-HMD in Flight Training Simulators. Gustafsson wrote a master thesis at Saab in 2018 evaluating the commercial Oculus DK2 virtual reality (VR)-HMD capabilities as a substitute for a Cave Automatic Virtual Environment (CAVE) setup, which is described in Section 2.8.3, flight simulator [8]. To be able to accept the VR-HMD as a potentially usable setup for a pilot training simulate, a couple of base requirements had to be met or exceeded. The requirements measured against was the requirements of a single monitor setup flight simulator, which had minimum requirements of 1080p resolution, 60Hz refresh rate and 61,9˝ field of view (FoV). Gustafsson found that the one critical deficiency with a VR-HMD flight simulator was the lack of interaction in the simulator. The only interaction was with the joystick and throttle, which were not visible since the simulator only displays virtual objects with no VST. During the user tests, the usability was confirmed to be high as a potential VR implementation of a. 4.

(16) 2.1. Related work flight simulator. The usability was measured by six factors: Usefulness, Efficiency, Effectiveness, Learnability, Satisfaction, and Accessibility. The user test consisted of six parts: • Task 0: Pre-test health control • Task 1: Reading instruments • Task 2: Gentle flying • Task 3: Maneuvering • Task 4: Gentle flying • Task 5: Simulator Sickness Questionnaire The pre-test assured that the participant was in good health without any sickness that could affect the results in task 5. In the first task, the test person had to read all the instruments in the cockpit, to verify that the resolution of the VR-HMD was high enough to read some of the smaller texts. The second task’s purpose was to make the participant look around the environment outside of the cockpit, which was done by finding the six runway numbers of the Václav Havel Airport, Czech Republic. This made the participant locate the airport, navigate to it and find the six numbers. The third task had the test person navigate a course around the airfield, including hard turns, rolls and loops, which would give the feeling of actually flying an aircraft. This task’s purpose was to inflict possible symptoms of simulator sickness. The fourth task was to either try to land the plane or just to fly around gently for a maximum of five minutes. The purpose of this was to subside any temporary simulator sickness symptoms. The user finally had to rate their health in a Simulator Sickness Questionnaire which complemented the evaluation of the VR-HMD setup for a flight simulator. However, Gustafsson found that the low resolution of the Oculus DK2 VR-HMD was a potential problem. The users also evaluated how immersive the experience was and had to mark an X on a scale, with a screen setup at one end and a CAVE setup on the other. The average score can be seen in Figure 2.1. The key positive aspects of the prototype where the high immersion and spatial ability. However, there were a couple of negative and most critical aspects such as lack of interaction, low resolution, unsatisfactory tracking, and fixed focal area.. Figure 2.1: Average test result from Article 3 [8], where the users had to score the immersion of a VR setup in the context of a screen and a CAVE setup. This article is related to the final user study described Section 3.4.3. The overall structure and flight path inspired the user study. This article evaluates VR-HMD while in Section 3.4.3 evaluates MR-HMD for the same purpose.. 2.1.4. Paper 4: Evaluation of VR Cockpit in Flight Simulators. Martinsson wrote a master thesis at Saab 2019 to create an interactive virtual cockpit with tracked hand movement [1]. Unity was used to create the virtual cockpit and both HTC VIVE and Varjo VR-1 was used as VR-HMDs. Due to the mentioned VR-HMDs being unable to track hands, a Leap Motion controller was used in conjunction with the VR-HMDs, which can track the hand movements without tracking gloves. Without any hard conclusions, 5.

(17) 2.2. Hardware Martinsson found that hand tracking was satisfying in the virtual environment. However, a prototype with more feedback and functionality for the free-hand interactions would give a better evaluation. This article relates to the simulator setups described in Section 2.8 and the hand tracking which is discussed in Section 5.1.1.1.. 2.2. Hardware. In this section, an explanation is given of the hardware components that gave the basis for the development and research for the thesis.. 2.2.1. Varjo XR-1. The Varjo XR-1 Developer Edition [2] shown in Figure 2.2, is the fourth HMD that Varjo has created, but it’s the first one that can connect the virtual world with the real world. This is done with help of a camera-module consisting of two cameras with a 90 Hz refresh rate that is integrated to the HMD. The image sensor in the camera has a Bayer style colour filtering array. The importance of having this kind of sensor is further explained in Section 2.5. A HMD with these properties are often called VST-HMD, VST-HMD, meaning that the video signal from the camera module can be passed through to the displays in the HMD giving the user the ability to see and interact with the physical world while still wearing the HMD. The cameras have an FoV of 94˝ . The VST-functionality also brings its downsides as it also impacts the performance of the system negatively, due to it requiring more resources computing a single frame [9]. To try to mitigate the delay, Varjo has worked on minimising the latency between the cameras and display, where they claim that the latency is below 20ms. [10] For the display system, the Varjo XR-1 uses a technique similar to foveated rendering (FR). FR is a technique used to determine which region of an image to render in a higher resolution using eye-tracking. The premise is that where the user is focusing its gaze, that region should be rendered in higher quality and everything else can be rendered in a lower resolution gaining better performance without perceived visual quality. [11] There is a simplified form of FR that is called fixed foveated rendering (FFR). Instead of using an eye-tracker to determine where to render in higher/lower quality, the image is sub-divided into predefined regions that determine the quality to render in each region. Varjo XR-1 uses a technique that is called Bionic Display.[12] The premise for the Bionic Display is that in each eye, there are two displays. One display to give a peripheral view with an FoV of 87˝ , and one smaller display for the area in the middle of the larger peripheral display. The main difference between Bionic Display and FR/FFR is that FR and FFR is done through software while Bionic Display optically combines the display outputs using a semi-transparent mirror. The technique is presented in Figure 2.3. The smaller display has a greater pixel density giving a greater visual quality. The peripheral display is an OLED with a resolution of 1920x1080, and the smaller display is an AMOLED with a resolution of 1440x1600, with a pixel density of 60 pixels per degree (PPD) or 3000 pixels per inch (PPI). The peripheral displays have a refresh rate of 90Hz, while the focus display has a refresh rate of 60Hz. Varjo’s software syncs the higher refresh rate display down to 60Hz to match the output refresh rate between both displays. With a refresh rate of 60Hz translates to a new frame being displayed approximately every 16.7 ms.. 6.

(18) 2.2. Hardware. Figure 2.2: The Varjo XR-1 HMD.. Figure 2.3: Bionic display technology. [12]. 7.

(19) 2.3. Software The XR-1 has the benefit of the ability to track visual objects due to being a VST-HMD. These visual objects typically are in the form of black and white markers, as they easily can be read by cameras [13]. Varjo has created their own version of tracking markers called Varjo Markers, as further explained in Section 2.5, where the XR-1 supports tracking of 1000 individual markers simultaneously without active controllers or increased latency. [14] Chroma key composition is another feature that is available due to the VST-functionality. Chroma key is usually applied in post-production in filmmaking, but with the XR-1 the chroma keying is applied in real-time during the post-processing of each frame. A pixelperfect precision can occlude the use in a virtual world wherever there is a green screen. In Section 2.5, there is a further description of what chroma key is and how it works. [14] Finally, the HMD is also equipped with a depth system including two wide-angle cameras with active IR sensors. The depth system allows for robust depth mapping when using the video-see-through mode to get an approximation of where physical objects are placed in an area around the user in the real world. By combining the different techniques available for the Varjo XR-1, one could create an immersive user experience in a mixed reality environment.. 2.3. Software. In the following section, the different key software components are presented. Varjo’s tracking system is also introduced in the form of tracking markers.. 2.3.1. Unreal Engine 4. Unreal Engine 4 (UE4), have been the foundation for the development of the prototype. To develop in UE4, there are two main approaches, code using C++ or use UE4 built-in visual scripting tool, Blueprints [15]. A large component of the UE4 environment is the different plugins that either companies or individuals from the community create and share for others to use, either by purchasing them or offering them for free.. 2.3.2. Varjo Unreal Engine Plugin. Varjo has created a UE4 plugin [16] that users can download for free which can be used to develop applications for all of their HMDs. The plugin gives the user access, through blueprint functions, to all components needed to use all currently supported features of the different HMDs.. 2.3.3. Varjo Markers. Varjo Markers are Varjo’s version of fiducial markers [17], where each of the markers has its unique identification. These markers are suitable for tracking virtual elements to a real-world location, as it becomes possible to compute the relative transformation for each marker and later map the transformation to virtual objects. Multiple markers can be used to track a single virtual object to get a higher precision when tracking it. A good example is to use four markers in the corners of a piece of paper to accurately track a painting however the rotation of the physical markers. [18] There are three pre-defined sizes of markers, all with their unique identifications, and printable from Varjo’s website. The sizes are 25 mm, 50 mm, and 150 mm suitable for tracking up to 0.5 meters, 1 meter and 3 meters respectively. Figure 2.4 shows five markers of the 25 mm 8.

(20) 2.4. XR size. [18]. Figure 2.4: Varjo Markers is Varjo’s variant of tracking marker. The Markers is of the 25 mm size, with ID’s ranging from 100 to 104.. 2.4. XR. Extended reality (XR) is often referred to as a collective name for AR, MR and VR [19]. In this section, those concepts are described. XR object is defined according to Peña-Ríos et al. as an object with a physical body that can be manipulated in the virtual world. Rules and behaviour are set for the object that declares how the objects affect virtual objects and the virtual environment (VE). [20]. 2.4.1. Augmented Reality. AR is defined as a 3D real environment displayed on a screen with 3D objects integrated in real-time. The virtual objects should also be interactive in real-time which allows AR to enhance the real world. Therefore, AR is not film or two-dimensional (2D) overlays. This is due to films not being interactive and 2D overlays not being integrated in 3D. If Virtual Environment (VE) (Completely synthetic) is on one end of the spectrum and telepresence (Completely real) is on the other end, then AR is the "middle ground" with some of each. AR is commonly found in phones where the user can use the phone camera to capture the real world and integrate virtual objects to see the objects relative size to the real world. It can also be used to get a 3D overlay on the camera feed that can help in professions such as in medical, path planning, visualization, manufacturing or in military applications to name a few. [21] [22]. 2.4.2. Virtual Reality. VR is a simulated 3D world synthetically generated by a computer allowing the user to immerse into. The VE created gives the user a feeling of being present in the rendered place, whether it being a replica of a real place or a fictional environment. This is commonly known as "sense of presence". VE is not limited to VR, as it tends to be used in pre-rendered films, where the user is a passive observer. However, in VR the user is an active observer as the VE is rendered in real-time allowing the user to interact with objects in the VE. VR allows for immersive real-time user experiences. [23] [24]. 2.4.3. Mixed Reality. MR is referred to as the technology to merge real reality (RR) and VR together. MR is on the spectrum of the virtuality continuum (VC) which goes from RR on one side, all the way to VR on the other side of the spectrum, as seen in 2.5. Mixed reality is anything that lies between 9.

(21) 2.5. Traveling Matte RR and VR on the spectrum. AR is on the spectrum closest to RR, where virtual assets are inserted to the RR. When the conditions are changed to a VR with real assets, augmented virtuality (AV) is created and lies close to VR on the spectrum. A RR which can be observed with a VST-HMD. [25] [26]. Figure 2.5: Simplified spectrum of mixed reality (MR), where real reality (RR) and virtual reality are on each end of the spectrum of virtual continuum (VC), and augmented reality (AR) and augmented virtuality (AV) lies somewhere in between. [25]. 2.5. Traveling Matte. Combining two images together has been a challenge in the film industry since the beginning of the last century. A mask has to be applied to one frame to be able to extract the foreground of a frame and add another background to it. This technique is often referred to as a traveling matte in the film industry. A matte is a flat single colour without any glossiness and the term traveling matte means a matte that travels from frame to frame in motion picture images. [27] The first use of a traveling matte was back in 1918 and was developed by Frank Williams. The technique required a matte black background and an evenly lit actor to be in the foreground. Frank copied the frames to high-contrast films several times back and forth to get a clear black silhouette which acted as the foreground mask. The foreground could then be composited together with the considered background. This technique allowed films to have a background of remote or even impossible places, reducing both travel and material cost. [27] Today most traveling mattes are done with either a blue or a green background, often referred to as a blue or green screen, and the composition is being done in computers. Chroma keying is often wrongly used to describe the whole traveling matte process in filmmaking. The term is actually just the technique of removing a solid colour and exchanging it with a transparent background. This allows the image behind to be visible creating the image composition. Thus, the final result is massively dependant on how the traveling matte is done, what equipment is used and under what circumstances. [27]. 2.5.1. Back Drop Colours. The first thing to figure out is what screen to use. Blue and green are the most commonly used colours for the screen, but there have been times in history where red and magenta has been used. When figuring out what screen to use, it is important to know what is in the foreground, as when the background colour is removed, the foreground should still stay the same and not be removed. Therefore, the screen should not be a colour which is represented in the foreground. As for humans, the colours which are on the opposite side of the colour spectrum of any skin colour is blue and green, which is why those are used frequently in filmmaking. In rare occasions when both blue and green are present in the foreground, a red 10.

(22) 2.5. Traveling Matte or magenta screen can be used. [27] A benefit of having a green screen over a blue screen is that the green screen is that it requires less light than blue. However, this also means that the green screen spills more green on to the foreground than what blue does. This spill slightly illuminates the foreground, which in turn increases the difficulty of to get a clean mask around the foreground. Solutions to this will be explained further down in this section. Blue however can take advantage of night or winter shoots as there is already a great amount of blue light outside. If filming outside and using a blue screen, the spill would be acceptable as it looks natural in that environment. [27]. 2.5.2. Setup for Traveling Matte. There are several components to consider setting up a shoot to decrease the amount of post-production when using the traveling matte technique. Firstly, the screen should be flat as any change in gradient could cause visual artefacts in the form of pixels staying green after the applied chroma key. This is due to the chroma keying choosing one colour, with a small range around the chosen colour, to key out. If the range is too large, the precision will decrease in the mask. This means that using several individual screens is unacceptable since there will be a great gradient change between the screens. Instead, either painting the walls with green screen colour or to have a cyclorama is preferred. A cyclorama is a seamless curved screen that connects the floor and the wall in a smooth transition as seen in Figure 2.6. [27] Another thing to consider is the lighting in the scene which can greatly increase the precision of the traveling matte, as seen in Figure 2.6. The screen should be evenly lit with screen lights to eliminate any shadows, which otherwise will create visual artefacts as previously mentioned. This however will create some spill onto the foreground. The key light is the main light which is focused on the foreground. This light source should represent the same colour and direction as the strongest light source in the new environment that will be composited in post-production. Then there is the fill light that creates an illusion of ambient light in the scene. It requires less fill light if it is a dark night under a streetlamp than during a bright day. Finally, there is the back light which sometimes is used to increase the definitions of fine details such as hair. It is applied from the backside of the foreground, but it does, however, decrease the ability to match in colour the foreground to the background. With all the light sources it is crucial to match the temperature of the light with the background. A warm red light in the studio will not mix well with the cold blue light of a background with the Swiss alps during a snowstorm.. 11.

(23) 2.5. Traveling Matte. Figure 2.6: A five light setup for with two screen lights to light up the backdrop, a key light to light the foreground, a fill light to get ambient light, and a back light to emphasise details of the foreground. [27] To solve the spill problem of any colour there are two methods which can be applied. The first method is to have the foreground at a distance of at least two meters to the screen. The closer to the screen the foreground is, the more spill it gets. The second solution is to have the screen light to match the colour of the screen. If the screen is green, then the screen light should also be green.. 2.5.3. Camera Sensors. Another factor in choosing which colour the backdrop should be has to consider the camera. Most camcorders today use a colour filtering array that is attached to the image sensor, where one the most common is Bayer filter [28]. It is arranged in grids where the rows are either alternating between blue and green, or between green and red. A small segment of the Bayer filter is presented in Figure 2.7, where a pixel on the image sensor is placed directly under each square of the filter. As can be seen in the figure, there are more green squares than blue or red. This gives the image sensor more information about the green light, which can be used when processing the information at a later stage. By having more information about the green light from the scene, makes using a green screen as a chroma key backdrop a good candidate since there will be more information to process that potentially leads to better masking.. 12.

(24) 2.6. Latency. Figure 2.7: Bayer Pattern camera sensor.. 2.6. Latency. Latency is commonly used to measure the time between cause and effect in a system and is often measured in ms in digital applications. In this application, latency is defined as the time taken from giving an input to a controller, to when the user can visually see the given action rendered inside the HMD [29]. Some factors that have the potential to increase the latency in a HMD is tracking devices [29] and the amount of post-processing [30] applied to the video feed before displaying the information. An important aspect when measuring latency is being precise in the method. [29] [31]. 2.6.1. Neuronal Latency. The human eye can react at a neuronal level of approximately 200 ms after the event [32]. For the mind to send signals to the rest of the body the react to the event takes another 100 ms resulting in a response time close to 300 ms during a visual Erikson flanker rapid response task [5] [33]. For the full reafferent loop to motor actions to be integrated takes a total of about 400 ms [33]. Therefore, depending on the task and situation for the reaction test, a response time between 300 and 400 ms would normally be expected from the user [4]. Latency is perceived differently by different people. Some may not notice a latency of 100 ms, while some may notice it with a latency as low as 3-4 ms [34]. A guideline often used when developing a VR application is that the latency should be less than 20 ms [35]. [34]. 2.6.2. The Effects of Latency. The user experience in VR and AR systems can drastically be affected by latency. If the latency is high enough, the rendered image displayed in the HMD will lag behind the movement of the user. This could cause an effect in VR systems called swimming [36], which means that the environment is slightly moving while the user is static. This drift can make the human mind disoriented and a significant delay can lead to cybersickness. [37] The latency of the system also affects human interaction with delay. As the user performs tasks including pointing and reaching, the latency of the system delays the video feed to the display, which results in the user having more trouble to complete the task. This corresponds to the Fitt’s law of human movement [6] which states that the time to complete a task is a 13.

(25) 2.7. IR depth sensor function of the distance to the goal and the size of the goal. These tasks are mostly in the form of reaching for, or pointing at, something. The longer the distance to the goal and the smaller the goal, the longer the time will be to complete the task, as the user needs more precision to reach the goal. With added latency, the time reaching the goal could be significantly increased. [6] However, Friston et al. found that at low latencies, reaction time was not better or perhaps even worse than without latency [38]. This could be explained by humans having an inherent latency in the human motor system [38]. The subjective experience is also affected by delay in VR environments where task performance is decreased during physical interactions [4]. Jitter in latency is uncommon, yet probable, and can impact the outcome of any given task to the user. [39] Depending on the frames per second (FPS) of the system, the latency of one frame can be computed with the simple equation 1/refresh rate, where the quotient should be a maximum of 20 ms, as described earlier in Section 2.6. With a quotient of a maximum of 20 ms, the minimum refresh rate should therefore be 50 Hz. When using a VST-HMD, the video feed might render at a different speed than the virtual graphics. Depending on the relative speeds of image capture and image rendering, the images may be rendered out of sync. A delay can be used to synchronise the two images, before sending the image to the display system. This will however give the system some additional necessary latency. Depending on which of the real and virtual image that renders the slowest, the other will be delayed. [40]. 2.7. IR depth sensor. By using a combination of IR light emitters and cameras that can capture IR light, it becomes possible to determine the depth captured by the cameras creating a depth map. The IR emitters produce a matrix of IR dots that are projected on objects and surfaces. The cameras are then able to identify these dots and estimate the distance to the dots. These estimations are the foundation of the constructed depth map. Figure 2.8 presents an example of how this could look like.. (a) IR emitters projecting dot matrix. [41]. (b) Depth map visualisation. [42]. Figure 2.8: Representation of IR dot projection, Figure 2.8a, and depth map visualisation, Figure 2.8b. With the depth map, it is possible, in combination with the depth buffer of a 3D program, for virtual objects to occlude physical objects and vice versa. This can be done since the depth maps represent relative distances from the HMD, where it becomes possible to compare the distances from the two depth maps and then determine which objects are closer to the user. 14.

(26) 2.8. Simulator Display Setup Using IR depth estimations often suffer from reflective problems which could occur in environments which are either transparent, shiny, or any kind of absorbing surfaces. [43] It is therefore essential to control the environment that should be used while depth estimating using IR, as the result could become unsatisfactory. Some popular product that uses this technology is the Kinect [44] developed by Microsoft, and the FaceID [45] technology used in some of the recent product developed by Apple.. 2.8. Simulator Display Setup. There are different type of display systems for simulators ranging from a single monitor setup to a dome with several projectors projecting the virtual environment on the walls. This section focuses on how to experience the virtual environment and not the physical setup such as joystick and cockpit.. 2.8.1. Monitor. In one end of the spectrum, there is the single monitor setup. Depending on the size of the monitor and the distance to the monitor, the FoV changes which affect the immersion of the simulator. However, due to the relatively low FoV, the immersion will be low, but with more monitors, the immersion can be improved. Immersion can be further improved with smaller bezels, larger monitors, and with curved displays, to surround the users’ field of view, which all increases the cost. Some flight training simulators consist of three monitors where the two edge monitors are rotated towards the pilot. A single computer is connected to the monitors and the response time is only as low as the display is rated for.. 2.8.2. Dome Projection. On the other end of the spectrum, there is the projector setup in a dome, as seen in Figure 2.9. Two or more projectors are simultaneously projecting parts of the full environment on the walls of the dome [46]. Each of the projectors is connected to a computer each, to be able to render each section of the environment. The dome walls are 2 meters away from the pilot which allows for an immersive experience, where the user can look around in all directions. However, the projector has some latency affecting the latency of the whole system. The dome projection setup is also an expensive setup due to the necessary projectors and the computers, and the required physical space.. 15.

(27) 2.8. Simulator Display Setup. Figure 2.9: Shows a dome projection display system setup for a flight simulator. This setup has five projectors and a half dome with 220˝ dome projection. [47]. 2.8.3. Cave Automatic Virtual Environment. The CAVE system is similar to the dome projection setup, although, there are some key differences. The similarities are that they both use projectors to project onto large empty screens and where the user is in the middle of the screens. One of the benefits of the CAVE-system over the dome projection system is that the user can see virtual objects around them, granting the user a higher feeling of immersion than the dome projection setup. To accomplish this, the user requires to wear stereoscopic eyewear that is being tracked. The user can also use equipment such as a wand or data gloves to be able to interact with the 3D objects. Figure 2.10 shows the CAVE system both when it is off and on. [48].. (a) CAVE with projectors off. [49]. (b) CAVE with projectors on. [50]. Figure 2.10: Shows the CAVE system setup where in Figure 2.10a shows the setup when the projectors are off, and Figure 2.10b shows the setup when the projectors are on.. 16.

(28) 2.9. AB-testing. 2.8.4. VR-HMD Setup. The VR setup requires two components; a VR-HMD and a compatible computer, which leads to the VR setup barely taking up any space. The immersion of VR is high, as the user can look in all directions. A negative aspect of the VR setup is that the user is unable to see their hands in the virtual environment. There are some solutions and equipment which allows the user to see their hands as virtual hands. But due to the limit of current technology, the hand movement can be rather dull and difficult to control. [8] [1]. 2.9. AB-testing. The purpose of AB-testing is to compare two different solutions of a component of a product, to see which of the solutions results in better performance and optimisation to a given hypothesis. The hypothesis should be as concise and precise as possible leading to a distinct winner between the two solutions. The compared versions of a component should also focus on solving just one task, yet differently. Two groups of people, group A and group B, will blindly test one of the two versions of the product. The test can either be to accomplish a set number of tasks or by simply letting the test person use the version of the product for a set amount of time and document the behaviour of the user. The former should have simple tasks that are uncomplicated to accomplish, which will lead to an objective answer whether the task was completed or not. The latter tests how well the version is doing regarding the hypothesis when the user is under no restrictions but time. [51][52]. 17.

(29) 3. Method. In the following chapter, the preparation for the development of the MR-simulator is presented followed by the implementation and physical setup of the simulator, then concluding with the methods used for the user studies conducted throughout the project.. 3.1. Pre-Study. Before creating an MR flight simulator, the functionality and the potential of the Varjo XR-1 had to be evaluated. The three main pre-study topics to research were: 1. What are the possibilities to create an MR environment using the Varjo XR-1? 2. What is required to be able to chroma key a backdrop, regarding masking without artefacts either on the backdrop or its surroundings? 3. Should the simulator be developed with the UE4 plugin created by Varjo, or with the Varjo software development kit (SDK)? As described in Section 2.2.1, the Varjo XR-1 offers great possibilities to develop an MRsimulator. As part of the research for the implementation of the MR-simulator, a UE4 project created by Varjo, that uses the Varjo Plugin for UE4, was examined. The project is meant for developers to get a feel and understanding of what is possible to implement using the plugin in UE4. The goal by examining the example project is divided into two parts: 1. Explore any potential problems with the plugin that would indicate that the plugin wouldn’t be the best choice for development. 2. Investigate which aspects are important to consider when chroma keying in the sense of colour of the backdrop and lightning as previously described in Section 2.5.. 3.2. Implementation. The implementation began with creating a foundation for the simulator in form of an environment and a controllable object. The first iteration of the environment, as seen in Figure 3.1, were cubes and planes that define the playable area. 18.

(30) 3.2. Implementation. Figure 3.1: The first iteration of the playable environment, where the boxes represent skyscrapers. The first controllable object, Figure 3.2, was used as an aircraft with the camera object in the figure representing the spawn location of the player wearing the HMD.. Figure 3.2: The first iteration of the controllable object is a simple representation of an aeroplane. The implementation for the MR aspects in the UE4 project is quite straight forward since the Varjo plugin controls the core functionality of the Varjo XR-1 headset. The considerations that need to be taken into account are the choices done when placing virtual objects and setting up the virtual environment.. 3.2.1. Mixed Reality. To be able to utilise the two different MR techniques used for this thesis, some preparation needs to be done to enable the common components that are used by both techniques. This is done largely using the Varjo UE4 plugin.. 19.

(31) 3.2. Implementation 3.2.1.1. Chroma Keying. Chroma key is enabled using a single blueprint function from the Varjo plugin. The function has three input parameters: target chroma key colour, the tolerance of the colour, and tolerance falloff. The targeted chroma key colour is defined using the HSV (hue, saturation, value) colour space. The tolerance values define a range from the given masking colour for which the system will recognise as the masking colour. The falloff parameter is used to give a smoother transition at the edges between the virtual and physical objects. It is possible to have several colour configurations working simultaneously meaning that you can use several colours for the chroma key masking giving greater control on where masking should be applied. 3.2.1.2. IR Depth sensors. For the IR version of the developed MR-simulator, two functions need to be called to enable depth sensing. To get a good representation of where objects are placed, both the distance to virtual objects from the virtual camera and the physical distance from the camera on the HMD to physical objects need to be compared to determine what should be visible. How to get the distance for the virtual objects is done by telling UE4 to send the depth buffer Varjo’s rendering stack. To determine the distance to physical objects, the IR/depth sensors placed on the HMD are enabled to generate a depth map of the physical space. With the information from the two depth maps, it is now possible to compare the two masks to determine which objects are occluded. By using this technique, it becomes possible to occlude real-world objects with virtual ones and vice versa, giving the user the sense of being both inside and outside the virtual cockpit.. 3.2.2. Varjo Markers. Varjo markers, as first presented in section 2.3.3, offers the ability to track virtual objects to physical markers. With these, users of the simulator could be given the ability to change the layout of the displays in the cockpit by physically moving the markers. The other use case that where considered was to track the virtual cockpit to a set of markers. Both of these use cases encountered the same problem, physical markers and chroma key combined caused problems for when the chroma key was enabled, the physical marker is seen and the object that should be tracked by the markers is only partially visible, as can be seen in Figure 3.3. This was, as could be imagined, an unsatisfactory result. To try to counteract this problem, an idea to use a stencil buffer to manipulate what is rendered.. 20.

(32) 3.2. Implementation. Figure 3.3: Problem with markers when chroma key is enabled. 3.2.2.1. Stencil Buffer. A stencil buffer is a mask which only renders pixels that is inside the mask. The stencil buffer is often shared with the depth buffer where the developer can define how large the stencil part of the buffer should be. Commonly, the stencil buffer is set to be 8 bits large but can be set to any number of bits that are allowed by the depth buffer. If the stencil buffer is set to be 8 bits large, the mask could contain 256 different values for each pixel. As seen in Figure 3.4, wherever there is a 1 the image being masked is rendered, and where there is a 0 is not. This method could potentially be used to render the scene in multiple passes, making it possible to occlude physical markers with virtual objects while having chroma key enabled.. 21.

(33) 3.2. Implementation. Figure 3.4: A stencil buffer is shown, where anything that has the value 1 is rendered from the colour buffer to the final render in after stencil test. Everything that has the value 0 in the stencil buffer is not rendered from the colour buffer. 3.2.2.2. Chroma Markers. Another potential solution that seemed promising was to use markers that are coloured with a similar colour used as masking colour for the chroma key, see Figure 3.5. The marker is recognised to be a marker by the contrast between the lighter and darker areas in the marker. Therefore, by changing the colour of the markers to colours similar to the chroma wall, while still maintaining a contrast between the colours, it could be possible to mask the physical marker while still being able to see the virtual object tracked by the marker.. 22.

(34) 3.3. Physical setup. Figure 3.5: Chroma marker, which is a marker from 2.4 that has a green tint to make it blend with the green screen.. 3.3. Physical setup. The following section presents the choices made for the physical simulator setup, where each of the key components will be presented separately with their respective problems that had to be solved. The physical setup consists of three main parts; the cockpit described in Section 3.3.1, the chroma wall described in Section 3.3.2, and the lighting described in Section 3.3.3.. 3.3.1. Cockpit. For development and testing purposes, a simple representation of a cockpit was used as it was not a required to have a correct replica of a real cockpit, as part of the aim for the thesis was to reduce the physical space needed for a simulator setup. To be able to test the concept of an MR-simulator, a commercially available simulator setup was used, that consists of a seat attached to a base frame. The frame has several mounting points giving the option in versatility to change the position of input controller. For the user input, a hands-on throttleand-stick (HOTAS) and rudder pedals are used. A HOTAS is commonly used in an aircraft’s 23.

(35) 3.3. Physical setup cockpit as it gives the pilot greater control of the aircraft without removing their hand of the controls. This is done by placing buttons of the joystick, Figure 3.6a, and throttle controls, Figure 3.6b, giving the pilot vital controls easily accessible. Rudder pedals, Figure 3.6c, are also found as part of an aircraft’s steering system. With the joystick, the pilot can control the aircraft’s pitch and roll, with the rudder pedal controlling the yaw, and the throttle controls the acceleration. The aircraft rotations are presented in Figure 3.7. Figure 3.8 shows the initial setup of the mock-up cockpit, which will be referred to as "mockpit" in this paper.. (a) HOTAS - Joystick.. (b) HOTAS - Throttle.. (c) Rudder pedals.. Figure 3.6: Input controls in the mockpit, where Figure 3.6a displays the joystick, Figure 3.6b displays the throttle, and Figure 3.6c displays the rudder pedals.. 24.

(36) 3.3. Physical setup. Figure 3.7: Displays the different rotations pitch (lateral axis), roll (longitudinal axis), and yaw (perpendicular axis). [53]. Figure 3.8: The initial mockpit setup with a chair, controllers from Figure 3.6 and three blue screens. 25.

(37) 3.4. User Studies. 3.3.2. Chroma Wall. The initial physical setup had no support for MR using chroma keying. There were no good materials or objects to use that gave a good representation of how a potential setup could look like. Testing of chroma key was done using different coloured boxes, several blue office dividers, a green blanket, ordinary office lights attached to the roof, and natural light that came in through windows. This made it hard to control the lighting within the room. While not being an optimal testing environment, it presented key features that had to be considered in future iterations and gave pointers towards what was needed to give a satisfying setup.. 3.3.3. Lighting. Two Viltrox VL-D640T [54] lamps were used in the physical setup. The lamps consist of 640 LED’s each and has a max brightness of 4800 Lumen. The light temperature range lies between 3300 to 5600 K. A diffuser was attached on the lamps to smoothen the light intensity. The lamps were set up so that both of the lamps were behind the mockpit, slightly tilting down from above. The right lamp was turned to the left, illuminating the left side of the backdrop while the left lamp to turned to the right to get the lamps to be as a great distance as possible. This was done to create as smooth of lighting as possible, due to small room, as described in Section 2.5.2.. 3.4. User Studies. The performed user studies are described in this section. Three user studies were performed: subjective latency, hand-eye coordination, and IR- versus chroma key MR flight simulator. Subjective latency, described in Section 3.4.1, test the change in reaction time of humans with and without added latency to the visual input. Hand-eye coordination, described in Section 3.4.2, tests how humans muscle memory is affected by the distorted view from looking through a VST-HMD with cameras „10 cm in front of the users’ eyes. IR vs. chroma key MR flight simulator, described in Section 3.4.3, performs an AB-test, described in Section 2.9, to determine which of the IR and chroma key MR flight simulators are preferred and why. For the user studies, two different computers have been used. The technical specification for the two computers used is presented in Table 3.1. Table 3.1: Technical specifications of the computers used for the conducted user studies. Operating system CPU GPU RAM. 3.4.1. User Study 1 & 3 Windows 10 Pro 64bit Intel Xeon W-2123 Nvidia Quadro RTX 6000 32GB @ 2666MHz. User Study 2 Windows 10 Pro 64bit Intel i7-8550U Nvidia Geforce MX150 16GB @ 2133MHz. Subjective Latency. All the subjective latency tests were performed, similar to how Greun et al. did their evaluation [4] as described in 2.1, in the Unreal Engine project. The project consists of an aeroplane which can fly and shoot, a simple environment, and the ability to spawn shootable objects in front of the user. The focus of the test was the subjective latency and its effect on the user. Therefore, the user performed the reaction test with and without the Varjo XR-1 to measure if the latency of the VST-HMD adds more time than the latency to the reaction time.. 26.

(38) 3.4. User Studies All functionality of the aeroplane, where it moves in space, were disabled during the whole user test. The aeroplane will therefore be completely still. However, there were two buttons on the joystick that worked; one to activate the reaction test and the other one to shoot a projectile from the weapon attached to the aeroplane. When the test person activated the reaction test, a cube spawned at a corresponding distance of 100 meters in front of the aeroplane. The cube randomly spawned between 1,000 and 4,000 ms after the reaction test is activated by the test person. When the cube appeared, the test person was instructed to shoot it as fast as possible. The cube was destroyed when it got hit by the projectile. The time between the cube spawned and when the test person used the trigger to launch the projectile, was measured as the reaction time. A reaction test is defined as the task, which the test person performed, from the activation button is pressed until the target cube is destroyed. A reaction test is defined as flawless if the reaction time is below 1,000 ms. If the reaction time is above 1,000 ms, then that test is counted as a flawed measured value. • The first part of the test was done without the HMD, which made the test person look directly at the computer screen. – Let the test person do five (5) reaction tests, where none was a part of the measured data set. This part of the task is simply done to give the test person a familiarity with the task. – Let the test person do ten (10) reaction tests, where all reaction tests are flawless, according to the flawless description from above in 3.4.1. • The second part of the test was done with the HMD, where the test person looked through the HMD on the screen. – Let the test person do five (5) reaction tests where none will be a part of the measured data set. This part of the task is simply done to get a familiarity with the task. – Let the test person do ten (10) reaction tests where all reaction tests are flawless, according to the flawless description from above in 3.4.1. • The third part of the test was done with the HMD, which took the user into immersive mixed reality. – Let the test person do five (5) reaction tests where none will be a part of the measured data set. This part of the task is simply done to get a familiarity with the task. – Let the test person do ten (10) reaction tests where all reaction tests are flawless, according to the flawless description from above in 3.4.1.. 3.4.2. Hand-Eye Coordination. In a real aeroplane cockpit, there are a lot of buttons, switches and sometimes screens, where the pilot sometimes needs to see what button, switch, or screen to interact with. A problem that could arise in a mixed reality simulator is that the cameras on the Varjo XR-1 are approximately 1 dm away from the eyes. This could lead to a parallax error that can affect the depth perception. [55]. Therefore, the purpose of these tasks was to find how much the Varjo XR-1 affects hand-eye coordination. Below are two hand-eye coordination tasks described and in both test the user’s eyes were approximately 30 cm away from the screen.. 27.

(39) 3.4. User Studies 3.4.2.1. Circles on a tablet touch screen. The first task was a precision test from a website [56] that measures accuracy, where circles randomly appear at random locations, and was performed on a tablet. The objective for the test person was to touch all the circles that appeared and the more precise the more points were awarded. Missed circles were recorded to calculate the accuracy of the test person, as well as the number of touches. The circles also disappeared after a set time. Note that for every test, a total of 41 targets appeared. This test was performed by a test person with and without the Varjo XR-1 to see how the VST-mode affects hand-eye coordination. The test with the Varjo XR-1 was started within 10 seconds of equipping it, to not get accustomed to the disorientation of using VST-mode. This was similar to how Park et al. tested hand-eye coordination in the second test described in 2.1.2. The test person performed the precision test three (3) times: • The first time, the task was done directly on the tablet touch screen, to give the test person a feel for how the test works. Data is not recorded. • The second time, the task was done directly on the tablet touch screen, where the score and accuracy were recorded as a reference point. • The third time, the task was done with the Varjo XR-1 using the VST-mode to use the tablet touch screen, where the score and accuracy were recorded. 3.4.2.2. Trace the line shape. The second task tests the line tracing skill of the person with the application SketchAndCalc [57], similar to the first task described in Section 2.1.2. The application can import images, trace lines over said images, and calculate both the length of the traced line along with the area within the trace lined. The test person had to trace three different shapes with a dominant index finger on the tablet touch screen. The three different shapes are shown in Figure 3.9, where each side of the triangle and square were 10 cm and each side of the star was 2 cm. The actual size of the shapes traced was not the same as the one measure with SketchAndCalc, as the shapes were enlarged for the task to be easier visible through the Varjo XR-1. The task was done first without the VST-HMD to get reference data, and then with the VST-HMD. The data recorded was the time to complete the task and the accuracy of the trace. The time was measured from the first contact with the touch screen to the loss of contact. The accuracy D was measured with Equation 3.2, by taking the area between the replicated shape A T and the original shape AO , divided by the length of the traced line L, which was how Gruen et al. measured it in their task described in Section 2.1.2. In Equation 3.1 the deviation area A D was calculated to be used in Equation 3.2. A D = A T ´ AO D=. AD L. (3.1) (3.2). 28.

No results found