Entertainement [!] for faster driving takeovers: Designing games for faster and safer takeovers on level 3 self-driving cars

(1)

Mall skapad av Henrik

ENTERTAINEMENT FOR FASTER DRIVING TAKEOVERS

Designing games for faster and safer takeovers on level 3 self-driving cars

Master’s degree Project in Informatics One year Level 22.5 ECTS

Spring term 2020 Luca Di Luccio

Supervisor: Per Backlund

Examiner: Mikael Johannensson

(2)

Abstract

The upcoming level 3 generation of self-driving vehicles will be characterized by the freedom of not having the driver’s hands on the steering wheel. This acquired freedom is posing new challenges on the traditional passenger comfort paradigm as the drivers will spend a higher amount of time doing non-driving tasks (NDRT). Certain constraints must be imposed as the level 3 generation systems will not be able to drive all the time without active feedback from the user. The driver needs to stay active enough to do takeover in a situation where it is needed to.

What effect will different NDRT have on the behavior of a driver in a self-driving car?

In our low fidelity driving simulator, we tested different simple actions (e.g. playing a simple 2D game). We then evaluated them based on their accident avoidance and situation awareness in the post-transition period. The results show a significant difference between the reaction speeds of the drivers before and after an active task.

Keywords: level 3 self-driving car, onboard entertainment, context switching, low fidelity simulator

(3)

Summary

1 Introduction ... 1

2 Background ... 2

2.1 State of self-driving car ... 2

2.1.1 Level of automation ... 2

2.2 Status of the Non-driving-related Tasks ... 3

2.2.1 Driver Vehicle Interface and Ergonomics ... 4

2.3 Simulated Driving ... 4

2.4 Hardware ... 5

2.5 Unity 3D ... 6

2.6 Measuring effects of a takeover ... 6

3 Problem ... 8

3.1.1 Research Question... 8

3.2 Method ... 8

3.2.1 Non-Driving Related Tasks ... 9

3.2.2 Experiment Definition ... 9

3.3 Ethical Considerations ... 10

4 The experiment ... 11

4.1 Virtual Environment ... 11

4.1.1 Physics Engine ... 12

4.2 Participant selection ... 13

4.3 Simulation Setup ... 13

4.4 Non-driving related tasks during the experiment ... 14

4.5 Questionnaire ... 14

4.6 Test Pilot Evaluation ... 15

5 Test Results ... 16

5.1 NDRT Categorizations ... 16

5.2 Influence of a NDRT ... 18

6 Analysis of the results ... 21

6.1 NDRT Categorizations ... 21

6.2 Influence of a NDRT ... 21

6.2.1 Chess ... 21

6.2.2 Flappy Bird ... 21

6.2.3 T-Test ... 22

7 Conclusions ... 23

7.1 Summary ... 23

7.2 Discussion ... 24

7.2.1 Limitation of a Low-Fidelity Environment ... 24

7.2.2 The difficulty of a Brake-Based reaction time test ... 25

7.2.3 The limited test subject in a pandemic situation ... 25

7.3 Future Work ... 25

References ... 27

(4)

Tables ... 30

Appendix A – Introduction speech ... 31

Appendix B – Consent Form ... 32

Appendix C – Experiment Data ... 34

(5)

1 Introduction

The automation of driving vehicles is approaching faster than expected but some problems still need tinkering. The human factor in such tasks is crucial and it will still be until we develop ways to make the interaction easier and more comfortable. Currently, the research enforces a taxonomy for motor vehicle driving automation on 5 levels (SAE, 2018)

In the current state, most self-driving cars are still only level 2 (Heath, 2018). This means that the research is focusing on level 3 automobiles in both self-driving systems and passenger comfort fields. The extended transit time and the limited accountability of the driver bring the discussion away from the traditional vehicle ergonomics as the passengers do not need to take an active part in the driving process anymore. Finding ways to make the trip comfortable will be a priority for the vehicle manufacturers because with the increase of free time while driving, the entertainment systems will become valid choice factors for the potential buyers.

In the next level of self-driving vehicles (i.e. level 3), we will see the rise of Conditionally Automated Driving (Naujoks, Befelein, Wiedemann et. al., 2017). CAD relives the car driver from monitoring the outside environment but, at the same time, the driver should still be vigilant enough as manual driving is the backup option. The takeover process is the series of 4 step that takes the user back to driving.

This paper research what approach to onboard entertainment is favored when working on a level 3 self-driving vehicle. As different games and tasks can be categorized based on different dimensions (Spiessl & Hussmann, 2011), the first half of the research project was to correctly identify a series of tasks that could have a strong impact on the drivers' reaction time. The second half of the research is a driving experiment that has been designed to find out what Non-Driving Related Task (NDRT) can be used for a fast take over in real-life scenarios while still being enough interesting for the driver itself.

The test was intended to be performed in the University of Skövde driving simulator that offers a well-defined simulation environment where there is little-to-no variation between users. The outbreak of COVID-19 has stopped all the University’s facilities and we must use a low-fidelity approach to conduct the experiments that will decrease the fidelity of the experiment but not the validity of the results.

The data obtained shows a difference in reaction time based on the NDRT given to the test subject, with a delayed reaction in case of fast-paced videogames and in case of a NDRT that the user has no familiarity with.

(6)

2 Background

To understand the work done here, there is the need to explain the elements that will be crucial to the thesis work: the current state of self-driving vehicles, the current state of the non-driving related tasks, simulated driving, Unity 3D and the simulator used for the experiments.

2.1 State of self-driving car

After many years of research, the field is reaching the prototype stage. Since 2007, many different vehicle manufacturers are working on self-driving cars. The current development is mostly towards vehicle and driver safety (Aishwarya, 2016): ADAS (“Advanced Driver Assistance System”) use a combination of sensors (such as stereo cameras, long-range and short-range RADAR) to enable the car's communication with their surroundings. Safe performances are always ensured by monitoring the driver’s behavior and displaying warnings in case of dangerous situations.

There are also sensor-based solutions ready to assist the driver through the three- dimensional representation of the space obtained by stereo cameras. This convergence of sensor-based technologies for communication will enable truly autonomous vehicles that will not require active control from the driver inside the car.

Currently, self-driving cars are on their way and they’re bringing benefits to both the users and the environment: as the need for human intervention will decrease over time there will be better road utilization, with higher passenger rates and lower carbon emissions overall.

2.1.1 Level of automation

To track with more accuracy the steps needed before the complete automation of the vehicles, SAE has created a standard that defines 5 (+1) levels of automation (SAE, 2018).

This taxonomy provides the basic information on the expected behavior from both the driver and the vehicle by ranking the different kinds of interactions starting with level 0.

In a level 0 (“No Automation”) vehicle “the full-time performance by the human driver of all aspects of the dynamic driving task, even when enhanced by warning or intervention systems”. This means that the human driver takes care of the steering, acceleration/deceleration, monitoring of the driving environment, and the driver also is the fallback in case of necessity.

In a level 1 (“Driver Assistance”) vehicle “the driving mode-specific execution by a driver assistance system of either steering or acceleration/deceleration using information about the driving environment and with the expectation that the human driver performs all remaining aspects of the dynamic driving task”. This means that the human driver is assisted by the vehicle system during the acceleration/deceleration or steering process. The driver also monitors the environment and it is the fallback in case of necessity.

In a level 2 (“Partial Automation”) vehicle the driving is executed by one or more “driver assistance systems” to control both steering and deceleration/acceleration. The monitoring of the surroundings and the fallback in case of necessity must be done by the human driver.

(7)

From level 3 to level 5, the “automated driving system monitors the driving environment”.

This means that the driver does not have to provide active attention to the surroundings anymore.

In a level 3 (“Conditional Automation”) the driving is performed by the system in both monitoring and execution of steering and acceleration/deceleration. The human driver is still the fallback for the system, this means that even if the driver's attention is not always required, a human must be present and ready to take over after a request to intervene.

In a level 4 (“High Automation”) vehicle the driving is completely performed by the automatic driving system in all its aspects, even as a fallback system. The human is still required to respond to a specific request to intervene as the automatic driving system will be limited in the areas where it will be possible to use the self-driving mechanisms.

In a level 5 (“Full Automation”) vehicle, the steering wheel is not needed anymore as all the driving aspects of the car will be performed by the automatic driving system. There will be no need for a driver at all. At this moment, the word “Driver” will be useless as all the people in the vehicle will be “Passengers”.

2.2 Status of the Non-driving-related Tasks

The new degree of freedom obtained through the usage of level 3 (SAE, 2018) self-driving car will enable the possibility for Conditionally Automated Driving (CAD for short) (Naujoks et al., 2017). For partially automated driving (Spiessl & Hussmann, 2011) a categorization to the different task has been created using the following dimensions to describe them:

- primary modality of the task (visual vs. auditory).

- interaction (active: controlled by the driver vs. passive: controlled by the task).

- interruptibility (easy vs. difficult).

- coding of information (verbal vs. spatial).

With the use of those categories, it is possible to create a driving task the suits the different scenarios and assists the driver in fulfilling their requirements: maintaining fitness to drive, noticing the takeover requests, and interrupting the NDRT on-the-fly.

In a level 3 self-driving vehicle it is crucial to design good takeover processes. The driver will need to re-configure his sensory, motoric, and cognitive states so the switching process must be as easy to perform as possible. We can describe the transition process through the following four steps (Naujoks et al. 2017):

1) “CAD Engaged”: during the automated driving time the driver might have their attention lowered, empirical assessments must be performed to identify negative behavior to select what is the best strategy (e.g. a sleepy driver is less reliable than taking a longer road)

2) “CAD Degraded”: The first notice is given to the driver; the step is strongly indicated by the NDRT that the driver is currently doing

3) “CAD disengagement”: As soon as the driver is ready to take over, the NDRT gets terminated by the driving system.

4) “Manual Drive”: the takeover is completed, the NDRT completely stopped and from

(8)

2.2.1 Driver Vehicle Interface and Ergonomics

Increased transit time will result in a major interest in the ergonomics factors of the driving process such as sound, noise, comfort, and vibrations of the vehicle. These factors are relevant even with automated vehicles, as the passengers are mainly affected by the mechanical design of the vehicle (Elbanhawi et. al, 2015). The behavior of the car plays a large role in disturbances that can decrease the driver’s performance.

With a word as subjective as “Ergonomics”, it is easy to assume that we cannot consistently evaluate such aspects of driving even if methods have been studied (Da Silva, 2002) and documented in different standards (such as ISO TR-3352 for noise reduction and IDO 2631-1 for vibrations of the vehicle).

The latest technological enhancements have helped in the development of Driver Vehicle Interfaces (DVI) (Elbanhawi et. al, 2015). The idea is to improve passenger comfort by limiting the driver distraction. Diversions for the driver have been proven to be connected to increased risk of crashing because their attention is shifted towards other activities. The design of such components is showed to be a great influence over the driver behavior. In the research field (Elbanhawi et. al, 2015), different tests have been conducted such as having visual feedbacks on the leading traffics or the use of Augmented Reality (AR) to provide the information to the user without being cumbersome. The DVI plays a crucial role in the ergonomics of a vehicle as it can increase the productivity of the passengers that can now do their daily tasks (e.g. checking emails, reading messages, etc.) while still providing comfort and clear information on the path ahead.

2.3 Simulated Driving

With the use of a Simulated Driving Environment, it is possible to reduce the costs and maximize the data collection during tests (Nilsson, 1993) as the consequences are minimized by the possibility to repeat critical and dangerous real-life scenarios with no additional cost or preparation time.

The simulated environments are often used to help the training of qualified workers in a specific field such as Aviation and Military training (National Training & Simulation Association, n.d.). According to the Transfer Effectiveness Ratio (TER), three hours in the simulator can replace 54% of the activities of three hours of flight. A similar situation has been found for the military training as the Apache Longbow Force Development Test and Experimentation analyzed data from two training processes that have been carried and found that the expenses in the simulator were reduced up to 6 times (0,7M$ in the simulator against 4.05M$ in real-life training) without counting the expenses of gasoline and artillery.

All of this without putting the life of the trainee at risk.

Driving can use simulated environments as well. It has been shown that virtual environments can help in the prevention of accidents behind the wheel (Ivancic & Hesketh, 2010) and can increase awareness regarding important driving concepts (Roenker, 2003) such as braking speed. Virtual driving systems can help us understand the different behaviors of the drivers and how they approach driving problems.

As the costs and the quality of the simulator increase, the sensation of realism increases too up until a specific amount. As seen in the Ford Driving Simulator (Cathey, 2000) the more the simulator tries to emulate the reality the more it can fail to reproduce the feedback of the

(9)

real world because of the uncanny valley effect. According to the research in the field (Green, 2015), the correct steps to create a simulated environment are to first “Replicate real driver behavior and performance” and then “mitigate the problems such as ‘Speed is too steady’

and ‘The subject drives too fast’” because the simulator will not take in consideration environmental variables such as wind and temperature that can change the not only the velocity of the car but also the perception of the road around the driver.

This mismatch between real life and simulation is due to two defining factors: fidelity and validity. The first concerns the physics of the simulation and how the different actions and reaction affects the simulation, every simulating environment must have fidelity as their first goal. According to Wyatt (2007), there are three types of fidelity: Equipment fidelity (e.g.

using a car and not a videogame pedal), Environmental fidelity (e.g. the different aspects of the environments feel natural to the user) and Psychological fidelity (e.g. the feedback given by the simulator is close to reality). Validity, on the other hand, defines how closely the simulated results can match the data collected in real-life. The latter is the reason why not all simulative games can be considered valid simulators.

The different analysis on how the simulation fidelity impacts on training effectiveness (de Winter, 2007) have shown that a low fidelity environment can still be considered a good approximation for the real-life scenarios. This kind of simulation experience offers a good level of validity and a high level of credibility in the data obtained. The downside of the low- fidelity approach is that it can increase simulator sickness and distraction in the test subject that will never feel completely immersed.

2.4 Hardware

Some important distinction can be made between simulators as they can be divided based on their fidelity (Miller, 1954): we can have a “Physical fidelity” simulator where the realism of the physical characteristics are simulated to perfection or a “Psychological fidelity” where the simulation manages to recreate sensations and skills needed for the real-life counterpart of the action performed.

Nowadays it is common to divide the simulators based on their level of fidelity (Matthews &

Yachmetz, 2008). There are currently 4 described levels of fidelity, ranging from level 1 (state of the art) to level 4 (low fidelity)

With a set of gaming pedal and steering wheel, we can make a low fidelity environment that is not as simulative as the high ends ones but is still advanced enough to test the differences in reaction time from the different test subjects.

A low fidelity simulator aims to reproduce the simulated scenario without focusing too much on all the factors that the user might experience in a real-life scenario. This kind of simulation is cheaper than the high-fidelity simulations as it does not require advanced simulating software and hardware to be performed. We can find a use for low fidelity simulators in situations where the physical characteristics of the environment are less vital.

A high-fidelity simulator reproduces the simulated scenario in the most realistic way possible as it aims to reproduce as many elements and relationships possible between them.

This kind of simulation is widely accepted as we can find them in the world of flight

(10)

simulation. The airplanes are usually expensive and dangerous to fly for training purposes so millions of dollars have been spent to recreate the sensation of the flight.

2.5 Unity 3D

Unity 3D is a development environment designed to build cross-platform games in a simplified manner. The interface allows the user full control over the game elements, from the camera perspective to the animation that will be displayed. The simplicity of the interface makes sure that even the non-experienced developer can make easily manage the high degree of complexity behind the development of a videogame.

Unity allows using both the graphical interface and the scripting system to assist in the creation of games that suits our needs using JavaScript or C#. The software is made of different subsystems that can enable easier management of all the aspects of the videogame:

from NPC to World Building, from sound effects to the physics of the simulation.

It is worth noting that usually, the building of these virtual environments is a tedious and repetitive process as every GameObject has to be instantiated, modified to accommodate our needs, and positioned in the simulated world. This approach can create problems even in small projects because if for some GameObject this level of complexity is manageable, it quickly becomes a problem.

Unity helps this process by making the user able to create and share “Prefab”, an asset type that allows you to store a GameObject object complete with components and properties. The prefab acts as a template from which you can create new object instances in the scene. Any edits made to a prefab asset are immediately reflected in all instances produced from it but you can also override components and settings for each instance individually. Through the use of the “Asset Store”, we can buy pre-made assets and prefabs making the process of creation faster.

We used Unity to modify the simulation software already used by the University of Skovde, making it more suitable for our needs. For the research, we made different prefabs that are needed to trigger events in specific moments of the simulation. The results are two prefabs, one will enable self-driving in the vehicle while the other will force the user to break. To make the forced brake feel more real, we also added new vehicle NPCs roaming freely around the map.

2.6 Measuring effects of a takeover

As context switching is becoming more common in the day to day life, the number of research works are increasing. Such works aim to explore and explain how the human brain reacts to an interruption of a task and how those actions can affect future behavior in the test subject.

During a takeover process, the interruption of the task can affect the test subject differently.

Studies (Latorella, 1998) found that auditory interruptions can cause plane pilots to commit more errors if compared to visual interruption. Stress and anxiety can also be an important factor in this kind of process as the act of switching can affect our actions even after we start again our original task. Some examples can be found in the strategy and approach that the user uses to the original task itself (Zijlstra et al., 1999).

(11)

Interruption during a low task workload can facilitate the decision performance, while interruption during high workload conditions usually decreases the decision performance of the user (Speier et al., 1999). Our work explored decision making in a very limited way (i.e.

only decision was the press of a button and to drive a simulated vehicle), we have explored how the task can influence such decision.

Most of our work with NDRT is based on the Theory of Multiple Resources (Wickens, 1992).

Such theory states that the separation of mental processes is based on the separate resources of the brain. The resources are divided into the modality of the process and coding of the process. The former describes how we interact with the task, the latter how the information is exchanged between the user and the task itself.

(12)

3 Problem

The entertainment systems will be an asset in the average self-driving car of the future, this is due to the need for distraction while the car is in self-driving mode. In the future with the approach of level 3 self-driving vehicles, the time spent driving will decrease significantly.

The current paper will try to assess the impact of the different Non-Driving Related Tasks during a takeover scenario, as the NDRT must be easy to interrupt and not heavy on the focus of the user that can be asked to take control of the vehicle in a little amount of time.

The simulator environment has already been implemented, tested, and used in different projects carried out by the University of Skövde (Backlund et al., 2010, Procaccini, 2013) but none of those have addressed research questions related to self-driving cars. This is what carried my research question forward before the COVID-19 pandemic as the intention was to use the simulator provided by the university to discover which tasks were less attention heavy on the driver. Now that we had to experiment in our low-fidelity environment the problem has shifted a little as we cannot account for the environment anymore.

The purpose of this research is to evaluate how different NDRT impacts on the driver’s attention and which NDRT helps the most with the reaction time of the test subject. This will be investigated by changing the NDRT between different subjects and comparing their reaction times. All the subjects will use the low fidelity simulator for the same amount of time, with the same simulation world and with the same settings: the only difference will the NDRT showed on the screen of the tablet.

The scope of the work is to:

1) To analyze and categorize different Non-driving Related Tasks basing our judgment on the different dimensions of tasks (Spiessl & Hussmann, 2011).

2) To find how the different dimensions of a NDRT can influence the attention of the driver and their reaction time.

3.1.1 Research Question

The research intends to explore different approaches to onboard entertainment trying to find out which one will be less distracting for the user driving in the level 3 self-driving vehicle.

To answer the question, we will test different tasks on different subjects, and we will try to spot differences in the distribution of the results. The different tasks will be decided based on their dimensions (Spiessl & Hussmann, 2011) of the user’s attention. The research question is hence: what is the effect of different types of Non-Driving Related Tasks on the reaction time of the driver after a takeover process has occurred?

3.2 Method

We design a test route in the driving simulator that the different subjects have to complete to test their reaction time both before and after the simulated self-driving ride and the forced takeover. All the subjects taking part in the experiment have used the same course and have experienced the same obstacles in the same manner: the only aspect that was different among the various runs was the Non-Driving Related Task that they could have played during the self-driving bits of the path.

(13)

3.2.1 Non-Driving Related Tasks

During the ride, the subject will experience the Non-Driving Related Task in the self-driving areas of the map. During the design of the experiment, many ideas on how to display the NDRT arise, all of them with their pros and cons that we should account for while solving this issue.

- The NDRT can be displayed on the windshield as a complete opaque overlay over the road view. This approach is the most radical one as the driver itself cannot see the road in front of him. This can be considered a problem because of two reasons:

1) The user does not know the current situation and that could slow down the takeover process as the driver will need a few seconds to assess the situation before being able to drive

2) At this moment such overlay can be considered illegal by the law as it blocks the view of the driver. It is worth noting that the current laws are made for the current transport system where the driver should, at any time, be aware of their surroundings

- The NDRT can be displayed on the windshield as a semi-transparent overlay over the road view. This approach is the best one if done carefully as it leaves the user to choose how much focus is needed for both the task and the road ahead.

- The NDRT can be displayed on an external tablet/phone, situated behind the driving wheel and above the driving dashboard.

1) This approach is the easiest to implement as it does not require a complete redesign of Unity’s camera view

2) This is the closest approach to the current real-world scenarios: people are already driving while using their phone or by looking at the car's screens. By using this strategy, we can increase the validity of the test without having to explain any new concept to the subjects of the research.

For the experiment, we have chosen an Android Tablet because it is the easiest tablet-like device to control from the computer. With the right tool, we can trigger the lock screen on and off forcing the player to stop the NDRT.

Another decisive step is the selection of the right NDRT as not all of them are expected to perform the same way on the attention of the user. To choose the better suited for our experiment we can categorize them by their dimension (Spiessl & Hussmann, 2011).

3.2.2 Experiment Definition

The experiment will be the same for all the participants, as we need a solid base to work with. The experiment is divided into the following steps:

1) The subject sits in front of the simulation. We explain the basics controls of the simulated vehicle and how to interpret the action signal.

2) The simulation starts and the subject drives for 60 seconds.

3) The simulation shows the action signal to the user. We count the amount of time between the signal prompt and the action performed, this measurement will be our baseline.

4) The simulator lets the user drive for 30 more seconds before enabling the self- driving. From now on the virtual car is driving itself and the tabled that shows the

(14)

5) The subject starts using the NDRT.

6) After 60 seconds of self-driving, the car alert that in 15 seconds the self-driving will be deactivated.

7) Now the subject is back into normal driving. The takeover is completed.

8) Less than 30 seconds after the takeover is completed, we show the action signal again and we count the amount of time as we did before. This measurement will be our result.

9) A simple questionnaire to know about how much the test subject knew about the task.

3.3 Ethical Considerations

The study is designed from the ground up to be done in a simulated environment. This is due to the inability to consistently and safely testing reaction time in real-life traffic situations.

This field of study (i.e. automated vehicles) is moving faster than the research can address, this is a problem because we are not able to understand the unintended side effects of the newer tech advancement.

To evaluate the different Non-Driving Related Task, we selected volunteer students age 19 to 28 and different nationalities. Every participant was informed of the purpose of the study but not about the methodology. Each tester was informed of how and why the personal data would be used as well as the resulting data from the tests. They were also advised on the possibility of motion sickness due to the simulator. All participants gave their written consent to take part in the study, a copy of the document can be found in appendix B of this paper.

(15)

4 The experiment

This section will describe the software that we are going to run on our low fidelity simulator and all the information needed to better understand how the experiment was carried out, from the design to the implementation aspects. Unity 3D has been used as a game engine to implement the original simulation so any derived work must be programmed with and for a Unity3D framework.

The simulation environment used by the medium-fidelity simulator of the University of Skövde was originally based on the (now deprecated) asset “Unity Car” even if drastic changes have been made to reach the current situation. We decided to modify the original simulated environment to better fit our purpose.

4.1 Virtual Environment

The scene used in the simulation is shared with the University of Skövde as our objective was to have a reliable and somehow realistic environment where the user could drive freely. To add the functionality needed for the experiment, we created two different prefabs:

HonkNowGate and SelfDrivingToggleGate.

HonkNowGate is an invisible wall with a trigger component that, if it detects a collision with the simulated vehicle, shows a message to the test subject asking to use the honk (see Fig. 1).

We use this prefab throughout the experiment to take measurements on the reaction time of the user while driving. The trigger is designed to save the data acquired in a file automatically without any intervention from the test operator. This approach made sure that we did not need to stay in the same room as the test subject while the experiment was carried out, reducing the distractions to the bare minimum.

Figure 1 HonkNowGate example

SelfDrivingToggleGate is an invisible wall with a trigger component. If the collision with the simulated car object is detected, a countdown will be displayed on the screen. At the end of the countdown, the self-driving mode will be enabled, and the tablet will be turned on. From this moment the driver can use the NDRT available on the tablet until the car reaches another SelfDrivingToggleGate trigger where the same thing will happen but this time the

(16)

Figure 2 SelfDrivingToggleGate example

Both the prefabs had characteristic sounds to make the user conscious of what was happening without forcing them to read the text in the middle of the screen.

To enhance the experience and to make the driver feel like driving in the real world, we decided to add new NPC (i.e. vehicle that is not controlled by the player) roaming around.

The NPC behave in a predetermined way, making them predictable and reproducible.

The experiment’s path was exactly as illustrated in figure 3, the landmark B and C are the two SelfDrivingToggleGate. The two HonkNowGate are shown by the landmark A and D.

Figure 3 Experiment path

4.1.1 Physics Engine

The simulated engine of the vehicle has been completely modified to accommodate the type of vehicle used in the Skövde’s University Driving Simulator, a Volvo s80. In our experiment, simulating the real car is not needed as the test subject does not get to experience the driving environment and so we decided to regulate all the car setting regarding the speed such as acceleration, horsepower, and torque to improve the familiarity of the simulation even at the cost of reducing the overall fidelity.

(17)

4.2 Participant selection

Given the current pandemic situation, finding participants for the experiment is a complex task. This is the reason why we have decided to use “Convenience Sampling”: only people close to us will be asked to participate in the experiment. With this approach, we can reach a good amount of people (24 + 1 pilot tester) without putting them at risk.

4.3 Simulation Setup

All the experiments took place in the same setting: same room, approximately the same sunlight, same driving setup, and the same number of people in the room. Because of the possible health issues due to the number of people participating in the test and the current health concern for the COVID-19, the driving setup, and any other surface the test subject might touch or interact with were cleaned before the start of each test.

As previously described for the experiment we have used a steering wheel and a set of pedals to simulate the feeling of the vehicle. The hardware used for the experiment is:

• The driving is a Logitech G920, it is composed of a steering wheel and a set of pedals.

The shift is incorporated into the steering wheel.

• The tablet used to show the NDRT to the subject is a Lenovo Smart Tab

• The computer used to run the simulation is a Lenovo Ideapad L340

The steering wheel and the computer screen were positioned at a comfortable height on a writing desk while the pedals were locked in place below. To sit the test subject used a generic office chair. The tablet used to display the NDRT was situated on the left of the steering wheel and was locked (i.e. the task was not available for the user) until the self- driving section of the simulated driving path had been reached.

Figure 4 The setup of the experiment

(18)

The experiment was described to the user in advance. They were informed about the low fidelity driving simulation and the self-driving aspect of it. They were not aware of the real objective of the experiment nor did they have any information on the simulation itself. This approach has been chosen because it is the fastest way to make the user familiar with the tools without developing biases or confidence with the low-fidelity simulation itself.

Before the experiment began, we decided to talk briefly to the test subject to teach them the basic information needed to drive the simulated vehicle as we needed to explain how to use the honk without focusing the subject’s attention too much on it. We opted for a quick introduction that can be found in appendix A.

4.4 Non-driving related tasks during the experiment

Different tasks were chosen for the experiment and they have been already described in the previous chapter. We decided to report here the task that has been used in the final experiment and how they were approached, including the control group:

- Control group: the control group did not have access to the tablet while the vehicle was self-driving. This has been done to get a better baseline of the expected reaction time. No questionnaire was given to this group of subjects.

- Semi-Active task “Playing chess”: the group had access to the tablet while the car was self-driving. We decided to use “Lichess” to display the game as it offers a free, libre, and open-source chess game for android tablets that have been released under AGPL-3. The game of chess the subject played had no time limits and they played against the bots

- Active task “Playing an active game”: the group had access to the tablet while the car was self-driving. We decided to use an open-source clone of the popular game Flappy Bird called “OpenFlappyBird”, released under the Unlicense license, because of its gameplay fast and somehow addictive. The subject was free to restart the game and/or play another round in any situation.

4.5 Questionnaire

At the end of each iteration of the experiment, the test subject was given a questionnaire to compile. The questionnaire contained questions about the NDRT they experienced during the experiment, such as:

- How experienced do you feel about the NDRT? [1/2/3/4/5]

- How much have you enjoyed the NDRT? [1/2/3/4/5]

- Do you want to play the NDRT again? (without the simulator) [1/2/3/4/5]

The questionnaire was given to the participants through the tablet itself and it was made using Google Forms.

No survey was given to the test subjects in the control group as they did not have any interaction with the NDRT.

(19)

4.6 Test Pilot Evaluation

Initially, we designed the experiment around the idea to brake the car when a specific prompt would ask. The test drive showed that such an approach was not ideal, this is the reason that brought us to test the reaction time with a honk. Asking the test subject to brake at a specific point during the simulated path is not the perfect solution for two main reasons:

it does not have any audio cue and can be triggered by mistake. We discussed the latter problem properly and we noticed what we could have implemented in two different ways:

1) We take into consideration the next brake signal after the user is prompt to brake.

This approach forces the user to lift the foot from the brake and start braking again if they were braking already.

2) We take into consideration the current brake signal after the user is prompt to brake. This approach registers 0:0:000s as reaction time if the user were to brake before the prompt is shown.

Both implementations of the problem have been considered non-ideal and we decided to shift our attention to the honk key. The honk is an action that must be performed on purpose while driving as it is not usually used neither in real life nor in the simulations. It also provides audio feedback to the test subject making it clearer when the input has been registered by the simulation.

The background information provided to the test subject has been proven successful at explaining the gist of the experiment without clearly stating the aim of the study.

(20)

5 Test Results

During the design phase of the experiment, we decided to gather a minimal amount of personal data possible. This approach was agreed upon to streamline the data management for the analysis phase. At the end of the experiments, the data collected had the following shape:

Table 1 Data collected for the experiment Task experienced Numb. of Subjects Control Group 8

Semi-Active Task 8

Active Task 8

Total 24

As anticipated in the third chapter, the scope of our experiment is wide enough to cover two distinct research areas:

1) To analyze and categorize different Non-driving Related Tasks.

2) To find how the different dimensions of a NDRT can influence the attention of the driver and their reaction times.

5.1 NDRT Categorizations

To correctly analyze the different NDRT we need to have a formal categorization. In the research field, the work of (Spiessl & Hussmann, 2011) shows a 4-dimension categorization (already described in section 2.2) based on the previously described (Section 2.6) theory of multiple resources (Wickens, 1992).

For a driver, the separation of those resources is essential to their safety as distractions are the first case for incidents worldwide (Strayer & Johnston, 2001). Given the different dimensions of a task, the Alliance of Automobile Manufacturers Guidelines (AAM Driver Focus-Telematics Working Group, 2006), and the initial work by Spiessl & Hussmann, we decided to create an operational categorization for the different NDRT. The results are summarized in Table 2.

The “Primary Modality” is how the task expresses itself to the user, it can be either Visual, Auditory, or both. A Visual task requires the driver to look at the source of the task itself to receive information, an Auditory task only needs the ability to hear the information given by the task. Sometimes is required to be both Visual and Auditory based on what is needed to complete the task itself. It is worth noting that, in our opinion, if a visual task such as playing a game has audio effects it is still considered a Visual task as we catalog the task based on the main aspect of interaction.

The “Interaction” describes if the task is interactive or not. It can either be Active if the user is required to interact with the task or Passive if the user is not.

(21)

The “Interruptibility” of a task is a measurement created by Spiessl & Hussmann based on the Guidelines of the Alliance of Automobile Manufacturers. It describes the perceived cost of mental resources (Wickens, 1992) needed to interrupt the task itself. The scale used for this measurement goes from Low (i.e. easy to interrupt) to High (i.e. difficult to interrupt).

The value assigned to each task is subjective for the most part.

The “Coding of Information” is how the user can interact with the task itself. It can be either Spatial or Verbal. The former describes the kind of task that requires the user to move, either the hands or the whole body while the former is used to describe tasks where the voice is used to interact with the task. When the task requires both behaviors we decided to catalog them by using the most used interaction system.

Table 2 Tasks selected and their dimensions

* Chess is a game that can be very distracting because as the game goes on, the number of decisions to remember and to keep in mind grows exponentially. We decided to rank it as

Task Primary

Modality Interaction Interruptibility Coding of Information Watching a

movie Visual/Auditory Passive Easy/Medium Verbal

Listening to an

audiobook Auditory Passive Medium Verbal

Doing internet

research Visual Active Easy Spatial

Reading the

lyrics of a song Visual/Auditory Passive Easy Spatial

Playing a game that requires

constant attention (e.g.

flappy bird)

Visual Active Hard Spatial

Playing a game that does not require constant

attention (e.g.

chess)

Visual

Active (passive when is the opponent turn)

Easy*

Spatial (verbal if there is the possibility of playing through voice command)

Speech

Commands Auditory Active Easy Verbal

(22)

“Easy” because the self-driving section of the simulation was short and the player couldn't reach those advanced moments of the game.

5.2 Influence of a NDRT

After having collected the response time from all the test subjects, we decided to do some data analysis to understand the patterns that have emerged. Figure 5 is the result of such plotting; it shows the representation of the response time as a box plot.

Figure 5 The distribution of the data obtained by the experiment

The data labeled as “Pre-NDRT” is the set of all the data measured before the NDRT happened. As described during the method paragraph. we can think of it as the baseline reaction time for the specific dataset. As clearly stated in the graph, the average is 1667 milliseconds for our whole dataset (24 measurements) with outliers of 2576ms and 959ms.

Compared to the results of the different NDRT examined throughout the experiment (8 measurements for each category) we can see that the average for the control group meets the baseline defined in the Pre-NDRT.

The situation is different for the actual NDRT we wanted to explore: Chess and Flappy Bird are tasks that can significantly impact the attention of the driver.

Figures 6 and 7 are there to plot the data obtained from the two NDRT tested during the experiment, both graphs have the same format. On the horizontal axes, we have the data divided into different groups, one for each test subject. The vertical axes show the scale used for the two measurements that we want to analyze: the vertical bars are measured by the values on the right (1 to 5, they correspond to the results of the questionnaire), the line is

(23)

measured by the values on the left (0 to 2500/4500, it corresponds to the reaction time of the users). The value for the reaction times of the subjects has been flipped as in our hypothesis it inversely correlates with the answers of the questionnaire.

Chess is a Non-Driving Related Task that we decided to consider easy to interrupt and with an active interaction only when it is the player’s turn to play. The results of the questionnaire (figure 6) show that most of the 8 subjects that have experienced the tasks do not have much experience with the game of chess and the correlation between response time and familiarity with the game is not always obvious.

Figure 6 Chess group analysis

Flappy bird (fig. 7), on the other hand, is a Non-Driving Related Task that we considered hard to interrupt as the game cannot be easily paused and requires an active interaction for the whole duration of the gameplay. From the distribution of the data, we can see that the average reaction time has increased. From the 1500-1750ms reaction time observed in both the baseline and the control group we now have a 2000ms response time on average with the standard deviation being slightly higher than the other categories.

0 1 2 3 4 5 0

500 1000 1500 2000 2500 3000 3500 4000 4500

1 2 3 4 5 6 7 8

Answers to the questionnaire

Reaction time (ms)

Test Subject

Experienced? Enjoyed? Play Again? Reaction speed Chess

(24)

Figure 7 Flappy Bird group analysis

All the untreated data obtained from the experiment are available in appendix C of this document, free of any data transformation.

0 1 2 3 4 5 0

500

1000

1500

2000

2500

1 2 3 4 5 6 7 8

Answers to the questionnaire

Reaction time (ms)

Test Subject

Experienced? Enjoyed? Play Again? Reaction speed Flappy Bird

(25)

6 Analysis of the results

6.1 NDRT Categorizations

The development of Table 2 has been the first step towards our thesis work as our research has been extremely related to NDRT and we needed to find which one was feasible for our experiment. After an exhaustive decision process, we opted for the one explained in section 4.4:

- Control group: No NDRT - Semi-Active task: Playing chess - Active task: Playing an active game

The reasoning of the choice lies in the difference of interruptibility between those tasks as, in our opinion, it is the most influential aspect to account for when we design tasks for a takeover.

6.2 Influence of a NDRT

As section 3.1.1 states, we want to find which approach to onboard entertainment is more distracting for drivers of automated vehicles. With the use of the data collected throughout the experiment, our goal is to find insight and trends that can answer the supposed question.

6.2.1 Chess

Figure 5 shows the obtained data by plotting all the acquired points in the experiment as boxplot graphs. In the figure, we can easily understand the difference in the data obtained by the different experiments and how the different interquartile compares to each other. The different medians of the measurements are in the 2000-1300ms range with the highest being the reaction time of the user that played flappy bird and the lowest being the reaction time of the user that played chess. This trend is not present if we start looking at the whole interquartile as, even if more than half of the data collected is below the median, the chess reaction time is more scattered than the other NDRT. This data anomaly is probably due to the task being considered atypical for some users.

The results of the questionnaire for the chess game (Fig. 6) show that most of the 8 subjects who have experienced the NDRT do not have much experience with the game of chess, making the correlation between response time and familiarity with the game not always obvious.

Subject number 7 of the Chess NDRT sample can be used as a clear example of such correlation as they were not familiar at all with the task and, such unfamiliarity can be considered an explanation for a slower response in the second phase of the experiment.

6.2.2 Flappy Bird

The average reaction time for the post-Flappy Bird measurement is increased overall from a 1500-1700ms of the control group to a 2000ms response time. The interquartile of such a task in fig. 5 shows a slight increase in response time if compared to the pre-NDRT.

In the analysis of the result (fig. 7), we have not found any explicit correlation between the

(26)

the same time, we also discovered that the reaction times are overall slower with only 3 subjects out of 8 that honked the second time in less than 1750ms (i.e. the upper bound on an average response time given by the control group and the baseline).

6.2.3 T-Test

To better understand the impact on the driver’s attention we decided to analyze the data collected more mathematically. The average and the median are good tools that can be used to measure the average behavior of a dataset, but they cannot describe the variation between the sets of data collected. This is the main reason why we decided to also use the T-Test.

A T-Test is commonly applied when we want to see if a test statistic (i.e. the data collected by the different tasks) would follow a normal distribution in large numbers but, in our approach, we decided to opt for replacing the scaling term with the estimate based on the data obtained before the NDRT, this way we can use T-Test to verify if the different averages are significantly different from the others. The significance level is how related are the two datasets that we are dealing with showing the probability that the difference we noted was by chance. It is usually described by a star value where 1-star means less than 0.05, 2-star means less than 0.01, and 3-star means less than 0.001. The usual approach is to consider the 1-star significant values as related and the 3-star significant values as unrelated.

The degree of freedom for every T-Test was 30 as we used the whole set of pre-NDRT as first value (24 measurements) and only the 8 related test subject data as the second value.

The data obtained when the NDRT was flappy bird shows a T-Test result of 0,002758 which is a 3-star significant result. Having a 3-star significant correlation shows very little to no relationship between the pre-NDRT measurement and the post-NDRT ones making the task very distracting if compared to the others.

The results of the Chess T-Test shows a value of 0,387195. This means that the data is related to the pre-NDRT dataset, making the chess results very related to the pre-NDRT ones. The result suggests that chess is not impactful on the reaction speed of the drivers.

The control group has a T-Test value of 0,784542 which makes it even more similar to the pre-NDRT data. This is the result we expected as the control group should have shown no difference from the pre-NDRT measurements.

(27)

7 Conclusions

In this section, we will present the results of the study, together with a discussion. The meaning, reliability, and some future areas of research will be discussed and analyzed further.

7.1 Summary

The purpose of this research experiment was to explore and analyze how the different Non- Driving Related Tasks can affect the reaction time of the user after a takeover process in a level 3 self-driving vehicle.

During the design process of the experiment, we understood that we needed a formal categorization for the different tasks to find the ones that are more relevant to the problem.

The table with the result is available in section 5.1. At the end of this first decision process, we defined the tasks to perform and divided the test subjects into 3 groups based on the NDRTs.

To test how the different tasks would affect the reaction times of the drivers, a sample of 24 people was asked to drive in a low-fidelity simulator that could reproduce the situation of a takeover reliably. To the test subjects were asked to:

1) Take control of the vehicle and driver for some minutes

2) We ask the user to press the Honk button to acquire a pre-NDRT measurement 3) After some more seconds of driving, the vehicle starts to self-drive

4) The user plays with the NDRT while the vehicle self-drives

5) The takeover process starts, the task is disabled, and the user is forced to drive again 6) The subject starts driving again

7) We ask the user to press the Honk button to acquire a post-NDRT measurement This test was designed to get the measurements in the most automated way possible as it would have been the only way to get data without adding biases to the test.

After the experiment, a 3-question long questionnaire has been given to the test subjects. In the survey, we asked questions to understand how knowledgeable they were with the task, how much they liked playing the task, and if they wanted to keep on playing. The questions were chosen because we wanted to understand if there was a correlation between task appreciation and reaction time.

The subjects were divided into 3 groups based on the task they needed to perform:

1) Chess 2) Flappy Bird 3) Control group

The first group showed scattered results by having a lower than usual median value speed but many measurements in the 3^rd and 4^th percentile being above the average. This behavior is reflected in the questionnaire as the subjects with less familiarity had slower reaction times overall. The t-test confirmed a correlation between this data and the pre-NDRT one.