Intuitive Interface for Pattern Recognition in Virtual Reality

(1)

Intuitive Interface for Pattern Recognition in Virtual Reality

Patrik Nilsson, Jack Lewitan Stålnacke

Computer Science Bachelor’s Degree 15 HP

(2)

Abstract

In this thesis we explore natural user interfaces and pattern recognition approaches. We did this by designing and developing an artifact which acts as an interactive spell-casting experience in virtual reality. In the artifact the user draws patterns to cause magical effects, such as summoning balls of fire, and gusts of wind. While there are other similar solutions available, they only allow a limited level of detail to the patterns that can be

recognised. To make an experience that feels intuitive to the user, an interface capable of detecting advanced patterns is required. Furthermore, a pattern recognition solution able to read these advanced patterns, without causing loss of frame rate, is an absolute prerequisite when the experience is entirely in virtual reality. We pose the question: how can you create a natural user interface that reads a player's intricate hand motions to execute various actions in a Virtual Reality gameplay environment? The pattern recognition system we developed is based on template matching, supported by a syntactic pattern recognition approach. With our intuitive interface we made it possible for the system to read a user’s drawn patterns in a 3D space as a 2D drawing. To keep it easy for us to add new patterns to the system’s library, we created a system that allowed us to quickly and easily add additional patterns for the system to recognise. We evaluated the artifact by conducting an experiment with eleven participants. The experiment was a user study consisting of a structured observation, where the users got to draw patterns as we observed them, followed by a semi-structured interview, to generate elaborate qualitative data. The final artifact does not provide a viable solution in and of itself, but it does have great potential for future work. The results show that the intuitiveness of the interface in a system like this relies heavily on the other parts connected to the interface, such as the leniency on the accuracy of the pattern recognition system.

(3)

Abstract 2

1. Introduction 5

2. Related work 6

2.1 Virtual Reality 6

2.2 Pattern Recognition 7

2.3 Intuitive User Interface 8

3. Method 9

3.1 Design Science Research 10

3.1.1 DSRM Process Model 10

3.1.2 Design-Science Research Guidelines 12

3.2 User Study 1₃ 3.2.1 Observation 13 3.2.2 Interview 1₄ 3.2.3 Thematic Analysis 14 4. Artifact Design 1₅ 4.1 Objectives 15 4.2 Non-functional Requirements 17 4.3 User Interface 17 4.4 Pattern Recognition 18

4.4.1 Difference to Related Work 19

4.5 The Full Process 20

4.6 Adding Patterns 21 4.7 Redundancy 2₁ 5. Experiment 2₂ 5.1 Observation 23 5.2 Interview 2₃ 6. Results 24 6.1 Pattern Analysis 2₄ 6.2 Intuitiveness 24 6.3 Virtual Reality 2₅ 7. Conclusion 2₅

7.1 Natural User Interface 25

7.2 Pattern Recognition 2₅

7.3 Summary 26

7.4 Future Works 27

References 28

Annex A: Observation Format 31

Annex B: Interview Format 32

(4)

Annex D1: Interview and Observation notes 46

(5)

1. Introduction

Virtual Reality (VR) is a technology and creative medium which through head-mounted display goggles lets the user view and interact with virtual 3D environments. The user can move their head to look around, and they can physically walk around to move in the virtual environment they are in. With positional hand controllers the user can interact with the environment and the objects in it [2]. For example, Tilt Brush is a VR painting tool where the user can move around in the virtual space, and use the hand controllers to draw in the air to make virtual 3D paintings [3]. Moreover, Hecomi made a virtual experience where the index finger would constantly draw a line. However, instead of creating a painting, the line conveyed how the user had moved their finger, in order to let the user draw certain shapes. Successfully drawing a shape would sound a chime, and the system would convey that a spell had been cast [7].

Regardless of what the VR experience is, it’s important to note that VR takes significantly more time to render each frame, compared to non-VR systems. This means that operations which might already take a lot of time to execute in non-VR systems need to be further optimized for use in VR. Pattern recognition is an example of a potentially time consuming operation, which would need to be optimized for real-time use in VR. Pattern recognition is a technology used to find patterns in some sort of input, which creates an output from it, based on a set of criteria. For example: the input could be a set of points on an 2D axis, which the pattern recognition algorithm might then recognise as a curve, and thus generates a curve as output. Some interactive scenarios utilizes pattern recognition to let the player draw shapes or otherwise move their controllers to cause effects. This is done with the use of a gesture-based interface.

An interface is most commonly known as “a device or a system that unrelated entities use to interact” [15], which means it is a very broad concept to fill. However, in this paper we will focus on natural user interfaces, which focus on human abilities like touch, vision, voice, and higher cognitive functions such as expression, perception and recall [20]. We use such a gesture-based interface that reads the gestures of the user and translates it into input for the pattern recognition system. The pattern recognition algorithms used in interactive scenarios usually work by either; finding edges and what degrees they have [1], using mathematical calculations to recognise simple symbols [7], or recognising the movement of the controller [8], to understand what was drawn. With these algorithms, it is possible for the player to draw rune-like symbols with hard edges, circles, or do simple movements, which the system can recognise. However, there does not seem to be any game or interactive scenario with a system able to detect more advanced and intricate shapes.

In the animated series The Dragon Prince [39], the characters are able to draw intricate shapes in the air to cast spells. The level of detail to the patterns in the series provide beautiful aesthetics, and a natural and artistic flow to the characters drawing them. A system able to detect such patterns would require an interface which feels intuitive to the user. We pose the questions:

RQ1: How can you create a natural user interface that reads a player's intricate hand motions to execute various actions in a Virtual Reality gameplay environment?

subRQ1: How can a user interface be designed to be intuitive to the user?

subRQ2: What method is suitable for the system to recognize the gesture, while keeping the time complexity low, as to avoid discomfort in VR?

(6)

Figure 1.1 - Three examples of the patterns used for spellcasting in the animated series The Dragon Prince [39]. Two pattern recognition approaches capable of detecting advanced and intricate shapes are template matching and syntactic pattern recognition. Template matching is a technique where by using a template image the system can find occurrences of the template in other images. The syntactic approach consists of a hierarchy of known patterns. By first finding primitives, i.e. the smallest forms of patterns in the hierarchy, the system can build upon that primitive by following the paths in the hierarchy. As it makes its way through the hierarchy, it can both detect the determined patterns, and to what extent the pattern is detected. These techniques are used in computer vision, which concerns fields like surveillance, vehicle tracking, robotics, medical imaging, and manufacturing [4, 21].

Furthermore, by inspecting the code of Arx Libertatis [1], an open-source community-made, modded version of Arx Fatalis’s released source code, we found that they use a syntactic pattern recognition approach to their spellcasting system. While Arx Fatalis has a simple user interface, where the user draws on the symbols two-dimensionally on the screen, Hecomi has developed a similar feature, but in his VR experience [7]. As Hecomi’s user interface lets the user draw symbols in three dimensions, he has created a more advanced, natural user interface. While Hecomi’s solution allows for more advanced and intuitive patterns to be drawn than Arx Fatalis, such as circles, it still lacks the possibility of truly advanced and aesthetic patterns that have more fluid motion in the hand movements, like those in the Dragon Prince (see figure 1.1).

By first researching the areas of concern, and building a research framework we have defined a problem and developed a solution. We propose a pattern recognition system which through a natural interface lets the user freely draw patterns in a three dimensional VR environment. The system compares the drawn patterns to a hierarchy of preconstructed images using a template matching approach which coupled with a syntactic pattern recognition approach is quick enough to not cause discomfort in VR, and accurate enough to feel intuitive, to the user. In our research, we have used the Design Science Research Methodology (DSRM), an iterative process by Peffers et al. [26], and the guidelines suggested by Hevner et al. [29].

We created a design and architecture for the artifact and developed it as a set of three objectives. Finally, we conducted a user study consisting of a scientific observation as the test-user used the artifact, followed by an interview. The collected data was evaluated to see if the interface was intuitive, if the system was fast enough to not cause discomfort in VR, and how well the system recognized the drawn patterns. The artifact provides a viable solution for similar projects, and could even be available as an asset for Unity.

2. Related work

2.1 Virtual Reality

VR is a technology that has existed as far back as the 1960:s and have gone through quite some changes during the years. There are many definitions when it comes to VR but they all basically say that VR is an interactive and immersive experience in a simulated (autonomous) world [40]. Mazuryk et al. dives into the history of VR from the beginning of the technology that helped to create it. In 1960-1962 Heilig created a multi-Sensory simulator also known as Sensorama, a pre-recorded film in color and stereo that was augmented

(7)

by binaural sound, scent, wind and vibration experiences. While this had all the features of a VR environment it was not interactive. Moreover, in 1965 Sutherland presented the idea of VR as an “artificial world construction concept that included interactive graphics, force-feedback, sound, smell and taste” [40]. In 1968 Ivan Sutherland created the first head-mount display (HMD) with appropriate head tracking which was called “The Sword of Damocles”. In 1982 Thomas Furness developed the Visually Coupled Airborne Systems Simulator which was an advanced flight simulator. It used a HMD that augmented the out-the-window view by adding visual indicators for targeting and optimal flight paths.

Modern VR is a technology and creative medium which through head-mounted display goggles lets the user view and interact with virtual 3D environments [2]. In [6], NASA defines virtual reality as “the use of computer technology to create the effect of an interactive three-dimensional world in which the objects have a sense of spatial presence”. This has been achieved in multiple ways by using different kinds of technologies. The most common one and the one we are using is the HMD which gives the user a view of the computer generated environment. The user can move their head to look around, and they can physically walk around to move in the virtual environment they are in. With positional hand controllers the user can interact with the environment and the objects in it [2]. In order to simulate the realistic depth perception of real eyes in VR, two virtual cameras are used to project the virtual environment onto two screens in the goggles. Since there are two screens, VR takes significantly more time to render each frame, compared to non-VR games. This means that operations which might already take a lot of time to execute in non-VR games needs to be further optimized for use in VR games. To keep the latency between frames in VR low is especially important, as research has shown that prolonged frame to frame latency causes intersensory discord and nausea [14].

Oculus is one of the modern companies when it comes to VR and one of their latest releases in the VR market is the Oculus Rift S. The Oculus Rift S uses insight tracking, which is their way to track the HMD in the real-world space without external sensors, as well as improved lenses that increases the experience by lessening both the screen door effect and the feeling of motion sickness. It also uses two touch controllers that work as hands in the virtual world and follow the user’s hand movements with realistic accuracy [9]. The screen door effect is a great challenge in the development of all VR displays. It is the visual illusion of looking through a net of squares instead of seeing a clear world. The issue is caused by how the HMD brings the user’s vision of the screens physically close, making the space between pixels noticeable [10, 11].

Figure 2.1 - A person wearing an Oculus Rift S HMD.

2.2 Pattern Recognition

While there are many approaches to pattern recognition, they all complete the same goal: to discover regularities in data. Pattern recognition can be used to discover regularities in any type of data, including: text, sounds, and images[18, 19, 3]. There are three main approaches to pattern recognition: statistical,

(8)

The statistical approach works by using specified boundaries, which are the desired patterns to be found, and then creating features, or measurements, of the data in which patterns should be found. For example, if we want to find all squares in a picture. First, we have created a boundary which consists of vectors which form the type of square we want to find. Then, all the features in the picture are identified, and vectors are created, based on the lines in the picture. The system checks if any of the features fit with the boundary, and thus finds which features form the square [21].

In the syntactic approach a pattern can consist of a hierarchy of ever shrinking sub-patterns, where the smallest sub-patterns are called primitives. By finding the primitives in some data, the system can test if a larger known pattern can be built upon the found primitive, by going through the hierarchy [21]. Arx Libertatis [1] and Hecomi’s project [7] both utilizes syntactic pattern recognition to detect edges and circular patterns. Arx Libertatis is a first-person perspective roleplaying game for the PC, which lets the user draw patterns on the screen with the use of their cursor. The algorithm then detects the lines the player drew, and the edges, and their degrees, which the lines construct. By doing this, the player can draw simple runes to cast magical spells in the game. Hecomi’s project is a first-person perspective interactive VR experience, in which the user continuously draws in the air, and when they draw certain shapes the algorithm will detect the pattern based on similar rules as in Arx Libertatis. While Arx Libertatis and Hecomi’s project are quite similar in their execution, Arx Libertatis requires the player to go into a drawing mode where they are technically drawing on a two dimensional canvas, while Hecomi’s project is constantly running, lets the user draw in any direction in the three dimensional space, and even runs smoothly in VR.

Template matching is the approach whereby using a template image the system can discover occurrences of the template in other target images [21]. Computer vision is a field in which template matching is most

commonly used to let a computer recognize objects in images. One major problem in template matching of images is the extensive computational cost because of having to check the template against all possible positions in the target image [12]. Banharnsakun et al. tackles this issue, and proposes a solution they call the Best-so-far Artificial Bee Colony (ABC). The Best-so-far ABC solution works by employing agents by giving them targets based on a fitness function conducted by a single scout agent. The scout moves through the image, and, with the collective information of all employed agents, employs new agents with new target positions in the image. Through these positions the system can find the occurences of the template in the image. The solution is proven to work faster than previously used methods, while still being accurate, and is proven to be a good solution for computer vision systems [4].

Convolutional Neural Networks (CNNs) is popular technology often used for pattern recognition. CNN is a neural network type artificial intelligence which uses deep learning. It does this by utilizing convolutional layers to process the data [38]. Combined with a pattern recognition approach, a CNN can be taught patterns to recognise, and detect patterns with great efficiency [36]. As shown in [37], pattern recognition using CNNs for recognising gestures in a VR experience is a viable solution with great pattern recognition accuracy and processing time.

2.3 Intuitive User Interface

An interface is most commonly known as “a device or a system that unrelated entities use to interact” [15]. Which means it is a very broad concept to fill. It can be a tv remote that works like an interface between the person and the tv as well as a language between two people.

Weiyuan Liu describes the evolution of user interfaces and how it has evolved during the computer's lifespan. The paper describes four stages during this evolution, “Batch Interface”, “Command Line Interface”, “Graphical User Interface” (GUI) and NUI [20]. Where batch interface was only something used in the early stages of computer development, the interaction between human/computer was more in the prototype stage where punch cards were used as input devices, and a line printer as the output device. The command line interface is seen as the ancestor of all computer interfaces. It came to be around the 1950s and established a kind of interactive dialogue that is understood by users and the computer alike [20]. A command line interface is text-based and is used to view and manage computer files. This was before the computer mouse was invented so to do anything the user had to write commands to run tasks on the computer [22]. With the creation of high-resolution displays and the appearance of the computer-mouse the GUI made its appearance in the 1960s. The main features of the

(9)

graphical user interface is the direct manipulation and “What You See Is What You Get” [20]. The strength of the GUI does not have much to do with its use of graphics but instead its ease with remembering actions, both in what actions are possible as well as how to invoke them [17]. The “NUI is a field of interface design where natural human abilities are leveraged to weave in technology” [23]. Basically the NUI is based on using natural human abilities like touch, vision, voice, motion and higher cognitive functions such as expression, perception and recall. While this system is meant to be intuitive and also very new, Donald A. Norman points out that NUI has been used in gaming in forms of musical instruments (like guitar hero for example). As well that there isn’t anything that can be seen as a truly natural user interface when it comes to gestures based NUI. Things like gestures can simply be misunderstood depending on the culture it is used in. “Even the simple headshake is puzzling when cultures intermix” [17]. Though he admits that this might change somewhat in the future when it is more developed.

Gesture Recognition is an alternative NUI for providing real-time data to a computer [42]. It uses a camera to feed the image data to the device connected to the computer, then the method starts recognizing the meaningful gestures from a preset library where each gesture is a command. Where the program tries to recognize the library's patterns with the ones made. Once the pattern has been recognized the program executes the command correlated with the pattern. Gesture recognition is used in many other areas than just gaming like e.g. in operation rooms to help the surgeon to virtually grasp and move objects that are shown on a monitor, or as a sign language interpreter [42].

Zhaochen Liu [43] and his team have worked together with Microsoft Windows Phone Team to develop a Kinect physiotherapy application. The application enables patients to perform exercises from their homes by checking their movements against the doctors instructions and provides real-time instructions to the patients.

Källberg has developed a gesture based natural user interface with the help of leap motion [5]. The “Erghis Sphere” is a VR typing system, where the user with the help of their hands and fingers can write words and sentences where every individual finger has its own function. In the application whether the user is in VR or not they will have a sphere in between their hands and by moving one's fingers towards the sphere will cause certain actions. For example, if the right hand thumb moves inward towards the sphere it will be as if pressing the spacebar on a keyboard. Similarly, if the same action is made but with the left hand, the system will read that as the enter key, and by rotating one hand 90 degrees the commands for the fingers change. However the tracking was unreliable at best, so Källberg created a separate application called the “Control Sphere”, which had some changes in design. In the end the control sphere also had issues with the tracking but was an improvement compared to the VR typing system.

Shih-Yao et al. have developed a “Freehand push-gesture based system” which lets the user do inputs via push-based motions with their arms [41]. This has been done by creating a “manipulation zone”, a 3D virtual plane which is positioned in front of the body, that adapts to the user’s body ratio, and provides the

recommended maximised manipulation zone. The steps that the system required were easy to follow where the user only had to move their hand to the target, making a pushing motion, and then a retracting motion. The recognition was more accurate than most common push-forward recognizers, at the time.

3. Method

When conducting our research we followed the six step, iterative Design Science Research Methodology process model developed by Peffers et al. [26]. DSRM is a methodology used for successfully creating and evaluating an artifact. It is used to define a well formulated process, from observing and defining a problem, to developing a finished artifact. It is a methodology that has been practiced for long, and has been formally defined and developed by many researchers. We chose to use this methodology because we were going to develop an artifact, and we used the iterative process because of its clear use of its six steps.

In addition to the six steps, we also followed the seven guidelines suggested by Hevner et al. [29]. We chose to follow these guidelines because they were useful in helping us develop the artifact, and they helped us develop a thesis which can be processed by many audiences.

(10)

In order to gauge the subjective performances of the artifact, we designed a user study experiment in the form of a structured observation and a semi-structured interview. By first observing the user as they tested our artifact, and then interviewing them, we were able to gather rigorous data. We then performed a thematic analysis to evaluate the data, by first color-coding the notes from the experiments, followed by placing each coded element into themes. We could then use these themes to more easily use the data in the discussions in our conclusion.

3.1 Design Science Research

3.1.1 DSRM Process Model

Figure 3.1 - DSRM Process Model. Each square represents each step. The arrows represent how the process can move between steps. Source: [26, p. 54].

Peffer et al. suggests an iterative DSRM process model consisting of six steps (see figure 3.1) [26] . The six steps go through the entire process of developing an artifact, including defining the problem, developing the artifact, and communicating the results. The steps exist to create a clear view of how the process should be iterated, and exactly what should be done before continuing onto the next or previous steps. In the following part we describe these six steps according to Peffer et al., and we explain how we have achieved and gone through the steps during our research.

Step 1 - Problem identification and motivation

In the first step, the problem needed to be identified and a research question was defined. Additionally, a possible solution was presented and justified. We developed an interest in a mechanic which was not available, the ability to draw intricate patterns in VR to cause effects, and defined a research question based on that mechanic. Furthermore, we researched similar existing mechanics, and found that we needed to explore NUIs and pattern recognition approaches. After researching those areas we created a design for how the finished system should work.

Step 2 - Define the objectives for a solution

In the second step, objectives - goals - required to complete the solution, needed to be deduced. The objectives were to be based on the problem specification, and would either be quantitative; consisting of many smaller objectives, or qualitative; consisting of fewer, more defined, but larger, objectives. To develop the solution to our research question we defined three qualitative objectives which the system would ultimately consist of. The first objective was a natural user interface for the user to be able to draw in VR. The second objective was a pattern recognition method which would be fast enough for the user to not experience any frame rate lag, and accurate enough for the NUI to still feel intuitive. The third objective was a pattern creation system, so it's easy for us to add new patterns to a library for the pattern recognition to use.

(11)

Step 3 - Design and development

In the third step, the design of the artifact, and its architecture, was defined based on the identified problem, the proposed solution, and the contribution which the artifact should provide. So, we defined the design and architecture of our artifact, and developed the objectives established in step 2. The artifact was created in Unity using the Oculus Rift S as our VR HMD. The system uses a template matching pattern recognition approach, combined with a syntactic pattern recognition approach. The user places a canvas to draw towards, and the canvas lets the system know how to translate the draw pattern. To make it easier on the user we added visual feedback in the form of points that will appear if the drawing is close to the pattern, which the user can follow to complete a pattern.

Step 4 - Demonstration

In the fourth step, data is accumulated by testing the artifact. The data should be appropriate to later evaluate how well the artifact solves the determined problem. A test can be any type of data collecting activity, for example a manual case study, or an automatic simulation. During the experiment we made sure to explain for the user how the controls worked for the artifact, as well as letting them play through the Oculus Rift S tutorial (The First Steps) to make them acquainted with the Oculus Rift S. We then collected data by conducting a structured observation, followed by a semi-structured interview. We let the test-users conduct a set of tests, while we checked how many times it took for the user to succeed, as well looking out for certain behavioral patterns, as defined in annex A. After the observation we had a semi-structured interview, with open-ended questions which let us adapt the interview to what the test-user said.

Step 5 - Evaluation

In the fifth step, the objectives to the solution proposed in step 2 are compared to the data collected in step 4. If the comparison does not yield adequate results, the artifact should undergo another iteration from step 3. By analysing the data we had gathered from the user study, we found that the artifact was not quite adequate, as it was not very natural for the users to use. We thought that the artifact actually should undergo an iteration from step 3, but we decided to continue on to the next step anyway. This decision was made partly because we did not have enough time to undergo another iteration, but also because we actually had adequate results to answer our research questions. In 7. Conclusion we further discuss why the artifact did not need to undergo another iteration as part of this thesis, and instead propose solutions for future work.

Step 6 - Communication

In the final step the problem, and its importance, is discussed. The artifact is also discussed; its utility, novelty, and the rigor of its design, as well as its effectiveness to its stakeholders. In the conclusion we discuss that the pattern recognition system does not work well with the NUI, and propose several suggestions on future work. For the artifact to be considered a success in of itself, it needs to be further developed. The results of the discussion in the conclusion are promising, so despite the shortcomings of the artifact, we deem the solution a success.

(12)

3.1.2 Design-Science Research Guidelines

Table 3.1 - Design-Science Research Guidelines. _Source: Adapted from [29]

Guideline Description

1. Design as an Artifact Design-science research must produce a viable artifact in the form of a construct, a model, a method, or an instantiation.

2. Problem Relevance The objective of design-science research is to develop technology-based solutions to important and relevant business platforms.

3. Design Evaluation The utility, quality, and efficacy of a design artifact must be rigorously demonstrated via well-executed evaluation methods.

4. Research Contributions Effective design-science research must provide clear and verifiable contributions in the areas of the design artifact, design foundations, and/or design methodologies.

5. Research Rigor Design-science research relies upon the application of rigorous methods in both the construction and evaluation of design artifact.

6. Design as a Search Process The search for an effective artifact requires utilizing available means to reach the desired end while satisfying laws in the problem environment. 7. Communication of Research Design-science research must be presented effectively both to

technology-oriented as well as management-oriented audiences. In addition to the process model suggested by Peffer et al., we followed the seven guidelines suggested by Hevner et al. in [29] (see Table 3.1). We chose to follow these guidelines because they were useful in helping us develop the artifact, and they helped us develop a thesis which can be processed by many audiences. In the following part we describe how we followed these guidelines, and how that aided with the development. Guideline 1 - Design as an Artifact

We produced a viable artifact in the form of a system. Our solution handles the problems we had determined and makes it easy for developers to design patterns for use in the pattern recognition system.

Guideline 2 - Problem Relevance

VR is a popular technology with a growing market and relevance in modern society. As our solution would develop an artifact with a fairly unique functionality based on popular media, the solution would potentially be desired for commercial use. Additionally, the specific issues we were aiming to solve could lead to the artifact being a solution to future work.

Guideline 3 - Design Evaluation

The quality of the artifact was evaluated by comparing the time complexity, the pattern recognition accuracy, and the intuitiveness of the user interface. The time complexity was evaluated by considering a scenario where the system had been fitted with infinite patterns, but still with a “smart” hierarchy in mind. “Smart hierarchy” in this context means that the hierarchy of patterns that the artifact uses would be developed using the “control points” in an optimal manner, as discussed in section 4. The pattern recognition accuracy and the intuitiveness of the user interface was evaluated using the data collected through a user study, which involved a scientific observation, followed by an interview (see more in section 3.2).

Guideline 4 - Research Contributions

To prove the contribution that the artifact provides, the results of the evaluations are presented and rigorously discussed to form a conclusion.

(13)

Guideline 5 - Research Rigor

The solution was discovered by researching the areas of concern and by defining what limitations would have to be considered.

Guideline 6 - Design as a Search Process

This step was promptly skipped for the main reason that there were no good sources that upheld the rules. Where Hecomi [7] makes something similar it works for making shapes by measuring distances between points which works but makes doing intricate patterns much harder on the program. Furthermore, the VR Infinite Gesture asset for Unity [8] was a good reference, but when we were looking at it, it didn’t have any open source to it’s work.

Guideline 7 - Communication of Research

In order for this paper to appeal to anyone who may be interested to read it, all relevant conducted research on the technologies used in the research is presented in this paper. Additionally, the artifact was planned to be available as an asset for Unity for anyone to access and use.

3.2 User Study

User study is a method on collecting data from people [31]. There are many methods to perform a user study, each of which can be gauged by a 3-dimensional framework with the following axes: Attitudinal vs. Behavioral, Qualitative vs. Quantitative, and Context of use. The attitudinal vs. behavioral dimension defines whether the method collects data by what people say (attitudinal) or by what people do (behavioral). The qualitative vs. quantitative dimension defines whether the method can collect in-depth data, but is slower and less

distributable, qualitative, or if the data is quickly and easily collected from a larger pool of users, but the data is not as in-depth, quantitative. Lastly, the context of use dimension defines how the method lets the user perform during the test. For example, whether the method lets the user freely use the product as the researchers take notes, or if the user follows a set of instructions [32].

We chose to conduct a structured observation, followed by a semi-structured interview. The structured observation and the semi-structured interview are both qualitative methods. That means they are methods best suited for gaining in-depth insight, at the cost of the methods being more time consuming, while also running the risk of the hawthorne-effect [31, 27]. The hawthorne-effect is the conscious or subconscious effect of the user doing something differently than they normally would, because of the fact that they are aware that they are being monitored [25]. For example, a test-user may answer questions during an interview less truthfully than if they were not being formally interviewed, or if they were being observed while performing some sort of action, they may not perform the actions they normally would if they were not aware of them being observed. While the observation is a behavioral type of user study, interviews are an attitudinal type of user study. As such,

combining a structured observation with an interview would complement our data collection rather well, and when used in succession, the two would provide the perfect setup for practical testing of an artifact.

3.2.1 Observation

A scientific observation is a method which can provide insight which the test-user would not normally convey during an interview, but may still be invaluable to the research As it is impossible to observe all behaviour, it is important to decide what details should be observed. When performing a scientific observation, the researcher can either interact with the test-user, or completely avoid interaction. This is called observation with intervention, and naturalistic observation, respectively. However, a scientific observation performed in a laboratory environment is not considered naturalistic observation. As such, most scientific observations performed when testing an artifact is observation with intervention. Even so, the observation may have intervention of varying degrees. A structured observation is a scientific observation performed when the researcher has some control over the events they are observing. Structured observations easily provide the researchers with the type of information they desire [34]. By conducting a structured observation we were able to set up a scenario for the user and give them a set of instructions to follow. We would then observe them with minimal interference, taking notes based on a set of behavioral patterns we had anticipated in advance.

(14)

3.2.2 Interview

In [30] an interview is defined as “a conversation for gathering information”. It involves an interviewer, who coordinates the process of the conversation and asks questions to gather data, and an interviewee, who responds to the questions. For this research we have done semi-structured interviews. By using the semi-structured interview type we can structure the interview to gather data that we want to find out. Additionally, we can probe for more information depending on how the interviewee answers, and gather important data that we otherwise would have missed. While it’s more time consuming than a structured interview when it comes to gathering data, this type of interview allows for more qualitative data to be gathered [33]. We generated a sheet of questions using the types of questions suggested by Qu et al. We hoped the answers would help us answer our research questions and help us write our discussion in our conclusion. The questions were designed to let the interviewee answer freely, as to let them develop their answers, to create more qualitative responses.

3.2.3 Thematic Analysis

Caulfield goes through how one can do a common thematic analysis [28]. There are six specific steps when doing a thematic analysis. Following we define how we have followed these steps when analysing the notes from the experiments we conducted. The steps can also be seen in annex C, as well as the full results of each step. Before analysing the data we acknowledged the risk of the hawthorne-effect, but we did not take it into consideration when evaluating the data. As Metzgar notes in [25], there is little proof to what degree the hawthorne-effect actually affects the data in research.

Step 1: Familiarization

In the first step we got familiar with all the data collected during our experiments. We read through the notes we had taken during the observations and interviews, and noted interesting topics.

Step 2: Coding

The second step was to code the data. Coding means to highlight the text into different sections and give them labels. This step is important to be thorough and highlight everything that can be concluded as relevant data. What we did here was highlighting different types of responses into different colors. The types we decided on were: Hard = pink, easy = teal, frustrated = red, content = blue, natural = green, clunky = purple. We decided on these six types by analyzing the patterns we had decided to observe during the observations, and the

questions we decided to ask during the interviews. Each part that we highlighted would effectively become a code, i.e. an element which we would use in the next steps.

Step 3: Generating themes

In the third step, we reviewed the codes, identified patterns among them, and started coming up with themes. The themes were to be identifiers which would be more relevant to answering our research questions. With good themes we would have an easier time finding the information we wanted when discussing the relevant data in our conclusion. By analyzing the codes we found eight themes which would be useful: First pattern, Second pattern, Third pattern, Fourth pattern, Training, Accuracy, Intuitiveness, and Virtual Reality. All codes were either linked to one of the themes, or discarded if they were too vague or not relevant enough to fit into either of these themes.

Step 4: Reviewing themes

In the fourth step we made sure that our themes were useful and accurate representations of the data. If we encountered problems with our themes, we would have split them up, combined them, discarded them or created new ones. However, we did not find that either of our themes were lackluster, but rather that they were going to be very useful when writing our conclusion.

Step 5: Defining and naming themes

By the fifth step we had already named our themes, and so we properly defined each of them. We formulated exactly what we mean with each theme and how they would help us understand the data. We also made sure that all the names were easy to understand the meaning behind them.

Step 6: Writing up

In the last step it is simply suggested to start writing the report using the analysed data. This is when we started writing our conclusion.

(15)

4. Artifact Design

4.1 Objectives

We presented the research question: How can you create a natural interface that reads a player's intricate hand motions to execute various actions in a Virtual Reality gameplay environment? To answer this question we proposed a solution consisting of three objectives.

The first objective is a gesture based natural interface. The interface tracks the user’s hand movements, and creates a pattern which is used as input. The second objective is a pattern recognition algorithm. The pattern recognition uses the input from the interface in a template matching approach, combined with a syntactic pattern recognition approach to improve time complexity. The system uses a library of template patterns, created in advance. The last objective is a pattern creation system that lets us quickly and easily expand the library of template patterns. The pattern creation system only requires a black and white image, utilized by the template matching approach, and a list of control points, utilized by the syntactic approach, to let the developer control how the pattern should be drawn.

The complete solution forms a system which is our artifact. With this solution we had a good idea of the overall design and architecture of the artifact.

(16)

Figure 4.1 - A flowchart representing how the system’s two active parts work. The interface takes in the positions of the cursor (the HMD hand controller) until the button for drawing is released. The points are then translated from the 3D space into a 2D space. The pattern recognition starts checking that the pattern is done correctly. If the last point is reached without failure, the spell is successful, and the spell’s effect will occur. If no pattern is recognized, the spell fails.

(17)

4.2 Non-functional Requirements

There were two technical decisions that we had to make. The first decision was: which VR HMD would we use? While there are other viable VR HMD in the market, we decided to use the Oculus Rift S.

Figure 4.2 - A user, wearing the Oculus Rift S HMD, drawing patterns in the artifact.

We already had an Oculus Rift S readily available to us, and it fulfilled all our requirements to create the artifact. Oculus Rift S let us quickly grasp game development in VR by using Oculus’s web tutorials and development kit for Unity. The second decision was which game engine we should use. We considered using either Unity or Unreal Engine 4, as they are the most popular free to use game engines available. Nevertheless, we decided to go with Unity, since we had previous experience using it, but no experience using Unreal Engine 4. Unity was clearly the best choice to use, as we would not need to spend time learning another game engine’s framework. Unity also has great support for Oculus Rift S, which would make it easier to start developing the User Interface.

4.3 User Interface

The UI we created is a gesture based NUI that tracks the position of the user's right virtual index finger and lets the user press and hold a button to draw freely in the air (see figure 4.3 A). When the user releases the drawing-button, the interface collects the user’s drawing and translates it to be used by our pattern recognition algorithm. Moreover, the user has to choose a direction in which they want to draw the patterns. They do this by first aiming in a direction with their right hand, and then pressing a button on the right hand controller. This makes a visual indicator appear for the user to draw towards (see figure 4.3 B). We chose this solution to choosing a direction, because we faced problems with letting the user draw freely while the program would automatically detect the direction. This led to the solution of letting the user choose where they intend to draw. This allows the NUI to correctly understand what the user is drawing, as the user has full control over where they want to draw.

(18)

(A) (B)

(C) (D)

Figure 4.3 - A user draws the swirl pattern, depicted as a green, transparent line, towards the canvas, depicted as a white, transparent, square plane with a white arrow pointing forward, and another pointing upward (A). If the pattern recognition fails, the CPs of the spell that the user most likely was trying to draw are made visible, depicted as red, transparent spheres (B). The player can move the canvas, and the _{CP-spheres will follow with it.} The user can then easily follow the CP-spheres to succeed with the spell (C). If the user is successful, the spell’s effect will be executed, in this case spawning a ball of fire, depicted as a red flame (D).

4.4 Pattern Recognition

The pattern recognition system that we developed was inspired by two pattern recognition approaches: template matching and syntactic pattern recognition. We decided to not use a CNN as part of the solution for two reasons. Firstly, we wanted to make it clear for the developer of the patterns how the system would

recognise the patterns. By letting the developer be in full control of how the patterns are recognised, they should be able to create patterns that the user has to draw exactly how the developer wants them to. With our solution the developer is in control of how much the user can fail to precisely follow a pattern, while still succeeding. The developer can thus clearly see how the pattern will be recognised, while if a CNN was utilized it would not be clear to the developer how the system detected a pattern as correctly drawn. Secondly, we wanted it to implement a system for us to easily and quickly add new patterns to the library of patterns. With a CNN the developer would need to train the CNN with each added pattern. The absence of a CNN means that the user’s freedom in drawing patterns is diminished. However, we believe that can be a positive effect, as it means the developer can constrict the user from successfully drawing a pattern the developer might want to be more difficult.

Our solution utilizes images containing white patterns on black backgrounds (see figure 4.4 A & B), combined with a hierarchy of patterns. The pattern recognition system uses the input from the NUI to see how well the user’s drawing follows the white pattern in the image. If the input strays too much off into the black area of the image, the system deems the attempt a failure, and it makes a check on another image. However, in

(19)

an infinitely sized library of patterns, this solution would give a time complexity close to O(Ni_{), where}_{N is the}

number of patterns in the library, and i is the amount of data from the user’s input. To deal with this issue, we added what we call “control points” (CPs). The CPs are points in the image which the user has to pass by in their drawings in order to succeed in drawing a pattern, in addition to following the white pattern of the image (see figure 4.4 D, E, & F). Additionally, the user’s drawing needs to pass each _{CP in a set order, from the first}

CP to the last, and needs to pass through all CPs, in order for the recognition to succeed. The order is set by the developer of the pattern, as seen in figure 4.5.

Moreover, with the addition of CPs came a lot of valuable functionality, like allowing the developer to make sure that the user draws the patterns as intended, and the possibility of giving feedback to the user in case of errors in their drawing. However, just adding the CPs would have little effect on the time complexity. With the addition of CPs, each pattern can also be assigned succeeding patterns, or child patterns. This establishes a setup inspired by the syntactic pattern recognition approach, a hierarchy of patterns which can build upon each other. Each child pattern is also associated with a number representing the index of the CP in the parent pattern from which the child can be expanded from. Patterns without children are placed into a list of primitive patterns. This list is the main list of patterns to be checked when the pattern recognition is executed.

4.4.1 Difference to Related Work

Earlier we have discussed works similar to the artifact, like Arx Fatalis and Hecomi’s project, which both use syntactic pattern recognition approaches. We would here like to define the major differences between our artifact, and the two related works. First, let’s give a more detailed definition of the two related works’ pattern recognition approached:

● Arx Fatalis is a first-person game that uses a keyboard and mouse as interface, and typically a regular computer monitor for visuals. The user can thus draw patterns on the screen with the mouse, similar to drawing on paper, or more like drawing with a mouse in some computer drawing software. The pattern library in Arx Fatalis consists of lengths of the lines drawn, and measurements of the angles between the lines. All patterns are in a hierarchy, where e.g. a vertical line with length x as the initial piece of the pattern, narrows the possible drawn patterns down to only those beginning with a vertical line with that length. The next line has length y, but there’s also an angle between the first and this next line that needs to be measured. That angle narrows the possible drawn pattern down even further, and so on. In this system, the patterns consist of a set of such rules, and understanding what a pattern should look like is not directly obvious when reading its rules [1].

● Hecomi’s project is a first-person interactive scenario that uses Leap-Motion [5] to detect the user’s hands directly, allowing the use of the person’s hands as the interface, and a set of Artificial Reality glasses for visuals. In Hecomi’s project, the system can recognise four patterns: a perfect square, an equilateral triangle, a star-pattern, and a circle. The circle is unique in that it uses a specific

mathematical algorithm to be detected. The other three patterns use a system similar to the one in Arx Fatalis. The system constantly tracks the user’s right index finger, drawing a transparent trail as it moves. The user is constantly drawing, and the system is constantly checking if one of the patterns has been drawn. When one of the four patterns is drawn, the system immediately recognises from where the pattern started, to its end. The part of the trail that makes up the pattern changes colour, and becomes more visible, in addition to a symbol appearing in equal size to the drawn pattern, aligned perfectly in the middle of it. Similar to the solution of Arx Fatalis, the patterns of Hecomi’s project are not directly obvious what they are, if you check their rules [7].

The most obvious difference between our artifact, and the two related works, is that the artifact primarily utilizes template matching, mostly just supported by a syntactic recognition approach. Additionally, the artifact does not measure angles between lines, and does not really measure lines at all. While the syntactic approaches of Arx Fatalis and Hecomi’s project use hierarchies of rules and measurements, the artifact allows a much freer approach to drawing the patterns. For example, it may be difficult to create a rule for the two related works to recognise an irregular curve, but in our artifact you would only need to draw the curve, and set up _{CPs to control} how the user must draw it. Not only does the artifact make it easy to add such a pattern, but it’s also done very visually, meaning it’s intuitive for a human to understand what a pattern is. This is to say that the rules that the

(20)

two related works use do not make it apparent what the pattern will look like, when drawn, but the template image, and to some degree the CPs, do. As such, it is possible, or at least much easier, to create more intricate and advanced patterns for the artifact, in comparison to the two related works.

4.5 The Full Process

The pattern recognition system starts off by checking if where the user’s drawing started touches the first _CP in any of the primitive patterns. When it finds a pattern where the first CP is touched, it checks if the drawing is hitting enough white space in the pattern’s image on its way to the next CP. If another CP is touched, and the drawing is still adequately accurate in relation to the black and white parts of the image, the check continues on with the same pattern. If the player’s drawing reaches into black area of the image by more than 30 procent, it will check if it has any succeeding patterns. If it does, it checks if any of those patterns’ associated number is the last touched CP’s index of this pattern. If it is, it goes back to that CP and continues the check from there. If no succeeding pattern has the correct follow-up CP, or if the pattern has no succeeding patterns, the system goes through the rest of the primitive patterns, from the beginning of the drawing. Assuming the hierarchy has been smartly designed, if one strain in the hierarchy has already failed, the rest of the primitive patterns should not be appropriate for the drawing. When a pattern fails, the user is shown the CPs of the pattern which they most likely were trying to draw, i.e the pattern whose CPs the drawing passed through the most of (see figure 4.3 B & C). This is done to let the user more easily learn to draw the patterns that they fail. When a pattern is

successfully completed, an effect determined by the spell in Unity occurs (see figure 4.3 D).

(A) (B) (C)

(D) (E) (F)

Figure 4.4 - These are 100x100px sized images, used as template images for the pattern recognition system. The user’s drawings have to follow these white patterns, in addition to the image’s CPs, which are shown as red spots in D, E, and F. Hitting the white parts of the image continues the pattern recognition, while hitting too much of the black parts will fail the spell, effectively prematurely cancelling the pattern recognition of that pattern. By painting the arrow-down pattern (B & E) the system will have started off thinking the player was making the line pattern (A & D). After the user has reached the third CP at the bottom, the system notices that the pattern derails into the black areas, which will fail the line spell. The system will then check if the line pattern has any patterns in its list of succeeding patterns, where it will find the arrow-down pattern. The system goes back to the last touched CP, the bottom CP, and retries the pattern recognition on the arrow-down pattern from there. C and F shows the pattern drawn in figure 4.3.

(21)

4.6 Adding Patterns

Figure 4.5 - The scriptable object of a spell (pattern) in Unity. A scriptable object is a data-container in Unity, which can be used to help serialize objects, making it easy to create new ones from a template. The “Spell Sprite” field contains the black and white image of the pattern. The “Control Areas” field lets the developer choose how many CPs the pattern should have, and which position each CP has in the image. The “Succeeding Spells” field lets the developer choose how many child-patterns the pattern has, and which the child patterns are.

In order for us to be able to easily add new patterns to the library, we developed a pattern creation method. There are two steps to the pattern creation method. The first step is to create a picture that has only black and white colors, where the white is the pattern of the spell/action. The second step is to manually add CPs to the image, which is done in a scriptable object in Unity (see figure 4.5). The CPs are added into a list, where the first CP should be where the pattern starts, and the last CP where the pattern ends. The CPs need to be in the correct order in the list, as the CPs are used by the pattern recognition system to check that the pattern is drawn as intended. Without this rule it becomes too easy to break the pattern recognition by majorly drawing within the white in a drawing without actually following the intended pattern. Adding a new pattern takes one to three minutes for us to do, depending on how complex it is, and how many CPs the pattern requires.

4.7 Redundancy

With the current functionality of the artifact, the system runs the risk of not letting the user know how to complete certain patterns, because it mistakes them for another pattern. In the following speculations we assume that the developer of the patterns has made no mistakes when placing the CPs, and have placed them in a manner so as to avoid any possibility that two CPs overlap. Failure in doing so will run an unnecessary risk of the system checking the incorrect pattern. Additionally, we assume that the developer has placed CPs in the right places of patterns, and made no errors in the construction of the hierarchy of patterns. This means that a child pattern does not continue from a CP that the user’s drawing has not passed through, and that the patterns build upon each other, following the same pattern, but acting as splitting paths. In the following examples we’ll be referring to the four patterns seen in figure 4.6.

(22)

(A) (B) (C) (D)

Figure 4.6 - These patterns are make up examples on how the system will have trouble recognising what the user is drawing under certain circumstances. Each image has the black and white fields, of which the users drawing must follow the white field. Additionally, the user’s drawing shall follow the red arrow. B is a child pattern to A, which if the user does not draw far enough to the right at the end, will not even be checked. C and D are

examples of sibling patterns, i.e. two child patterns with the same parent. In the case where the system knows the user was trying to draw either C or D, but neither was successful, the system will tell the user how to draw the last pattern of the two that it checked.

In our first example we’ll be using pattern A as the parent of its child pattern B. In order for the system to recognise a pattern as failed, the user’s drawing has to enter the black fields of the pattern’s template image. The system won’t check child patterns unless the parent pattern is recognised as failed. Therefore, when the user is trying to draw pattern B, when their drawing does not reach far enough into the black field to the bottom right of template image A, the system will never realise that pattern B is even attempted, because pattern A will succeed. This is a problem, as this makes it difficult for the user to see the helping CPs for pattern B, since pattern A succeeds, thus showing no CPs at all.

In our second example we’ll be using patterns C and D as sibling patterns, that is to say that they both have the same parent pattern. In this case, pattern C is first in the list of succeeding patterns. This means that when the system fails to recognise patterns C and D’s parent pattern, the system will check if the user drew pattern C before checking pattern D. If the user is trying to draw pattern D, their drawing may not go far enough to the left at the bottom for the system to recognise the pattern as failed, and instead thinks that the user partially

succeeded in drawing pattern C. In this scenario, pattern D’s CPs will never be shown to the user, because either the system recognises pattern C as almost successful, or it recognises pattern D as a complete success.

The first example is an issue caused by the next pattern in line back-tracking into the previous pattern’s template image’s white fields. This is an issue which we can think of a few simple solutions for, e.g. keeping track of how far from the last CP that the user’s drawing trails off, after the CP has been hit. However, no solution is implemented in the artifact, as of writing this text. The second example is an issue caused by sibling patterns’ CPs being too close to each other. Because of how the system works, we cannot think of a particular solution to this issue. The best way to deal with this issue is to simply have the sibling patterns’ CPs further away from each other. Unfortunately, this means that the developer of the patterns becomes somewhat limited in what patterns they can make.

5. Experiment

In order to be able to evaluate if the artifact we had created was a suitable solution to the research questions we had presented, we conducted an experiment with eleven (11) participants. The selection criterion for finding participants was that they would not be very experienced with VR, so as to make sure that the interface could be natural for almost anyone, and not only those who are used to the general VR interface. Before the test officially began, we let the user play through a short demo created by Oculus. The demo introduced the user to the basic controls and standard mechanics of VR. The experiment consisted of two parts: a structured observation, followed by a semi-structured interview. Before the observation began, we asked the user for consent to record their reactions and answers during the observation and interview. The answers and observations were recorded on separate papers, and the users are anonymous, each referred to only as “Test 1”, then 2, and so on.

(23)

5.1 Observation

During the observation the user wore the HMD and was let into the artifact. The user was first verbally introduced to the basics of the artifact, and was shown a set of patterns. The patterns can be seen in figure 6.1, and were visible to the user in the artifact in the form of a book, as seen in figure 5.1. We instructed the user to draw the patterns shown in the book, and then observed the user, looking for behavioral patterns in, such as: how many attempts were made before completing a pattern, if the pattern seemed difficult to draw, and if the user learned to draw the pattern better as they drew. The observation proceeded until the user had succeeded with each of the four patterns. If the user wanted to redraw any of the patterns they succeeded in they were allowed to do so until we found that no more useful data could come out of it. When a user continuously failed to draw a pattern without being able to get the CPs to show, we gave them tips on how to draw the pattern correctly. The instructions given on how the artifact works, how the observation process worked, exactly what behavioral patterns we observed, and the tips given when the user failed repeatedly, are all detailed in annex A. After the user had successfully completed each pattern, they dismounted the HMD and we followed up with the interview.

5.2 Interview

For the interview we had a set of questions prepared, each categorized as one of the ten types of questions by Qu [33]. The questions were designed to let the user answer freely and elaborate on a set of keywords that were important to answer our research questions: Natural, Clunky, Dizzy, Confused, Clear, Difficult, Hard, Easy or other synonyms to these keywords. Each interview started off with our two introduction questions. If the interviewee stopped talking without having mentioned either of the keywords, we would follow up with the direct questions. If one of the keywords was mentioned we would use the follow up and probing questions to let them elaborate on that keyword. The other question types were used when it seemed appropriate during the interview.

Figure 5.1 - A book in the artifact which the user would carry in their left hand. The user was able to “flip” through the pages to see all four patterns while wearing the HMD.

(24)

6. Results

The following results provide a comprehensive summary of the thematically analysed [28] data that can be found in annex C.

(A) (B) (C) (D)

Figure 6.1 - Representations of the patterns that the users would draw during the Experiment. The patterns are referred to as the first (A), second (B), third (C), and fourth (D) pattern.

6.1 Pattern Analysis

The following subtitles represent the results from the experiments on how well each of the patterns were completed. Pictures of the tested patterns, and their names, can be seen in figure 6.1.

First Pattern

There was no difficulty to draw the first pattern, which is a straight vertical line. While some users were still learning the basic controls of the artifact and VR, they were having trouble drawing this pattern, but once they got the hang of the controls, this pattern was no trouble at all.

Second Pattern

The second pattern was easy for most users because it consists of three straight lines. Most users completed the pattern at their second or third attempt. Those who did find this pattern difficult said it was because of the required level of proportionality in all the patterns. Eventually, all users learned to consistently make this pattern correctly.

Third Pattern

The third pattern was the most difficult for all users to draw, as the intricate shape was difficult to follow while keeping it proportional. It took an average of 10 attempts to succeed with this pattern, ranging from 3 to 19 attempts. This pattern took the most tries for most users to complete, and many found it frustrating to draw. Some users said that all patterns felt good to draw, except for the third one.

Fourth Pattern

The fourth pattern was very difficult to complete for most users. Some users believed that it was because the pattern looked like the letter S, so they tried to draw it as they would write it, instead of following the pattern as shown. Like the third pattern, this pattern took an average of 10 attempts to succeed with, but some users succeeded in their first attempt, while for others it took as many as more than 30 attempts to succeed. Many users expressed how it was frustrating that the pattern was not recognised, despite that they thought that they had drawn it correctly. Some users experienced the S as the most difficult, despite completing it faster than the third pattern. Some users had no trouble drawing the S at all.

6.2 Intuitiveness

Most users agreed that drawing in VR felt natural and was easy to do. However, many users thought that drawing the more complicated patterns, like the third and fourth patterns, felt weird. Even when the user thought they had drawn the pattern correctly it still did not succeed. Despite this, almost all users felt that it was intuitive to draw the patterns. Most users expressed that after some practice it would become easier to succeed with the patterns. Some users quickly learned the correct technique to complete patterns, while other users seemed uncomfortable even at the end of the observation. Because of the precision required to complete a pattern, some users were unsure when asked if they thought they could succeed with the patterns in a stressful situation, e.g

(25)

when faced with an opponent. The general consensus was that the first and second pattern could work in such situations, while the third and fourth would need to be less strict on the accuracy. However, most users seemed to be able to learn to consistently draw the first, second, and fourth patterns correctly. After some training, most users were also able to consistently spawn the CPs of the third pattern, and were able to complete it with the help of the _CPs.

6.3 Virtual Reality

The artifact did not make anyone feel nauseous and no signs of frame skips were seen or mentioned by the users. Some of the users did not like to draw in VR and had problems with either the depth perception when drawing freely in the air, or simply felt that it was a lot harder to draw in VR than it is in real life on paper.

7. Conclusion

7.1 Natural User Interface

Most of the users showed frustration when trying to recreate the patterns shown to them. While all of the users succeeded after some trial and error, the interviews made it clear that most of the time the users felt it was the system that was the problem and not them drawing incorrectly. One thing that improved the results was how many questions the user asked us before and while drawing. The users who asked many questions got very clear instructions, and were more accurate in drawing, than those trying to learn from experience. This doesn’t mean that the NUI is a complete failure, as many users felt as if the system was not so strict, and it was fun to use.

Moreover, while many of our users expressed that it felt natural to draw the patterns, some said the depth perception in VR made it difficult to draw in the air. Some user’s seemed to have little problem with this, while others struggled with all patterns because of it. Even with clear instructions users would not be able to draw the patterns intuitively. While the issue of depth-perception could be blamed on the limitations of VR, since only some users seemed to be affected by the issue, it seems more likely to be caused by how the individual experiences something as intuitive. Norman goes as far as to say that it is not possible to create a completely intuitive gesture based system [17]. Yet, even if that would be the case, we believe that a NUI can at least get very close to being completely intuitive. Part of the issue with the way the user experienced the negative effects of the NUI was caused by the pattern recognition’s accuracy. A great way to improve upon our interface would be to make the precision of the pattern recognition more lenient on the users, as the more intricate patterns created a lot of frustration for the users.

Furthermore, many users said that the visual feedback when failing or attempting a pattern was part of the reason that it did not feel intuitive to draw them. The implemented visual feedback let the user see what they had drawn, as they were drawing, and if they had almost succeeded with a pattern, they would also get to see the solution to it. As some users expressed, it would likely help to increase the amount of visual feedback. When asked, the users agreed that being able to see the pattern in the air when trying to draw it would help a lot with it feeling natural.

However, although there are issues with the NUI, many users still believed that it would be possible to learn to draw the patterns, even in stressful scenarios. Several users thought that the patterns would settle in the muscle memory. To be able to learn a pattern to such a level must mean that it is intuitive. Even so, there were also users who said it would not be possible, particularly not for some patterns.

7.2 Pattern Recognition

The pattern recognition method of the artifact uses a template matching approach coupled with a syntactic pattern recognition solution. The system we have created lets the developer easily add new patterns to a library of patterns, with control over how the user needs to draw the patterns. During the experiments many users expressed having trouble following the patterns, despite our efforts to design patterns that should be lenient on the required accuracy to complete them. It seems the accuracy of the system is still too strict, as many users said that the required accuracy is too high to complete the patterns, particularly the third and fourth. While the