Handwriting in VR as a Text Input Method

(1)

(2)

Handwriting in VR as a Text Input Method

2017-07-09

Rasmus Elmgren relmgren@kth.se

Title in Swedish: Handskrift i VR som en Textinmatningsmetod Master’s Thesis in Human-Computer Interaction

School of Computer Science and Communication (CSC) Royal Institute of Technology, Stockholm

Examiner: Haibo Li Supervisor: Yang Zhong

Project provider: Accedo Broadband AB

(3)

Abstract

This thesis discusses handwriting as a possible text input method for Virtual Reality (VR) with a goal of comparing handwriting with a virtual keyboard input method. VR applications have different approaches to text input and there is no standard for how the user should enter text. Text input methods are important for the user in many cases, e.g when they document,

communicate or enter their login information. The goal of the study was to understand how a handwriting input would compare to pointing at a virtual keyboard, which is the most common approach to the problem. A prototype was built using Tesseract for character recognition and Unity to create a basic virtual environment. This prototype was then evaluated with a user study, comparing it to the de facto standard virtual keyboard input method.

The user study had a usability and desirability questionnaire approach and also uses Sutcliffe's heuristics for evaluation of virtual environments.

Interviews were performed with each test user. The results suggested that the virtual keyboard performs better except for how engaging the input method was. From the interviews a common comment was that the

handwriting input method was more fun and engaging. Further applications of the handwriting input method are discussed as well as why the users favored the virtual keyboard method.

Sammanfattning

Virtual Reality (VR) applikationer har olika tillvägagångssätt för textinmatning och det finns ingen tydlig standard hur användaren matar in text i VR.

Textinmatning är viktigt när användaren ska dokumentera, kommunicera eller logga in. Målet med studien var att jämföra en inmatningsmetod

baserad på handskrift med det de facto standard virtuella tangentbordet och se vilken inmatningsmetod användarna föredrog. En prototyp som använde handskrift byggdes med hjälp av Tesseract för textinmatning och Unity för att skapa en virtuell miljö. Prototypen jämfördes sedan med det virtuella

tangentbordet i en användarstudie. Användarstudien bestod av uppmätt tid samt antal fel, en enkät och en intervju. Enkäten grundades på

användarbarhet, önskvärdhet och Sutcliffes utvärderingsheuristik av virtuella miljöer. Resultatet visar att det virtuella tangentbordet presterade bättre, handskriftsmetoden presterade endast bättre på att engagera användaren.

Resultatet från intervjuerna styrkte också att handskriftsmetoden var roligare och mer engagerande att använda men inte lika användbar. Framtida studier föreslås i diskussionen samt varför användarna föredrog det virtuella

tangentbordet.

(4)

Rasmus Elmgren KTH Royal Institute of Technology

Stockholm, Sweden relmgren@kth.se ABSTRACT

This thesis discusses handwriting as a possible text input method for Virtual Reality (VR) with a goal of comparing handwriting with a virtual keyboard input method. VR applications have different approaches to text input and there is no standard for how the user should enter text. Text input methods are important for the user in many cases, e.g when they document, communicate or enter their login information. The goal of the study was to understand how a handwriting input would compare to pointing at a virtual keyboard, which is the most common approach to the problem. A prototype was built using Tesseract for character recognition and Unity to create a basic virtual environment.

This prototype was then evaluated with a user study, comparing it to the de facto standard virtual keyboard input method.

The user study had a usability and desirability questionnaire approach and also uses Sutcliffe’s heuristics for evaluation of virtual environments. Interviews were performed with each test user. The results suggested that the virtual keyboard performs better except for how engaging the input method was. From the interviews a common comment was that the handwriting input method was more fun and engaging. Further applications of the handwriting input method are discussed as well as why the users favored the virtual keyboard method.

KEYWORDS

Virtual reality, Handwriting, Text, Input Method, User Inter- face, Character Recognition

1 INTRODUCTION

Virtual Reality

Today’s big increase in Head Mounted Displays (HMD) has led to many applications in Virtual Reality (VR), Augmented Reality (AR) and Mixed Reality (MR). VR simulates 360 degrees of an artificial 3D environment created using various computer technologies, often with six degrees of freedom for the user. The common x, y, z positions and also roll, pitch and yaw represent the six degrees of freedom. When used correctly VR gives the user feelings of presence and immer- siveness. The HMDs let the user look around in the artificial world and interact with it using 3D hand controllers or by using gaze and presence. Milgram gives in [9] a concept of a

Figure 1: The Reality Virtuality Continuum.

"Reality Virtuality Continuum" where on one end there is the real environment and on the other end there are completely virtual environments, see Figure 1. VR would appear on the second end as users get mounted into a simulated world. AR is also in this continuum but it does not put the user in a new environment so it appears closer to the first end in Milgram’s Virtuality Continuum.

Many of today’s VR systems (Oculus, HTC Vive and Google’s Daydream) have 3D hand controllers that track the position of the user’s hands as well as their rotations. These controllers have various uses: e.g pointing, grasping, gesturing, painting and more. The 3D hand controllers are also used for text input. A common solution is to use a virtual keyboard that the user point at with the controllers, see Figure 2.

Text input in VR is so far not well developed, there is no standard for user text input in today’s VR systems as many applications handle the task differently. When a user is searching, communicating, registering, documenting ex- periences or logging in to a service the user needs text input.

Users need a way of entering text in order to communicate with a machine, like entering a password or other sorts of registrations. The most common solutions for text input are virtual representations of computer keyboards that the user point at with the accompanied 3D-hand controllers of the system, see Figure 2. However, precise work (like pointing at the right key on a cluttered keyboard) is a tough problem for the user for many reasons, e.g:

• Friction: 3D interaction performed in the air has no friction. This makes the movement less controlled.

• Shakes: Humans show hand tremors.

• Not Static: 3D hand controllers cannot get dropped like a mouse, it is not possible to fix the position of the pointer since the controllers position, roll, pitch and yaw is constantly updated.

(5)

R. Elmgren

Figure 2: SteamVR’s virtual keyboard where a user points at a keyboard using a 3D hand controller to input text.

Other solutions use the gaze location (the focal point) of the HMD to point at a virtual keyboard or pre-defined common entries or choices. Using gaze limit users to gazing at the same entry for a period of time in order to select it.

Another less common solution is to use radial buttons that extends the 3D hand controller.

Using a mouse is 98% more accurate and 36% faster than using a 3D controller (in this case the Wii Remote) when it comes to selecting and navigating a 3D UI as shown by Zaman et al. [14], though it should be noted that the Wii Remote does not feature as good tracking as today’s 3D hand controllers. Since precise work is hard for the user in a virtual environment, another approach than the virtual keyboard method is presented in this article.

Voice recognition is an interesting topic that could make the user experience of text input much smoother. There are few open source voice recognition software packages available that are compatible with game engines like Unity or Un- real Engine. However, voice recognition is not an alternative for some users for various reasons; the user does not want the microphone constantly on due to privacy preferences, somebody is sleeping in the other room, it would make the user speak their password out loud, speech impediment, etc.

Therefore, there needs to be an alternative to speech recognition when it comes to communication with the computer.

Naturally, there are arguments for using voice recognition but there needs to be a redundancy in input methods so that there are options for all users and use cases.

Design Evaluation

When creating a new input method we need to think about how we evaluate the interface. In [11] Sutcliffe et al. present 12 heuristics that a designer can use to evaluate virtual environments. Sutcliffe’s heuristics are based on Nielsen’s pop- ular heuristic evaluation paper but extends it with virtual

environment principles. Key points from the study are natural engagement, presence and that the user expectations are always met. Compatibility with the user’s task and the do- main is important for the user to make decisions that would resemble the real world example. These heuristics are put to use later in the evaluation.

It is possible to test a design quantitatively by its usability.

ISO standards define usability by its effectiveness, efficiency and satisfaction in a quantified context of use [5]. Effective- ness is how capable the software is to produce the desired result and efficiency measures how much time or energy the user spends performing the task.

A component of user satisfaction is desirability. Barnum and Palmer [2] describe how desirability was measured and evaluated by Microsoft. To measure desirability Microsoft created a "desirability toolkit" with two approaches. The first approach was to use "face cards" that users could assimilate with. The second approach used "reaction cards". Reaction cards helped users express how they felt about a product or a topic. By measuring the user’s reaction on a Likert scale they could find patterns that affect desirability. To measure desirability, a set of 118 reaction cards was created with positive words, these were later extended with negative associations as well. The user study in this article will describe desirability based on these reaction cards.

Sofia Fröjdman has given some User Experience (UX) guidelines in her report [6] regarding graphical user interfaces in VR. These were later on used when designing the presented input method in this thesis. Placements of the Graphical User Interface (GUI), visual feedback, information areas, non-moving information and affordance to minimize cognitive load are guidelines that are general to all VR interfaces. Since her study focused on head oriented input a selection of the guidelines were used.

Character Recognition

One natural way of entering text is penmanship or writing by hand. We learn this skill in school using pen and paper and many use handwriting daily in their work. This makes it a promising alternative for text input. For the computer to recognize letters and digits, it needs character recognition.

Character recognition is a well researched subject with many different applications varying from aiding the visually impaired to quickly entering your credit card credentials in an electronic purchase. The overall goal is to recognize and classify where the characters are in a picture or in a live frame. The field of Optical Character Recognition (OCR) has different approaches to how the character gets recognized.

Each approach has its own conversion accuracy (how often the correct character gets recognized) and computing speed.

(6)

A common approach to find objects in a picture is to find edges. Color changes can be used, or if the picture was taken with a stereo camera depth edges are available, that will indicate were objects separates from each other. For characters, the approach is to find features and match them to a sample set. Approaches using machine learning algorithms can train themselves to get better conversion accuracies.

A study conducted by Trier et al. [13] compared different approaches for recognizing characters. Approaches using machine learning are highly accurate if enough training data is used. Template matching were fast at recognizing characters but had slightly worse accuracy. Feature extraction and classification methods can be used almost everywhere but the use case must be evaluated so the right method can be applied according to Trier.

There is plenty of free, open source character recognition software available. Because of its availability and ease of use, Google’s Tesseract was chosen for this project. Tesseract also has the possibility of training the engine to recognize new samples. Rakshit et al. [10] found Tesseract to be working with handwritten characters, with a conversion rate of almost 90% with the segmentation of characters being a fault source of 2.29%.

The conversion rate of Tesseract on handwritten characters is not that impressive. The MNIST database is a database of handwritten digits that is used for training and is a good indication on how well a character recognition system performs. Ciresan et al. [4] have achieved an error rate as small as 0.23% on the MNIST database using convolutional neural networks. However, that paper only focused on handwritten digits so it is not comparable with the method of Rakshit et al. who worked with letters.

Related Work

Instead of recognizing the writing of the user it is possible to recognize the gestures the user performs when writing.

In 2014 Kim et al. [7] constructed a 3D gyroscope based handwriting recognition system. Their system uses Dynamic Time Warping to be able to compare different inputs from their gyroscope-data. The 3D gyroscope works by having 3 coordinates that measures the orientation based on the principles of angular momentum. However, this thesis will not cover this concept, but for future work this might be of interest.

In [1] Alger presents a design for an OS in VR. During his presentation he discusses input methods for VR. He also talks about how a user can read by introducing a speed reading concept. In speed reading a user see one word at a time in a fast pace. He designed an OS suited for the Oculus development kit 2, and suggested that the user have one 3D

hand controller and use the other hand for control, similar to a keyboard and mouse combine.

By using OCR this thesis will aim to create an input method based on handwriting in 3D. The target group for the input method is VR users as there is no standard for text input in VR.

2 GOAL

As there is no standard for how to enter text in VR, available options have to be evaluated in order to find a suitable solution. Handwriting may become a natural input method used for communication, documentation, search and more in VR.

The result of the study may impact how VR-users deal with text as VR becomes more widespread.

This thesis aims to create a text input method from a design perspective in VR and then compare if it is preferable to the commonly used virtual keyboard, the SteamVR virtual keyboard shown in Figure 2 will be used for the comparison.

This thesis will not focus on character recognition or creating the most accurate character recognition for VR use.

Handwriting is a natural text input method that is learned in school. Using the accompanying 3D-hand controllers as a pen, a character recognition of drawings will be built into a VR scene (using Unity) and used as a text input method using Tesseract. The use of handwriting conforms with Sutcliffe’s heuristics of natural engagement. The report aims to find how the input method performs and compares to the de facto standard virtual keyboard.

The goal of this thesis therefore is to:

Create a handwriting text input method in VR using the 3D hand controllers and character recognition. In order to:

• Compare the commonly used virtual keyboard and the proposed handwriting method to indicate which input method the user prefers and why the user prefers it.

Research Question

Can 3D-controllers be used to write characters for a real- time input in VR using character recognition and how would that input method be comparing with a virtual keyboard according to users?

The problem entails implementing Tesseract into a VR environment, designing a method for text input in VR and conducting a user study to evaluate a common text input option to the proposed one.

3 METHOD

This section has two subsections since the goal has two different tasks. One for the implementation of the character recognition in VR and one for the user study.

(7)

R. Elmgren

Figure 3: The design featured canvases that would hold one character each. This eliminated the segmentation problem that Tesseract had and also kept the users from having to turn their heads when writing longer sentences. The left side is the user’s point of view and the right side shows the user’s hand controller position.

Character recognition in VR

The implementation of the character recognition started with designing how the input method should work. The design of the input method was created through a think aloud brainstorming session with an experienced VR programmer and a UX designer with experience in VR using a selection of Fröjdman’s guidelines [6] as a basis for the discussion.

The brainstorming session resulted in the final design by exploration. The exploration used basic prototypes (hands and paper) to simulate the flow of the controllers.

The design let the user input one character at a time into separate drawings, see Figure 3 for an example. This keeps the user from having to turn their heads when they have written a longer sentence. It also helps the OCR engine seg- menting out characters, which was a fault source of 2.29%

as mentioned earlier, see [10].

Tesseract was later chosen for the character recognition as it works sufficiently with handwriting, especially if we eliminate segmentation from the process. Since the focus is not on presenting a new character recognition algorithm the thesis will only implement a set of the English characters.

The implemented recognition had uppercase letters A to Z and the numbers 0 to 9 working.

The Unity3D game engine will provide a basic environment. Unity (64-bit version 5.5.2f1 was used) provides environments divided into scenes that users can design freely with few limitations. Interactions and behaviors are pro- grammed in C-sharp (C#) or JavaScript. Unreal Engine is an alternative but the author favored Unity from previous experience.

Since Tesseract 3.04 is built in C/C++ a C-sharp wrapper acted as a middleware between Unity and Tesseract. The wrapper is an open-source project [3] that provides an interface with the features that Tesseract has. The Virtual Reality

Toolkit [12] (VRTK) helped with providing an interface to controllers, interactions and object handling. With VRTK it is also possible to write once and deploy to many VR head- sets. However, the main device during this thesis was the HTC Vive.

The input method presents the user with drawable canvases. The user points at the canvas with the 3D hand controller and presses the draw button (default to the trigger button) to draw. When the draw button gets pressed a ray ejects from the tip of the controller. If the ray hits a canvas it will result in a change of color on the hit pixel of the canvas.

When the draw button gets released the system waits for 0.6 seconds for more input since some characters have multiple lines e.g A and E. The duration was iteratively chosen from user trials. If the system is not given more input the canvas applies for character recognition and gets pushed to it’s left.

A new canvas appears at the original place of the first canvas and the system now has a loop that the user can repeat.

Having the new canvas appear on the same position as the original canvas keeps the user’s attention in the same direc- tion. This is a feature Fröjdman wrote about, main content areas are always in front of the user. Also, the users do not have to turn their heads when they write longer sentences.

As the user doesn’t have to point at a specific spot to select a character we can remove one of the problems that 3D interactions have regarding precise work, see Section 1.

The user does not have to fix the pointer at a certain position, so the "not static" problem is resolved.

As long as a HMD has accompanying 3D hand controllers this method should be adjustable for that HMD. For the real environment (the first end of the Virtuality Continuum) the method would just be called handwriting and no environment would have to be simulated.

User study design

As the goal of the thesis was to know which input method the user prefers, a mixed quantitative and qualitative desirability study with a near-natural use of the product was performed.

Qualitative user studies aim to give insight into how an interface or a product performs from the user’s perspective.

A qualitative study answers questions like "why" and "how", instead of "how much" and "how many". A near-natural use of the product minimizes interference from the study to understand behavior but lacks control. Desirability studies are attitudinal and can be both qualitative and quantitative.

These studies often give the user different visual designs and output associations to attributes of the design. The attributes and associations can be both pre-made or the user can answer in a conversational matter.

The user study firstly aimed to answer which of the two input methods is preferable. Secondly, it measured some

(8)

quantitative data such as which input method is the fastest and most accurate, so time and amount of errors was measured by recording the participants. The guidelines that were used to design the input method will also be evaluated. The user study will focus on a near-natural desirability study, with user feedback being the main output of the user study.

Since using HMDs for a longer time period can cause fa- tigue and dizziness the user study featured a within group design and counterbalancing in order to not give a bias to- wards one of the methods.

Invitations were sent out to an office in Stockholm. The invitations asked for people with little knowledge of VR. The user study was planned for two days, were the users could request a time slot of 30 minutes.

The user study had five parts, first an introduction to what the study is all about and that we were going to record the user. Second, the user could try the two different input methods to get acquainted with the gear. Third, the user wrote a sentence with one of the input methods. Fourth, the user filled in an evaluation form. The form was based on the heuristics from the previously mentioned Sutcliffe paper [11], a desirability approach based on Microsoft’s reaction cards described in [2] and the ISO 9241-definition for usability [5].

The third and fourth step were repeated for the second input method. Lastly, an informal interview with the user was held that would address the user’s thoughts that were not brought to light by the evaluation form in step four. The design of the interview used guidelines from McCammon [8].

The questions featured in the questionnaire are provided in the appendix. The users were asked to associate how much a feature relates to the input method. All associations were positive associations and the association was indicated on a scale of 1 to 5. A higher number is more favorable.

Since Microsoft’s reaction cards had negative reactions, the opposite of the negative reactions were chosen for this study.

The questionnaire was filled in directly after the user had completed the task for one input method and then again after the second input method so the user’s memory of the task was fresh. The task was to write’The quick brown fox jumps over the lazy dog’. The sentence was chosen so the user had to write each letter, a to z, of the English alphabet in order to give a more fair comparison of the input methods. Time and amount of errors were then compared by their means.

4 RESULTS

This section will display the results of the study. The user study results will be divided into two parts, one will present the results of the measurable data and one will present the informal interviews. The questions of the questionnaire can be found in the appendix.

Figure 4: A user testing the input method

User study

The study had 15 users with different backgrounds. The users’

ages ranged from less than 25 to 45-55. Six of the 15 users were women. Unfortunately, the total population count was an odd number so there was one more user in the group that tested the handwriting method first. The test took place in an office environment with the HTC Vive chosen as the HMD, see Figure 4 for an example of a user testing the handwriting input method.

The first three questions of the questionnaire regard usability, questions 4-7 regard desirability and questions 9-11 regard Sutcliffe’s VR heuristics. The questions in Figure 5, Figure 6 and Figure 7 are highlighted to show which category they belong to.

Figure 5 and Figure 6 show the responses averaged from the user tests. The counterbalancing did have an effect on the study as those figures show, both in the time and error figures and in the answers to the questionnaire. The average of both groups gave the result in Figure 7.

The standard deviation for the keyboard time was 15.116 seconds and 42.827 seconds for the handwriting. The amount of errors’ standard deviation were 1.792 errors and 6.552 errors respectively.

In the total average of both groups regardless of the counterbalancing, see Figure 7, the keyboard method performed better on all questions except for the method being "engaging". This is also true in Figure 6 when the users had tried the keyboard method before using the handwriting method. Also, in Figure 5 there is a smaller difference and the handwriting method performed better on the question "the method felt suitable for the task". However, the handwriting method was not perceived as suitable for the task. Comparing it to the keyboard method’s representative value when the user had tried the other method first, see Figure 6, shows that the handwriting method was seen as less suitable after the other method had been tested.

(9)

R. Elmgren

Figure 5: Averages from the first test

(a) Questionnaire answers (higher is better). This histogram show the average answer when the user tried their first input method. The questions for each column are available in the appendix.

Input Method Time (seconds) Amount of Errors

Handwriting 135.75 14.13

Keyboard 46.43 1.14

(b) The mean time and amount of errors (lower is better).

When the user tried the handwriting method first the method was also perceived to have more natural engagement than the keyboard had when it was tried first. For all other questions the keyboard method performed better.

The second test, Figure 6, shows that the keyboard methods answers are more accentuated than they were in the first test, see Figure 5.

Patterns in the informal interviews Patterns found from the interviews:

• The users would rather take off the HMD and write on a regular keyboard if they had to write more than a sentence.

• It was more fun and engaging to use the handwriting method but not nearly as efficient and useful as the keyboard method.

• Both methods were equivalent in terms of ergonomics and deemed as no problem to the users.

• The handwriting method could be used for educa- tional purposes.

Also, some users mentioned that the controller should be replaced by a pen-like controller in order to increase user’s

Figure 6: Averages from the second test

(a) Questionnaire answers (higher is better) This histogram show the average answer when the user had tried the other input method before this method.

Input Method Time (seconds) Amount of Errors

Keyboard 38 0.88

Figure 7: Averages from both tests

(a) Questionnaire answers (higher is better).

Input Method Mean time (seconds) Mean amount of errors

Keyboard 42.21 1.08

(10)

performance for the input methods. Writing multiple characters continuously on a blank space instead of the separation in the handwriting method was also requested by a few users.

One user said that the handwriting method confronted the user with their terrible handwriting skills. The handwriting method was also more "user-error-prone" stated by one user.

It was also mentioned that the handwriting method could work better for short "dribbles" as a "note" function. To write a quick note in the air without having to bring up the keyboard might save the user time. One feature that was favored from the handwriting method was that it potentially would be faster when the user wants to write special characters.

On a keyboard this often requires the user to press another key to show the special characters.

5 DISCUSSION

The discussion will be divided into subsections as the topics of discussion range from method discussion to future work The results

The results shows that the keyboard input method had better results in most of the questions. Also, the time and error figures show an advantage for the keyboard method. Arguably, the handwriting method can generate more "user-errors" as one user suggested in the informal interview but this also weighs in on the usability of the input method.

Speed and error is not the only metrics that the users consider when choosing an input method. If that were the case then the QWERTY layout would not be the optimal one for physical input and speech recognition would be more common. The time and errors varied greatly between each user for the handwriting method which could suggest that there is a (re-)learning curve for handwriting as a user commented during the informal interview. It is also worth noting that the design of the handwriting method adds 0.6 seconds of waiting time for each character. The task was to write 35 characters. If we multiply the waiting time with the amount of characters we can say that the method always adds 21 seconds to the total time of each user test. If we remove 21 seconds from the mean time of all users the keyboard method still beats the handwriting method with about a minute.

The user-errors could be one of the factors that made the handwriting method have a larger standard deviation than the other method, see the standard deviations in Section 4.

The time was also significantly higher for the handwriting method. Since the standard deviation was so significant it is hard to draw any conclusions about the performance of the handwriting method regarding time and errors. If a user could practice more they could approach the times measured from the keyboard method, but this is sheer speculation.

The previously mentioned user-errors may also be caused by the choice of OCR engine. Tesseract’s conversion rate might not be enough for live VR input. Users still have the problems of hand tremors and lack of friction which could create small errors that are hard for Tesseract to process.

Using handwriting for "dribbles" as a "note"-function was mentioned in the interviews and could be a possible future application of the method but there is no evidence that sup- ports that it would work better than a virtual keyboard.

The qualitative part of the results do favor the keyboard method, only concluding that the handwriting method is more fun and engaging. The results from the questionnaire also favors the keyboard method.

Research questions

The research questions from Section 2"Can the 3D-controllers be used to write characters for a real-time input in VR using character recognition and how would that input method be comparing with a virtual keyboard according to users?" can be evaluated from the results. The usability of the 3D-controllers to conduct handwriting is from Figure 7 deemed as lower than 3 on a scale of 1-5 which should be interpreted as the usability of the input method is not sufficient as a text input method. The input method has been compared by users to a virtual keyboard and has not been favored, so the second part of the research question has also been answered.

The first part of the goal in Section 2 has been fulfilled by the creation of the input method. The second and main goal to "compare the commonly used virtual keyboard and the proposed handwriting method" has been completed by the user study.

Method discussion

Using qualitative data from the interviews strengthened the results from the quantitative data. It also gave information about the user’s thoughts that was not expressed in the questionnaire. For example, the questionnaire showed that the keyboard method performed better except for the engaging part, see Figure 7, and the users confirmed this in the interviews saying that the handwriting method was more fun and engaging. The interviews also gave insight into potential applications for the different methods, which can be developed and/or evaluated for future work.

The standard deviations were quite different for the input methods’ results so it could be argued that it was not really fair to compare the means of the time and amount of error variables. In order to get data that more resembles the nor- mal distribution a larger test population would have to be acquired.

(11)

R. Elmgren It is possible that the choice of Tesseract affected the re-

sults. Users could have performed better and produced less errors if the OCR engine would have been better.

Social and environmental impacts

As seen in the result, see Section 4, the ergonomics of the methods was not seen as a problem. This would suggest that writing in VR would not have any impact on the bodies of the users but more research has to be done on the subject to confirm this. It is hard to predict how much the input methods will be used. The users clearly said that they would rather take off the HMD and write on a regular keyboard if the text to be written would be longer than a sentence.

From a sociological perspective the group of people who can afford a VR setup will gain no benefit from writing in VR compared to the group who cannot acquire a VR setup, except they might have more fun if their goal is to learn how to write.

Running a computer for VR use definitely requires more power than writing in a regular word processor. This is true because the simulation of the environment constantly has to be updated in order to not give an unpleasant experience to the user. Also, the time it takes to perform the task of writing is longer in VR and thus requires more power for a longer period of time.

Future work

For the analysis of the results we need more data to draw stronger conclusions because of the big standard deviation of the results from the handwriting method. From the qualitative part of the results we can conclude that the handwriting method is not better than the virtual keyboard method. How- ever, handwriting in VR could be used as a learning tool for students when they start to learn how to write, see the informal interviews in Section 4. Having fun while learning can possibly give the users more motivation to learn. The users of this study thought the handwriting method was more engaging than the virtual keyboard method, see Fig- ure 7. Engaging the student by making input method into a game could be a possibility. The goal of the game would be to write with as few errors as possible. It is possible to get a confidence value from Tesseract that can tell how sure it is that the resulting character is right.

Doing the same test but with a better OCR could give different results. A future study may implement the work of Ciresan et al. [4] and see if the handwriting method would perform better. More research is needed to draw stronger conclusions about handwriting as a text input method regarding time and amount of errors. Using gyroscope to analyze the gestures could also be an alternative, as Kim et al did in [7].

The dribble feature that was mentioned in Section 4 may also be a future implementation. The interviewed user wanted to have the feature of writing a quick note when inside a game without having to bring up the keyboard. The user also said that removing the canvases for that particular function would be optimal as less interface to the user would be better.

Using speech as an input method in VR is a possibility and should be evaluated, talking is a natural way of communication that might be well suited for VR. However, it is still important to have a fall back option of text input since there are use cases or users that cannot utilize voice recognition to its full potential.

Dealing with one of the precise pointing problems discussed in Section 1 might not have been sufficient. Taking the problems into account to design another input method would produce a better text input method. Humans showing hand tremors is not a problem when using a real keyboard since the user can rest their hands on the physical keyboard.

A better design would take the hand tremors into account and help the user when writing, maybe by introducing a tactile feedback. Since the 3D hand controllers are used in free motion it is difficult to introduce friction. One way to give physical feedback is to use tactile shakes in the controller, which is supported by the HTC Vive.

Instead of writing one character at a time another design could increase the limit to one word at a time. In this case another OCR engine might be needed. However, it is still important to keep the users’ focus in the right context zones as shown in the background.

The handwriting method could be applied to other areas in the Virtuality Continuum. The result showed that it did not fit well in the second end of the spectra, but it could work better when the user is approaching the other end of Milgram’s Virtuality Continuum, e.g AR. A future study could reuse the same method but implement the handwriting recognition to an AR-environment.

6 CONCLUSIONS

To conclude this paper, a text input method using handwriting was constructed. The input method used the 3D hand controllers that most recent HMDs have (Oculus, HTC Vive, DayDream) to let users draw on canvases in VR. The method was designed through a brainstorming session with a VR programmer and a UX designer with experience in VR.

A qualitative and quantitative desirability user study was conducted to compare the input method with the de facto standard virtual keyboard, in this case the SteamVR keyboard was used. The test users were interviewed and asked to fill in a questionnaire. The questionnaire was based on the ISO- standard for usability, a selection of Microsoft’s desirability toolkit and a selection of Sutcliffe’s heuristics.

(12)

The results conclude that handwriting as a text input performed worse than the virtual keyboard method. The virtual keyboard method was faster and produced less errors. The time and error variables may need more research to draw stronger conclusions but the questionnaire and interview results are clear, the virtual keyboard performed better. How- ever, handwriting in VR can possibly be used as an educa- tional tool for students who want to learn how to write in a fun and engaging manner.

ACKNOWLEDGMENTS

The author would like to thank the supervisors of this thesis Niklas Björkén and Yang Zhong for their help and motivation, as well as the company Accedo for providing equipment and help during the process.

A QUESTIONNAIRE

• The input method felt effective

• The input method felt efficient

• The input method felt as a satisfiable method for the task

• The input method felt engaging

• The input method felt suitable for the task

• The input method felt worthwhile for the task

• The input method felt natural

• The input method had natural engagement

• The input method had a natural expression

• The input method had a close coordination of action and representation

• The input method had much compatibility with the task

REFERENCES

[1] Mike Alger. 2015. Visual Design Methods for VR. (2015). https:

//drive.google.com/file/d/0B19l7cJ7tVJyRkpUM0hVYmxJQ0k/view [2] Laura A. Palmer Carol M. Barnum. 2010. Tapping into Desirability in

User Experience. InUsability of Complex Information Systems: Evalua- tion of User Interaction, Brian Still Michael Albers (Ed.). CRCPress.

[3] charlesw. 2016-03-23. A .Net wrapper for tesseract-ocr. (2016-03-23).

https://github.com/charlesw/tesseract

[4] Dan C. Ciresan, Ueli Meier, and Jürgen Schmidhuber. 2012. Multi- column Deep Neural Networks for Image Classification. CoRR abs/1202.2745 (2012). http://arxiv.org/abs/1202.2745

[5] International Organization for Standardization. 2010. ISO 9241- 210:2010 Ergonomics of human-system interaction. (2010). https:

//www.iso.org/standard/52075.html

[6] Sofia Fröjdman. 2016. User experience guidelines for design of virtual reality graphical user interfaces controlled by head orientation input.

(2016).

[7] Dae-Won Kim, Jaesung Lee, Hyunki Lim, Jeongbong Seo, and Bo- Yeong Kang. 2014. Efficient dynamic time warping for 3D handwriting recognition using gyroscope equipped smartphones.Expert Systems with Applications 41, 11 (2014), 5180 – 5189. DOI:http://dx.doi.org/10.

1016/j.eswa.2014.03.011

[8] Ben McCammon. 2013. Semi Structured Interviews.

(2013). http://designresearchtechniques.com/casestudies/

semi-structured-interviews/

[9] Paul Milgram and Fumio Kishino. 1994. A Taxonomy of Mixed Reality Visual Displays.IEICE Transactions on Information Systems E77-D, 12 (Dec. 1994). http://vered.rose.utoronto.ca/people/paul_dir/IEICE94/

ieice.html

[10] Sandip Rakshit, Subhadip Basu, and Hisashi Ikeda. 2010. Recognition of Handwritten Textual Annotations using Tesseract Open Source OCR Engine for information Just In Time (iJIT).CoRR abs/1003.5893 (2010). http://arxiv.org/abs/1003.5893

[11] Alistair Sutcliffe and Brian Gault. 2004. Heuristic evaluation of virtual reality applications.Interacting with Computers 16, 4 (2004), 831 – 849.

DOI:http://dx.doi.org/10.1016/j.intcom.2004.05.001 Human Computer Interaction in Latin America.

[12] thestonefox. 2017-02-27. A productive VR toolkit for rapidly building VR solutions in Unity3D. (2017-02-27). https://github.com/thestonefox/

VRTK

[13] Øivind Due Trier, Anil K Jain, and Torfinn Taxt. 1995. Feature Extrac- tion Methods For Character Recognition - A Survey. (1995).

[14] L. Zaman, D. Shuralyov, R. J. Teather, and W. Stuerzlinger. 2012. Poster:

Evaluation of a 3D UI with different input technologies. In2012 IEEE Symposium on 3D User Interfaces (3DUI). 173–174. DOI:http://dx.doi.

org/10.1109/3DUI.2012.6184217

(13)