• No results found

E-wallet - designed for usability

N/A
N/A
Protected

Academic year: 2021

Share "E-wallet - designed for usability"

Copied!
91
0
0

Loading.... (view fulltext now)

Full text

(1)

IN

DEGREE PROJECT TECHNOLOGY, FIRST CYCLE, 15 CREDITS

,

STOCKHOLM SWEDEN 2019

E-wallet - designed for

usability

BERCIS ARSLAN

BLENDA FRÖJDH

(2)

E-wallet - designed for usability

Bercis Arslan and Blenda Fröjdh

2019-06-07

Bachelor’s

Thesis

Examiner

Gerald Q. Maguire Jr.

Academic adviser

Anders Västberg

KTH Royal Institute of Technology

School of Electrical Engineering and Computer Science (EECS) Department of Communications

(3)

Abstract

As the use of mobile payment applications (apps) and electronic wallets (e-wallets) increases, so does the demand for a improved user experience when interacting with these apps. The field of Human-Computer interaction (HCI) focuses on the design, evaluation, and implementation of interactive computing systems for human use. One aspect of HCI is usability, i.e., the quality of the interactions with a product or system.

This thesis investigates how an e-wallet can be designed to provide a high level of usability by conforming to best HCI practices and by formative evaluation using a set of usability evaluation methods.

The research process consisted of an initial literature study and development of a prototype, which was evaluated iteratively through Thinking-aloud-protocol (TAP) and a combination of performance measurements and questionnaire by a chosen test group.

By each iteration, the results of the performance measurements, as well as the verbal data improved. The most complex or difficult task, for the test subjects to perform, was, according to the results, Pay via manual input. All goals were achieved for all tasks except for the performance goal of a percentage of errors below 5%.

To conclude, it was clear that the test subjects had more trouble understanding the concept of the e-wallet rather than navigating and completing tasks. The difficulties lay in understanding how currencies were stored and how transactions happened. When developing this e-wallet we noticed that the most important issue was to make new functions and concepts familiar to the user through relating it to recognizable ideas.

Keywords

Usability, usability testing, e-wallet, mobile payments, Think-aloud protocol, performance measurements

(4)

Abstract

I och med att användning av mobila betalningslösningar (appar) och elektroniska plånböcker (e-plånböcker) ökar, ökar även efterfrågan på en förbättrad

användarupplevelse vid interaktion med dessa appar. Området

människa-datorinteraktion (MDI) fokuserar på design, utvärdering och implementering

av interaktiva datorsystem för mänsklig användning. En aspekt av MDI är

användbarhet, dvs kvaliteten på interaktionerna med en produkt eller ett system. Detta kandidatexamensarbete undersöker hur en e-plånbok kan utformas för att ge en hög användbarhet genom att anpassas till MDI praxis och formativ utvärdering av designen med hjälp av en uppsättning utvärderingsmetoder för användbarhet.

Forskningsprocessen bestod av en litteraturstudie och utveckling av en prototoyp, som utvärderades iterativt genom Thinking-aloud Protocol (TAP) samt en kombination av prestationsmätningar och frågeformulär av en vald testgrupp. Efter varje iteration förbättrades resultaten av prestationsmätningarna, såväl

som för den verbala datan. Den mest komplexa eller svåra uppgiften,

för testpersonerna att utföra, var, enligt resultaten, Betalning via manuel inmatning. Alla mål uppnåddes för alla uppgifter förutom prestationsmålet för en procentandel av fel under 5 %.

Avslutningsvis var det tydligt att testpersonerna fann det svårare att förstå konceptet av e-plånboken än att navigera och slutföra uppgifterna. Svårigheterna

låg i att förstå hur valutor lagrades och hur transaktioner gick till. När vi

utvecklade den här e-plånboken märkte vi att den viktigaste uppgiften var att göra nya funktioner och koncept förståerliga för användaren genom att koppla dem till igenkännliga idéer.

Nyckelord

Användarbarhet, användarbarhetstestning, e-plånbok, mobilbetalningar, tänk-högt protokoll, prestationsmätningar

(5)

Acknowledgements

We would like to thank our examiner Gerald Q. Maguire Jr. at KTH Royal Institute of Technology, for his support and much needed guidance during our project. For his detailed and useful feedback, given whenever needed.

Secondly we would like to thank Henrik Gradin at Centiglobe for giving us the opportunity to do this project.

Stockholm, June 2019

(6)

Authors

Bercis Arslan, bercis@kth.se and Blenda Fröjdh, blendaf@kth.se KTH Royal Institute of Technology

Place for Project

Centiglobe

Stockholm, Sweden

Examiner

Gerald Q. Maguire Jr.

KTH Royal Institute of Technology

Supervisor

Anders Västberg

(7)

Contents

1 Introduction 1 1.1 Background . . . 1 1.2 Problem . . . 2 1.3 Purpose . . . 2 1.4 Goal . . . 3

1.5 Benefits, Ethics, and Sustainability . . . 3

1.6 Methodology . . . 5

1.7 Stakeholders . . . 6

1.8 Requirements from Centiglobe . . . 7

1.9 Delimitations . . . 7

1.10 Outline . . . 8

2 Background 9 2.1 Human computer interaction . . . 9

2.1.1 Usability . . . 9 2.1.2 Usability evaluation . . . 10 2.1.3 Analytical modeling . . . 11 2.1.4 Inspection . . . 11 2.1.5 Simulation . . . 12 2.1.6 Inquiry . . . 12 2.1.7 Testing . . . 12 2.1.8 Measurement of usability . . . 13 2.1.9 Prototypes . . . 14

2.1.10 Design principles, guidelines, and theories . . . 14

2.1.11 Task flow . . . 16

2.2 E-wallet . . . 17

2.2.1 E-wallets on the market . . . 18

2.2.2 Transaction and payment methods . . . 18

2.2.3 Task flow of existing E-wallets . . . 19

2.3 Related Work . . . 20

2.3.1 User experience of Bitcoin wallets of usability and security . 20 2.3.2 Designing mobile wallets . . . 21

(8)

2.3.3 User experience of a mobile app . . . 21

2.3.4 Acceptance of mobile wallets in Oman . . . 21

2.4 Summary . . . 22 3 Method 25 3.1 Research process . . . 25 3.2 Research paradigm . . . 28 3.3 Data collection . . . 28 3.3.1 Test group . . . 28 3.3.2 Consent form . . . 29 3.3.3 Sampling . . . 30 3.3.4 Sample size . . . 30 3.3.5 Target population . . . 30

3.4 Experimental design/planned measurements . . . 31

3.4.1 Performance measurement . . . 31

3.4.2 Thinking-aloud protocol . . . 33

3.4.3 Test environment . . . 33

3.4.4 Software and Hardware to be used . . . 33

3.5 Assessing reliability and validity of the method and data collected . 34 3.5.1 Reliability . . . 34

3.5.2 Validity . . . 35

3.6 Planned Data Analysis . . . 35

3.7 Evaluation framework . . . 36 3.7.1 Collection of data . . . 36 3.7.2 Evaluation of data . . . 37 4 The Work 41 4.1 Tasks to be analyzed . . . 41 4.2 Test group . . . 42 4.3 Testing . . . 42

4.4 Data collection and analysis . . . 43

4.5 The prototype . . . 44

(9)

5.1 Results iteration 1 . . . 45

5.1.1 Changes after iteration 1 . . . 46

5.2 Results iteration 2 . . . 47

5.2.1 Changes after iteration 2 . . . 48

5.3 Results iteration 3 . . . 49

5.3.1 Changes after iterations 3 . . . 50

5.4 Results iteration 4 . . . 51

5.5 Reliability Analysis . . . 52

5.6 Validity Analysis . . . 53

5.7 Discussion . . . 53

6 Conclusions and Future work 57 6.1 Conclusion . . . 57 6.2 Limitations . . . 59 6.3 Future Work . . . 60 6.4 Reflection . . . 60 References 62 A First Appendix 69 B Second Appendix 71 C Third Appendix 72 D Fourth Appendix 78

(10)

List of figures

2.1 Example of the task flow of buying milk . . . 17

2.2 Task flow of Paypal . . . 23

3.1 SEQ . . . 32

5.1 Example of changes made to the prototype between iteration 1 and iteration 2 . . . 47

5.2 Example of changes made on the prototype between iteration 2 and iteration 3 . . . 49

5.3 Example of changes made on the prototype between iteration 3 and iteration 4 . . . 51

A.1 Task flow of Swish . . . 69

A.2 Task flow of Paypal . . . 70

C.1 Task instructions for iteration 1 . . . 72

C.2 Task instructions for iteration 2 . . . 74

(11)

List of tables

3.1 The performance measurements. . . 31

3.2 The performance goals. . . 39

4.1 The test subjects and their perceived experience using mobile payment apps and experience of software development. . . 42

5.1 Median value of performance measurements iteration 1 . . . 46

5.2 Median value of performance measurements iteration 2 . . . 48

5.3 Median value of performance measurements iteration 3 . . . 50

5.4 Median value of performance measurements iteration 4 . . . 52

(12)

List of acronyms and abbreviations

App Application

E-wallet Electronic Wallet

HCI Human-Computer Interaction

High-fi High Fidelity

ICF Informed consent form

ICT Information and Communication Technology

Low-fi Low fidelity

Nav Navigation

NFC Near Field Communication

QR Quick Response

RPA Referring Phrase analysis

SDGs Sustainable Development Goals

SEQ Single Ease Question

TAP Think-aloud protocol

UE Usability evaluation

UEM Usability evaluation method

UI User interaction

(13)

1

Introduction

The demand for improved user experience when interacting with computers has grown and developed into an important multidisciplinary field named Human-Computer Interaction (HCI). This field consists of material from computer

science, psychology, ergonomics, and many more subject areas [6]. Today

electronic payment systems deployed on electronic wallets (e-wallets) are becoming more common [46]. In this thesis, we investigate how an e-wallet can be designed to ensure that the user’s experience will be as pleasing as possible and will do so by applying HCI best practices and evaluation methods.

1.1

Background

The shift from analogue technology to digital technology, better known as ”The Digital Revolution” has had an impact on our lives and enabled new possibilities in different activities of society. One of these activities is electronic purchasing and the digitization of money1. This digitization introduces the need for e-wallets as the next step in the digital revolution, as part of the transformation from a physical wallet or plastic card payments to an all-electronic payment system [43].

An e-wallet is defined as a digital system that enables a user to perform electronic transactions, including but not limited to, purchasing from (for example a store), transferring money, receiving money, etc. Not only can monetary value be stored but there is also the possibility to store ID documents, driver licences, and other information that would normally be stored as cards in a wallet [26].

There are two important components of an e-wallet: the software and the information. The information is stored in a database containing names, credit cards, and payment methods. The software component handles the personal information of the user and provides security through encryption of data. Furthermore, it is possible to transfer money via several techniques, such as Quick Response (QR codes), Near Field Communication (NFC), Bluetooth, etc [39] [14]. 1Throughout this thesis we will use the term money in its broadest sense, i.e., as a record that has some value that can be used in an exchange

(14)

At the time of writing this thesis, there are some existing e-wallets in the market, each of which are used (some more frequently than others). This raises the importance of examining how users use these applications (apps) and whether these apps are designed in a way that makes it easy for people to interact with them or not. This is where the field of HCI comes into play.

One aspect of HCI is the usability of a system. Usability refers to the ease of access or ease of use of a product, and it is the features and context of the use that determine an app’s level of usability [62].

Centiglobe is a trading company [9] that wants to develop a mobile payment app (specifically an e-wallet), which enables payment with cryptocurrency as well as international payments and has tasked us with researching how a mobile payment app can be designed that conforms to HCI practices, specifically usability.

1.2

Problem

As e-wallets are a fairly new invention, there is a lot of research still needed on the usability (user experience) of these apps. This thesis will investigate how an e-wallet can be designed to provide a high level of usability by conforming to best HCI practices (from a usability perspective) as well as try to answer user experience questions that can occur when designing new and innovative apps that use unconventional technology. The question we want to answer is:

• How should a mobile payment app be designed in order to improve its usability, with regards to a set of pre-defined usability criteria?

1.3

Purpose

The purpose of this bachelor’s thesis project is to improve the usability of e-wallets and ultimately contribute to an improved and optimized user experience when using E-wallets. The overall aim of this thesis is to present and discuss the findings from user testing of a series of prototypes of a mobile payment app.

(15)

iterative user testing, improve the prototype from a usability standpoint. The resulting prototype will be the basis for a finished mobile payment app, to be developed by Centiglobe.

1.4

Goal

The end goal of the project is to produce a prototype of a mobile payment app with a design based on HCI principles. This prototype will be evaluated through usability evaluation. This goal has been divided into the following six sub-goals:

• Gather information about previous usability research related to mobile payment apps, existing mobile apps, and usability evaluation practices. This will be done by conducting a literature study.

• Choose an evaluation method for user testing.

• Define measures through which to evaluate the usability of the design. • Create a series of prototypes of a mobile payment app.

• Conduct iterative user-based usability testing on these prototypes.

• Iteratively improve the prototypes based on the evaluation of the previous prototypes.

1.5

Benefits, Ethics, and Sustainability

Sustainable development implies the development that meets the needs of the present without compromising the ability of future generations to meet their own needs [8]. Three aspects are included: ecological, economic, and social sustainability. There are a number of Sustainable Development Goals (SDGs), defined by the United Nations (UN), aiming to increase sustainable development, globally [38].

The result of this thesis might contribute to the design of mobile payment apps, from a usability perspective. Hopefully, this improved design will increase the use of such apps. An increased use of mobile payment apps, has some advantages

(16)

from an ecological standpoint, as its increased use may lessen the need for physical cards and wallets (the production of which has an impact on the environment). The use of mobile payment apps may instead increase the use of electricity, smartphones, etc. The benefit of this thesis from an ecological standpoint is dependant on whether or not the reduced use of cards (and paper) make up for the increased use of electricity and the manufacturing of the e-wallet.

Should the result of this thesis contribute to the increased use of mobile payment apps, companies developing apps in general, but more specifically companies

developing mobile payment apps could benefit economically. As they could

improve the usability of their product, based on the results of this thesis. An improvement in the app’s usability could lead to greater customer satisfaction and result in a larger customer base. This, in turn, can lead to a higher rate of return and more loyal customers. Providing economic sustainability to the companies. E-wallets may contribute to both social sustainability and economic sustainability for society. Because of the high mobile penetration, especially in rural and poorer areas, a large number of people have the possibility to gain access to mobile financial services [24, pp. 12-14] which they may not have had access to previously, because of geographical location. Access to financial services contributes to the reduction of vulnerability to economic, social, and environmental shocks [51]. Thus increasing the resilience of the poor and contributing to the SDG number 1, ”No Poverty”.

Furthermore, increased access to financial services, through mobile payments,

contribute to the development of new business. Targeting SDGs number

eight, ”Decent work and Economic growth”, nine ”Industry, Innovation and Infrastructure”, and ten ”Reduce inequality within and among countries” [51]. However, there exist some possible downsides with mobile payment. Firstly, many of the mobile apps available rely on an internet connection to function. Only some types of transactions can be made without an internet connection. Limiting the use of these apps to people with funds for and access to internet connection or mobile data. The need for internet also disables the use of these apps during internet blackouts. However, some developing and existing mobile apps are not reliant on internet connectivity, as they offer an offline-mode; for example, by

(17)

using tokens that can be used offline, but downloaded online [47, 57]. There are also a number of security risk with e-wallets. Since phones are vulnerable to malware and hacking, there exists a risk of sensitive financial information ending up in the wrong hands, when using mobile payment apps [60].

1.6

Methodology

When conducting a degree project, the choice and use of methods and methodologies are important. Methods and methodologies are tools through which to assure the quality of the research. They also help guide the work and help ensure proper and well founded results. There are a variety of methods and methodologies from which to choose. The choice of methods and methodology must match the research actually conducted in order to have any effect on the work [23]. There are two categories of methods: quantitative and qualitative. A quantitative method aims to prove a phenomenon by objective measurements and the statistical, mathematical, or numerical analysis of large sets of data [33]. In contrast, a qualitative method is concerned with gathering non-numerical data while studying a phenomenon or artefact in order to create theories or products by examining the environment [23].

Within these categories, there are different research methods from which to choose. These methods determine how the research process is conducted. A few examples of methods are empirical research methods and analytical research

methods. Using an analytical research method, pre-planned hypotheses are

tested based on existing knowledge and findings. Using an empirical research method hypothesis are instead tested based upon experiments, observations, and experiences [23].

The method used in this degree project will be both quantitative and qualitative. Some of the data will be quantitative, 5 seconds to do a given task, 4 errors when doing the task, 3 negative comments etc. Additionally, some of the data will be qualitative, content of comments from the user. However, as the test group is small data is not expected to have statistical significance. To test the usability and gather knowledge regarding the design, tests will be conducted. Since it is through

(18)

testing and observations that data will be gathered, the research method used is an empirical research method.

To ensure sufficient background knowledge in the area, a literature study will be conducted. The literature study will explore existing usability research about mobile payment apps, the design of current mobile payment apps, and HCI practices. Based on this literature study and grounded theory in usability we will choose an evaluation method and a set of usability measurements. A prototype of the app’s design will be created. The design will be evaluated through iterative testing where tests subjects will comment on the usability of the design and performance measurements will be measured.

1.7

Stakeholders

The stakeholders for this degree project are first and foremost Centiglobe (the company that will market, sell, and profit from the end result, a potential design of an e-wallet mobile app). They will be directly affected by this thesis since the design of the app can directly or indirectly potentially affect their profits and future development.

Furthermore, the finished app could be sold to other companies as a white label app. Enabling these other companies to adopt the app and adapt it to realize their own product. The finished app is intended to be used together with other existing products and companies, such as other mobile payment apps and banks. These potential customers will be indirectly affected by the design of the app and therefore the outcome of this project.

Further stakeholders also include potential users of the app. The app’s design is directly affected by the outcome of this thesis project. The target users for the app are expected to have all sorts of different backgrounds, ages, and nationalities. However, the early adopters will most likely be younger, technically skilled persons.

(19)

1.8

Requirements from Centiglobe

The requirements on the app from Centiglobe is that it has to be suited for international use, have the ability to perform transactions to other users, (not limited to users of the same app) and it should support several currencies (cryptocurrencies as well).

1.9

Delimitations

We will only use a small test group as we do not have access to a large number of people. Therefore, our results are not statistically reliable and do not necessarily apply to our demographic of target users outside that of our test group. The test group used is not an accurate representation of the target users of the final product. Our results will only be applicable for the test group used. Furthermore, due to limited resources, the same test group will be used for all iterations. Because of time constraints we will only test some specific functions and not present a fully developed app. This means that a high-fidelity (high-fi) prototype

will not be presented. We will only draw conclusions regarding the tested

prototypes.

The usability of the prototypes will only be tested based on a set of pre-defined measurements hence there could be some aspects of usability that we could not take into consideration since we had limitations in time and other resources. Furthermore, the focus of the usability testing is discovering major problems with the design of the prototype. Therefore, no major effort will be put into finding minor usability problems with the design.

A widely discussed topic in conjunction with e-wallets is the security level

of the apps. Studies show that users’ experience a lack of privacy and

confidentiality in transaction information and are therefore reluctant to perform online transactions [54]. However, this is something that will not be discussed further in our thesis since it is outside of our scope. On the other hand, if the design or functionality of an e-wallet app can affect the user’s sense of security this will be highly relevant to discuss.

(20)

1.10

Outline

The following chapter in this thesis contains an in-depth theoretical background to explain what HCI is and what methods can be used when evaluating apps. Different e-wallets on the market and what technology and task flows exist will also be presented. The method and methodology used in our study are presented in Chapter 3. This chapter covers how the research process and data collection was performed. The validity and reliability of the method are also discussed. This is followed by a description of the implementation and results in Chapters 4 and 5 (respectively). Finally, a discussion of our work will be given and conclusions will be presented in Chapter 6.

(21)

2

Background

In this chapter, we present a detailed description of the background areas relevant to answering our thesis questions involving HCI theories and practices as well as describe how e-wallets work and inspect some of the apps currently on the market.

2.1

Human computer interaction

HCI is the interdisciplinary study of interactions between humans, that is users, and information technology design [58]. HCI is about designing interfaces in a human-centred way, taking account of human abilities and preferences. It ensures that systems are accessible, usable, and acceptable [6]. It encompasses several areas of research, such as computer science, cognitive science, and human factors engineering [58].

2.1.1 Usability

There are several ways to test if a design is ”good”, from an HCI standpoint. One such way is testing the usability of the system design. There is not a set definition of usability. However, the formal definition of usability from the ISO 9241-11 standards is ”The extent to which a product can be used by specified users to achieve specified goals with efficiency, effectiveness, and satisfaction in a specified context of use” [28, p. 6]. A shorter definition is that usability is the quality of the interactions with a product or system [6, p. 77]. As the definition of usability can differ so can the parameters that define the usability. Some models include parameters such as how safe the system is to operate in the context it will be used [6, p. 81]. Others define the parameters as accessibility, clarity, learnability, and feedback [12]. In the ISO model, however, the parameters are those ones stated in the definition of usability, i.e., efficiency, effectiveness, and satisfaction.

In keeping with the definitions of the three parameters as given by ISO 9241-11 [28, pp. 9-12], efficiency is the resources used in relation to the results achieved, effectiveness is the accuracy and completeness with which users achieve specified

(22)

goals, and satisfaction is the extent to which the user’s physical, cognitive, and emotional responses that result from the use of a system, product, or service meet the user’s needs and expectations.

2.1.2 Usability evaluation

When evaluating the usability of a system different usability evaluations methods (UEM) are used. These can be divided into different categories. Ivory and Herst (2001) propose the following five classes and definitions [29]:

Analytical modelling

an evaluator employs user and interface models to generate usability predictions.

Inspection

an evaluator uses a set of criteria of heuristics to identify potential usability problems in an interface.

Inquiry

users provide feedback on an interface via interviews, surveys, and the like.

Simulation

an evaluator employs user and interface models to mimic a user interacting with an interface and reports the results of this interaction.

Testing

an evaluator observes users interacting with an interface to determine usability problems

Depending on whether user participation is required or whether expert evaluators are employed, a UEM is either user-based or expert-based. Expert-based UEMs all include the involvement of an expert in the field of usability or the design of interactive systems. In contrast, user-based UEMs instead employ a group of people, preferably representative of the target users, to evaluate usability [34, p. 256]. UEMs from the analytical, inspection, and simulation class can all be expert-based. However, testing and inquiry UEMs are dependant on users and are therefore user-based methods [29]. Furthermore, a UEM can be either summative

(23)

or formative in nature. Each of these two types of testing is more appropriate for different stages in a product’s development and purpose of evaluation [34, p. 260]. Formative evaluation is best suited for the early stages of development. In this case, testing is a part of the iterative design process, to explore designs that are or are not usable. Formative testing is exploratory in nature and the focus is on qualitative feedback and moderator observation. In contrast, summative evaluation involves measuring the usability of a specific design choice. The focus is on metrics and quantitative measurements [48]. In short formative testing is for discovering what needs to be improved, while summative testing explores whether the improvements were successful. For example, according to Melody Y. Ivory and Marti A. Hearst are UEMs in the testing, inspection and inquiry are best used for evaluations that are formative in nature, while analytical and simulation methods are summative [29].

2.1.3 Analytical modeling

Based on some representation or model of the UI and/or the user, analytical modelling methods enable an evaluator to predict the usability of the UI, i.e., the user’s performance when interacting with the UI [30].

One method, classified as analytical modelling, is GOMS analysis. This method developed by David Kieras is based upon evaluating Goals, Operators, Methods, and Selection rules [32]. A GOMS model specifies a set of methods that are used to accomplish specific goals. The methods are composed of operators. Operators are steps that a user performs. Steps that are assigned an execution time. If several methods can be used to achieve a goal, then selection rules are used to decide on the correct method. Based on this model predictions of how users will use the modelled system can be made.

2.1.4 Inspection

Inspection includes evaluation methods where experts examine the usability of an interface based on a set of guidelines that range from very detailed to broad descriptions of the guidelines [30]. However, studies show that the inspection

(24)

methods are not always useful due to the fact that designers can be biased towards aesthetically pleasing interfaces instead of measuring the efficiency of the design [52]. Examples of inspection methods are cognitive walkthrough, heuristic evaluation, and guideline review.

2.1.5 Simulation

With simulation, it is possible to mimic a user interacting with an interface with the help of computer programs. The simulations are done in a controlled environment where the values of parameters can be chosen for each simulation. This provides the designer with quantitative data which can be easy to interpret.

2.1.6 Inquiry

Methods that are categorized as inquiries include field observations, user feedback, interviews, and questionnaires. When conducting evaluations using these methods, the goal is not to test performance but rather collect data on

opinions. These methods are especially relevant in early stages of product

development but also after a product has been released in order to collect feedback as can be done when conducting for example field observations [30].

2.1.7 Testing

Testing methods are the fundamental way of knowing how humans are going to interact with a certain interface since participants are those who will test the product. The participants will use a prototype or a system where the goal is for them to complete a task given by the tester who will record the results and act on them.

Testing methods include a think-aloud protocol (TAP) where the participant talks during testing, performance measurement where the tester records the usage data during the test, and coaching where the participant can ask the tester questions about the interface.

(25)

When using TAP, for usability evaluation, the comments can be audiotaped and then transcribed. The transcription can be analyzed in multiple different ways. One way is analyzing through three progressive steps: referring phrase analysis (RPA), assertional analysis, and script analysis. In RPA each phrase is coded, with names of concepts, based on the category of words contained in the phrase. Concepts are categories based on the meaning of a word. For example, one concept can be ”value” and refers to all words that are a rating or scaling of usefulness, importance, or worth [21].

2.1.8 Measurement of usability

When measuring usability through usability evaluation, both performance and

subjective measures can be used. Performance measures are quantitative

measures that are observed. Such measurements can be the time it takes for a user to learn how to perform a specific function, the rate of errors that occur during use, the speed of task performance, or the number of observations of frustration. Subjective measures are instead based on the subjective opinions of the test subjects. They can be both qualitative and quantitative. This data can be collected in the form of user comments or a user’s rating on a scale. The data can be gathered a number of different ways. Some examples are surveys or a test moderator observing the test subject as the subject performs certain tasks [17, pp. 184-188].

When evaluating the three parameters for usability, as defined by ISO 9241-11:2018 [28], some metrics of effectiveness are percentage of goals achieved, functions learned, and number of errors. Some metrics for efficiency are the time to complete a task, learning time, and time spent correcting errors. Lastly, some metrics for the measurement of satisfaction include ratings for satisfaction, ease of learning, and error handling [30, p. 7].

Measuring satisfaction using the rating of satisfaction can be done through different types of questionnaires. These questionnaires can be distributed at the end of a test, measuring the satisfaction of the entire system or they can be distributed at the end of each task, measuring the satisfaction of that specific task.

(26)

One such post-task questionnaire is Single Ease Question (SEQ). It consists of one single question that refers to how easy or difficult participants think a task is to perform. The respondent can rate the difficulty on a scale ranging from one to seven. Where one is ”Very difficult” and seven corresponds to ”Very easy” [50]. The time it takes before a user gets frustrated is about 1 minute to perform one task. If a system requires more than one minute to perform a task, it is likely that the site will be abandoned [44].

According to MeasuringU [35], the average result when testing the difficulty of a task is between 4.8-5.1 on a 7 level Likert scale.

2.1.9 Prototypes

A prototype of a product is an early sample or model of a product. Prototypes are created to test a concept or a process [7]. When designing and evaluating interactive products, prototyping is heavily relied on. Producing a prototype, of a future product, provides the opportunity to experiment with alternative designs, fix any problems that might occur, and provide a conceptual idea of the product that can be used during testing. Since a prototype is relatively easy to change the design can quickly be adjusted according to the results of testing. Depending on the level of detail, a prototype has different degrees of fidelity in relation to the final product [16]. Ranging from a low fidelity (low-fi) prototypes to high fidelity (high-fi) prototypes.

2.1.10 Design principles, guidelines, and theories

There are several different guidelines and principles for the development of “good” interactive products from a human interaction design standpoint. Some of these principles are given in Ben Shneiderman’s eight golden rules of interface design. If a design aligns with these principles, then these are the strengths, while those that violate it will be the weaknesses. These rules [17] are:

• Strive for consistency,

(27)

• Offer informative feedback, • Design dialogues to yield closure, • Offer simple error handling, • Permit easy reversal of actions, • Support internal locus of control, and • Reduce short term memory load.

Additionally, there exist several other well known rules of thumb, such as Don Norman’s six design principles and Jakob Nielsen’s 10 usability heuristics for user interface design. For example, Nielsen’s 10 heuristics are defined as [40]:

• Visibility of system status

Give the user appropriate feedback, in reasonable time. • Match between system and the real world

Use language that user understands instead of system-oriented terms. • User control and freedom

Support undo and redo, a user should not have to go through several steps when a mistake has occurred.

• Consistency and standards

Follow conventions, thus do not use ”bye” and ”exit” interchangeably, rather stick to one of them.

• Error prevention

Try not to put the user in situations prone to error. Present users with a confirmation option before committing an action.

• Recognition rather than recall

Minimize users’ memory load. Make actions visible so the user does not have to remember them.

(28)

• Flexibility and efficiency of use

Customize the interface for both the novice and the experienced user, for example with shortcuts or actions that speed up the interaction for expert users.

• Aesthetic and minimalist design

Do not present irrelevant information since it competes with relevant information.

• Help user recognize, diagnose, and recover from errors

Use error messages that are easy to understand, and suggests solutions to the problem.

• Help and documentation

The system should preferably be manageable without documentation but if this documentation exists, it should be easy to find, be concrete, and not too large.

Guidelines similar to the ones named above can in broad terms be summarized as striving for consistency, give the user control, and be aware of a user’s limited memory.

One way to conform to the principles stated above is to create local rules for each design. As there exists a lot of guidelines which sometimes can be contradictory it is very important to specify these rules beforehand.

An example of a local rule would be that all ”back” buttons in an interface have to be red and with a width of at least 5 percent of the window width.

2.1.11 Task flow

Before developing an app it is important to have a clear mind map of where each button press, swipe, etc. will take you to ensure that each interaction serves an important purpose for the user. This can be done by creating task flows. A task flow is the sequences of steps required to perform a certain task. It is often represented by a flow chart showing the relations between each step [5]. See the example in Figure 2.1.

(29)

Figure 2.1: Example of the task flow of buying milk

2.2

E-wallet

E-wallets are a fairly new invention that in recent years have become increasingly popular as we enter the digitization era, where a transition from physical money and payments to electronic money and cryptocurrencies is currently taking place. The markets for these electronic payment methods has a promising future, but their successes are uncertain due to potential new technological inventions [15]. An e-wallet or digital wallet transforms the way people purchase and pay for things, by changing the means of payment to be done via apps on mobile phones [26]. All information that is stored in a wallet is encrypted through the use of public and private key-pairs to ensure that payments and other data are handled securely.

There are wallets for conventional currencies as well as for cryptocurrencies (such as) Bitcoin, which require the same functionality, i.e., the ability to perform transactions, check balance, etc. As there exist various e-wallets and the goal of this thesis is to investigate how one can be designed from a usability standpoint, it is desirable to present and compare e-wallets currently on the market. This comparison is done in the following subsections.

(30)

2.2.1 E-wallets on the market

It is important to be aware of existing apps, what functions are present, and to understand their design when developing an e-wallet yourself. The agreements between the e-wallet app and banks or other involved entities can limit the

user group for the app. Some e-wallets are location specific and cannot be

used overseas due to transactions only being enabled with country specific requirements and some are only available to users who are customers of a defined set of banks. Nonetheless, a desire for global use is not unusual.

Some examples of e-wallets are: the Swedish mobile app Swish [55], multinational Apple Pay [4], EcoCash app for the Zimbabwean market [36], and the Chinese apps WeChat Pay [61] and Alipay [2].

2.2.2 Transaction and payment methods

There are a few communication styles to choose from when developing an e-wallet app. The selected communication style refers to the technical functionality that is chosen when a user of an e-wallet wants to transfer money to another account. Different wallets have adopted different technical functionality and some of these will be presented in this subsection.

Swish is a smartphone app consisting of minimal functions that enable receiving and transferring money through a phone number connected to the bank. Both users have to have their phone numbers known to Swish through their bank, then subsequently the bank performs a real-time money transfer between the associated accounts of these users [56]. When the party sending money has confirmed the transaction, the money has been transferred. The user also needs to have an electronic identification app that is only available to Swedish citizens. Swish was developed for companies. It has the possibility to provide payment information through a QR code [55] (a two-dimensional black and white code that can be read by machines). A camera app can be used to decode and evaluate the information encoded in the QR-code as cleartext [45].

(31)

Apple Pay is a bit different from Swish because its main feature is electronic purchases using any of the Apple products, but it does not support money transfer. It is currently only possible to transfer money in the USA through a feature named Apple Pay Cash. Apple Pay is using wireless payment via NFC.

NFC is an umbrella term for techniques using wireless data transfer for distances

shorter than 10 cm. Smartphones can through peer-to-peer communication

communicate with anything that has an NFC interface, thus they can receive information from the payment terminal and this information enables purchases. With the EcoCash app, a user can transfer money, or make deposits and withdrawals of money [1]. The wallet links a user’s bank account and phone number. To transfer money the user needs the receiver’s phone number. EcoCash is in partnership with Cassava Remit, which together enables money transfers from the UK to Zimbabwe.

WeChat Pay is a Chinese messaging and social media app with the added functionality of payment services. The app supports multiple payment methods including QR code scanning for purchasing and smoother transactions. In-app payments are also available [37]. In-app payments are payments made from within the app. With in-app payments, the user can choose the payment method directly inside the app instead of being redirected to another application or web page.

Alipay is another popular e-wallet app that serves the Chinese market where

transactions can be made internationally by Chinese customers. Similar to

WeChat Pay, this app supports QR code scanning for local in-store payments [11] and other services such as bank account management and peer-to-peer transfer (money transfer without the need of a bank, i.e. money transfer directly between two users).

2.2.3 Task flow of existing E-wallets

Many of the E-wallets in the market have similar functions. However, the task flow within these apps differ. A functional requirement for the app to be designed is, amongst other things, to facilitate international transactions as well as the

(32)

exchange of currency. For this project, the most relevant mobile payment apps to consider are therefore apps actively used in different countries, apps that facilitate international payments, and apps that facilitate payment with cryptocurrency. Apps with different communication methods, for example, NFC, QR code, are also relevant. As noted earlier, the task flow of an app can be presented in the form of a flow chart. The apps, whose task flow and/or overall design is considered in this report, are Swish, Paypal, Alipay, WeChat, and Cassava Remit. The flowchart for Swish and Paypal are presented in Appendix A. The flowchart for Paypal is also presented in Figure 2.2. These five apps represent a small percentage of the mobile payment apps available, namely less than 2 percent. At least 100 e-wallet existed 2016 according to the list made by [59], and [19] listed the 70 best cryptocurrency wallets of 2019 together being at least 200 e-wallets. However, there is a possibility that even more exists. In Figure 2.2 is the task flow of Paypal shown.

2.3

Related Work

This section present previous research done on the subject of e-wallets designed for usability.

2.3.1 User experience of Bitcoin wallets of usability and security

In Abdulla Alshamisi and Peter Andras paper ”User perception of Bitcoin usability and security across novice users” [3], they examine how digital payment systems with cryptocurrencies (such as Bitcoin) influence new users in comparison to credit card payments. They used surveys to collect data about users’ perception and concluded that the users responded more positively to conventional credit card payments which influenced their negative perception of e-wallet security. Ultimately they say that a deeper understanding and education about digital payment systems is needed together with improved user-centred designs in order to elevate each user’s experience and thus increase acceptance. Alshamisi and Andras tested and compared the usability of payment systems with cryptocurrencies and credit card payment using subjective measures. Their test

(33)

subjects completed a survey, responding to statements regarding usability. The respondents could grade what level they agreed with each statement, through a Likert scale.

2.3.2 Designing mobile wallets

Mia Olsen, Jonas Hedman, and Ravi Vatrapu have in their paper ”Designing digital payment artefacts” [42] describe their scientific inquiry of a mobile wallet. Four different user groups were identified, and then they developed and evaluated

mobile wallets from the perspective of its design. They presented sketches

and low-fi prototypes to user groups who were interviewed in order to collect data. They concluded that there are two types of properties that are relevant when designing a mobile wallet: (1) the functional properties and (2) the design properties and that evaluation criteria needed to be expanded in order to take everyday life contexts into consideration.

2.3.3 User experience of a mobile app

Fanny Chan and Sofia Johansson have in their bachelor’s thesis ”Evaluation of user experience on a mobile application” [10] investigated if there could be any design improvements in Shownight’s mobile app in order to increase the quality of the user’s experience. The evaluation method they used consisted of a combination of interviews with the user group and making performance measures in order to collect data. They were able to suggest improvements in the design but left it as future work to redesign the app.

2.3.4 Acceptance of mobile wallets in Oman

Sujeet Kumar Sharma, Sachin Kumar Mangla, Sunil Luthra, and Zahran Al-Salti in their article ”Mobile wallet inhibitors: Developing a comprehensive theory using an integrated model” [53] reflect on what is hindering the acceptance of mobile wallets in Oman as mobile wallets are increasingly accepted in both developing countries and developed ones. They developed a hierarchical model

(34)

and concluded from it that anxiety and lack of understanding of new technology are some of the key reasons why the promotion of the use of mobile wallets in Oman is difficult.

2.4

Summary

This chapter has presented several HCI practices, guidelines, and models. Usability is one of the HCI practices that is most significant in our thesis. The three parameters defining usability and different types of usability testing methods (user-based, expert-based, and automated testing) were introduced. An overview of various e-wallet apps on the market and their task flow and transaction methods were also presented.

Today it is common to investigate how usability and security of the mobile wallets affect the acceptance of apps and how a mobile wallet can be designed to maximize its acceptance by users. The next chapter will present details of the method used in this thesis.

(35)
(36)
(37)

3

Method

The purpose of this chapter is to provide an overview of the research method used in this thesis. Section 3.1 describes the research process. Section 3.2 details the research paradigm. Section 3.3 focuses on the data collection techniques used for this research. Section 3.4 describes the experimental design. Section 3.5 explains the techniques used to evaluate the reliability and validity of the data collected. Section 3.6 describes the method used for data analysis. Finally, Section 3.7 describes the evaluation framework used.

3.1

Research process

This subsection lists the steps conducted in order to carry out this research. 1. Literature study

At first, a literature study was made in order to gather information about HCI practices, usability, evaluation methods and to get an overview of the functionality of e-wallets. This was done in order to accurately determine

suitable evaluation metrics and methods. The choice of sources were

mainly books on the subject as well as relevant papers in databases such as ScienceDirect and Scopus. The search words among others were: mobile wallets + usability, e-wallets + usability, HCI + mobile wallets, HCI + mobile payment. A lot of information was also found on the internet where HCI communities and interaction design foundations share their knowledge and experiences. The credibility of these sources was checked through a comparison of other material and inspected to see if there were any biased or outdated information on the sites. The primary results of the literature study are presented in Chapter 2.

2. Determine functionality

Secondly, a decision on the required functionality of the app was made. A list of functions was produced together with Centiglobe. These functions were the foundation of the prototype.

(38)

3. Select usability evaluation parameters

In order to evaluate the usability of these functions, evaluation parameters were selected. The selected parameters were the ISO 9241-11:2018 standard parameters for usability: Efficiency, Effectiveness, and Satisfaction.

4. Create a prototype

A prototype had to be created based on Schneiderman’s eight golden rules [17], Nielsen’s 10 heuristics [40], and from the app comparisons in the literature study. The prototypes are presented in Appendix E.

5. Define a test group

Thereafter, a test group was constructed. The test group needed to be as diverse as possible. The selection of this group is described in Section 3.3.1. 6. Choose evaluation methods

After the test group was constructed, the usability of the prototype was evaluated through user-based evaluation. The UEMs that were used are classified as a testing and inquiry UEMs (according to the classification by Ivory and Hearst [30]). The methods we chose were TAP (testing method) and a combination of performance measurement (testing method) and questionnaire (inquiry). The questionnaire was used to collect performance data regarding user satisfaction. We observed the test subjects performing each task, having them comment during testing and measured a set of

measurements. For example, the time to perform a task. At the end

of each test, the subjects rated their satisfaction of the interaction with the prototype. The test subjects were provided with a question regarding the ease of performing the task and rated the ease of performance (on a 7 level Likert scale). Therefore, we gathered both subjective (TAP and questionnaire) and objective (performance measurement) data. Since the purpose of the usability evaluation was to develop a design of a mobile payment app, the evaluation method was formative.

(39)

7. Design experiments

Based on the methods and test groups chosen we designed the experiments. These experiments were conducted four times during individual meetings with the test subjects in a quiet setting. Details of the experiment are given in Section 3.4.

8. Capture data

To capture data, participants expressions and comments were audiotaped with mobile phones as well as noted on paper. Performance data such as time was timed and noted on paper. The user interactions with the prototype as well as errors made were recorded using screen recording. The questionnaire was filled out online by the participant using Google forms. 9. Analyze and interpret data

The data from testing were both quantitative and qualitative. Qualitative data (such as the content of comments and expressions from the TAP and the types of errors made) were more difficult to analyse than the quantitative data. The data from the TAP was analyzed through a simplified version of RPA. The number of negative, neutral, and positive comments were counted. The median value of the performance measurements was calculated. The quantitative data was only used to measure the level of usability of the task while the qualitative data was used to find areas of improvement.

10. Critique UI to suggest improvements

After the data was analyzed, suggested improvements were based on the qualitative data as well as the quantitative data.

11. Iterate process

The process of designing and evaluating the prototype was iterated in order to improve its usability. For each iteration, data was captured, analyzed and new suggested improvements were decided and implemented in the prototype. This iterative process made it possible to find new faults with the design and remove them.

(40)

3.2

Research paradigm

We are collecting data to describe the experience of using the prototype for a certain number of people. We are also basing our conclusion on observing what is happening in the world. We are therefore assuming that there exists some experience that can be observed or tested empirically. Hence we are taking a positive stance towards this phenomenon. The research paradigm that this work adheres to is subsequently positivism.

3.3

Data collection

Data was collected during tests by several different means. The TAP data, i.e. the verbal data, was collected through audiotape that was later transcribed. Performance measurements data was collected using a stopwatch on a mobile phone, to time the activity, and notes on paper. Performance data was also recorded by screen recording. The responses to the SEQ were collected online using Google forms.

No personal information was recorded or collected during the tests. The only information that can be regarded as sensitive, that was collected, were the comments expressed during the TAP. However, these cannot be tied to any one person, since no names were recorded and therefore can’t be linked to them. The testing process presented no risks to the participants.

3.3.1 Test group

The test group used during the user-based usability evaluation was limited to 7 people. This number of people was considered adequate since the research performed was mainly qualitative, to provide insight into the design, together with the fact that most usability problems can be discovered with a test group of five people [41]. Furthermore, because of limitations in resources, the same test group was used for all iterations. Since the target users are not limited to any specific group it was important to construct a diverse group. However, we had limited

(41)

resources when choosing test subjects. Because of this, the difference in the test subjects’ characteristics was limited to:

• Experience using mobile payment apps. • Experience of software development.

We chose ”experience using mobile payment apps” as one characteristic because it could affect how the user interacts and experience the prototype. A user with experience using mobile payment apps is used to performing similar actions that were performed during testing. This could affect the speed of performance as well as the number of errors they make. Furthermore, can they be more critical as they can compare the prototype to other mobile payment apps. A user’s experience of software development can possibly also affect how they experience and interact with the prototype. A software developer may view the prototype from a different perspective and be more technically skilled. The test subjects rated their experience using mobile payment apps and their experience of software development on a 5 level Likert scale, respectively. Level 1 being ”no experience” and level 5 being ”extensive experience”.

The language that was used for the instructions and the ICF for the participants were in English, but any verbal interaction was done in Swedish. The language used in the app prototype is in English.

3.3.2 Consent form

The ICF was designed by us, but it was partly based on a sample ICF provided by the World Health Organisation [63]. Our goal was to obtain informed consent from our test subjects. Therefore we designed a comprehensive consent form, providing the test subjects with information about what the tests entailed. As advised by Joseph S. Dumas and Janice C. Redish [17], in our ICF we explained the procedure that we would follow, the purpose of the test, any risks to the participants, the opportunity to ask questions, and the opportunity to withdraw at any time. Furthermore, we made sure that the participants comprehended the information by using clear language. The test subjects were not offered anything in exchange for participating. All test subjects had reached the age of majority,

(42)

i.e., 18 years old, in Sweden and were clear of mind. The ICF is presented in Appendix B.

3.3.3 Sampling

As the test group was limited in size, all data collected from the TAP testing as well as data from the performance measurements was needed when iterating and improving the prototype. Therefore, no sampling was done.

3.3.4 Sample size

The size of the test group, responses during TAP, responses to the questionnaire and the performance measurements taken determined the sample size. The number of performance measurements and responses to the questionnaire were static in size. However, as we used TAP as one testing method, the total sample size varied with each iteration of testing and for each participant. Because the number of comments and thoughts expressed varied for each iteration and for each participant.

3.3.5 Target population

Since the main functionality of the app is the transaction of money, a functionality not limited to any specific group, and the app is intended for international use, the target population was not limited. However, to what extent the app will be adopted is expected to vary across groups. For example, early adopters will most likely be younger, tech-savvy users. Furthermore, the app is limited to owners of smartphones. The app will also require access to a bank account of some sort, this will affect the minimum age of the users, as many banks have a minimum age limit. There is no age limit to owning cryptocurrency. However, there is often an age limit of 18 when selling and buying on trade sites [13].

(43)

3.4

Experimental design/planned measurements

The subjects were scheduled for individual meetings in a quiet setting. To

facilitate using TAP. The sessions were audiotaped and then transcribed. After subjects signed an informed consent form (ICF), described in Section 3.3.2, they were randomly given an instruction and scenario card (with limited instruction regarding what task to perform and the goal to achieve). Furthermore, the test subjects were informed that performance measurements were to be performed and that they were encouraged to constantly talk and think aloud as they performed the task. If the test subjects were quiet for more than a few seconds we gently reminded them to ”keep thinking out loud”. The test subjects then tried to perform the specific task on the interactive prototype that was displayed on a smartphone. At the end of each task, each test subject answered the questionnaire concerning the specific task. During the test, performance measurements were also measured and notes were taken. The interactions between us, as observers, were limited during the performance of tasks. Only if a reminder, to keep thinking aloud, was needed did we intervene.

We evaluated the usability of the prototype according to the ISO 9241-11:2018 definition of usability. Each parameter (efficiency, effectiveness, and satisfaction) was evaluated using both TAP and performance measurements in combination with the SEQ.

3.4.1 Performance measurement

The performance measurements we chose for each parameter can be found in Table 3.1.

Table 3.1: The performance measurements.

Performance measurements

Efficiency Effectiveness Satisfaction

(44)

The efficiency was measured through the time to complete a task which is defined as:

T asktime = Endtime− Starttime (1)

The time was measured with a stopwatch on a mobile phone. When we asked the participant to start, the stopwatch was started, and when the participant was finished with the task and stopped interacting with the prototype the watch was stopped.

The percentage of errors the participant makes when performing a task is the chosen effectiveness measure. Errors are defined as slips, mistakes or unintended actions. The screen and the user’s interaction, during the interaction with the prototype, was recorded. Based on this recording the number of errors and the total amount of interaction the user had with the prototype were noted. Interaction is in this case defined as any interaction with the touch screen such as press, swipe, etc. To calculate the percentage of interactions that were mistakes, for each task, the following formula was used:

N umber of errors

T otal number of interactions (2)

The satisfaction of each task was measured through an SEQ. After each task, the subjects were given the questionnaire. In the questionnaire, the test subjects rated how difficult a task was to perform on a 7 level Likert scale. The single question questionnaire, the SEQ, is presented in Figure 3.1 below.

(45)

3.4.2 Thinking-aloud protocol

The comments made during the usability evaluation were, as stated above,

audiotaped and then transcribed. Then sections of the transcript that were

identified as not reflecting verbal thoughts, such as when subjects were given or reading task instructions, and filler words, such as ”um”, ”ah” etc, were eliminated from the transcript. The transcript was then analyzed using a simpler version of RPA. A set of concepts (categories of ideas) were defined based on the verbal data. The phrases were concept coded based on the category of the idea a phrase expressed. The concept coded phrases were then categorized into these categories: negative, positive, and neutral, based on whether the phrase expressed a positive, negative, or neutral idea.

3.4.3 Test environment

Each participant decided the place to conduct the testing, meaning that the testing environment was not the same for each participant. The only demand on our side was that the environment had to be quiet so it would be easier to focus and not being disturbed by others when doing the testing. We chose a group room at either the Royal Institute of Technology in Kista or Campus if a participant wanted us to decide where the testing should take place.

The rooms had to have at least one chair for the participant and a table where a laptop for the questionnaire was placed along with the prototype. The prototype was developed with the software ”Figma” displayed on a Huawei P20 (further explained in Section 3.4.4). Other materials that were needed were pens, a notebook, a laptop for taking notes and a mobile phone with a voice recorder and a stopwatch. To record the screen during testing software available on the Huawei P20 was used.

3.4.4 Software and Hardware to be used

The software used for prototyping was ”Figma”. Figma is a freemium, browser-based interface design application. Among other things, a feature of Figma is

(46)

making designs interactive. Meaning that the design can mimic the functions of a real application and respond to the user’s interactions. For example, pushing a button, in the design, can prompt a new view to appear or one can scroll on a page. In presentation mode, only the interface is shown.

Furthermore, will the Figma mirror app be used to present the app on the phone. With presentation mode the prototype can be viewed on a smartphone taking up the whole screen of the phone, mimicking a real application [20].

The smartphone model used to display the prototype was Huawei P20. The phone had a screen size of 5,8 inches (measured diagonally) [27]. Making it a good representation of the average smartphone. The most popular smartphone’s screen size varying between 4,7 to 6,5 inches [31].

The software used for recording the screen and users interaction was a

built-in screen recorder found on the Huawei P20. The software used for the

questionnaire was Google forms. The software used to calculate the median value of the performance measurements was excel.

3.5

Assessing reliability and validity of the method and data

collected

The reliability (Section 3.5.1) and validity (Section 3.5.2) of our results will be discussed down below as well as the reliability of the chosen method.

3.5.1 Reliability

In order to get reliable data from the TAP, we chose to audiotape the conversation to minimize the disturbance to be able to transcribe when the testing was finished, which otherwise can affect the results.

Furthermore, to get reliable results it is important that the analysis of the transcription is done consistently. Same expressions should have the same value for each participant and iteration. When analyzing the transcription it is necessary

(47)

to be unbiased and that we do not draw conclusions in order to accept a certain hypothesis [22].

The methods we have chosen, such as for example the TAP, are well known and scientifically used in the field of HCI and in usability testing. According to Hertzum and Jacobsen [25] some UEM’s as TAP can suffer from ”The Evaluator Effect” which means that novice and expert evaluators that evaluate the same system with the same marked problems come up with different issues with the design. They concluded in their report that it is not recommended to use only one evaluator as a reliable conclusion cannot be made due to this effect. As we used several test subjects or evaluators of our prototype we say that it is reliable.

3.5.2 Validity

In order to get valid results from the TAP, it is important to not disturb or affect the participants when talking and to not behave in a way that makes them say things that they do not agree with. The only time the observer should intervene is when the participants stay quiet and that is through saying ”keep talking” which does not disrupt their thought process. Disrupting the thought process is something that affects the validity of the results [18, 22].

Furthermore, the only opinions and expressions we can record are the ones the participant verbally expressed through the TAP. Meaning that all views and opinions of the participant that have not been shared will be unknown to us. The sample size (number of participants) can affect the validity of the results. We chose 7 which is argued for in Section 3.3.1 to why it is enough in our case.

3.6

Planned Data Analysis

As mentioned earlier we chose to analyze the data received from the TAP through a simplified version of RPA.

The verbal data collected during TAP was foremost used to find usability problems and areas of improvement of the design for each task. Furthermore, the number

(48)

of negative, neutral and positive comments was used to evaluate the prototype’s usability.

Based on the screen recording the number of errors during the interaction could be calculated and described. Furthermore, could the total number of interactions be calculated. The percentage of errors could then be derived from the number of errors and the total number of interactions. The median was used to get an average value for each performance measurement (time for performing a task, percentage of errors, and rating of satisfaction) for each task. We chose the median value, because of the small size of the test group. The arithmetic mean tends to be less accurate for small sample sizes because of a few outliers and the geometric mean cannot handle negative or zero values. [49].

The mean values of the performance measurements were used to check if an improvement, of the usability, had been made between each iteration and if the set performance goal had been reached. More specifically if the design had improved in regards to and fulfilled the set performance goals for efficiency, effectiveness and satisfaction of the interaction with the prototype.

3.7

Evaluation framework

The following subsections describe how we chose to collect and evaluate the data from our testing.

3.7.1 Collection of data

As presented in Section 3.4, during the tests, we timed how long it took for each participant to perform each task as well as record the screen to later count the number of errors a user made and the total number of interactions for each task. The percentage of interactions that were errors was then calculated. The participants also rated their satisfaction of the interaction with the prototype. We then calculated the median value of the measurements for each task. When timing the user performing a task, the start time was considered the moment the user was told, by us, to start. The phone, on which the prototype was presented, was lying

References

Related documents

In this paper, I explore Jan Patoĉka’s enigmatic thesis that the the Platonic ideal of care for the soul.constitutes the essence of Europe According to Patoĉka’s reading,

The availability of the bending and robot welding work stations are on the satisfactory level but the laser cutting and punching work station’s machines availability is under

Further the software has been designed to be able to perform matching between a specified task and available personell, attempting to reccomend persons that would be particularly

This is an age group that tends to perseverate on the A-not-B task, at least with a 6 s delay (Diamond, 1985). In the current study, the infants were not trained to search at

• To examine the impact of microclimatic conditions on the responses of vital rates, shoot growth and population growth rate, and the genetic differentiation in population dynamics,

The PHIN photo-injector test facility is being commis- sioned at CERN to demonstrate the capability to produce the required beam for the 3 rd CLIC Test Facility (CTF3), which

In these guidelines S-LCA has been defined as “a social impact (and potential impact) assessment technique that aims to assess the social and socio- economic aspects

Grounded in research both on formative assessment but also on motivation in connection to foreign language learning, it is hypothesised that sharing the