Investigating How Design of a Mobile Application Affects Cognitive Load: UI Development of a Mobile Application for the Volvo Group IT

(1)

Bachelor’s thesis, 15 ECTS

Bachelor programme in Cognitive Science, 180 ECTS Spring term 2021

Supervisor: Lena Palmquist

Investigating How Design of a Mobile Application

Affects Cognitive Load

UI Development of a Mobile Application for the Volvo Group IT

Madeleine Englund, Sarah Lorenzo

(2)

We would like to thank the people who have guided us through this

bachelor’s thesis and helped make this possible. Thank you to Niklas Björkman

and Olof Bergman at the Volvo Group for helping us recruit participants and

providing us with crucial information and tools for our work. A huge thank you

goes to our supervisor Lena Palmquist for the endless support and encouraging

words during the process of this thesis. Without her, this would have been a much

more stressful and difficult experience. Thank you to the Volvo Group IT in

Umeå for this experience and the opportunity to work on such an interesting

project. We are grateful to have gotten the chance to work with a company such

as the Volvo Group and therefore get a taste of what our education has prepared

us for.

(3)

Abstract

Nowadays when most people are used to smartphones the Volvo Group IT would like to look into the use of mobile devices in the production and design an application which can be used in the painting process at the Umeå plant. When learning a new system, whether it is on a computer or a mobile platform, it is crucial for the users’ performance to develop the system in accordance with their needs and wishes. One factor that can affect the performance negatively is if the users experience a high amount of cognitive load while using the system. This paper, therefore, investigates which design choices in a mobile application minimise cognitive load. The study is a single-case study where five different interview sessions and one test to measure cognitive load have been conducted. The sample size in the cognitive load test consisted of 7 participants, where 2 of them were women and the rest were men. The participants were instructed to perform seven different tasks on the prototype while their response times were measured. After each completed task the participants had to answer a mental effort rating. When all tasks had been performed, the participants was asked to rate the difficulty of each task. Each test was recorded using Microsoft Teams and qualitatively analysed. The results showed that three types of design choices used led to a low amount of cognitive load; (1) the use of the right type of icons, (2) the use of colour contrast and edges and (3) preservations of the users’ familiar and prior work experiences.

Keyword: cognitive load, user experience, user interface, design, mobile application

Sammanfattning

Numera när de flesta är vana vid smartphones vill Volvo Group-koncernens IT-avdelning undersöka användningen av mobila enheter i produktion och utforma en applikation som kan användas i måleriprocessen vid fabriken i Umeå. När man lär sig ett nytt system, oavsett om det finns på en dator eller en mobil plattform, är det avgörande för användarnas prestation att utveckla systemet i enlighet med deras behov och önskemål. En faktor som kan leda till försämrad prestation är om användarna upplever en hög kognitiv belastning vid användandet av systemet. Den här uppsatsen undersöker därför vilka designval i en mobilapplikation som kan leda till minskad kognitiv belastning. Studien är en enfallsstudie där fem olika intervju-sessioner och ett test för att mäta kognitiv belastning har genomförts. Urvalet av deltagare i testet för att mäta kognitiv belastning bestod av 7 deltagare, varav 2 var kvinnor och resterande män. Deltagarna instruerades att utföra sju olika uppgifter på prototypen medan deras responstider mättes. Efter varje slutförd uppgift skulle deltagarna svara på en enkät som mätte mental ansträngning. När alla uppgifter var utförda skulle de svara på en enkät som undersökte den upplevda svårighetsgraden av varje uppgift. Alla testningar spelades in med hjälp av Microsoft Teams och analyserades kvalitativt.

Resultaten visade att tre typer av designval som användes ledde till en låg nivå av kognitiv belastning; (1) användning av rätt typ av ikoner, (2) användning av färgkontrast och kanter och (3) bevarande av användarnas bekanta och tidigare arbetsupplevelser.

Nyckelord: kognitiv belastning, användarupplevelse, användargränssnitt, design,

mobilapplikation

(4)

Today the Volvo Group has several web- and client applications which are being used in the factory but with different purposes. These systems are mostly used, with a few exceptions, on a PC. Nowadays when most people have smartphones the Volvo Group has been investigating whether using smartphones in the production process can lead to increased mobility. The Volvo Group IT would like to further look into the use of mobile devices and design an application which can be used in the painting process at the Umeå plant.

Everyone knows the expression “you have never loved your old system as much as when you transition to another”. It can be difficult and frustrating to learn a completely new system, especially when you are under pressure to perform. Oviatt et al. (2006) showed that traditional graphical interfaces which distracted the users’ attention led to the users working more slowly, made them make more errors, and showed a decline in high-level meta-cognition skills. According to Oviatt (2006), it is important to design new interfaces to reduce cognitive load, which facilitates users’ focus and performance.

DeLeeuw and Mayer (2008) showed that different cognitive load measurements, such as response time and self-ratings of effort and difficulty, were not equally sensitive to different types of cognitive load, i.e., intrinsic, extraneous and germane cognitive load. Intrinsic cognitive load can be described as to how much effort it takes for the learner to understand the task (i.e., the complexity of a task). Extraneous cognitive load is associated with how information or tasks are presented to the learner (i.e., information that is poorly presented can cause this type of cognitive load). Germane cognitive load refers to the creation of schemas and how we process and automate them. Thus, a heavy measure of germane cognitive load can cause difficulties in information processing resulting in poor learning. DeLeeuw and Mayer showed that intrinsic cognitive load could be measured with effort ratings. They also showed that response time was most sensitive to extraneous cognitive load. Self-reports of difficulty ratings indicated that this type of measurement was most sensitive to germane cognitive load. These three types of cognitive load can be affected by poor design choices in varying degrees. This leads to our research question:

which design choices that minimise cognitive load for the users can be identified?

The study is a single-case study design. Five different interview sessions and one test to

measure cognitive load while using the prototype have been performed. A prototype has been

developed based upon the five interview sessions and cognitive theories. During the cognitive load

test, seven participants were instructed to perform different tasks on the prototype while their

response time was measured. The participants answered mental effort ratings and difficulty ratings

during the test.

(5)

Method Participants

Benchmarking

The Volvo Group in Umeå recruited a representative from the Volvo Group in Tuve who had previously developed a mobile application to be used within the factory located in Tuve. The participant was a man.

User Groups

The first interview was held with the production manager of the painting process at the Volvo Group in Umeå. The participant was contacted by the authors’ supervisor at the Volvo Group. The participant was recruited due to his great knowledge about the employees who work in the painting process. The participant was a man belonging to the older age group.

User Needs

Five participants were recruited by the production manager of the painting process at the Volvo Group in Umeå. The authors had mapped the user groups and compiled a list of the type of employees that were to be recruited (see Appendix). A crucial requirement for the participants recruited was that they worked in the painting process. Two of the participants worked at the same department, Segment 2. One of the participants worked as a segment coordinator and the other two worked on the factory floor within different departments. Participants were not recruited from every department due to communication issues and time pressure. One participant was a woman belonging to the younger age group and 4 participants were men belonging to the older age group.

Low Fidelity

For the low fidelity test phase, only two participants were interviewed. This was as mentioned before due to communication issues and time pressure. The other participants who were contacted didn’t receive e-mails sent from outside of the company, thus the shortage of participants. Both participants had participated in the user needs interview. They were both men belonging to the older age group.

High Fidelity

During the high fidelity test phase, only two participants could be recruited. These were the same participants as in the low fidelity test phase. Just like the low fidelity test phase, there were communication issues and time pressure which caused the shortage of participants. In addition, most of the contacted participants were busy at work due to a shortage of staff.

Cognitive Load Test

For the final test, seven participants were recruited to measure cognitive load while using

the prototype. The requirements for the participants recruited was that they had no prior knowledge

about the developed prototype, but did have great prior knowledge about the current systems being

used. The participants needed to have the same presumption when using the prototype in order to

avoid bias. The cognitive load test investigated if the use of familiar practices affected cognitive

load. Therefore, the participants needed to know about the current systems. All seven participants

(6)

developed the systems that are currently used in the painting process and was therefore allowed to partake in the test. Two of the participants were women, where one belonged to the younger age group and one to the older age group. The rest of the participants were men, and they all belonged to the older age group.

Instruments and Materials

Microsoft Teams was used for conducting interviews and to record meetings. Two interviews during the first interview session (user groups) were conducted via phone. Figma was used to create a prototype. Google Slides was used to create a slideshow with pictures of the prototype. Google Forms was used to create questionnaires. The computer software Color Blind Simulation was used to simulate colours used in the prototype (Wickline & Human-Computer Interaction Resource Network, 2000).

Procedure Ethics

Ahead of each interview, participants gave their informed consent. Participants were informed that the interview was going to be recorded. All interviews were recorded using Microsoft Teams. The consent stated that all participation was voluntary and that participants could withdraw their consent at any time without having to disclose why. No risks were associated with participation. If consent was withdrawn no results from that participant would be involved in the study. The participants were furthermore informed that all participants' results would be anonymous and only the analysed and summarised results would be presented in the final report.

Benchmarking

During the benchmarking interview, the authors asked the Volvo Group representative

questions about the mobile application which he had previously developed and about the

development process. The representative demonstrated how the application worked and explained

which information was shown ( Figure 1 ).

(7)

Figure 1

Screenshot of Application Used at the Volvo Group Tuve Plant

User Groups

The first interview was held to identify the user group of the future mobile application.

Questions to understand the painting process and the different departments within were asked to the production manager. Questions regarding the employees’ age groups, gender and language knowledge within each department were also asked. The interview was transcribed and acted as a base for mapping the user groups. A list of what type of employees and how many were to be recruited for the second interview session was compiled and sent to the production manager (see Appendix).

User Needs

The second interview phase helped in understanding user-experience issues and the Volvo

Group factory employees’ needs in a mobile application. Sari Kujala (2003) stated that it is

generally agreed upon that achieving usability in system design requires the involvement of

potential users. The participants were therefore involved through the entire design process. Firstly,

they were asked about the nature of the work that they do in the factory, current systems they use

and general questions about their needs in a hypothetical mobile application. Using this data, two

personas could be created, as shown in Figure 2.

(8)

Personas Created Based on Data Collected During the User Needs Interviews A

B

They represent two different types of workers with distinct personalities. Lastly, low fidelity of

the mobile application was created based on the participants’ needs and the types of employees

who were to use the application.

(9)

Low Fidelity

When the low fidelity prototype was completed, two participants were interviewed separately. The participants were shown a Google Slides presentation with pictures of the prototype (Figure 3).

Figure 3

Two Pictures Shown During the Low Fidelity Interviews

A B

Note. The current status of truck cabs in production is shown in A and the error report is shown in B.

They were told to take as much time as they needed to investigate each picture and to think out loud. When the participants were done investigating, they were asked questions about the design and the information presented. The interviews were transcribed and used as a base for developing a high fidelity prototype.

High Fidelity

The high fidelity interview followed the same procedure as the low fidelity interview except that the Google Slides presentation included pictures of the high fidelity prototype instead.

Some slides showed two different design alternatives where the users were asked to determine

which design they liked the best and why (Figure 4).

(10)

Two Design Alternatives Which Were Shown During the High Fidelity Interviews

A B

Based upon the transcribed interviews the final design of the prototype was developed.

Prototype

Since one of the purposes of the prototype was to minimise cognitive load, short term memory was continuously in mind when designing. Sharp et al. (2019, Chapter 4) state that to minimise cognitive load designers should avoid long and complicated procedures when performing tasks. This design principle has been utilised by not having more than one sub-menu at a time. The design principle has also been utilised by not adding any distracting or unnecessary interface features to the application, such as visual triggers or disturbing sound effects. Oviatt (2006) showed that the lack of unnecessary features was associated with enhanced performance and that distracting features could undermine the users’ attention.

Grill-Spector and Kanwisher (2005) showed that people have an extraordinary ability to

both detect objects and recognise them. Their results showed that the participants did not need

significantly more time to categorize an object compared to detecting it and as soon as the

participants knew the object was there, they also knew what it was. Rogers (1989) stated that

iconic-based interfaces may be easier for the user to learn and remember since the need to recall

is not present. With iconic-based interfaces, the user instead only needs to depend on their visual

memory to recognise icons, and as Grill-Spector and Kanwisher (2005) showed, this is done with

ease. Rogers (1989) meant that recognition leads to a lower cognitive load compared to having to

(11)

use recall. She also states that the use of icons only has a positive effect on cognitive load in those cases where the icons are recognisable for the users. Hence, the prototype includes well-known icons (Figure 5).

Figure 5

Icons Used on the Page Error Report

The icons were continuously evaluated by the users.

Eight per cent of men have some kind of colour blindness (Jenny & Kelso, 2007). Jenny

and Kelso (2007) therefore say that designers must ensure they use colours that can be seen by

everyone, even those with colour blindness. During the development of the prototype, a colour

blindness simulator has been used (Wickline & Human-Computer Interaction Resource Network,

2000). The appearance of the application with colour vision disability can be seen in Figure 6.

(12)

Visualization of Colours Used When Colour Blind.

A B

C D

Note. Normal colour vision can be seen in A, Protanopia can be seen in B, Deuteranopia can be seen in C, and Achromatopsia can be seen in D.

Ware (2020, Chapter 3) talks about how visual distinctness is not defined by the amount

of light, rather by the amount of luminance that contrasts with the background. This means the

human brain can detect edges in a faster way when using contrasts. Ware talks about pseudo

contrasts and that these can be created by adding an edge between two areas having the same

(13)

lightness. The edge will make the two colours look like different shades, when in fact they are the same (Figure 7).

Figure 7 Pseudo Edge

Note. Fibonacci (2007) illustrated a Pseudo Edge. Both sides are the same colours but the edge between them makes the brain believe they are different shades.

This happens because of the perceptual interpolation the brain automatically computes. Ware advises the reader to use this method to highlight different areas. In the prototype, this has been done when choosing text colours to contrast well with the background colours. The usage of shadows is motivated by the same reason, e.g., to highlight which buttons are pressed and which parts of the prototype are “clickable”.

One of the conclusions from the interview session focusing on the users’ needs was the fact that the users appreciated the current way of presenting the flow of the truck cabs in the factory and did not have any need to replace or change this function. According to Oviatt (2006), the users’

performance substantially benefited from not having their existing and familiar work practice

changed. Because of this, the map function in the prototype is based upon the current system that

visualises the flow of the truck cabs in the factory. Small changes, such as the colours and existence

of shadows have been changed to facilitate perception, not only for those with normal colour seeing

(Jenny & Kelso, 2007; Ware, 2020 Chapter 3) (Figure 8).

(14)

The Flow of Truck Cabs in the Factory

A B

C

Note. The flow of the truck cabs in the factory while using the prototype in a standing position can be seen in A. In B the flow of the truck cabs can be seen in a horizontal position/full-screen mode.

C shows the current way of displaying the flow of the truck cabs in the factory.

In conclusion, the desirable outcome was for the design to take advantage of the users’

former knowledge and experiences, as well as inherent behavioural patterns. Adapting the design

(15)

to the users’ preferences, behaviour and cognitive processes when using the application was crucial when creating this prototype.

Cognitive Load Test

To measure cognitive load while using the prototype, three different measurements were taken. Response time, mental effort ratings and difficulty ratings were measured (DeLeeuw &

Mayer, 2008). DeLeeuw and Mayer showed that these three types of measurements were not equally sensitive to different types of cognitive load. In order to derive the possible results from the different design choices, all three measurements were taken. A link to the prototype and the mental effort rating were sent to the participants via Microsoft Teams. The participants were asked to share their screen while interacting with the prototype. Participants were instructed to perform seven tasks on the prototype, such as “send a message to Thor”. Before the first task was announced the participants were informed that it was possible to interact with the prototype as if it was a smartphone and not only possible to press buttons. All participants performed the same tasks in the same order. If the participant could not figure out how to complete the task, they were given a clue after 45 seconds. Participants only needed clues while performing the last task. They were instructed to remove the group ‘Segment 2’ from the message overview (Figure 9).

Figure 9

Message Overview

The hint they were given was as follows: “the solution is found on the message overview page”.

Response times were measured from when the authors had finished instructing the task that was

(16)

consisted of seven multiple-choice questions with nine different answers, one question for every task. The alternative answers ranged from very, very low mental effort to very, very high mental effort (Figure 10).

Figure 10

One Rating During the Mental Effort Rating Questionnaire

The participants were instructed to only choose one answer. When all tasks had been performed a link to the difficulty rating was sent to the participants via Microsoft Teams. The difficulty rating consisted of seven 7-point semantic differential scales ranging from very easy to very difficult (Figure 11), with one scale for each task.

Figure 11

One Rating During the Difficulty Rating Questionnaire

(17)

The participants no longer needed to share their screen and the meeting closed to give the participants time to think through their answers without the supervision of the authors.

Each test was recorded to gather qualitative data. The recordings were analysed by the authors to see where the participants faced the most trouble.

Results Benchmarking

The benchmarking interview gave rise to inspiration and an idea of what the participants might ask for in a mobile application. The page where status is shown in the authors’ application is to a great deal inspired by the application used at the Volvo Group Tuve plant. The half circles and the large numbers clearly show the status of the production, and if it is in accordance with their daily goal (Figure 12).

Figure 12

Cropped Part of a Screenshot of the Application Used at the Volvo Group Tuve Plant

(18)

different groups (Figure 13).

Figure 13

User Groups Found Within the Painting Process at the Volvo Group Umeå Plant

Two different age groups were identified, the younger age group and the older age group. The younger age group included people who were between 20-30 years old and the older age group included people who were between 45-50 years old.

User Needs

The participants voiced some ideas and requests during the first phase of interviews. In summary, some of the participants wanted the ability to monitor the flow of the truck cabs in the factory. A couple of participants requested push-notifications that were to be sent when the truck cabs passed through certain checkpoints in the assembly process.

Some participants also requested more detailed information about the production, for instance how many truck cabs are in production, how many cabs are produced each hour and how many cabs have been produced during the day. Previously the balance of cabs produced was counted automatically by a function. The function was removed in an update of their current system, and some of the participants requested for the function to be reinstated once more.

When quality checks are performed on cabs, the employees sometimes find mistakes that

need to be fixed. Participants requested a way to report these mistakes as well as print the tags they

(19)

put on the faulty cabs more easily as opposed to how they do it today. Today they need to find the nearest computer to report and print tags, which causes quite a lot of running around the factory.

Low Fidelity

The participants were overall very delighted with the low fidelity prototype. They thought it was user friendly and that the icons used were speaking for themselves. They voiced that the percentage shown at the status bar was irrelevant and the important part was the actual number shown (Figure 14, Figure 3A).

Figure 14

Cropped Part of the Current Status of Truck Cabs in Production

They also voiced that the percentage for the status to turn red was a bit low and the limit for it to be red should be at 75 %. Both participants appreciated the message function a lot but also asked for it to involve a function to create groups. None of the participants had use of the error report function in their work which led to it not being evaluated.

High Fidelity

Once again the participants were overall very delighted with the prototype. They expressed the changes made based upon the low fidelity interviews were great. In those cases, two design choices were shown the participants chose differently each time (Figure 4). One of them stated that he thought the information shown was clearer and easier to distinguish while the other one said his preferred choice was based upon personal style preferences. Since both design choices were chosen by the participants the authors decided upon a final decision based upon their personal design preferences.

Cognitive Load Test

Response time, RT, for each task was measured. RT for participant 2 disappeared because

of technical issues. RT differed between the different tasks due to the tasks not being equally hard

to complete. Each participant’s RT for each task can be seen in Figure 15.

(20)

Response Times for Tasks Performed

Note. Participants are called P1-P7. P2’s RT disappeared due to technical issues.

Participants were instructed to answer the mental effort rating questionnaire after each task

had been completed. When all tasks had been performed the participants answered the difficulty

rating questionnaire. The results from the two ratings can be seen in Figure 16 and Figure 17.

(21)

Figure 16

Mental Effort Ratings of Tasks Performed

Note. Participants are called P1-P7. The rating scale ranged between 1 and 9.

Figure 17

Difficulty Ratings of Tasks Performed

Note. Participants are called P1-P7. The rating scale ranged between 1 and 7.

(22)

All tests were recorded and qualitatively analysed. The first task read “change department to ‘Förbehandling/ED/Reningsverk/PW’”. The analysis showed that all participants pressed on the map function instead of on the filter button as intended (Figure 18).

Figure 18

Visualisation of the First Page During Tests

Note. Pink circles surround the concerned icons.

Five of the participants realised they were at the wrong place and then went back to the first page.

They then pressed on the filter icon and found the department they searched for. The other two

participants went to the concerned department on the map and tried to press it (Figure 19).

(23)

Figure 19

Visualisation of the Department Two Participants Thought They Searched For in the Map Function.

Note. A pink circle surrounds the department participants searched for but in the map function.

When it didn’t work, the participants went back to the first page and pressed the filter button.

During the fourth task participants were instructed to “close the menu and remove the error

code” they had previously chosen. Three of the participants seemed to have difficulties with

understanding the task. Instead of closing down the menu first and then remove the error code,

they removed the checkmark from the box and then closed down the menu (Figure 20).

(24)

Ways of Unselecting Chosen Error Code

A B

Note. Participants who misunderstood the task removed the error code as seen in A. The right way of removing the error code during the task can be seen in B.

The sixth task was as followed: “look up when the first message in the group ‘Segment 2’

was sent”. Two of the participants thought the task was completed when they entered the group

and saw the message sent at 08:37 (Figure 21).

(25)

Figure 21

Messages Sent in the Group ‘Segment 2’

A B

Note. The message participants wrongfully thought was the first message sent can be seen in A.

The first message sent can be seen in B.

The remaining participants understood they could scroll upward and found the top message which was sent on Monday 13:17 (Figure 21).

In the last task, task 7, the participants were instructed to “remove the group ‘Segment 2’

from their message overview”. This was done by dragging/swiping the group from right to left and

then pressing the red button. Participants 1, 4 and 6 completed the task with ease, whereas

participants 2, 3, 5 and 7 needed a hint after 45 seconds (Figure 22).

(26)

Illustration of How to Delete a Chat From the Message Overview

A B

Note. To delete a chat, the participant had to drag/swipe the chat from the left to the right (A) and then press the red button (B).

The research question read “which design choices that minimise cognitive load for the users can be identified?”. DeLeeuw and Mayer (2008) showed that; (1) response time was most sensitive to extraneous cognitive load, (2) mental effort ratings were most sensitive to intrinsic cognitive load and (3) difficulty ratings were most sensitive to germane cognitive load. Based on the results and the findings of DeLeeuw and Mayer, Rogers (1989), Oviatt (2006) and Ware (2020, Chapter 3), three types of design choices were identified. The identified design choices used were;

(1) the use of the right type of icons, (2) the use of colour contrast and edges and (3) preservations of the users’ familiar and prior work experiences.

Discussion

During the qualitative analysis of the cognitive load test, it was clear that some participants

did not understand the tasks they were given. The response time during the first task seems quite

even between the participants, which can be interpreted as a good result. The analysis showed that

all participants went to the wrong page of the application during task 1. Two of the participants

(27)

tried to press the concerned department but in the map function instead of in the status overview function. This made the author’s notice that the department the participants searched for was in fact present in the map function as well. During task 4 the participants executed the stated task but in the wrong order, leading them to remove the chosen error code in the wrong way. The analysis made it clear that the response times of task 1 and task 4 were affected due to poor formulations of the tasks.

Another point to keep in mind is that the participants completed the tasks on a computer when the application is supposed to be used on a smartphone. Some participants expressed that the interaction was more difficult to understand since they had to imagine they were interacting with a smartphone application and not a computer. The participants couldn’t interact with the prototype on a smartphone due to Covid-19, which resulted in the tests having to occur online. The analysis showed that this issue did affect the response times and mental effort ratings of task 6 and task 7 for some participants. They had trouble understanding how to scroll in the conversation with the group ‘Segment 2’ since there was a scroll bar present. They tried to scroll by dragging on the scroll bar but it was not linked in the prototype due to a lack of functions in Figma. Instead, they had to scroll in the actual chat. They also had trouble understanding they could swipe on chats during task 7 by dragging them from right to left. This could be explained by the potential lack of technical knowledge of the participants.

DeLeeuw and Mayer (2008) had shown that response time was most sensitive to extraneous cognitive load. The response times for the different tasks showed that task 1 and task 7 were the tasks that took the longest to execute (Figure 15). Some spikes can be seen during the other tasks, but the majority of the participants executed those tasks with ease. According to DeLeeuw and Mayer, this type of cognitive load can be affected by poorly presented information. The results can therefore be explained by the fact that the remove button during task 7 was not visible for the participants when first looking at the prototype. The participants also had trouble with understanding it was the filter button they were supposed to press during task 1. Rogers (1989) meant that icons only have a positive influence on cognitive load if the users can recognise them.

Perhaps the filter icon was not the best-suited icon for its function.

Moving on to intrinsic cognitive load, DeLeeuw and Mayer (2008) showed that mental

effort ratings were most sensitive to this type of cognitive load. DeLeeuw and Mayer meant that

the higher complexity of the task, the higher is the cognitive load. The results of the mental effort

rating showed most participants experienced a “very, very low mental effort” during the tasks,

however, some participants did rate the mental effort higher. Participant 3 rated the mental effort

between “low mental effort” and “rather low mental effort” for every task (Figure 16). Participant

5 rated the tasks quite differently for each task and as high as “rather high mental effort” for task

6 and “high mental effort” for task 7. The results can be linked together with the response times of

the participants. The participants who rated the mental effort higher than the rest were mostly the

same participants who had a longer response time (Figure 15, Figure 16). This can explain why

these participants experienced the complexity of the tasks being higher than the others did.

(28)

can according to DeLeeuw and Mayer (2008) affect information processing and result in poor learning. They mean that germane cognitive load can be affected by the learner’s prior knowledge and motivation. Oviatt (2006) showed that the users’ performance benefited when the work practice was not changed. This can explain why the difficulty ratings were higher for task 6 and task 7 (Figure 17). These two tasks were new to most participants in their work practice. Some participants have used Facebook Messenger or WhatsApp in their work before, but not everyone.

It was also a new experience for all participants to interact with a mobile application on the computer.

Due to time pressure and the participants having a stressful time at their work and the shortage of staff due to a lack of semiconductors, the participants did not have a lot of time left over during their work hours to participate in studies. This led to the tests not taking their current system into account. Instead, the results only show which design choices led to a low amount of cognitive load when using the developed prototype.

The design choices identified that led to a low amount of cognitive load, is among others, the use of icons. Some icons were not evaluated during the cognitive load test, and one of the evaluated icons, the filter icon (Figure 18), did perhaps lead to a higher cognitive load. Although, the analysis of the tests did show that the use of the icons in the rest of the prototype did not affect the users’ performance. The use of contrasts and colour also seemed to be a good design choice.

The analysis once again showed the participants had no trouble with visual perception (Ware, 2020, Chapter 3). The last identified design choice that resulted in a low cognitive load was the principle of preserving the users’ familiar and prior work experiences. It was when this was not followed the users had the most trouble with the prototype, which led to a higher cognitive load.

A strength of the method used is that three different quantitative measurements were taken.

The three measurements were not equally sensitive to the same type of cognitive load which led to a more accurate result of each cognitive load being achieved. The qualitative analysis supported in those cases when the quantitative measurements did not last all the way, e.g., in task 1. A weakness of the method is that the participants were interacting with the prototype on a computer instead of on a smartphone as intended. The results of the experienced cognitive load when using the application is not completely accurate because of this. Another weakness is the low number of participants resulting in the study being a case study. In spite of the low number of participants, the case study still resulted in a large amount of relevant information for the subject being investigated. It might be that the results cannot be derived from the design choices used since the study did not compare the prototype with a system without including the design choices.

Future Research

In the future, it would be interesting to investigate if there is a difference in the cognitive

load experienced by the users when using their current systems compared to using the developed

(29)

prototype. It can be hard to compare the two systems since some functions in the prototype are

quite different from the current system. It would also be interesting to use statistical analyses when

comparing the cognitive load, but that would require at least 10 more participants. The results

showed that the participants experienced a higher cognitive load for the tasks that were not part of

their familiar work practice. Because of this, it can be difficult to derive the possible results from

the implemented design choices since the users are familiar with their current system and not the

prototype. A more accurate result of the cognitive load while interacting with the prototype could

in the future be achieved if the participants had used a smartphone instead of a computer.

(30)

DeLeeuw, K. E., & Mayer, R. E. (2008). A comparison of three measures of cognitive load:

Evidence for separable measures of intrinsic, extraneous, and germane load. Journal of Educational Psychology, 100(1), 223-234. https://doi.org/10.1037/0022-0663.100.1.223 Fibonacci. (2007). Cornsweet illusion. [Pseudo Edge]. Wikipedia Commons.

https://commons.wikimedia.org/wiki/File:Cornsweet_illusion.svg

Grill-Spector, K., & Kanwisher, N. (2005). Visual Recognition. Psychological Science, 16(2), 152–160. https://doi.org/10.1111/j.0956-7976.2005.00796.x

Jenny, B., & Kelso, N. V. (2007). Color Design for the Color Vision Impaired. Cartographic Perspectives, 58, 61–67. https://doi.org/10.14714/cp58.270

Kujala, S. (2003). User involvement: A review of the benefits and challenges. Behaviour &

Information Technology, 22(1), 1–16. https://doi.org/10.1080/01449290301782

Oviatt, S. (2006). Human-centered design meets cognitive load theory. Proceedings of the 14th Annual ACM International Conference on Multimedia - MULTIMEDIA ’06, 871–880.

https://doi.org/10.1145/1180639.1180831

Oviatt, S., Arthur, A., & Cohen, J. (2006). Quiet interfaces that help students think. Proceedings of the 19th Annual ACM Symposium on User Interface Software and Technology - UIST

’06, 191–200. https://doi.org/10.1145/1166253.1166284

Rogers, Y. (1989). Icons at the interface: their usefulness. Interacting with Computers, 1(1), 105–117. https://doi.org/10.1016/0953-5438(89)90010-6

Sharp, H., Preece, J., & Rogers, Y. (2019). Cognitive Aspects. In Interaction Design: Beyond Human-Computer Interaction (5th ed., pp. 101–134). Wiley.

van Gog, T., Kirschner, F., Kester, L., & Paas, F. (2012). Timing and Frequency of Mental Effort Measurement: Evidence in Favour of Repeated Measures. Applied Cognitive Psychology, 26(6), 833–839. https://doi.org/10.1002/acp.2883

Ware, C. (2020). Lightness, Brightness, Contrast, and Constancy. In Information Visualization:

Perception for Design (Interactive Technologies) (4th ed., pp. 69–94). Morgan Kaufmann.

Wickline, M. & Human-Computer Interaction Resource Network. (2000). Color Blind Simulation [Computer software]. Human-Computer Interaction Resource Network.

https://www.color-blindness.com/coblis-color-blindness-simulator/

(31)

Appendix

List of People to Recruit for Interview 2

The list of employees who were to be recruited by the production manager consisted of the following points:

● One user from each department within the painting process - a total of 6 participants.

● Half of the participants should be from the younger age group and half of the participants from the older age group.

● Both leaders and people working on the floor. Prioritize people who use the current system the most.

The list has been translated from Swedish to English.