OpenCourse: Gamification in Computer Science Education
An investigation of a gamified educational application
Bachelor’s thesis in Computer science and engineering
Gustaf Bodin Elias Ekroth Tobias Engblom Filip Linde
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CHALMERS UNIVERSITY OF TECHNOLOGY
UNIVERSITY OF GOTHENBURG Gothenburg, Sweden 2022
OpenCourse: Gamification in Computer Science Education
An investigation of a gamified educational application
GUSTAF BODIN ELIAS EKROTH TOBIAS ENGBLOM
FILIP LINDE ANWARR SHIERVANI
Department of Computer Science and Engineering Chalmers University of Technology
University of Gothenburg
Gothenburg, Sweden 2022
OpenCourse: Gamification in Computer Science Education An investigation of a gamified educational application
Supervisor: Yuchong Zhang, Department of Computer Science and Engineering Examiner: Pawel W. Wozniak, Department of Computer Science and Engineering
Bachelor’s Thesis 2022
Department of Computer Science and Engineering
Chalmers University of Technology and University of Gothenburg SE-412 96 Gothenburg
Telephone +46 31 772 1000
Typeset in LATEX
Gothenburg, Sweden 2022
Academic degrees in computer science suffer from the highest dropout rates among universities. As a result, there is a lack of technical professionals, which could threaten the digital infrastructure. For instance, the employers of companies might have difficul- ties hiring staff who possess the correct knowledge, which could be detrimental to the security of their companies. Therefore, this project aims to capture and analyze the stu- dents’ motivation and engagement while using an educational tool that can be adopted in a university computer science course, which employs gamification for the teaching of the subjects. Regarding the proof-of-concept, a desktop-based application named OpenCourse was developed. Afterward, a comprehensive evaluation was implemented for our proposed methods.
We chose frameworks and libraries that quickly allowed us to use popular design conven- tions when designing OpenCourse. This was done in order to make the target users feel comfortable and get familiarized with the interface design quickly. The primary color of our desktop application was green, enlightened by a set of previous research. From this information, mockups for the theory, tasks, and score pages contained in OpenCourse were designed and were later used as bases when developing the application.
In order to evaluate our methods, including the practical usage of the application, a within-subject experiment was performed. In this experiment, the results from the users when they used the application were compared to their results when using merely pen and paper. In addition, a heuristic evaluation where the application was assessed by a predefined set of heuristics was conducted. We observed that the university computer science students tended to feel more motivated and engaged when using the application than when using pen and paper when participating in the experiment. Our conclusions were obtained from a moderate size of samples, which implies that the results can be questioned. However, our work indicates and ignites the direction of bringing gamified digital artifacts into computer science education in universities.
Keywords: Gamification, Education, Heuristic Evaluation, Desktop Application Development
Akademisk examina inom datavetenskap lider av den största andelen avhopp bland universitet. Som resultat av detta finns en brist på dataingenjörer, vilket kan hota den digitala infrastrukturen. Exempelvis kan företagens arbetsgivare ha svårt att anställa personal som besitter rätt kunskap, vilket kan vara skadligt för deras företags säker- het. Därför är syftet med detta projekt att fånga och analysera elevernas motivation och engagemang under användning av ett utbildningsverktyg som kan användas i en universitetskurs inom datavetenskap, som använder spelifiering för undervisningen av ämnen. När det gäller “proof-of-concept” utvecklades en skrivbordsbaserad applika- tion, vid namn “OpenCourse”. Därefter genomfördes en omfattande utvärdering av våra föreslagna metoder.
Vi valde ramverk och bibliotek som enkelt möjliggjorde att använda populära designele- ment under design av OpenCourse. Detta syftade till att få målanvändarna att känna sig bekväma, och bekanta sig snabbt med designen av gränssnittet. Huvudfärgen för vår skrivbordsapplikation var grön, efter upplysning från tidigare forskning. Från denna informationen designades skisser för teori-, uppgifts-, och poängsidan i OpenCourse, som senare användes som baser när applikationen utvecklades.
För att utvärdera våra metoder inklusive den praktiska användningen av applikatio- nen, utfördes ett experiment inom ämnet. I experimentet jämfördes resultaten från användarna när de använde applikationen med resultaten från när de endast använde papper och penna. Dessutom genomfördes en heuristisk utvärdering där applikationen bedömdes utifrån en fördefinierad uppsättning heuristik. Vi observerade att univer- sitetets studenter inom datavetenskap tenderade att känna sig mer motiverade och engagerade i applikationen, än när de använde papper och penna när de deltog i ex- perimentet. Våra slutsatser erhölls från ett måttligt urval av deltagare, vilket innebär att resultaten kan ifrågasättas. Likväl ger vårt arbete en indikation och belyser rikt- ningen för att föra in gamifierade digitala artefakter till datavetenskaplig utbildning på universitet.
Nyckelord: Spelifiering, Utbildning, Heuristisk Utvärdering, Datorapplikationsutveckling
We want to dedicate this section to thanking our supervisor Yuchong Zhang, for sup- porting and guiding our group throughout this bachelor’s project. The feedback received during the supervision meetings provided tremendous value and helped set the foun- dation for our project. Additionally, we would like to thank the people who partook in our experiment, as this enabled us to evaluate the concept behind this project. With their help, we have discovered new aspects to consider in this project.
MVP- Minimum Viable Product: a product with just enough features to be tested in order to gather feedback
GUI- Graphical User Interface: an interface allowing the user to interact with a soft- ware through graphical icons
CSS- Cascading Style Sheets: a programming language for stylizing web markup lan- guage documents
HTML - HyperText Markup Language: a programming language for creating docu- ments meant to be displayed in a browser
UX- User Experience: Deals with all aspects of the user interaction with the specified product, in this case an application
Gamification - The concept of making a system, service or activity more game-like, to increase motivation and engagement
Front-end - The part of the software actively utilized and seen by the user
Cross platform- Computer software designed to work on several different platforms, such as different operating systems
Toast- Simple feedback message presented in a small popup for a few seconds
1 Introduction 1
1.1 Background . . . 1
1.2 Purpose . . . 1
1.3 Scope and Challenges . . . 2
1.4 Related Work . . . 2
1.5 Thesis Outline . . . 3
2 Theory 4 2.1 Gamification . . . 4
2.2 Application Evaluation . . . 4
2.2.1 Experimental Design . . . 4
2.2.2 Surveys . . . 5
2.2.3 Likert Scale . . . 6
2.2.4 Heuristic Evaluation . . . 6
3 Tools and Frameworks 8 3.1 Electron . . . 8
3.2 React . . . 8
3.3 React-Bootstrap . . . 8
3.4 Figma . . . 9
4 Methodology 10 4.1 Application Development and Design . . . 10
4.1.1 Application Requirements . . . 11
4.1.2 Mockup . . . 11
4.2 Evaluation . . . 12
4.2.1 Experiments and Survey . . . 13
4.2.2 Heuristic Evaluation . . . 14
5 The OpenCourse Application 16 5.1 Theory Page . . . 16
5.2 Tasks Page . . . 17
5.3 Quiz Page . . . 17
5.4 Score Page . . . 18
5.5 Gamification Within the Application . . . 19
6 Results 21 6.1 Comparative Evaluation . . . 21
6.2 Non-Comparative Evaluation . . . 24
6.3 Heuristic Evaluation . . . 29
7 Discussion 31 7.1 Proof-of-Concept and Usability Evaluation . . . 31
7.2 The OpenCourse Application . . . 32
7.2.1 Heuristic Evaluation . . . 33
7.3 Ethical Aspects . . . 34
7.4 Limitations . . . 36
7.5 Continuation of the Concept . . . 36
8 Conclusion 37
Chapter 1 covers this thesis’ background and purpose. Also, the project’s scope and challenges, as well as the thesis outline, are included.
The academic degrees in computer science subject bear the highest dropout rates among universities . This high dropout rate, combined with the relatively low number of students, threatens the supply of technical professionals. When these students were asked for the reasons as to why they dropped out of computer science, almost half of the students (49%) said they left because they did not enjoy their studies. Meanwhile, the US Bureau of Labor Statistics projects that employment within the computer and networking fields will grow 13 percent between 2020 and 2030, creating close to 667,000 new jobs in the United States alone .
The lack of professionals could make it hard for agencies and companies to hire staff with the right skills, which could threaten the safety of the digital infrastructure. According to the Swedish Civil Contingencies Agency , Sweden lacks secure IT systems and professionals in cybersecurity, leaving individual organizations and societies vulnerable to cyber threats. They explain that Swedish society is heavily dependent on IT systems to function, to protect these systems is, therefore, a matter of national security.
Gamification for learning is a process that uses game mechanics to enhance learning, and the process has been shown to improve learning and make it more engaging for individuals . Another source  claims gamification is an effective teaching method because it gives the learner control over their education and socializes learning. Learners can reflect on and compare their performance with their peers and start competitions within the class. This makes gamification a promising method to increase the moti- vation and engagement of computer science students, and because of this, the concept should be investigated.
The purpose of this project is to capture and analyze students’ motivation and engage- ment during their use of an educational tool, which employs gamification in the teaching of a university computer science course. The educational tool is to be in the form of a desktop application. This application, named OpenCourse, will create an interactive and engaging setting for students, where it is served as an easy-to-use tool and has a clear and straightforward user interface, so as to accurately portray the coursework en- vironment and its problems. Our method is designed to provide a computerized artifact for the enrolled students to better absorb the scientific knowledge than the traditional
teaching mode. This conceptualized form of alternative educational course content will be tested in an experiment on students studying a computer science program at Chalmers University of Technology, Sweden.
1.3 Scope and Challenges
There are a number of limiting factors needed for the production of the application.
This is partly due to keeping the scope of the project reasonable in the given time frame. Within this scope, challenging aspects are bound to surface. Firstly, in order for the concept to be tested, it must be concretized in the form of a desktop application.
This application should form a base that is stable enough to test the concept, while allowing for further development which would strengthen its usefulness and versatility.
Secondly, an evaluation of the concept has to be performed in order to determine its viability and practicality. Thirdly, the application and its design choices must be adequately evaluated and tested in order to show that they can serve the purpose of such software, as well as to discover as many flaws and weaknesses as possible. The scope is mainly generated from the purpose of the project and its challenging aspects.
This scope is narrated as follows:
1. To evaluate the proof-of-concept, there will be experiments in which computer science students at Chalmers participated. A diligent investigation would not be possible since it would take up a more significant part of the project. A claim could be made that the evaluation of the concept will not be as scientifically accurate as a consequence of this. However, some correlations may still be observed.
2. The application will be developed as a desktop application for computers running Windows. The application framework (see section 3.1) has support for cross- platform deployment. However, the time constraints make it difficult to contin- uously ensure that the quality and reliability of the application’s functions on different operating systems are up to par. A version for other operating systems, like Android, iOS, Linux, and macOS, will therefore not be developed, but could perhaps be implemented with further development.
3. This application will base its theoretical content on the topics covered in the Chalmers course EDA343 - Computer Communication, specifically the coursebook Computer Networking: A Top-Down Approach, 7th edition by James Kurose and Keith W. Ross . The project’s time constraints hinder the ability to include the course’s entire educational content. Instead, the developed application will be of a smaller scale and adequately cover two to three chapters’ worth of information, which equates to a similar amount of study weeks within the course.
1.4 Related Work
Karl Emanuelsson and Elvira Gustafsson  investigates the level of engagement within e-learning and explores the notion of engagement, and the possible ways of increasing it.
The thesis’ close connection to engagement and e-learning places it in close proximity to the work of this project, which is focused on measuring engagement and motivation
when using a digital tool.
In a paper by Hamari et al., a literature review is conducted to investigate the positive and negative effects of gamification . The thesis concludes that gamification in general offers positive effects on the user by, for example, heightening both the user’s motivation and engagement.
Another paper by Gari et al. investigates the effects of gamification on students .
The paper reports that the gamification elements that offer the highest increase in motivation and engagement among the students are the opportunity to gather points and a leaderboard that shows the students’ scores.
A study conducted in 2020  showed that gamification in an educational context proved to improve the engagement of introverted university students. Another study in a conference paper  displayed the differences in the effects of gamification between the gamification features and certain personality traits. Both of these studies made use of the Big Five Personality Test (Big 5) , which is meant to calculate the: agree- ableness, conscientiousness, extroversion, neuroticism, and openness of an individual.
The usage of this psychological test allowed the authors to propose the aforementioned patterns.
1.5 Thesis Outline
Chapter 2 contextualizes valuable information for the understanding of the thesis. The following chapter, chapter 3, explains the tools and frameworks used in the realization of the application. Chapter 4 covers the development and evaluation methods of the project. In chapter 6, the results of the application design and experiments are detailed, and chapter 7 discusses these results and topics surrounding the concept. Finally, chapter 8 contains a conclusion to the project.
Chapter 2 covers concepts central to the project and similar previous work.
One can be faced with situations where the inherent monotonousness of a real-world task stands in the way of its completion within a reasonable time. This is where one can utilize gamification. This is a method in UX that employs game-like aspects to cumbersome tasks, as a means to increase productivity . There are many contexts where one could implement gamification e.g., in a business or in education.
The facilitation of gamification in education, as a way to improve students’ grades, can be approached in several different ways. A system can be implemented where the student is rewarded with points for their progression e.g., when they reach a certain checkpoint in a task. An implementation of such a system is believed to keep the student engaged in their education, and as a result improve their grades .
2.2 Application Evaluation
In the following sections, experimental design and survey research will be described.
Along with that, a common scale used in survey research named Likert scale will also be introduced. Finally, the theory heuristic evaluation and its corresponding heuristics will be presented.
2.2.1 Experimental Design
Experimental design involves taking a hypothesis and testing it in a structured manner.
There are four components that experimental design is built out of: the variables, the hypothesis, the treatment and the treatment groups.
There are two types of variables: independent and dependent variables . One can look at these two different types of variables as cause and effect. Independent variables define the cause, and their values do not have any form of dependency on any other variable. Independent variables are the ones that the experimenters themselves intend to manipulate. Dependent variables define the effect, and their values depend on the independent variables. Dependent variables are meant to represent the outcome of the experiment.
In order to easier describe what kind of correlation the experiment intends to analyze, a hypothesis can be specified . Describing a hypothesis, in the case of experimental design, is done in two parts. First, the null hypothesis H0 has to be specified. The
null hypothesis states that there will be no, positive or negative, effects on whatever variables the experiment intends to identify relationships between. Secondly, the al- ternative hypothesis Ha has to be specified. It is the opposite of the null hypothesis, and it is supposed to represent the anticipated outcome of the experiment based on the relationships between some variables.
As previously mentioned, the independent variables are the ones that the experimenter wishes to manipulate in a certain way. The way that they are manipulated sets the ground for the intended experimental treatment and how it is designed. Two key things to have in mind while designing the treatment, is the internal validity and external validity.
• Internal validity is defined in regards to what extent the relationships between variables in the experiment can be seen as reliable and justified . A factor that could affect internal validity is maturation. Maturation essentially means that previous, prolonged exposure to a certain situation affects the value of a dependent variable.
• External validity has more to do with how the outcome can subsequently be gen- eralized for a larger or a different type of people . The Hawthorne effect is a factor that could affect the external validity. The Hawthorne effect refers to how the members of the treatment group change their behavior if they can tell that their progress is being observed during the experiment.
Choosing the random design to use when assigning subjects to treatment groups can be done in two different ways. Completely randomized design refers to a random as- signment of the subjects. This means that there isn’t any patterns to the assignment, it’s performed irregardless of any common characteristics between the subjects .
There are also two different ways of exposing the subjects to different treatment levels:
between-subjects and within-subjects. Between-subjects has each and every subject only receive one treatment level. Within-subjects has each and every subject receive all of the treatment levels in a consecutive order .
Survey research is a way of collecting answers to questions that one might have regarding a certain group in the population. Three key things to creating a survey is: the sampling method, the type of survey, and the types of questions.
The intended target population for a research should be heavily related to the theme of the research. From the target population usually comes a sample, which is a smaller set of individuals who participate in the survey. Sampling is the name of the process in which the sample is being created. Non-probability sampling is a way of sampling a population, which means that not everyone in the population share the same chance of being included in the sample. The chances of an individual being selected is depen- dent on the conditions that the researcher set themselves . An example of this is convenience sampling, in which the sampling is done based on the convenience of the
researcher . This could be the case if the researcher is conducting their surveys in a specific physical area, and they will accept whoever happens to pass by at a given time.
There are two commonly occurring types of surveys: questionnaires and interviews .
Questionnaires are questions that are meant to be filled in by the respondents. These commonly contain quantitative questions, counting the occurrences of a certain aspect, usually with numerical answers. On the other hand, interviews are always verbal. These are meant to contain qualitative questions, describing the quality of a certain aspect, usually with non-numerical answers . Further distinctions have to be made when it comes to what types of questions to include in the survey. Closed-ended questions are questions with a limited range of answers . The answer alternatives to such questions are usually meant to be represented in a binary form (yes or no), or through the use of a scale, e.g. a Likert scale. Open-ended questions have no range of answers for the respondents to choose between. Instead, the respondents may answer whatever they want.
2.2.3 Likert Scale
The Likert scale is a rating system, commonly used in questionnaires, that is supposed to quantify people’s perception and attitude towards the subject in question . It provides a set of answers for the respondents to the proposed questions and statements, which typically are the following: 1. Strongly Disagree, 2. Disagree, 3. Neutral, 4.
Agree, 5. Strongly Agree. The pollster must consider some issues that follows by using the Likert scale, and how to proceed with statistical analysis of the collected data. An example can be what range of answers to provide and how to interpret and compare them to each other.
2.2.4 Heuristic Evaluation
Heuristic evaluation is the process of having a number of people study and investigate an interface, in order to judge it according to a predefined set of usability heuristics . These heuristics might be a list of required and overarching behaviors of whatever is being evaluated. For example, a heuristic might be: “The application should always provide a clear way for a user to return to a previous page.” A main idea behind heuristic evaluation is to identify problems with usability and inadequate design quickly, and without the need of first publishing it to end users which might reveal the same feedback. Jakob Nielsen explains that while the evaluations can be performed with a single evaluator, the results of the evaluations can significantly improve by having more than one . These evaluators should judge the interface without impression from others, and collects their thoughts in a list. However, these thoughts must be grounded in that they go against any of the heuristics in some way.
As interaction design theory evolved, so did ideas around heuristic evaluation. One instance is when Nielsen and Rolf Molich developed a set of usability heuristics , where the main points are as follows:
1. Provide Feedback: The system should always keep the user informed about what
is going on by providing him or her with appropriate feedback within reasonable time.
2. Speak the User’s Language: The dialogue should be expressed clearly in words, phrases, and concepts familiar to the user rather than in system-oriented terms.
3. Provide Clearly Marked Exits: A system should never capture users in situa- tions that have no visible escape. Users often choose system functions by mistake and will need a clearly marked “emergency exit” to leave the unwanted state with- out having to go through an extended dialogue.
4. Be Consistent: Users should not have to wonder whether different words, situa- tions, or actions mean the same thing.
5. Provide Good Error Messages: Defensive error messages blame the problem on system deficiencies and never criticize the user. Precise error messages provide the user with exact information about the cause of the problem. Constructive error messages provide meaningful suggestions to the user about what to do next.
6. Error Prevention: Even better than good error messages is a careful design that prevents a problem from occurring in the first place.
7. Minimize the User’s Memory Load: The user’s short-term memory is limited.
The user should not have to remember information from one part of the dialogue to another. Instructions for use of the system should be visible or easily retrievable whenever appropriate. Complicated instructions should be simplified.
8. Provide Shortcuts: Clever shortcuts-unseen by the novice user-may often be included in a system such that the system caters to both inexperienced and expe- rienced users.
9. Simple and Natural Dialogue: Dialogues should not contain irrelevant or rarely needed information. Every extraneous unit of information in a dialogue competes with the relevant units of information and diminishes their relative visibility.
Tools and Frameworks
Before picking the tools for developing OpenCourse, the group had a discussion about their previous skills and experiences. It so happened that multiple group members had worked with web development in previous courses and organizations. Picking a web based solution therefore seemed like an appealing option. Consequently, it was decided that OpenCourse should be developed using web technologies. One might think that web technologies purely are used to develop web applications, but this is not true. There exist frameworks that make it possible to develop OS based applications with web technologies. Despite the fact that a web application would be available for more people and work on multiple devices, it was decided that OpenCourse would be developed as an native desktop application. One of the reasons for this was that a web application needs a server, which costs money, but also because it is illegal.
The prototype of OpenCourse will use scientific literature as well as exercises made by teachers from Chalmers University of Technology. The group has no rights to any of this material and therefore cannot publish OpenCourse on the internet.
Using a free and open-source library for designing elements within web pages can save time, while also following popular conventions. This is why React-Bootstrap is chosen
Figure 1: Example of available React-Bootstrap button stylings. Adapted from .
React-Bootstrap has predefined styling and code for the most commonly used elements in modern web pages. Its ease-of-use and large library of available components speeds up the development process, since less time has to be spent creating these components.
To help bring the application from idea to product, an initial prototype was created using the vector graphics editor Figma . Figma is a collaborative tool that allows users to create vector graphics on a large canvas with modern help features, which enables quick prototyping for OpenCourse.
Chapter 4 covers the methods used for the proof-of-concept of this project, including the application requirements and the development of the OpenCourse. Furthermore, how the evaluation was carried out is contained in this chapter.
4.1 Application Development and Design
When structuring and picking the elements contained in a desktop application, a good idea is to follow conventional designs. It can be advantageous to imitate design choices of other popular desktop or web applications. This is beneficial both in terms of time spent prototyping and designing, as well as user comfortability and familiarity to com- mon design elements they have previously experienced. If a new user to OpenCourse recognizes its visual elements, for example: navigation bar, icon choices, and button placements, they will most likely have enough intuition to navigate and understand the application as soon as they start using it.
The application is supposed to be easy for any student to use, which means that the interface has to be clear and intuitive in order to not hinder the students in their learning. Research within the subject of colors in an educational context indicates that green is a calming color in general, which helps keep students calm and concentrated during longer periods of time . This is why green was chosen as the primary accent color of the application.
able to test and evaluate the concept.
4.1.1 Application Requirements
Several requirements were imposed on the development process of the application.
These were believed to be essential in regards to fulfilling the purpose of this project.
The majority of requirements were picked for the reason of relevance, some being in- spired by the mentioned set of heuristics by Nielsen . Other requirements were picked as they were deemed to be convenient.
The application should:
1. Be evaluated: OpenCourse should be tested and evaluated by students, which entails finding an metric that can be used to compare this teaching method and its practical use to standard educational practices.
2. Engage and motivate the user using gamification techniques.
3. Have a user interface that is easy to use, while providing a practical educational environment. It should also keep consistency between pages: similar actions and elements should be portrayed in similar ways.
4. Contain the book Computer Networking: A Top-down Approach, 7th edition.
5. Contain tasks related to the course content. They should be fun, challenging and benefit the teaching of the subject.
6. Award score. Score is earned when the user completes tasks. This helps in provid- ing an overview of the user’s progress and encourages them to continue using the application.
7. Contain a leaderboard. The leaderboard lets the user compete with other users.
It should display name, score and rank. The rank is based on the user’s score.
Mockups for OpenCourse was iteratively created in Figma. Each page in the applica- tion, as well as their components, were roughly sketched in Figma. This allowed for the facilitation of an easier construction process in software, as well as making the ideas more easily processed. This method reduced a lot of development time, since visual components could be designed quickly, and internal visual evaluation of these compo- nents could be done without actual software implementation. Placeholder text and images were used before the content was decided, e.g., the temporary application name
“NetView”. Two iterations of an overview page displaying available tasks contained in OpenCourse are shown in Figure 2 and Figure 3 respectively.
Figure 2: First iteration of the task overview page mockup.
Figure 3: Second iteration of the task overview page mockup.
The project had two main test categories: internal tests and user tests. The internal tests were done both biweekly, as well as a final heuristic evaluation. The weekly tests was less formal in nature, and were meant to provide quick feedback.
4.2.1 Experiments and Survey
In order to measure the effectiveness of the application and its gamification elements, experiments were carried out on 11 willing subjects. The null hypothesis H0, in the con- text of this experiment, was: “No change is going to occur with the subjects’ motivation and engagement” and the alternative hypothesis H1 was: “A positive change is going to occur with the subjects’ motivation and engagement”. The subjects were divided into two treatment groups (“A” and “B”), with a probability of 50% to enter either one of these, meaning that completely randomized design was used when assigning the subjects. Subjects in both of the two treatment groups were exposed to the treatment method within-subjects. They started with no treatment, completing course-related tasks using a pen and a paper, and then transitioned to the highest level of treatment, completing similar tasks but through the gamified application.
A survey was created in the form of a questionnaire. It was meant to collect the values of the dependent variables included in the experiment (e.g., engagement and motiva- tion), as well as evaluate the participants’ overall enjoyment of the methods and their practicality. Parts of the questionnaire were filled in by the subjects during and after the experiment, with a project member sitting next to them during the entire process. The experiment, and therefore the questionnaire, were conducted on the Chalmers’ campus at Johanneberg. The participants were students of Chalmers’ computer science and engineering programs. They were not selected beforehand, instead they were asked on the spot if they would like to participate. It was also made clear that a reward, in the form of a candy bar, would be given at the end of the process as a compensation for their efforts. As the participants were chosen as they happened to pass by, this means that the sampling method used for the questionnaire was convenience sampling, a form of non-probability sampling. The four quantitative statements within the questionnaire all had five possible answers, in accordance with the Likert scale mentioned in section 2.2.3. All statements in the questionnaire were created by the project group and are shown in Table 1.
Table 1: Evaluation statements.
Statement 1 I felt engaged in the learning while performing these tasks Statement 2 I think that this way of practicing course content was fun Statement 3 It was easy and practical to answer the given tasks Statement 4 I felt motivated in the learning while performing these tasks
The participants started with the pen and paper method. In this method they were provided with a computer displaying the PDF of the course book, as well a paper answer sheet. The sheet contained the first two multiple-choice questions out of four in total, each question having five answer alternatives, one of which being correct. It also contained a hint directing them to the subchapter containing the information to solve the questions. After they answered the questions, the participants were given the first part of the questionnaire. Contained within this closed-ended questionnaire were four statements, pertaining to the participants’ experience of the method. The participants were asked to evaluate these statements in order to obtain a metric of the method’s
motivational, engagement, and enjoyment factors.
Next, the participants were given a short introduction to OpenCourse. They were shown three pages of the application: the theory page containing the course book PDF, the tasks page containing the tasks, and the score page containing the leaderboard.
The second part of the test was conducted using OpenCourse instead of the paper answer sheet and PDF. The participants were given a computer with the application, as well as written instructions of which task they should navigate to within OpenCourse.
Similar to the previous method, a provided hint within the selected task guided the participants to the theory page and subchapter containing the information needed to solve the questions. The task given in this part of the test contained two multiple-choice questions with five possible choices. The participant’s group (A or B) decided which pair of questions to be answered using the pen and paper method, and which to be answered in a task using OpenCourse; all four questions were answered using both methods. They were then asked to register on the leaderboard with an anonymous nickname in order to save their score, as the leaderboard was a chosen gamification element. The aspect of gathering points when completing the tasks was also explained. After having answered the questions using OpenCourse, the participants were asked once again to evaluate the same statements as in Table 1 in the second part of the questionnaire, however this time having the application in mind. Asking these questions after each method enabled a comparison between the two.
Finally, they were given the third part of the questionnaire which contained seven statements created by the project group, with answer alternatives that also followed the Likert scale. These statements meant to evaluate some details of the method and of the application, and are listed in Table 2. The sixth and seventh statements additionally had optional open-ended text input fields, where the participant could write what made the application hard to use and what features were missing, if applicable. These answers were regarded as qualitative.
Table 2: Final evaluation statements.
Statement 1 I felt like the point system had positive effects on my motivation and engagement Statement 2 I felt like the leaderboard had positive effects on my motivation and engagement Statement 3 I believe I would learn efficiently using the pen and paper method Statement 4 I believe I would learn efficiently using the application method Statement 5 I like the idea of comparing my scores to my classmates’ scores Statement 6 I found the application easy to use
Statement 7 I felt the application was lacking features I expected
4.2.2 Heuristic Evaluation
Following the user tests, was a heuristic evaluation performed by internal members of the project. The evaluation was carried out in accordance to a list of actions described by Jakob Nielsen .
1. Define the set of heuristics to be used in the heuristic evaluation and identify the evaluators.
2. Allow the evaluators to understand how the application operates by letting them test it and get answers to questions that may arise regarding the interface from the designers.
3. Allow the evaluators to evaluate the interface’s usability against the chosen set of heuristics.
4. Collect a list of usability problems from the evaluators. The motivations behind each and every identified usability problem has to be based on one of the heuristics at hand.
5. Find a way to solve the usability problems that were identified. There are sev- eral ways of finding a solution to the problems, one of these ways involving a brainstorming session with the front-end designers and evaluators.
There were two evaluators, taking roughly 45 minutes each to complete their evalua- tion. It was important to perform these assessments in order to judge the application’s interface of any shortcomings in terms of usability, from a structured and recognized point of view—as a complement to the experiments. This is why the evaluations were established on the set of heuristics created by Jakob Nielsen, as these are common to use when performing a heuristic evaluation.
As the internal group structure resulted in members working on different aspects of the application, it was decided that members that were distant from the application’s design, in terms of usability, could perform a heuristic evaluation. One of the rest of the members could therefore be picked to play the role of expert in the application’s interface. The two tests were carried out similarly, starting by allowing the evaluators time to get comfortable with the interface, while the expert provided answers to any questions. Next, the evaluators were given enough time to analyze the application’s content in detail, while noting any violation against Nielsen’s heuristics. Finally, the evaluators’ results were collected and then reflected upon, which will be discussed later.
The OpenCourse Application
Chapter 5 refers to the final version of the OpenCourse. It attempts to use common design elements, as well as connecting gamification concepts with content from the university course EDA343.
The resulting application contains three main pages: the Theory page, the Tasks page, and the Score page. Since green was chosen as the primary color of OpenCourse, a set of green colors was chosen out of a green gradient. These were then used as accent colors to complement the user interface, such as for the icons, buttons, text wherever suitable, and navigation bar.
5.1 Theory Page
Firstly, the theory page, contains the EDA343 course book in a PDF format, together with a sidebar which allows for quick navigation between chapters. A user can utilize this page to gather critical information to solve tasks, or study in general. The top- right element in the navigation bar serves as a static visual element to remind the user what chapter they’ve accessed recently within the theory page. The chapter sidebar also displays the amount of points that could potentially be earned if the user decides to read it, together with solving the corresponding task.
Figure 4: The theory page in OpenCourse
5.2 Tasks Page
In the tasks page, the user can find all available tasks. The tasks are fetched from a database, and presented in the form of cards. The cards contains the possible points for the each task, the subchapter that the task belongs to, the subchapter name, and an image that visualizes the subchapter. The task cards are displayed in ascending order, following the chapters from the book. In the top-right corner there are two drop-down buttons, which are used to filter the tasks either by chapter, the status of the task (if it’s uncompleted or completed), or both. The completed tasks are grey-scaled, to indicate to the user that these tasks are completed.
Figure 5: The tasks page in OpenCourse
5.3 Quiz Page
The quiz page is the page a user reaches when they select a task from the tasks page.
These pages houses the exercise questions related to chapters within the book itself.
The application uses the task ID to fetch the questions belonging to the task that was clicked on. The number of available questions within each quiz page varies, and the progress point reward similarly so. To the top-left corner of the content view (beneath the navigation bar) is a back button, which is always visible, and reminds the user that they can return to the tasks page. The only question types available are those with pre-selected answers for the user to choose from, with one corresponding answer. The user chooses one of the five answers by selecting one of the radio buttons in the available radio button group for each question. When the user has answered the questions, or if they want to check if they’ve answered a question correctly, they press the submit button. When pressed, correct and incorrect answers are marked in green and red respectively, and a popup toast message is displayed congratulating the user, if all questions in the current task is answered correctly.
Below the last question of the task, there is a box which tells the user how many of the
questions have been answered, and a submit button. When the user presses the button, the application responses with the result in the form of points. For each question, the background color will be green if the answer was correct, and red if it was incorrect.
If a question was not answered, the background color will not be changed. If the user managed to get all answers correct, a toast message is displayed at the top of the screen.
The submit button becomes a leave button after been pressed which directs the user back to the tasks page. There’s also a button at the top left corner if the user wants to leave the task and go back to the tasks page.
Figure 6: An example quiz page in OpenCourse
5.4 Score Page
The score page presents the user’s score and achievements. There’s two different ways that a user collects points. Each question in the tasks has a associated number of points that the user can achieve for valid answers, and the sum of all collected points make up the displayed score value. The user gains points for every correct answer to a question, but to mark the task as complete, the user needs correct answers for all questions. The points are visualized in a React Bootstrap progress bar. For the text to always be visible in the progress bar, the green part of the bar that shows the current progress in percent is set to be the maximum of 16 and the current percent, which means that it will never go below 16% of the bar. Above of the progress bar, the user’s awards are shown. The text shows how many out of the total number of tasks that the user has completed.
Beneath the progress bar is the leaderboard. By clicking on the green “Register” button, the user is prompted to enter a anonymous nickname, and then proceeds to be shown how many points has been collected in comparison to other users in the application who has also uploaded their score.
Except from the awards, points, and the leaderboard, the user can see which tasks is completed and uncompleted. Here, the tasks are presented in the opposite way of the tasks page. The uncompleted tasks are 100% grey-scaled, and the completed tasks are shown without grey-scale.
Figure 7: The score page in OpenCourse
5.5 Gamification Within the Application
There are four gamification oriented elements within OpenCourse. The score, awards and leaderboard elements were chosen according to the observations in , which pre- sented a list of the most commonly-occurring gamification elements. The last one was chosen as it was assumed that it would have some form of effect on the student’s mo- tivation.
Score can be earned by completing task questions. The score shows how far the user has progressed within the application. A user’s score is displayed in the top right corner of the application and on the progress-bar in the score page. It is also displayed on the leaderboard, if the user has registered. The application is considered completed once the user has reached the maximum score.
Awards are earned once a task is completed. A task is considered completed if the user answers correctly on all the quiz questions. The user’s awards are displayed on the score page.
The leaderboard lets the user compete with other users in the same group. It is optional for the user to register on the leaderboard. On registration, the user enters his nickname and his group. After registration, the user can view the nickname, score and rank of all members in the same group. The rank is based on the user’s score.
Toasts provide feedback and a motivational quote to the user, on task completion. It is shown for 3 seconds on the quiz page once the user presses the “submit” button and has answered all questions correctly. The quote is randomly selected from a set of quotes.
Chapter 6 presents the results of the OpenCourse evaluation, as well as the heuristic evaluation.
6.1 Comparative Evaluation
In this section, the answers to the four statements detailed in Table 1 used to evaluate the two methods used in the experiment. These methods were the pen-and-paper method and the OpenCourse method. The 11 participants gave responses to these statements after using the pen-and-paper and after using OpenCourse respectively. the collected results are shown in Tables 3 and 4.
Table 3: Results from statements with pen and paper method.
# Statement Strongly
Disagree Neutral Agree Strongly Agree 1 I felt engaged in the learning
while performing these tasks
1 3 3 4 0
2 I think that this way of
practicing course content was fun
1 3 6 1 0
3 It was easy and practical to answer the given tasks
0 2 3 4 2
4 I felt motivated in the learning while performing these tasks
0 6 2 2 1
Table 4: Results from statements with the OpenCourse method.
# Statement Strongly
Disagree Neutral Agree Strongly Agree 1 I felt engaged in the learning
while performing these tasks
0 1 2 8 0
2 I think that this way of
practicing course content was fun
0 1 6 4 0
3 It was easy and practical to answer the given tasks
0 4 2 3 2
4 I felt motivated in the learning while performing these tasks
0 2 4 4 1
The comparisons of the results of each individual statement are listed below. The results of the statements of using the pen-and-paper method are shown on the left, whereas the the results of using OpenCourse are shown on the right.
Results for Statement 1: The results reveals a lesser number of participants agreeing
(answering either “Agree” or “Strongly Agree”) when the pen and paper method was used, while the number of disagreeing answers (“Strongly Disagree” and “Disgree”) were greater. The answers for this method was the following: one person strongly disagreed, three disagreed, three remained neutral, and four agreed. The corresponding answers for the OpenCourse application was instead: one disagreed, two were neutral, and eight agreed with Statement 1. The results are shown in Figure 8 and 9.
Figure 8: The results of Statement 1 using pen and paper.
Figure 9: The results of Statement 1 using Open- Course.
Results for Statement 2: The amount of neutral answers stayed the same between the two methods totalling 6 answers both times. However, the OpenCourse method saw an increase in the agreeing answers and a decrease in the disagreeing answers. The pen and paper method had the following answers: one strongly disagreed, three disagreed, six neutral answers, and one agreed. For OpenCourse the answers were instead: one disagreed, six neutral answers, and four agreed. These results are visualized in Figure 10 and Figure 11.
Figure 10: The results of Statement 2 using pen and paper.
Figure 11: The results of Statement 2 using OpenCourse.
Results for Statement 3: Regarding the pen and paper method there were two participants disagreeing, three were neutral, four agreed, and two answers strongly agreed. However, for OpenCourse, the total amount of agreeing and strongly agreeing answers was three and two respectively. Two participants remained neutral, and four disagreed to Statement 3, shown in Figure 12 and Figure 13.
Figure 12: The results of Statement 3 using pen and paper.
Figure 13: The results of Statement 3 using OpenCourse.
Results for Statement 4: The amount of participants answering disagree decreased when using OpenCourse, while the amount of agreeing answers increased. The answers regarding the pen and paper method were: six answering disagree, two were neutral, two agreed, and one strongly agreed. The answers regarding the OpenCourse method
were: two disagreed, four were neutral, four agreed, and one strongly agreed. These answers to the two methods are visualized in Figure 14 and Figure 15, respectively.
Figure 14: The results of Statement 4 using pen and paper.
Figure 15: The results of Statement 4 using OpenCourse.
6.2 Non-Comparative Evaluation
In this part, the results of the last seven statements evaluated by the test participants are shown. These statements were meant to provide the participant with the opportu- nity to evaluate the impact that the test methods and the specific gamification elements had on their motivation and engagement. They also had the opportunity to provide feedback on the application design. The statements and their corresponding answers are listed together in Table 5.
Table 5: Results from the application evaluation statements
# Statement Strongly
Disagree Neutral Agree Strongly Agree 5 I feel like the point system
had positive effects on my motivation and engagement
0 0 2 9 0
6 I felt like the leaderboard had positive effects on my motivation and engagement
0 0 4 6 1
7 I believe I would learn efficiently with the pen and paper method
1 1 6 2 1
8 I believe I would learn efficiently with the application method
0 1 2 8 0
9 I like the idea of comparing my scores to my classmates’ scores
1 2 1 6 1
10 I found the application easy to use
0 0 3 5 3
11 I felt the application was lacking features I expected
2 1 0 8 0
The compiled results for each individual statement are presented as follows.
Results for Statement 5: A majority consisting of nine participants agreed with this statement, while two were neutral to the statement, as shown in Figure 16. This indicated that the participants felt a positive effect on motivation and engagement when a point system was used.
Figure 16: The results of the Statement 5 eval- uation
Results for Statement 6: For this statement a majority of six participants agreed,
and one strongly agreed. Meanwhile, four participants remained neutral. This distri- bution of answers suggests that the leaderboard system also had a positive effect on motivation and engagement in the participants, as shown in Figure 17.
Figure 17: The results of the Statement 6 eval- uation
Results for Statement 7: In the case of this statement, where the focus is put upon the pen and paper method, the results show a wider distribution of answers. A majority of the participants were neutral, totalling six answers. The rest of the participants answered as the following: one strongly disagreed, one disagreed, two agreed, and one strongly agreed. This wide-spread selection of answers indicates that the pen and paper is believed to neither be an efficient, nor inefficient, way of learning. These answers are visualized in Figure 18.
Figure 18: The results of the Statement 7 eval- uation
Results for Statement 8: In the case of this statement, the focus is put onto the application method. Compared to the previous statement, a much higher proportion of the participants agreed with the statement, at eight out of eleven answers. Meanwhile, a smaller share of the answers were neutral or disagreed, at two and one answers, respectively. This indicates that the application was believed to provide an efficient method of learning. This distribution is shown in Figure 19.
Figure 19: The results of the Statement 8 eval- uation
Results for Statement 9: Six participants agreed to the statement, making up a majority of the answers, while one strongly agreed. In contrast, there were two that disagreed, one that strongly disagreed, and one that remained neutral. This indicates that many students would like the idea of comparing their scores to their classmates’
scores anonymously. However, it should be noted that a substantial amount of partici- pants do not like this idea. This is illustrated in Figure 20.
Figure 20: The results of the Statement 9 eval- uation
Results for Statement 10: A majority of the participants agreed with this state- ment, where three strongly agreed and three were neutral. In the case that they were disagreeing, they were asked to specify what made the application hard to use. There were two different answers to this question, even though none of the participants dis- agreed. The first answer mentioned that the application was missing a search feature (in other words, a “CTRL + F” command), which is common in PDF and document readers. This feature would enable users to search for keywords or phrases in the book, that is available in the application. The second answer expressed that the application was missing a text zooming function, and that this resulted in the application being
“unnecessary difficult”. The answers to Statement 10 are detailed in Figure 21.
Figure 21: The results of the Statement 10 evaluation
Results for Statement 11: A majority of the participants agreed with this statement totalling eight answers, while two of them strongly disagreed and one disagreed. In case they agreed, they were asked to specify which features they were missing. Even in this case, there were answers that said that they were missing the “CTRL + F” command as well as a zooming function. Some users expressed wanting a search bar, and one mentioned that the user’s selected answers to a task were not saved in between switching pages in the application. When a user went back to the quiz page, none of the answers were selected, and the user had to re-check the already answered questions. The answers to Statement 11 are shown in Figure 22.
Figure 22: The results of the Statement 11 evaluation
6.3 Heuristic Evaluation
In this section, the results from the heuristic evaluation performed by the internal project members are presented. The evaluation included some usability problems for the application’s three pages, being the theory, tasks, and score pages, as well as mentioning some general issues violating the chosen heuristics.
In the theory page it was noted that in order to see the active page number within the book, one needed to hover the mouse over the active PDF page to display a button group enclosing the current the page number. These were located close to the bottom of the screen, and the page number was not shown anywhere else in the interface. The user therefore has to scroll down the page to see this number, meaning that the user is not always able of knowing what page they are on unless they either remember it from earlier, or scroll down the page while hovering the mouse over the page. This was mentioned by both evaluators, and judged that it goes against the Provide Feedback and Be Consistent heuristics.
The proposed solution to this was to display the page number on both button groups relating to changing the current page—not only the bottom one—or to place a static label which follows the user’s scrolling so that the page number is always displayed to the user on the theory page.
Another noted inconsistency was the lack of a visual element to indicate to the user that the sidebar, housing the chapter links, was scrollable. As there was a scrollbar appearing for the PDF content displaying the book, this should also be the case for the left side bar. This inconsistency goes against the heuristics: Be Consistent and Provide Feedback, and was thought to be solved by adding a scrollbar to chapter sidebar.
It was also recognized that not all elements of the chapters in the chapter sidebar
were clickable. This applied to the page number range, progress point text, as well as page icon; only the title text of each chapter and subchapter itself was clickable.
These elements can be viewed in Figure 4. Furthermore, it was noted that no hover highlighting was made when hovering over each subchapter in the sidemenu; only the cursor styling changed to indicate something clickable. Since there already was visual hinting for hovering over the three available pages in the navigation bar, it was expressed as something missing in the sidebar. The evaluators judged that these issues went against the Provide Shortcuts and Be Consistent heuristics. The suggested solution was to add a change in color (a highlight) to each element of relating to the same subchapter when hovering over it in the chapter sidebar, as well as making the icons and rest of the text clickable.
The tasks page, like the previous page, contains a button inconsistency, as the “Back”
button that appears when entering a task does not share the appearance of the rest of the application’s buttons. The button is also grey, which makes it blend in to the background of the page. This breaks the Provide Clearly Marked Exits and Be Consistent heuristics. When inside of one of these tasks, the user is able to mark answers before submitting the questions. The user can still navigate outside the page in order to i.e. read the theory page, but when the user returns to the task the marked answers are removed without warning. This means that the user needs to remember to check the answer again before submitting the task, else risking to submit the task without answers and losing points. This violates the Minimize the User’s Memory Load and Provide Feedback heuristics. Another minor flaw is the lack of a “Filter” text next to the filter buttons, which may cause the user to miss the filter function. This violates the Provide Feedback heuristic.
Next, the score page shows all the completed and uncompleted tasks, with the same images as the tasks page does. This may make the user believe that they would take them to the tasks when clicked, but they do not. Adding this feature could be a good way of adding more shortcuts to the application, which would follow the Provide Shortcuts heuristic. This page contains the leaderboard element, the mechanics of which are not clearly explained. The user is able to retry questions and the score on the leaderboard only updates if they beat their previous score on that question, but neither of these functions are explained, which goes against the Provide Feedback heuristic. Adding a small explanation window, either before or after registering, would work towards solving this problem. Finally, an area marked “Awards” looks like the other buttons in the applications, but it is not one and can therefore confuse the user.
This violates the Be Consistent heuristic, and can be fixed by changing the design of this area away from the typical button design.