IN
DEGREE PROJECT COMPUTER SCIENCE AND ENGINEERING, SECOND CYCLE, 30 CREDITS
STOCKHOLM SWEDEN 2020 ,
Designing and Evaluating a Visualization System for Log Data
XIAOHAN WANG
KTH ROYAL INSTITUTE OF TECHNOLOGY
SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE
Ab ac
I e e ee f e d, da a a a a bee c d c ed b c a e a a bec e a f ca e f d c e b e a d
b a e e . V a a c b be e
c e e f da a c d be ed a a effec e a d e e d f
da a a a . T d a a a a a c a de a ac
de e a a a e f da a, e ed de ac e
c d e e , , ab e a d e a e e
e ea c ce , a a c a a e d e ac f
a a e a a ec e a d e ab a d e
e a e e e a a a . T e f d e ed a
e a d a a e a a ec e ee c ea e e
e a e e e d e ee c ea e ab . Def e c c
c d be d a d e a de a c d e f a c a ; e e , e e c d be a a e f e e ea c e ac
f e a d a a e a a ec e e e e e ce.
F e e ea c e c a ed ec e a c a a de d e , e- ce da a a d c d c a c a a e d e ec e be
a a f da a.
Sa a fa g
I e de a da a-a a e f a de f e a f e a ef e de a b e e f a c a b e c f e e . V a e e b e f e e a da a a a da e effe c e d f da aa a . De a d e f a a a e de a a de de e d f a ec a e a e e a da a,
a da ed de a e e e e e , e ,
a dba e e c f ef f ce e , a a ed
e f a de d e a effe e a a a a da be a de
a e e e c be e e a dba e c
a da e a e a ed f a de a e a . Re a e ade a a d a be e e c be a de a e e e e a a a da a e a e a eda de e e a a a dba e e .
Def a a e de e d a d a e de af fa d
a de a a e; e e e d a e a e a a e f a f a a
e a e f effe e a a be e e c be a de
a e e e a da e e e . F a da f
a a e e a f e de a a e e fa d, f be eda da a c e f a e f a de d e a a de b a a e e f
da a.
Designing and Evaluating a Visualization System for Log Data
Xiaohan Wang
KTH Royal Institute of Technology Stockholm, Sweden
xiaohanw@kth.se
ABSTRACT
In the engineering field, log data analysis has been conducted by most companies as it has become a significant step for discovering problems and obtaining insights into the system.
Visualization which brings better comprehension of data could be used as an effective and intuitive method for data analysis.
This study aims at applying a participatory design approach to develop a visualization system of log data, employed with design activities including interviews, prototyping, usability testing and questionnaires in the research process, along with a comparative study on the impacts of using narrative visualiza- tion techniques and storytelling on usability and user engage- ment with exploratory visualizations. The findings exposed that using storytelling and narrative visualization techniques seems to increase user engagement while it does not seem to increase usability. Definitive conclusions could not be drawn due to a low demographic diversity of participants, however, the results could be an initial insight to trigger further research on the impacts of storytelling and narrative visualization tech- niques on user experience. Future research is encouraged to recruit more participants in a wide diversity, pre-process log data and conduct a comparative study on selecting the best visualization for log data.
Author Keywords
Usability evaluation, User Engagement, Visualization, Participatory design, Log data
INTRODUCTION
Analysis of log data is conducted by most companies nowa- days. Through analyzing log data, the past working status of the system could be understood by engineers in order to dis- cover and solve problems timely. Moreover, log data records every interaction between users and the system, which could reflect user behaviors and habits to some extent. Compared with other types of data, log data is usually massive and messy, which urges analysts to find measures to deal with the data and obtain insights from it. In addition, as the storage space is limited, past back log is required to be deleted, thus a rapid and efficient method to analyze data is highly needed. Visual aids that facilitate better comprehension has become a trend for most companies and information visualization method has been adopted for analyzing log data. However, as most vi- sualization tools used in companies whether concentrate on displaying data with high accuracy and computational power or adapt to incorporate novel technologies into visualizations,
these tools might not be customized for different user groups and might have not considered about factors relating to user experience. Nowadays, a simple visualization to present data is not sufficient as users would expect an explorative visu- alization tool with high level of usability to attract them to interact with it. Storytelling has become an emerging topic in visualization research and practice, as the study of narrative vi- sualizations (e.g., [12]) and development of storytelling tools prove. Storytelling and narrative visualization techniques are gradually becoming an element to consider when designing visualizations.
The company in collaboration with this project also encoun- ters problem about dealing with log data and a visualization solution is expected to be designed for displaying log data.
The data is from an intelligent speaker "Rokid", which is a smart-home equipment incorporating AI technology and could act as a main controller for other smart-home equipment of up to 44 brands. It adopts natural language processing technology in the Chinese language to make speech interaction idiomatic.
It is able to recognize speech command of few words due to its intelligent feature, which leads to produce substantial data however. Currently, log data from "Rokid" is not being treated properly and visualization tools used in the company are confined to traditional visualization tools providing few in- teractions such as Excel and Matlab. In addition, data analysts in the company are not particularly study log data and they express that it is difficult to maintain interest to investigate about massive data especially for log data which is usually repeated and massive.
In this project, my interest is to use storytelling and narrative
visualization techniques to develop a visualization system for
viewing log data and assess whether using these techniques
could increase usability and user engagement. The iterative
cycle of design, implement and test is followed throughout this
study and participatory design method is used as real users are
involved in every process. Scope and delimitation were dis-
cussed beforehand and this thesis focuses on the HCI aspects
of visualization and evaluation. Based on the aforementioned
background, the study examines the design process and de-
sign artifact in the context of log data visualization under the
research question:
Research Question
What are the impacts of using storytelling and narrative visu- alization techniques in designing visualization for log data?
1. Does it help to increase usability as measured by System Usability Scale Survey?
2. Does it aid to engage users to explore the data as measured by User Engagement Scale questionnaire?
BACKGROUND Motivation
Since ancient times, stories have been used as a vehicle to communicate and convey information. Storytelling is an effi- cient method of information transmission.The final purpose of both text media resources, such as newspapers and books, and visual media, such as film and anime, is to tell a story. A well-described, vivid story could easily attract audience thus plentiful and useful information could be easily discovered.
What effects would be generated if the skills of storytelling were incorporated into visualization? Audience may feel like watching a movie and key information is easily discovered.
Traditional information visualization and visualization tools focus more on mining and analyzing of data itself. For ex- ample, traditional spreadsheet tools (e.g. Microsoft Excel) put emphasis on presenting data in a diagram. Comparatively, several recent visualization applications which focused more on storytelling (Tableau Public) provide functions allowing users to have more interactions with visualization. However, in the context of exploratory visualization, where viewers may not have enough specific knowledge about the data or about the visualization systems, data querying and data comprehen- sion may be problematic. For the past few years, visualization research in the companies has traditionally focused on the exploration and analysis of data with much of the work fo- cusing on novel techniques. Although the design space of visualization has been thoroughly studied and it is easy to find suitable techniques for most data set and tasks now, researches from users’ aspects is still few and not well integrated. De- signers and researchers [12, 31] have suggested that using storytelling in visualization could trigger user-interaction and exploration, as it can attract and motivate users to discover.
However, comparative studies on the impacts of using story- telling in visualization on user experience are still in infancy.
Therefore, there lies an opportunity to assess whether using storytelling and narrative visualization techniques could en- hance user experience. Here, due to time constraint, I chose two indicators - usability and user engagement for study and I consider these two as parallel relationship.
Information Visualization
Visualization, which is a broad term, provides a method that aids to acquire new insights which is not easily perceived by looking at raw data. Human cognition are improved and conclusions are drawn from data visualization. Spence [27]
suggested that the visualized data itself is not directly telling anything, but the the way how it is represented visually helps to gain insights. Mackinlay [6] divided visualizations into two categories in their paper: scientific visualization and in- formation visualization. Scientific visualization is relating to
physical data and usually been used by researchers for aca- demic purposes. On the contrary, information visualization contains any type of data. Information visualization presents data in an intuitive, comprehensible and easy-to-operate way and conveys information to users. Scientists used to be the major users of visualization, however, as the internet and me- dia develops rapidly, more and more people are exposed to visualizations [10].
Data can be categorized into the following formats: capture interaction, profiles, time series and combination of above [30].
Nowadays, as the technology grows, it is easy for computer to handle large data set which provides users a new insight into data. According to the proposition by Schroeder and Noy [30], information visualization can be divided into 4 steps: capture data, prepare data, process data and visualize data. Capturing data plays the most important role in visualization, and then the logged data is prepared involving dealing with missing values and normalising and weighting data. Given the prepared data, the next step processes data using hierarchical clustering and multi-dimensional scaling. This thesis project focuses on the last step and processing data step to some extent.
Developing Visualization
Various factors contribute to the design and development pro- cess of visualizations. Four facets of overall system: people, activities, technology and environments has to be achieved in harmony.Graham et al. described and illustrated a general and effective methodology for developing visualizations [9].
In order to achieve a system with high level of usability, they suggested adopting a development approach described below:
1. Use an iterative method of development
2. Real users should be involved in the design process (partici- patory design method)
3. Focus on the goals rather than the functions
4. Interface design should be attended and documented Graham et al. used this methodology to build a number of prototypes of botanical taxonomy visualization in an iterative cycle of design and testing. The authors presented how re- search through design model may be applied to visualization design with improving tool while using user research meth- ods, which could be adopted as a reference for developing visualization.
Narrative Visualization
Hullman and Diakoplous define narrative visualization as "a style of visualization that often explores the interplay between aspects of both explorative and communicative visualization.
They typically rely on a combination of persuasive, rhetorical
techniques to convey an intended story to users as well as ex-
ploratory, dialectic strategies aimed at providing the user with
control over the insights gained from interaction" [12]. This
article explains the effects of visualization rhetoric on com-
prehension and the mentioned "interplay between aspects of
both explorative and communicative visualization" is related
to "balance between author-driven and reader-driven narrative
scenarios" proposed by Segel and Heer [31]. Author-driven
scenario usually follows a linear sequence, which includes
a large amount of information with few interactions. In the
most extreme cases, no interaction was provided at all and visualization was totally guided by author. Comparatively, reader-driven scenario allows audience to receive the informa- tion in random sequence with free interactions, which offers more autonomy. In real practice, interactive narrative visual- ization rarely falls directly into either of these two but rather somewhere between. For example, slideshow-style visual- ization is following the linear sequence while it allows users to navigate content both forward and backward to discover interesting points. In this project, the final visualization design was using a combination of author-driven and reader-driven approaches. Moreover, it is investigated whether author-driven introductory "story" could trigger reader-driven scenario in later phase.
Card and Mackinlay [7] analyze information visualization de- sign space in their paper, and they describe examples relating to specific fields. They analyze several specific visualiza- tion methods, including semiology of graphical data, multi- dimensional plots, multi-dimensional table, information land- scapes and spaces, node and link, trees, and also discussed how to use these methods in visualization design space. Javed and Elmqvist [14] analyze design space in composite visual- ization. They propose 5 general design patterns for composite visualization: juxtaposition, integration, overloading, super- imposition and nesting. They define the corresponding visu- alization design space of these patterns, which contains type of visualizations, spatial relation and data relation. Segel and Heer [31] also propose a design space for narrative design ele- ments and identify three structures of narrative visualizations:
the Martini Glass structure, the Interactive Slideshow and the Drill-Down Story. The Martini Glass structure prioritizes the author-driven approach and is divided into two steps: First, questions, observations or written articles are initially used to convey author’s intended narrative. Sometimes an interesting default view of visualization or annotations are displayed with no text used at all. Second, as initial narrative by author is complete, a reader-driven stage enters as the user can interact with the visualization freely. The structure resembles a mar- tini glass, as the stem representing single-path author-driven scenario while the mouth of the glass representing the pos- sible paths decided by reader-driven activity. Therefore, the authoring segment should function as a "jumping off point for the readers’ interaction" and this structure is the most com- mon one [31]. The Interactive Slideshow structure uses a standard slideshow format, while incorporates mid-narrative interaction of every slide. It allows the users to explore par- ticular points and is a more balanced mix of author-driven and reader-driven approaches compared with Martini Glass structure. Drill-Down Story structure puts more emphasis on reader-driven approach. It presents a general theme at the beginning and then allows the users to dictate what stories are told and when. In this project, I focus on the first two structures to design the visualization.
These frameworks are highly helpful for designing interactive visualization and also triggers to evaluate whether the use of narrative visualization techniques in an introductory author- driven scenario could increase the level of user engagement and usability in a later more reader-driven scenario.
Research through design
The term design has been more common in HCI practice than in HCI research previously, and has often been linked with usability engineering [37]. Recently, HCI scholars have been researching on how to combine research with design. Zim- merman et al. have developed a model of interaction design research and a set of criteria for evaluation [37]. Advantages of interaction designers’ working methods were considered beneficial to HCI field. One of the biggest problem of con- flicting perspectives of the stakeholders, or named “wicked problem” is able to be figured by interaction designers [26].
The work of design researcher can be described as “...the study, research, and investigation of the artificial made by human beings, and the way these activities have been directed either in academic studies or manufacturing organizations” [2]. Aim of this work is to improve the process and generate theories.
The process of designers collaborating with software develop- ers is defined as “creative design” by Lowgren to identify it from the engineering approach [17]. Research-oriented design, proposed by Fallman, was used to describe the research by en- gineers and behavioral scientists [8]. In contrast with that the goal of design-oriented research is to expand knowledge, the aim of research-oriented design is to develop design artifact.
While design-oriented method often abandons the prototype, designer usually follow research-oriented approach in order to deliver work to users.
Usability and User Engagement
Usability is part of the broader term "user experience" and refers to the ease of access and/or use of a product or website.
Usability is defined as "the extent to which a product can be used by specified users to achieve specified goals with effec- tiveness, efficiency and satisfaction in a specified context of use" in ISO 9241-11 [33]. Features of a design, with the user, what the user wants to do with it and the users’ environment determine its level of usability. Moreover, usability testing is not confined to formats and it can be done with paper and pen but also with high-fidelity prototypes.
User Engagement (UE) is another metric of user experience, it is an abstract term - not directly observable, and containing various parts. It can be characterized by the depth of an actor’s cognitive, temporal, affective and behavioral investment when interacting with a digital system [22]. User engagement cannot be easily accessed and it can only be examined via an observ- able phenomenon where they are manifest, which makes it hard to define [19]. Various subjects characterized engagement differently. For example, in the education field, engagement of students is usually discussed in terms of achievement, at- tention, interpersonal relationship and motivation [18]. For psychologists, engagement usually corresponds to positive psy- chology and motivation [32]. In gaming realm, engagement is regarded as a generic indicator of game involvement and becomes a necessary evaluation step [3]. In the area of HCI, user engagement has been viewed in the context of flow and fluid interaction and several frameworks has been proposed.
It is usually considered as a metric for generating satisfying
and delighting emotions with relation to curiosity, joy and
surprise [34]. The relationship between user experience and
user engagement is usually considered positive and user en- gagement could increase motivation to use more. Sometimes user engagement is treated as user’s level of involvement of a product and it has been mentioned in several related research areas [28].
Examining Usability and User Engagement
A variety of evaluation methods for usability exits, includ- ing both qualitative and quantitative methods [5]. Qualitative methods offer a direct assessment of the usability of the system with researching directly observing participants completing tasks. Qualitative data could be gathered through usability testing and interviews. Quantitative methods provide an in- direct assessment of the usability of a design and they can be based on users’ performance of a given task or can reflect the perception of usability by participants. Quantitative data could be gathered through questionnaires and surveys. One of the most widely used surveys to assess usability is System Usability Scale (SUS) survey proposed by John Brook and has been proven to be useful to assess not only usability but also the learnability of a system. The results of survey can be converted into a single number (System Usability Scale Score) ranging from 0 to 100. SUS score is a universally used metric for usability evaluation and is an accurate evaluation method with small sample sizes [4].
Several approaches of both quantitative and qualitative for examining user engagement on various disciplines have been studied by researchers [15]. For example, the Immersive Ex- perience Questionnaire (IEQ) and the Gaming Engagement Questionnaire (GEQ) have been widely used to evaluate im- mersion and engagement respectively and they are particularly designed for measuring user engagement in games. Several models and frameworks have also been proposed to evalu- ate service and web interfaces. Questionnaire and surveys in information science have also been applied [21]. User engagement is difficult to be measured directly. Although several psychological indicators are able to be measured, such as blood pressure, heart rate and nervous system activity, these methods are rather costly and time-consuming, which makes it not suitable to become a scalable solution in most cases.
Moreover, several indicators relating to human behaviors such as page visits, mouse clicks and time spent could be used as indicators of subjective user experiences, however, measuring them is time-consuming and prerequisite is in need. Self- reporting, as a widely used method, has various advantages such as interpretability, information richness and practicabil- ity [24]. Although self-reporting may cause inaccuracy even participants try to be completely accurate to questions, it is considered as a beneficial method for evaluating psychologi- cal constructs. The mostly used self-reporting questionnaire - User Engagement Scale (UES) questionnaire designed by O’brien is used to assess user engagement with visualizations in this study.
One of the advantages of quantitative data over qualitative data is statistical significance [5]. Quantitative data is able to avoid randomness if presented in a sound way and several statistical instruments could be used to calculate how likely it is that the data reflect the truth or whether the data is just an effect of
random noise. For example, Shapiro-Wilk’s test and Levene’s test could be used to examine the normality and homegeneity of variances of data respectively. In order to determine if there exists significant difference between two independent samples, parametric t-test or non-parametric Mann-Whitney U test if sample is not normally distributed could be adopted. Even though qualitative studies are more common in the industry field, quantitative studies are the only one stating how much new design improved over the old one clearly.
Related Research
Due to the infancy of the field, hardly any research has specifi- cally been made in the visualization of log data from intelligent equipment. An interesting approach to presenting WWW log data in interactive starfield visualizations was explored by Hochheiser and Shneiderman [11]. They introduced a series of interactive starfield visualizations to be used to interpret and explore web log data by combining two-dimensional displays of thousands of individual access requests, and these visual- izations provide capabilities for examining data that exceed the traditional web log analysis tools. Although this paper offers an insight into possible methods to present massive log data, it does not take users’ expectations towards the system into account and failed to obtain high usability. Moreover, as this paper was published approximately 20 years ago, various aspects such as adaptability, aesthetics and data processing failed to convey. When it comes to evaluating user engage- ment and usability, few well integrated researches have been done in the information visualization field. Hung and Parsons [13] took a look into assessing user engagement in interac- tive visualizations: they developed a questionnaire containing 22 questions and calculate engagement score according to the results. Although it proved to be beneficial for assessing user engagement, the questionnaire measures 13 engagement characteristics with each for 2 items, which is excessive to be applied. When it comes to measuring usability of visualization framework, Zbick et al. [36] applied Technology Acceptance Model (TAM) and System Usability Scale (SUS) to identify the perceived usefulness, perceived ease of use as well as us- ability of the framework. The results showed the framework was accepted by users and received a high level of usability.
METHODOLOGY
The methodology (figure 1) applied in visualization design is proposed by Graham et al. [9], proven to be an effective method for visualization development. This thesis project is following the idea of the use case presented in their article [9].
In the first phase, requirements of project are discussed and defined. After knowing the requirements, low-fidelity pro- totypes were created using paper prototyping and sketching.
Then by using graphical design software, a graphical design
which contains intuitive view of company’s design style was
developed and validated with company’s UX guideline. This
graphical design was later converted into initial prototype with
web technologies. Both formative and summative usability
evaluation were conducted with potential users in traditional
method: One user at a time in controlled test environment,
think aloud protocol and predefined tasks were used. Every
testing session contained a User Engagement Scale (UES)
Questionnaire, System Usability Scale(SUS) survey [4] and an optional semi-structured interview.
Phase 1 Define User Group and Requirements
The project starts with a pre-study of the existing visualization tools used by the company. The essential technical informa- tion and non-technical information regarding backgrounds of this project were gathered from several meetings with two developers and two product managers. Use cases of the de- signed artifact were initially clarified: final prototype would help domain experts to view log data from company’s prod- uct - "Rokid", which is an intelligent speaker incorporated with AI technology for further investigation. After discussion with supervisor at the company, preliminary functional and non-functional requirements [25] were determined as follows:
Functional Requirements
• The tool should be able to visualize log data of a period of time
• Visualization should clearly indicate data changes Non-functional Requirements
• The product should be multi-platform capable of different operating system
• The product should be highly interactive
• Visualization should be aesthetically satisfying
Log data was stored in company’s own database system and it is able to filter data using queries and download data of a specified period into Excel file format. It was agreed that evaluations will be conducted at company premises and only open to company employees. Table 1 presented an example entry of data.
Phase 2 Iterative Prototyping
The first phase offered an excellent reference and foundation for early sketches and prototypes. The initial sketches was de- signed using paper and pen. Heuristic evaluation [16] method was used to assess different variants without the need of addi- tional study participants. Having refined the sketches, Adobe XD
1- a vector-based user experience design tool was used to convert sketches to graphical interface. One of the reasons for choosing Adobe XD is because of its cross-platform adapt- ability. Another reason is that it offers built-in interaction functionality allowing mocking some basic interaction which proves to be beneficial when conveying information with the technical expert. In order to examine whether or not a novice user can easily carry out tasks within the system, cognitive walkthrough [16] were used before web implementation to gather rapid insight before spending efforts on developing an unusable product. Because of the cross-platform adaptability requirement, web technology was chosen for the final proto- type. In the web-based prototypes, web framework Flask and visualization library d3.js was used.
Phase 3 Evaluation
70 participants in total were recruited to participate in evalu- ation process. 10 participants participated in first evaluation
1
Adobe XD https://www.adobe.com/products/xd.html
session which contains usability testing and semi-structured interviews to discover usability problems and gather formative data. 60 participants were divided evenly to participate in com- parative evaluation sessions: each contains usability testing, a SUS survey, a User Engagement Scale Questionnaire and an optional semi-structured interview. The qualitative data and quantitative data were later used for analyzing the effect of applying narrative visualization method.
Each participant was invited to an individual session of usabil- ity testing. Each participant was asked to perform the same tasks from same scenarios, with the same scripts being used.
Each evaluation lasted for approximately 60 minutes and can be separated into four steps:
Introduction briefing
The participants were given a brief introduction, making them understand the aim of usability testing. Then they were asked to answer a few questions to gather demographic data. After reading disclaimer and signing signature, usability testing started.
Project description and concept explanation
An introduction to the prototype was delivered to the partici- pants. Participants were purposely given a mere overview of the prototype in order to prevent bias towards the study. As none of them has been involved in usability testing before, think aloud concept was explained and encouraged to follow.
Scenarios
Participants were given approximately 10 minutes to become familiar with the tool. No instructions or suggestions were given during this period in order to see how novice users perform. Think aloud protocol was used and participants explained their thoughts and operations as well as their confu- sions.
After users became accustomed to the interface, they were asked to share their findings from visualization. This is de- signed to see if the prototype aids for further data exploration.
This lasted for 5-10 minutes, varying from person to person.
Surveys and Debrief questions
In second usability evaluation session, the participants were asked to fill in a System Usability Scale survey of 10 questions and a User Engagement Scale Questionnaire of 30 questions following Likert Scale of 1-5. They were given approximately 20 minutes to complete them. Then an optional short semi- structured interview was conducted to get more insights to- wards visualizations and interfaces. Questions asked were like:
“What do you find interesting?”, “What are you not satisfied with?”, and users were asked to leave a comment towards the prototype. This session lasted 5-10 minutes.
RESULTS AND ANALYSIS First Prototype
The interface of first prototype comprised of two main views:
the index page with brief project description and the main
page with visualization (figure 2). In the latter users there is an
introductory "story" about the project and encourage users to
explore the data. The introductory story (figure 3) describing
Figure 1. Process flow of methodology for developing visualization
Recognition Result Function Label Time Duration (s) User ID
How’s the weather in Hangzhou com.rokid.weather1 query 2016-05-31 23:50:37 123 010116000951
Table 1. Example entry of log data
information about "Rokid" and log data will appear when users hover on the visualization. The narrative component was designed as a mix of author-driven and reader-driven slideshow, which includes 3 stimulating default views (or sections). Each section followed the same general layout.
Users will see an animated visualization showing changes of percentages of different categories activities varying from time to time (figure 2). On the left panel, time was flowing and user could set flowing speed from slow to fast. Message changing according to time was shown below the time (Figure 4). Users could filter data by setting parameters of time period and categories of activities to show. On the right was the transforming visualization, each dot presented a Rokid’s user and dot will be moving if corresponding user changed behavior.
The percentage of each category of activities was keeping updating according to time. The transition time was accurate to a minute.
After the narrative component, the web-page is following martini glass structure as it includes only a small amount of messaging and users can freely explore the data. Users were able to click a category of activities to view details and time will be automatically paused. When clicking a dot, the detailed information including time, user ID, system function name will be displayed on the left panel. Users could click user ID to view a stacked bar chart to display data of corresponding user for a period of time.
Technology used
The prototype was built using web technologies, which in- cludes HTML5, CSS3, Python3 and Javascript. The web framework applied in the prototype was Flask, and D3.js
2was used for visualization. This prototype was developed as a static web application as web development is not the main focus of this thesis project. Description of data capturing and process- ing is out of scope of this thesis. Original data is exported from company’s database into Excel file and transformed into
2
D3.js https://d3js.org
JSON format. Every data entry included user ID, time of use, time period of use, function used by speaker, label, speech recognition result. Although there were over 100 names of system functions, the functions were categorized into 7 kinds according to actions: Music, Chat, Life Information, Opera- tion, Wiki, Time, Unknown and Ignore. Detailed description of data capturing and processing is out of scope of this thesis therefore will not be further discussed. D3.js was chosen for visualization because it is a powerful framework to craft data visualization with rich interactivity and it also allows users to design customized graphs and diagrams, making advanced multidimensional visualizations possible.
Evaluation
The main objective of the first evaluation was to detect us- ability problems with participants. Comparatively, the main aim of second evaluation was to investigate on the impacts of using storytelling and narrative visualization techniques on usability and user engagement. Usability testing, SUS survey, User Engagement Scale Questionnaire and semi-structured interviews were applied.
Demographic Data
10 users (5 male, 5 female) participated in first evaluation and 60 participants (48 male, 12 female) divided evenly into two groups (each group has 24 male and 6 female) were involved in second evaluation session. Most participants fell into age range 30-40 and no participants were aged over 50. 36 out of 70 participants received Master’s degree and merely 6 out of 70 participants were holding associate degree. 34.3% partici- pants have worked in the company for 1-2 years, with 34.3%
participants have worked for more than 2 years. The detailed demographic data is shown in Figure 7,8,9,10.
Evaluation 1
The aim of first evaluation is to discover usability issues. In
total 10 individual evaluation sessions were conducted. It
quickly became apparent that novice users felt confused about
Figure 2. Initial view of visualization design. Filter of data is on the left panel, and the main view presents animated visualization
Figure 3. Introductory story of "Rokid" and log data
Figure 4. Messages changing according to time
this visualization. Users raised questions like "How do I use this?" or "What does it mean?" commonly. Another issue was that users seemed lacking for interest of interaction with visualization and they did not follow the intended exploration route. Most of participants only paid attention to animated visualization which is displayed as default and only two out of 10 participants noticed that there are other static visual- izations which could be viewed to help to understand data.
They seemed to ignore the interaction and only focused on the animated visualization. This behavior was questioned during debriefing session and users explained that it was unclear to see there was other visualizations and description text was missing. Another problem happened when users changed pa- rameters: the loading time of newly rendered visualization was substantially long. Current interface used a pop-up dialog at the bottom of setting window, however, it was clearly seen that participants did not notice it and became confused about what was happening and whether they succeeded to filter data.
In semi-structured interviews, participants were more open about the difficulties encountered and shared their feeling and suggestions. Several participants argued that the order of pa- rameters is not fully considered and should be rearranged.
Moreover, description text was missing which will cause con- fusions for novice users. Most notably, static visualizations should be clearly indicated because currently it was easily neglected. Regarding suggestions, two participants recom- mended that background color could be changed between black and white according to sunrise and sunset time to in- crease immersions. Most of participants noted that animated visualization should be able to pause and the transition time could be set more accurately. One user advised to add export function to this visualization system because (s)he would like to use visualization in her/his MS PowerPoint presentations.
Evaluation 2
The purpose of second evaluation was to acquire summative data to make comparison between visualization with story- telling and narrative visualization technique and visualization without these features. In total 60 participants were involved in evaluation sessions and they were divided evenly into two groups, and each one participated in usability testing, SUS survey, User Engagement Scale Questionnaire and optional semi-structured interview. Most notable changes include night theme, time interval settings, a pause button, more descrip- tion text of storytelling and more error-prone interface. Users quickly discovered the new set of features and commented on their usefulness.
Moreover, an explore section which included only a small amount of messaging and users could freely explore the data set was designed as a "summary" page of the previous ani- mated visualization and static visualization (figure 14). When users finish the previous three visualizations, this explore sec- tion will be ’opened up’ and this section was shown to partici- pants assigned to view visualization without storytelling and narrative visualization techniques applied (referred as no-ST version). To be concrete, users assigned to view visualiza- tion with storytelling and narrative visualization techniques (referred as ST version) tested visualization system containing three visualization section and the explore section while the users assigned to no-ST version merely viewed the explore section. Each version, which could be respectively referred to as the Storytelling (ST) version and the no-ST version, was alternately tested with participants. Thus, the experimental design was between-subjects.
Final Prototype
Qualitative data which played an important role in develop- ing final prototype was collected through usability testing and semi-structured interviews. The final prototype added descrip- tion text, night mode theme according to time of sunrise and sunset, function of zooming in/out visualization and an ex- plore section. The description text included text encouraging users to interact with the visualization and message of pattern of data according to time (figure 15). The message of pattern of data was generated by applying machine learning methods to data and thus the details of generation is not considered and discussed in this thesis project. Compared to the initial version, the time interval of animation could be set by user through inputting numbers or scrolling bar. Moreover, screen capturing features, more interaction signifier and more explanatory text were included. Compared with the initial prototype, loading time of visualization after parameters’ change was slightly re- duced and pre-load of new visualization was applied to prevent confusions.
SUS Results
Participants were asked to fill in survey of 10 questions
adopted from System Usability Scale survey in a 5-point Likert
Scale. In SUS, there were 5 positive questions and 5 nega-
tive questions appearing alternately. Higher scores of positive
questions correspond to a high level of usability while lower
scores of negative questions is indicating positive result. Ban-
gor et al. [1] and Sauro et al. [29] developed two widely used
Figure 5. Detail information of an activity of a user
Figure 6. Stacked bar chart presenting numbers of different activities of a user in a period of time
Figure 7. Gender information of participants
Figure 8. Age information of participants
Figure 9. Education level information of participants
Figure 10. Working age information of participants
approaches to rate SUS score. Both approaches relate the SUS score to the well known academic grading from A-F for a better comprehension.
For the ST version of visualization system, the average SUS score from 30 participants is calculated to S1 = 70.08 with a learnability score L1 = 79.17 and a usability score U1 = 67.80. Comparatively, no-ST version received an average SUS score of S2 = 68.17 with a learnability score L2 = 69.58 and a usability score U2 = 67.81. According to Bangor et al. [1]
and Sauro et al. [29] the ST version both graded C and no-ST version is graded D and C respectively.
It is clearly seen that SUS score of ST version was approxi- mately 2 points higher than No-ST version. Although usability score of ST and no-ST version was approximately equal, aver- age learnability score (by calculating the results of question 4 and question 10 in SUS survey) of ST version was approxi- mately 10 points higher than No-ST version. Therefore, it was hypothesized that ST version would earn higher scores than no-ST version in both overall SUS and learnability. Assump- tion of normality, as assessed by Shapiro-Wilk’s test (p>.05) was met and homogeneity of variances, as assessed by Lev- ene’s test (p>.05) was met. Therefore, an independent sample t test was conducted to compare the averages of SUS and Learnability of ST version and no-ST version. Results did not show statistically significant differences between two versions (ST, no-ST), t = 0.63087, p = .530601 and t = 1.93885, p = .057391 respectively.
And it was hypothesized that there would be differences be- tween the two versions in the usability. Assumption of normal- ity, as assessed by Shapiro-Wilk’s test (p>.05) was met and homogeneity of variances, as assessed by Levene’s test (p>.05) was met. An independent sample t test was conducted and result did not demonstrate statistically significant differences between two versions, as t = 0.182, p = .857.
UES Results
User Engagement Scale (UES) was proposed by O’Brien and Toms [21] and has been universally used to assess user engage- ment in a quantitative method. The UES is associated with other self-report measures specifically, e.g., the focused atten- tion subscale is related to cognitive absorption and flow [23]
while perceived usability subscale has shown correlation with
Figure 11. Final Prototype
Figure 12. Night mode theme according to time of final prototype
System SUS-Score Bangor Sauro
Range Grade Range Grade
ST version 70.08 70-80 C 65.0-71.0 C
no-ST version 68.17 60-70 D 65.0-71.0 C
Table 2. Grade of system
Figure 13. Detailed information of an activity, button of back to main view and description text of recognition result was added
Figure 14. Explore section, the "summary" visualization
Figure 15. Message of pattern of data according to time
Figure 16. SUS result
Scale Version Mean SD
Overall ST 70.08 12.84
no-ST 68.17 10.58 Learnability ST 79.17 14.80 No-ST 69.58 22.67
Usability ST 67.80 14.00
No-ST 67.81 9.11
Table 3. SUS scores of different subscale
Figure 17. UES scores of every subscale
Figure 18. Overall UES score
other usability questionnaires [20]. In this study, optimized UES questionnaire containing 4 factors: Focused Attention (FA), Perceived Usability (PU), Aesthetics (AE), and Reward (RW) was used to gather summative data. The questionnaire consist of 30 questions with 7 on FA, 8 on PU, 6 on AE, 10 on RW factors and average scores of each subscale is calculated by summing scores for the items in each of four subscales and dividing by the number of items. Results of UES is shown in table 4:
It is clearly seen that compared with no-ST version, average
scores of ST version were higher in every subscale. Score
of ST version in FA subscale was higher with more than 1
point and score in RW subscale was approximately 0.5 point
higher. Meanwhile, scores of ST version in PU and AE were
slightly larger than those of no-ST version. Based on the table
above, it was hypothesized that scores of ST version would
be higher than the no-ST version in Focused Attention (FA),
Perceived Usability (PU), Aesthetics (AE), Reward (RW) and
overall engagement score. In order to determine test method,
Shapiro-Wilk’s test and Levene’s test were used for assessing
assumption of normality and homogeneity of variances respec-
tively. According to results of Shapiro-Wilk’s test, scores of
no-ST version in FA, PU, AE, overall and score of ST version
in RW was assumed not normally distributed since p <.05.
Suscale Definition Version Mean SD
Focused Attention (FA) Concentration of mental activities ST 4.25 0.42
no-ST 3.03 0.51
Perceived Usability (PU) Affective and cognitive response ST 4.42 0.24
no-ST 4.22 0.49
Aesthetics (AE) Appearance attraction ST 4.51 0.36
no-ST 4.35 0.27 Reward (RW) Possibility to remember, feelings of involvement, unexpected experiences ST 4.16 0.36 no-ST 3.69 0.31
Overall Sum of four subscale ST 17.34 1.05
no-ST 15.30 0.76
Table 4. UES scores of different subscale
Therefore, a non-parametric Mann-Whitney U test was con- ducted to compare the average score for both the ST version and the no-ST version in overall and all subscale. In subscale FA and RW, results showed statistically significant differences between two versions (ST, no-ST), W = 70 ,p < .00001 and W = 153, p < .00001 respectively as calculated. These re- sults indicated that the ST version’s high scores in FA and RW were significant. For subscale PU and AE, results did not demonstrated statistically significant differences between two versions, W = 324, p = .06288 and W = 327.5, p = .07186 re- spectively as calculated. These results mean that ST version’s higher scores in PU and AE subscale were not significant.
The result of overall engagement score between two versions was W = 62, p < .00001, which indicated the fact that higher overall engagement score of ST version was significant.
DISCUSSION
This paper described participatory design process of a visual- ization system to be used by developers for viewing log data from intelligent speaker "Rokid" and researched on whether using storytelling and narrative visualization techniques could increase usability and user engagement. The discussion would be developed based on implications of visualization design, summary of hypotheses, method criticism, limitations and future work.
Implications of visualization design
The strength of applying participatory design is obvious: real users were involved and continuous feedback was acquired through design process. It is more effective and easier to convey ideas when showing it in a real object, because using verbal language explaining the concepts and features are not enough for users to thoroughly comprehend the scope of the vision. Through usability evaluation, it is shown that visualiza- tions could facilitate field experts to view and understand log data. Study participants paid closer attention to user interface design, aesthetics, user engagement and interactivity, trying to assess whether users would find the system to be attracting and interesting. Participants expressed more focus on the design and interactivity while computational function and accuracy of visualization turned out to be secondary concern. The visual- ization is not following Tufte’s rules [35] that data-ink ratio is positively correlated with visualization value, because during interviews, participants argued that moderate data-ink ratio was more aesthetically pleasing. Furthermore, testing results demonstrated that participants were not particularly paying
much attention to data exploration, while they were more inter- ested in the interactivity and storytelling feature. The reason might be that several tools with high computational ability such as Matlab were tailored for these needs and have been widely used in the company. The study might suggest that narrations in visualization could lead to better user experience and user engagement of designed visualizations.
Summary of hypotheses
Analysis of results was driven by two qualitative hypotheses.
The first was that storytelling and narrative visualization tech- niques should contribute to usability of visualization system.
And the second, was that storytelling and narrative visualiza- tion techniques should effectively immerse users and engage users in data exploration. However, verifying qualitative hy- potheses is time-consuming and costly, therefore, I proposed and verified the following quantitative hypotheses:
Results from SUS survey showed that hypotheses relating to SUS score were all rejected, which indicate that ST version did not aid to increase usability in current experiment environment with participants involved in evaluation process. Although learnability score of ST version was obviously higher than that of no-ST version, usability score (by calculating results of questions except question 4 and 10) of ST version was approximately equal to that of no-ST version. This may relate to the fact that SUS survey was designed to assess usability of whole system and narrative features might not be key factor to the level of usability. Results from UES questionnaire confirmed that ST version group would receive higher scores in overall user engagement, FA and RW. These indicated that narrative features could potentially increase user engagement towards visualization. The level of concentration was higher as FA factor showed a larger average score in ST version.
Moreover, during usability testing, it was observed that most participants showed curiosity and expressed their drawn into the experience. The score of RW factor was also higher in the ST version than that in the no-ST version - as RW factor was combined with three previously used factors: Endurability (EN), Novelty (NO), Felt Involvement (FI) and each questions of RW factor was corresponding to these three factors (Table ), data of RW subscale was further assessed.
It is clearly seen that scores of Novelty (NO) subscale and
Felt Involvement (FI) subscale in ST version were higher than
those in no-ST version while score of Endurability (EN) in
ST version was approximately equal to that in no-ST version.
Hypotheses Rejected or Accepted SUS Overall H0: There are no differences in overall SUS score between the ST and the no-ST groups. Failed to Reject
H1: There are differences in overall SUS score between the ST and the no-ST groups.
Learnability H0: There are no differences in learnability score between the ST and the no-ST groups. Failed to Reject H1: There are differences in learnability score between the ST and the no-ST groups.
Usability H0: There are no differences in usability score between the ST and the no-ST groups. Failed to Reject H1: There are differences in usability score between the ST and the no-ST groups.
UES Overall H0: There are no differences in overall UES score between the ST and the no-ST groups. Rejected H1: There are differences in overall UES score between the ST and the no-ST groups. Accepted Focused Attention H0: There are no differences in FA score between the ST and the no-ST groups. Rejected
(FA) H1: There are differences in overall FA score between the ST and the no-ST groups. Accepted Perceived Usability H0: There are no differences in overall PU score between the ST and the no-ST groups. Failed to Reject
(PU) H1: There are differences in PU score between the ST and the no-ST groups.
Aesthetics H0: There are no differences in AE score between the ST and the no-ST groups. Failed to Reject (AE) H1: There are differences in AE score between the ST and the no-ST groups.
Reward H0: There are no differences in RW score between the ST and the no-ST groups. Rejected (RW) H1: There are differences in RW score between the ST and the no-ST groups. Accepted
Table 5. Summary of hypotheses
Subscale Questions Version Mean SD
EN RW1, RW2, RW3, RW4, RW5 ST 3.50 0.41
no-ST 3.40 0.27
NO RW6, RW7, RW8 ST 4.22 0.47
no-ST 3.48 0.59
FI RW9, RW10 ST 4.45 0.42
no-ST 3.42 0.59
Table 6. Scores of subscale EN, NO, FI