A Survey of Methods for Visualizing Spatio-temporal Data

(1)

Department of Science and Technology

Institutionen för teknik och naturvetenskap

LiU-ITN-TEK-A--20/019--SE

A Survey of Methods for

Visualizing Spatio-temporal

Data

Mattias Persson

(2)

LiU-ITN-TEK-A--20/019--SE

A Survey of Methods for

Visualizing Spatio-temporal

Data

Examensarbete utfört i Medieteknik

vid Tekniska högskolan vid

Linköpings universitet

Mattias Persson

Handledare Katerina Vrotsou

Examinator Aida Nordman

(3)

Upphovsrätt

Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare –

under en längre tid från publiceringsdatum under förutsättning att inga

extra-ordinära omständigheter uppstår.

Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner,

skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för

ickekommersiell forskning och för undervisning. Överföring av upphovsrätten

vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av

dokumentet kräver upphovsmannens medgivande. För att garantera äktheten,

säkerheten och tillgängligheten finns det lösningar av teknisk och administrativ

art.

Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i

den omfattning som god sed kräver vid användning av dokumentet på ovan

beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan

form eller i sådant sammanhang som är kränkande för upphovsmannens litterära

eller konstnärliga anseende eller egenart.

För ytterligare information om Linköping University Electronic Press se

förlagets hemsida

http://www.ep.liu.se/

Copyright

The publishers will keep this document online on the Internet - or its possible

replacement - for a considerable time from the date of publication barring

exceptional circumstances.

The online availability of the document implies a permanent permission for

anyone to read, to download, to print out single copies for your own use and to

use it unchanged for any non-commercial research and educational purpose.

Subsequent transfers of copyright cannot revoke this permission. All other uses

of the document are conditional on the consent of the copyright owner. The

publisher has taken technical and administrative measures to assure authenticity,

security and accessibility.

According to intellectual property law the author has the right to be

mentioned when his/her work is accessed as described above and to be protected

against infringement.

For additional information about the Linköping University Electronic Press

and its procedures for publication and for assurance of document integrity,

please refer to its WWW home page:

http://www.ep.liu.se/

(4)

Abstract

Different kinds of data is generated continuously every second and in order to be able to analyze this data it has to be transformed into some kind of visual representation. One common type of data is spatio-temporal data, which is data that exists in both space and time. How to visualize this kind of data have been researched for a long time and is still a very relevant subject to expand on today. A number of approaches have been explored in this work. An extensive literature study has also been performed and can be read in this report. The study has been divided into different classifications of spatio-temporal data and the visual representations are structured by these classes.

Another contribution of this thesis is a climate data application to visualize spatio-temporal data sets of temperatures collected for several countries in the world. This appli-cation implements several of the visual representations presented in the survey included in this thesis. This resulted in a four display application, each showing a different aspect of the chosen data sets that consisted of climate data. The result shows how effective multiple linked views are in order to understand different characteristics of the data.

(5)

Acknowledgments

I want to thank MindRoad AB for helping me with this thesis by not only providing work space but also involving me in different company activities and conferences that made this time more interesting and fun. A special thank you to my supervisors Åsa Detterfelt (MindRoad AB) and Katerina Vrotsou (Linköping University) as well as my examiner Aida Nordman for all the help and making this thesis possible. Also thank you to Linköpings Studentspex for helping me take breaks from the studies.

I also want to thank my parents and family for the continuous support during my studies as well as my grandparents. A big thank you to my partner, Lennie Jansson, for always being there for me. Last but not least, thanks to all my friends for all the amazing support, Moa Lindqvist and Björn Sintorn in particular.

Mattias Persson Linköping, June 2020

(6)

4.5.6 Ring maps . . . 36 4.5.7 Micromaps . . . 36 5 Implementation 39 5.1 Data sets . . . 39 5.2 Choropleth map . . . 40 5.3 Heat map . . . 41 5.4 Line graphs . . . 42 6 Results 43 6.1 A climate-data application . . . 43 6.1.1 Choropleth map . . . 44 6.1.2 Time handling . . . 45 6.1.3 Heat map . . . 45 6.1.4 Line graphs . . . 46 7 Discussion 48 7.1 Results . . . 48 7.1.1 The survey . . . 48

7.1.1.1 Temporal aspects in spatial visualizations . . . 49

7.1.2 The climate-data application . . . 49

7.1.2.1 Choropleth map . . . 50

7.1.2.2 Heat map . . . 51

7.1.2.3 Line graphs . . . 52

7.1.2.4 Raw versus refined values . . . 53

7.2 Method . . . 54

7.3 The work in a wider context . . . 55

8 Conclusion and Future work 56 8.1 Research questions . . . 56

8.2 Future work . . . 57

(8)

List of Figures

2.1 Data models representing points in the continuity-abruptness space. . . 7

4.1 An example of a event map. . . 11

4.2 An example of a space-time event cube. . . 12

4.3 Two examples of the GeoTime. . . 13

4.4 Visualization of traffic in Milan. . . 14

4.5 A trajectory map with a temporal bar chart. . . 15

4.6 An example of map matching. . . 16

4.7 A flow map of the traffic in Milan. . . 17

4.8 A typical example of a flow map with multiple trajectories. . . 18

4.9 A proposed pipeline of rendering density trajectory maps . . . 19

4.10 A density trajectory map displaying vessel traffic in Rotterdam. . . 19

4.11 Two space-time (trajectory) cubes. . . 20

4.12 Trajectories represented by ribbons and tubes. . . 21

4.13 A stack based trajectory visualization of CPM-values along the Tokio-Fukushima highway. . . 21

4.14 Two raster grids. . . 22

4.15 Two raster maps. . . 23

4.16 Snapshots from a 3D raster map. . . 23

4.17 A isopleth map. . . 24

4.18 A matrix heat map. . . 25

4.19 A density heat map. . . 25

4.20 A origin-destination matrix. . . 26

4.21 A origin-destination map. . . 27

4.22 The Vismate application. . . 28

4.23 A triangle heat map. . . 28

4.24 Two heat maps using a point reference system which locations differ with time. . . 29

4.25 Two basic dot distribution maps. . . 30

4.26 Two examples of a value flow map. . . 31

4.27 A diagram map of the population distribution in Zurich and Winterthur in 1990. . 31

4.28 Two choropleth maps, one classed and one unclassed. . . 32

4.29 An example of an cross-classed choropleth map. . . 33

4.30 Small multiples of a unclassed choropleth map . . . 33

4.31 A dasymetric map . . . 34

4.32 The Great Wall of Space-Time. . . 35

4.33 The Great Wall of Space-Time utilizing parallel coordinates. . . 35

4.34 Two examples of ring maps. . . 36

4.35 A interactive linked micromap. . . 37

5.1 The blue, yellow and red color map used for the average temperatures in the choropleth map. . . 41

(9)

5.2 The red color map used for the average temperature uncertainties in the

choro-pleth map. . . 41

5.3 The blue and yellow color map used for the heat map. . . 41

6.1 The final implementation. . . 43

6.2 The implemented choropleth map. . . 44

6.3 The implemented time slider. . . 45

6.4 The implemented heat map. . . 46

6.5 The implemented heat map but the time period is 1750-2015 instead of 1850-2015. . 46

6.6 The line graph displaying the yearly average temperature readings for the entire world. . . 47

6.7 The line graph displaying multiple countries. . . 47

7.1 The choropleth map displaying the refined average temperature deviations of the world 1950, 2000 and 2010. Here, the blue colors represents colder temperatures and the red colors represents hotter temperature. . . 50

7.2 The choropleth map displaying the average temperature uncertainties of the world 1850, 1950 and 2010. Here, bright red represents a smaller uncertainty and deeper red a higher uncertainty. . . 51

7.3 The heat map representing the monthly values of Sweden. . . 51

7.4 The heat map representing the monthly values of the world with a different color scheme. . . 52

7.5 The line graph of Canada and Australia. . . 53

7.6 The heat map displaying the monthly raw values of the entire world. . . 53

7.7 The choropleth map displaying the raw average temperatures of the world 1950, 2000 and 2010. . . 54

7.8 The choropleth map displaying the refined average temperature deviations of the world 1950, 2000 and 2010. . . 54

(10)

List of Tables

4.1 An example of movement data, geographical positions in X and Y are recorded at certain times. The data is an example it is not an actual recording of anything. . . . 14 5.1 A snippet of the data including the average temperature in countries. Here AT

stands for Average Temperature and ATU stands for Average Temperature Uncer-tainty. All temperature readings are given in °C. . . 40

(11)

1 Introduction

This thesis was carried out at MindRoad AB and the Masters program of Media Technology and Engineering at the Department of Science and Technology, Linköpings University. This chapter will introduce a brief background of MindRoad AB in Section 1.1, the motivation of the problem in Section 1.2, the aim of the thesis in Section 1.3, the research questions answered by the thesis in Section 1.4, a brief overview of the method in Section 1.5, delimitations of the thesis in Section 1.6 and lastly the structure of the report in Section 1.7.

1.1 MindRoad AB

MindRoad AB is a company based in Mjärdevi, Linköping, but it is active in multiple Swedish cities such as Jönköping. They offer consulting services and have a base of both low- and high level developers that either work with in-house projects or with projects at the customer. MindRoad can offer cutting edge competence combined with engaged employees and their goal is to always have the right competence and experience for the task. Furthermore, they also offer courses in relevant frameworks and tools both for internal and external use.

1.2 Motivation

Data analysis and visualization is a fast growing field. Both companies and researchers are constantly looking for ways to analyze and visualize different types of data. One of the most common types of data is spatio-temporal data; data that have both a spatial and temporal dimension. This means that data points have a spatial position and a time stamp associated with them. The area is relevant for MindRoad AB since they get a lot of questions about data visualization from customers and from this thesis they will gain more knowledge about the available methods for visualizing spatio-temporal data. It will also lay some of the ground-work for further ground-work and research of the subject within the company.

1.3 Aim

The expected result of the first part of this thesis is a thorough literature study of relevant and interesting visual representations of spatio-temporal data. The aim is to survey different

(12)

1.4. Research questions methods proposed for the visualization of spatio-temporal data and to classify these depend-ing on their appropriateness for different data types. Different types of interactivity that can be utilized for these visualizations will also be described and how to correlate the spatial and temporal components of the data will be discussed. The survey results presented in this report are structured with regards to the identified classification.

The second part of the thesis results in several implementations based on some of the methods explored in the survey. However, not all the representations will be implemented, only a selected number of fitting visualizations. The aim is to, with the help of the methods researched in the survey, find different patterns and conclusions in the data and to see if different representations will lead to different conclusions.

1.4 Research questions

1. What are the visual representations most commonly used for different types of spatio-temporal data?

2. Which types of interaction are present in these representations and how do they depend on different data characteristics?

3. Can a meaningful classification of appropriate methods be created depending on the data characteristics?

1.5 Method

The first part of the thesis has been concerned with conducting a thorough survey on which methods are commonly used for visualizing different types of spatio-temporal data. Previ-ous research was considered and the survey was done by reading relevant books and articles provided from e.g the online library of Linköpings university, IEEE Xplore [33] and Google Scholar [34] and was obtained by searching references or relevant keywords from other arti-cles. After the survey was done the collection of retrieved methods was classified depending on the type of data. The classification achieved was then tested by implementing some of the representations using the Javascript library d3.js. A spatio-temporal climate data set was used for these implementations. It’s important to note, however, that climate data in itself has nothing to do with the thesis more than to be used as an example. What was investi-gated is how different characteristics of the data can be revealed and explored using different visualization approaches.

At the end of each week a meeting with the external supervisor at MindRoad AB was held. In this meeting the work of the week was discussed and the work for the next week was planned. Roughly every two weeks a meeting with the internal supervisor at Linköpings University was held. Similar to the meeting with the external supervisor, the work done since the last meeting was discussed and the work for the next weeks were planned.

1.6 Delimitations

This thesis will only consider spatio-temporal data and only methods on how to visualize this type of data. Given the limited time span of this thesis work, a selection of data characteristics of interest was chosen together with the supervisors and only visualizations relevant to these were implemented in order to closer explore their appropriateness.

1.7 Report structure

The remaining of this report is structured as follows.

(13)

1.7. Report structure relevant work performed before the thesis.

Chapter 3 describes the method of this thesis and is divided into the pre-study, the

sur-vey, and the implementation.

Chapter 4is the survey of relevant visualization methods for spatio-temporal data. Chapter 5describes the implementations done in reference to the survey.

Chapter 6factually presents the results of the thesis.

Chapter 7 discusses and criticizes the results, as well as the method, and describes the

work in a wider context.

Chapter 8 concludes thesis, answers the research questions posed, and proposes ideas

(14)

2 Background theory and Related

work

2.1 Big data

The term big data was coined in 1998 by John R. Mashey [51] and today the field is growing faster than ever. The term indicates a data set that’s larger than hardware can process in a reasonable amount of time or one with very complex relations and, most of the time, both of these. The Big Data field works with analyzing and processing this kind of data and to extract information from it. The need for efficient data handling methods is larger than ever since today’s data is continuously collected in vast amounts in various diverse areas such as consumer habits, politics, traffic just to mention a few. According to an International Data Corporation (IDC) report the global data volume will grow to about 44 zettabytes by the year 2020 [16].

Big data is often described by the concept "The five V:s" [41], [42], [53] which are the following characteristics:

• Volume - Indicates the quantity of the data and the amount of data created in a unit of time.

• Velocity - Indicates the speed in which data is generated.

• Variety - Indicates the type of the data like images, audio, numerical and so on. • Veracity - Indicates the quality and truthfulness of the data.

• Value - Indicates whether it is possible to extract value from the data.

These concepts help us define big data and lets us target, focus, and handle different parts of the data. It has been shown that interestingly the shear volume of the data is not always the biggest problem, instead, it is more often how to successfully extract value from the data by combining different tools [65].

2.2 Data analysis

Data analysis is the science of extracting correlations and conclusions by analyzing raw data [50]. The databases are becoming enormous, it’s impossible to derive conclusions and make

(15)

2.3. Information visualization any sense of the data just by observing at it in its raw form. Therefore, powerful data min-ing techniques are used in order to derive conclusions, find unexpected relationships, and identify correlations. The goal is to find common factors and patterns between different vari-ables in the data. There’s a wide variety of techniques and models for this problem and a few are considered in this thesis. It is beneficial to divide data mining into tasks with different objectives for analyzing the data [18].

• Exploratory Data Analysis - The data is explored, without any clear idea of what to look for, through interactive and visual methods.

• Descriptive Modeling - All of the data is summarized and its important features are described in a convenient form.

• Predictive Modeling: Classification and Regression - The goal is to be able to predict a value by analyzing the previous known values. Classification indicates that the variable is categorical, while regression indicates that it is quantitative.

• Discovering Patterns and Rules - The task is to, as the name suggest, find patterns and relationships within the data.

• Retrieval by Content - The goal is to find patterns of interest using already defined similar patterns.

Data is described as qualitative or quantitative data. Quantitative data is numerical data expressed in numbers while qualitative data, also known as categorical data, represents vari-ables with a limited number of discrete values. Examples of quantitative data is data like sales, height, weight, population and so on. Examples of qualitative data include data like the color of a car and true or false responses to a question [58].

2.3 Information visualization

The field of information visualization can be described as the study of representing data in a graphical form. This includes visual data representations such as charts, graphs and so on [58]. Visualization in itself is however defined as the human activity of being engaged as an internal construction in the mind and thus, is an cognitive activity enforced by visual repre-sentations from which an observer builds an internal mental image. Therefore, visualization isn’t anything that can be printed on paper or represented by a computer screen [63].

The goal is to have a visual representation that gives a clear overview and lets the human brain understand and visualize large chunks of data in a single glance.

There are many visual representation techniques and they depend on what should be presented by the available data. More advanced techniques includes visualizations such as the sunburst diagram [36] and parallel coordinates [30] just to mention some. According to [61] the representation should at first give an overview of the entire data and then allow the user to interact with the tool by zooming, searching, filtering or similar functions. This allows the user to get a deeper understanding of the data and gives interesting details on demand. Thus, interactivity is an important feature in data visualization.

2.3.1 Interaction techniques

Several interaction techniques are mentioned in the survey and this section explains them beforehand since they are not always intuitive. Interactions such as zooming and panning are, however, intuitive and is not be explained here.

Brushingis a technique quite common in information visualization and it’s the process of

interactively selecting data items from a visual representation. The purpose of brushing is to highlight specific data items in different visualization displays.

(16)

2.4. Spatio-temporal data

Linking, which is commonly used with brushing, is described as the process of combining

different visualization methods to overcome the shortcomings of a single method. Changes made in one of the displays affects the other displays automatically.

Marking, also refered to as selecting, is another common technique which purpose is to

mark items in the visualization in order to display more details about that data point, or to distinguish that item in order to tag, copy or delete that item, for instance. Marking usually results in the data item being highlighted.

2.4 Spatio-temporal data

Spatio-temporal data is a type of data that’s defined in both space and time. This means that the data contains information about a geographical position on a specific time. Spatio-temporal data has become increasingly popular with the development of mobile phones and GPS-devices [17]. Typical examples of this could be data like movement where an object traveling between two or several different points is recorded as a trajectory, climate data where for e.g the change of temperature in an location is recorded and crime data where a time and location of the crime is recorded, just to mention a few. There are a numerous different types of spatio-temporal data and it’s important to determine which data type is going to be visualized since it determines the problem formulation and different types lead to different formulations. The four most common types are: [8].

• Event Data - Data that’s defined by an event occurring at a specific location in a specific time. Crime data could be defined as a type of event data or voting data. Every event has a variable that denotes the location and one that denotes the time. It is also possible to have further variables that do not describe a spatio-temporal relationship and these variables are known as marked variables [8]. Marked variables provide further infor-mation about the event and could be a variable denoting the type of crime committed in a crime data example or which party a person has voted for in voting data.

• Trajectory data - Trajectories are the path traveled by objects in space over time. Flight data and taxi data are common types of trajectory data. It’s usually collected by mount-ing a sensor on the movmount-ing object which records the GPS-position at different time stamps. The smaller the time difference between these time stamps the greater the ac-curacy of the trajectory.

• Point reference data - Point reference data is data collected from a group of moving reference points. Say for instance data collected by weather balloons floating in space or sensors recording the surface temperature of a water body.

• Raster data - Raster data is defined as data measured in continuous or discrete fields with fixed points in space and time. Similar to point reference data but instead of mov-ing, the reference points are static. Either there is a number of fixed locations distributed regularly in space such as pixels in an image or distributed irregularly in space such as in a ground based sensor system. Observations are recorded at a fixed set of regularly or irregularly spaced time stamps.

Even though spatio-temporal data often can be defined within these categories sometimes the difference can be diffuse since if for instance crime data would be collected in a raster-type grid such as regions in a city, the event data could then be converted into raster data since it’s collected in a static grid system. The opposite is possible as well since algorithms that extract events of interest from different kind of data exist. It is therefore entirely possible to extract events from raster data and thus converting it into event data. These types of conversions are possible for the different types of data and depends on the purpose and questions asked

(17)

2.4. Spatio-temporal data about the data. The questions one would want to ask from the given data is what determines how it is used and which visual representation designs are relevant.

Apart from the questions one wants to ask from the data, the choice of visual represen-tation is also tightly coupled to the way the geographic phenomena described by the data (data models) occur through space [49]. As suggested by MacEachren [49], and shown in figure 2.1, different representations are appropriate depending on the type of data, ranging from discrete to continuous, and the degree of spatial dependence of the data items, i.e. data values changing abruptly or smoothly across space.

Figure 2.1: Data models representing points in the continuity-abruptness space. The left image represents the two dimensional representation while the right image represents the three dimensional representation. Left image source: Fig. 9 in [49]. Right image source: Fig. 8 in [49].

(18)

3 Method

This chapter describes the method used for this thesis. It’s divided into three sections each describing one part of the project. During the work, two supervisors were overwatching the project, one at MindRoad AB and one at Linköpings University. Meetings were held once a week with the supervisor from MindRoad and roughly once every second week with the supervisor at Linköpings University. Their purpose was to see to that the project was kept within the frames of the aim and that the work proceeded as planned. The work was performed at MindRoads offices in Linköping.

3.1 Pre-study

The pre-study consisted of four weeks of research on the topic of data analysis, information visualization and big data. During this period the outline of the project was formed and the problem formulation developed from a generalized problem to something specific. This was important in order to keep the project within the scope of 20 weeks and at a level of a masters degree. A planning report took form during this phase but was altered during the later stages and so did the research questions.

3.2 Survey

The survey emanated with a base of books and articles with relevant research conducted in the area. Through these, further relevant material was found by searching through the refer-ences and finding relevant keywords to use in Google Scholar. All material regarding each identified visual representation of interest was read carefully and summarized in this report. This was done for all the different representations and there were more material to use for some sections and less for some. To find more recent articles, the Visual Analytics Science and Technology (VAST) conference was explored via Institute of Electrical and Electronics Engi-neers (IEEE). The VAST conference is a conference where a large number of new visualization techniques, often regarding spatio-temporal data, are presented.

The visual representations included in this report were surveyed and classified based on the types of spatio-temporal data (as described in section 2.4) that they were designed for. First, only information about trajectory data was researched and the information found in

(19)

3.3. Implementations the books and articles was documented in the report. Thus, the work with the report was started early in the project and has been built upon during the course of the project. The same method was applied to event data, point reference data and raster data, in that order.

3.3 Implementations

The second part of this thesis was to create an implementation of some of the visual represen-tations presented in the literature study for climate data. The implementation was developed as a web based application, written in JavaScript, with the library d3.js [28]. D3.js is a library used to manipulate documents based on data. It visualizes data using HTML, SVG and CSS. Thus, HTML and CSS were also used in the application. The application was implemented on a desktop computer with 8 gigabytes of RAM and a 3.30GHz CPU in terms of hardware.

The work with the implementation was started when most of the survey was completed. This was because the implementation was supposed to build on the survey and the methods had to be researched first. The survey explores several visualization methods but only two, a choropleth map and a heat map, were implemented in the application. The application also includes two line graphs that were not explored in the survey since they are trivial and not directly related to spatio-temporal data.

(20)

4 A survey of visual representations

of spatio-temporal data

This chapter presents some of the most interesting types of visual representations within the different sub-categories of spatio-temporal data as well as ways of relating spatial and temporal information. It also describes different ways of letting the observer interact with different visualizations. The most common way of representing spatio-temporal data (mostly spatial) is on a cartographic map since it gives a clear indication of the locations and the relations between them [5]. Therefore, most of the visual representations presented here are based on maps. Spatio-temporal data is classified into four classes (event data, movement data, raster data and point reference data), each corresponding to a section of this chapter. Most of the visualizations explored here are geovisualizations grouped by the visualization capabilities they offer for these four classes of data.

4.1 Event data

Spatial events are described as an interesting occurrence in a specific location at a specific time. Andrienko et al. [6] describes spatial events as "Physical or abstract entities, such as lightning strikes or mobile phone calls, which occur at some time moments at particular locations and have limited existence times".Other events could be disease outbreaks, crimes or elections and to analyze this data several types of visualizations are utilized. Spatio-temporal event data always include information about where the event occurred and at what time but can also include additional thematic attributes such as for example what type of sub-event has occurred, what type of disease that is spreading or what type of airplane crashed.

4.1.1 Event maps

Spatial events can be represented as points in space and time and can be visualized as icons on a map. Event maps have icons spread out on the map representing an event. In an event map, these icons are directly related to the spatial position of the event. The event map, while often being the main view, can be combined with some other displays to introduce the temporal components of the data. Examples of such displays are interactive time lines, temporal distribution displays and/or temporal brushes just to mention a few. These allow the observer to advance the time by some value (a day for example), to see the distribution of events over a time period and to be able to define the time interval for visualization.

(21)

4.1. Event data

Figure 4.1: An example of a event map using icons to represent events, in this case crimes, on a map. Different icons and colors represent different crimes. If a number is used then multiple crimes have occurred at that location. The left bar allows the observer to select which time period to visualize, which location to show, to filter which types of crime the application takes into consideration and to see further reports and graphs of the data. Image source: https://www.sanfranciscopolice.org/stay-safe/crime-data-and-maps/crime-maps.

The symbols on the event map are rendered in the location the event occurred and by using additional visual cues such as colors, size, shape and opacity further attributes of the data can be visualized. Say for example that a series of earthquakes that occurred within a given time interval is visualized on a map of the world, the magnitudes of these earthquakes could be represented by the color saturation of the dots.

Icons are commonly used to visualize the events themselves or to visualize the thematic attributes of the data. In Figure 4.1 for example, crime data is visualized by using icons instead of dots. Color still represents different types of crime but the icons bring another aspect since, if used correctly, the observer can achieve a faster understanding of the data. This also depends on the icons since different icons can mean different things in different cultures. In the image the icons are replaced by numbers if multiple occurrences of crimes have taken place at that location. This leads to a less cluttered map but the observer would have to go one extra step and click the icon in order to get more details on what has happened there.

A common type of interaction in these types of event maps, is to be able to mark symbols to find out more information about the event. Other interactive elements are for example, as already mentioned, time related activities like advancing time and defining the time interval (brushing). Filtering the data by attributes, area or times are common interactions, as well as zooming and panning the map.

4.1.2 Spatio-temporal event visualization in three dimensions

Three-dimensional perspectives can be used in order to represent the temporal aspects of the data and can be relevant for the visualization of the movement of objects in three dimensions. In three-dimensional visualizations the temporal component can be represented by an axis,

(22)

4.1. Event data as seen in Section 4.1.3.1. This allows the visualization to represent both the temporal and spatial components of the data in a single overview.

4.1.2.1 Space-time (event) cube

A space-time cube is a technique that shows the spatio-temporal data inside a cube, where X and Y-axis denotes the spatial components and the z-axis (height) denotes the temporal com-ponents [15]. It’s strength is that is gives the user the full spatio-temporal information in one single glance in contrary to the 2D-representations shown previously, where the temporal as-pects often have to be shown using time sliders, animations or similar [46]. A space time cube can be utilized by many types of spatio-temporal data and it’s common to use it for trajectory data (it is brought up again in the section about trajectory data in three-dimensions).

Event data can be visualized in a space-time cube by displaying the events as dots where the vertical positions correspond to the time point when the event occurred. As in the two-dimensional representations, the position of the circles in reference to the map represents the geographical position of the event. Color and size can be used to represent thematic attributes of the data here as well [14]. In Figure 4.2 an space-time event cube can be seen.

Figure 4.2: An example of a space-time event cube. The x and y axis represents the geograph-ical positions of the event and the Z-axis (vertgeograph-ical axis) represents the time of which the event has occurred. Image source: Figure 2 in [14].

The observer should be able to interactively explore the data further by, for example, marking the dots in order to get more information about the event. The dot could be high-lighted and information could pop up in a legend. The cube should support zooming, ro-tation and panning in order to let the observer achieve the optimal view. Since this is a 3D representation, there is a high risk of having data points blocking each other making it dif-ficult to see some of the data. The ability to select the time span of which events to show allows reduction of cluttering and is a relevant feature in data with long time spans or large data sets.

The cube can be dynamically linked to a 2D representation of the same map in order to enhance the understanding of the dataset’s spatial attributes. If the observer marks a dot in the cube the corresponding dot should be highlighted on the 2D map. The cube can be linked to other types of displays which is an important feature that allows further exploration and understanding of the data [14].

(23)

4.2. Movement data

4.1.2.2 GeoTime

An insteresting implementation of the space-time cube is GeoTime [32]. GeoTime is an event driven software framework that uses a variant of the space-time cube often in combination with other displays. The purpose of GeoTime is to give the information of the spatio-temporal data in a single glance, instead of having to use multiple displays [13]. Similar to the space-time cube, the events are represented by an X, Y and Z (somespace-times called T) coordinate space. The X and Y-axes represents the geographical space and the Z-axis represents the time into the future and past. The points above the X and Y-plane represent the possible positions of future events and the point beneath the plane represents the positions of past events and the current position of the plane is called the instant of focus [44]. An event is animated relative to a time slider. The GeoTime can be further built upon by adding icons and images to describe the events and activities.

The GeoTime can also be used to track an object in its geographical space over time. This is visualized as a trajectory, known as an entity trail, and along this trail the event dots are lo-cated, depending on their geographical location and time as described. Figure 4.3 illustrates this concept. If the GeoTime is used to track an object, and thus visualizing a trajectory, it can be argued to be a type of movement data and belonging under that section. However, since the event occurring in the data still is the aspect of most interest, it’s kept under this section about event data.

Figure 4.3: Two examples of GeoTime. Both examples have the Space-time cube as main view and a time slider at the bottom. The left image source: Figure 3 in [13]. The right image source: Figure 7 in [44].

In addition to more common types of interaction such as filtering, selection and grouping the GeoTime uses some customized ways of interaction. A linear scale representing time is visible beneath the space-time cube in the right part of Figure 4.3 and, by translating the slider, the time is translated by some value in future or past. Colors are used in the GeoTime in order to represent if the shown events are in the past or the future. In Figure 4.3, while not easy to identify here, the red range represents the past while the blue range represents the future. If the time slider is altered animation of the events occurs and the event(s) move continuously along the map and the timelines animates up and down [44].

4.2 Movement data

Movement data, as the name implies, is a broad data category concerned with the movement of objects. The positions of moving objects can be collected at different temporal resolutions and with different means. GPS sensors are most commonly used for collecting this type of

(24)

4.2. Movement data data by continuously registering the position of an object. But movement data can also be extracted, for example, from geo-referenced photographs or twitter posts.

4.2.1 Trajectories

Trajectories are usually represented as lines on a map and depending on the application the lines gives different amounts of information. The most basic form is a line between two points which gives information about the movement of an object. The temporal component is a time line with a start and an end point and usually with points in between while the spatial information is locations recorded at these time stamps. In other words, the spatio-temporal components in trajectory data are several position readings in a sampled time interval. In Table 4.1 an example of this type of trajectory data can be seen.

Index x y Time 0 59.65 5.51 19-10-24 14:40:05 1 60.01 5.55 19-10-24 14:40:10 2 60.03 5.77 19-10-24 14:40:15 3 60.10 5.86 19-10-24 14:40:20 4 60.17 5.87 19-10-24 14:40:25

Table 4.1: An example of movement data, geographical positions in X and Y are recorded at certain times. The data is an example it is not an actual recording of anything.

Trajectory data can help the understanding of human behavior during an unusual event such as a concert or sports game or some sort of crisis or disaster such as a hurricane or terrorist attack. For example, patterns can be found and compared in detail in transportation flows before and after an event. It’s also very common to use trajectories when analyzing traffic flow which can help with city planning.

Figure 4.4: Visualization of traffic in Milan Monday, 2 April 2007. The left image has trajecto-ries drawn fully opaque and the right has trajectotrajecto-ries drawn with a 5% opacity. Image source: Fig. 1.21 in [5].

(25)

4.2. Movement data A single trajectory does however not give us much information, and this visual represen-tation is mostly used in GPS-devices when a user wants to find the path to some location. Therefore it’s more common to use a map where multiple trajectories have been mapped, as seen in Figure 4.4. This could be several trajectories of a single object, such as of a persons movement data over multiple days, or the trajectories of multiple objects like the traffic data in a city.

Overplotting is a big problem in trajectory maps and one way to solve that problem is to plot all the trajectories but with a lower opacity. This leads to less clutter but also results in data loss since some of the data is hidden in order to better show the areas with a high frequency of trajectories. In Figure 4.4 two maps with traffic data during a day can be seen. One has drawn all lines fully opaque and one drawn with 5% opacity.

Trajectory data can also have thematic attributes associated with them. These could be data describing the direction of movement, the speed of the object or the slope of which the object is ascending (or descending) just to mention a few. The thematic attributes of the data can be visualized by, for example, using color for speed. One problem with the maps shown in Figure 4.4 is that it does not give much information about the data’s thematic attributes such as direction and the flow of the traffic can’t be analyzed.

Figure 4.5: In the left image a temporal bar chart is shown. It shows temporal variation in positional attributes within the trajectories. In the right image a trajectory map with a higlighted trajectory is shown. Image source: Figure 4.6 in [5].

In order to get more information about the data’s thematic attributes and to be able to distinguish some trajectories from the cluster, this view could be combined by dynamically linking the display to another visual representation of the data. In Figure 4.5 a map with multiple trajectories is combined with a display that represents a temporal bar chart. In this way the user can find interesting points in time and mark it with the cursor. When the cursor hovers a point in the chart, the corresponding trajectory is highlighted. The temporal bar chart can show more information about the thematic attributes of the data like start and end time of the travel, the duration of the travel and average speed for example. This is one of many ways of combining different visual representations in order to get more information form the data.

4.2.1.1 Interpolation and map matching

Trajectory data consists of several coordinate readings at several time stamps and sometimes the interesting aspect to analyze is a path between these coordinates. Therefore, in order to extract that line the data has to be interpolated. The interpolation can be done using some geometric interpolation or an Beizer Curve [5].

(26)

4.2. Movement data

Figure 4.6: An example of map matching. Here we see the raw recorded data as red dots with the lines representing the interpolation. The black line is the track snapped onto the road. Image provided by ©OpenStreetMaps.

The problem with interpolation however is if the data sampling rate would be low there is a greater risk of having an inaccurate trajectory since the coordinate points would be further apart and thus the interpolation becomes less effective. In order to overcome this problem, sometimes it’s fitting to match the raw recorded data to a real world logical model, such as matching the recorded GPS-coordinates to a road, as seen in Figure 4.6.

This is done by taking the raw data and snapping it to the edges of an existing street graph. Many applications use this pre-processing step. For example, it is widely used in areas such as traffic flow analysis and moving objects management [48]. There are a numerous amount of algorithms, both online and offline, that achieve this and the area is still under research. One set of algorithms are called Geometric algorithms and often use the closest road segment in reference to the recorded data coordinate in order to construct the trajectory [55]. Note that interpolation and map matching are not exclusive of each other, sometimes the data is interpolated then matched to a map.

4.2.2 Moves

A move is defined as a flow between two locations. In Figure 4.4, the flow of the objects and thus the moves cannot be determined easily. Therefore, it is sometimes more interesting to visualize the aggregated moves between neighboring locations. Consider Figure 4.7, it has a lot om similarities with Figure 4.4 however is a lot less overplotted and the moves of the objects are visualized as well as the directions of these. These types of maps are often used in traffic visualization.

This type of map uses spatial aggregation in order to be able to reduce clutter and the width of the arrows represents the number of times cars have driven along the way. The tra-jectories are divided into segments and thus does not only take the tratra-jectories start and end positions into account, but the intermediate points as well. One method for this is called time-based division [4] and works in a way that the result is the aggregated moves corresponding to different time intervals. This can be achieved by, for instance, including a additional dis-play dimension.

(27)

4.2. Movement data

Figure 4.7: A flow map of the traffic in Milan. The trajectories are aggregated and only flows representing 50 or more trajectory segments are shown. Image source: Fig. 1 (d) in [4] and the data was provided by Comune di Milano (Municipality of Milan).

Another method is called place-based division [4] and here the sequences of visited places of interest by the object are found and then each trajectory is divided into segments between these places. Then, trajectory-segments with the same start and end positions are put together as summarized moves between these places. Both of these methods do however require pre-defined information about relevant places.

In the case of Figure 4.7, the method consisted of extracting specific points from the tra-jectories, grouping them by spatial proximity and using the center of the groups to generate points for Voronoi tessellation of the territory. The Voronoi tessellation results in cells that then are used as the places for aggregating movement data and building flow maps. How much that is generalized depends on the sizes of these cells and they depend on the spatial extents of the point groups. This method allows spatial aggregation without any pre-existing knowledge about places of interest [4].

In order to further reduce clutter, the visualization could show the flows only when the number of trajectory segments are above a fitting threshold.

4.2.3 Flow map

Flow maps are most often used to show large quantities of objects traveling between nodes. It’s a common representation in areas such as migration, trade and data with a from-to rela-tionship. The typical flow map uses a flow tree where the source of the flow is the root while the distribution target nodes are the leaves [68]. The width of the flows are scaled relative to the quantities of items that are being transported [68]. If there are several flows going from the root to the same node they merge and the trajectory grows in width. This process is called edge merging [56] and is a type of spatial aggregation [4]. Edge merging is one of the strengths of flow maps and results in a reduced visual clutter.

The trajectories are often represented as arrows which allows the user to analyze the flow of the objects. According to [56] the foundations of a good drawn flow map can be achieved by taking three aspects into consideration: merging of the edges that share the same goal, smart distortion of positions and a good routing of the edges. Good routing is important since, even if the edges merge, if the trajectories are drawn straight the risk is that they are

(28)

4.2. Movement data drawn over each other, if there’s a lot of data with destinations close to each other. In Figure 4.8 a typical example of a flow map can be seen.

Figure 4.8: A typical example of a flow map with multiple trajectories. Here we see the export of softwood lumber from British Columbia in 2014. The width of the arrows represents the quantity of wood exported.

Even though flow maps use techniques such as merging, when the data sets become very large, flow maps usually suffer greatly from overplotting. Therefore, it might only be fitting to use when the size of the data set is reasonably small or to filter the data by e.g. only looking at a selected number of nodes.

Interactivity can also reduce clutter by a great amount. For example say that an appli-cation shows the exports of various goods from many countries. This appliappli-cation would probably be very cluttered but if the user could be able to define which type of goods or which countries to show a lot of the clutter would disappear, provided the application still uses some aggregation techniques, such as edge merging. This way the analyst can work with segments of the data and not be overwhelmed by large quantities of data and overplotting.

Interactivity is also an important part since being able to mark and select different parts of the visualization can give more information. For example, a user could select one of the tra-jectories and a legend could pop up giving further information about the quantity off items.

4.2.4 Density trajectory maps

A density map is a visual aggregation using kernel density estimation to generalize the be-havior of multiple trajectories [59]. The visualization aggregates the trajectories data into multiple density fields [60]. Density maps are mostly used for visualizing single attributes but can however be used for the visualization of trajectories as well. In a method proposed by ems et al. [52] the trajectories are smoothed using a small and a large kernel resulting in two aggregated density fields. The large kernel density field is then also used for the color

(29)

map-4.2. Movement data ping and the aggregated field is used for the illumination. In the final image the gray-scale illumination image and the color image are multiplied.

Figure 4.9: The pipeline proposed by [52] of the rendering of vessel trajectory density maps. The trajectory data is smoothed using two different kernels of different size resulting in two different density fields. These are then aggregated and the large sized kernel field is then used for the color mapping and the aggregated fields are used for the illumination. These are then multiplied and result in the final density map visualization.

In Figure 4.9 a visualization of this proposed method can be seen. Color can be used to represent different time periods during which the objects have traveled. Consider a map with multiple density field trajectories with data recorded during a day. Then, colors could be used for representing morning, afternoon, evening and night for example. Furthermore the saturation can be linked to the density field contribution and the hue given by the period of the day with the highest density [60]. The result of this method can be seen in Figure 4.10, where a density map is used to represent vessel traffic in front of Rotterdam during a single day.

Figure 4.10: The density trajectory map displaying vessel traffic in Rotterdam. Here, the colors each represents a quarter of a day. Bright yellow represents morning, dark yellow represents afternoon, bright blue represents evening, and dark blue represents night. Image source: Figure 1. in [60].

Density maps are often utilized in analyses of anomaly detection and risk analysis but are however quite limited and not suitable for many types of data [59]. This method is thus use-ful for analyzing vessel movements in harbors, or similar areas, but can however be adapted and used for other tasks as well. The downside of density trajectory maps is that they mainly are intended for expert users [59], [60] since they are difficult to analyze, to set up, and re-quires a lot of data preprocessing. Depending on what should be analyzed and other different attributes, the sizes of the kernels can be varied.

(30)

4.2. Movement data

4.2.5 Trajectory visualization in three dimensions

Just as with event data, trajectory data can make use of three dimensional visualizations. One example is flight data, since planes move in a three-dimensional space sometimes it’s fitting to look at the three-dimensional trajectory.

4.2.5.1 Space-time (trajectory) cube

The space-time cube has already been brought up in this thesis, but then for event data. The space-time cube is however more commonly used for trajectory data [45].

Figure 4.11: Two space-time cubes. The one to the left represents the travel of one person on an average Thursday. Image source: Figure 1. in [45]. The right figure represents Napoleon’s Russian campaign in 1812. Image source: Figure 1 in [14].

As seen in Figure 4.11 the trajectory goes along a map just like in the two-dimensional representation. However the trajectory also goes in a vertical line. This represents the time and by looking, at for example, the part of the trajectory which represents "Work" in the left figure in Figure 4.11 the trajectory does not move in the x or y-axis. This is because the person stayed at work from 08:10 to 17:30. The problem with space-time cubes is that the more data one would want to analyze simultaneously the more likely the visualization becomes too cluttered and hard to understand. This problem becomes even greater in three dimensions since it’s easier to block other data. Therefore, it might be important to let the user be able to filter out some of the data, in cases when a multitude of trajectories were to be analyzed. Ways of filtering are for example to only show data collected during a certain time period, only show flows with magnitude greater than a set threshold or to be able to change the opacity of the trajectories. To be able to interactively manipulate the view by zooming, rotating or shifting is an important feature since makes the effects of trajectories blocking each other less extreme and allows the user to correct the perception of the information [5]. The temporal component is always present in the space-time cube and automatically introduces dynamics [45].

4.2.5.2 Tubes and ribbons

Movement in three-dimensions can be represented as tubes or ribbons. This way the user gets a clearer indication of movement in height over time and this is why this approach is com-mon for e.g. flight data. Just as in previous examples the trajectories color can denote other thematic attributes of the data such as speed or slope. Tubes are better at visualizing very curved paths like the trajectory of a paraglider seen in the left image of Figure 4.12. Ribbons

(31)

4.2. Movement data however are better at showing paths with softer curves such as the trajectory of an airplane. Another advantage of the ribbons is that one could show different thematic attributes by having icons go along the ribbon, as seen in the right image of Figure 4.12 [69]. These repre-sentations also suffers from cluttering. However, if many trajectories go in the same direction the visualization can be quite useful for analyzing a larger number of trajectories [5].

Figure 4.12: The left figure represents the path of a paraglider with a tube as the trajectory. The intensity of the slopes are represented by colors. Shades of blue for positive slopes and shades of red for negative slopes. The right image is the trajectory of an airplane coming in for landing. The arrow icons represent the direction of the plane. The coloring of the ribbon represents the speed. Image source: [69], courtesy of Katerina Vrotsou and Carlo Navarra.

4.2.5.3 Stack based trajectories

Color-coded bands are used to build a set of trajectories by stacking these bands along a path. Like many other three-dimensional geovisualizations a two-dimensional map is the base of the visualization and the trajectory data is then mapped along the Z-axis. This method helps with overplotting since if, for instance, the data consists of multiple trajectories going along the same path, in a two-dimensional environment these would be plotting over each other while in three-dimensions they would stack up allowing the trajectories to be built as a wall. This way they can still be seen by the observer and less information is lost due to overplotting. If the trajectories are stacked in a chronological order temporal aspects are included in the main overview. In Figure 4.13 a stack based solution can be seen.

Figure 4.13: A stack based trajectory visualization of CPM-values along the Tokio-Fukushima highway. Image soruce: Fig. 1. in [67].

The temporal component of the visualization can be determined through a time lens, which is a circular display that the user can dynamically query. Further, the time can also be shown via a time graph that shows the trajectories as horizontal bands which colors deter-mine the time-dependency [67]. These displays are dynamically linked allowing the observer

(32)

4.3. Raster data to find interesting patterns in the spatio-temporal dimension. One important feature is how-ever that the color-coding for the data values is appropriate. Further attribute information, as flow for example, can be visualized via icons on the bands. Flows can be visualized by having arrows on these bands, pointing in the direction of the trajectory as seen in Figure 4.13. Color changes in the horizontal dimension can signify spatial changes, gradual changes along the vertical dimension can signify temporal changes and gradual changes in the diag-onal dimension can signify spatio-temporal changes [67].

As mentioned, the temporal aspects can be analyzed via a time lens. The time lens is a cir-cular display that consists of two components. The first component is an interior that shows the spatial aspects while the second is the outer ring that visualizes the temporal aspects. The interior shows the trajectory data for a circular queried area from the trajectory wall. The ob-server can move the query circle in order to determine which trajectory points are shown in the time lens. The outer ring is segmented in relation to the time model of the data, for exam-ple it can be segmented into seven parts that each represents a day of the week. The segments are then filled to visualize the temporally aggregated information about the trajectories [67].

The time lens for this application can be seen in Figure 4.13 in the lower right corner.

4.3 Raster data

Raster data is data collected from a grid system of static observers. Climate data recorded by a set of weather stations is a common type of raster data or sensors recording data such as hu-midity, radiation or temperature. Raster data is sometimes constructed as a spatial grid with some resolution, similar to an digital image. These types of raster data are called irregular spaced and regular spaced raster data respectively [8]. In Figure 4.14, regular and irregular raster grids are shown.

Figure 4.14: In (a) a regular spaced raster grid. In (b) an irregular spaced raster grid. Image source: Fig.3 (a) and (b) in [8].

4.3.1 Raster map

Raster maps are represented by a grid-system displaying the pattern of the spatio-temporal data. The data can be data with a spatial resolution of some degree in longitude and latitude with some time scale. For example in the right image in Figure 4.15, the data is monthly temperatures in the world (though only Europe is shown in the image) with a spatial resolu-tion of 0.5 degrees in both longitude and latitude. The grid is based on spatial interpolaresolu-tions constructed by using data from weather stations.

The application shown in Figure 4.15 is called the Global Climate Monitor [9], [37] and its purpose is to make it easier to understand climate change data for non-scientists. The color of the raster squares represents the temperature. The observer can change the time by the drop down menu to the left in the application. In this case, there’s also possibilities to

(33)

4.3. Raster data

Figure 4.15: Two raster maps. The left image shows the mean temperatures in Europe be-tween 1901-2012 while the left image shows the monthly temperature in September 2019. Red shades stand for warmer temperatures while blue shades stand for colder temperatures. Images generated with https://www.globalclimatemonitor.org/.

change what kind of data that is shown, if it’s shown by monthly values, annual values, to show the normals or trends. As seen in the left image in Figure 4.15, the mean temperatures between 1901-2012 are shown instead of only the temperature for a month. Furthermore, the application also allows the user to zoom and pan the map and to alter the transparency of the grid.

4.3.1.1 Raster maps in three dimensions

To take the concept of raster maps one dimension further would allow us to represent some attribute of the data along the Z-axis. Such an attribute could for example be an numeric amount or temporal information. In Figure 4.16 a raster map with a spatial resolution of 250m x 250m and a temporal resolution of 15 minute intervals is shown. The images shows the overall network activity during an average work day in Udine, Italy. The height and color of the raster squares stands for the network activity at that area. Colors could be used to display other thematic attributes of the data as well.

Figure 4.16: Snapshots from a 3D raster map that shows the overall telecom network activity during a working day in the city of Udine, Italy. Image source: Fig. 1 in [57].

(34)

4.3. Raster data

4.3.2 Isarithmic maps

Isarithmic maps are also known as contour maps and can represent smooth continuous type of data. There are many ways of representing these types of maps but two popular repre-sentations are the isopleth and the isometric map. They use contour lines for interpolated discrete raster data from different areas in space [11]. Several regions on a map are bound by the contour lines and all the regions within the same area represent similar values [12]. They utilize colors, especially value and hue, in order to convey the information.

These types of maps use something called control points that consist of two types of data; either true point data, which the actual data value recorded at that location, or conceptual data, which is the data information recorded over an area [62]. When the data is true data, the contour lines represent the specific value and this is called a isometric map. Contrary, the contour line only approximates the data values for conceptual data and this type of map is called an isopleth map. In Figure 4.17 an isopleth map can be seen.

Figure 4.17: An Isopleth map showing the precipitation the 10th of June, some year in a Spanish region. Image source: [31].

Both these maps are a good way of, for instance, representing surface temperature in an area since temperature is continuous and does not have abrupt value changes. Isarithmic maps are also ideal of showing continuous and gradual change over space [31]. Other types of common data that can be represented by a isopleth map are elevation and rainfall.

The data values can be classified into groups where each group can be represented by a color. If, for instance, the goal is to visualize temperature data, the values could be classified into groups that consists of temperature intervals, for instance.

The temporal aspects of a isarithmic could be represented by a time slider that, when interacted with, translates the time by some value. When the time is translated, the colors of the map are updated. Dividing the map into small multiples, each multiple showing the map during different time stamps, is also an interesting way of representing the temporal aspects.

4.3.3 Heat maps

A heat map, or a shading matrix, is a graphical representation of data where the values are represented by colors.

(35)

4.3. Raster data

Figure 4.18: A matrix heat map. This heat map is generated from DNA micro array data representing gene expression values. The values are represented by the colors of the squares. There are many types of heat maps and it’s often represented as a matrix where each in-dividual value contained in the matrix is represented by some color. This can be represented by a 2D square mosaic raster plot either just by itself of over some geographical map in order to represent some spatial information, similar to the raster map. In Figure 4.18 a mosaic ma-trix heat map can be seen. In Figure 6.4 another mama-trix heat map can be seen, representing monthly temperature values.

Figure 4.19: A density heat map displaying the probability of the location of the crashed airplane MH370. The colors represents the following, blue for less like-lihood, yellow for increased likelihood and red for the most likelihood. Im-age source: Figure 9 in http://www.atsb.gov.au/media/5733650/AE-2014-054_MH370-Definition%20of%20Underwater%20Search%20Areas_3Dec2015.pdf.

One type of heat map is the tree map, which is a 2D hierarchical partitioning of data and another one is a geographical density visualization. In Figure 4.19 a geographical density heat map can be seen. It represents the probability of the location of the crashed airplane MH370, with blue being less likely, yellow more likely and red most likely.

(36)

4.3. Raster data

4.3.4 Origin-destination matrix

The origin-destination matrix represents flows between points and uses the same type of information as a flow map. It’s a technique used to combat the difficulties of overplotting since in an origin-destination matrix, overplotting does not exist.

Figure 4.20: An origin-destination matrix showing the summarized movement of male and female laboratory mice. The left image represents the movement of the male while the right represents the female. Image source: Fig. 4.8 in [5].

However the majority of, if not all, the spatial information is lost. This is not a problem where the places are few but when it shows flows between a multitude of places, it’s less efficient. It’s also important that the places have descriptive labels, in order to have the in-formation understood. The origin-destination matrix is used to find clusters of interlinked places and places linked to several other places [5]. In Figure 4.20 an origin-destination ma-trix can be seen. Here the colors represents the magnitude of flows between sensors. The grey squares represents no movement while the yellow, orange and red colors represents an increasing magnitude of movement.

It’s important to sort the columns and rows in an intuitive way in order to find and ex-plore interesting patterns of the data. In the case of Figure 4.20 the rows and columns were ordered in reference to the spatial positions of the sensors that recorded the movement. Man-ual ordering of the rows and columns can be hard, or even impossible, when a large number of places are to be taken into consideration [5].

4.3.5 Origin-destination maps

Origin-destination maps (OD-maps) are based on origin-destination matrices but brought to the geographical space, thus the spatial layout is preserved. Instead of representing a node link with a trajectory, this idea proposes a solution where each two dimensional vector from and origin to a destination is represented by a cell in a two dimensional matrix. However, unlike the origin-destination matrix, the cells are ordered with the original two dimensional geographic location in mind. In Figure 4.21 an origin-destination map can be seen.

Origin-destination maps offers advantages over traditional flow maps. OD-maps are, for example, scalable to a large number of trajectories since the OD-trajectories are aggregated into a regular grid. This aggregation could however result in a loss of information, as with any aggregation. In particular, the geographic grid resolution is limited to a 20x20 grid which could result in a loss of detail in the origin destination results. This can be overcome by al-lowing the user to interactively zoom to retrieve details on demand. Further, the aggregation

A Survey of Methods for Visualizing Spatio-temporal Data

Department of Science and Technology

Institutionen för teknik och naturvetenskap

LiU-ITN-TEK-A--20/019--SE

A Survey of Methods for

Visualizing Spatio-temporal

Data

Mattias Persson

LiU-ITN-TEK-A--20/019--SE

A Survey of Methods for

Visualizing Spatio-temporal

Data

Examensarbete utfört i Medieteknik

vid Tekniska högskolan vid

Linköpings universitet

Mattias Persson

Handledare Katerina Vrotsou

Examinator Aida Nordman

Upphovsrätt

Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare –

under en längre tid från publiceringsdatum under förutsättning att inga

extra-ordinära omständigheter uppstår.

Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner,

skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för

ickekommersiell forskning och för undervisning. Överföring av upphovsrätten

vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av

dokumentet kräver upphovsmannens medgivande. För att garantera äktheten,

säkerheten och tillgängligheten finns det lösningar av teknisk och administrativ

art.

Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i

den omfattning som god sed kräver vid användning av dokumentet på ovan

beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan

form eller i sådant sammanhang som är kränkande för upphovsmannens litterära

eller konstnärliga anseende eller egenart.

För ytterligare information om Linköping University Electronic Press se

förlagets hemsida

http://www.ep.liu.se/

Copyright

The publishers will keep this document online on the Internet - or its possible

replacement - for a considerable time from the date of publication barring

exceptional circumstances.

The online availability of the document implies a permanent permission for

anyone to read, to download, to print out single copies for your own use and to

use it unchanged for any non-commercial research and educational purpose.

Subsequent transfers of copyright cannot revoke this permission. All other uses

of the document are conditional on the consent of the copyright owner. The

publisher has taken technical and administrative measures to assure authenticity,

security and accessibility.

According to intellectual property law the author has the right to be

mentioned when his/her work is accessed as described above and to be protected

against infringement.

For additional information about the Linköping University Electronic Press

and its procedures for publication and for assurance of document integrity,

please refer to its WWW home page:

http://www.ep.liu.se/

Acknowledgments

Contents

List of Figures

List of Tables

1

Introduction

1.1

MindRoad AB

1.2

Motivation

1.3

Aim

1.4

Research questions

1.5

Method

1.6

Delimitations

1.7

Report structure

2

Background theory and Related

work

2.1

Big data