Transport mode inference by multimodal map matching and sequence classification

(1)

IN

DEGREE PROJECT THE BUILT ENVIRONMENT, SECOND CYCLE, 30 CREDITS

STOCKHOLM SWEDEN 2020,

Transport mode inference by multimodal map matching and sequence classification

BRUNO SALERNO

KTH ROYAL INSTITUTE OF TECHNOLOGY

SCHOOL OF ARCHITECTURE AND THE BUILT ENVIRONMENT

(2)

Transport mode inference by multimodal map matching and sequence classification

BRUNO SALERNO

Master in Transport and Geoinformation Technology Date: July 8, 2020

Supervisor: Győző Gidófalvi Examiner: Yifang Ban

School of Architecture and Built Environment, KTH Swedish title: Inferens i transportläge genom multimodal kartmatchning och sekvensklassificering

(3)

(4)

iii

Abstract

Automation of travel diary collection, an essential input for transport planning, has been a fruitful line of research for the last years; in particular, concerning the problem of automatic inference of transport modes. Taking advantage of technological advance, several solutions based on the collection of mobile devices data, such as GPS locations and variables related to movement (such as speed) and motion (e.g. measurements from accelerometer), have been investigated. The literature shows that many of them rely on explicit initial segmentation of GPS trajectories into trip legs, followed by a segment-based classification problem. In some cases, GIS-related features are included in the classification instance, but usually in terms of distance to transport networks or to specific points of interest (POIs).

The aim of this MSc Thesis is to investigate a novel transport mode inference procedure based on the generation of topological features from a multimodal map matching instance. We define topological features as the topological context of each point of a GPS trajectory. Further utilization of these features as part of a sequence classification problem leads to mode prediction and to the implicit definition of the trip legs. In addition to not depending on an explicit segmentation step, the proposed routine also has less requirements in terms of the complexity of the required GIS features: there is no need to consider distance features, and the proposed map matching implementation does not require the usage of one unified multimodal network —as other multimodal map matching approaches do.

The procedure was tested with a travel diary data set collected in Stock- holm, containing 4246 trips from 368 different commuters. The transport modes considered were walk, subway, commuter train, bus and tram. In order to assess the impact of the topological context, different feature set compo- sitions were investigated, including topological and conventional movement and motion features. Three different classifiers —decision tree, support vector machine and conditional random field— were evaluated as well.

The results show that the proposed procedure reached high accuracy, with a performance that is similar to the one offered by current approaches; and that the most performant feature set composition was the one that included both topological and movement and motion features. The best evaluation measures were obtained with decision tree and conditional random field classifiers, but with some differences: while both of the them presented similar recall, the former yielded better precision and the latter achieved a higher segmentation quality.

(5)

iv

Sammanfattning

Automation av insamling av resedagböcker, en viktig insats för transportplane- ring, har varit en fruktbar forskningslinje under de senaste åren; särskilt när det gäller problemet med automatisk inferens av transportlägen. Genom utnyttjan- de av teknologiska framsteg har flera lösningar baserade på insamling av mo- bila enhetsdata, såsom GPS-platser och variabler relaterade till förflyttningar (såsom hastighet) och rörelse (exempelvis mätningar från en accelerometer) undersökts.. Litteraturen visar att många av dem förlitar sig på uttrycklig initial segmentering av GPS-banor i delresor, följt av ett segmentbaserat klas- sificeringsproblem. I vissa fall ingår GIS-relaterade attributi klassificerings- instansen, men dessa är vanligtvis avstånd till transportnät eller till specifika intressanta platser (POI).

Syftet med denna MSc-avhandling är att undersöka en ny inferensprocedur för färdmedelbaserat på generering av topologiska attribut från en multimodal kartmatchningsinstans. Vi definierar topologiska attributsom det topologiska sammanhanget för varje punkt i en GPS-bana. Ytterligare användning av dessa attribut som en del av ett sekvensklassificeringsproblem leder till förutsägelse av färdmedel och till den implicita definitionen av delresor. Utöver att inte bero på ett uttryckligt segmenteringssteg, har den föreslagna rutinen också mindre krav med avseende på komplexiteten hos de nödvändiga GIS-attributen: det finns inget behov att ta hänsyn till avståndsattribut, och den föreslagna kart- matchningsimplementeringen kräver inte användning av ett enhetligt multi- modalt nätverk - som andra multimodala kartmatchningsmetoder gör.

Förfarandet testades med en resedagboksuppsättning insamlad i Stockholm, innehållande 4246 resor från 368 olika pendlare. De beaktade färdmedlenvar gång, tunnelbana, pendeltåg, buss och spårvagn. För att bedöma påverkan av det topologiska sammanhanget undersöktes olika sammansättningar av attri- but, inklusive topologiska och konventionella förflyttningaroch rörelseattribut.

Tre olika klassificerare - beslutsträd, stödvektormaskin och villkorligaslump- fält - utvärderades också. Resultaten visar att det föreslagna förfarandet upp- nådde hög noggrannhet, med en prestanda liknande den som erbjuds av nu- varande tillvägagångssätt; och att den uppsättning attribut som gav bäst resul- tat var den som inkluderade både topologiska attributoch rörelseattribut. De bästa utvärderingsmåtten erhölls med beslutsträd och villkorliga slumpfälts- klassificerare, men med vissa skillnader: medan båda presenterade liknande återkallelse, gav den förstnämnda bättre precision och den senare en högre segmenteringskvalitet.

(6)

v

Acknowledgements

First of all, I want to thank Prof. Győző Gidófalvi, my supervisor, for his invaluable guidance and support; and Can Yang, for his decisive advice. I would also like to thank Yifang Ban, the thesis examiner, and Tianshu Shang, my opponent, for their comments and suggestions; they have improved in a great extent the quality of this thesis. Secondly, I would like to thank my classmates, for their companionship and friendship. Among them, a special thank goes to Michael, for his time and helpful suggestions; and to Chico, Dominik, George, Memo, and Nick. Finally, I wish to express my deepest gratitude to Blanca, who shared with me most of this journey; my family, Ana, Luciano, Paula and Raúl, for their unconditional love; and my grandmother Emma, who always supported me.

(7)

List of Figures

3.1 Two-stage mode inference approach . . . 10

3.2 Illustration of the mode probability graph and emission and transition probabilities . . . 12

3.3 Emission and transition probabilities calculated for commuter train, subway, bus and walk modes of a section of a selected trajectory . . . 14

3.4 Patterns by mode in the feature space . . . 15

3.5 Example of the graphical model of a linear chain CRF . . . 18

4.1 Transport networks in Stockholm, Sweden . . . 21

4.2 Distribution of segments by mode after balancing . . . 23

4.3 Evaluation measures at the segment level . . . 28

5.1 Comparison of accuracy by point and distance, precision, recall, and F1 score, by model . . . 35

5.2 Comparison of trips over and undersegmentation in the test data sets . . . 36

6.1 Proportion of misclassified segments by mode . . . 38

6.2 Misclassified points by ground truth mode . . . 39

6.3 Misclassified walk and bus trip legs . . . 41

6.4 Example of misclassified walk and bus trip legs . . . 42

viii

(10)

List of Tables

3.1 Movement and motion features . . . 16 4.1 Statistics of the transport networks and travel diary data set . . 21 4.2 FMM configuration . . . 24 4.3 Configuration of the classifiers . . . 25 4.4 Evaluation measures by level . . . 26 5.1 Movement and motion features. Point, segment, distance, and

trip accuracy. . . 29 5.2 Movement and motion features. Precision and recall by dis-

tance, and F1 score, computed in the test data set. . . 30 5.3 Movement and motion features. Proportion of trips over and

undersegmented. . . 30 5.4 Training time in seconds. . . 31 5.5 Topological features. Point, segment, distance, and trip accu-

racy. . . 31 5.6 Topological features. Precision and recall by distance, and F1

score, computed in the test data set. . . 32 5.7 Topological features. Proportion of trips over and underseg-

mented. . . 32 5.8 Movement, motion and topological features. Point, segment,

distance, and trip accuracy. . . 33 5.9 Movement, motion and topological features. Precision and re-

call by distance, and F1 score, computed in the test data set. . . 33 5.10 Movement, motion and topological features. Proportion of

trips over and undersegmented. . . 34 5.11 Mean precision and recall by distance, and F1 score, computed

in the test data set. . . 34

ix

(11)

x LIST OF TABLES

6.1 Accuracy, precision and recall by distance. Comparison with other works . . . 44 6.2 Measures at the segment level. Comparison with other works

using GIS-related features . . . 45

(12)

Glossary

Accuracy by distance Sum of the length of the correctly predicted segments, over the total length of the segments.

Accuracy by segment Number of segments or trip legs correctly predicted over the total number of segments.

Accuracy by point Proportion of correctly predicted points over the total num- ber of points.

Correctly predicted segment Full match between the labels of the points in the ground truth segment and the labels of the corresponding points in the predicted aligned sequence.

Emission probability Related to the Hidden Markov Model (HMM). It mea- sures the closeness of a GPS observation to the matched network.

Map matching Process of detecting the physical path of a trajectory, usually along a road network.

Movement and motion features (MMF) Conventional features used in mode inference. Movement features imply displacement, such as speed. Mo- tion features do not imply displacement; they are usually collected from sensors such as the accelerometer.

Multimodal map matching Process of detecting the physical path of a tra- jectory along networks with multiple modes.

Oversegmentation Proportion of trips where the number of predicted seg- ments is greater than the number of ground truth segments.

Precision by distance Length of the correctly predicted ground truth seg- ments of a mode, over the total length of all predicted segments of that mode.

xi

(13)

xii Glossary

Recall by distance Length of the correctly predicted ground truth segments of a mode, over the total length of all ground truth segments of that mode.

Segment or trip leg Continuous portion of a trip defined by an unique trans- port mode.

Topological features (TOPO) Features built from the emission and transi- tion probabilities collected in the multimodal map matching instance.

Transition probability Related to the Hidden Markov Model (HMM). It mea- sures the closeness between the straight line connecting two GPS observations and the matched path on the transport network.

Travel diary Verbose descriptions of how individuals travelled during a time frame, including origin and destination, transport modes, and in many cases, the activity performed.

Trip From the segment-based mode inference perspective, a trip is as a set of segments or trip legs, where each segment is defined by an unique transport mode.

Trip accuracy Proportion of correctly predicted full trips (full sequences of points) over the total number of trips.

Undersegmentation Proportion of trips where the number of predicted seg- ments is less than the number of ground truth segments.

(14)

Chapter 1 Introduction

How to automate the collection of travel diaries, which is a key input for transport planning, has been a research problem in Transport Science for many years. Travel diaries are verbose descriptions of how individuals travelled during a time frame, including origin and destination, transport modes, and in many cases, the activity performed. This information is usually used to build behaviour and travel models that describe how trips are generated, and their distribution and mode share, among other things. Traditionally, travel diaries were generated with pen and paper, and required expensive and time- consuming surveys; but in the last decades they have also being collected by phone and through web forms, which has reduced the costs. This conventional implementation is memory-based, which implies that the respondent is refer- ring to a past time frame; and it has problems, such as trip underreporting, and the fact that the response rates have steadily decreased in the last years (Prelipcean et al., 2018b).

Among the solutions proposed, many procedures have been developed to automate travel diaries by using mobile devices data, especially coming from the GPS sensor, but also from other ones such as the accelerometer or the gyroscope. Semi-automated approaches, which are a compromise between automated and memory-based diaries, have also been developed. In this kind of solutions, users validate the collected information, which, according to Pre- lipcean et al. (2018b), is an advantage, as it allows the generation of ground truth data. The MEILI collection system, used to build the data set used in this thesis, is an example of a semi-automated travel diary generation system (see Section 4.2).

Automatic approaches, nevertheless, still have problems that have not been fully solved, such as the quality of the trip segmentation and the correction of

1

(15)

2 CHAPTER 1. INTRODUCTION

the mode inference. A literature review has shown that the most common approaches involve the reconstruction of raw GPS trajectories, their explicit segmentation into legs, and the inference of the transport mode of these legs by using a set of features. We distinguish here between movement features, which involve displacement, and motion features, which do not require displacement.

Among the first group, we include speed; in the second one, we include, for example, data generated with the accelerometer sensor.

A limitation of the explicit segmentation approach is the reliance of the mode inference instance on the correctness of the initial segmentation. An- other limitation in current approaches is that when there is a reference to GIS features, it is mostly in terms of proximity —to transport networks or points of interest (POI). We believe that there is space in the field to explore an implicit segmentation approach that makes also usage of topology of the networks to infer the transport modes of the trajectories.

The aim of this MSc Thesis is to contribute to the improvement of automated solutions for collecting travel diaries, by investigating a novel transport mode inference procedure based on two steps. The first one involves the generation of a topological context for each point in each trajectory with a new multimodal map matching approach —which refers to the physical path detection of a trajectory along networks of different modes. The second step uses the topological context as features in a sequence classification problem.

Here, the segments and their mode are inferred at the same time.

The case study is based on the travel diaries generated with the MEILI system in Stockholm in 2015, for which a group of people reported their com- muting information using a special software installed in their mobile phones (Prelipcean et al., 2018a). This data set also includes annotated data by the users, which is used as ground truth.

To validate the approach, a comparison with a baseline method based only on conventional movement and motion features, and the combination of such features with the topology-derived ones, is attempted. Also, three different supervised classification approaches are evaluated.

The structure of the thesis is the following. Chapter 2 discusses the state of the art of the mode inference field, focusing on the difference between explicit and implicit segmentation approaches, on the features used, and on work related to some of the solutions proposed. Chapter 3 presents the map matching problem, and discusses topological and motion features. This part of the thesis ends with a discussion about classifiers. In Chapter 4 the data used and the experiments carried out are introduced. Important aspects of this section are the chosen data set, the collection of the transport network data, and the definition

(16)

CHAPTER 1. INTRODUCTION 3

of the evaluation measures. Chapter 5 presents the results, regarding mostly the output of each model according to the selected feature sets, and Chapter 6 contains the discussion, by analyzing the distribution of errors, problems in mode inference and across classifiers, and the related work. The final chapter contains the conclusions and future work.

(17)

Chapter 2 Related work

2.1 Transport mode detection

There are two main approaches in transport mode detection: point-based and segment-based (Yang et al., 2018; Prelipcean, 2016). While point-based approaches are usually related to real-time services that try to infer transport mode of current location based on a set of dimensions, segment-based approaches, following Prelipcean et al. (2016), try to answer the question "given a set of locations grouped into segments, which segments are accurately detected and what transportation mode is associated with the accurately detected segments?".

The departure point of most segment-based transport mode detection procedures is location data, which is collected from the GPS sensor. This data is used to reconstruct the GPS trajectories or traces of the user. Usually mobile devices used in transport mode inference also collect measures with other sensors, such as the accelerometer. As it was said before, data derived from this sensor can be included among the motion data, while data that makes reference to displacement, such as the speed that can be calculated with the GPS registers, can be called movement data.

After the initial raw trajectories reconstruction, very often the following steps are carried out, in what Yang et al. (2018) calls a direct or explicit segmentation approach:

1. Segmentation of GPS trajectories into trips, using generally heuristic rules.

2. Segmentation of trips into trip legs. Usually this is done by detecting transfer or "mode change points", such as the ones defined by short walk-

4

(18)

CHAPTER 2. RELATED WORK 5

ing periods. Each trip leg has only one mode.

3. Feature extraction.

4. Inference of the mode of each trip leg.

5. In some cases, a post processing step, focused on mode transition, is carried out (for example, in Lin et al. (2013) or Reddy et al. (2010)).

Usually, the features extracted for each leg are both movement and motion- related, like maximum speed and acceleration (Prelipcean et al., 2017; Chin et al., 2019; Lin and Hsu, 2014), such as in Feng and Timmermans (2013).

Nevertheless, more sophisticated features such as direction or heading change rate, velocity change rate, acceleration change rate or timeslice type have been also attempted (like in Zheng et al. (2010), Zhu et al. (2018) or Xiao et al.

(2019)).

Some studies include GIS features. In many of these cases, like Biljecki et al. (2013) and Stenneth et al. (2011), usual movement and motion features are combined with data such as average distance to transport networks. In Gong et al. (2012), the same kind of features is used in a rule-based procedure. Semanjski et al. (2017) extract features by calculating distances from the GPS observation to both transport networks and specific point of interests (POI) and a support vector machine (SVM) classifier is trained to detect the transport mode of uni modal trips; this work does not deal with the segmentation problem. Rasmussen et al. (2015) propose a three-stage approach. After segmentation and preliminary mode detection using speed and acceleration features, a set of fuzzy logic rules based on distance to transport networks is used to improve the classification. Finally, a map matching instance is used to exclude non-trips (trajectories that could not be map matched to a transport network) and to do minor corrections to the legs detected.

Once the features are extracted, transport mode inference from trip legs is then treated as a classification problem (Bantis and Haworth, 2017), and solved with Ruled Based Heuristics (RBH) routines, Fuzzy Logic solutions or with Machine Learning approaches (usually supervised).

Explicit approaches reach a high accuracy —between 73% and 95%—

and in many cases they can deal with five or more transport modes. Perfor- mance depends on the chosen technique, the data used, the level of measurement (point, segment or length) and the number and type of modes involved (Prelipcean et al., 2017). However, as Chen and Bierlaire (2015) point out, trip segmentation into trip legs as an initial step may be problematic, because "po- tentially wrong segmentations in the first step are not recoverable". Also, the

(19)

6 CHAPTER 2. RELATED WORK

quality of trip segmentation is still sensitive to several factors such as variable movement characteristics and initial estimation.

There is a second and smaller group group of segment-based mode inference that does not attempt an initial explicit segmentation. Yang et al. (2018) call this kind of approaches an indirect partitioning mode and Prelipcean et al.

(2016) refer to it as "implicit point-based transportation mode segmentation".

In this approach, the segments of the trajectories are defined after mode inference, and not before it, in order to avoid depending on the correctness of the initial segmentation. Prelipcean et al. (2016) investigate two explicit segmentation approaches and also reports the usage of an implicit approach with speed and acceleration features and a random forest classifier, for the same data set used in this thesis, but with more modes. In that work, the best explicit model reports a three times higher precision and 5 times higher recall than the implicit one (0.75 against 0.26, and 0.73 vs 0.13). There are also unsupervised examples, such as the procedure developed by Lin et al. (2013), that reports precision and recall figures close to 75%.

As a summary, the segment-based mode inference field is well established and rich in solutions, and current approaches reach a high performance. Nev- ertheless, they still suffer from certain drawbacks. The most usual approach requires trip segmentation as a pre-processing step. Also, if considered, transport networks are included in terms of proximity features: topology of the transport networks is not taken into account. The approach proposed in this thesis addresses these limitations by introducing a two-stages mode inference approach with implicit segmentation. In the first stage, location data is used not only to reconstruct the trajectories to be analyzed, but as input to generate topological features with a multimodal map matching instance. The second step involves a sequence classification problem that uses the features collected in the first stage.

2.2 Multimodal map matching

Map matching refers to the problem of detecting the physical path of a trajectory, usually along a road network. Even though this is a fully developed field of study, as it is shown in Quddus et al. (2007) or Yang and Gidófalvi (2018), the reviewed literature provides few examples of multimodal map matching — this is, where the path may take place in networks corresponding to different transport modes. Among them, many approaches use cellular anthenae data (Asgari et al., 2016; Bonnetain et al., 2019). An example of multimodal map matching with GPS data can be found in Chen and Bierlaire (2015).

(20)

CHAPTER 2. RELATED WORK 7

In all cases, map matching is performed over an unified multimodal transport network, using probabilisitc or hidden Markov models (HMM) + Viterbi routines —which is the state-of-the-art approach to conventional map matching (Bonnetain et al., 2019). The unified multimodal network is built as a set of networks joined together by virtual arcs that represent connections between different modes. The connection with virtual arcs is a complex task as it involves an extra set of decisions and further computation. For example, a public transport network may be connected trough each stop directly to the road network, as in Asgari et al. (2016), to the walking network, or, as it was done in Bonnetain et al. (2019), to the parking locations —which requires the parking locations data set as well— which are in turn connected to the road network.

Also, a given station may be simplified as a single node, or each individual exit may be considered independently.

As it will be explained later, in the proposed routine the map matching is not done against an unified network, but over different networks. This is a new way of attempting multimodal map matching which allows us to a) collect the map matching errors by mode, and b) avoid the complex process of building an unified multimodal network.

2.3 Classification and sequence classifica- tion

Classification is the problem of predicting a discrete or qualitative response given a set of features (James et al., 2013). In supervised classification, which is the most common in transport mode inference, for each predictor or input x_i ∈ X, there is a response measurement, the response class y_i ∈ Y . This kind of models can be trained in a discriminative or a generative way (Koller and Friedman, 2009). Discriminative approaches try to model the conditional probability of Y given X, which can be written as P (Y |X). In this group we can find models such as SVM, decision tree/random forests, and neural networks. According to Bantis and Haworth (2017), these are the most performant routines, and more suitable for big data sets and real time solutions.

Generative models, on the other hand, try to describe the underlying probability distribution. This is, the joint probability of Y and X, P (Y, X). They include different versions of Bayesian networks (naïve Bayes, Bayes nets, HMM), and, according to Bantis and Haworth (2017), they are useful when the re- searcher wants to reason behind the observed mobility pattern, and to quantify the uncertainties of the estimates. According to Nikolic and Bierlaire (2017),

(21)

8 CHAPTER 2. RELATED WORK

generative solutions are also more suited for non-real time implementations.

Unsupervised models in transport mode inference, where no response variable is present —so it cannot be used to train the models— are less common. Some examples are the already mentioned Lin et al. (2013), Bantis and Haworth (2017) which uses dynamic Bayesian networks, or the classic work by Patterson et al. (2003), based on Bayes filters and the expectation- maximization (EM) algorithm.

With regard to sequence classification, here the problem is related to the prediction of interdependence (Sutton, 2012), in particular, in the form of a sequence of variables. According to Xing et al. (2010), we have to distinguish between the problem of assigning one class to a sequence (conventional sequence classification), and the problem of inferring a sequence of classes (strong sequence classification). Conventional approaches can be classified in three groups: feature-based classification, sequence distance based classification, and the model-based one. The first two approaches are discriminative.

In the feature-based solutions, the sequence is transformed into a vector of features, and can take the form of decision trees, or SVM. In the sequence distance based classification, the sequences are compared by a distance function that measures similarity. The model-based classification approaches, finally, are generative, and refer mainly to the Bayesian networks family, including HMM. The strong sequence classification task, on the other hand, has been usually solved by using conditional random fields (Xing et al., 2010), which is a discriminative approach.

(22)

Chapter 3 Methodology

3.1 Preliminaries and problem formulation

Let τ denote a trajectory that stores a sequence of N GPS observations in chronological order as τ = hp1, p₂, · · · , p_Ni. Each observation p_i records the location of the tracked object (x_i, y_i) at timestamp t_i, written as p_i = (x_i, y_i, t_i).

A transport network is represented by a graph G = (V, E) consisting of a set of vertices V and edges E. Each edge e ∈ E is a directed connection from a vertex u ∈ V to v ∈ V indicated by e = (u, v). Edge e is associated with a non-negative cost of c(u, v) indicating its length. A multimodal set of transport networks consists of several graphs where each graph G^M stores the transport network of a specific mode M ∈ {W, B, S, T, R} indicating walk, bus, subway, commuter train and tram.

The trajectory τ can be decomposed in a set of segments S. Each segment s contains consecutive pi ∈ τ with the same mode mi, and represents a trip leg.

Given a trajectory τ and a multimodal set of transport networks G^W, G^B, G^S, G^T, G^R, the problem of mode inference is formulated in two levels. At the point level, it involves inferring the travel mode m_i of each observation p_i ∈ τ . At the segment level, it involves detecting the set of segments S that compose τ .

This thesis proposes a novel travel mode inference approach consisting of two stages: multimodal map matching and sequence classification. The whole process is illustrated in Figure 3.1. The diagram shows that multiple networks, one by transport mode, are used as an input of the map matching process —which is based on a hidden Markov model. This highlights the fact

9

(23)

10 CHAPTER 3. METHODOLOGY

that in this new approach, a unified, more complex, multimodal network is not needed. The chart also illustrates how emission and transition probabilities are collected from the map matching process as the topology related features. The sequence classification part of the scheme shows how the selected features are then used to create a train and a test data set, in order to train the mode inference model.

Network n

Mobile device data

HMM-based map matching per

network

Feature selection, cleaning and

balancing Emission and

transition probabilities per network

Transport mode inference and

trajectory segmentation Stage 1: Multimodal map matching Stage 2: Sequence classiﬁcation

Model training DT, SVM, CRF

Train data

Test data Network 2

Network 1 Movement

and motion data

GPS location

Figure 3.1: Two-stage mode inference approach. The first box refers to the multimodal map matching instance, which generates the topological context of each GPS point. The second box shows the process of using this context as features in a sequence classification problem. The dashed line illustrates the possibility of including conventional movement and motion features, besides the topological ones. DT=decision tree, SVM=support vector machine, CRF=conditional random field, HMM=hidden Markov model.

The flow chart also displays a dashed line that illustrates the possibility of including conventional movement and motion features, which is in fact investigated in this thesis. As it was mentioned in the Introduction, the contribution of the topology features is considered in comparison with a model based solely on conventional features, and with an analysis based on both kinds of features together.

3.2 Topological features: multimodal map matching

The way multimodal map matching is attempted in this work is novel, as a single-mode map matching routine is used on multiple networks at the same time in order to handle multimodality. As it was explained before, this con- trasts with approaches that use unified multimodal networks.

(24)

CHAPTER 3. METHODOLOGY 11

The core of the single-mode map matching procedure used in this thesis is illustrated in Figure 3.2 (a). Two sequences of GPS points that represent different trajectories are matched to a transport network, and candidate edges for each point are collected, within a certain radius r. The chosen implementation is the Fast Map Matching solution (FMM) developed by Yang and Gidó- falvi (2018), which is based on a hidden Markov model (HMM) (Newson and Krumm, 2009). Two different probabilities are computed by this procedure.

• Emission probability p_emeasures the closeness of a GPS observation to the matched network. The distance d from pnto a matched point p⁰_nin a candidate edge c_nis assumed to follow a normal distribution N (0, σ) where σ is the GPS sensor error, namely

p_e(c_n) = e^−d(pⁿ^,cⁿ^.p⁰ⁿ⁾²^/σ²/√

2π (3.1)

• Transition probability p_t, which measures the closeness between the straight line connecting two GPS observations and the matched path on the transport network:

pt(cn, cn+1) = d(p_n, p_n+1)

L(c_n, c_n+1) (3.2)

where cnand cn+1are the candidate edges for pnand pn+1.

FMM generates candidate paths, defined as sequences of matched candidate edges for each point, and then uses emission and transition probabilities to find the optimal path among them, by maximizing a score defined as

score =

N −1

X

n=1

p_e(c_n+1) × p_t(c_n, c_n+1) (3.3) In this thesis, however, we focus only on the probabilities, which will be used to build the topological context of each point, which we call topological features. Hence, given a GPS trajectory τ , a transport network G^mwith mode m ∈ M and M = {W, B, S, T, R}, and the optimal candidates ˆc_i, ˆc_i+1¹, the features are extracted for each point pi ∈ τ as

f^m(p_i) = (p^m_e (ˆc_i), p^m_t (ˆc_i, ˆc_i+1)) (3.4) In practice, it is common that a trajectory can only be partially matched to a specific transport network, as happens with Trajectory 2 of Figure 3.2.

1For brevity, optimal candidates are omitted in the remaining content.

(25)

An example of this can be found in Figure 3.3 as well, where points 20 to 30 are outside the search radius for the subway mode. If there is no edge found in G^m within distance r^m to a point pi, then both p^m_e and p^m_t are set to 0. Figure 3.2 (b) illustrates the emission and transition probabilities for the mode probability graph in the left side. The first trajectory has points that are located close to the transport network, so the emission probabilities are high in all cases. Given this fact and the shape of the network, the transition probability is high for all points as well. As the initial point does not have a preceding element, in this case p_tis zero. For the second trajectory, only the first two points are close to the network, which explains their high pe. The second one obtains a high transition probability as the network may contain the path. Nevertheless, points 3 and 4 are far away from the network again, so the probabilities decrease.

Trajectory 1 Trajectory 2 Candidate edges for trajectories

1 and 2

Candidate edge for trajectory 2 P2

P3 P4

Network P2

P1

(a) Mode probability graph

1

0.5

P1 P2 P4

Emission probability

1

0.5

P1 P2 P3

Transition probability

P3

(b) Emission and transition probabilities Figure 3.2: Illustration of the mode probability graph and emission and transition probabilities. The left panel display a network and two trajectories. Can- didates paths are found for all the points of Trajectory 1. The second trajectory is only partially matched. The right panel shows how this is translated into emission and transition probabilities.

Finally, for each point p_i, emission and transition probabilities corresponding to each network are stored in the form of

f (p_i) = (p^W_e , p^W_t , p^B_e, p^B_t, p^S_e, p^S_t, p^T_e, p^T_t, p^R_e, p^R_t) (3.5) Figure 3.3 shows an example of emission and transition probabilities computed by matching a section of a real-world trip to networks in 4 modes (train, subway, bus and walk). As it can be observed, the GPS trajectory (represented with the green dashed line) has two parts close to the subway and train networks respectively. This is clearly shown in the probability plot displayed in

(26)

Figure 3.3 (c): emission and transition probabilities are higher for the subway within points 5 to 10, and for the commuter train in the range 10 - 30.

Points with indexes 0 to 4 have a different shape, and, as the high transition probabilities in Figure 3.3 (d) indicate, they may be related to a walking trip leg.

In all cases, the emission probabilities are normalized using the SoftMax function

p^m_t = exp(p^m_t ) P

j∈Mexp(p^j_t) (3.6)

and the transition probabilities are normalized by the emission probability of the previous point:

p_t₀ = p_t₀ × p_e−1 (3.7)

A real-world normalized version of the set defined in Equation 3.5 can be seen in Figure 3.4 a), where each line represents a point, and which is intended to showcase how specific patterns can be recognized in the feature space. The patterns allow to distinguish different modes, which backs up the usage of these features in the classification instance. In b), the feature set is bigger, as it contains the normalized probabilities of the neighboring points as well. In this second plot, probabilities are sorted by mode. −1 makes reference to the preceding neighbor element and +1 refers the following one. Even though in both cases clear patterns can be recognized, the second approach yields better accuracy, so it is the one used in this study. In order to assure homogeneous information for each point, we remove the leading and trailing points for each trip.

3.3 Movement and motion features

The movement and motion features considered in this thesis are based on the ones used by Prelipcean et al. (2016) in his implicit segmentation approach. As it was said before, movement features refer to data that implies displacement, while motion features do not necessarily imply displacement. The model presented by this author was trained with the same travel diary data set used in this thesis, collected with the MEILI system (Prelipcean et al., 2018a). The details of it will be introduced in Section 4.2.

The model classified each GPS point among 8 transport modes by using a random forest classification method and a set of features. From this set, we selected the ones displayed in Table 3.1: speed was the only movement feature considered, while three accelerometer-related motion features were included.

(27)

Trajectory GPS location

Train Subway

Bus Walk

(a) Commuter train and subway network (b) Bus and walk network

0.0 0.2 0.4 0.6 0.8 1.0

0 5 10 15 20 25 30

GPS Point index 0.0

0.2 0.4 0.6 0.8 1.0

(c) p_eand p_tfor com. train and subway

0.2 0.4 0.6 0.8 1.0

0 5 10 15 20 25 30

GPS Point index 0.0

0.2 0.4 0.6 0.8 1.0

(d) p_eand p_tfor bus and walk Figure 3.3: Emission and transition probabilities calculated for commuter train, subway, bus and walk modes of a section of a selected trajectory. The label around a GPS point in (a) and (b) marks the point index inside a trajectory. To ensure compliance with GDPR, the chosen section does not have any private location as terminal point, and it is one of many trips with similar origin-destination pair.

(28)

(a) Only current point

(b) Current, previous and following point

Figure 3.4: Patterns by mode in the feature space. Each line represents a point.

The patterns in the plot allow to distinguish between different transport modes.

This backs up the usage of these features in the classification instance. In a), only probabilities for the given point are considered. In b), normalized probabilities of the neighboring points are included as the point features as well.

(29)

Besides the selected ones, the author also considered variables that contained a statistical summary of the preceding 4 points, and a userId column. The statistical summary variables were not considered in this study because it was not clear how to make them compatible with the already described inclusion of the neighboring elements in the feature set. The userId, as it was noted by the author, would have removed generality from the model.

Table 3.1: Movement and motion features taken from Prelipcean et al. (2016) for the baseline approach. Movement features refer to data that implies displacement, while motion features do not necessarily imply displacement.

Feature Kind Derived from Definition

speed Movement GPS Point speed of the user

accMax

Motion Accelerometer

Maximum accelerometer value

accMean Mean accelerometer value

accStdDev StDev of accelerometer values

In this case, the features are normalized by the maximum value in the data set, and, as it is done with emission and transition probabilities, the neighboring values are included in each point feature set.

3.4 Sequence classification

Two supervised approaches are considered to deal with the sequence classification problem. In the first place, the problem is addressed as a strong classification problem, as defined earlier. This backs up the usage of a conditional random field (CRF) classifier. The second approach is to treat this stage as a conventional point classification exercise. In order to explore this second approach, a decision tree (DT) and a multi-class support vector machine (SVM) are evaluated.

Decision tree (DT) is a procedure that implies dividing the predictor space into a set of distinct and not overlapping regions. For each observation lying in a certain region, the same prediction is made. In classification problems, the prediction for each region is the most common occurring class of training observations in that region. The tree grows by recursive binary splitting following a certain criterion. In the implementation used in this thesis, the criterion is based on the Gini index, which takes a small value when the variance accross classes is low. Equation 3.8 shows the definition of the Gini index,

(30)

following James et al. (2013). ˆp_mkrepresents the proportion of training observations of the k class ∈ K, in the region m.

G =

K

X

k=1

ˆ

p_mk(1 − ˆp_mk) (3.8)

The growth stops when a condition among the stop criteria is met, such as the minimum size of the leaf nodes (terminal nodes), or maximum depth. Another option to control the size of the tree is pruning, where branches of the tree are removed in a final step, balancing complexity of the tree and purity of the nodes.

Support vector machine (SVM) is a binary classifier, non-probabilistic, that tries to draw a separation between two classes of observations by finding an optimal hyperplane (James et al., 2013). This is the hyperplane that has the largest distance to the closest observation of any class, defining a margin on each side of it. The regularization parameter C defines to what extent the training observations may violate these margins. To handle non-linear bound- aries, SVM enlarges the feature space by using kernels, which are functions that measure the similarity of two observations. Classification in the test data set is done by assessing in which side of the hyperplane (found with the train data set), the observation is.

To train the model, the implementation used in this thesis²uses the following error minimization function:

min

w,b,ξ

1

2w^Tw + C

n

X

i=1

ξ_i (3.9)

with the constraints yi(w^Tφ(xi) + b) ≥ 1 − ξiand ξ ≥ 0. C is the already mentioned regularization parameter, ξ is the distance from each point to the correct margin boundary, w is the vector of coefficients of the hyperplane, and φ is the kernel function.

In this case, the multi-class support is achieved by using a one-vs-one approach. This implies building ^K₂ SVMs, where K > 2 is the number of classes. The test observation is classified by each of the models, and the number of times that the observation is assigned to each class is collected. The most frequently assigned class is chosen.

With respect to conditional random fields (CRF), this is a discriminative framework developed by Lafferty et al. (2001) to model data where variables

2https://scikit-learn.org/stable/modules/svm.html#

svm-classification, last accessed 20 May 2020.

(31)

depend on each other. It describes the conditional distribution p(y|x) that factorizes according to a factor graph G over X and Y for any value x ∈ X.

Following Sutton (2012), the general conditional distribution can be written as

p(y|x) = 1 Z(x)

A

Y

a=1

ψ_a(y_a, x_a) (3.10) where ψ_ais a factor or local function of G, and Z(x) is the normalization constant. The local function has usually a log linear form:

ψ_a(y_a, x_a) = exp







K(A)

X

k=1

θ_akf_ak(y_a, x_a)







(3.11) where f_akis the feature function for feature k ∈ K, and θ is the parameter vector that has to be learned.

The specific way in which ya is considered depends on the type of graph and on the feature function. x_a may be any subset of the input variables, as CRF does not represent dependencies among them. Because of the nature of the problem analyzed here, in this thesis the graph G takes the form of a linear chain, which leads to the so called linear chain CRF (see Figure 3.5). In this case, the feature function takes the form of f_k(y_t, y_t−1, x_t).

y

x

...

Figure 3.5: Example of the graphical model of a linear chain CRF, inspired by Sutton (2012). The black boxes represent the factors or local functions. In this case, the transition factors depend on current and previous input variables.

Our implementation estimates the parameter vector θ with the gradient descent algorithm, using the L-BFGS method.

In DT and SVM, the unit to be classified is the point. In the training stage, the point is labeled with the annotated ground truth transport mode. In CRF, the unit to classify is a sequence of points that corresponds to a trip. Each point of this sequence is described with the same set of features considered with the other classifiers. In this case, nevertheless, the labels are further confined with

(32)

an annotation of the position in a sequence inspired by the work of Sutton (2012) with the entity recognition problem. This way, instead of labelling the points only by their transport mode, we consider three different states for the labels: I-mode, which means that the point belongs to a leg, B-mode, which means that the point is the first item of a new leg, and E-mode, which means that the point is the last member of the leg. The leading and trailing points of the trip are labeled as I-mode because, as it was said before, the original leading and trailing points were removed because they have incomplete information. For example, a sequence of labels for a trip with walk and bus modes may look like this: [I-walk, I-walk, E-walk, B-bus, I-bus, I-bus]. In order to calculate the accuracy measures, the state prefix is removed from the predicted labels.

(33)

Chapter 4 Experimental setup

4.1 Transport networks

Two kinds of transport networks were considered for the map matching instance: public transport and pedestrian networks, for the whole Stockholm county. The public transport networks (bus, metro, commuter train and tram) were collected in Shapefile format from an official website, and correspond to the 2017 version of the network. The date of the network is important, as it has to be close to the year in which the MEILI collection was carried out (2015).

After retrieving the raw network data, several steps were carried out. First, a manual check was performed to fix broken links and remove duplicated ones.

The edges data set contains 2 arcs per stops pair (one in each direction), for all the modes, so each arc touches other arc, in the same direction, only at stops. Nevertheless, in certain cases there were problems with the geometries that had to be fixed manually. Next, the topology information was built with the help of the "build graph" tool of the QGIS3 Network plugin¹. Finally, connections between stops with the same name, in each network, were built, in order to allow all possible combinations of trips in each direction. To keep the network simple enough, strait lines were used.

Regarding the walking network, it was collected with the help of the OSM- nx library² for Python. This tool can fetch the desired network type from OpenStreetMap (OSM), for the given bounding box, polygon or region name, including the topology information Boeing (2017). Once the network was retrieved the following steps were carried out: first, "residential" highways

1https://plugins.qgis.org/plugins/networks/, last accessed 5 January 2020.

2https://github.com/gboeing/osmnx, last accessed 31 May 2020.

20

(34)

CHAPTER 4. EXPERIMENTAL SETUP 21

where there were sidewalks were removed. Then, the topology was generated again with the already mentioned Networks plugin. Finally, reverse links were added, in order to allow movement in both directions.

The multimodal network is displayed in Figure 4.1. Statistics about it are reported in Table 4.1.

(a) Walking (b) Public transport

Figure 4.1: Transport networks in Stockholm, Sweden. The left panel shows the pedestrian network, obtained from OpenStreetMap. The right panel displays the subway, commuter train, tram and bus networks.

Table 4.1: Statistics of the transport networks and travel diary data set Mode Transport network Travel diary data set

Nodes Edges Legs Points

Walk 226,109 561,470 1,448 29,380

Bus 12,827 43,120 281 25,120

Subway 239 226 255 7,545

Commuter train 106 228 157 11,192

Tram 217 435 182 4,631

Total 239,498 605,479 2,323 77,836

4.2 Travel diary data set

The empirical evaluation of the algorithm was performed with a data set collected with the MEILI travel diary collection, annotation and automation system, which is a semi-automated travel diary generation solution developed by

(35)

22 CHAPTER 4. EXPERIMENTAL SETUP

Prelipcean et al. (2018a). The data set was generated in November 2015 in the Stockholm area, and contains 1 million GPS locations and 4246 identified trips from 368 voluntary commuters. Each commuter had a specific software installed in their mobile devices, which was connected with a backend server that contained an Application Programming Interface (API), and a database where the information was stored. The devices automatically collected the sensors information, such as the GPS locations, and a dedicated AI module in the server segmented the trips and inferred the modes. Finally, the software asked the users to validate or modify the information.

The locations were collected with an equidistant sampling of 50m, instead of using a fixed time interval (equitime sampling). The reasons behind this choice, according to Prelipcean (2016), are that with equidistant sampling fewer points are needed, the usage of battery is more efficient, and it has higher utility for spatial applications. However, in certain transport modes where the GPS signal can disrupted, such as the subway, the effective distance between collected points may be much longer. This can be recognized in the trajectories as long straight lines.

For each point, the following fields were included: latitude and longitude, timestamp, point speed (derived from GPS), and accelerometer-related measures, among others. While the geographical coordinates were used to build the GPS trajectories, and as an input for the map matching procedure, speed and some of the accelerometer-related measures (see Table 3.1) were used as conventional features for the baseline approach.

Transformation of GPS locations into trips is not part of the problem considered in this thesis. Hence, the set of GPS locations was segmented into trajectories by using the trips information annotated in MEILI. The table triplegs_

gt contains the annotated mode for each leg, as a time range defined with the f rom_time and to_time fields with a given trip_id and user_id. The annotated data was loaded through a built-in mechanism that allowed users to describe the legs of their trips, and the corresponding modes. Using this ground truth information, each trip could be reconstructed as a sequence of labeled points.

The modes considered for the analysis were walk, subway, commuter train, bus and tram, corresponding to the selected networks; and from that group, only the 1890 ones limited to the Stockholm county were used. Table 4.1 shows the composition of that data set. As the number of items in each class was different, the data set had to be balanced. The strategy adopted consisted in trying to obtain a similar number of points by class, without corrupting the trips. This was achieved by duplicating the trips randomly, until the number of

(36)

Figure 4.2: Distribution of segments by mode after balancing. Walk, subway and tram have a higher participation.

points by class was similar to the class with the greater number of points. Trips were removed in classes were this procedure generated even bigger samples.

Besides this, further cleaning had to be performed, as it was found that single- mode tram trips, and walk and bus single-mode trips longer than 20 km had important label problems. Nevertheless, not all noise could be removed from the data set. The final distribution of segments (trip legs) is shown in

Figure 4.2.

4.3 Settings and implementation

The multimodal map matching was implemented in C++ using the Fast Map Matching (FMM) framework³developed by Yang and Gidófalvi (2018). FMM was run for each trajectory against each transport network using the configuration shown in Table 4.2. The same parameters were used for all networks, for the sake of simplicity, with the exception of the pedestrian network, where, because of its ubiquitiy, a smaller search radius was set.

The DT and SVM classification models were implemented in Python using the sklearn library⁴. The CRF implementation was taken from sklearn- crfsuite⁵, which is based on Okazaki (2007). The classifiers configuration is

3The source code is available at https://github.com/cyang-kth/fmm.

4https://scikit-learn.org/stable/, last accessed 15 May 2020.

5https://sklearn-crfsuite.readthedocs.io/en/latest/, last accessed 15 May 2020.

(37)

Table 4.2: FMM configuration Transport network Parameter^a

k r (m) pf gps_error (m)

Walk 3 250 0 100

Bus

3 1000 0 100

Subway Commuter train

Tram

a k= number of candidates, r= search radius, pf = penalty factor for reverse movement, gps_error:

GPS error shown in Table 4.3.

In all cases, the hyper parameters were optimized with the randomized search procedure, which is also available in the sklearn package. With regard to DT, the minimum bound of the minimum leaf size was set to a higher number than the default (which is 1), and the resulting parameter was then modified manually, to limit the amount of noise captured by the model. This parameter and min_samples_split, which refers to the minimum size of an inter- nal node, were the main mechanisms used to control the growth of the tree.

In SVM, γ defines the radius of influence of the RBF kernel, and, as it was mentioned, C is a regularization parameter related to the margin violations.

Regarding CRF, c1 and c2 are the coefficients for L1 and L2 regularization, specific to the L-BFGS method.

The three classifiers were trained for three feature sets: the one containing only movement and motion features, the one containing the topological ones —derived from the map matching instance— and the one containing both groups of features together. Training was performed with 75% of each data set, and the complement was assigned to the test set. A set of evaluation measures were collected for each model.

All experiments were performed on a computer with a 2 GHz Intel Core i5 processor with two cores, and 8 GB of RAM (1867 MHz LPDDR3).

4.4 Evaluation measures

Three different levels of measures were considered: point-level accuracy, which measures the proportion of correctly predicted points over the total number of

(38)

Table 4.3: Configuration of the classifiers. The table does not include some hyper parameters with default values.

Features^a Classifier Parameter

MMF

DT

sample_leaf _size = 7 min_samples_split = 3

criterion = gini SVM

kernel = rbf C = 10

γ = 10

CRF

algorithm = lbf gs c1 = 1.1110297614818931 c2 = 0.04789142150865519

max_iterations = 150

TOPO

DT

kernel = rbf C = 1 γ = 10

CRF

algorithm = lbf gs c1 = 0.017820401525518698

c2 = 0.1403641087445735 max_iterations = 100

TOPO + MMF

DT

kernel = rbf C = 1 γ = 10

CRF

algorithm = lbf gs c1 = 0.21470299809040455 c2 = 0.03502731385456263

max_iterations = 100

a MMF=Movement and motion features,

TOPO=Topological features

(39)

points, segment (or trip leg) level, and trip level. A summary is presented in Table 4.4.

Table 4.4: Evaluation measures by level Level Evaluation measure Point Accuracy by point

Segment

Accuracy by segment Accuracy by distance

Precision Recall F1 score Trip

Trip accuracy Oversegmentation Undersegmentation

(40)

Regarding segment-based measures, the objective is to evaluate how well trips are segmented into legs, and how well the mode of each leg is inferred.

At this level, we consider two different kinds of measures, illustrated in Figure 4.3. First, accuracy by segment and accuracy by distance, following Zheng et al. (2010). Accuracy by segment is the number of segments (trip legs) correctly predicted over the total number of segments, and accuracy by distance refers to the sum of the length of the correctly detected segments, over the total length of the segments. Following the definition from Section 3.1, the predicted segments are defined as a sequence of points with the same (inferred) mode. A correct prediction is defined, on the other hand, as a full match between the labels of the points in the ground truth segment and the labels of the corresponding points in the predicted sequence. Length is computed, as in Zheng et al. (2010), as the actual geographical length in kilometers.

Secondly, we also consider precision and recall by distance, in the same fashion as we defined accuracy by distance. These measures are used by Zheng et al. (2010) and then replicated by Lin et al. (2013) and Zhu et al. (2018). Re- call is defined as the ratio between true positives and the sum of true positives and false negatives. In this case this is equal to the length of the correctly predicted ground truth segments of a mode, over the total length of all ground truth segments of that mode. Precision is the ratio between true positives and the sum of true positives plus false positives. This is equal to length of the correctly predicted ground truth segments of a mode, over the total length of all predicted segments of that mode. At this level we also report the F1 score, which takes into account both precision and recall. It is computed as the har- monic mean of these two metrics:

F 1 = 2 × (precision × recall)/(precision + recall) (4.1) With respect to the trip level, we define trip accuracy as the proportion of correctly predicted full trips (full sequences of points) over the total number of trips. Also, we measure how accurately trips were segmented, as the proportion of trips over or undersegmented.

While accuracy by point may illustrate how well the classifier is able to detect the mode of each individual location, we consider this measure less descriptive of the mode inference quality than the segment-based measures. In particular, the distance-based ones are the most interesting, as they take into account the fact that segments may have different length. The combination of these indicators with the over and undersegmentation measures may give a good idea of how well trips are being described.

It can be expected that point-based classification based on DT and SVM

(41)

Ground truth

Predicted

Match

Accuracy by segment: 1/2 = 0.5 Accuracy by distance: 4/7 = 0.57

Green mode Precision: 4/5 = 0.8 Recall: 4/4 = 1 F1 score = 0.89

Red mode Precision: 0/2 = 0 Recall: 0/3 = 0

Segment 1 Segment 2

Figure 4.3: Evaluation measures at the segment level. The thick lines represent the limits of the segments, and the thin ones the units of distance. In this example, only the first ground truth segment is correctly predicted. A correct prediction is defined as a full match between the labels of the points in the ground truth segment and the labels of the corresponding points in the predicted sequence. Accuracy by segment shows that one of the two segments is matched. Accuracy by distance refers to the fact that only 4 units of distance of the total number of seven, are matched. Precision for the green mode is the ratio between the distance of the correctly predicted ground truth segment (= 4) over the total length of all the predicted segments for this mode (= 5).

Recall for the green mode refers to the fact that the total length of the ground truth segments (= 4) is correctly predicted. In the red mode, both precision and recall are zero because the numerator (the length of correctly predicted ground truth segments) in both cases is zero.

will perform better at the point level, and that CRF will have advantage at the segment and trip levels, both in terms of distance and over and undersegmentation.

(42)

Chapter 5 Results

5.1 Movement and motion features

Tables 5.1, 5.2, and 5.3 show the performance of the baseline approach, which uses conventional motion features.

Table 5.1: Movement and motion features. Point, segment, distance, and trip accuracy.

Accuracy level Data set DT SVM CRF

Point Train 0.94 0.82 0.59

Test 0.90 0.81 0.59 Segment Train 0.81 0.68 0.43 Test 0.76 0.67 0.42 Distance Train 0.75 0.53 0.52 Test 0.70 0.47 0.52

Trip Train 0.63 0.47 0.30

Test 0.55 0.45 0.29

It can be stated that DT has the higher performance at all levels, and outper- forms the other classifiers in an important magnitude. At the point level, SVM presents values higher than 80% but CRF does not reach 60%. Nevertheless, taking into account distance, these two classifiers have a similar performance:

accuracy and mean precision and recall are all similar to 50%. The F1 indica- tor, on the other hand, shows that CRF is a step behind the other classifiers. It should be stated that mean precision and recall values are similar, and higher, than the results reported by Prelipcean et al. (2016) in his implicit approach.

29

Transport mode inference by multimodal map matching and sequence classification

Transport mode inference by multimodal map matching and sequence classification

BRUNO SALERNO

Transport mode inference by multimodal map matching and sequence classification

BRUNO SALERNO

Abstract

Sammanfattning

Acknowledgements

Contents

List of Figures

List of Tables

Glossary

Chapter 1 Introduction

Chapter 2

Related work

2.1 Transport mode detection

2.2 Multimodal map matching

2.3 Classification and sequence classifica- tion

Chapter 3

Methodology

3.1 Preliminaries and problem formulation

3.2 Topological features: multimodal map matching

3.3 Movement and motion features

3.4 Sequence classification

Chapter 4

Experimental setup

4.1 Transport networks

4.2 Travel diary data set

4.3 Settings and implementation

4.4 Evaluation measures

Chapter 5 Results

5.1 Movement and motion features