Route flow estimation based on time-dependent route choice sets and historical travel times

(1)

Department of Science and Technology Institutionen för teknik och naturvetenskap

Linköping University Linköpings universitet

g n i p ö k r r o N 4 7 1 0 6 n e d e w S , g n i p ö k r r o N 4 7 1 0 6 -E S

Route flow estimation based on

time-dependent route choice

sets and historical travel

times

Valerie Dahl

Mikael Davidsson

(2)

Route flow estimation based on

time-dependent route choice

sets and historical travel

times

Examensarbete utfört i Transportsystem

vid Tekniska högskolan vid

Linköpings universitet

Valerie Dahl

Mikael Davidsson

Handledare David Gundlegård

Examinator Clas Rydergren

(3)

Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare –

under en längre tid från publiceringsdatum under förutsättning att inga

extra-ordinära omständigheter uppstår.

Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner,

skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för

ickekommersiell forskning och för undervisning. Överföring av upphovsrätten

vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av

dokumentet kräver upphovsmannens medgivande. För att garantera äktheten,

säkerheten och tillgängligheten finns det lösningar av teknisk och administrativ

art.

Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i

den omfattning som god sed kräver vid användning av dokumentet på ovan

beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan

form eller i sådant sammanhang som är kränkande för upphovsmannens litterära

eller konstnärliga anseende eller egenart.

För ytterligare information om Linköping University Electronic Press se

förlagets hemsida

http://www.ep.liu.se/

Copyright

The publishers will keep this document online on the Internet - or its possible

replacement - for a considerable time from the date of publication barring

exceptional circumstances.

The online availability of the document implies a permanent permission for

anyone to read, to download, to print out single copies for your own use and to

use it unchanged for any non-commercial research and educational purpose.

Subsequent transfers of copyright cannot revoke this permission. All other uses

of the document are conditional on the consent of the copyright owner. The

publisher has taken technical and administrative measures to assure authenticity,

security and accessibility.

According to intellectual property law the author has the right to be

mentioned when his/her work is accessed as described above and to be protected

against infringement.

For additional information about the Linköping University Electronic Press

and its procedures for publication and for assurance of document integrity,

please refer to its WWW home page:

http://www.ep.liu.se/

(4)

Route flow estimation based on

time-dependent route choice sets and

historical travel times

Master Thesis carried out at Division of Communication and Transport systems

Valerie Dahl

Mikael Davidsson

28 June, 2016

Institute of Technology, Department of Science and Technology SE-581 83 Link¨oping, Sweden

(5)

Since congestion leads to variations in travel time which gives a variation in the traffic flow, it is interesting to estimate the traffic flow in larger cities where cars drive in a limited space. In order to estimate the traffic flow, different traffic models are usually used. These models often use volume-delay functions which calculate the travel times for each link depending on the current traffic flow. However, in these models, the process for reaching equilibrium can be time consuming and it is hard to calibrate the volume-delay functions for a road network with a large set of links. Instead, we assume that it is relatively simply to measure or collect historical time-dependent travel times on a large set of links. With this assumption, a method that uses time-dependent route choice sets and time-dependent travel times in order to estimate time-dependent route flows, is developed.

In this thesis, the method was applied to Stockholm County where it is interesting to study the route choice since congestion occurs in the area which generate variations in travel time. In order to estimate time-dependent route flows, a time-sliced OD-matrix was created by dividing the matrix for the peak hour using two different time-slicing distributions.

The time-dependent route choice set with time-dependent travel times was created by using an existing route planning tool. These routes were mapped to the links in a road network in order to estimate link flows. The mapping was done by using map matching and a shortest path algorithm. Route shares were decided by using a method that splits the demand equally among the routes in the route choice set for an OD-pair, and with a logit model that takes travel time into account with the assumption that the travel time can affect a traveler’s route choice.

The evaluation of the resulting link flows was done by comparing these link flows with observed link flows using different time-slicing distributions and route share models. Furthermore, the method’s resulting link flows was evaluated against the resulting link flows from a scenario where all travelers are assumed to choose the shortest path, in terms of free flow travel time, between each OD-pair.

(6)

The developed method can estimate link flows so that 27.9 % of the links have a GEH value less than 5, which can be compared to the commonly used acceptance criteria of 85 %. This shows that the method needs to be developed further in order to achieve link flow estimations that fulfills the acceptance criteria. Even though the overall result show that the developed method does not fulfill the acceptance criteria, the method works well on some individual links. Furthermore, the resulting link flows from the developed method match the observed link flows better than the resulting link flows from the scenario where all travelers are assumed to choose the shortest path.

(7)

First of all, we want to thank our examiner Clas Rydergren and supervisor David Gund-leg˚ard at Linköping University for feedback, support and interesting discussions. Fur-thermore, we want to thank Andreas Allström and Siamak Baradaran at Sweco Society Stockholm for interesting discussions and input to the thesis. We also like to thank Rasmus Ringdahl at Linköping University for helping us developing the mapmatching algorithm and for help with the database. Finally, we want to thank our classmates that we shared the office with on Campus Norrköping during our thesis work.

Valerie Dahl and Mikael Davidsson Norrk¨oping June 16, 2016

(8)

AADT Annual Average Daily Traffic API Application Program Interface FRC Functional Road Class

GEH statistic Geoffrey E. Havers’ statistic MCS Motorway Control System NVDB National Road Database

SAMS Small Areas for Market Statistics SQL Structured Query Language

(9)

1 Introduction 1 1.1 Aim . . . 2 1.2 Methodology . . . 2 1.3 Delimitations . . . 3 1.4 Outline . . . 3 2 Literature review 4 2.1 The four step model . . . 4

2.1.1 Trip generation . . . 6

2.1.2 Trip distribution . . . 6

2.1.3 Mode choice . . . 8

2.1.4 Traffic assignment . . . 10

2.2 Route choice set generation . . . 12

2.2.1 Deterministic approach . . . 12

2.2.2 Stochastic approach . . . 13

2.3 Discrete route choice modelling . . . 14

2.3.1 C-Logit . . . 15

2.4 Comparison of link flows . . . 16

3 Route flow estimation method 18 3.1 Area . . . 19

3.2 Road network . . . 22

3.3 Time period . . . 24

3.4 Link observations . . . 25

3.5 Overview of the route flow estimation method . . . 26

3.6 Implementation . . . 27

4 OD-matrix generation 29 4.1 OD-matrix for the peak hour . . . 29

4.1.1 OD-pair reduction . . . 29

4.2 Time-slicing . . . 30

4.2.1 Time-slicing distribution based on the regional average traffic flow 31 4.2.2 Time-slicing distribution based on traffic counts around Brunnsviken 34 5 Route choice set generation 38 5.1 Description of interesting routes . . . 38

5.2 Route choice set generation method . . . 39

5.3 Route extraction . . . 40

5.3.1 Start- and end point of routes . . . 41 v

(10)

5.4 Route mapping . . . 42

5.4.1 Route mapping reduction . . . 46

5.4.2 Route improvement . . . 46

5.4.3 Dynamics in the route choice set . . . 48

6 Traffic assignment 52 6.1 Calculating route shares . . . 52

6.1.1 Equal split . . . 52

6.1.2 Logit model . . . 53

6.2 Network loading . . . 58

6.3 Link flow comparison . . . 60

6.3.1 Link flows using Brunnsviken cut . . . 61

6.3.2 Link flows using Regional cut . . . 64

6.3.3 Link usage . . . 66

6.4 Network loading analysis . . . 69

6.4.1 Comparing results from using different time-slicing distributions . 69 6.4.2 Comparing different route share models . . . 70

6.4.3 Comparing different route choice sets . . . 70

7 Discussion and further work 72 7.1 Limitations in route choice set generation . . . 72

7.2 Problems with road network route mapping technique . . . 73

7.3 Effect of delimitations and simplifications . . . 74

7.3.1 No consideration of the traffic to or from Stockholm County . . . . 74

7.3.2 No calibration of the OD-matrix . . . 75

7.3.3 Drawbacks with the validation of the route choice sets . . . 75

7.3.4 Drawbacks with the zoning of the area . . . 75

7.3.5 Improvements of network loading technique . . . 75

7.4 Travel times accuracy . . . 76

7.5 Excluded factors that can affect the route choice . . . 76

8 Conclusions 77

(11)

1 The general idea of the developed method . . . 2

2 The four step model . . . 5

3 The multinomial logit model and the nested logit model for the same choice set . . . 10

4 Overview of the investigtated area, Stockholm County . . . 20

5 An OD-pair where the route choice is not too obvious . . . 21

6 The VM-zones . . . 22

7 The road network of links with FRC between 0 and 7 . . . 24

8 The location of the radar detectors (with their individual ID number) that the traffic counts are collected from . . . 25

9 Overview of the route flow estimation method . . . 27

10 The border of the regional cut . . . 32

11 The regional average traffic flow for weekdays for the period 30-09-2007 to 11-11-2007 . . . 33

12 The location of the radar detectors (with their individual ID number) . . 35

13 The Brunnsviken average traffic flow for Tuesday-Thursday for the period 01-09-2015 to 01-10-2015 . . . 36

14 Example of what parts of different routes that are of interest in this thesis and an example of a unique route . . . 39

15 Example of how Google snaps the given start point to the closest realistic start point, in this case this is a parking lot . . . 42

(12)

16 Example of how the angle is created between point 1 and point 2 . . . 43 17 Example of how Google describes a step as a series of points (in the figure

to the left) and the points at the start and end of the step (in the figure to the right) . . . 44 18 Example of how the raw Google Directions data is mapped to the road

network . . . 45 19 The route before running the algorithm is presented to the left. To the

right, the result of running the algorithm is presented. Last link of step 9 is removed and the first and only link of step 10 is removed . . . 47 20 Example of a problem that the developed algorithm can not correct . . . 48 21 Example of how routes in a route choice set can vary over time . . . 49 22 Illustration of how a route choice set can have the same number of

sug-gested routes but with different routes. . . 50

23 The share of all calibration links that have a GEH value less than 5 for different β values . . . 54 24 The total GEH value for all calibration links for different β values . . . . 55 25 The resulting link flows for scenario 1 (Brunnsviken cut with equal split)

for the time period 07:30-07:45 . . . 59 26 Resulting link flows on E4 north when using the Brunnsviken cut

com-pared with link observations . . . 61 27 Resulting link flows on E4 south when using the Brunnsviken cut

com-pared with link observations . . . 61 28 Resulting link flows on E18 north when using the Brunnsviken cut

com-pared with link observations . . . 62 29 Resulting link flows on E18 south when using the Brunnsviken cut

com-pared with link observations . . . 62 30 Resulting link flows on E4 north when using the Regional cut compared

with link observations . . . 64 31 Resulting link flows on E4 south when using the Regional cut compared

(13)

32 Resulting link flows on E18 north when using the Regional cut compared with link observations . . . 65 33 Resulting link flows on E18 south when using the Regional cut compared

with link observations . . . 65 34 The total number of times a link is used in all 12 time periods for the

dynamic route choice set . . . 67 35 The total number of times a link is used in all 12 time periods for the

shortest path route choice set . . . 68

(14)

1 An example of an OD-matrix . . . 7

2 Rules of thumb comparing traffic flows using the GEH value . . . 17

3 Acceptance crieteria for modeled hourly flows compared with observed flow for two different road types . . . 17

4 An example of the traffic counts used . . . 26

5 A part of the OD-matrix for the peak hour between 07:00 and 08:00 . . . 30

6 The share of each time period in the time-slicing distribution based on the regional average traffic flow . . . 34

7 A part of the time-sliced OD-matrix based on the first time-slicing distri-bution (Regional cut) . . . 34

8 The share of each time period in the time-slicing distribution based on traffic counts around Brunnsviken . . . 36

9 A part of the time-sliced OD-matrix based on the second time-slicing distribution (Brunnsviken cut) . . . 37

10 Example of extracted data that is stored . . . 40

11 Example of data that is added to each row . . . 41

12 Example of output when using pgRouting for one step . . . 45

13 Number of OD-pairs with different number of unique routes . . . 50

14 Number of OD-pairs with different number of unique routes for OD-pars with and without routes passing through the Brunnsviken area . . . 51

(15)

15 The β values that are best for each time-slicing distribution and evaluation criteria . . . 55 16 The best β values for each time period and evaluation criteria when using

the time-slicing distribution Brunnsviken cut . . . 56 17 The best β values for each time period and evaluation criteria when using

the time-slicing distribution Regional cut . . . 57 18 The β values that are best for each time-slicing distribution and evaluation

criteria . . . 57 19 The different scenarios that are going to be evaluated . . . 58 20 The results from the network loading for all evaluated scenarios . . . 60 21 The share of time periods the calibration links have a GEH value less than 5 63 22 The share of time periods the calibration links have a GEH value less than 5 66

(16)

Introduction

In both long- and short-term traffic planning, it is important to consider possible changes in traffic flow. In order to estimate the traffic flow, different traffic models are usually used. In traffic models for long-term traffic planning, for instance changes in traffic flow caused by a new residential area being built, is considered. Meanwhile, traffic models for short-term planning instead considers changes in traffic flow caused by, for instance congestion. In larger cities, cars drive in a limited road network which can cause congestion on roads. Since congestion leads to variations in travel time which gives a variation in the traffic flow, it is valuable to estimate the traffic flow in larger cities. Traditionally, traffic models are often used under the assumption that equilibrium should be reached in all time periods and by this the distribution between routes is determined. These models often use volume-delay functions which calculate the travel times for each link depending on the current traffic flow. However, there are drawbacks with using these types of traffic models. One drawback is that the process for reaching equilibrium can be time consuming since it is an iterative process. Another drawback is that it is hard to calibrate the volume-delay functions for a road network with a large set of links. Today a large amount of devices that collects data that can be used for travel time es-timations exist. Some examples of such devices are GPS-devices in cars, mobile phones and Bluetooth-devices. In this thesis, a method that uses this kind of travel time esti-mation data to distribute demand between routes, a route choice model, is developed. This method does not include having to reach equilibrium or the use of volume-delay functions. Instead, we assume that it is relatively simple to measure or collect historical time-dependent travel times on all links in an area. With this assumption, it is possible to distribute the demand between routes based on these time-dependent travel times and get time-dependent route flows.

(17)

1.1 Aim

The aim of this thesis is to develop a method that estimates time-dependent route flows using historical travel times. This includes that for each OD-pair generate both time-dependent route choice sets and time-dependent route shares for each choice set. In this thesis, the following research questions are going to be investigated in order to reach the aim of the thesis:

• How can a relevant time-dependent route choice set between OD-pairs be gener-ated?

• How can relevant time-dependent route travel times be generated?

• How can a model be developed that distributes a given demand between different routes in a route choice set with known travel times?

• How well do the link flows from the developed method match link observations?

1.2 Methodology

In this thesis, a route flow estimation method was developed. The general idea with the method is shown in Figure 1, where a traveler’s choice of route from a route choice set is decided depending on the travel time of the routes. The route choice is combined with an OD matrix in order to estimate route flow.

Figure 1: The general idea of the developed method

The developed method uses time-dependent route choice sets and time-dependent travel times in order to create time-dependent route and link flows. In order to generate time-dependent route flows for routes between each OD-pair, a time-sliced OD-matrix was used. For each OD-pair, time-dependent route choice sets and travel times were generated using an existing route planning tool. The routes in the route choice sets were

(18)

mapped to links in a road network. To determine the route shares, different methods for calculating route shares were used.

The developed method was evaluated for a specific area. In this area, link observations were extracted and used in order to evaluate the method by comparing these observations with the method’s resulting link flows.

1.3 Delimitations

In this section, the delimitations that initially form the basis of this thesis are presented. Additional delimitations are presented further in this thesis in connection with their execution.

• This thesis does not involve calibration of the OD-matrix.

• The method is only applied and evaluated for one travel mode, car. • The method does not take the capacity of the links into account.

1.4 Outline

This thesis is structured as follows. Chapter 2 is constituted by a literature review that covers the traditional four step model, different approaches for route choice set generation, discrete route choice modelling with different types of logit models and how to compare link flows. Chapter 3 covers the prerequisites for the developed route flow estimation method, an overview of the method and a description of the implementation tools used. Chapter 4-6 deals with the developed method’s different modules: OD-matrix generation, Route choice set generation and Traffic assignment, respectively. Chapter 7 includes a discussion of the potential and weaknesses together with possible improvements of the developed method. Finally, a conclusion of the thesis is presented in Chapter 8.

(19)

Literature review

In this section, a review of literature that is relevant for this thesis is presented. The first part covers the traditional four step model with a description of each step. The second part is about different approaches for route choice set generation followed by a section about discrete route choice modelling with different types of logit models presented. Lastly, this chapter contains a section about a metric used to compare link flows.

2.1 The four step model

In this section, the traditional traffic demand model, the four step model is presented. The section is mainly based on Hyd´en (2008), Immers and Stada (1998), Lundgren (1989) and Ort´uzar and Willumsen (2001).

The four step model consists of four sub-models, which can have different names de-pending on literature. In this thesis, the four sub-models are named as follows: trip generation, trip distribution, mode choice and traffic assignment. As the name of the model implies, the four step model is based on calculations in four different steps. The steps can be performed as the described sequence. However, there are also other ways of combining the different steps. Nowadays, it is common that the first three steps are per-formed more or less simultaneously. Thereafter, the fourth step is perper-formed separately in a particular simulation software such as Emme or Visum.

The four step model has to be applied to a specific studied area which has to be divided into smaller areas, called zones. According to Immers and Stada (1998), the number of zones and their size are important parameters to consider when dividing an area into zones. The trips modelled are the trips between the zones and it is often assumed that the trips starts and ends in a specific geographical point in the zone. This point is called centroid and is often located in the center of gravity based on the population. Trips that have both their start and end point in the same zone are typically not modelled.

(20)

If the zones are too large, the trips within a zone becomes a large part of the total number of trips which leads to that a large part of the trips are not considered in the model. However, the zones could not be too small since it requires more input data that describe the zone. The zones should be divided according to the borders of existing administrative units since this gives access to the input data needed. In Sweden, a nationwide zoning is SAMS (Small Areas for Market Statistics) which is created by the administrative agency SCB (Statistics Sweden). The SAMS areas are based on the municipalities’ own zoning in the larger municipalities and on the election district in the smaller municipalities (Statistic Sweden, 2016).

The area needs an associated network system, which consists of the transport systems of the modeled travel modes. Depending on the purpose of the desired model, both the zoning and network system can have different levels of detail. An example of a detail level of the network system is which road types to include in the road network.

One type of a four step model with the output from each step can be seen in Figure 2.

Figure 2: The four step model

In Figure 2 the output from the first step, trip generation, are the trips from zone i (Oi)

and the trips to zone j (Dj). Oiand Dj are the input to the second step, trip distribution,

where the trips are connected from zone i to zone j (Tij). In the third step, mode choice,

the trips Tij are distributed among travel modes which gives the trips between zone i

and j using travel mode m (Tm

(21)

are assigned to the transportation network which gives the trips on route r (hr) and the

trips on link a (xa) as the output.

2.1.1 Trip generation

The first step of the four step model is about predicting the number of trips that starts and ends in each zone in the studied area. The trips that starts in a zone are called produced or generated trips and the trips that ends in a zone are called attracted. The trips can be classified with regard to, for instance, the purpose of the trip, the departure time of the trip or the travel mode. Typically, trips which have home as a origin and work as a destination are modeled.

In order to predict the produced and attracted trips, production and attraction models are used. The input to these models are land use and socio-economic data for each zone. Another input that influence the production and attraction of a zone is the accessibility, which is a zone’s extent of travel options and the quality of those travel options. However, the accessibility is often not used as an input to neither the production or attraction model since it is hard to quantify the accessibility.

The zones that produce trips are usually zones where people live and therefore locations of the households are the most important input for the land use in the production model. The socio-economic data for the production model is often car ownership, level of income and different characteristics connected to the household such as the number of people working within the household. The zones that attract trips are usually zones where workplaces are located. In attraction models, example of land use are industries, hospitals, schools, sport centers and shopping centers and the socio-economic data is often number of employees. The produced and attracted trips are, according to Immers and Stada (1998), most frequently calculated by regression analysis.

The result of the first step of the four step model is the number of produced trips per zone i, regardless of destination (Oi) and the attracted trips per zone j, regardless of

origin (Dj).

2.1.2 Trip distribution

The second step of the four step model is to connect the trips which originates in zone i to its destination in zone j. Each i belongs to the entire set of origins I and each j belongs to the entire set of destinations J. Each OD-pair with its predicted trips, travel demand, can be presented as an OD-matrix. In this matrix the origin zones are represented by the rows and the destination zones are represented by the columns. The cell with row i and column j contains the predicted trips from zone i to zone j. In Table 1, an example of the general form of an OD-matrix is presented.

(22)

Table 1: An example of an OD-matrix Origins/Destinations 1 2 ... j ... m P jTij 1 T11 T12 ... T1j ... T1m O1 2 T21 T22 ... T2j ... T2m O2 ... ... ... ... ... ... ... ... i Ti1 Ti2 ... Tij ... Tim Oi ... ... ... ... ... ... ... ... n Tn1 Tn2 ... Tnj ... Tnm On P iTij D1 D2 ... Dj ... Dm P_ijTij = T

In order to predict the trips between each OD-pair (Tij), a distribution model is used.

The aim of such a model is to distribute the trips that originates in each zone i (Oi)

over all destinations zones j according to Equation 1

J

X

j=1

Tij = Oi for each i ∈ I. (1)

The model also distributes the trips that have its destination in each zone j (Dj) over

all origin zones i according to Equation 2

I

X

i=1

Tij = Dj for each j ∈ J. (2)

There are two main types of distribution models, double and single constrained. Double constrained means that the model includes both of the constraints presented above in Equation 1 and 2. In order to use the double constraint model both departures (Oi) and

arrivals (Dj) are needed to be known. A single constrained model includes only one of

the constraints and is used when only one of either the departures or arrivals are known. The distribution of trips is often performed with some kind of gravity model. A double constrained gravity model can be written as Equation 3

(23)

In Equation 3, Oi and Dj are the number of departures and the number of arrivals,

respectively and Ai and Bj are constant factors. Furthermore, f (cij) is a deterrence

function which usually is represented by a negative exponential function of the travel impedance (cij) between zone i and j. In the travel impedance, several cost and time

elements such as petrol and toll costs and driving time can be included.

The double constrained gravity model in Equation 3 with exponential deterrence func-tion is the unique optimal solufunc-tion when minimizing the cost c =PI

i=1

PJ

j=1cijTij under

the constraints according to Equation 1, 2, 4 and 5

− I X i=1 J X j=1 Tijln(Tij) ≥ ¯H (4) Tij ≥ 0. (5)

The constraint in Equation 4 is an entropy constraint with a lower bound ¯H. This constraint gets the resulting trips to be more dispersed as in reality compared to a result when not including this constraint. Equation 5 is a constraint that only allows the trips in an OD-pair to be non-negative.

The result of the second step in the four step model is the number of trips between zone i and j (Tij) which can be presented in an OD-matrix.

2.1.3 Mode choice

The third step in the four step model is to, from the OD-matrix determined in the previous step, decide how many trips in each OD-pair that are going to be allocated on each available travel mode. There are several factors that influence what travel mode a trip maker will choose. Ort´uzar and Willumsen (2001) divide these factors into three groups. The first group is factors connected to the trip maker, such as car availability, driving license and income. The second group is factors connected to the journey itself, such as trip purpose and the time of the day when the trip is made. Lastly, the third group is about factors connected to the transport facility. The factors in the third group can be divided into quantitative factors and qualitative factors. Quantitative factors are for instance the travel time which include the time for the journey door-to-door and

(24)

costs like fuel and parking costs. Qualitative factors can be difficult to measure, these factors are for example comfort and security.

The mode choice model is done for each available travel mode. The travel modes can be divided into groups based on some common factor, for instance ”car” and ”public transport”, or they can be more specific such as ”driving alone in a car” and ”bus”. Usually, mode choice models are based on discrete choice models where the trip maker is assumed to maximize the utility when choosing among a set of travel modes. The utility for each travel mode k in the choice set K can be expressed as Equation 6

Uk= Vk+ ε. (6)

In Equation 6, Vkis the observable utility and the error term ε captures the unobservable

utility and the variety of taste among the trip markers. The observable utility consists of a weighted sum over some factors that influences what travel mode a trip maker will choose, for instance car availability and travel time.

The utility function Vk is used to calculate the probability that a trip maker will choose

travel mode k. The probability can be calculated by using a logit model if the error term in Equation 6 is assumed to have a Gumbel or Weibull distribution. In Equation 7, the multinomial logit model which is the simplest form of the logit model is presented

pk=

eµVk

P

k∈KeµVk

. (7)

In Equation 7, µ is a variance parameter larger than 0. In the multinomial logit model, the error terms for all alternatives in the choice set is assumed to be independent and have the same variance. For instance, if the choice set consists of car, bus and subway and each travel mode have the same observed utility, the probability for choosing one of the modes will be 1

3. In the example, two of the modes are public transport and

can be assumed to have similar characteristics compared to the car alternative. These similar characteristics can be modeled by assuming that the error term for the two public transport modes are dependent. The nested logit model can be used to model such multi-level choice situation. In our case, the choice at the first multi-level is between public transport and car and at the second level, the choice is between bus and subway. The probability

(25)

for choosing bus would then be the product of the probability for public transport and the probability for bus. However, this only applies if there are independence between the nests. In Figure 3 the differences between the multinomial and the nested logit model can be seen.

Figure 3: The multinomial logit model and the nested logit model for the same choice set

Given the probabilities for each travel mode, the proportions of trip makers that will choose each travel mode is used to divide the OD-matrix, received the trip distribution step, into OD-matrices for each travel mode. This gives the result of the third step of the four step model which is the number of trips between zone i and j using travel mode m (Tm

ij).

2.1.4 Traffic assignment

The fourth and last step in the four step model is to assign the forecasted trips between zone i and j with travel mode m to routes and their corresponding links. This assignment is done for each travel mode separately since there are large differences between the travel modes’ different transportation systems. Since this thesis only considers cars, this section will only examine the traffic assignment for cars. The road network that the trips will be assigned to consists of links with a travel impedance ca connected to each

link a.

There are different types of traffic assignment models, Ort´uzar and Willumsen (2001) classifies them as: all-or-nothing, Wardrop’s equilibrium, Stochastic user equilibrium

(26)

and pure stochastic assignments. The simplest traffic assignment method is the so called all-or-nothing assignment model. The model assigns all travel demand in one OD-pair to one specific route, usually the shortest route in the sense of travel time or length. The shortest route is the one that have the least travel impedance of all routes. The travel impedance of one route is calculated as the sum of the travel impedance caof

all links in the route. The all-or-nothing assignment method does not take into account congestion effects and assumes that all trip makers have the same perceptions of what the “best” route is. However, according to Ort´uzar and Willumsen (2001) trip makers have individual perceptions of what the best route is. Some trip makers may prefer rural roads instead of highways while others prefer routes with no left-turns. This is one reason to assume that trip makers often choose different routes between the same OD-pair and not only the shortest (fastest) path.

Traffic assignment models that takes congestion into account often uses volume-delay functions. These functions describes that the travel impedance (travel time) on a link is affected by the flow on the link, ca(v) = ca(va). A commonly used method that takes

congestion into account is Wardrop’s equilibrium or user equilibrium which is the results of assigning travel demand by the Wardrop’s first criterion stated below:

”The journey times on all the routes actually used are equal, and less than those which would be experienced by a single vehicle on any unused route.” (Wardrop, 1952, p. 345)

Another method that takes congestion into account is Stochastic user equilibrium (S-U-E). S-U-E is a modified version of Wardrop’s user equilibrium which Daganzo and Sheffi (1977) states as the following:

“In a S-U-E network no user believes he can improve his travel time by unilaterally changing routes.” (Daganzo and Sheffi, 1977, p. 255)

Compared to both all-or-nothing assignment and user equilibrium, S-U-E takes into account the trip makers’ differences in perceptions of what the best route is. This is modelled by adding an error term with some distribution to the travel impedance. For instance, if the travel impedance is represented by the travel time, the trip maker has a perceived travel time for all routes in the choice set. The perceived travel time is the sum of the measured travel time, which is equal for all trip makers, and a random error term which will alter between each trip maker. Furthermore, in the S-U-E, the trip maker will choose the route which has the smallest perceived travel time.

(27)

The last type of traffic assignments models are the pure stochastic assignment methods, which in comparison to S-U-E do not take congestion into account. One example of this kind of method can be found in Dial (1971).

The result of the fourth and last step of the four step model is an assignment of the forecasted trips to each route r (hr), which in turn leads to flow on each link a in the

network (xa).

2.2 Route choice set generation

In the task of modelling the route choice of travelers in a network, there can exist, if looping is not prohibited, an infinite amount of alternative routes that connect an OD-pair. All the routes that connect an OD-pair are called the universal choice set and are not possible to decide. Instead of using the universal choice set, a smaller subset of possible routes is needed in order to make it possible to model the route choice. There will only exist a finite number of routes that travelers actually consider choosing since overly circuitous routes never will be a reasonable choice for any traveler. This leaves the problem of deciding which of the possible routes between an OD-pair that in fact are considered by any traveler. According to Bekhor et al. (2006) this problem can be specified as identifying algorithm rules for generating plausible routes that represent actual, observed route choices. This means that the algorithm rules should be able to reflect the traveler’s knowledge of the network and their perception of the value of travel time and other network specific parameters.

There are two types of algorithms that can be used for choice set generation, determin-istic and stochastic. These two types of algorithms are described below.

2.2.1 Deterministic approach

If the route choice set is generated using a deterministic approach, the chosen algorithm will always generate the same subset of route choices. Many of the deterministic algo-rithms used are, according to Frejinger (2008), based on some form of repeated shortest path search. Some algorithms allow repeated links, and some does not. In most cases there is nothing to gain from allowing cycles in a route when it comes to transportation and therefore not allowing cycles are to be preferred when constructing a choice subset. In order to eliminate cycles in a subset, there are two methods to use. The first one is to construct the choice set, containing both cyclic and noncyclic routes, and then eliminat-ing the cyclic paths. The second method is to not allow cyclic routes to be constructed at all.

(28)

In this section three different types of deterministic algorithms will be presented. The first two algorithms are based on repeated shortest-path and the third one uses a branch-and-bound approach.

Link elimination: Azevedo et al. (1993) describes a deterministic approach where links from the shortest path are eliminated successively. At first, the shortest path is calculated between an OD-pair with regard to a generalized cost and is added to the route choice set. After this, one of the links that is a part of the shortest path is removed from the set of available links. Thereafter a new shortest path is calculated where the removed link is not possible to use, this shortest path is then added to the route choice set. This process of removing links and calculating new shortest paths are done until the requested number of routes are added to the route choice set.

Constrained k-shortest paths: K-shortest path algorithms can be divided into two subgroups, one that allows repeated links, and one that does not. One example of an K-shortest path algorithm that constructs the choice set without allowing repeated links using constraints, is presented in van der Zijpp and Catalano (2005). In this algorithm, there is not only constraints that do not allow repeated links, the algorithm also contain constraints that prohibit certain detours and overlapping alternative routes.

Branch-and-bound: Instead of using a repeated shortest path method, a branch-and-bound method can be used. These types of algorithms enumerates all routes that connect an OD-pair by generating a tree of routes that satisfies some constraint. When using this method it is crucial that the constraints are formulated so that the number of routes is limited. Prato and Bekhor (2009) presents such an algorithm that uses constraints for detours, similarity in routes, number of left-turns and a constraint linked to that travel times may not be unreasonably high. The results that are presented in Prato and Bekhor (2009) indicates that when using a branch-and-bound method it is more likely that a more heterogeneous choice set is generated than when a shortest path algorithm is used.

2.2.2 Stochastic approach

In this section, two stochastic approaches of creating a choice set will be presented. Simulation: Ramming (2001) presents a simulation method that produces alternative feasible routes by drawing generalized link cost from different probability distributions. The presented method draws the generalized costs using a Gaussian distribution for travel time perception. With the randomly drawn travel times, the shortest path is calculated and placed in the choice set. By repeating this process, it is possible to create a large choice set containing only feasible route choices.

(29)

In many cases it is possible to use simulation to create a choice set by using random costs in a deterministic shortest path method. This means that most of the deterministic approaches can be altered in order to create an stochastic method for constructing choice sets.

Doubly stochastic choice set: Bovy and Fiorenzo-Catalano (2007) describes a method that is similar to the simulation approach. This method suggests that the generalized costs for the links should be replaced with utility functions and that the parameters and attributes of these functions are stochastic. Bovy and Fiorenzo-Catalano (2007) means that by using utility functions rather than generalized cost functions, many travelers preferences can be represented. This should then make it possible to generate a route choice set that contains a sufficient variety in its composition and that reflects the bi-ased perception of individual travelers. In the method that Bovy and Fiorenzo-Catalano (2007) propose, there is also a filtering method presented that excludes routes from the route choice set that do not fulfill some constraints.

2.3 Discrete route choice modelling

When trying to decide the route choice between two points in a road network, it is in general interesting to know the behaviour of travelers on an aggregated level. This aggregated behaviour however, is the result of the decisions of different individuals. Ben-Akiva and Lerman (1985) means that this leads to that the modelling of individual behaviour is either implicitly or explicitly the core in all predictive models of aggregated behaviour.

Since the discrete route choice models try to describe the behaviour of individuals on an aggregated level, there will always be a margin of error. It is impossible to include all parameters that can affect each individual’s route choice. This also means that there is no universally accepted choice model that can be used to model route choice in a road network.

As described in Section 2.2, the first step of discrete route choice modelling is to create a relevant route choice set, these sets can be rather large and both relevant and irrele-vant routes can be included. Prato and Bekhor (2007) explains that in case studies of European and American urban networks, up to 70 different routes are considered and that 100 alternatives included in the route choice set is regarded as common practice to perform traffic assignment. There has been studies of how the route choice set affects the model estimation of route choice. In Prato and Bekhor (2007) the importance of size and composition of the route choice set with regard to model performance is illustrated. Furthermore, Frejinger (2007) shows that the universal route choice set is needed in order to obtain optimal estimation results.

(30)

Even though studies show that the size of the route choice set is important, there is no guarantee that the results in the route choice model will be improved with a larger route choice set. In a dense urban network with 70 -100 alternative routes, there will be a high degree of similarities between routes. This means that it will be important to take this into account in the route choice model. Prato and Bekhor (2007) explains that route choice models need to consider the correlation between the alternative routes and this will alter the choice probabilities of overlapping routes. Further, Prato and Bekhor (2007) explains that the two models described in Section 2.1.3, multinomial logit and nested logit, are not suitable for this type of modelling. This is stated even though they are the most commonly used travel behaviour models. The reason for this is that the multinomial logit model does not take the similarity between alternatives into account. The problem with the nested logit model is that an alternative only can belong to one nest even though a set of links can be part of several different routes.

In the following section, a discrete choice model will be presented that can be used for route choice modelling.

2.3.1 C-Logit

In order to take the similarities of alternative routes into account, Cascetta et al. (1996) suggest a modified multinomial logit model. This model uses a commonality factor that represents a route’s degree of similarity with other routes in the generated route choice set. This model is called the C-Logit model and the probability P (k | C) of choosing route k from the given route choice set C is expressed as in Equation 8

P (k | C) = e

Vk+βCFCFk

P

lǫCeVl+βCFCFl

. (8)

In Equation 8, Vkand Vlare the utility functions and CFkand CFl are the commonality

factors of route k and l. βCF is a parameter that needs to be estimated.

The commonality factors can be described in different ways and can describe different concepts of commonality between alternatives. A simple way of describing the common-ality is to compare the shared length between routes. An example of this is presented in Equation 9 CFk= ln X lǫC Lkl √ LkLl γ , (9)

(31)

where Lk and Ll are the length of route k and l, Lkl is the shared length of the two

routes. The commonality factor can be described in different ways. For example, a weight can be introduced for each link that describes how big part of the route the specific link is. Rather than just taking the shared length between routes into account, the weighted shared length is compared.

According to Prato and Bekhor (2007), the major disadvantage of the C-logit is that only parts of the similarities between alternatives are compared and that there is no selection rule for choosing how to express the commonality factor.

2.4 Comparison of link flows

When modelling traffic, there will be a need to calibrate and validate the developed model to ensure that the model is not producing unrealistic or misleading results. Since a model by definition is a simplification of reality, no model can reproduce reality perfectly. Instead, it is important to calibrate the model and validate that the output is a good enough representation of the reality.

Since it is common to measure traffic in the form of number of cars passing a specific point, it is also common to calibrate and validate traffic models using these traffic flow measurements. The modelled link flows are compared with the observed link flows and the aim in the calibration process is to make the difference between these values as small as possible. According to Balakrishna et al. (2007), using link flows when calibrating a model can be ineffective if the used road network consists of several functional road classes (FRC). In Wisconsin Department of Transportation (2014), an example of the problem with this is presented as follows:

For example, the mainline of a freeway might carry 5000 vehicles per hour, while one of the freeway’s on-ramps carries only 50 vehicles per hour. In that situation, it would not be possible to select a single percentage that can be used as a model accepance criterion for both volumes. For example, setting a volume tolerance of 5 would permit a modeled mainline flow of 5000 ± 250 vehicles, which would be very lenient compared to the ramp tolerance of 50 ± 3 vehicles. (Wisconsin Department of Transportation, 2014)

In order to overcome this potential problem, a comparison method that takes percent errors of the mean value of the observed and modeled values into account. According to Balakrishna et al. (2007), the GEH statistic (Geoffrey E. Havers’ statistic, named

(32)

after its inventor) deal with this potential problem and can be used when comparing link flows. How the GEH value is calculated is presented in Equation 10

GEH = s 2(Ys n − Yno)2 (Ys n + Yno) , (10) where Ys

n is the modeled traffic volume and Yno is the observed traffic count.

When using the the GEH value to calibrate a model, there are certain rules of thumb that can be applied to measure how well the model is performing. These guidelines are presented in Table 2.

Table 2: Rules of thumb comparing traffic flows using the GEH value

Criteria Meaning GEH < 5 Acceptable fit

5 ≤ GEH ≤ 10 Possible model error or bad data

GEH > 10 High probability of modelling error or bad data

There are also rules of thumb that can be used as acceptance criteria for a model as a whole. In Wisconsin Department of Transportation (2014), there are two acceptance criteria that can be used as guidelines when calibrating a model using GEH values. These criteria are mainly connected to modelling flows on highways and are presented in Table 3.

Table 3: Acceptance crieteria for modeled hourly flows compared with observed flow for two different road types

Criteria Acceptance targets

GEH < 5 At least 85 % of freeway and arterial mainline links GEH < 5 At least 85 % of entrance and exit ramps

(33)

Route flow estimation method

The aim of this thesis is to develop a route flow estimation method which uses the same input and output as the fourth step in the four step model (presented in Section 2.1). However, there are differences between the developed method and the fourth step in the four step model. The biggest difference is that the developed method can not be used to make traffic forecasts which the four step model can be used for. Furthermore, another difference is that the developed method uses measured historical travel times in order to estimate the route flow while the forth step predict the travel times based on link flow. The first three steps of the four step model (trip generation, trip distribution and mode choice) are not a part of developed method. Instead the result of these steps, which is an OD-matrix usually divided by travel mode, is assumed to be given. According to the delimitations set in Section 1.3, the OD-matrix only describes the trips made by cars and the OD-matrix will not be calibrated. In order to model another travel mode, an OD-matrix for that specific travel mode is needed.

Although the first three steps are not included in the method, the output from these steps can be used as an input to the developed method. To accomplish this, the same prerequisites used in the four step model are needed in the developed method. First, the method has to be applied to a specific area which has to be divided into zones. The area also needs an associating transportation system, which in our case is a road network for cars. The method also has to be developed for a specific time period. In this chapter, these prerequisites are described followed by a description of the link observations that are going to be used in the validation of the developed method. The chapter concludes with an overview of the developed route flow estimation method and a section about the implementation tools used.

(34)

3.1 Area

In order to choose a specific area, it is first needed to specify what would make an area interesting to investigate. Since this thesis aims to deal with how travelers make their choice of route, it is more interesting to investigate an area where route choice not is too obvious. When there exist congestion in a road network, the travel times can vary over time. This means that the best choice of route, in terms of travel time, also can vary over time. Congestion is a more common phenomenon in urban areas and therefore it is preferred to study such an area in this thesis. Furthermore, it is interesting to investigate an area where there is a lot of traffic data available. It is also preferable if there exist multiple data sources to make it possible to determine differences between the different sources.

With these rules for selection of area, Stockholm County was chosen as the area to be investigated. In this area, there are parts of the road network where link observations exist and for some parts of the road network, historical travel times are available. A map of the chosen area is presented in Figure 4.

(35)

Figure 4: Overview of the investigtated area, Stockholm County

In this thesis, a smaller area is chosen to be examined more thoroughly. In this area, link flows will be compared on major roads, therefore it is important that there is available traffic flow data in the chosen area. In a Stockholm case study presented in Ding et al. (2014), an area of interest is presented that fits the requirements in this thesis. This area is called Brunnsviken and is located north of Stockholm’s inner city and contains the junction between the European routes E4, E18 and E20. In the case study it is mentioned that the route choice between these motorways are interesting, either when entering or leaving the inner city or when going through the city. Furthermore, the route choice is also of interest due to the fact that there exist congestion on these roads and that it is not obvious which route that has the shortest travel time. The location of Brunnsviken and a closer look at the area is presented in Figure 5 where an OD-pair is displayed where the route choice is not obvious.

(36)

Figure 5: An OD-pair where the route choice is not too obvious

Although the area around Brunnsviken is chosen to be examined closer, the whole area of Stockholm County is included in the model. This since there are travelers that pass through this area when they are going to, from or through the city. Furthermore, travelers that pass through the Stockholm County area also have to be considered. In order to model the traffic demand between zones, the area has to be divided into smaller areas, called zones. When modelling an area in Sweden, the zoning is often based on SAMS (see Section 2.1). Stockholm County is divided into 890 SAMS-zones, which can be seen in Appendix A.

The numbers of OD-pairs between zones are the square of the number of zones. With this in mind, there are 792 100 OD-pairs between the SAMS-zones. In a route choice model, every OD-pair needs to be considered, both for generating the route choice sets and generating link flows. In order to reduce the computational time, the number of zones was decreased compared to the number of SAMS-zones.

The first step in order to decrease the number of SAMS-zones was to aggregate the SAMS-zones based on the municipalities of Stockholm County. The zoning of the 26

(37)

municipalities in Stockholm County can be seen in Appendix A. Since the area around Brunnsviken is going to be examined closer, the areas in the outskirt of Stockholm County can be merged together if we assume that this can be done without affecting the resulting flows in Brunnsviken. Therefore, some of the municipalities in the outskirt were aggregated. The municipalities closer to Brunnsviken such as Stockholm Munici-pality, Solna Municipality and Sundbyberg Municipality were divided into smaller areas. Furthermore, the area around Brunnsviken was divided into zones corresponding to the SAMS-zones since the trips to and from zones in this area affects the link flows that are going to be investigated. The outcome of this aggregation of the 890 SAMS-zones was 50 zones, which we name as VM-zones (ValerieMikael-zones named after the authors of this thesis). Consequently, every VM-zone correspond to one or more SAMS-zones. The VM-zones can be seen in Figure 6.

Figure 6: The VM-zones

3.2 Road network

In order to perform a network loading and calculate link flows, a road network for the selected travel mode with known links and nodes are needed. This thesis mainly aims to calculate link flows on the roads in the Brunnsviken area. These roads are highways

(38)

and are rather large compared to small residential roads. With this in mind the road network used only needs to contain links that describe rather large roads.

The road network used was based on a road network with data from NVDB. NVDB is administrated by Trafikverket (The Swedish Transport Administration) and the data associated to the road network is of high detail which means that each link has many attributes. The road network with data from NVDB was modified in order to improve it. This was done in the following steps:

1. Link removal. All the small residential links were removed since these links are not of interest in this thesis. The links kept were the links with FRC between 0 and 7. 2. Unidirectional links. In order to be able to calculate link flows in both directions of the link, only unidirectional links are needed. Therefore, each bidirectional link was replaced by two unidirectional links which had both their start and end node as well as their geometry reversed.

3. Insert of speed limit. For the following type of links, the speed limit attribute were missing and were inserted:

• Car ferry lines: The speed limit was set to the travel time of the ferry / length of the link. The travel time of the ferry was extracted from Trafikverket F¨arjerederiet (2016).

• Small links (about 1 centimeter long): The speed limit was set to the speed limit of the neighboring links (all neighboring links had the same speed limit).

4. Insert of cost. The cost of each link was set to the free flow travel time (length of the link /speed limit). However, the links which did not have any speed limit were the boundary links which leads in/out from the road network covering Stockholm County. These links are not supposed to be used when modelling the routes in Stockholm county area and therefore the cost of these links were set to a large value (1 000 000).

(39)

Figure 7: The road network of links with FRC between 0 and 7

3.3 Time period

Since it is interesting to study how the traffic changes over time, it would be preferable to develop a method that generates time-dependent route choice sets and route shares for each day and each time of the day. However, in order to reduce the computational time, the method was developed for an average traffic day and for a limited time period of the day. In Section 3.1, it was described that one reason for the choice of the investigated area was because the travel times can vary over time in this area. Thus, it is important to choose a time period of the day where the travel time varies. For instance, the travel times varies when going from free flow traffic to congestion. This kind of situation occurs before and after the morning peak hour which is called the extended morning peak period. Kristoffersson and Engelson (2008) used the time period 06:30-09:30 to model the extended morning peak period, which also was the choice of investigated time period in this thesis.

(40)

3.4 Link observations

In order to evaluate the developed method, the resulting link flows will be validated against link observations. The link observations used are collected from the examined area around Brunnsviken since the aim is to make the method produce the ”best” results in this area. The link observations, also called traffic counts, are collected from the radar detectors that belongs to Motorway Control System (MCS). The location of the radar detectors that the traffic counts are collected from can be seen in Figure 8.

Figure 8: The location of the radar detectors (with their individual ID number) that the traffic counts are collected from

Since the model was developed for an average traffic day, traffic counts were extracted from the detectors for each minute during the time period 06:30-09:30 for all weekdays (Tuesday-Thursday) in the period 01-09-2015 to 01-10-2015. Furthermore, an average of the traffic counts for each minute for each detector were calculated. The average traffic count for each minute and for each detector were then aggregated on 15-minute intervals.

(41)

The traffic counts are based on all different types of vehicles that pass the detectors and since the resulting link flows from the method are only for cars, we want to calibrate and validate these flows against traffic counts from cars only. According to Trafikverket (2016a), 86 % of all vehicles that pass on the European Routes are private cars. This share is a template value which is calculated based on the The Swedish Transport Ad-ministration’s traffic count system. Based on this, the traffic counts from the detectors are set to traffic count · 0.86. Furthermore, since the resulting links flows from the method only consider vehicles that travels within Stockholm County, the vehicles that pass through Stockholm County must be taken into consideration when comparing these link flows with traffic counts. In order to take this into consideration, the number of vehicles that only pass the area have to be removed from the traffic counts. The Swedish Institute for Transport and Communications Analysis (SIKA) predicts that in 2020, 163 private cars out of approximately 400 000 vehicles per Annual Average Weekday Traffic (AAWT) pass Stockholm County (SIKA, 2007). Therefore we can assume that 163/400 000 ≈ 0.041 % of the traffic only pass Stockholm County and Brunnsviken area. Based on this, the traffic counts from the detectors located at E4 are set to traffic count · 0.86 · 0.99959.

An example of the resulting traffic counts (in veh/h) that were used are presented in Table 4.

Table 4: An example of the traffic counts used

Sensor id Flow (veh/h) Time period

3 3 359 06:30-06:45

3 3 281 06:45-07:00

3 2 712 07:00-07:15

In addition to the vehicles that pass through Stockholm County, the vehicles that either have their origin or destination within Stockholm County also affect the link observations and should be removed. However, in this thesis a delimination is done that these vehicles are not removed from the link observations.

3.5 Overview of the route flow estimation method

An overview of the different modules in the developed route flow estimation method can be seen in Figure 9. The input to the method is the VM-zones which were presented in Section 3.1.

(42)

Figure 9: Overview of the route flow estimation method

A short description of how the different modules in the developed method works are presented below:

OD-matrix generation. In this module, a matrix of the peak hour is aggregated based on the VM-zones. Some of the non relevant OD-pairs (for the studied area) in the matrix are being removed in order to reduce the computational time. Lastly, the peak hour matrix for the VM-zones is time-sliced into 12 time periods between 06:30-09:30. The result of this module is a time-sliced OD-matrix for the VM-zones.

Route choice set generation. In this module, up to three routes between VM-zones are extracted using a set of coordinates that describe the start and end point in each OD-pair. Routes are extracted for each of the 12 time periods. The result of this module are time-dependent route choice sets with included travel times. Traffic assignment. In this module, the demand from the time-sliced OD-matrix is

assigned to the road network. The input to this module are the time-sliced OD-matrix and the generated route choice sets. For each time period and OD-pair, the distribution (route shares) of the demand between routes is decided. The demand for each OD-pair is then loaded to the routes in the route choice set based on the route shares. With all demand loaded in the road network, the total link flows from the route flows can be calculated. The result of this module are time-dependent link flows and route flows.

3.6 Implementation

To implement the method, two main implementation tools were used. Since a lot of data needed to be handled, a method for storing this data was needed. To do this, a Structured Query Language (SQL) database in the form of a P ostgreSQL database was used. The reason behind choosing a PostgreSQL database is that this makes it possible

(43)

to use a postGIS extension that allows the usage of geospatial operations and methods. This makes it easy to both illustrate and process data associated with the VM-zones and the used road network. By using a PostgreSQL database, it is also possible to use other extensions that make it possible to calculate the shortest path between points in the road network.

In order to retrieve, process and store the data, the Python programming language was used. Python has a wide variety of libraries that can be used, both for communicating with the used PostgreSQL database and to communicate with external databases using HTTP (Hypertext Transfer Protocol) request.

(44)

OD-matrix generation

In order to perform a network loading in the road network, an OD-matrix which includes the trips between each VM-zone is needed. In this chapter, the OD-matrix generation module of the route flow estimation method is described in detail.

4.1 OD-matrix for the peak hour

Since an estimation of an OD-matrix was not a part of this thesis, an existing OD-matrix for the peak hour between 07:00 and 08:00 was used. The OD-matrix was generated using Sampers which according to Trafikverket (2016b) is the national model system for passenger transport. The OD-matrix for the peak hour covers the demand for all Sampers-zones in Stockholm County and M¨alardalen. The demand describes the annual average daily traffic (AADT).

The OD-matrix was scaled down to only cover the Sampers-zones in Stockholm county. This was done by removing all OD-pairs in the OD-matrix which either had the origin, destination or both of the zones outside Stockholm County. Each Sampers-zone is connected to a SAMS Sampers-zone and each SAMS Sampers-zone can be constituted by several Sampers-zones. Since the geometry of the Sampers-zones was known, a translation table between Sampers-zones and VM-zones could be made. By using this translation table, the demand was aggregated for the VM-zones.

4.1.1 OD-pair reduction

OD-pairs which do not have any routes passing by Brunnsviken and therefore can not contribute to the flow on the links in the area, can be excluded from the OD-matrix. This was done by creating a White list in the OD-matrix, where the OD-pairs included in the matrix were set to True and the excluded ones were set to False. Since only the interzonal traffic (traffic between zones) was taken into consideration in the model, the

(45)

intra zonal traffic (traffic within a zone) was excluded by setting the OD-pair with the same zone as origin and destination to False. In Table 5, a part of the OD-matrix for the maximum hour is presented.

Table 5: A part of the OD-matrix for the peak hour between 07:00 and 08:00

OD id VM oid VM did Demand (veh/h) White list

1 34 34 367 FALSE

2 34 43 1 TRUE

3 34 32 165 TRUE

4.2 Time-slicing

Since the aim of this thesis is to generate time-dependent route flows for routes between each OD-pair, a time-sliced OD-matrix can be used. The time-sliced OD-matrix consists of several OD-matrices, one for each time period. B˚ang et al. (2014) states that time-slicing for trips that starts in the investigated area can with favour be based on surveys where the respondents specify their start time of a trip. However, for trips which starts further away from the investigated area, it is more relevant to use traffic counts in the investigated area in order to divide the demand into different time periods. According to B˚ang et al. (2014), usually one or two different time-slicing distributions are used for the investigated area but more distributions can be used for different areas in the modelled area. Since it is a complex task to time-slice the demand in accordance with reality, B˚ang et al. (2014) suggests that a sensitivity analysis can be made where different time-slicing distributions are tested.

Trafikverket (2012b) is an instruction guide for micro and meso traffic models which states some recommendations regarding time-slicing of an OD-matrix. One recommen-dation is to use a time-slicing of 15 minutes interval as a rule of thumb. Another recommendation is that the distribution of demand between the time periods should be based on the variation of demand. Furthermore, the variation of demand can be de-scribed by data from traffic counts in the investigated area. Trafikverket (2012b) points out that it is important to have traffic counts that actually reflects the demand and not only the capacity of the road. This is done by having the measure points upstreams where congestion occurs and consider traffic counts for time periods before and after the period when congestion occurs. An example application of the recommendations in Trafikverket (2012b) is presented in Trafikverket (2012a). In the example, the time period 06:30-08:30 is divided into eight 15 minutes intervals and the demand for each interval is expressed as a share of the peak hour. For instance, the time period 07:30-07:45 has a share of the peak hour of 110 %, which means that the demand for this interval is 1.1 · peak hour’s demand veh/h.