• No results found

System Architecture for Positioned Data Collection : An investigation and implementation of a system architecture for positioned data collection with focus on indoor environments and Android smartphones.

N/A
N/A
Protected

Academic year: 2021

Share "System Architecture for Positioned Data Collection : An investigation and implementation of a system architecture for positioned data collection with focus on indoor environments and Android smartphones."

Copied!
96
0
0

Loading.... (view fulltext now)

Full text

(1)

Linköpings universitet SE–581 83 Linköping 1.5em

Linköping University | Department of Computer and Information Science

Master’s thesis, 30 ECTS | Datateknik

2021 | LIU-IDA/LITH-EX-A--2021/001--SE

System Architecture for

Posi-tioned Data Collection

An investigation and implementation of a system

architec-ture for positioned data collection with focus on indoor

envi-ronments and Android smartphones

\titleswedish

Adrian Royo

Supervisor : Chih-Yuan Lin Examiner : Kristian Sandahl

(2)

Upphovsrätt

Detta dokument hålls tillgängligt på Internet - eller dess framtida ersättare - under 25 år från publicer-ingsdatum under förutsättning att inga extraordinära omständigheter uppstår.

Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka ko-pior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervis-ning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säker-heten och tillgängligsäker-heten finns lösningar av teknisk och administrativ art.

Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsman-nens litterära eller konstnärliga anseende eller egenart.

För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/.

Copyright

The publishers will keep this document online on the Internet - or its possible replacement - for a period of 25 years starting from the date of publication barring exceptional circumstances.

The online availability of the document implies permanent permission for anyone to read, to down-load, or to print out single copies for his/hers own use and to use it unchanged for non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional upon the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility.

According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement.

For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its www home page: http://www.ep.liu.se/.

(3)

Abstract

With the location based service market being estimated to drastically increase in value to over 77 billion dollars in 2021, novel approaches to amass and combine data are be-ing explored. One such novel approach is that of collectbe-ing positioned data (PD), which in turn consists of data gathered from radio signals associated to ground truth positions (GTP). This type of PD can be used to benefit such things as spatial network analysis or supportive data for positioning algorithms. In this thesis we investigate how such PD can be collected, managed and stored in an effective manner regardless of environment. As a means to investigate this, we have proposed a positioned data collection (PDC) system architecture. The proposed PDC system architecture has been designed based on documen-tation related to six different PDC related systems, the ADD method, the ATAM method, the three-tier architecture pattern and a proposed PDC system definition. Parts of the pro-posed architecture have been chosen for implementation and testing. The chosen parts were those which were designed to collect PD within indoor environments, as it is more scientifically interesting compared to outdoor environments. The results gathered from the tests proved that the implemented PDC system parts worked as intended, successfully associating radio signal data values to both local- and geographical GTP. Ways of altering the association between radio signal data and GTP were also explored and tested, with the most prominent alteration approach being that of spatial filtration. Both the proposed architecture and the results gathered from testing the implemented parts were assessed by stakeholders. The thesis work was generally well accepted by the stakeholders, meeting little criticism and providing valuable insights.

(4)

Acknowledgments

I would like to extend a thank you to Kristian Sandahl for his valuable insights throughout the thesis work. Also, a big thank you to Chih-Yuan Lin and everyone at Ericsson who par-ticipated in both interviews and focus groups that benefited my work. I would further like to thank my opponents, Jonathan Bjurenfalk and August Johnson for providing construc-tive feedback during the opposition seminar. Finally, I want to extend a special thank you to Fredrik Gunnarsson, for his mentorship and continued support throughout not only this thesis work - but my whole time at Ericsson.

(5)

Contents

Abstract iii

Acknowledgments iv

Contents v

List of Figures viii

List of Tables x 1 Introduction 1 1.1 Motivation . . . 1 1.2 Aim . . . 2 1.3 Research questions . . . 2 1.4 Delimitations . . . 3

2 Positioned Data Collection System 4 2.1 Positioned Data Collection Background . . . 4

2.2 Positioned Data . . . 4

2.2.1 Ground Truth Positions . . . 4

2.2.2 Signal Data . . . 5

2.3 Positioned Data Collection System . . . 6

2.4 PDC Gathering & Association . . . 7

2.5 PDC Storage . . . 8

2.6 PDC Filtration . . . 9

2.7 PDC systems and applications . . . 11

2.7.1 Long Term Evolution (LTE) Signals . . . 11

2.7.2 Fingerprinting and Adaptive Enhanced Cell ID (AECID) . . . 12

2.7.3 Fingerprinting method . . . 12

3 Theory 15 3.1 Related Work Disclamer . . . 16

3.2 Source selection . . . 16

3.3 Software Architecture . . . 16

3.3.1 Software Architecture Design . . . 17

3.3.2 Architectural Pattern . . . 18

3.3.3 Software Architecture Assessment . . . 18

3.4 Attribute Driven Design (ADD) . . . 20

3.5 Client-Server architectural pattern . . . 21

3.6 Three-tier architectural pattern . . . 21

3.7 Architecture Trade-off Analysis Method (ATAM) . . . 22

(6)

4 Method 25

4.1 Method Overview . . . 25

4.2 Case Studies . . . 26

4.3 Quality Factors and Functional Requirements . . . 26

4.4 Proposing new PDC System Architecture . . . 26

4.5 Implementation and Architecture Assessment . . . 26

5 Case Studies 28 5.1 Case Studies Background . . . 28

5.2 Network Watchdog . . . 28

5.2.1 Ericsson Industry Connect (EIC) . . . 29

5.2.2 EIC Network Management Portal (NMP) . . . 29

5.2.3 EIC Network Controller (NC) . . . 29

5.2.4 EIC Radio System . . . 29

5.2.5 EIC Architecture . . . 30

5.2.6 NW Android Application and Measurements . . . 30

5.2.7 NW Data handling and Storage . . . 30

5.3 FindIT . . . 31

5.3.1 Architecture . . . 31

5.3.2 FindIT Tag . . . 32

5.3.3 FindIT Smartphone Application . . . 32

5.3.4 FindIT Measurement . . . 33

5.3.5 FindIT Data handling and Storage . . . 34

5.4 Radio Dot System Uplink RSS (RDS UL RSS) Log Processing . . . 34

5.4.1 RDS UL RSS Fingerprint Measurement gathering . . . 34

5.4.2 NLS Data handling and Storage . . . 35

5.5 E-Phenology Collector . . . 35

5.5.1 E-Phenology Quality properties . . . 35

5.5.2 E-Phenology Data Collection Requirements . . . 36

5.5.3 E-Phenology Proposed Scenarios . . . 36

5.5.4 E-Phenology Fault Tolerance . . . 37

5.6 Ericsson Device Analytics . . . 37

5.7 Walk The Line . . . 38

5.7.1 WTL Architecture . . . 38

5.7.2 WTL Web portal . . . 39

5.7.3 WTL Smartphone Application . . . 39

5.7.4 WTL Measurements . . . 41

5.7.5 WTL Data Handling and Storage . . . 43

6 Positioned Data Collection Architecture 44 6.1 Architecture Design Approach . . . 44

6.1.1 Usage of ADD . . . 44

6.1.2 Usage of ATAM . . . 45

6.1.3 Usage of 3 Tier Architectural Pattern . . . 45

6.2 Case Study Analysis . . . 45

6.2.1 Category 1: PD modules . . . 46

6.2.2 Category 2: Main system modules . . . 48

6.2.3 Category 3: Situational modules . . . 51

6.3 Quality Factors and Functional Requirements . . . 51

6.3.1 Core Quality Factors and Functional Requirements . . . 52

6.3.2 Enhanced Quality Factors . . . 52

6.3.3 External Quality Factors . . . 53

(7)

6.4.1 Design Decisions . . . 54

6.4.2 UE Application . . . 56

6.4.3 Web portal . . . 57

6.4.4 Server and Database . . . 58

6.4.5 PDC System Architecture . . . 59

7 Implementation and Testing 61 7.1 Implemented Components . . . 61

7.2 Used tools and frameworks . . . 61

7.3 WTL and RDS UL RSS combination . . . 62

7.4 PDC Filtration . . . 65

7.4.1 Spatial filtration algorithm . . . 65

7.4.2 Time based filtration algorithm . . . 66

7.5 Testing . . . 67

7.5.1 Test protocol . . . 67

8 Results and Assessment 68 8.1 Test Results . . . 68

8.1.1 Example segment results . . . 69

8.1.2 Pixel GTP to Geographical GTP . . . 71

8.1.3 Filtration results . . . 71

8.2 Stakeholder Architecture Assessment . . . 74

8.2.1 RDS UL RSS Focus Group . . . 75

8.2.2 EDA Focus Group . . . 75

9 Discussion 77 9.1 Results . . . 77

9.2 Assessment . . . 78

9.3 Method . . . 78

9.3.1 Positioned Data Collection Architecture . . . 78

9.3.2 Source Assessment . . . 79

9.4 Replicability . . . 79

9.5 The work in a wider context . . . 80

10 Conclusion 81 10.1 Future Work . . . 83

(8)

List of Figures

2.1 General PDC system . . . 7

2.2 PDC before and after spatial based filtration . . . 9

2.3 PDC system in practice . . . 12

3.1 Life cycle of software [vanViletSA] . . . . 18

3.2 Relation between software architecture assessment and actual system behavior [vanViletSA] . . . . 20

3.3 The ADD Plan, Do and Check cycle [ADD1] . . . . 21

3.4 Three-tier architecture model . . . 22

3.5 Sequential steps of ATAM [ATAM1] . . . . 23

5.1 General EIC architecture . . . 30

5.2 Example of NW application manually gathering cellular signals and then display-ing their historical round trip time . . . 31

5.3 General architecture of FindIT . . . 32

5.4 How the FindIT application conducts its search of a FindIT Tag . . . 33

5.5 Example of a FindIT measurement . . . 34

5.6 General architecture of WTL . . . 39

5.7 WTL Map Segment Group (MSG) creation, PD measurement gathering from MSG and overview of gathered PD . . . 40

5.8 Example of stored fingerprint measurement information . . . 41

5.9 Example of cellular information belonging to a fingerprint measurement . . . 42

5.10 Example of Wi-Fi information belonging to a fingerprint measurement . . . 42

5.11 Example of Bluetooth information belonging to a fingerprint measurement . . . . 42

6.1 General PDC system architectural design pattern based on case systems . . . 46

6.2 Proposed UE application architecture . . . 57

6.3 Proposed Web portal architecture . . . 58

6.4 Proposed Server architecture . . . 59

6.5 Proposed Database architecture . . . 59

6.6 General architecture of proposed PDC system . . . 60

7.1 Radio Dot System Ground Truth Position class found in Android application . . . 63

7.2 Positioned Data class found in Android application . . . 63

7.3 Example of Radio Dot System Positioned Data in MongoDB . . . 63

7.4 Example of Radio Dot System Data in MongoDB . . . 64

7.5 Map of floor 2 in Magyar Tudósok Körútja 11, Budapest . . . 67

8.1 The chosen segment (highlighted within red square) . . . 69

8.2 Plot of BLE to pixel GTP . . . 70

8.3 Plot of WiFi to pixel GTP . . . 70

8.4 Plot of RDS to pixel GTP . . . 70

(9)

8.6 Plot of RDS signal values * to pixel GTP . . . 71

8.7 Plot of RDS signal values * to geographical GTP . . . 71

8.8 Plot of RDS to geographical * GTP with 1x1 m bin filter . . . 72

8.9 Zoomed plot of RDS to * geographical GTP with 1x1 m bin filter . . . 72

8.10 Plot of RDS to geographical * GTP with 5x5 m bin filter . . . 73

(10)

List of Tables

(11)

1

Introduction

This chapter contains information related to the introduction of this thesis project. First a motivation is provided in Section 1.1, then in Section 1.2 the aim of the thesis is described, after that the research questions are presented in Section 1.3. Finally, in Section 1.4 the scope of the thesis is set through presented delimitations.

1.1

Motivation

Since the boom of the technological revolution, the increasingly digitalized world has come to recognize the value of amassing, storing and trading vast amounts of data. One such type of data is that of location data with commercial data providers such as the Here (We Go) service [13] which sells location data to a wide array of companies spanning a variety of industries such as telecommunication and transportation. According to a study made by researchers at Oxford university [19], data can be viewed as information goods which are produced and combined in order to create relations between them that in turn provide new valuable information. In other words, the value of one type of data seemingly increases in proportion to how it can be combined with other types of data. This positioned data (PD) can be used to gain intelligence that would otherwise not have been obtainable had we not established a relation between data and positions, resulting in PD being more valuable compared to if its position values and data values were used separately. A PD pair, as proposed by this paper, is made up of two parts in its essence: the ground truth position (GTP) part and the data part. When GTP is used in this context, we are referring to position information that is accurate to such a degree that it is considered true and is not questioned. The data part of a PD pair can be seen as a variable encapsulating one or several values. Commonly these data values could theoretically be any type of value that one would want to associate with a position, but in practice it is common that the data values are based on radio signals. PD could be used as input values for use cases such as spatial network analysis (e.g. signal strength measuring of a specified area) or data based position prediction (e.g. RF fingerprinting positioning method [14]). This is especially true for location based services (LBS) where positioning methods utilizing databases containing positions in relation to data are commonly used [28]. According to [28] the LBS market is estimated to grow from 15 billion in 2016 to over 77 billion in 2021. Taking this into consideration and acknowledging that LBS is to a large extent dependent on PD as its input, one can assume that the demand

(12)

1.2. Aim

for PD will only increase with time.

In today’s day and age, practically every commercial smartphone has some variant of Global Positioning System (GPS) functionality built into its technological service kit. This in combination with smartphones containing radio sensors and being mobile devices, make them the ideal user equipment (UE) for gathering PD. Still, gathering PD, specifically when smartphones are used as UE, has its challenges. One challenge is that of how smartphones estimate their positions. Smartphones GPS functionality rely on signals from a Global Navi-gation Satellite System (GNSS) which contain a major weakness in the sense that they require the UE to be within line of sight (LoS) of the GNSS signals [22]. This results in UE’s getting poor GPS functionality in environments where the UE is not within clear LoS of the GNSS signals, such as in dense urban environments or in a robust building. This in turn leave smartphone users with a problem of not being able to estimate the GTP values in such environments. Another challenge is that of maintaining and enriching a database consisting of PD. Assuming that PD can be collected at a rate that is deemed satisfactory, said PD needs to be stored in a scalable and accessible database. Considering that the value that is created from collecting GTP paired with data is that of the PD relation, which in turn has to be stored somewhere, it can be argued that a PDC system is only as good as its PD collection (PDC) in combination with the database that it is stored in. In other words, the value of a PDC does not only scale in proportion to its quality (e.g. PD needs to have accurate GTP values which has its challenges as described above), but also in proportion to its size.

In this thesis, we propose an architecture for a PDC system that aims provide a solution to the mentioned challenges as well as give the reader insight in how PDC systems can be used to gather and store PD. This is done through analysing existing data collection systems in order to investigate how different procedures and architectures can be used to collect PD. A particular focus in this work is PD for indoor environments, where GNSS is not available as GTP, compared to outdoor environments. The type of PD will be that of radio signals, were the prioritized signal type will be that of telecommunication signals.

1.2

Aim

The aim of the thesis is to establish a proposed view of how the collection of PD can be per-formed. This is done through investigating system architectures and analysing how they could be used or altered in order to enrich a PD database. The type of PD which this thesis focuses on consists of GTP associated data from radio signals. Based on the insight gath-ered from said investigation, a new PDC system is proposed that aims to enrich a PDC. The proposed PDC system is designed to collect PD regardless of environment as well as give in-sight in GTP to radio signal data association. Included in this research and proposed system, quality factors are assessed and characteristics pertaining to PDC systems are described.

1.3

Research questions

When system architecture is mentioned is this context, we refer to the infrastructure and functionality pertaining to a system. Infrastructure as smartphone applications, servers, and databases. Functionality in the sense as how the specified infrastructure communicates with each other as well as how every infrastructure node performs its specified computations in order to gather, process and store PD.

The research questions that will be answered as a result of this thesis work are as follows: 1. PDC system architecture design

How can a PDC system architecture be designed based on requirements, inputs and architectures from relevant case studies and stakeholders?

(13)

1.4. Delimitations

2. PD association

How can GTP and data be associated with each other in a PDC system? Which aspects impact the quality of the PD?

3. PD handling and storage

How can PD be supported by a PDC system? How should PD be managed, stored, filtered and exported?

4. PDC Use cases

In which environments and situations can PDC systems create, manage and provide positioned data? How can a PDC system be generic in order to support a large set of different use cases?

1.4

Delimitations

Investigating and implementing a system architecture for positioned data collection spans a wide scope. Time is a factor and in order to concretize the thesis work to finish it on schedule, delimitations have to be made. The scope is based on empirical projects provided by the major telecommunications company Ericsson. When the term architecture is mentioned in relation to a system, it refers to the underlying interfaces and functionalities of said system in conjunction with how these interfaces and functionalities operate together. The system architecture that this thesis will present, will consist of a front-end part for data collection and a back-end part for handling and storing the collected data. To narrow down the scope of the thesis work even further, the front-end will consist of an android smartphone for data collection and a web portal for data handling.

(14)

2

Positioned Data Collection

System

In this chapter we define what a PDC system is and what functionality it contains. We de-scribe PDC systems from a general perspective in section 2.3. There we explain how PDC systems gather PD and identify different parts pertaining to PDC systems. After that, PD is defined and discussed in more detail in section 2.2. Finally, in section 2.4 and forward, we take a closer look into the identified parts of PDC systems.

2.1

Positioned Data Collection Background

In this section we present the circumstances on which the knowledge pertaining to this chap-ter was gained. Prior to conducing this maschap-ter thesis, I, the author of this thesis, was involved in developing and maintaining a system for the telecommunications company Ericsson. The system that I developed is described in detail in section 5.7 and was used in order to collect radio signal data which as been associated to ground truth positions. I have worked on the system for a time span of two years leading up to this thesis work. Together with developing the system, I have also had the opportunity to consult, and discuss, with several industry professionals concerning the topic of PDC systems. These past experiences have in turn lead to me gaining a formidable understanding of PDC systems and it is those experiences that will be used in my attempt to define and describe what a PDC system is in this chapter.

2.2

Positioned Data

Positioned data (PD) is a term that describes the types of values that a PDC system collects. While the name positioned data hints to the fact that these values in essence describe the asso-ciation between positions and data, it begs the question, what do we mean by positions and data in this context? This section aims to answer that question and present a detailed view regarding the composition of PD.

2.2.1

Ground Truth Positions

When referring to positions in the context of PD, we refer to coordinates describing the location of where specified data has been collected. Since gathered PD-positions are consid-ered to describe the true locations of gathconsid-ered data, PD-positions are referred to as ground

(15)

2.2. Positioned Data

truth positions (GTP). These GTP coordinates can be expressed differently depending on the requirements placed on the PDC system. If the PDC system is required to collect PD within a confined area, were the areas dimensions is known to the PDC system, it might be considered enough to just associate relative GTP values to gathered data. These relative GTP values could be expressed by defining the height and width of a pixel-map describing the specified area. Depending on what unit is chosen to represent the height and width of the map, be it pixels or meters, the relative GTP could at that point be measured by estimating the relative map coordinate that represents the users current position. If on the other hand the PDC system is required to collect PD within a wide open area, or if the PDC system is simply required to collect GTP values that can be expressed geographically, then relative GTP values are not enough - geographical GTP values are needed. Geographical GTP values are commonly estimated through the use of GNSS and expressed in longitude and latitude coordinates. In situations where GNSS signals do not reach, such as dense urban areas or indoor environments - an alternative way of geographical GTP estimation is needed. One such alternative way is that of establishing a ”relative-to-geographical” relation between a map describing the dense or indoor area and its geographical equivalent. If such a relation can be established, then geographical GTP values could be read from the relative map. One way of creating such a relation is through generating a worldfile[10] which converts map pixel coordinates to geographical world coordinates.

Two ways of creating GTP values have been identified, passive GTP creation and active GTP creation. Passive GTP creation is done through continuously estimating GTP based on alterations in signal traffic. One common example is that of GNSS based positioning, where the received signals from a satellite are measured in order to estimate GTP values. It is passive in the sense that the GTP values are continuously estimated without any manual input required. This is not the case for active GTP creation. Here we instead rely on static GTP values which, for instance, could be placed on a map. It would then be the task of the users to make sure that they are stood in accordance with the previously placed positions. This type of GTP creation is of great use in situations where GNSS signals do not reach, as it primarily relies on gathering relative GTP values. In practice it can be argued that passive GTP creation is preferable due to it not requiring continuous manual input which in turn would make it easier to gather GTP values in a large scale. While this is true, the only type of passive GTP creation process that we have been able to find is that of GNSS based GTP. In other words, when gathering GTP values in areas were GNSS signals do not reach - we are still bound to active GTP creation.

When creating the GTP values, they have to be matched with their corresponding data values. Commonly, this can be done through matching the timestamps of when the GTP values are created with when the data values are created. Given that the PDC system gathers relative or geographical GTP, the timestamp pertaining to the gathered GTP is required for the GTP to be paired with the correct data values. The GTP and the data is usually collected separately but, parallel to each other from their respective sources. In order to find a relation between the GTP values and the data making the PD, timestamps are used as a medium to match the GTP values with the data that contain matching timestamps.

2.2.2

Signal Data

While associating a GTP value with any type of data would technically make it a PD, a specific type of data proves more useful than others in a positioning context. The specific type is that of data collected from radio signals, due to radio signals being widely available and their signal strength altering in proportion to the distance from the UE receiving the signals to the radio signal source. The types of radio signals span from Wi-Fi and Bluetooth covering indoor environments to cellular networks covering large urban areas.

(16)

2.3. Positioned Data Collection System

Cellular signal sources, called radio base stations (RBS), are stationary, more common in urban areas and have a wider area of effect compared to Wi-Fi and Bluetooth. The cellular signals that smartphones receive from their respective RBS can be trusted to be useful signal sources. This is because RBS are publicly available and stationary in the sense that they are non mobile telecommunication towers. This means that if we use cellular signals for the creation of PD, we can trust that those measurements will be useful for a long period. Had we relied on alternative signal sources which are mobile, such as Wi-Fi routers or a pair pedestrians Bluetooth earphones, the collected PD would be rendered worthless for posi-tioning purposes as soon as the signal source changes location. Regardless of what signal type is being used, it is necessary for the signal source to be stationary. In addition to this, signals with a wide area of effect are desirable. Here, based on our prior experience when collecting PD, cellular RBS outperform practically any other wide area signal source in urban environments. For these reasons, cellular signals will be the prioritized radio signal type moving forward.

Given that the cellular RBS is stationary and its signals can be read from the UE sensors, the following parameters are of interest for PD in the cellular network.

• User Equipment International Mobile Subscriber Identity: The IMSI pertaining to a UE is a unique number that is used to identify a UE on a cellular network. This number can be used to differentiate from which UE’s the signals were collected.

• Recieved Signal Strength (RSS): The signals RSS, also referred to as Reference Signal Received Power (RSRP), value describes the UE’s received signal strength from the RB. It is used under the assumption that the RSS value is dependant on the distance between the UE and the RB.

• Cell ID: The cell-ID is used in order to differentiate from which RBS the signal was collected from.

• Timestamp: The timestamp is mainly used to establish a relation between the gathered signals and their corresponding GTP, but also for simply documenting when the signal was collected.

• Timing Advance (TA): TA is not a necessary signal parameter for PD, but it can be a useful addition. The parameter can be used to estimate the round trip time (RTT) between the UE and the RB. With this in mind one could assume that the RTT, in a similar manner to RSS, would grow proportionally to the distance between UE and signal source.

2.3

Positioned Data Collection System

PDC system is a term that is proposed by this thesis and it is used to describe a specific type of system behavior. The type of behavior we are referring to is that of collecting information pertaining to data gained from radio signals pertaining to a radio active network (RAN) and associating them with ground truth positions (GTP). This GTP to radio signal data association is what in turn defines a PD. A collection of PD is referred to as a positioned data collection (PDC) and is commonly stored in a database belonging to a PDC system. The ultimate goal of any PDC system is to provide its user with the tools necessary to enrich a PDC with enough PD so that that PDC can be used for such things as input for positioning algorithms or in spa-tial network analysis. What is deemed as a satisfactory amount of PD in a PDC varies relative to the PDC systems requirements and user expectations. In the case of one wanting to con-duct spatial network analysis of a specific area, a PDC system could be used in order to gather

(17)

2.4. PDC Gathering & Association

network information and associate it with the GTP of where said network information was collected. This information could in turn be used to create a PDC that would describe net-work alterations throughout the specified area. In other words, the PDC could then be used as supportive information to visualize the networks values relative to their GTP. Another, ar-guably more prominent, case would be that of positioning. Here, positioning methods such as fingerprinting rely on a database consisting of historical positioned signals [14]. When a signal is gathered it can be compared to the historical positioned signals in the database in order to estimate the corresponding GTP value of the gathered signal. A challenge that exists when using the fingerprinting positioning method is that of acquiring the positioned data necessary for creating the fingerprinting database [28]. Through collecting PD which in this case would refer to fingerprinting measurements, with the use of a PDC system, such a fin-gerprinting database could be generated.

This study proposes that the functionality of a PDC system be divided into three sequen-tial stages as shown in figure 2.1, namely gathering & association, storage and filtration. These stages are described in detail starting with section 2.4.

Figure 2.1: General PDC system

2.4

PDC Gathering & Association

In the gathering stage we aim to gather the information necessary to create PD. For the PDC system to do this, it is necessary that it is able to detect and read wireless network information. Software pertaining to the PDC system can be installed on a piece of user equipment (UE) containing antennas, commonly smartphones. These antennas can in turn be used in order detect the radio waves pertaining to one or several wireless network(s). Smartphones are used as UE’s since they are mobile devices which contain the necessary hardware and support for the collection of radio signals. Smartphones can on top of this also have applications implemented on them, which lets the user specify which signals should be collected and how they should be handled. Further more, especially in the case of cellular signals, smartphones grant the user access to a wide variety of information pertaining to the gathered signals. From a positioning perspective, we are interested in the signal values that scale proportionally to the distance between the UE and the signal source. An example could be the RSS from a RBS to the UE that it is serving. Placing the UE far away from the RBS would result in a weak signal strength, while it would grow proportionally stronger as the UE gets closer to the RBS. The reason to why we are interested in such signal values is mainly due to those values shifting in relation to the GTP of the UE. Since we assume that the RSS scales in relation to distance we could then in turn estimate the GTP of the UE based on its gathered RSS values. The signal information that scales proportionally to distance, in combination with the GTP values of the UE, contain all the information required to create a unique PD.

(18)

2.5. PDC Storage

Once the information required for the creation of PD has been gathered, we move on to the association stage of the PDC system. Here we aim to create the PD through associating the gathered GTP with the correct data values and prepare them for storage. In order to associate the signal information with the correct GTP value, a relation between the two parameters is needed. One way of establishing such a relation is through the usage of times-tamps. If we assume that the PDC system creates timestamps of when the signal information and the GTP values are collected, that PDC system would be able to match the signal infor-mation with their corresponding GTP value by matching the two parameters timestamps. Given that the PDC system is able to create such a relation between the signal information and GTP values, they can be associated with each other and are from that point onward considered PD. At this point the PD would contain the information necessary for it be used for positioning purposes, but in theory any type of information could be included in addition to this. One example of such information could be what type of UE was used for collection or the specified name of the collector, should that be of any relevance to the user of the PDC system. Regardless of whatever additional information is added to the newly created PD, it should at this point be ready for storage in the database containing the PDC.

2.5

PDC Storage

Newly generated PD, that are ready to be inserted into the PDC database, are considered to be in the storage stage. In this stage the PDC system aims to properly store the PD among the existing PDC for future use. This would mean that the PDC system would be able to compare the incoming signal information with the PDC. Through the use of a positioning algorithm, such as the fingerprinting algorithm, an estimation of the corresponding GTP value could be computed and then returned to the user. The PDC should uphold the traits of scalability, accessibility and redundancy. These traits are especially important considering that the PDC is the part of the PDC system that contains the valuable information, the very reason to why the PDC system was created in the first place. The more scalable the database containing the PDC is, the larger amount of PD it will be able to contain and with that increase its potential value. Scalability of the PDC systems database dictates how thorough the PDC system af-fords to be when collecting signals. If the PDC systems database is shown not to be scalable, then the PDC system might be obligated to decrease both the amount of potentially gathered PD and the amount of information each PD contains. The accessibility of the PDC systems database describes how easy it is for the PDC system user to access its PDC. In the event of wanting to integrate the PDC with a third party program, such as in the case of wanting to use the PDC as input for a positioning application, it should be possible give the third party program access to the PDC database. A way of doing this could be to include the feature of exporting the stored PDC. Exportation could be done through parsing the PDC to fit the input requirements for the third party program and then creating a CSV-file containing the parsed information. This would enable third party software to use the PDC as input for posi-tioning functionality. Data redundancy of the PDC database should ideally also be included. This could be done through periodically copying the PDC from the PDC systems database to a separately hosted database. By doing this, we provide a form of PDC insurance in the sense that if the PDC system would crash, or any other situation would occur that results in the PDC systems database losing its training data, the PDC would be recoverable from the separately hosted ”backup” database. This is especially important if the PDC system is to be used for a longer period of time, such as in a PDC session lasting several hours.

(19)

2.6. PDC Filtration

2.6

PDC Filtration

The filtration stage is the only optional stage of the three. When exporting the PD, to for instance be used for such things as a positioned application or for spatial network analysis, it may be of interest to alter how the GTP values are associated with their data values. The main purpose of filtering a PDC revolves around the notion of associating PD values to each other according to different specifications. The filter specifications range from associating PD that were gathered within close proximity of each other to associating PD gathered from uplink-(UL) and downlink- (DL) radio signal data. When UL radio signal data is mentioned in this thesis we refer to radio signal data gathered from network logs or external radio hardware, were the UE gathering GTP is the emitting signal source towards external radio hardware. When DL radio signal data is mentioned, we refer to radio signal data gathered from signal emitting sources such as the previously mentioned RBS towards the UE. When collecting DL radio signal data, the UE is tasked with gathering the signals emitted from other signal sources and associating them with GTP. We will now provide three examples of PDC filtra-tion, namely spatial based filtrafiltra-tion, time based filtration and programmatic filtration. Note that only filtration in the sense of decreasing the GTP accuracy in exchange for the PD containing a larger set of signal data values is possible, due to PD being stored in its most accurate form.

Figure 2.2: PDC before and after spatial based filtration

• Spatial based filtration can be applied to a PDC when one wants to associate its PD relative to a spatial bin. A reason to filter the space, or in other words area, of a PDC could be to make it fit the input specification of a positioned application. For instance, if a positioned application requires that the PD input describes data collected for each cubic meter of an area, but the PDC only contains data collected for each cubic centime-ter of an area - the existing PD can be grouped to fit the positioned applications cubic meter requirement. This would result in new, filtered, PD with a larger amount of data values associated with the filtered GTP values. A general example of a spatial filtering process is illustrated in figure 2.3. Here we have a PDC within the GTP span of x-length and y-length. Once the spatial filter bin, filterX length and filterY length are specified, a filter based on the spatial filter bin can be projected on top of the PDC. Each filter cell represents a filtered PD in the sense that all the data values pertaining to the filtered PDC, which fall under the dimensional span of each filter cell, are grouped.

• Contrary to spatial filtration, when performing time based filtration, the dimension that the specified PDC is being filtered on is that of time in the sense that the PDC is being filtered based on when its PD were gathered. Here the filter is specified through a filter time span which then is projected on the specified PDC. The PD values that fall within each filter time span are grouped up into filtered PD.

(20)

2.6. PDC Filtration

• Programmatic filtration is when PD are associated to each other based on specified pro-grammatic contexts. An example of this could be to associate all PD that were gathered within the same session to each other.

The filtered PD can be grouped based on two different aspects, being that of grouping based on GTP or radio signal data:

• GTP grouping is when you group PD based on their GTP. In the case of spatial based filtration this could be a GTP value describing the center of the filtered area (such as the red dot describing the filtered GTP in figure 2.3). For time based filtration this could be the a GTP value describing the GTP average of the time filtered PD. For linear averaging filtration this could simply be a GTP value describing the GTP average of the filtered PD. Regardless of filtration type, when the new GTP is computed the radio signal data pertaining to the filtered PD are associated to the new GTP.

• Data grouping is when you group PD based on their data. This can be done in varying ways. One way would be to let a data value, or a set of data values, represent a grouped set of PD. Another way would be to compute an average data value based on a grouped set of PD. The GTP pertaining to the grouped set of PD are then associated to the newly computed data value(s).

(21)

2.7. PDC systems and applications

2.7

PDC systems and applications

The PDC that a PDC system generates is commonly meant to be used for some type of posi-tioning related application, such as for posiposi-tioning algorithms or for spatial network analysis. In this section we aim to give the reader an insight into the practical uses of a PDC system. We do this by first discussing how the chosen cellular signal, being that of LTE signals, benefits a PDC system and then providing an example of how a PDC systems gathered LTE depen-dant PD could benefit an application. The application we have chosen to describe how the gathered PDC of a PDC system could be used is the prominent AECID fingerprinting posi-tioning method. We further illustrate how a PDC system could be used in combination with the AECID fingerprinting positioning method in figure 2.3

2.7.1

Long Term Evolution (LTE) Signals

The increasing popularity in smartphones, that were dependent on 3G technology, created a growing consumer base with increasing demands on the perceived functionality of their smartphones. A new need for increased data rates and reduced latency spawned the fourth generation of cellular technologies (4G) and with it the Long Term Evolution (LTE) telecom-munication standard in late 2009 [7]. In today’s day and age 4G LTE technology is still widely used and many technical upgrades have been added to it since its release, such as improved antennas and multi-site coordination. One major field that 4G in combination with LTE tech-nology has popularized, and is seeing its increase in importance going into 5G, is that of posi-tioning. More specifically, the usage of smartphones that contain radio frequency sensors and functionality which in turn communicate through cellular networks in order to compute its perceived GTP. Smartphones in today’s society commonly include GPS functionality which is computed using GNSS satellite signals.

LTE compared to GNSS

The increased usage of smartphones in combination with LTE network coverage being high in dense and highly populated urban areas, make LTE signals a viable alternative to GNSS signals due to LTE network coverage being relatively strong in environments where GNSS coverage is poor. In other words, since GNSS signals are bound to only serve UE’s that are within LoS of the GNSS satellites, UE’s that rely on GNSS signals for GTP can in theory use signals from LTE networks for GNSS signal loss compensation in such ”GNSS signal LoS environments”. In addition, network operators have configured LTE networks to be as wide as possible in order to further enhance the positioning capabilities of LTE [9]. The driving force behind LTE technology is attributed to the third generation partnership program (3GPP) [23]. 3GPP can be described as a global corporation of experts and researchers that combine their knowledge in order to produce telecommunication standards of which technologies, such as LTE, are designed after [1].

LTE Positioning

When using a smartphone as a UE, several different values pertaining to cellular LTE signals can be harvested. Worth mentioning is that other, older and in a sense more outdated, cellular signal types exist and are still to some extent used in telecommunication contexts. These cellular signal types are that of WCDMA, CDMA and GSM. The values that are interesting from a positioning perspective are those that change relative to the distance between the UE and the LTE signal source [14]. Those values are mainly the RSS and the TA of the LTE signals. The LTE signal sources are cellular radio towers that broadcast said LTE signals in an area, commonly dense urban areas, and are referred to as radio base stations (RBS). As to practically any radio signal, the closer the UE is positioned in relation to the radio signal source, the higher the RSS and the lower the TA. In order to differentiate from which cellular

(22)

2.7. PDC systems and applications

signal source the LTE signal was received from, the physical cell id (PCI) is also of interest. Each PCI is unique to its cellular signal source and can therefore be referred to as an LTE signals cellular identification (cell-ID). Cellular UE’s primarily gather LTE signals from the RBS that is positioned closest to the UE, due to the RSS being stronger from that RBS in relation to other RBS. This RBS is referred to as the serving cell. It is common, especially in urban areas, that UE’s pick up cellular LTE signals from several RBS. In this case the RB, which the UE has the strongest connection to, is considered the serving cell, while the other RBS are referred to as neighbor cells. RSS and PCI values can also be gathered from the neighbor cells, resulting in unique values being collected from each RBS in relation to the UE’s distance to the RBS [20]. This leads to a situation where the UE has multiple reference points on which it can gather cellular signal information. This situation can be exploited in order to gather unique cellular radio frequency measurements consisting of distance related signal information in relation to several RBS. These measurements play a key role in the fingerprint positioning method which utilizes the combined gathered cellular signal information and pairs it with GTP in order to create a relation between cellular LTE signals and GTP.

Figure 2.3: PDC system in practice

2.7.2

Fingerprinting and Adaptive Enhanced Cell ID (AECID)

Fingerprinting, also known as database correlation method, is a positioning method that uti-lizes historical radio signal information in combination with GTP in order to estimate a UE’s position [15]. In this section we describe the fingerprinting method and compare it with al-ternative positioning methods. After that, a description of how fingerprinting works in the AECID positioning method is presented.

2.7.3

Fingerprinting method

The positioning method is divided into two phases, the offline phase and the online phase[14].

(23)

2.7. PDC systems and applications

Offline phase

In the offline phase, fingerprint measurements consisting of radio signal information and the GTP of where said radio signal information was collected from, are gathered and stored in a fingerprinting database. The sought after traits of the collected radio signals are its received signal strength (RSS) from the radio base station (RBS) to the UE and the identity (e.g. cell-ID) of which RBS the signal was collected from. Preferably, the UE should be gathering signal information from several RBS, primarily the serving cellular RBS and its neighbor cellular RBS. Each RBS acts as a GTP reference point relative to the UE through measuring the RSS to each RBS. RSS signals from RBS are collected due to them growing proportionally to the distance between the UE and each RBS. In other words, as the UE moves closer to a RBS, its perceived RSS value increases from that RBS. RSS values from the RBS in combination with their cell-IDs create a unique combination of values that can be associated with a GTP, called a fingerprint measurement. These fingerprint measurements are commonly collected using a UE with radio sensors, such as a smartphone, which then are combined with the current GTP value of the UE and stored in a fingerprinting database. The continuous collection of such radio frequency (RF) fingerprint measurements should ideally, when the offline phase is over, have enriched a “historical“ fingerprinting database with unique fingerprint measurements.

Online phase

Moving on to the online phase, the constructed fingerprint database is set into play. New radio signal information collected by the UE can be compared to the fingerprint database through the use of fingerprinting algorithms, which in turn estimate the UE’s GTP. The larger the fingerprint database is, the more accurate the UE GTP estimation becomes. Note that the fingerprint database is limited to the RBS of which the signal information were collected from. Should the UE move to a location and measure its radio signal information towards new RBS, which possibly could give it the same RSS values, it can not be used in combination with the prior fingerprint database.

Characteristics of the Fingerprinting method

The use of a fingerprinting database for GTP estimation does not require knowledge of a net-work, since we solely rely on an empirical database. This makes the fingerprinting method a good alternative to GNSS. Since GNSS performs poorly in environments where UE’s are not within LoS of the GNSS satellite signals [12]. Compared to other positioning methods that instead rely on network knowledge, such as the time of arrival (TOA) positioning method or the angle of arrival (AOA) positioning method, fingerprinting is relatively simple. It is also relatively cheap to implement compared to TOA and AOA, and does not require any addi-tional resources besides a UE and RBS. [2] In addition to this, the positioning accuracy of the fingerprinting method has been shown to increase substantially by including an increased set of parameters that can be read from the retrieved radio signals. Such a parameter could be the time of arrival (TA) of the radio signal, as shown in [14]. Here the TA value of a radio signal can be used to further estimate the distance between the RBS and the UE. TA works well in combination with RSS values due to it being less sensitive to alternations in the UE environment. The fingerprinting technology can further be integrated as a complement to alternative positioning methods. A positioning method that builds on top of this fingerprint-ing methodology, in combination with cellular network technology, is called advanced cell enhanced ID (AECID).

Fingerprinting in AECID

AECID [32, 25] is a positioning method that clusters fingerprinting measurements in order to generate polygons that in turn describe the estimated GTP of a UE. The generated polygons

(24)

2.7. PDC systems and applications

are composed of a parameter called tags, which in turn are composed of the cell-ID of the serving RBS, the cell-IDs of the neighbor RBS, measured TA and the RSS from each RBS. Each tag is assigned to a cluster which describes the GTP of the UE when the values pertaining to a tag were collected. The generation process of these clusters can be considered auto-matic. When a satisfactory number of clusters have been generated, a polygon describing the boundary of the “tagged“ clusters is iteratively computed using specified polygon algo-rithms. These polygons can be accessed in a later stage in order to accurately estimate the position of a UE. The clusters can be considered to be fingerprint measurements in the sense that they contain radio signal information that varies over distance in combination with GTP. The AECID fingerprinting positioning method is dependent on its database in order to gain access to historical fingerprint measurements. This in turn brings with it a challenge that is creating the historical fingerprinting database consisting of empirical fingerprinting mea-surements.

(25)

3

Theory

In this chapter the theory pertaining to this thesis is presented. The concept of software architecture is described in section 3.3. Software architecture design methods and patterns are described in sections 3.3 to 3.6. Finally, in section 3.7, we describe the method used for assessment of the architecture

(26)

3.1. Related Work Disclamer

3.1

Related Work Disclamer

As PDC systems are a novel concept coined by the contributors of this thesis, it is a question of debate what one would consider related work towards designing a PDC system architecture. Due to the shortage of related work, we instead provide theory revolving around software ar-chitecture design in a general sense, which we use as a basis when making decisions towards designing a PDC system. To further compensate for the lack of related work, we present case systems in chapter 5 were we discuss and describe systems that have similar functionalities to PDC systems.

3.2

Source selection

This section describes how the sources pertaining to this theory chapter were gathered. Dur-ing the process of findDur-ing suitable sources, peer reviewed sources were prioritized. The search engines used were Google Scholar, ResearchGate and Linköping University’s e-libary Linköping University Library. When selecting the peer reviewed sources, sources that were published by IEEE conferences were further prioritized. Preferably, the sources should be related to some aspect of PDC systems according to the definition presented in chapter 2. The most prominent search strings were the following and were used to find peer reviewed sources from the search engines mentioned above:

• Positioned data collection • Software architecture design • Large scale data collection • Architecture assessment methods • Focus Group method

• Architectural design patterns • Fingerprinting positioning method • Adaptive Enhanced Cell ID

3.3

Software Architecture

In the writings of van Vilet [29, p. 281] discussing software architecture, a definition by Bass et al. describing software architecture in its essence is given.

The software architecture of a program or computing system is the structure of structures of the system, which comprise software elements, the externally visible properties of those elements, and the relationships among them.

The definition encapsulates the notion commonly found in software architecture design about decomposing a system into parts that each bring value to the system as a whole. The decom-position process is iterative and can be done up until the architect finds that the desired level of detail is acquired. Parallel to decomposing the system, as defined above, the relationship between the newly created system-parts is described, resulting in a increasingly detailed de-scription of how the system is intended to work as a whole. Based on the writings of Van Vilet, Sommerville [26] and Hasselbring [11], three reasons for using software architecture can be highlighted. First and foremost, software architecture can be used as a means for communi-cating among the relevant people who have a stake in the creation and maintenance of the potential system, these people are referred to as stakeholders, of which the architect himself

(27)

3.3. Software Architecture

is a part of. Taking into consideration the very probable possibility that the stakeholders in play may have different backgrounds and interests relative to the proposed system, different views of the software architecture should be defined. Each view of the software architecture pertaining to a system should, as described by van Vilet, represent the same system but from a perspective which is appropriate for a specific sub-group of stakeholders. Each view has viewpoints associated with it, which in turn serves as a description of said view. Secondly, a software architecture gives the stakeholders a chance to discuss the system design from an early stage. Remnants of early design decisions of a system are felt throughout the systems whole lifetime, arguably making the design decisions more critical in proportion to how early they were made. Lastly, a software architecture can be seen as ”transferable abstractions of a system”. As mentioned earlier, a software architecture can be decomposed in an iterative manner in order to reduce its layers of abstraction. It can in situations prove beneficial to leave the architecture abstracted enough so that a clear overview of the system is presented. A more abstract software architecture can potentially be decomposed in separate ways, re-sulting in large-scale reuse, through spawning multiple separate systems with different but related functionality based on the prior more abstract software architecture.

3.3.1

Software Architecture Design

Van Vilet [29] described the process of designing a system architecture as the process of de-termining how the system is to be decomposed. He further explains that all though proposed methods do exist such as Attribute Driven Design described in section 3.4, there is no ”de facto” universal method for software architecture design. Van Vilet, Sommerville[26] and Hasselbring[11] agree on that software architecture design is a creative process where the ar-chitect is tasked with organizing a system in such a way that it satisfies the expected require-ments put on said system. More specifically, Sommerville argues that the design process is dependent on three things, namely the system type, the architect’s knowledge and experience and the requirements put on the system. As a result of software design being an iterative creative process, trial and error is common, as new realizations pertaining to design decisions usually are made through out the whole design process. Since the design of a system architecture is dependent on the requirements put on the system and the requirements are in turn decided by stakeholders, the software architecture should be presented to stakeholders several times throughout the design process. As shown in figure 3.1, more is gained from presenting and discussing the software architecture itself with stakeholders instead of solely requirements. This is because the software architecture itself is a result of both functional requirements and non-functional requirements that can also be referred to as quality factors. Functional require-ments describe the behavior of the system in specified use cases while quality factors describe how the system enforces system characteristics such as availability and reliability. The archi-tecture is presented to stakeholders until an agreement is made, at which point the imple-mentation of said architecture can begin.

(28)

3.3. Software Architecture

Figure 3.1: Life cycle of software [29]

3.3.2

Architectural Pattern

The concept of architectural patterns can be seen as the concept of reusing knowledge gained from designing previous successful software architectures. Here we observe the software architecture of existing systems in order to come up with an abstract description of good practice when designing a software architecture for a similar system. Sommerville states that the architect must choose a architectural pattern or style to base the systems software architecture on. He further states that the pattern choice should depend on the quality factors that are placed on the system. Based on the notion of re using existing knowledge to design software architectures in a more effective manner, approaches such as large-scale reuse [26, 4] have surfaced. Large scale reuse exploits the fact that systems that are designed within similar application domains commonly have software architectures that overlap in structure. In other words, if a new systems software architecture was to be designed, the large-scale approach would enable the architect to save both time and resources by studying architectures that are deemed of similar functionality by reusing, or drawing inspiration from, those types of systems. The degree of how these related systems would be reused would be proportional to the level of access that the architect is given to said systems.

3.3.3

Software Architecture Assessment

Software architecture assessment is in its essence described by van Vilet [29], Hasselbring [11] and Bosch [5] as being an iterative process dedicated to evaluating how well the software architecture in question fulfills the expectations, mainly being the quality attributes, that it is expected to meet and uphold. They recommend that software architecture assessment should start early in the design process in order to detect poor early design decisions, which otherwise would have proven costly to change had they been found at a later stage. Assess-ing software architecture is in a sense assessAssess-ing how the system is perceived to work once implemented. When assessing a software architecture, what we really assess is its quality attributes, which in turn are based on the properties of the software architecture. Assessing the software architectures properties are done in hopes of predicting system behaviour, as shown in figure ??.

Van Vilet [29] proposes two types of techniques for software architecture evaluation, mea-suring techniques and questioning techniques. The measuring technique uses quantitative information gathered from metrics and simulation results pertaining to the software archi-tecture. The questioning technique on the other hand, which is the technique that van Vilet focuses on explaining, investigates the software architecture when put in different scenarios. Van Vilet further describes that there are four types of scenarios for software architecture assessment. Use case scenarios and Far-into-the-future scenarios where the scenario is derived

(29)

3.3. Software Architecture

from the, commonly already existing, use cases pertaining to a software architecture, where the latter scenario type is relatively farther away in time. Change case scenarios based on potential future situations that the software architecture could find itself in. Stress situation scenarios describing possible, highly demanding, situations on the software architecture (e.g. memory overflow). A method that utilizes this type of scenario based software architecture assessment is the Architecture Trade-off Analysis Method (ATAM), described in section 3.7. Bosch [5] continues on the path of scenario based software assessment by proposing pro-files. Each profile is connected to a quality attribute, such as Maintainability or Safety, and contain a set of scenarios that describe potential use cases for the system. In order to define a profile, one must follow these four sequential steps:

• Define Scenario Categories: The scenarios for the quality attribute in question are di-vided into categories, commonly between five and six.

• Define Scenarios: For each category, the architect chooses the set of scenarios, com-monly five to ten, that best describe said category.

• Assign Weights: Quantifiable weights are assigned to each scenario in order to indicate how likely they are to occur.

• Normalize the weights: The weights of the scenarios pertaining to the profile are nor-malized in order to set their total sum to one.

Once defined, the profiles can be used to support three, as proposed by Bosch, approaches for software architecture assessment. The approaches are scenario-based assessment, simulation-based assessment and mathematical model simulation-based assessment.

• Scenario-based assessment:

Scenario-based assessment consists in its essence of two sequential steps, namely im-pact analysis and quality attribute prediction, and is dependant of the profile defined for the quality attribute that the architect wants to assess. In the impact analysis step, the software architecture and the profile are used. Here, the impact of the different scenarios that the profile consists of are measured on the software architecture. Infor-mation describing how the software architecture was affected by the different scenarios is summarized and saved. This saved information is then used in the quality attribute prediction step in order to predict a value for the quality attribute in question.

• Simulation-based assessment:

In Simulation-based assessment a high level version of the software architecture is im-plemented. This implementation is then combined with a simulation of the systems context. By combining the two, one can assess a chosen quality attribute by executing its corresponding profile.

• Mathematical Model-based Assessment:

Mathematical models can be used for assessing quality attributes. These models tend to be of a more complex nature than required at the architectural level, resulting in abstractions being made to the model to reduce the required assessment time. This in turn results in less accurate assessment of the architecture, but at the architecture design level this is acceptable to a certain degree. Once the abstraction of the model has been made, the architecture in question should be expressed in accordance to said model (e.g. in terms of components and connections). It is common that the model requires certain input data that has not been previously defined for the architecture. That, additional, input data is approximated and gathered based on the requirement specification. Given that the model has been defined and abstracted, and that the architecture is expressed

(30)

3.4. Attribute Driven Design (ADD)

in accordance to the model with all the required input data, a prediction of the assessed quality attribute can be calculated.

Figure 3.2: Relation between software architecture assessment and actual system behavior [29]

3.4

Attribute Driven Design (ADD)

ADD is a method developed by the Carnegie Mellon Software Engineering Institute to serve as an approach for software architecture design [29, 33, 16]. In ADD, a set of prioritized software quality attribute requirements are used as a base on which to design the software architecture on. The software architecture is decomposed in an iterative manner, increasing the architectures level of detail by identifying one or more components each iteration (also referred to as refinement steps). It is up to the architect to decide on the order and kind of refinement steps of the decomposition process. Each iteration is dedicated to supporting a desired quality attribute for the system. For instance, if usability is selected as the quality attribute for a decomposition iteration, a usability appropriate pattern is chosen. Such a pat-tern could be When the architect has finished defining a iterations new components based on the selected quality attribute, the set of quality attribute scenarios are verified and updated for the following iteration. This iterative approach continues until the architect is satisfied with the complexity of the software architecture.

Rob Wojcik et al. [33] propose the ADD method be divided into three categories that to-gether make the ADD cycle, the plan-do-check cycle as shown in figure ??. They describe the three categories as sequential steps in the following way:

• Plan: Quality attributes and design constraints are considered to select which types of elements will be used in the architecture.

• Do: Elements are instantiated to satisfy quality attribute requirements as well as func-tional requirements.

(31)

3.5. Client-Server architectural pattern

Figure 3.3: The ADD Plan, Do and Check cycle [33]

3.5

Client-Server architectural pattern

Weir [30] describes the Client-Server architectural pattern as involving two roles, the Client and the Server. A architectural component, be it a process, an object or even a hardware device, can take on the role of being a client, a server or both. The components taking on the roles communicate through an established network connection where the client sends re-quests to which the server sends responses back. Such a connection only lasts for as long as it takes for the request-to-response exchange to complete. Sulyman [27] extends this descrip-tion by adding that the client applicadescrip-tion runs both the coding and business logic. There are existing architectural design patterns that build on top of the client-server pattern, such as the three-tier architectural pattern where the server side is split into a application server and a database server. Here, the client only contains presentation logic, resulting in less resources being needed on the client side.

3.6

Three-tier architectural pattern

In accordance with the description presented by Marinescu [21] and Chen et al. [6], the three-tier software architecture design pattern consist of three tiers which all can be changed independently. I.e. if the structure or contents of one tier is changed, this will not affect the other tiers. The relation of the three tiers is illustrated in figure ??.

• User Interface Tier: The User Interface Tier can in other words be seen as the client side tier. Here, information is displayed to the user, enabling him/her to access the underlying services of the three-tier system. In order to achieve this, this tier enables the user to specify inputs for the system while presenting appropriate outputs. Commonly, the user interface tier consists of websites or smartphone applications - but could in theory be any type of user interactive software.

• Application Logic Tier: The Application Logic Tier connects the client from User Inter-face Tier with the systems underlying functionality in the Database Tier. The Applica-tion Logic Tier can consists of one or several independent modules that together run on

(32)

3.7. Architecture Trade-off Analysis Method (ATAM)

a designated server. The main task of this tier is to handle the application functionality, but it is commonly also used to filter out what information is accessible to the users on the client side.

• Database Tier: The Database Tier is responsible for the systems data related logic, such as the storing- and accessing of data as well as the general optimization of the two. Data is inserted and presented to the User Interface Tier from the Database Tier, through the Application Logic Tier. The main responsibility of the Database Tier is that of indepen-dent data handling, in order to improve both the scalability and performance of the system.

Figure 3.4: Three-tier architecture model

3.7

Architecture Trade-off Analysis Method (ATAM)

Architecture Trade-off Analysis Method (ATAM) is a established approach to software archi-tecture assessment that is commonly paired with ADD [16] due to the two methods com-plementing each other well. It is designed to help stakeholder detect potential conflicts in the software architecture design by analysing its behaviour towards a set of quality attributes [8]. ATAM is an iterative process that can be illustrated through a spiral representation of the sequential steps pertaining to each iteration in figure ??. The figure presented below, along with the following description of ATAM is based on the description given in [17] by Kazman et al. Kazman et al. note that the best architects use their experiences and ”know-how” to guide them in their decision making when assessing a architecture - established assessments methods such as ATAM mainly exist to aid them in said decision making. It is assumed that several architectures are being compared with each other. Here they describe ATAM as containing six steps which in turn have been divided into four sequential phases, namely Scenario & Requirements Gathering, Architectural Views and Scenario Realization, Model Building & Analyses, Trade-offs.

References

Related documents

As this approach estimates the error probability, it is expected that during operation the estimates will converge to the real values, so we should expect changes on the estimated

Detta beteende kan tänkas vara orsaken till varför ungdomen är i behov av samhällsvård men beteendet blir även orsaken till varför ungdomen inte kan stanna i vården (ibid.).

This thesis presents a computational model suitable for complex image processing algorithms, a framework based on this computational model which supports the engineers when de-

Inom denna studie kommer det didaktiska teoretiska ramverket sammankopplas med de didaktiska frågorna för att kunna analysera undersökningens resultat. De didaktiska frågorna

Den inducerade effekten tar i sin tur hänsyn till att om företag växer och ersätter sina medarbetare så konsumerar dessa varor för hela eller delar av ökningen (i regionen

Avkodning av olfaktorisk information kan eventuellt också ske i den nasala mukosan, i olika celler, genom samarbete mellan olika celler eller genom bildandet av tillfälliga

Further, the main difference of 3DkNN algorithm is that data mule is both the centre point for query origination and the point-of-interest, in opposite to the algorithms described

Furthermore, table 7:6 summarises measures, performance objectives, strategic objectives, level of planning and their interrelations, which consequently will be a very useful