CollMule: An Opportunistic Data Collection System for IoT-based Indoor Air Quality Monitoring

(1)

Collection System for IoT-based Indoor Air

Quality Monitoring

Aigerim Zhalgasbekova

Computer Science and Engineering, master's level 2017

Luleå University of Technology

(2)

PERCCOM Master Program

Master’s Thesis in

Pervasive Computing & COMmunications

for sustainable development

Aigerim Zhalgasbekova

COLLMULE: AN OPPORTUNISTIC DATA COLLECTION SYSTEM

FOR IOT-BASED INDOOR AIR QUALITY MONITORING

2017

Supervisors: Professor Arkady Zaslavsky(CSIRO)

Assistant Professor Saguna Saguna(Luleå University of Technology) Assistant Professor Karan Mitra(Luleå University of Technology) Dr. Prem P Jayaraman(Swinburne University of Technology) Examiners: Professor Eric Rondeau(University of Lorraine)

Professor Jari Porras(Lappeenranta University of Technology) Associate Professor Karl Andersson(Luleå University of Technology)

(3)

This thesis has been accepted by partner institutions of the consortium (cf. UDL-DAJ, no_1524,

2012 PERCCOM agreement).

Successful defense of this thesis is obligatory for graduation with the following national diplo-mas:

• Master in Complex Systems Engineering (University of Lorraine)

• Master of Science in Technology (Lappeenranta University of Technology)

• Degree of Master of Science (120 credits) –Major: Computer Science and Engineering, Specialisation: Pervasive Computing and Communications for Sustainable Development (Luleå University of Technology)

(4)

Luleå University of Technology

Department of Computer Science, Electrical and Space Engineering PERCCOM Master Program

Aigerim Zhalgasbekova

CollMule: An Opportunistic Data Collection System for IoT-based Indoor Air Quality Monitoring

Master’s Thesis

2017

86 pages, 35 figures, 17 tables.

Examiners: Professor Eric Rondeau(University of Lorraine)

Professor Jari Porras(Lappeenranta University of Technology) Associate Professor Karl Andersson(Luleå University of Technology)

Keywords: Internet of Things (IoT), Opportunistic Sensing (OppS), Analytic Hierarchy Pro-cess(AHP), Indoor Air Quality Monitoring

Opportunistic sensing advanced methods of IoT data collection using the mobility of data mules, the proximity of transmitting sensor devices and cost efficiency to decide when, where, how and at what cost collect IoT data and deliver it to a sink. This thesis proposes, develops, implements and evaluates the system and algorithm called CollMule which builds on and extend the 3D kNN approach to discover, negotiate, collect and deliver the sensed data in an energy- and cost-efficient manner. The developed CollMule software prototype uses Android platform to handle indoor air quality data from heterogeneous IoT devices. The CollMule evaluation is based on performing rate, power consumption and CPU usage of single algorithm cycle. The outcomes of these experiments prove the feasibility of CollMule use on mobile smart devices.

(5)

Certainly, by this moment, the last two years are the brightest years of my life. I have acquired the huge amount of new knowledge, experience and a lot of new friends. At the same time, these two years were pretty tough for me as I had to study even harder than ever before. Fortunately, I have been surrounded by great people who have always been there when I needed their support.

I would like to thank the PERCCOM consortium for giving me such an opportunity to be the part of this master program. All the universities, professors, lessons, students, travellings and ex-perience obtained during this program influenced significantly to an evolution of my personality.

I want to express my gratitude to Professor Arkady Zaslavsky, Doctor Saguna, Doctor Karan Mitra and Doctor Prem Prakash Jayaraman for their guidance, responses, help and patience dur-ing this work.

I am very grateful for the support of my research I got from LTU and CSIRO.

Many thanks to my parents, younger brother, and relatives for their priceless support in all my endeavours!

Special thanks to Tamara and Victor who spend the last semester with me in Skelleftea working on our thesis projects. I really appreciate your support and the time spent together during this semester.

PERCCOM happens once in a lifetime, that gives such valuable gifts as new incredible friends from all over the world. Thank you, Olga Rybnytska, Nhi, Giang, Chandara, Atefe, Henrique, Manish, Emil, Valentin, Joseph, Felipe, Carlos, Mustaqim, Rafiul, Victor and Tamara for shar-ing with me wonderful moments durshar-ing these two years! Skellefteå, May 29, 2017

(6)

List of Figures

1 Smart city full of smart and sensing IoT devices. . . 11

2 Motivating Scenario. . . 13

3 DSRM Process Model [25]. . . 15

4 The Internet of Things emergence between 2008 and 2009 [31]. . . 19

5 The IoT evolution [32]. . . 19

6 The elements of the IoT [37]. . . 20

7 The MULEs three tier architecture [52]. . . 24

8 a) energy consumption b) query latency comparisons of 3DkNN with KBT [66]. 30 9 Sensor selection in a three dimensional area. . . 34

10 Illustration of periods involved in the calculation of the power consumption of our model. . . 38

11 Illustration of advertising period. . . 39

12 Illustration of connection period. . . 40

13 AHP model for CollMule. . . 44

14 System Topology. . . 53

15 CollMule System Architecture. . . 54

16 Mule Architecture. . . 55

17 Sensing IoT device layer Architecture. . . 57

18 The sensing set with Raspberry Pi a) 3; and b) 2. . . 61

19 Flow of the app’s process. . . 62

20 The meter used for measuring current during the advertisement and connection modes of a sensing IoT device. . . 63

21 BLE Advertising Packet Structure. . . 64

22 Local Name Structure. . . 65

23 Communication between mules and sensors. . . 65

24 Communication between mules and sensors during simultaneous connection. . 66

25 Experimental setup on the second floor plan. . . 67

26 Computational Latency vs k size. . . 70

27 Collection Latency vs k size. . . 70

28 Power Consumption of the prototype with different k size of sensor set. . . 71

(9)

List of Tables

1 Related Works. . . 25

2 Random Index (RI) [86] . . . 43

3 Metrics scaling . . . 44

4 Example values of sensor metrics. . . 45

5 Paired comparison matrix on the intermediary layer (among the factors). . . 45

6 Paired comparison matrix on the bottom layer (among alternatives with respect to the Distance). . . 46

7 Paired comparison matrix on the bottom layer (among alternatives with respect to the RSSI). . . 46

8 Paired comparison matrix on the bottom layer (among alternatives with respect to the power level). . . 46

9 Aggregated matrix composed of priority vectors of the bottom layer and the priority vector of intermediary layer. . . 47

10 List of Notations. . . 48

11 Information about used hardware. . . 60

12 Current consumed by sensing sets with Raspberry Pi 2 and 3. . . 63

(10)

ABBREVIATIONS AND SYMBOLS

LTU Luleå University of Technology

IoT Internet of Things

OppS Opportunistic Sensing

ICT Information and Communications Technologies US EPA United States Environment Protection Agency

WHO World Health Organization

EC European Commission

MEP Ministry of Environmental Protection EPD Environment Protecting Department

IAQ Indoor Air Quality

RFID Radio-Frequency Identification IBSG Internet Business Solutions Group

IERC European Research Cluster on the Internet of Things

WSN Wireless Sensor Network

OppN Opportunistic Networking

MCS Mobile Crowd Sensing

OSDC Opportunistic Sensing Data Collection MULE Mobile Ubiquitous LAN Extensions

AP Access Point

kNN k Nearest Neighbours

MANET Mobile Ad-Hoc Network

MCDM Multiple Criteria Decision Making MAUT Multiple Attribute Utility Theory AHP Analytic Hierarchy Process

(11)

1 Introduction

This chapter introduces the main areas in this thesis, which are linked to the Internet of Things and opportunistic sensing paradigms. The chapter presents the motivation and defines the aims of this research work. Further, the thesis contributions are highlighted. Also, this chapter in-cludes a section that introduces the sustainability aspects of this thesis.

1.1 Introduction

The modern world of technologies develops rapidly and brings considerable changes to human life. Gradually, everything that surrounds people in their everyday life becomes smart and makes it easier and more comfortable. Such smart things that already exist are smart devices (e.g smart-phones, tablets, watches, TV), smart transportation (e.g cars, buses, trains), smart buildings (e.g houses, business centres, shopping malls). They serve a human user in automatic and collabora-tive manner while the user does not need to make big efforts to exploit them. They communicate with each other, thereby, creating a network that can also interact with the world. This is en-abled via the Internet of Things (IoT) that interconnects different objects [1] like smartphones, watches, cars, sensors, etc. According to Gartner [2], this year (2017) the number of connected things which will be used all around the world will reach 8.4 billion (see Figure 1). This number has increased from 2016 to more than 31% and will grow up to 20.4 billion by 2020. This can cause to concern about the amount of energy consumed by these devices. However, the IoT can also become one of the crucial drivers of green Information and Communication Technolo-gies (ICT), if there is an increased focus on energy efficiency [3]. According to GeSI’s report SMARTer2030 [4], ICT can facilitate to reduce the amount of global CO2 emissions by 20%

till 2030. Therefore, academia, industry, and government are interested in developing the IoT technologies and systems that introduce sustainable development to the world.

In [3], authors highlight green ICT principles leading to energy efficiency improvement of the IoT area. One of the principles is a length minimization of the wireless data path. It is im-portant especially for IoT systems gathering and transferring an enormous amount of data via wireless connections. The principle can be achieved using opportunistic sensing (OppS) [5, 6] which utilizes the opportunity of collecting data about an environment from IoT devices (sen-sors) using mobile smart gadgets (e.g smartphones, tablets, etc) carried by people, without their direct involvement. Thus, there is no need in predefined infrastructure and additional hardware. Moreover, as the devices and sensors communicate directly without mediator hops, it reduces the usage of bandwidth and, consequently, energy consumption.

(12)

Figure 1. Smart city full of smart and sensing IoT devices.

The world’s rapid development has also brought negative consequences. One of them is air pollution. Therefore, the knowledge of air quality status becomes vital, because it affects our well-being. There are many people suffering from numerous diseases caused by poor air quality like respiratory and cardiovascular diseases. Moreover, the recent research states that the num-ber of premature deaths caused by polluted air reached 5.5 million a year [7].

Notably, modern people spend, on average, about 90% of their time indoors where pollutant lev-els are substantially from few times to hundreds of times higher than outdoors [8, 9]. There are hundreds of identified hazardous pollutants contained in the air of our living environment. Only six of them are considered as the most common in our daily lives and, therefore, well studied. These are carbon monoxide (CO), nitrogen dioxide (N O2), ground-level ozone, sulfur dioxide

(SO2), particulate matter (PM) and lead (Pb)[10]. Research studies [11, 12, 13, 14, 15, 16, 17]

provide a comprehensive information about their effects on human health and environment. There are regulatory limits of these pollutants that vary according to the governments and or-ganizations in different countries. Thus, their standards are declared differently by the United States Environment Protection Agency (US EPA), the World Health Organization (WHO), the European Commission (EC), the Chinese Ministry of Environmental Protection (MEP) and the Environment Protecting Department (EPD) of Hong Kong [18].

It is important to monitor the indoor air quality (IAQ) for assessing the danger to human health based on the above-mentioned standards. Therefore, its monitoring is widely studied in recent research works. Moreover, it is broadly used as an application for deployment of different IoT systems such as [19, 20, 21, 22, 23]. These utilize sensors that detect concentrations of the

(13)

pollutants.

1.2 Research Motivation

Nowadays, smart buildings are being equipped with an increasing number of IoT devices. Most of these are powered with a battery that has a strictly limited life. Therefore, they need to be recharged periodically. The energy consumption of this large number of IoT devices is a challenging issue. To extend the IoT devices’ battery lifespan, there is a need to save the energy consumed by them. For example, within IAQ monitoring the IoT devices have two primary functions which consume a large amount of power. These are (1) sensors measuring air pollutant levels and (2) data transferring between the IoT device and a gateway. Recently, research has been conducted regarding both of them. The solution for the former is to schedule a duty cycle of the devices which may eliminate unnecessary process execution. A solution for the latter is to use energy efficient approaches of data collection like OppS. It enables techniques whereby sensors negotiate with smart mobile devices passing them by and upload data which is delivered in energy efficient manner. However, the OppS approaches need to be further extended to enhance the efficiency and prolong the device’s life. Let’s assume the following scenario.

Motivating Scenario This scenario presents how an OppS approach can be applied to IAQ monitoring (see Figure 2). Let’s assume that the monitoring system is deployed in the build-ing of the Luleå University of Technology1 _{(LTU) in Campus Skellefteå. It is considered that}

numerous IoT devices sensing different air characteristics are situated in the building. Its users are the staff and students who usually carry their smart devices like a smartphone with them while moving within the building. Assume some of them suffer from a respiratory disease like asthma. Therefore, it is vital for them to know the air quality in the rooms where they study or work.

For example, let’s follow the possible path of a student suffering from asthma named Kyle, who changes his location during the usual academic day. He has classes in different auditoriums, lunch in a student kitchen, just a rest somewhere during breaks between the classes. During this time, his smartphone directly gathers data from sensors around him. Thereby, Kyle always knows the air quality of his current location. Moreover, he can share the data gathered by the smartphone with his peers and professors who can also help Kyle in finding the air quality in other parts of the building. However, they can bother about their gadget’s power level, because data gathering consumes a considerable amount of its energy. It can prevent them to continue

(14)

Figure 2. Motivating Scenario.

exploitation of the system. Therefore, there is a need to reduce the consumptions that can attract people to participate in data gathering.

1.3 Research Questions and Objectives

This section presents the questions and hypotheses that are addressed and sets objectives that should be achieved in this research.

1. What are the main challenges, research gaps and problems in the area of IoT and OppS that this research can address? How is effective data collection important? The research challenge is an investigation of current challenges in IoT and the approaches and technologies recently used for developing the OppS systems consisting of smart mo-bile devices.

Investigate state-of-the-art in IoT and OppS.

2. How to collect data from IoT devices in efficient manner? What should be done to enhance efficiency of data collection from IoT devices?

The next challenge is to propose and develop a new approach for gathering real-time data from the IoT devices in energy-efficient manner.

Develop an algorithm to collect data efficiently from heterogeneous sensing IoT devices situated in the environment with a specific focus on built environments.

3. How effective is an approach proposed by this research?

(15)

IAQ monitoring system. Its performance is evaluated through conducting experiments on real life scenarios.

Evaluate the system implementing the algorithm in application to IAQ monitoring.

1.4 Research Contributions

Considering aims defined in previous section, the research contributions can be described as follows:

1. Existing challenges in IoT were revealed to prove the relevance of this research. Recent solutions in OppS systems were analysed to develop this research on top.

2. An algorithm called CollMule [24] was proposed and developed for efficient sensor data collection using mobile smart device. It discovers nearby sensors, evaluates them con-sidering different metrics, selects a set of the most cost-efficient ones and collects data from them. Thereby, It prevents unnecessary attempts to connect to and gather data from sensors which are inappropriate in terms of different requirements like distance, signal strength, power level and accuracy (the colours of sensors illustrated on the Figure 1 mean different efficiency). Thus, it reduces the number of connections, that leads to decrease of overall energy consumption.

3. CollMule system [24] implementing the proposed algorithm was deployed and tested on real life scenario. The obtained results were analysed to evaluate its performance.

The outcome of this research allows to reduce energy consumption and, in perspective, data redundancy of the sensor data collection systems. Thus, one of the main benefits, the energy-and cost- efficient system we developed brings to its users, is a reduction of billing cost for the amount of consumed energy. In addition, our paper [24] was accepted to the 17th NEW2AN/ ruSMART conference.

1.5 Research Methodology

This thesis follows a Design Science Research Methodology (DSRM) [25]. That represents a research process in six-steps iteration illustrated on the Figure 3: problem identification and motivation; the objectives definition for a solution; design and development; demonstration

(16)

of the product; evaluation; and communication through publishing the results. The steps are defined in this research as follows:

Figure 3. DSRM Process Model [25].

1. Identifying and highlighting the main challenges in the IoT and OppS areas, and studying existing approaches that can be incorporated in CollMule system.

2. Defining objectives for the development of the CollMule system.

3. Design and development of the CollMule algorithm and system for addressing the identi-fied research problems.

4. Implementation of the CollMule system in application to IAQ monitoring and its demon-stration.

5. Evaluation of the prototype by testing it via collection of sensed data from real IoT de-vices and identification of possible shortcomings. Jump back to the third step for the next iteration.

In order to evaluate the prototype, its performing rate, power consumption and CPU usage of single algorithm cycle are measured. In addition, for comparison, a power and CPU utilization values of a commercial BLE connection application are provided. The Coll-Mule system is compared to a simple system where sensors broadcast messages which are caught opportunistically by mule.

(17)

6. Publishing results of the research.

Our paper [24] will appear in proceedings of the 17th NEW2AN/ruSMART, St. Peters-burg, Russia, August 28-30, 2017.

1.6 Sustainability

Considering that PERCCOM program [26, 27] involves studies ICT for sustainable develop-ment it is important to highlight the sustainable aspects of our research work. In this section, we provide the definition to the term sustainability in order to introduce it to the reader. Then, we give an explanation of how our work contributes to it.

The modern concept of the term "sustainable development" is rooted to Brundtland Report [28] where it is stated that development is considered sustainable in case "it meets the needs of the present without compromising the ability of future generations to meet their own needs" [29]. However, today it is oriented more on economic, social and environmental aspects of the mod-ern world development. Therefore, [30] has suggested that "the term ’sustainability’ should be viewed as humanity’s target goal of human-ecosystem equilibrium (homeostasis), while ’sus-tainable development’ refers to the holistic approach and temporal processes that lead us to the end point of sustainability." Consequently, a three-pillar or three-dimensions approach is ap-peared to describe it. Moreover, this method is used to analyse the role of new technologies in sustainable development. We also describe the contribution of this thesis in terms of these three pillars.

Our project brings benefits to all three dimensions of sustainability:

• Ecological: it directly relates to reducing CO2 emission as it provides an efficient

ap-proach for sensor data collection that will potentially decrease energy consumption of the IoT systems. In addition, this reduction leads to extension of battery life thereby decreas-ing wastes and the number of batteries needs to be recycled.

• Social: due to the power usage reduction the frequency of recharging user’s devices is reduced. Also, the user does not need to be concern about the memory space of the device taken for data collection because it is reduced due to data gathering only from sensors producing appropriate data.

• Economical: our work decreases the energy consumption, therefore, it reduces the ex-penses on natural resources which relate to economic assets. Furthermore, it enhances the quality of life through IAQ Monitoring.

(18)

1.7 Thesis Outline

This section briefly introduces the following chapters of this thesis report.

• Chapter 2 provides literature review of works in the field of the Internet of Things and Opportunistic Sensing in order to reveal current challenges.

• Chapter 3 presents a new approach for sensor data collection proposed by this thesis. • Chapter 4 describes an architecture of the proposed system and its components.

• Chapter 5 gives a detailed description of the system implementation and conducted exper-iments which results are also analysed there. In addition, it discusses the implementation complication and its limitations.

• Chapter 6 concludes the thesis outcomes and discusses future work that can be accom-plished to improve the proposed system.

(19)

2 Background and Related Work

The previous chapter has introduced the research area of this thesis. That includes the Internet of Things (IoT) and Opportunistic Sensing (OppS) paradigms. This chapter introduces them to the reader. The IoT section gives definition to the paradigm and reviews its elements and applications. The OppS section provides an introduction to the emergence of this paradigm, its definition, and challenges. Moreover, the recently developed systems that address the OppS is-sues are reviewed here in order to determine a problem that has not been addressed yet. Further, this section discusses techniques that can cover the identified challenge. In addition, it considers approaches that can be used to tackle this problem.

2.1 Internet of Things (IoT)

Internet of Things (IoT) plays a huge role in the modern world. It is a major enabling technology for future smart cities. In this section, we review this paradigm and its components.

2.1.1 IoT definition

The Internet of Things (IoT), or Internet of Objects [31], was proposed as a concept by Kevin Ashton in 1999. He referred it to interconnected objects with radio-frequency identification (RFID) technology [32]. However, there is still no exact definition of the IoT paradigm today. Many visions were presented during these years, some of them are discussed further. For in-stance, the Cisco2 _{Internet Business Solutions Group (IBSG) in 2011 presented it as the point}

in time when the number of "things or objects" connected to the Internet exceeded the world population. Moreover, IBSG assumes it happened between 2008 and 2009 [31] (see Figure 4). European Research Cluster on the Internet of Things (IERC) generally defines the IoT as "dy-namic global network infrastructure with self-configuring capabilities based on standard and in-teroperable communication protocols where physical and virtual "things" have identities, phys-ical attributes, and virtual personalities and use intelligent interfaces, and are seamlessly inte-grated into the information network" [33]. The definition of "things" changes with evolution of the technology depicted on Figure 5.

Authors of [1] highlight three perspectives of IoT visions: 2_{www.cisco.com}

(20)

Figure 4. The Internet of Things emergence between 2008 and 2009 [31].

Figure 5. The IoT evolution [32].

• Things-oriented - focused on the "objects" and on approaches for their identification and integration;

• Internet-oriented - adapting IP to make anything addressable and reachable from any-where;

• Semantics-oriented - utilizing semantic technologies for data managing, storing and rep-resenting in the continuously growing network of things.

They state that the IoT is a combination of these visions [1, 34].

To summarise, the IoT implies an infrastructure of smart devices (e.g. smartphones, tablets, laptops etc.) connected to the world-wide network where they communicate and exchange data

(21)

with other "objects". Moreover, it allows to sense and remotely control such "objects" as sensors [1, 35], and also process any received information without human intervention [36].

2.1.2 IoT elements

The IoT systems do not have standardized list of the components they can contain. However, [37] defines six elements which deliver the IoT functionality (see Figure 6):

Figure 6. The elements of the IoT [37].

• Identification within the IoT paradigm implies naming (setting an ID and address) and matching services with their demand;

• Sensing means data collection from the network objects (smart sensors, actuators, wear-able devices) and send it to some central sink like database or cloud;

• Communication includes technologies linking all objects together while consuming low power like WiFi, Bluetooth, Z-wave, LTE-Advanced and etc.;

• Computation represents processing units (e.g. microprocessors and micro-controllers like Arduino, Raspberry Pi, Intel Galileo) and software applications (e.g. Contiki and TinyOS) which execute computational ability of "things";

• Services perform some work for different applications;

• Semantics refer to the ability of knowledge extraction (recognition and analysis of data) using different machines in a smart way [37].

The first three elements are also highlighted in [1] as a class of the enabling technologies. [36] also emphasized other five IoT components from the more high-level perspective:

(22)

• Wireless Sensor Networks (WSNs), • addressing schemes,

• data storage and analytics, • visualization.

Indeed, the first four technologies perform features of the communication, sensing, identifica-tion and semantics elements described above. Noteworthy, WSNs play one of the crucial roles in the IoT paradigm as they bring a wide range of applications [1, 38] through capability of sensing different environmental phenomenon. However, WSNs within the IoT paradigm differs from the conventional ones in terms of using more smarter devices instead of simple sensors constrained with only sensing abilities. These devices have computational ability and consist of several different sensors. Therefore, they can make certain decisions autonomously and provide different services [39]. Further, in this thesis, they are called sensing IoT devices.

Thus, the elements defined by [37] can be considered as basic components of the IoT systems. Further, this thesis takes into account these components during development of the proposed system.

2.1.3 IoT applications

The wide capabilities of the IoT bring an opportunity to apply it in all spheres of human life in or-der to enhance its quality. The major areas where the IoT is currently being integrated are smart homes/buildings, smart cities, environmental monitoring, healthcare, smart business/inventory and product management, security and surveillance [40]. The general application scenario im-plies a system containing smart devices, sensors for different measurements communicating with each other and some central station via RF technologies like WiFi, Bluetooth Low Energy (BLE), ZigBee and etc. The station collects and analyses data from them. According to the analysis, it can make some decisions and notify the system to perform some action. Thereby, the system needs a minimal human involvement to the data processing. There are still many fields where the IoT can be introduced to contribute towards their improvement and automation. We have chosen to apply our system to indoor air quality monitoring.

(23)

2.2 Opportunistic Sensing

Networking within IoT can be divided into infrastructure-based and opportunistic types. The former one uses fixed network topology with centralized data management. In contrast, the latter one represents infrastructure-free and decentralized ad hoc networks [41]. Initially, op-portunistic networking (OppN) term emerged in Ad Hoc Networks. It allows communication among nodes which do not have a pre-established link between them, moreover, they might even be connected to different networks. Thereby, in OppN there is no need to know the net-work topology. However, the price of such netnet-works is an extra delay in packets delivery, that is taken by conduction of the path towards a destination. Nevertheless, OppN can be exploited in a wide range of delay-tolerant applications. For example, it can provide connection to the global network in rural and developing areas without existing networks [42]. Moreover, ad hoc con-nections can be easily established between the IoT devices using short-range radio technologies like Bluetooth, WiFi and NFC to share information among them like in example described by [41].

Further, the idea of OppN is evolved to opportunistic sensing (OppS) paradigm in the era of WSNs and the IoT paradigm. The main reason for this evolution is a possibility to collect data about an environment without any pre-defined infrastructure. For instance, OppS can be a con-nection of IoT devices to the global network in rural and developing areas. Furthermore, it is acceptable for the smart cities in the case when there is no need for real time communication. Even though, there is a delay in data transmission the OppS paradigm satisfies the IoT systems for data exchange between the IoT devices and platforms where the immediate delivery is not necessary. For example, the fullness of a trash can is not an urgent situation as there are other available cans nearby so the data can be reported to a sanitary department with a reasonably long delay [43].

OppS is also popularly known as crowdsensing, community sensing or Mobile Crowd Sens-ing (MCS) [5, 44, 6]. The former term refers to the use of advantage brought by the crowd to gather necessary sensor data from available mobile and wearable devices [6]. In fact, the next two terms have the same meaning as crowdsensing. Community sensing is subdivided to par-ticipatory and opportunistic sensing. The first one requires direct involvement of users into the data collection process. On the other hand, OppS is an almost autonomous process where deci-sions about participation in sensing are made by mobile devices with minimal user involvement [5, 6]. Such decisions are based on the situation defined by different conditions like sensing task requirements and user context [6]. The MCS is just a broad name for community sensing paradigms coined in [5]. An increase in a number of smart devices carried and wearable by people brings an opportunity to use the power of the crowd in order to explore an environment.

(24)

It became possible because these devices contain a diversity of sensing, computing and com-munication features. Thereby, they can produce data about environment themselves that can be collected for its further exploitation. Moreover, they can play the role of a bridge for linking other objects to global network [5]. Thus, crowdsensing can cover a wide monitoring area in densely populated areas that can lead to reducing an amount of hardware and energy consump-tion.

Currently, there is no standard definition of OppS. Though, many research papers like [44, 6, 45, 46] discuss it during recent years. A short and simple definition is provided in [47]: OppS "is seen as a way to gather information about the physical world in the absence of a stable and permanent networking infrastructure." That means data collection within a network with-out any pre-defined topology, moreover, which might change dynamically. Further, OppS have been broadly defined in [48] revealing its functions as "a paradigm for signal and information processing in which a network of sensing systems can automatically discover and select sensor platforms based on an operational scenario, determine the appropriate set of features and opti-mal means for data collection based on these features, obtain missing information by querying resources available, and use appropriate methods to fuse the data, resulting in an adaptive net-work that automatically finds scenario-dependent, objective-driven opportunities with optimised performance." Fundamental steps involved to OppS procedures are data collection, storage, and upload [49]. Our study is mainly focused on the former process. The paper [50] highlights the main features of the OppS: sensing, transmission, analysis of big sensory data and decision making. OppS systems face challenges like coverage [51], sensing quality [50, 49], resource cost, privacy, security and data integrity [49].

In the following subsections, we describe existing data collection strategies and projects that try to tackle the problems discussed above.

2.2.1 Data Collection Approaches

Most of the Opportunistic Sensing Data Collection (OSDC) frameworks follows one common architecture called MULE (Mobile Ubiquitous LAN Extensions) three-tier architecture [52] (see Figure 7). It is an abstraction of three layers that can be adapted to different scenarios.

• A top tier represents the devices providing access to the global network. The role of such devices can be played by access points (APs)s which are deployed in the convenient locations with network connectivity and power supplier. They allow to synchronise the data collected by middle layer with the central data station.

(25)

Figure 7. The MULEs three tier architecture [52].

• A middle one is composed from moving agents called mule nodes that are mobile devices having a larger storage capacity comparing to sensor nodes and short-range wireless radio technology to communicate with the sensors and networked APs.

• A bottom tier consists of static wireless sensor nodes which are randomly distributed within an area.

The number of tiers in this abstract architecture can merge into one device. For instance, using LTE mobile smart devices can play as the top and intermediary levels. Moreover, for the modern network topologies, we can interpret the top tier as central stations like servers and clouds. This architecture corresponds to opportunistic sensing idea about collecting data without a defined network infrastructure. Notably, the data mule systems are considered as energy conservation mobile-based scheme in WSNs [53]. The system architecture of OSDC projects presented in the Table 1 relate to the MULEs three-tier architecture.

Many research works addressed the OppS challenges mentioned previously, we review some of the recent ones. These projects use mobile devices to collect data opportunistically. For in-stance, the authors of [54] evaluate two BLE approaches using smartphones for OSDC. They are advertisement- and connection-based. In the case of the former approach, a sensor node broadcast packets including information about itself and fragments of the sensed data. While, in the latter method, these packets contain only sensor information. Such messages are called advertisements in the case of both approaches. The data is transferred in the connection-based method after establishing a connection. Moreover, this paper proposes models of analytical current consumption, sensor node lifetime and a maximum amount of collected data. They are

(26)

Table 1. Related Works.

Project names Location

Communication Technologies Used Data Gathering Device Problems Addressed Algorithm for Data Collection System Performance Evaluation Opportunistic Sensor Data Collection with Bluetooth Low Energy [54]

outdoor BLE smartphones

power consumption estimation no via theoretical calculation of the current utilization Data Collection Algorithm Based on the Sampling Frequency (DC-BSF) [55]

outdoor not specified

mobile sensing devices coverage effectiveness, data redundancy region division algorithm, estimates how frequently to gather data via estimation of coverage rate and data

redundancy WiFi-Amber [46] outdoor/ indoor WiFi Direct/Hotstop smartphones privacy, security, user resource utilization and sensing effectiveness no N/A

SCmules [43] outdoor not specified

mobile smart devices data value, memory utilization no, however it weighs gathered data to detect less valuable to discard via measuring collection rate with respect to storage limit, time and number of data centres SenseMyCity [45, 56] outdoor Bluetooth,

WiFi and GPS smartphones

data upload, urban movement detection, privacy and security, participant engagement no via measuring number of days using SneseMyCity application, number of session gathered per participant, day time. week day Gateway [57, 58] outdoor WiFi, Bluetooth, ZigBee, 3G/LTE smartphones integration, harmonization and interoper-ability of different com-munication protocols and standards no via measuring CPU, memory, battery usage CollMule indoor/ outdoor short range radio mobile smart devices resource cost algorithm selecting a set of most cost-efficient neighbors via measuring latency, energy consumption and CPU usage

(27)

based on a real behaviour of BLE devices. S. Aguilar et al identified that sensor node lifetime for the advertisement-based approach is four times longer than in connection-based one. However, the data amount that can be transmitted during contact in the former method is really limited to less than 42 kB. Moreover, in terms of energy consumption per collected bit connection-based is more efficient than the advertisement-based one. The project does not implement an OSDC system in the real world scenario. The approaches are evaluated using the proposed theoretical models.

Y.Ma et al in their paper [55] target to increase network coverage and decrease data redun-dancy in an application where mobile sensing devices embedded into taxis to collect data about environment, as in such networks the nodes are not uniformly distributed that causes bad cover-age in the sparse regions where taxis drive frequently (e.g. shopping malls, airport, train station, etc.) and data redundancy in the opposite kind of regions. Also, it is supposed to improve resource consumption by reducing the amount of data to collect. The authors propose a data collection method where they use two algorithms: Region Division Algorithm (R-DA) and Data Collection Algorithm Based on the Sampling Frequency (DC-BSF) that they developed. The former divides an entire area of interest on three regions grading them depending on the density of vehicle trajectory. This is calculated by counting a number of trajectories in each cell grid (the smallest unit of the area) considering the pre-sampling data. To identify the rate of similar-ity they use the traditional K-means clustering algorithm. For the second algorithm, the authors of [55] assume that in their network each sensor stores dynamically changing information ta-ble with its current location, sampling frequency and data to collect and transmit. Thereby, it checks the region and sampling interval grade of its location. Then it exchanges data with other sensors in the region and checks a content of the current grid data. The paper [55] simulates the network to verify the performance of the latter algorithm. It measures coverage rate and data redundancy according to the co-efficiency of the sampling frequency. The results show that its data redundancy outperforms the alternative algorithm significantly when the coverage rate difference is negligible.

Another framework is WiFi Amber [46] that is a missing object tracking system involving

• device with WiFi communication playing role of a tag to an object like WiFi beacon; • smartphone apps for reporting about missing object and tracking them (on the devices of

users who request for missing object and users who volunteer to track it); • and cloud server for managing the system.

(28)

re-ceives a signal from the cloud server. WiFi Amber uses WiFi Direct to have efficient sensing coverage. In order to enhance energy efficiency, it applies WiFi "probe request" and "probe response" mechanisms based on a particular Service Set Identifier (SSID). Thereby, when vol-unteer app receives a request from the server, it broadcasts a probe request with specified SSID of a missing object and waits for a response from a tag with matching identity. In addition, WiFi Amber propose a cryptographic key system to keep privacy and make the tracking secure.

SWDCP-SCmule scheme [43] (Social Welfare Data Collection Paradigm based on Storage-Constrained Oblivious Data Mules) propose a system where mobile smart devices called mules collect data from nearby sensing devices embedded in Smart City infrastructure and carry it until having an access to a data centre to upload the data there, which then notifies responsible depart-ment to proceed appropriate actions. The project addresses memory limitation of the mules and data redundancy with respect to the data centres. To deal with former one the authors proposed to weight the collected data, therefore they assign unique priorities to the smart devices forming the priority table. Using it SCmules identify data that needs to be stored of discarded based on greedy principle. Moreover, the system also assigns priority to the mules in order to minimize similar data uploaded to the data centres. The priority assignment is made using Simulated An-nealing for Priority Assignment Algorithm (SA-PA algorithm) that is proposed in [43]. It sets the priorities automatically assuming optimization targets and SCmule’s location that is learned from its past trajectory using a machine learning method called metaheuristics. In this paper, the performance evaluation of the system proves that the proposed algorithm assigns priorities properly by dint of experiments on datasets with respect to three different optimization targets where the degree of enhancement provided by it is compared with respect to the storage limit, time and the number of data centres in Smart City. Moreover, comparison of SCmules to mules without storage-constraint shows that the good priority assignment obtained by SA-PA algo-rithm allows diminishing the bad influence of storage-constraint. Consequently, the use of this algorithm can reduce the redundancy rate, thereby increasing the energy efficiency and network lifetime.

SenseMyCity is an opportunistic MCS tool used for exploring urban life processes [45, 56] run since 2011. The system consists of devices that sense the surrounding environment and those that gather the information from former ones via Bluetooth, and a server where all data is forwarded [56] opportunistically via WiFi. The project intends to provide an energy effi-cient, private and secure data collection framework with minimal user interaction. However, the performance of this system with respect to energy effectiveness seems to be lower than of the previous system, because it only uses an opportunistic technique to gather information. Here, the problem of data collection from cheap and inaccurate sensors that meet most of OppS plat-forms is solved on a data processing stage that is done on the server side, while it does not

(29)

assume the problem of data similarity at all. That leads to inefficient use of bandwidth and consequently battery power. Nevertheless, in the recent paper [59] the creators of SenseMyCity confirm that data collection and storage needs to be optimized.

The authors of [57, 58] propose a smartphone-centric system called Gateway. The role of gate-way between sensing technologies and the Internet and Cloud connection belongs to smartphone application which is able to interact with surrounding technologies using multi-communication standards, opportunistically gather and distribute data and dynamically provide services. This framework aims to make interoperability between devices having different communication in-terfaces like Bluetooth, WiFi, 3G/LTE and ZigBee. It does not follow any algorithm to collect data. It simply connects to the nearby devices. The performance evaluation of the software application is done by measuring the CPU, memory and battery utilization on different smart-phones in low and high load scenarios. It reveals the energy consumption limitations. The authors suppose that development of more efficient hardware of radio interface and the battery will be a solution. However, they have not assumed a lot of factors that effects on the perfor-mances in terms of the power exploitation like bandwidth usage and data redundancy. As the previous project, it does not consider using efficient methods for gathering data.

The research projects described above belongs to data collection frameworks using OppS prin-ciple that means connection to and gathering data from nearby devices in the networks without any pre-defined topology. All of them aim to cope with different challenges that OppS systems face like storage and energy constraints, coverage, privacy, and security. We have noticed that addressing the first two problems they mostly try to reduce the amount of redundant data in or-der to decrease memory usage on the gathering devices and bandwidth during transferring this data to the server or cloud. However, they do not consider that communication between sensing and gathering device also consumes the significant amount of energy. We believe that the ef-fectiveness of data collection can be enhanced on this stage. For this purpose, the mobile smart device should gather data selectively only from the cost-efficient sensing devices. Thereby, the number of needless connections and the amount of redundant data can be diminished. To the best of our knowledge, there are no projects that offer something similar for OppS in IoT.

In the next subsection, we consider different techniques that can be used to solve the identi-fied challenge.

(30)

2.2.2 Techniques for cost-efficient sensors identification

Our intention is to develop an efficient approach for data collection from sensors using data mules. We suppose that sensors which data should be collected must be cost-efficient with respect to the data collector. It means that the latter does not waste its energy on trying to gather data from sensors which are too far to connect or not reliable. Thereby, considering that the mule moves in space and time it needs to determine a set of sensors which corresponds to a certain level of reliability at that moment. Thus, it can be related to similarity problem as the sensors must meet the same requirements. There is an existing approach for searching similar (closest) objects to a Point of Interest (POI) called k nearest neighbour (kNN) query. It is widely used in WSNs for performance optimization aimed to reduce power consumption of sensors, their vulnerability to failure and variation of their availability [60]. A brief description of kNN query and its processing techniques are introduced here.

kNN queries Initially, kNN query was used in database applications for searching similarities efficiently [61, 62]. In WSNs it is also found to be useful for energy efficient aggregation of sensor data. kNN query discovers k closest sensors to a POI called query point [60]. In general, kNN query processing approaches can be classified as 1) infrastructure-based and 2) infrastructure-free. The former ones rely on a network infrastructure for query propagation and processing. The infrastructure-free class does not depend on any pre-established infrastructure, instead, it uses well-designed routes in order to collect data [63].

1. Routing phase;

2. kNN boundary estimation phase;

3. Query dissemination and data collection phase [64]

Below, we discuss some of the methods which main phases are summarised above.

kNN is actively proposed in many research studies as an approach for determination of the cost-efficient sensors in WSN starting from the previous decade. Here, we discuss some of such works [65, 66, 67]. For instance, Winter et al [65] propose a partial infrastructure-based two-dimensional kNN algorithms namely the Geo Routing Tree (GRT) and the kNN Boundary Tree (KBT) to identify energy efficient sensors. The former has a good trade-off between energy and query accuracy, however, it is not suitable for dynamic networks thereby it is not suitable for modern WSNs. At the same time, KBT is far better query latency and energy consumption in

(31)

dynamic networks. Moreover, it consists of three phases which include different approaches. There are two energy efficient techniques are proposed in [65]: the single root (KBT SR) and the perimeter tree (KBT PT) which use "TreeHeight". Unfortunately, the drawback of KBT SR is that poorly set timers can either reduce the accuracy of the kNN results or unnecessarily increase the query latency. Though the KBT PT attempts to balance the tree to improve query accuracy since the timers are set based on a fixed estimate of the height of the tree, it does not solve an issue of a short life time and the low accuracy yet.

3DkNN algorithm [66, 68, 69] is based on data mule and implemented over three-dimensional (3D) sensor space rather than most kNN algorithms that deal with two-dimensional (2D) spaces. Further, the main difference of 3DkNN algorithm is that data mule is both the centre point for query origination and the point-of-interest, in opposite to the algorithms described above which use the base station (sink) as a central point for query origination which reaches a sensor close to the point-of-interest. The mobile data mule employs kNN to select a subset of sensor nodes around it [66]. The cost of collecting data from this subset is minimum compared to any other subset of nodes around the data mule. The 3DkNN’s novelty lies in the dynamical computation of k size set of cost-efficient nearest sensors around the mobile data mules using KNN-METRIC. That allows considering the metrics pointing out to sensor’s efficiency such as Signal to Noise Ratio (SNR) and distance.

KNN − METRIC = c α × SN R

β × Distance(D) (1)

where c is constant, α and β are pre-assigned weights obtained experimentally.

The algorithm was simulated using Cooja and Contiki and evaluated considering metrics as query latency and energy consumption. Moreover, its performance was compared with KBT

a) b)

Figure 8. a) energy consumption b) query latency comparisons of 3DkNN with KBT [66].

algorithm discussed above. According to the results in [66], the 3DkNN outperforms KBT in terms of energy consumption and query latency (fig. 8).

(32)

Y.Komai et al in their paper [67] propose two bealess (without broadcast messages con-taining information about the device forwarding them) methods processing kNN query in Mo-bile Ad-Hoc Networks (MANETs). It uses a three-way handshake transmission to identify the query. First, query-issuing node broadcast a kNN query to identify the closest node to it which becomes its global coordinator. Second, this node gathers information about other nodes in the network following two different approaches: Explosion (EXP) and Spiral (SPI). In the former one, the coordinator broadcast query request within a particular area defined by the density of nodes in the entire network. The receiving nodes reply to it with information about itself, that is then transmitted to the query issuing node. The second method does not require the pre-defined region to send a query message. However, it is assumed that the entire network area is divided into a set of cells. The global coordinator forwards the query in a spiral manner to the nodes located close to the centre of the nearest cell. These nodes are called local coordinators. They forwards query with collected information about the nodes within their cell further to the next cell in the spiral. The node which eventually obtains the kNN result transfers it to the query issuing node. The simulation experiments conducted in [67] show that the beacon-less methods decrease a bandwidth usage but it takes more time for query identification than in the methods using beacons. The authors claim that their approaches can also retain the accuracy of the query result while quick movements.

kNN query in the IoT context We see a perspective of using kNN query for efficient data collection from sensing IoT devices (sensors) in order to address the OppS challenges related to resource cost. kNN query processing algorithms allow gathering data only from cost-efficient sensors. The "cost" here might depend on different indicators like distance, reliability, accu-racy, sustainability, scalability, usability, security etc. Thus, it reduces the energy consumed on attempts to connect the sensor nodes which are too far for establishing a connection, do not have sufficient battery level to transfer data or have inappropriate data. According to the OppS projects discussed in the previous subsection, the mobile smart devices gather data from nearby sensors. The word "nearby" there means the sensor nodes which are visible in the communi-cation radio range. However, "visible" does not mean that they are connectible (e.g in BLE smartphone can discover the sensors which have a low signal level, however, that can not be connected), so attempts to connect such modes are just energy and time waste. The same situ-ation is with the sensors which have a really low battery level that can lead to data loss. Also, the sensor nodes which are usually used in the modern networks are cheap and unstable that means sometimes they can produce unreal data. Therefore, considering these metrics we can save power and memory. The discussed kNN algorithms determine the physically nearest neigh-bour sensors using one or two indicators. Though, 3DkNN allows using multiple indicators it

(33)

remained just in theory. Our idea is to use kNN query concept considering different metrics in order to retrieve a set of the most cost-efficient sensors. Collecting data from this set can tackle resource cost in OppS systems.

2.2.3 Multiple Criteria Decision Making

The algorithm proposed in this thesis considers several indicators for determining k size query of the most efficient sensor nodes. This is a complex process that relates to Multiple Crite-ria Decision Making (MCDM) problem. In the MCDM, decisions about choosing or ranking alternatives are made through evaluation of different criteria that is a hierarchical structure of attributes. In this case, it is inappropriate to directly exploit the conventional methods based on weighted sum. Moreover, some of the attributes might not be numerical (e.g security). Making decisions involves finding trade-off among several criteria conflicting with each other. There are many techniques developed to solve MCDM problems like Multiple Attribute Utility The-ory (MAUT) [70] and outranking methods [71, 72].

MAUT-based approaches introduce different preferences as multiple attribute utility functions. They combine utility functions with criteria weighting functions for each criterion. After suc-cessfully assessing the utility function, it presents a problem as a single objective function which is a privilege of using MAUT. Thereby, it ensures that the best compromise decision is achieved.

The principle of outranking approaches is to scale a dominance of one alternative over an-other, instead of identification of a single best choice. So, outranking identifies the preference of one solution over another through a comparison of alternatives performance for each crite-rion without user prescribed scale. Generally, such types of approaches are applied in cases of complicated aggregation of criteria factors measured in incomparable units. Their implementa-tion is quite complex compared to other MCDM methods. Moreover, they often do not reach a decision [72].

Considering our MCDM problem, we give our preference to MAUT-based approach. Analytic Hierarchy Process (AHP) [73] is one of the widely used MAUT-based approaches, solving the MCDM problems. It was developed by T.L. Saaty in the 1970s then broadly investigated and improved [74]. This tool can deal with qualitative and quantitative data as aspects of MCDM problems. It simplifies and organises complicated problems by quantifying and relating their elements to general goals with consideration of alternative solutions. Thus, AHP transforms a complex problem to hierarchically arranged structure [74, 72]. It uses pairwise comparisons of the factors that allow determining the compromise among criteria. Its major advantage over

(34)

other methods solving MCDM problems that intuitively appeals decision makers is a verifica-tion of inconsistencies in the evaluaverifica-tion of criteria and alternatives that decreases unfairness in decision making. There is a variety of application fields for this method in MCDM situations like the ranking of cloud computing services [72], analysis of environmental issues [75], se-lection of websites, evaluating tools, e-business, drugs sese-lection and others [74]. Thus, we use AHP in this thesis for ranking sensors in order to identify a set of the most cost-efficient ones. We give definition to AHP and describe processes involved to model the ranking problem in the next chapter.

2.3 Summary

This chapter has discussed the IoT and OppS paradigms in details. It has reviewed several works developed in this domains. Due to this, we identified a gap of these systems. They do not utilize any specific approach to collect the raw data from sensing actuators to enhance the efficiency of their system. In the implementations, they used to collect data with mobile smart devices from nearby sensors thereby applying OppS paradigm. They do not consider that these sensors might produce low accuracy data or even are not connectible because their signal strength is too low. However, the introduction of an approach that identifies efficient sensors can improve the effectiveness of this stage in the IoT systems. This idea came to us while reviewing the technique widely used in WSNs called kNN query that selects the k sensors from which the data is collected. Mostly, we have inspired by 3DkNN that uses a mobile smart device to collect the data. Moreover, we have chosen a method that helps to enable our idea of ranking the sensors that are available in radio range and then pick k top sensors to collect data from.

(35)

3 CollMule Algorithm

The previous chapter provides a comprehensive literature review that investigates the current challenges in the Internet of Things (IoT) and opportunistic sensing (OppS) paradigms. Thus, it defined the research problem of this thesis which is a reduction of energy consumption. More-over, Chapter 2 presents the existing approaches that can be utilized in the development of our solution. This chapter describes our method that addresses the problem on the data collection stage where mobile devices collect data only from the most cost-efficient sensors regarding the system requirements.

3.1 Introduction

In this section, we propose an algorithm named CollMule for collecting data from sensing IoT devices situated in the surrounding environment using mobile smart devices. The former appli-ances are also called sensors here, while the latter ones are called mules or collectors. CollMule algorithm is inspired by 3DkNN algorithm [66]. It considers three-dimensional space (see Fig-ure 9) and different metrics to select the most cost-efficient k size set of sensors. Following

Figure 9. Sensor selection in a three dimensional area.

KNN concept, our algorithm consists of three phases:

(36)

2. Ranking phase - ordering the devices considering different metrics; 3. Collecting phase - obtaining data from the k most cost-efficient devices.

During the first phase, a mule discovers surrounding IoT devices by listening for radio channels where they broadcast messages informing about their presence. Such packets usually contain data about the device like name, signal strength, radio power, manufacturer data, services the device provides etc. In our algorithm, it is assumed that they include metrics for the ranking phase that are discussed in the next section. Then, CollMule ranks the devices from the less to the most cost-efficient ones using the metrics retrieved from these messages. The last phase determines a k size (number of devices) set of devices to collect data from. Further, the mule connects and reads the sensed data from them. The optimal k size is found by experimental tests in Chapter 5.

3.2 Metrics

There are many different metrics which may play important role in the estimation of the IoT device performance as mule’s neighbour. For example, they can be location, distance, battery level, sensor accuracy, received signal strength indication (RSSI), link quality indication, packet delivery rate, packet error rate [76], data size and etc. Here, we use only a few of them which are described below.

• RSSI describes the wireless signal strength through measuring power of the received signal. Usual unit used is decibel-milliwatts (dBm).

dBm = 10 × log( P

1mW) (2)

The signal is stronger when its RSSI value is higher. RSSI is automatically determined by wireless receivers. As It is measured differently on various devices, it is useful for assessment of the environmental effect on a specific radio chip.

• Distance here estimates how far are IoT devices from the mule. To calculate it, we con-sider coordinates in three dimensional space and use geometrical formula:

DistanceD = p(X2 − X1)2 + (Y2 − Y1)2 + (Z2 − Z1)2 (3)

It impacts on latency (the time taken to collect data from a device) which effects itself on energy consumption. Hence, the shorter distance, the less power is consumed to get the

(37)

data.

• Sensor Accuracy. Modern sensors embedded into IoT devices are still unstable, as they sometimes do not work properly, return errors or zero values. Whereas, in the data col-lection it would be wasting of energy and time on connecting to and trying to obtain data from a sensor which currently does not provide proper measurements. Therefore, sensor accuracy is an important metric in the computation of the efficient devices. Generally, accuracy is defined as "the amount of uncertainty in a measurement with respect to an absolute standard" [77]. Sensor accuracy is the maximum difference between the value measured by primary or good secondary standard and the output value of the sensor. It can be expressed in percentage or absolute value [78].

• Battery Level. It is another significant factor to consider for evaluation of the device performance. Its impact can be interpreted different ways. We can assume that devices with the higher level should be graded as more efficient because the devices having low power should remain alive until recharging. On another hand, the low power can be an alarm for gathering data from such device in first order, however, it must be enough for such action. Thus, we take a second assumption considering the minimum amount of energy with that the data collection is possible. As there is no standard battery level estimation mechanism on the IoT devices we use for our testbed, we have developed its model which is described next.

3.2.1 Battery Level Estimation Model

Nowadays, we have to be concerned and aware of the ecological and environmental situation on our planet. Many years people threw away used hardware polluting lands with poisonous emissions like lead, sulphuric acid, and cadmium [79]. Especially, it relates to batteries which are power limited. Thereby, they become unusable in short time and are thrown in big amounts. Consequently, rechargeable batteries replace regular ones today. Although they must be also re-cycled, their utilization reduces the amount of waste. Therefore, we consider exploiting reusable batteries for modelling battery consumption in this project.

Battery capacity represents a period of time during that it can provide a given current. It is measured in Ampere-hours (Ah). Different battery manufacturers measures their own way. It is important to notice that the capacity of rechargeable batteries tends to diminish every time they are recharged. Moreover, such factors as temperature and current consumed impacts consider-ably on time it can last [80]. In this thesis, we do not consider temperature effect as it usually does not exceed storage and charging temperature range of the most types of batteries in the

(38)

buildings. Moreover, it requires a good knowledge of chemistry and complex computations. Also, we assume to use power bank of 5000mAh [81] (BatteryCapacity) with lithium polymer battery which residual capacity (RisidualCapacity) might lose 0.1% of full capacity per month with discharging at 100% and recharge every 24 hours. The recharge life is approximately 300 - 400 cycles at 100% of discharge [82]. We estimate the power level following this formula:

BatteryLevel = ResidualCapacity

BatteryCapacity × 0 .8 × 100 % (4)

where ResidualCapacity is battery capacity remained after some consumption. We also con-sider that 20% is utilised by charging/discharging process. Further, the battery consumption is modelled assuming that energy is used on CPU, sensing and communicating processes.

Communication modelling Before modelling, we have to choose a type of communication to be used in this thesis. Among available short range radio technologies, it is the most preferable, as e.g while using WiFi Direct on the smart mobile devices it prohibits using WiFi for Internet access. Another technology that can be used is ZigBee or NFC, however, the additional hard-ware needs to be deployed in the most cases, as modern smart mobile devices do not support ZigBee and the same with sensors and NFC. Therefore, in our model, a mule uses Bluetooth Low Energy (BLE) to interact with IoT devices. BLE, also known as Bluetooth SMART, is an evolution of Bluetooth Classic focused on low-powered devices. [83] described a full ar-chitecture of this protocol. Here, we review the two lowest layers which allow the low-level communication between devices. Physical Layer represents 40 Radio Frequency (RF) channels in 2.4 GHz band used for broadcasting and other 37 channels for bidirectional data exchange between devices connected to each other. Link Layer is where the interaction occurs. There are two patterns of communication in BLE protocol:

1. advertisement based - when the data is broadcasted using advertising packets;

2. connection based - the bidirectional data exchange takes place after establishing a con-nection;

In the context of CollMule algorithm, we use the second pattern. A mule scans to discover sensors nearby which advertise their availability through three RF channels in series to avoid interference. The scan period called scanInterval takes a fixed time T during which the mule listens to advertising messages for a fixed period Ts (scanWindow). After these, the mule stops scanning for a while. The sequence of broadcast packets is transmitted by sensors (advertisers) within the time span called advertising event (advEvent). The time period between them equals

(39)

to the sum of advertising interval (advInterval) and advertising delay (advDelay), whereby the former has a standard time range from 20 ms to 10.24 s which might be configured; and the later one differs from 0 to 10 ms in uniformly distributed way implied to avoid synchronisation with other advertisers.

After the mule discovers accessible sensors, it connects to a selected sensor which is derived using CollMule algorithm described previously. The connection happens through sending a request to connect (Connection Request) from the mule to sensor which is supposed to listen for possible incoming messages. The sensor establishes a connection after receiving the request message defining it. Here, the mule and sensor play roles of master and slave respectively. There is a wait time of 1.25 ms and then delay in master’s data packet sending (TransmitWindowSize) considered.

In our case, mule sends a data request to the sensor which replies with a message consisting the measured data. The request is sent at the time called connection event. If there is more data to be exchanged the connection may continue. By the standard, an Inter Frame Space (IFS) time of at least 150 µs must be guaranteed between data messages transmission. It is assumed that sensor’s radio is turned off to save energy till the next event. Another important parameter to be considered is the period between connection events (connInterval) ranging from 7.5 ms to 4 s. Finally, the connection fails if the time period called connSupervisionTimeout is surpassed since the last packet was received. This parameter adopts values from the span between 100 ms to 32 s, and equals to or is greater than connInterval. In our scenario, the connection break follows as soon as sensor data is received by the mule.

Battery Residual Capacity Estimation An IoT device sends advertising packets with fre-quency fadv meanwhile it stays in receive mode waiting for a possible incoming request for a

connection. When a mule requests to connect the device, the advertisement is interrupted.

CollMule: An Opportunistic Data Collection System for IoT-based Indoor Air Quality Monitoring

Collection System for IoT-based Indoor Air

Quality Monitoring

Aigerim Zhalgasbekova

Master’s Thesis in

Pervasive Computing & COMmunications

for sustainable development

COLLMULE: AN OPPORTUNISTIC DATA COLLECTION SYSTEM

FOR IOT-BASED INDOOR AIR QUALITY MONITORING

CONTENTS

List of Figures

List of Tables

ABBREVIATIONS AND SYMBOLS

1

Introduction

1.1

Introduction

1.2

Research Motivation

1.3

Research Questions and Objectives

1.4

Research Contributions

1.5

Research Methodology

1.6

Sustainability

1.7

Thesis Outline

2

Background and Related Work

2.1

Internet of Things (IoT)

2.2

Opportunistic Sensing

2.3

Summary

3

CollMule Algorithm

3.1

Introduction

3.2

Metrics