Towards reducing bandwidth consumption in publish/subscribe systems

(1)

IN

DEGREE PROJECT INFORMATION AND COMMUNICATION TECHNOLOGY,

SECOND CYCLE, 30 CREDITS ,

STOCKHOLM SWEDEN 2020

Towards reducing bandwidth

consumption in publish/subscribe

systems

YIFAN YE

KTH ROYAL INSTITUTE OF TECHNOLOGY

(2)

(3)

Abstract

Efficient data collection is one of the key research areas for 5G and beyond, since it can reduce the network burden of transferring massive data for various data analytics and machine learning applications. Specifically, 5G offers great support for massive deployment of IoT devices, and the number of IoT devices is exploding.

There are mainly two complementary ways for achieving efficient data collection: one is integrating data processing into the collection process via e.g. data filtering, aggregation; the other one is reducing the amount of the data needs to be transferred via e.g. data compression/approximation.

In this thesis, efficient data collection is studied from the mentioned two perspectives. In particular, we introduce enhanced syntax and functionalities to the message queueing telemetry transport (MQTT) protocol, such as data filtering and data aggregation. Furthermore, we enhance the flexibility of MQTT by supporting customized or user-defined functions to be executed in the MQTT broker, and thus data processing in the broker will not be constrained to the predefined processing functions. Lastly, dual prediction is studied for reducing the data transmissions by maintaining the same learning model on both sides of the sender and receiver. In particular, we study and prototype least mean square (LMS) as the dual prediction algorithm. Our implementations are based on MQTT and the benefits are shown and evaluated via experiments using real IoT data.

Keywords: pub/sub, 5G, MEC, data reduction, IoT.

(4)

(5)

Sammanfattning

Effektiv datainsamling är ett av de viktigaste forskningsomr˚adena för 5G och därefter, eftersom det kan minska nätbördan för att överföra massiva data för olika dataanalyser och maskininlärningsapplikationer. Specifikt erbjuder 5G bra stöd för massiv distribution av IoT-enheter, och antalet IoT-enheter exploderar.

Det finns huvudsakligen tv˚a komplementära sätt att uppn˚a effektiv datainsamling: ett är att integrera databehandling i insamlingsprocessen via t.ex. datafiltrering, aggregering; den andra minskar mängden data som behöver överföras via t.ex. datakomprimering / tillnärmning.

I denna avhandling studeras effektiv datainsamling ur nämnda tv˚a perspektiv. I synnerhet introducerar vi förbättrad syntax och funktionalitet till meddelandekö telemetri-transportprotokollet (MQTT), till exempel datafiltrering och dataggregation. Dessutom förbättrar vi MQTT-flexibiliteten genom att stödja anpassade eller användardefinierade funktioner som ska köras i MQTT-mäklaren, och därför kommer databehandling i mäklaren inte att begränsas till de fördefinierade behandlingsfunktionerna. Slutligen studeras dubbla förutsägelser för att minska dataöverföringarna genom att bibeh˚alla samma inlärningsmodell p˚a b˚ada sidorna av avsändaren och mottagaren. I synnerhet studerar och prototypar vi minst

genomsnitt kvadrat (LMS) som den dubbla förutsägelsealgoritmen. V˚ara implementeringar är baserade p˚a MQTT och fördelarna visas och utvärderas via experiment med

verkliga IoT-data.

Nyckelord: pub / sub, 5G, MEC, datareduktion, IoT.

(6)

(7)

2.1.1 Evolution of Mobile Wireless Communication Networks . 9 2.1.2 5G Multi-access Edge Computing and The Internet of Things . . . 10 2.2 Publish/Subscribe Architecture . . . 11 2.2.1 Topic-based filtering . . . 12 2.2.2 Content-based filtering . . . 13 2.3 MQTT . . . 13 2.3.1 Enhanced MQTT . . . 14

2.4 Data reduction in IoT . . . 16

(8)

iv CONTENTS

3.3 UDF-based subscription . . . 25

3.3.1 Architecture. . . 28

4 Result and Analysis 31 4.1 Experiment Setup . . . 31

4.2 Dataset . . . 32

4.3 MQTT vs. MQTT+ . . . 33

4.4 Prediction-based data reduction algorithms. . . 34

4.4.1 proof of concept . . . 34

4.4.2 On testbed . . . 39

4.5 UDF-based subscription . . . 39

4.5.1 Discussion . . . 40

5 Conclusions and Future Work 41 5.1 Conclusions . . . 41

5.2 Future work . . . 41

(9)

List of Figures

1.1 Research methodology . . . 5

2.1 IoT supported by MEC. . . 11

2.2 Comparison between client-server and Pub/sub . . . 11

2.3 Topic-based filtering. . . 12

2.4 MQTT has no support for enhanced subscriptions. . . 14

3.1 Architecture of MQTT+ broker. . . 20

3.2 LMS filter for time series forecasting. . . 24

3.3 Dual prediction. . . 27

3.4 Inflexible subscription of MQTT+. . . 28

3.5 UDF-based subscription. . . 29

4.1 Testbed. . . 32

4.2 Reduction ratio for topic T. . . 35

4.3 Reduction ratio for topic p. . . 35

4.4 Topic P(error bound = 0.5). . . 36

4.5 Topic P(error bound = 2). . . 36

4.6 Reduction ratio of all features (Part1) . . . 37

4.7 Reduction ratio of all features (Part2) . . . 38

(10)

(11)

List of Tables

4.1 14 features of the dataset. . . 32

4.2 Downlink traffic comparison. . . 33

4.3 Uplink traffic comparison. . . 33

4.4 Uplink traffic with data reduction . . . 39

4.5 Downlink traffic with UDF-based subscription . . . 39

(12)

(13)

List of Acronyms and Abbreviations

IoT Internet of Things

5G The filth generation of mobile communication

MEC Multi-access edge computing

MQTT Message Queueing Telemetry Transport

pub/sub Publish/subscribe

WSN Wireless sensor network

QoS Quality of Services

UDF User-defined function

mMTC Massive Machine Type Communication

LMS Least Mean Square

(14)

(15)

Acknowledgements

It is with great honor to do my master thesis at Ericsson Research, Kista, in conjunction with the KTH Royal Institute of Technology.

First of all, I would like to express my gratitude to Dr. Zhang Fu, my industrial supervisor at Ericsson Research. It is Zhang who led me into this research area, and gave me plenty of assistance in implementation and evaluation. I also would like to thank my academic supervisor Dr. Ki Won Sung, examiner Dr. Slimane Ben Slimane, and opponent Chen Tang from KTH, from whom I received a lot of comments and feedback on my thesis report. Lastly, many thanks to my parents in China. Without their financial and mental support, I can not imagine that I can finish my master study in Sweden.

Yifan Ye July 2020

(16)

(17)

Chapter 1 Introduction

This chapter presents the basic introduction and background upon which the project is developed.

1.1 Background

Efficient data collection is one of the key research areas for 5G and beyond, since it can reduce the network burden of transferring massive data for various data analytics and machine learning applications. For example, we are entering the era of the Internet of things (IoT), and the number of IoT devices is growing exponentially. As stated in the Ericsson report [1], the number of IoT connections are estimated to be over 50 billion in 2025. There will be greater volumes of data being generated and exchanged between devices, which gives a heavy burden on the network. How to reduce latency and save resources becomes the key issue in IoT. The evolution of the 5th Generation (5G) of mobile cellular networks are designed to give better support to IoT. Multi-access edge computing (MEC), which is one of the key technologies in 5G, can boost the massive deployment of IoT devices. MEC, differing from traditional cloud computing, distributes the computation to the edge of the network [2]. In the cloud computing paradigm, data generated from edges are always be transferred to cloud servers, which results in high latency due to network transmission delay. Also, this solution will not be scalable when the number of devices and the volume of generated data become are growing exponentially, because transferring so much data through the networks consume too many resources and even makes the network overload and crash. However, in MEC, the computations are distributed to the edges, which make the computations close to corresponding end users. The latency could be reduced, so as traffic flowing from the edge to the cloud.

(18)

2 CHAPTER1. INTRODUCTION

In MEC, a message exchange protocol is needed. Message Queueing Telemetry Transport (MQTT) is widely used in IoT communication due to its lightweight property [2]. MQTT has small protocol overhead, which suits well for constrained IoT devices that normally have limited CPU or memory. MQTT has a topic-based publish/subscribe architecture, which contains subscribers, a broker, publishers. Publishers, which are also called producers, are responsible for collecting and publishing messages to the broker. Each message has a topic. In IoT, a publisher could be a sensor, for example, there is a temperature sensor that can measure the temperature in the room and publish the measurements. The MQTT broker receives messages from publishers and forwards them to subscribers according to the messages’ topic. Subscribers, which are also called consumers, can subscribe to some topics they are interested in, then they can receive the messages from the broker. The use of data depends on the end-users, for example, one user can subscriber to many topics including temperature, humidity, CO2level, O2 level in a room because he wants to use all of these data

to predict whether there is a person in the room.

However, there are still some shortcomings of the MQTT protocol. MQTT broker cannot understand the meaning of the received packets and apply some operations on them. It simply receives messages and forwards them based on their topics [3]. The use of MQTT normally results in a waste of network bandwidth since subscribers can receive many redundant messages that are unnecessary to be transmitted and will be eventually discarded by the user. For example, there is one user who is only interested in the average temperature in the previous 30 minutes, while the temperature sensor publishes a message to the topic of temperature per minute. The user has no choice but to subscribe to the temperature topic, receive all the measurements of temperature, and calculate the average by himself. Also, MQTT does not support merging messages. A user has to subscribe to all the existing topics and receive messages separately if he is interested in all the topics. But it will be more efficient to merge all the messages into a big message and give it a new topic name ”all”, then the user will subscribe to the topic ”all” if he wants messages from all the topics.

(19)

1.2. PREVIOUS WORK ANDRESEARCHGAP 3

1.2 Previous Work and Research Gap

As mentioned in the background, the use of MQTT can lead to a waste of network bandwidth due to several shortcomings. To overcome the aforementioned shortcomings of MQTT, Some researchers have proposed MQTT+ [3], which adds some enhanced syntax functionalities such as spatial/temporal data aggregation and rule-based subscription. The experiment result shows that MQTT+ can reduce bandwidth consumption while CPU and memory costs in the broker do not increase significantly, which makes MQTT+ suitable to be used in 5G MEC [2].

However, MQTT+ still has some limitations. Firstly, every time we want to have some new functionalities, we need to change the source code, recompile it, and deploy it in production, which is adaptive to the consistently changing requirements of clients. Secondly, the use of MQTT+ could only reduce the traffic between the broker and subscribers, since it does not affect the behaviors of publishers. The traffic between the broker and publishers also need to be reduced. There are also several researches on data reduction in wireless sensor networks (WSN) [4,5,6,7,8]. They mainly focus on a proof of concept of prediction-based data reduction algorithms in WSN, while none of them have really deployed the algorithms in a pub/sub system. Zehnder, et al. introduce data stream reduction into their pub/sub system [9], which is Kafka-based. However, they use a rather simple reduction algorithm without investigating possibilities of prediction-based data reduction algorithm.

To this end, this paper proposes a novel solution that overcomes the aforementioned limitations of MQTT+ and fills the research gap. A prototype is implemented, and several experiments are conducted to compare performance.

1.3 Problem statement

According to the aforementioned research gap, the thesis aims to solve the following research question:

Is it possible to overcome the two shortcomings of MQTT+, thereby making further improvement to the publish/subscribe systems ?

(20)

- How to reproduce MQTT+ [3], an enhanced MQTT broker ?

MQTT+ introduces several enhanced functionalities to the MQTT broker, and our research is based on the MQTT+. However, the paper [3] does not describe the details of the implementation of MQTT+. We need to reproduce MQTT+ in order to achieve our goal.

- How to make the subscriptions of MQTT+ more flexible ?

As mentioned in the section 1.2, one of the shortcomings of MQTT+ is the inflexible subscription. A client can only subscribe to aggregate results that are pre-defined in the broker. We want to investigate how to let clients subscribe to any kind of content by passing user-defined functions to the broker.

- How to reduce the traffic between publishers and the broker ?

As also mentioned in the section 1.2, the other shortcoming of MQTT+ is only the traffic from the broker to subscribers can be reduced, while theoretically, the traffic from publishers to the broker stays unchanged, which could become the bottleneck of the system. We need to find a solution to achieve data reduction in the whole pub/sub system.

- How to benchmark?

Even though our research aims to be used IoT networks with the support of 5G MEC, we cannot conduct experiments in a real 5G MEC environment. So we need to emulate this environment, and find a suitable way to benchmark the systems.

1.4 Goal and Contribution

(21)

1.5. RESEARCHMETHODOLOGY 5

be reduced, and it does not support subscription to user-defined content.

In the project, we propose several solutions to solve the existing problems of MQTT+, thereby further reducing bandwidth consumption in pub/sub systems. The work can benefit the massive deployment of IoT devices in the upcoming 5G era.

1.5 Research Methodology

Figure 1.1: Research methodology

To answer the research questions in 1.3, the research follows the research methodology as shown in the Figure1.1.

In the first step, we did a literature study to learn some theoretical background of IoT, 5G, MEC, etc, and make. During this period, we noticed that MQTT can be a suitable message exchange protocol that supports the massive deployment of IoT under the scenario of 5G MEC. However, the original MQTT broker has several shortcomings and there is much research on enhanced MQTT protocol. Some research, such as MQTT+, can partly solve the problems, however, there is still a research gap to achieve our ultimate goal.

(22)

implementation and evaluation.

The third step can be divided into three sub-steps. Firstly of all, we reproduced the MQTT+, which provides several enhanced functionalities as listed in the paper [3]. Then we make improvements on the MQTT+ by introducing flexible subscription and data reduction algorithm.

In the last step, we conducted several experiments and evaluate the results, and a detailed description of the research methodology can be found in chapter3.

1.6 Delimitation

This research has three limitations:

• Most work is just proof of concept, and we did not deploy our pub/sub system in a real 5G MEC scenario.

• We mainly focus on reducing bandwidth consumption, however, some other metrics such as latency, computational cost are not evaluated.

• Only one dataset is used when we evaluate the performance of data reduction algorithms.

1.7 Benefit, Ethics, and Sustainability

The upcoming 5G era aims to connect everything with its key technologies such as MEC. The goal of this research is to reduce bandwidth consumption in pub/sub systems in the upcoming 5G era. And there are many benefits of our research.

On the one hand, with the exploding number of IoT devices and the amount of traffic, reducing traffic can alleviate the burden of networks, thereby saving the total energy consumption of the networks and improving the quality of services (QoS) for all network users.

(23)

1.8. STRUCTURE OF THIS THESIS 7

transmissions, which can help the sensor nodes to save energy and increase their lifetime.

1.8 Structure of this thesis

This thesis is composed of four main chapters. The rest of this thesis is organized as follows:

(24)

(25)

Chapter 2 Theoretical Background

In this chapter, a detailed description of the background of the degree project is presented together with related work.

The chapter starts with a brief introduction of 5G networks, and how 5G enabling technologies, such as MEC, support the massive deployment of IoT in the section2.1. Then in the section2.2, we discuss the pub/sub architecture, which is used for message delivery in IoT networks. MQTT is discussed in detail in2.3 since our research is based on the MQTT broker. Lastly, the section2.3.1and2.4 present two classes of related work which are most relevant to our research. One is some research on enhanced MQTT broker, and the other is data reduction in IoT networks.

2.1 5G Enabling Technologies

We have witnessed several evolutions in mobile technologies, and each evolution gives us more performant networks with higher speed, lower latency, etc. Nowadays, the fifth-generation (5G) networks are being deployed across the world, and the world is changing by enormous novel technologies enabled by 5G. The section 2.1.1 briefly introduces the history from 1G to 5G, and the section 2.1.2describes the MEC, which is a key technology in 5G.

2.1.1 Evolution of Mobile Wireless Communication Networks

Wireless network technology has undergone five innovations, and it all starts from the first generation of mobile networks (1G). In order to support wireless voice calls, 1G was firstly deployed in Tokyo by Nippon Telephone and Telegraph Company early in 1979. and it adopts analog signals and offers non-encrypted

(26)

10 CHAPTER 2. THEORETICALBACKGROUND

voice call with poor quality [13]. Then in 1991, 2G was launched in Finland in 1991. 2G networks are signal and offer encrypted voice call with much better quality than that in 1G. Apart from voice calls, 2G also prompted many new services such as text messages [14].

After that, we witnessed a giant revolution when 3G was launched in 2001. 3G increases data rates and makes it possible to surf the internet, make video calls, etc. With the popularity of the Internetthe streaming data on the internet is exploding, 4G was launched in 2009 to offer higher speed and better QoS [14,15]. 4G fully abandons circuit switch technology, and uses Internet Protocol (IP) for all the data communication [16].

2.1.2 5G Multi-access Edge Computing and The Internet of

Things

The number of IoT connections can reach 50 billion in 2025 [1], which implies a greater need for higher speed and low latency of the networks. However, 4G is not capable of offering good service quality for IoT applications [17]. Then 5G comes with the goal of connecting everything. There are at least two key technologies that support the deployment of IoT devices: massive Machine Type Communication (mMTC), MEC [2]. mMTC enables a high density of connected devices, and MEC distributes computation at the edge of networks [2].

MEC, differing from cloud computing, brings all kinds of services such as computation and storage even within the radio access network (RAN). Since services are very close to the end-users, latency and network bandwidth can be greatly reduced [18].

(27)

2.2. PUBLISH/SUBSCRIBE ARCHITECTURE 11

Figure 2.1: IoT supported by MEC.

2.2 Publish/Subscribe Architecture

pub/sub is a message delivery model, which can decouple publishers and subscribers [19]. In contrast to the client-server architecture, there is a broker between publishers and subscribers. The following Figure shows the comparison between these two architectures.

(a) Client-Server architecture. (b) Pub/Sub architecture. Figure 2.2: Comparison between client-server and Pub/sub

(28)

and forward the message to subscribers according to their interests. The main advantage of pub/sub architecture is good scalability [20]. In the traditional client-server model, a subscriber needs to set up communication with all the publishers containing message it is interested in, which does not scale when the number of clients in a network increase. However, in pub/sub model, subscribers and publishers are decoupled by the broker and clients only need to communicate with the broker. Furthermore, if the number of clients increases exponentially and a too large amount of traffic exceeds the capability of a single broker, it is also possible to use clustered broker nodes, and distribute traffic using load balancers [20].

The following sections describe several different types of filters in the broker.

2.2.1 Topic-based filtering

Figure 2.3: Topic-based filtering.

Topic-based filtering, also called subject-based filtering, filters message based on the topic of the message. As is shown in the Figure2.3. Publishers publish a message to the broker, and the message must contain a topic. Subscribers can announce topics they are interested in to the broker. When the broker receives a message, it will check the topic of the message and forward the message to all the subscribers who are interested in this topic.

(29)

2.3. MQTT 13 Redis pub/sub [25], etc.

MQTT broker is widely used in IoT communication due to its lightweight property [2]. MQTT has small protocol overhead, which suits well for constrained IoT devices that normally have limited CPU or memory.

However, MQTT broker, the same as other topic-based brokers, suffers from several shortcomings, which results in a waste of network resources. There are many types of research on improvement on MQTT broker consistently, and more details can be found in2.3.1.

2.2.2 Content-based filtering

In content-based filtering, subscribers can register their interesting content, and the broker evaluates the content of the message sent from publishers. The shortcoming of content-based filtering is that we must be very clear about content if the message beforehand, which is infeasible in most real-world cases [20]. Content-based filtering is relatively a new filtering type, and there are not many existing implementations of it. IBM MQ now support content-based filtering using ESQL [26].

2.3 MQTT

MQTT protocol was firstly invented by Andy Stanford-Clark and Arlen Nipper early in 1999 when they wanted to connect oil pipelines via satellite link [27] . Until now, many versions have been released, and version 3.1.1 is the newest version that follows the OASIS Standard.

MQTT protocol follows topic-based message filtering, and an MQTT-based pub/sub systems consist of an MQTT broker and MQTT clients.

(30)

MQTT clients are either publishers or subscribers. The client talks to the broker. The implementation is very simple, since the clients just need to speak MQTT messages to the broker. There are many implementations of MQTT client libraries in different programming languages. This research implement MQTT clients using the python library paho-mqtt 1.5.0.

2.3.1 Enhanced MQTT

As mentioned earlier, original MQTT protocol may not suit many scenarios, and there are many types of research on enhancements of MQTT.

MQTT+ [3], which is mostly related to our work, focuses on reducing traffic between the broker and subscribers.

The author noticed that the use of the original MQTT broker can lead to a waste of network bandwidth, since the MQTT broker is like a message Queue, and no operations can be applied on the incoming packets. For example, some messages will be discarded by subscribers when they are interested in messages from a topic only when a certain condition is met. Or some subscribers are want to receive the aggregated messages, but the broker will forward all the raw messages to them.

Figure 2.4: MQTT has no support for enhanced subscriptions.

(31)

2.3. MQTT 15 or filtered.

MQTT+ aims to solve these problems by introducing three enhanced functionalities to the MQTT broker.

Rule-based subscriptions: The subscribers can choose to receive a message from a topic only if the specified condition is met, which is implemented by adding a filter filed to the topic when subscribing to a topic. The authors predefined some filters as shown in the table 2.1 . For example, a sensor is publishers real-time temperature to the topic T, and the subscriber can subscriber to topic $GT;10.0/T to receive the temperature only if it is greater than 10.0 degrees.

Rule-based subscription Value Type Description

$EQ;value/topic numeric,string Forwards data to subscriber if data_{published on topic is equal to value} $NEQ;value/topic numeric,string Forwards data to subscriber if data_{published on topic is different from value} $GT;value/topic numeric Forwards data to subscriber if data_{published on topic is greater than value} $GTE;value/topic numeric Forwards data to subscriber if data published_{on topic is greater than or equal to value} $LT;value/topic numeric numeric Forwards data to subscriber if data_{published on topic is less than value}

$LTE;value/topic numeric Forwards data to subscriber if data published_{on topic is less than or equal to value} $CONTAINS;text/topic string Forwards data to subscriber if data_{published on topic contains value}

Table 2.1: MQTT+ rule based operators,taken from [3]

Temporal/spatial data aggregation: Some subscribers may only want to receive aggregated messages rather than all the raw messages. In this paper, the syntax of temporal aggregation topic is $ < T ime >< OP > /topic, and the author predefined some operations and time ranges: T ime =

{DAILY, HOURLY, QUART ERHOURLY }; OP = {COUNT, SUM, AV G, MIN, MAX}. For example, a sensor publishes temperature per minute to the topic T, while a

(32)

maintaining an internal memory which caches some intermediate value.

Data processing: The broker can even apply some advanced algorithms to process the data. For example, a broker receives images from several cameras and run the objection detection algorithm to count the number of people in the image. A subscriber can make a subscription to receive the number of people rather than the raw image.

2.4 Data reduction in IoT

In a typical wireless sensors network (WSN), many sensors consistently report their measurements to the sink nodes. To prolong the lifetime of sensor nodes, there are enormous researches on energy-saving techniques in WSN.

Data reduction is an efficient way of saving energy for sensor nodes since data transmission is the major cost of the energy [31], and the reduction can be either reducing number of transmission or reducing message size. With data reduction, the sensors send report reduced message to the sink node, and the sink node is capable of reconstructing the messages.

Jain et al. proposed dual Kalman filters prediction [4], which is used to reduce the number of transmissions. In their approach, each sensor node runs a Kalman filter to make time series forecasting, and sends the measurement to the sink node only if the difference between the measured value and predicted value exceeds the pre-defined error bound. The sink node is able to reconstruct all the data streams since it runs as many Kalman filters as the number of sensor nodes. The sink node receives incomplete data streams from sensor nodes, and it uses Kalman filters to fill in some missing measurements. However, to guarantee the sensor node and sink node work coherently, they must be fed with the same model firstly and prior knowledge is needed [10,6].

In the paper [6], Santini et al. use least mean square (LMS) filters instead of Kalman filters to perform dual prediction in WSNs. Since LMS filter is an adaptive filter, no prior knowledge needed to guarantee the coherency of sensor nodes and the sink node, which means the filter at the sensor node side can always yield the same prediction as the filter at the sink node side, even though they are working independently. They conducted an experiment on the Intel Berkeley dataset [32], and reduce 92% data transmissions given 0.5 error bound.

(33)

2.4. DATA REDUCTION IN IOT 17

(34)

(35)

Chapter 3 Design and Implementation

The implementation of our proposed system can be divided into three steps. First of all, we reproduce MQTT+ [3], which serves as the foundation of our pub/sub system. Secondly, we introduce dual prediction into the pub/sub system. Lastly, a user-defined function is introduced into the broker to enable flexible subscriptions.

Section 3.1 shows a referenced implementation of MQTT+, and section 3.2 and section 3.3 explain how data reduction and user-defined function can be introduced into the pub/sub system.

3.1 Reproduction of MQTT+

MQTT+ [3] serves as a foundation of this thesis, so firstly MQTT+ should be reproduced. This section gives detailed descriptions of how MQTT+ can be reproduced.

3.1.1 Architecture

MQTT+ is reproduced as a baseline. The paper [3] implemented MQTT+ based on HiveMQ 3.4 broker [30], which offers a free and open-source plugin SDK with service provider interfaces, however, the source code and implementation details are not given. Here in this paper, we give a reference implementation of MQTT+.

The following Figure3.1depicts the architecture of referenced implementation of MQTT+ broker.

(36)

20 CHAPTER 3. DESIGN AND IMPLEMENTATION

Figure 3.1: Architecture of MQTT+ broker.

To be more specific, the original HiveMQ broker is extended with SQLite database [33] and several callback functions. SQLite is a lightweight database, and is used to cache some intermediate message. The HiveMQ plugin offers several callback functions, which are called when certain events occur.

Fours callbacks [34] are used:

• OnbrokerStart callback is called when the broker starts up. This callback is implemented to set up default parameters when the broker starts up ; • OnP ublishReceivedCallback is called when the broker receives a PUBLISH

message from publishers. This callback is used to intercept the message published by producers, process the message and save some intermediate data in the SQLite database ;

(37)

3.2. DATA REDUCTION 21

• ScheduledCallback is executed periodically on a time basis. This callback is used to periodically check status.

Three HiveMQ services [35] are used:

• Plugin Executor Service: Avoid blocking when running some blocked operations.

• Publish Service: Publish a topic.

• Log Service: Print log information in the terminal.

3.1.2 Database

Our referenced implementation of MQTT+ includes a database module, which is implemented via Sqlite3. Unlike other relational database management systems (RDBMS), which follow a client-server pattern, Sqlite is serverless that can be embedded into our program. To use a non-serverless RDBMS such as MySQL, we need to first start a standalone server, with which our program will later communicate. This type of RDBMS is also called the client-server database engine, and most RDBMS belong to this type. Differently, SQLite is embedded in the program, and the program can directly speak to the database without starting a sperate database server process. This feature makes SQLite a good fit for our research since some brokers can be hosted in machines with limited computation resources.

The SQLite database receives a message from HiveMQ ’s OnPublishReceivedCallback Callback, and maintains a table.

The comparison between MQTT broker with MQTT+ broker is discussed in section4.3.

3.2 Data reduction

In addition to data aggregation, data reduction is another approach to reduce bandwidth consumption. In IoT communications, data reduction generally means reducing the number of data transmission[], and the reason behind this is the data that could be reconstructed by the receivers do not necessarily be transmitted by the senders.

(38)

side and receiver’ s side. For time series prediction models, the LMS filter is used in this paper.

3.2.1 LMS filter

Time series forecasting is using previous time series to forecast time series in the future [36,37].

There are many types of research on what kinds of time series forecasting models could be used in dual prediction [38]. There is always a trade-off between complexity and accuracy. Simple models generally have low computation cost, while the accuracy is very low. One extreme example is that paper [9] only uses one previous data to make forecasting, which means that the forecasted value is equal to the last value. As stated in the paper, the biggest advantage is the model does not introduce any extra computations to the IoT clients, which are mostly constrained devices with limited computation resources. However, the model will always fail to make successful forecasting when the time series has high fluctuation.

Complicated models normally give high accuracy, but the capability of devices should also be taken into consideration. LMS filtering is proven to be a good choice, since it can achieve high accuracy while the computational overhead is pretty low [6].

LMS filter is firstly invented by Bernard Widrow and Ted Hoff in 1960 [39, 40], and after many years, It is still one of the most widely used adaptive filters in the field of signal processing. Adaptive filters, different from fixed filters, can adapt to changes in the signal, and update parameters of the filter consistently, while fixed filters have fixed parameters and are only useful when we have a good understanding of the characteristic of the signal [41]. LMS filters are a type of adaptive filter which uses a stochastic gradient descent algorithm to produce the least mean square error between the actual signal and desired output. Assuming x(t)is the input signal, y(t) is the output signal, and d(t) denotes desired output. LMS filters can be described as the following math formula3.1[42,43].

y(t) = x1(t)⇤ W1(t) + ... + xN(t)⇤ WN(t) = N X i=1

xi(t)⇤ Wi(t) (3.1)

The weight W (t) = [W1(t), W2(t), ..., WN(t)]is updated by the error between

(39)

3.2. DATA REDUCTION 23

To use the filter for time series forecasting, we can let xi(t)equal to x(t N), and

make desired output d(t) equal to x(t). Then the formula3.1can be written as the formula3.2 y(t) = N X i=1 x(t i)⇤ wi(t) (3.2)

We can update weights using the formula

W(t + 1) = W(t) + µ_{⇤ X(t) ⇤ e(t)} (3.3)

Where µ is the learning rate, e(t) is the prediction error, and W(t) and X(t) are N length vectors:

e(t) = y(t) x(t) (3.4)

X(t) = [x(t 1), x(t 2), ..., x(t N )] (3.5) W(t) = [w1(t), w2(t), ..., wN(t)] (3.6)

The following Figure 3.2 depicts the architecture of LMS filter for time series forecasting.

Learning rate and stability

The learning rate decides the speed of learning. Too small learning rate makes the learning too slow, so it will take too long to switch to the stand-alone mode. Setting the learning rate too high may bot guarantee convergence. As stated in the book [44], the learning rate µ should be no greater than 1

Ex, where 1 Ex is computed by the formula3.7. Ex = 1 M ⇤ M X t=1 |x(t)|2 (3.7) 1

(40)

Figure 3.2: LMS filter for time series forecasting.

To guarantee convergence of the LMS filter, we initialize the learning rate based on the following equaltion3.8

µ = Ex

100 (3.8)

Weight initialization strategies

Weight initialization can have a great impact on the efficiency of the model. In the paper [3], they initialize all the weights to zero. This strategy can guarantee all the filters with the same weights. However, this static initialization strategy is not efficient, especially when all the values of X(t) are much greater than zero. This is because according to the equation3.2, the first predicted value of y(t) is always 0, which has too large difference from x(t).

We hereby propose another weight initialization strategy, namely fast start. The weights are initialized as the equation3.9

(41)

3.3. UDF-BASED SUBSCRIPTION 25

This initialization strategy can make the first predicted value very close to the desired output, thereby speeding up learning.

We have also conducted experiments to compare these two weights of initialization strategies.

3.2.2 Dual prediction

Dual prediction is a prediction-based data reduction techniques that has been used in many papers [6,7,10]. Considering we have a sensor that produce a data stream X(t) = X(0), X(1), X(2). . . , and we can use a type of time series prediction model, such as LMS filter, to make forecasting of the next data Xp(k) based on several previous data, while the real measured value is X(k). Assuming the pre-defined error threshold is emax, If |Xp(k) X(K)| < emax, the X(k)

will not be sent since it can be predicted by any receivers that are running the same time series prediction model. If |Xp(k) X(K)| > emax, then X(k) should

be transmitted since the error exceeds the threshold, also, the prediction model should be updated at both sides.

The description of the dual prediction algorithm for data reduction is presented in Algorithm 1, and data reconstruction is presented in the Algorithm 2. In practice, data reduction can happen at both the publishers and the broker, and data reconstruction can happen at both the broker and subscribers.

The following Figure3.3shows an example dual prediction, where publishers perform data reduction and subscriber perform data reconstruction.

The publisher publishes temperature measurements T(t) to the broker with the topic T, and there is one subscriber that subscribes to the topic T. The subscriber has a max tolerable error emax, and notify this parameter to the publisher. The

publisher and the subscriber have the same time series forecasting model for the topic T. Every time the publisher is about to publish a message, it compares the measured temperature value with the value predicted by the model, and only send the data if the difference exceeds emax.

3.3 UDF-based subscription

(42)

Algorithm 1: Data reduction algorithm threshold error bound;

N _{length of the filter;}

Initialize the LMS filter based on the equation3.8and3.9.; Initialize count for successful prediction count = 0; while True do

if A new message is to sent then extract value of the message y;

calculate predicted value ypredbased on the equation3.2;

calculate prediction error error according to equation3.4; if error > threshold then

// failed prediction:update filter,reset count, and sent message; update the filter based on equation3.3;

count = 0;

Send this message. else

if count < N then

// waiting mode:update filter, increase count and sent message ;

count = count + 1;

update the filter based on equation3.3; Send this message.

else

// idle mode: do nothing. end

end end end

The figure3.4depicts the inflexible subscription of the MQTT+. There are several publishers that consistently publish raw messages to the MQTT+ broker, which is implemented by leveraging HiveMQ and SQLite. The MQTT+ broker pre-defines some enhanced subscription syntax, including temporal average, spatial average, etc. And the subscribe can leverage this enhanced subscription syntax to receive only aggregated or filtered message there are interested in, rather than receive all the raw message and process the message by themselves.

(43)

Algorithm 2: Data reconstruction algorithm N length of the filter;

Initialize LMS filter based on the euqation3.8and3.9.; while True do

if A new message received then

// update the filter extract value of the message y; update the filter based on equation3.3;

count = 0; else

Not received: use predicted value ;

calculate predicted value ypredbased on the equation3.2;

y = ypred

end end

Figure 3.3: Dual prediction.

of the message with topic T1, and the integral operation is not pre-defined in the broker. There are two choices for the subscriber:

• The subscriber subscribes to the topic T1, and receive all the raw message. This choice leads to a waste of network bandwidth, which contradicts the goal of MQTT+.

(44)

scalable when the number of subscribers that are calling for adding new functionalities is way too large.

Figure 3.4: Inflexible subscription of MQTT+.

None of these two solutions is suitable, and in our research, we enable clients to be able to subscribe to theoretical any content they are interested in by passing a UDF to the broker. The method we invented in called UDF-based subscription.

3.3.1 Architecture

We implement the UDF-based subscription by leveraging docker container [45].

Docker is a popular operation system (OS) level container virtualization technology. It differs from hypervisor virtualization, where there are several virtual machines(VM) that share hardware, and each VM has its own OS. Docker is more lightweight, since it can virtualize several instances upon the OS level. Each instance, namely container, can share both hardware and OS resources of the host machine. There are several important components of Docker:

• Docker engine: The core component of the Docker system, and it runs on the OS and provides the foundation for Docker containers.

• Docker clients: Any client that can interact with the Docker engine through command line or REST API.

• Docker image: The template for creating Docker container.

• Docker container: Created from Docker image and runs on the Docker engine.

(45)

The Figure 3.5 shows the architecture of the UDF-based subscription. Once a subscriber would like to subscriber something which is not currently by the MQTT+ broker, it can write a UDF, which implements the logic to convert raw messages to the message it wants. It then dockerizes the implemented UDF into a docker container and registers this container in the Docker registry, which is hosted by the broker. After finishing these two steps, the subscriber can send a SUBSCRIBE message with the topic in the format of $UDF/(address), the broker then can understand this is a UDF-based subscription and download the corresponding container from the docker registry using the address in the SUBSCRIBE message. Lastly, the raw messages will be sent into the container, and the output of the container will be forwarded to the subscriber.

Figure 3.5: UDF-based subscription.

(46)

(47)

Chapter 4 Result and Analysis

In this chapter, we present the results and discuss them.

Section4.1gives the details of how the development and experiment environment is setup, and section 4.2 describes the dataset used in our experiments. In the section 4.3, we benchmark the reproduced MQTT+ broker by comparing the MQTT broker with the MQTT+ broker. The section4.4presents the experiments results when data reduction algorithms are employed. The function of user-defined function based subscription is shown in the section4.5. The last but not the least, we discuss all the results in the section4.5.1.

4.1 Experiment Setup

The testbed is set up in the Ericsson OpenStack cloud environment, as is shown in the Figure4.1.

Virtual Machine 1 (VM1) hosts a simple MQTT broker, and VM2 has an enhanced MQTT broker which enables data reduction. VM3 contains a number of producers, and each producer is a separate docker container. VM4 contains several consumers, and each consumer is a docker container.

Two traffic measurers are installed at downlink and uplink to monitor the traffic. Downlink is the link between consumers and brokers, while uplink is the link between producers and brokers.

(48)

32 CHAPTER 4. RESULT ANDANALYSIS

Figure 4.1: Testbed.

4.2 Dataset

The paper uses a weather station dataset [46], which was collected by Max-Planck-Institute for Biogeochemistry. Fourteen features in the dataset are used in the experiments, and the table4.1shows the meaning of each feature [47].

p air pressure T air temperatrure Tpot potential temperature Tdew dew point temperature rh relative humidity

VPmax saturation water vapor pressure VPact Actual water vapor pressure VPdef Water vapor pressure deficit sh Specific humidity

H2OC Water vapor concentration rho Air density

wv Wind velocity

max.vv Maximum wind velocity wd Wind direction

(49)

4.3. MQTT VS. MQTT+ 33

4.3 MQTT vs. MQTT+

Results in this section aim to show that MQTT+ can apply some aggregation operations, thereby reducing downlink traffic. We expect to get results similar to that in the paper [3].

The aforementioned experiment was conducted to compare downlink traffic and uplink traffic between the original MQTT broker and MQTT + broker. Specifically, 14 publishers firstly publish measurements to the MQTT broker per 1.5 seconds. Traffic within 2000 seconds is measured. There are 6 different measured values as we chose 6 different numbers of subscribers (1,2,3,4,5,6), all the subscribers are interested in the average value of all the topics in the previous 15 seconds. Since the original does not support aggregation operation, all the subscribers have to subscribe to all the topics and compute average values by themselves. Then we made the same measurements for MQTT+ broker. The results are shown in Table4.2and4.3.

Number of subscribers 1 2 3 4 5 6

MQTT (kb) 157.88 279.07 417.65 549.15 676.76 814.68

MQTT+ (kb) 13.55 26.48 39.15 51.32 64.28 76.24

Table 4.2: Downlink traffic comparison.

MQTT (kb) 143.90 151.25 146.69 144.88 146.61 145.49 MQTT+ (kb) 146.61 143.84 144.78 147.13 144.61 145.49

Table 4.3: Uplink traffic comparison.

From the tables, we can see that the volume of downlink increases as the number of subscribers increases. And MQTT+ broker outperforms the single MQTT broker in terms of downlink bandwidth usage. The reason is that MQTT+ apply aggregation operations on the message received from publishers, and publish the only message that subscribers are interested in.

However, the uplink traffic is not reduced in MQTT+, this is because behaviors of publishers are not affected when introducing MQTT+, which is one of the aforementioned shortcomings of MQTT+.

(50)

4.4 Prediction-based data reduction algorithms

This section presents the experimental results when deploying LMS filters on the pub/sub system.

4.4.1 proof of concept

Before deploying LMS filters on the pub/sub system, a proof of concept is conducted to see how many reductions we can gain theoretically. We implemented the LMS filter using pure Python code and test the model on the dataset??.

The reduction ratio is as the metric, which denotes how many numbers of messages are not sent. The higher the reduction ratio we have, the more bandwidth can be saved. The reduction ratio is calculated by the formula4.1.

Reduction ratio = number of unsent message

number of total message (4.1) We implemented the LMS filter algorithm using pure Python3.5, and used two weight initialization strategies, namely without fast start and without fast start. The Figure 4.2 and 4.3 show the results for the feature T and p. Experimental results of the rest of features are presented at the appendix in the Figure4.6and 4.7.

From the two figures, we can see that as we set the error bound larger, we can obtain a higher reduction ratio. Also, fast start is a better weight initialization strategy, compared with the other strategy. This is because the algorithm can find a global optimal point faster when initialize weights using the fast start strategy.

The Figure4.4and4.5also shows how error bound can affect the forecasting. The blue curve is the value of real measurements of the Topic P, and the green curve is the forecasting value. If the error bound is set as a very large value such as 2 in the Figure4.5, most of the time, the green curve is a straight line, which means no data will be sent by the publishers, and the subscribers can use the algorithm to predict the missing message with a maximized error 2. However, if the error bound is sat as 0.5, then the trend of the curve change frequently, which means the publishers should publish messages more frequently then the publishers with error bound as 2.

(51)

4.4. PREDICTION-BASED DATA REDUCTION ALGORITHMS 35

Figure 4.2: Reduction ratio for topic T.

(52)

Figure 4.4: Topic P(error bound = 0.5).

Figure 4.5: Topic P(error bound = 2).

such as the feature wd. This means that most numbers of message do not necessarily be sent, since they could be predicted by the LMS filter, provided minor errors is allowed.

timestamp timestamp value

(53)

4.4. PREDICTION-BASED DATA REDUCTION ALGORITHMS 37

(a) Tpot (b) Tdew

(c) rh (d) VPmax

(e) VPact (f) VPdef

Figure 4.6: Reduction ratio of all features (Part1) error bound

reduction ratio with fast start

(54)

(a) sh (b) H2OC

(c) rho (d) wv

(e) max.wv (f) wd

Figure 4.7: Reduction ratio of all features (Part2)

If the LMS filter forecasting algorithm can be deployed in our pub/sub system, theoretically publishers can reduce the number of transmissions. For example, if the error bound for topic T is 0.5, from the Figure4.2, more than 50% of messages do not need to be sent.

error bound reduction ratio

(55)

4.4.2 On testbed

The results in the proof of concept show some message does not need to be sent if minor errors is allowed.

Without data reduction (kb) 148.94 146.67 150.21 149.95 143.32 150.16 With data reduction (kb) 89.50 87.99 88.23 89.12 89.31 90.23

Table 4.4: Uplink traffic with data reduction

We deploy the algorithm on our pub/sub system, and run a similar experiment like that in the section 4.3 on the testbed. The only difference is that now all subscribers subscribe to the raw value of each feature. The subscribers perform the data reduction algorithm to reduce transmissions, and all the subscribers are capable of reconstructing messages if not received. We also set 0.5 as the error bound for each feature. The result is shown in the table4.4.

As we can see from the table, we can obtain up to nearly 40% reduction on the total amount of uplink traffic.

4.5 UDF-based subscription

Without UDF (kb) 157.67 278.94 417.63 549.10 677.61 814.27 With UDF (kb) 9.01 18.02 27.32 34.74 44.29 52.18

Table 4.5: Downlink traffic with UDF-based subscription

(56)

the broker can use the container to process the received message and forward the output of the container to the subscribers. The comparison result is shown in the table4.5.

The broker cannot pre-define every possible operation, which means in most cases, the downlink traffic cannot be reduced even though we use the MQTT+ broker. However, UDF-based subscription makes the subscribers can potentially subscribe to any content without destroying data reduction in the downlink traffic, no matter if the operation is pre-defined in the broker or not.

4.5.1 Discussion

(57)

Chapter 5 Conclusions and Future Work

5.1 Conclusions

In our research, we design a pub/sub system with data reduction and flexible subscription, which can be used to support efficient data collection and massive deployment of IoT in the upcoming 5G era. More specifically, we aim to reduce bandwidth consumption in the MQTT based pub/sub system, thereby reducing the network burden of transferring massive data. Firstly, we reproduced the MQTT+, which is an enhanced version of MQTT that offers enhanced functionalities such as data filtering and data aggregation. Secondly, we introduced LMS-based dual prediction algorithm to the pub/sub system to reduce the number of data transmission of publishers. Thirdly, we enhanced the flexibility of our pub/sub system by supporting UDF-based subscription, leveraging containerization technology. In the evaluation, experiments on the IoT dataset show that compared with MQTT broker, the use of MQTT+ can reduce bandwidth consumption between the broker and subscribers, while the traffic between publishers and the broker remains almost the same. Furthermore, dual prediction can help to reduce the traffic between the broker and publishers, provided some error is allowed. Lastly, UDF-based subscription makes subscribers can subscribe to any content without destroying data reduction in the downlink traffic, no matter if the operation is pre-defined in the broker or not. Another advantage is UDF is programming language independent, since the UDF is containerized.

5.2 Future work

Our system can be improved in many aspects. First of all, there could be serious security issues in the UDF-based subscription. The UDF-based subscription works in a way that subscribers pass a container to the broker, and the

(58)

42 CHAPTER 5. CONCLUSIONS ANDFUTURE WORK

(59)

Bibliography

[1] IoT connections outlook|mobility report. [Online]. Available: https://www.ericsson.com/en/mobility-report/reports/ november-2019/iot-connections-outlook

[2] A. E. C. Redondi, A. Arcia-Moret, and P. Manzoni, “Towards a scaled IoT pub/sub architecture for 5g networks: the case of multiaccess edge computing,” in 2019 IEEE 5th World Forum on Internet of Things (WF-IoT). IEEE. doi: 10.1109/WF-IoT.2019.8767268. ISBN 978-1-5386-4980-0 pp. 436–441. [Online]. Available: https: //ieeexplore.ieee.org/document/8767268/

[3] R. Giambona, A. E. C. Redondi, and M. Cesana, “MQTT+: Enhanced syntax and broker functionalities for data filtering, processing and aggregation.” [Online]. Available:http://arxiv.org/abs/1810.00773

[4] A. Jain, E. Y. Chang, and Y.-F. Wang, “Adaptive stream resource management using kalman filters,” in Proceedings of the 2004 ACM SIGMOD international conference on Management of data - SIGMOD ’04. ACM Press. doi: 10.1145/1007568.1007573. ISBN 978-1-58113-859-7 p. 11. [Online]. Available: http://portal.acm.org/citation.cfm?doid= 1007568.1007573

[5] Y. Fathy, P. Barnaghi, and R. Tafazolli, “An adaptive method for data reduction in the internet of things,” in 2018 IEEE 4th World Forum on Internet of Things (WF-IoT). IEEE. doi: 10.1109/WF-IoT.2018.8355187. ISBN 978-1-4673-9944-9 pp. 729–735. [Online]. Available:https://ieeexplore.ieee.org/document/8355187/

[6] S. Santini and K. Romer, “An adaptive strategy for quality-based data reduction in wireless sensor networks,” p. 8.

[7] A. Jarwan, A. Sabbah, and M. Ibnkahla, “Data transmission reduction schemes in WSNs for efficient IoT systems,” vol. 37, no. 6, pp.

(60)

44 BIBLIOGRAPHY

1307–1324. doi: 10.1109/JSAC.2019.2904357. [Online]. Available: https: //ieeexplore.ieee.org/document/8664582/

[8] G. M. Dias, B. Bellalta, and S. Oechsner, “Using data prediction techniques to reduce data transmissions in the IoT,” in 2016 IEEE 3rd World Forum on Internet of Things (WF-IoT). IEEE. doi: 10.1109/WF-IoT.2016.7845518. ISBN 978-1-5090-4130-5 pp. 331–335. [Online]. Available: http://ieeexplore.ieee.org/document/7845518/

[9] P. Zehnder, P. Wiener, and D. Riemer, “Using virtual events for edge-based data stream reduction in distributed publish/subscribe systems,” in 2019 IEEE 3rd International Conference on Fog and Edge Computing (ICFEC). IEEE. doi: 10.1109/CFEC.2019.8733146. ISBN 978-1-72812-365-3 pp. 1–10. [Online]. Available: https://ieeexplore.ieee.org/document/8733146/ [10] A. Shastri, “Master of science in electronics and communication engineering

by research,” p. 67.

[11] G. Anastasi, M. Conti, M. Di Francesco, and A. Passarella, “Energy conservation in wireless sensor networks: A survey,” Ad Hoc Netw., vol. 7, no. 3, p. 537–568, May 2009. doi: 10.1016/j.adhoc.2008.06.003. [Online]. Available: https://doi.org/10.1016/j.adhoc.2008.06.003

[12] V. Raghunathan, C. Schurgers, Sung Park, and M. Srivastava, “Energy-aware wireless microsensor networks,” vol. 19, no. 2, pp. 40–50. doi: 10.1109/79.985679. [Online]. Available: http://ieeexplore.ieee.org/ document/985679/

[13] S. Yadav and S. Singh, “Review paper on development of mobile wireless technologies (1g to 5g),” Int. J. Comput. Sci. Mob. Comput, vol. 7, pp. 94– 100, 2018.

[14] L. J. Vora, “Evolution of mobile generation technology: 1g to 5g and review of upcoming wireless technology 5g,” International journal of modern trends in engineering and research, vol. 2, no. 10, pp. 281–290, 2015. [15] P. Gupta, “Evolvement of mobile generations: 1g to 5g,” International

Journal for Technological Research in Engineering, vol. 1, pp. 152–157, 2013.

(61)

BIBLIOGRAPHY 45

[17] S. Li, L. D. Xu, and S. Zhao, “5g internet of things: A survey,” p. 28. [18] D. Sabella, A. Vaillant, P. Kuure, U. Rauschenbach, and F. Giust,

“Mobile-edge computing architecture: The role of mec in the internet of things,” IEEE Consumer Electronics Magazine, vol. 5, no. 4, pp. 84–91, 2016.

[19] P. T. Eugster, P. A. Felber, R. Guerraoui, and A.-M. Kermarrec, “The many faces of publish/subscribe,” vol. 35, no. 2, pp. 114–131. doi: 10.1145/857076.857078. [Online]. Available: http://portal.acm.org/citation. cfm?doid=857076.857078

[20] Publish & subscribe - MQTT essentials: Part 2. [Online]. Available: https://www.hivemq.com/blog/mqtt-essentials-part2-publish-subscribe/ [21] D. Happ, N. Karowski, T. Menzel, V. Handziski, and A. Wolisz,

“Meeting IoT platform requirements with open pub/sub solutions,” vol. 72, no. 1, pp. 41–52. doi: 10.1007/s12243-016-0537-4. [Online]. Available: http://link.springer.com/10.1007/s12243-016-0537-4

[22] MQTT. [Online]. Available:http://mqtt.org/

[23] Apache kafka. [Online]. Available: https://kafka.apache.org/

[24] CoAP — constrained application protocol | overview. [Online]. Available: https://coap.technology/

[25] Pub/sub – redis. [Online]. Available:https://redis.io/topics/pubsub

[26] Content-based filtering using ESQL. [Online]. Available: https://www.ibm.com/support/knowledgecenter/en/SSMKHH 10.0.0/ com.ibm.etools.mft.doc/bq13450 .htm

[27] S. A. Shinde, P. A. Nimkar, S. P. Singh, V. D. Salpe, and Y. R. Jadhav, “Mqtt-message queuing telemetry transport protocol,” International Journal of Research, vol. 3, no. 3, pp. 240–244, 2016.

[28] Eclipse mosquitto. [Online]. Available: https://mosquitto.org/

[29] MQTT broker for IoT in 5g era | EMQ. [Online]. Available: https: //www.emqx.io/

(62)

46 BIBLIOGRAPHY

[31] G. Anastasi, M. Conti, M. Di Francesco, and A. Passarella, “Energy conservation in wireless sensor networks: A survey,” Ad hoc networks, vol. 7, no. 3, pp. 537–568, 2009.

[32] Intel research lab berkeley: Intel lab data. [Online]. Available: http: //db.csail.mit.edu/labdata/labdata.html

[33] SQLite home page. [Online]. Available:https://www.sqlite.org/index.html [34] Callbacks :: HiveMQ documentation. [Online]. Available: https:

//www.hivemq.com/docs/hivemq/3.4/plugins/callbacks.html

[35] Services :: HiveMQ documentation. [Online]. Available: https://www. hivemq.com/docs/hivemq/3.4/plugins/services.html

[36] C. Chatfield, Time-series forecasting. Chapman & Hall/CRC. ISBN 978-1-58488-063-9

[37] G. Zhang, “Time series forecasting using a hybrid ARIMA and neural network model,” vol. 50, pp. 159–175. doi: 10.1016/S0925-2312(01)00702-0. [Online]. Available: https://linkinghub.elsevier.com/ retrieve/pii/S0925231201007020

[38] G. M. Dias, B. Bellalta, and S. Oechsner, “A survey about prediction-based data reduction in wireless sensor networks.” [Online]. Available: http://arxiv.org/abs/1607.03443

[39] B. Widrow, “Thinking about thinking: the discovery of the LMS algorithm,” vol. 22, no. 1, pp. 100–106. doi: 10.1109/MSP.2005.1407720. [Online]. Available: http://ieeexplore.ieee.org/document/1407720/

[40] “ADAPTIVE SWITCHING CIRCUITS,” p. 48.

[41] S. R. Prasad and B. B. Godbole, “Optimization of LMS algorithm for system identification,” p. 13.

[42] E. Ferrara, “Fast implementations of lms adaptive filters,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 28, no. 4, pp. 474–475, 1980.

(63)

BIBLIOGRAPHY 47

[45] Empowering app development for developers | docker. [Online]. Available: https://www.docker.com/?utm source=google& utm medium=cpc&utm campaign=dockerhomepage&utm content=

nemea&utm term=dockerhomepage&utm budget=growth&gclid= EAIaIQobChMIwfyy8NCk6gIVxcqyCh0v-gG7EAAYAiAAEgL9vfD BwE

[46] Max-planck-institut fuer biogeochemie - wetterdaten. [Online]. Available: https://www.bgc-jena.mpg.de/wetter/

(64)

TRITA-EECS-EX-2020:782