An evaluation of how edge computing is enabling the opportunities for Industry 4.0

(1)

Wictor Svensson

Thesis - Institution of Information Systems and Technology Main field of study: Computer Engineering

Credits: 300

Semester, year: Spring, 2020

Supervisor: Stefan Forsström, stefan.forsstrom@miun.se Examiner: Tingting Zhang, tingting.zhang@miun.se Course code/registration number: DT005A

(2)

Abstract

Connecting factories to the internet and enable the possibilities for these to autonomously talk to each other is called the Industrial Internet of Things(IIoT) and is mentioned as Industry 4.0 in the terms of the indus-trial revolutions. The machines are collecting data through very many different sensors and need to share these values with each other and the cloud. This will make a large load to the cloud and the internet, and the latency will be large. To evaluate how the workload and the latency can be reduced and still get the same result as using the cloud, two different sys-tems are implemented. One which uses cloud and one which using edge computing. Edge computing is when the processing of the data is decen-tralized to the edge of the network. This thesis aims to find out ”When is it more favorable to use an edge solution and when is it to prefer a cloud solution”. The first system is implemented with an edge platform, Crosser, the second system is implemented with a cloud platform, Azure. Both implementations are giving the same outputs but the differences is where the data is processed. The systems are measured in latency, band-width, and CPU usage. The result of the measurements shows that the Crosser system has less latency, using smaller bandwidth but is needing more computational power of the device which is on the edge of the net-work. The conclusion of the results is that it depends on the demands of the system. Is the demands that it should have low latency and not using much bandwidth Crosser is to prefer. But if a very heavy machine learn-ing algorithm is golearn-ing to be executed in the system and the latency and bandwidth size is not a problem then the Cloud Reference System is to prefer.

(3)

Acknowledgments

I am grateful for the help I have gotten from Knowit, specifically from Kristin and Tim which have acted as my supervisors, have been engaged, had regular meetings every week and supported me through the whole thesis.

I would also like to thank Crosser who have let me use their platform and been giving me support when I have had questions.

(4)

4.6 Crosser . . . 24 4.7 Motivation of choice . . . 25 5 Implementation 27 5.1 General approach . . . 27 5.1.1 Sensor data . . . 27 5.1.2 Pre-processing . . . 28 5.1.3 Machine learning . . . 28 5.1.4 Visualization . . . 29 5.1.5 Storage . . . 29 5.2 Crosser system . . . 30 5.2.1 Edge node . . . 30

5.2.2 Azure IoT Hub and Stream Analytics . . . 32

5.3 Cloud reference system . . . 32

5.3.1 Gateway . . . 33

5.3.2 Azure IoT Hub and Azure Stream Analytics . . . 33

5.3.3 Azure Machine Learning Services . . . 33

5.4 Measurement arrangement . . . 34

5.4.1 Packet size . . . 34

5.4.2 Round trip time . . . 34

5.4.3 Computational power . . . 35

6 Results 36 6.1 Packet size . . . 36

6.1.1 Crosser system . . . 36

6.1.2 Cloud reference system . . . 36

6.2 Round trip time (RTT) . . . 37

6.3 Computational power (CPU usage) . . . 38

6.4 Front-end result . . . 41

6.5 Analysis of the measurement results . . . 42

6.5.1 Packet size . . . 42

6.5.2 Round trip time (RTT) . . . 42

6.5.3 Computational usage (CPU usage) . . . 43

(6)

7 Conclusions 46

7.1 Future work . . . 48 7.2 Ethical considerations . . . 49

References 51

Appendix A Time plan 56

(7)

Terminology

Acronyms

AI Artificial Intelligence

ANN Artificial Neural Networks

AR Application Relationship

CF-IIoT Cloud-fog integrated IIoT

CIP Common Indstrial Protocol

CoAP Constrained Application Protocol

CPS Cyber Physical Systems

CPU Central Processing Unit

CR Communication Relationship

ETSI

European Telecommunications Standards Institute

ERP Enterprise Resource Planning

Ethernet/IP Ethernet/Industrial Protocol

HTTP Hypertext Transfer Protocol

IIC Industrial Internet Consortium

IIoT Industrial Internet of Things

(8)

IP Internet Protocol

M2M Machine-to-Machine

MEC

Multi-Access Edge Computing (earlier: Mobile Edge Computing)

ML Machine Learning

MQTT

Message Queuing Telemetry Transport Protocol

NFC Near Feild Communications

ODVA Open DeviceNet Vendors Association

OPC-UA OPC-Unified Architecture

OT Operational Technology

PCA Principal Component Analysis

PI Profibus International

PLC Programmable Logic Controller

PLS Partial Least Squares

QoE Quality of Experience

QoS Quality of Service

RCGA-CO

(9)

RFID Radio-FrequencyIDentification

RTT Round Trip Time

SDK Software Development Kit

SSL Secure Sockets Layer

SQL Structured Query Language

TCP Transmission Control Protocol

TLS Transport Layer Security

UDF User Defined Function

UDP User Datagram Protocol

VM Virtual Machine

WISP

Wireless Identification and Sensing Platforms

(10)

1. Introduction

Internet of Things (IoT) is getting more and more common in our so-ciety. Where almost everything around the human is predicted to be connected to the internet in the year 2025. This could be such things as food packages, furniture’s, paper documents et cetera.[1]

Industrial Internet of Things (IIoT) is when IoT is implemented in the industries. This is said to be revolutionary, since machines would be able to talk with each other without any human interaction. IIoT will be useful in many different industries and there are examples of when IIoT have minimized the loss in the factory and thereby increased the economic gain.[2] Using IIoT would also enhance the working condi-tions for workers. For example it would be possible to have unmanned vehicles in industries where it could be dangerous for workers to be. This could be in mines where unstable shafts can collapse, by using IIoT these machines could be autonomous and if Machine Learning also is implemented it is possible to predict when a collapse is near to occur.[3]

1.1 Background and problem motivation

(11)

Using Machine Learning in the industries can help to identify when a fault has occurred or even predict a fault before it occurs. There are many different methods when it comes to using machine learning. The two main categories of the methods are supervised- and unsupervised learning. Both these methods could be favorable to use in industries depending on use case.[6]

Combining IIoT and machine learning will take Industry 4.0 closer. But some challenges need to be solved before this is possible. Con-necting the industries to the internet will include conCon-necting millions of sensors to the internet. These sensors send out multiple values at each second which would be problematic. Since sending all of these values to the cloud would take very much bandwidth, which is creating latency problems. The data which often are generated are also sensitive and it is, therefore, a risk with having it analyzed in the cloud.

A solution for this could be fog- or edge computing. Fog- and Edge computing is almost the same and can be described as a platform that is introduced at the edge of the network, at a gateway for example. The edge platform could then process, analyze, or even act on the edge of the network. The benefits of edge computing is that it improves performance and data privacy, the data security is getting higher and it reduces economic costs.[7]

1.2 Overall aim

The overall aim of this project is to investigate how an edge solution performs compared to a cloud solution. The aim is also to examine how the edge solution can minimize latency and bandwidth but still not need heavy computational power. The contribution of this thesis will be to present when each system is to prefer. The impact will be that this thesis will make the differences between edge computing and cloud computing more clear, which helps in the choice of choosing different solutions for implementing IoT in the industries. The impact will thereby also be to take a step closer to Industry 4.0

(12)

1.3 Concrete and verifiable goals

The concrete and verifiable goals are divided into four goals where the second goal has three sub-goals. The goals are listed below.

1. Examine different Edge solutions based on functions, usability, and capacity needed to run it. Choose one of them

2. Implement the two systems

2.1. Implement the chosen edge platform on a Raspberry Pi 2.2. Implement a cloud solution on one of the bigger cloud

service providers platform

2.3. Implement a machine learning algorithm for classification depending on the sensor values, for both systems

3. Measure the differences in latency, bandwidth needed and com-putational usage

4. Evaluate in terms of the measurements when which system is to recommend

1.4 Scope

The scope of the project is to set up an edge node which is executing a machine learning algorithm to analyze the data which are received and then send the data for storage at a cloud solution. The scope is also to set up a system that is not using an edge node but just a cloud solution with the same functionality as the edge solution and then evaluate the differences between these two different solutions. The scope is to measure the solution considering the latency, bandwidth, and computational usage.

(13)

1.5 Outline

(14)

2. Theory

In the sub-chapters below the theory will be presented. The following theory parts will be explained; the Internet of Things, Industrial Internet of Things, Edge computing, and Machine Learning. A subchapter with related work is also covered in the theory part.

2.1 Internet of things

Internet of Things (IoT) is defined as ”connect the unconnected” by Hanes et al[8]. Where all objects around the human which is not already connected to the internet will be connected and communicate together. In other words, the physical world will be smarter while connecting them to an intelligent network.

The age of IoT is said to have been started in the years 2008 to 2009. During this period the number of devices connected to the internet went past the world population. The person how coined the term ”Internet of Things” where Kevin Aston when he was working at the company Procter & Gamle in 1999. The reason for the term was when he was trying to explain an idea of linking their supply chain to the internet.[8]

In the beginning of IoT the considered ”things” where Radio-Frequency IDentification (RFID) tags. Further on things such as Near Feild Com-munications (NFC), Wireless Sensor and Actuator Networks (WSAN) are techniques which were used. In the beginning of IoT, it was also said that Wireless Identification and Sensing Platforms (WISP) should be vital to connect ”the real world with the digital world”.[1]

(15)

optimize and automate the devices.[10]

Figure 2.1: IoT Technical Stack[10]

Many different protocols are used in the connectivity layer. Three of the most known and common used protocols are MQTT, CoAP and HTTP.

(16)

used together with MQTT is TCP, where TLS/SSL is used for security. MQTT is also using three levels of Quality of Service (QoS), which is deciding the reliability level of the packet delivery.[11]

HTTP

(Hypertext Transfer Protocol) is available in two different ver-sions, version HTTP/1 and HTTP/2. The protocol is a request/response protocol, which means that either a request or a response is sent. HTTP is using TCP as a transport protocol where TLS/SSL is used for security. The difference between the two versions is that HTTP/2 supports all features of HTTP/1 but is more efficient in many more ways. The HTTP protocol is often verbose and repetitive and therefore takes unnecessary network traffic, which can result in latency. The most common request in HTTP is GET, PUT, POST and DELETE.[12]

CoAP

(Constrained Application Protocol) is as the name tells a protocol which is used in constrained nodes and networks. This can be small microcontrollers with small ROM and RAM. The protocol is designed for M2M and is also well fitted for low power consumption. CoAP is built on REST and reminds a bit of HTTP. However, CoAP is using UDP for transport, which is a datagram-oriented transport proto-col. CoAP has four different types of messages, these are Confirmable, Non-confirmable, Acknowledgement, and reset. The difference of these four messages are the reliability, since UDP is used there are not any acknowledgments automatically included. Therefore it is no guarantee that the message will be delivered.[13]

2.2 Industrial internet of things

(17)

2.2.1 Automation control protocols

In the industries today there are many already established protocols that are used for control motion, synchronization, and security. These protocols were established a long time ago, before even Ethernet and IP were introduced. These protocols have been adapted to take advantage of new modern protocols. There are a lot of different industrial protocols but the most common and with largest market adoption is PROFINET, Ethernet/IP and Modbus/TCP.[8]

PROFINET

is managed by Profibus International(PI), the proto-col uses Provider/Consumer communication model, which means that a centralized device, the Provider, and decentralized devices, the con-sumers, are communicating. The provider makes data available to the consumer. The device can have three different classifications, these are IO-controller, IO-Device and IO-Supervisor. Application Relationship (AR) between IO-Device and IO-Controller is establishing the commu-nication. Communication Relationship (CR) is established after AR and in that moment network data traffic and alarm management is config-ured. The devices can after that exchange data in full-duplex mode.[15]

Ethernet/IP (Industrial Protocol)

was developed by Rockwell Automation and is an open standard for industrial automation. IP stands for ”Industrial Protocol” and not ”Internet Protocol” which is common to see. Common Industrial Protocol(CIP) is adapted to the standard Ethernet protocol.[8] CIP is a communication protocol that was designed by the Open DeviceNet Vendors Association(ODVA) organization. CIP is applied in Ethernet to make on-demand usage. CIP services helps to communication without any need of establish connection for each time.[16]

(18)

2.2.2 Architecture connected factories

To be able to connect many different sensors there needs to be some architecture framework. The design of the IIoT architecture should be scalable, extensible, modulable, and interoperable. There have been many suggestions of different IoT architectures, these different solutions have five, four and three layers. The International Telecommunication Union advocate a five-layer architecture, these layers are; sensing, accessing, networking, middleware and application.[17] Atzori et al.[1] suggests a three-layer architecture with the layers sensing layer, net-work layer and service layer. Liu et al. [18] describes a four-layer architecture with the layers; physical layer, transport, middleware, and applications.

The Industrial Internet Consortium (IIC) has released a paper ”Ref-erence Architecture”, where they focus on the different viewpoints and provides models for these. The different viewpoints is Busi-ness Viewpoint, Usage Viewpoint, Functional Viewpoint and Implementation Viewpoint. The business viewpoint is how the IIoT system can achieve stated objectives through its mapping to fundamental system capabil-ities. The usage viewpoint is the requirements of the users which is involved in the IIoT systems with for example interactive activities. The functional viewpoint is the functional components in the IIoT system. The implementation viewpoint is dealing with the technologies that are needed to implement the functional components, the communication schemes, and the procedures of their life cycles. The four viewpoints are used to identify which concerns the IIoT system and their stake-holders have. The viewpoints makes it easy to analyze and address the related concerns.[19]

2.2.3 Industry 4.0

(19)

production where introduced based on electricity and internet. The fourth industrial revolution is named industry 4.0 and is ongoing right now. The revolution builds on cyber-physical systems(CPS) production. Industry 4.0 role is to improve effectiveness and efficiency in the factories by using heterogeneous data and knowledge integration. Some techniques that are associated with industry 4.0 is RFID, IoT and Enterprise Resource Planning(ERP).[20]

2.3 Edge computing

Edge computing is defined by [7] as a ”cloud computing system that perform data processing at the edge of the network, near the source of the data”, this is a very simple description of what edge computing is. Edge computing is when the computing infrastructure is decentral-ized and the computing resources and application services can be distributed between the cloud and data resource. The main benefits of edge computing is that it improve performance, data privacy and data security is higher and reduces costs. The edge improves the performance since the data can be processed, analyzed, and acts on the edge of the network before it sends the data to the cloud. This is much more close to real-time than it ever has been before when the computing was at the cloud. Edge computing is making it easier to make good privacy and security of the collected data since it is in a local network when processing and analyzing are taking place. The cost is automatically reduced since connectivity, data migration, bandwidth and latency features are very expensive at the cloud.[7]

In the paper [21], W. Yu et al are describing three areas in the architecture of edge computing. This three areas are Front-end, Near-end and Far-end. Where front-end are the devices and sensors, near-end is the IoT gateways or local networks that connect the sensor and far-end is network or cloud which are used for heavy computation and storage. Figure 2.2 shows how these three areas can look like.

(20)

Figure 2.2: Front-end, near-end and far-end in IoT[21] will be described briefly in the sub-chapters to get a better picture.

2.3.1 Fog

Fog computing is described as an extension to cloud computing by S. Yi et al[22]. Where the fog offers computation, storage, and networking services between the end devices and the cloud. The fog network should connect all devices which are at the fog together and providing services to the devices.

F. Bonomi et al[23] describes fog computing as a ”highly virtual-ized platform that provides compute, storage and networking services between end devices and traditional Cloud Computing Data Centers”. This platform is often placed at the edge of the network, but not necessary. The edge should have low latency and the location of the edge should be aware. The fog should also be a large scale sensor network where it should be possible to monitor the environment. The applications which are interesting to be running in the fog are real-time applications. The fog platform should also support connectivity to the cloud service providers.

(21)

2.3.2 Mist

Mist computing is a term that is mentioned when it comes to IoT and the future of it. Mist computing can be described as even more closely than fog computing, where the guiding principles are information provided by the network, the information would only be delivered if a request has been made, dynamic creation in the network where subscriber/provider model should be used and the devices must find and discover the providers dynamically and execute the applications. The difference between fog/cloud computing and mist computing is that fog/cloud computing has awareness of the users’ needs and global situation, while mist only has awareness of the physical environ-ment and local situation. This means that the mist only know that the device exists and where it is but not what the application is, the mist is connecting the devices with other words.[24]

2.3.3 MEC (Multi-Access Edge Computing)

When MEC first was introduced by the European Telecommunications Standards Institute (ETSI) it stood for Mobile Edge Computing. MEC which were renamed to Multi-Access Edge Computing means that the intelligence is extended to the edge of the network, with higher processing and storage capabilities.[25] Many IoT applications need ultra-low latency such as 1 ms and reliability high as 99.99%, this is where MEC is useful.[26]. MEC can be described as moving the computational power to the edge of mobile base stations. This makes the resources come closer to the users compared to when the resources are at a centralized data center (cloud). [27]

2.4 Machine learning

(22)

Unsupervised learning is when the data does not have any given labels, the main goal is then to find some hidden structure of the data. Methods which are used during unsupervised learning in industries are principal component analysis, independent component analysis, k-means clustering, kernel density estimation, Gaussian mixture models, support vector data description, manifold learning and self-organizing map. Z. Ge et al have compared which unsupervised model that are most common in industries, where PCA (principal component analysis) are being used in 51% of the cases.[6]

Supervised learning is different from unsupervised, where the methods deal with data that have labels. These labels can either be discrete or continuous. If the label is discrete it could be used as classes, if the values are continuous, regression models could be used or pre-processing can be used to make classes. Some of the most known supervised methods in the industries are principal component regression, partial least squares, Neural networks, artificial neural networks, support vector machine, decision tree, random forest and nearest neighbor. In the report of Z. Ge et al the most common supervised methods are PLS (partial least squares) and ANN (artificial neural networks). Decision trees are commonly used in operations where decisions are made. This is therefore also common in industries.[6]

2.4.1 Decision tree

Decision tree is described by S. R. Safavian and D. Landgrebe as ”The basic idea involved in any multistage approach is to break up a complex decision into a union of several simpler decisions”. This means that a bigger problem is divided into many smaller steps which builds a tree and will hopefully solve the bigger problem. A decision tree has some different components, these are the following; root, nodes, edges, and leaves. Where the root is always exactly one with no edges enters. A graph consists of a set of nodes and a set of edges. A leaf is a node with no proper descendant. Figure 2.3 shows how a decision tree can look like. Where the top node is the root and the squares are leaves .[29]

(23)

Figure 2.3: Example of how a decision tree looks like[29]

continues until it reaches a leaf where each leaf is the class it belongs to. The leaf is shown as what the decision tree predicts of the input value. A decision tree is easy for the human eyes to read where the conditions of each rule are Boolean.[28]

2.5 Related work

In the sub-chapters below three different related works are presented.

2.5.1 A fog computing industrial cyber-physical system for

em-bedded low-latency machine learning Industry 4.0

appli-cations

P. O’donovan et al are in their article ”A fog computing industrial cyber-physical system for embedded low-latency machine learning Industry 4.0 applications”[30] testing how fog (edge) performs compared to cloud computing. The fogs are executing machine learning models, the models are collected from the cloud where they have been trained and created.

(24)

100, 250, and 500 number of connections where made. The maximum round trip time for each number of connection for each system where measured. This showed that the difference in the maximum time of the different systems where between 67.7%-99.4%. The report measure also how many failures there are between both systems. Where the fog did not have any failure and the cloud did have 0% for 50 connections, 0.11% for 100 connections, 1.42% for 250 connections, and 6.6% for 500 connections.

P. O’donovan et al concludes that for adopting Industry 4.0 fog computing should be better since it is making decisions much faster than the cloud and the fog is minimizing the risk of failure compared to the cloud. It is also concluded that the fog has benefits for the security since the machine learning is executed in the physical boundaries of the factory.

2.5.2 Ultra-Low latency cloud-fog computing for industrial

in-ternet of things

C. Shi et al[31] have integrated cloud and fog computing to set up something called ”cloud-fog integrated IIoT”(CF-IIoT) in their article ”Ultra-Low Latency Cloud-Fog Computing for Industrial Internet of Things”. The architecture of CF-IIoT is built to satisfy low latency, the architec-ture builds on three layers which are cloud service layer, fog computing layer, and infrastructure layer. The infrastructure layer contains the sensors, manufacturing equipment et cetera. The role of the infrastruc-ture layer is to collect sensor data, manufacturing, and logistics. The fog layer contains multiple edge devices which do communicate with each other, the fog nodes are also communicating with the cloud. Most importantly the data should be able to be processed in the fog nodes which reduces the latency. The cloud service layer should store IIoT data, sharing global information, and be able to compute data mining. The paper is presenting an algorithm for optimizing the latency, which is called ”real-coded genetic algorithm for constrained optimization problem”(RCGA-CO). The algorithm is a modification of the real coded GA algorithm, which focuses on constrained optimization.

(25)

is shown in figure 2.4. The result were 92.9% and 35.7% better latency performance than the cloud-based architecture and fog-based architec-ture respectively.

Figure 2.4: Result of [31]

The conclusion of the paper is that with help of the CF-IIoT and the RCGA-CO algorithm it is possible to provide ultra-low latency but it could be a bit unreliable.

2.5.3 Hierarchical fog-cloud computing for IoT systems: a

com-putation offloading game

(26)

fog nodes and how much that is needed to be executed in the cloud service.

The results of the tests show that if adding more fog nodes the QoE is increasing and the delay is decreasing. The delay is also in-creased if the number of IoT users are increasing. The result also shows that the more fog nodes that are added the less computational power of the cloud service is needed. It also shows that most of the time that it is taken in the system is for computation and communication when the system is running on many fog nodes, and when the system is running on only the cloud service the most time is for computation and round trip time.

(27)

3. Methodology

This project thesis will be performed by a fifth-year student in computer science. The project will be performed together with the companies Knowit and Crosser. The project will contain literature studies, two system developments, measurements of the systems, calculations, and an evaluation. The project will be planned in sprints where each sprint is in two weeks and ends with a sprint meeting. At the sprint meeting the accomplishments will be presented, the meeting will also contain what’s to be done until the next meeting. A time plan will be planned, written, and followed during the thesis work. The sprints will contain; startup, pre-study of platform, implementing the platform, setting up cloud solution, implementing machine learning, implement machine learning on edge and cloud, make measurements of the systems, and put together the report. The time plan can be seen in Appendix A. At the beginning of the project a literature study will be performed to get a deeper understanding of the project. The topics which would be studied are IoT, IIoT, Edge Computing, Machine Learning, and how these topics can be combined, the literature study will be performed by reading research papers, books, and web pages. A study of different edge solution platforms is also going to be made, the study will contain both larger and smaller platforms on the market. The study will result in a written survey where each of the investigated platforms will be presented and one of them will be chosen. The goal Examine differ-ent Edge solutions based on functions, usability, and capacity needed to run it. Choose one of them will thereby be accomplished.

(28)

cloud solution. The third and last sub-goal Implement a machine learning algorithm for classification depending on the sensor values, for both systems will both be implemented on the edge solution and the cloud solution. The training of the algorithm will be executed in python with the help of the library scikit-learn. The trained model will be executed on the edge node in the edge solution and at the cloud solution, the model will both be trained and executed at the cloud.

The third goal Measure the differences in latency, bandwidth, and compu-tational usage will be accomplished by measure and calculate the two different solutions which were implemented in the second goal. The measurements will take into consideration how the latency, bandwidth, and computational usage are affected in the two different solutions. The measurements will be calculated by the network analyzing program Wireshark, were in- and out-traffic can be seen. The latency is going to be measured in how long time (milliseconds) it takes from the moment when a sensor value is collected until it has been going through the machine learning algorithm and a decision has been made(round trip time). This can be an action which is warning if some error is on the way to occur. The bandwidth will be measured in speed (bit/second). The computational usage will be measured in percent(%) used of the CPU.

The fourth goal Evaluate in terms of the measurements when which system is recommended is accomplished by analyze and make calculations of the measurements. The focus of the evaluation will be to evaluate which system has the lowest latency(fastest), needs the least bandwidth, and how much computational power needed in the processing of the data. The evaluation will be presented with different graphs that display how the two different systems are performing compared to each other. The values will be carefully studied and a conclusion will be drawn of which system that is to prefer when.

(29)

(30)

4. Choice of edge computing

solu-tion

There are several different edge computing service providers to choose be-tween. There are larger companies such as Microsoft[33], Amazon[34] and Cisco[35] who have launched their edge solutions but there are also smaller companies such as Nebbiolo Technologies[36], FogHorn[37] and Crosser[38] which have specified their business on just edge/fog computing. The following sub-chapters will give a brief description of each solution. The solutions will be investigated on; how is the plat-form structured, which hardware is needed, is it possible to implement Machine Learning, and in which way is the platform programmed and developed? The last sub-chapter in this section will present which solution that is chosen.

4.1 Microsoft Azure IoT Edge

Microsoft has built an edge solution, Azure IoT Edge, which is working together with their Azure IoT Hub. IoT Edge is a solution that moves some of the workload to the edge of the network. Microsoft is providing an edge device which is called Project Brainwave, which should be able to make real-time AI-calculations. However it is not mandatory to use Microsoft’s own hardware, a third-party hardware requires that it has at least 128 MB random access memory. However, this is just for running IoT Edge, if other functions such as machine learning and artificial intelligence are implemented more capacities may be needed. If machine learning is implemented the training of the model should be executed in the cloud of Azure and the model is executed locally on the device. Microsoft’s Azure IoT Edge is also able to work when it does not have any internet connection, where it can store the information locally and when it gets the internet back it should be able to upload the data to the cloud. Microsoft IoT Edge can be developed in various program languages, such as C, C#, Java, Node.js and Python.[33]

4.2 Amazon Web Services FreRTOS & Greengrass

(31)

Amazon FreeRTOS is an operating system which is open source. This operating system is suited for microcontroller. With FreeRTOS it is easy to program, deploy, secure, manage, and connect the edge device. Since FreeTROS is an open-source operating system it is free to use. The operating system is provided with everything needed to program the microcontrollers, it is also provided with different libraries for connectivity which makes it possible to connect to other edge devices, AWS IoT Core (Amazon’s IoT solution) and through Bluetooth to mobile devices.[39]

Amazon Greengrass is an edge solution that extends the AWS ser-vice to an edge deser-vice. The function of the edge deser-vice can be to aggregate or filter data, the solution can also execute predictions based on machine learning models. The edge device can even talk with connected devices if it missing connection to the cloud service, and when it gets the connection back it sends the data to the cloud.[40] Amazon Greengrass requires hardware with at least 1GHZ of compute, either Arm or x86 CPU and 128 MB of random access memory. This is the ”at least” requirement, if more advanced functions are executed it may require better hardware. It is possible to run machine learning on Greengrass, the model is optimized with Amazon SageMaker Neo and then store it in a S3 bucket. Greengrass can then use the S3 bucket to deploy the optimized model. Amazon Greengrass supports lambda functions in the following languages; Python 2.7 and 3.7, Node v8.10 and v12.x, Java 8, C, C++ and any language which supports C-libraries.[41]

4.3 Cisco Fog Computing

Cisco Fog Computing solutions include connecting IoT devices, secures the transport of the data between the edge of the network to the cloud, is easily developed and deployed fog applications, and simplified man-agement of a large number of fog nodes.

(32)

fog application can be executed on Cisco routers, switches, cameras, wireless access points, and Cisco Unified Computing System servers. Cisco fog application is hosted by the cisco operating system Cisco IOx, this operating system makes it possible to create the fog infrastructure. The applications are developed in the cloud and then deployed to the fog nodes. This makes it possible to run the same application on many different edge nodes.[35] Cisco fog computing does not write anything about how their system can be combined with machine learning and therefore the conclusion is that this is not possible.

4.4 Nebbiolo Technologies

Neibbiolo Technologies platform consist of three different platform components. These components are; fogNode, fogOS and fogSM. The platform is a complete solution which are minimizing the gap between the cloud and the devices.[36]

FogNode is the hardware that the Nebbiolo Software Platform can be deployed on. The requirements is that it either have Intel x86 CPUs or ARM CPU.[42] FogOS is the operating system that is running on the fogNodes. This operating system can be hosted in either a Docker Container Environment or a Windows VM.[43] FogSM is an interface and system manager. Where the nodes are programmed before the program is deployed to the nodes.

The nodes are programmed with the help of different modules which are building a data processing pipeline or more complex pipelines such as CEP pipelines. These modules are drag and dropped to a work area where they are configured and connected to each other, this is plug and play. Many different modules contain input stream, output stream, and analyzing. Some of the analyzing modules are applying machine learning. The different output modules are HTTP Post, Alert and influxDB.[44]

4.5 FogHorn

(33)

simplifying the interoperability with existing OT systems. The platform is detecting events in real-time without any need of connecting to a cloud. FogHorn is reducing machine learning model sizes and it is also possible to import existing models. It can also execute neural nets with the same result as if it were executed in the cloud, which is favorable in for example video analytics.[37]

Foghorn has a small footprint of 256MB and can, therefore, be implemented at third party devices with at least this memory. Foghorn also ”edgifies” machine learning models, which means that it is pos-sible to use machine learning at the edge. Foghorn supports models such as Spark ML, Python, and R Studio. Foghorn has its own program language which is called VEL, the platform is set up by using this programming language. Foghorn are describing the language as ”Pythonic, SQL-ish, English reading-like language” [45]

4.6 Crosser

Crosser consists of two components, Crosser Cloud and Crosser Node. The Crosser cloud is where the designing of the edge solution is developed. The cloud can either be hosted by Crosser or at a private cloud. The Crosser node is the real-time engine which runs on the edge of the network.[38]

The Crosser node is deployed in a Docker container and runs on any Windows, Linux, or Unix server or gateway. The device needs only 100MB and fits at the most CPU’s. The node can scale up to 100.000 messages per second in memory processing.[46]

(34)

Right now there is no pre-programmed machine learning module, but there is a python module that makes it possible to bring, manage and deploy own made machine learning models in python. This makes it possible to use several different ML frameworks such as TensorFlow, Keras, Scikit-learn, PyTorch, and so on. It is possible to upload pre-trained ML models to the node and then easily refer to these when the flow is built.[48]

4.7 Motivation of choice

Almost all of the products are providing similar functionalities. The product which are most divergent from the other is Cisco fog computing which needs Cisco products to be able to run and they are not tell anything about how machine learning can be implemented in their system. By the other products four of five can run on at least 128 MB Intel x86 cpu or ARM cpu, FogHorn however can only run on 256 MB CPU. The four products which are left are almost equally, there are some differences in how the customer is implementing and developing on the platform. Azure IoT Edge and Amazon Greengrass can both be implemented in standard programming languages such as Java, C, Node.js or python. Neibbiolo Technologies and Crosser is implemented with flow programming and pre-programmed modules. In table 4.1 it is possible to see specifications about the different products.

(35)

Product Hardware Requirement Machine Learning Programming language Microsoft Azure IoT Edge 128MB RAM x86, ARM Yes C, C#, Java, Node.js, Python Amazon Web Services Greengrass 128MB RAM x86, ARM Yes C, C++, Java, Node.js, Python Cisco Fog Computing Cisco

Products - Cisco IOx

Nebbiolo

Technologies x86, ARM Yes

Flow programming modules FogHorn 256MB RAM Yes Pythonic, SQL-ish, English reading-like language Crosser 100MB RAM Yes Flow programming modules

(36)

5. Implementation

The sub-chapters below describes the general approach of the im-plementation of the two systems and a detailed description of the elements in both systems. It also describes how the measurements where arranged and measured.

5.1 General approach

During the project two different systems have been implemented. Both systems fulfill the same functionality and are giving the same result for the end-user. The systems are getting information from simulated sensors, is filtering and pre-processing the values, and executing Machine Learning on the values. The result is then visualized and stored into a database for future reports. The two different systems will be mentioned as Crosser system and Cloud Reference System in the implementation chapter. Both systems are doing the same filtering and calculations, the differences of the systems are where the filtering and calculations is executed. The concept with both systems can be seen in figure 5.1, where the data first is collected at a factory by different sensors. These values are then preprocessed, and Machine Learning is executed. The result of the Machine Learning is visualized on a dashboard. All raw values are stored into a database(storage).

Figure 5.1: Overall picture of the concept

5.1.1 Sensor data

(37)

when the logs are measured, for example; length, ovality, crookedness, and diameter. The data consist of 43 858 rows with 45 columns. A bigger part of the data is used for training of the Machine Learning algorithm and a smaller part is implemented as it where real-time streamed data where each row is pushed at a given time interval just like it where at the factory. The sensor data which have been used in this project is not real-time data but data which already have been collected and then simulated as it where real-time data. The sensor data were also missing some important parameters for quality assessment, therefore these values where generated in the frequencies this parameters use to occur.[49][50] The values which were generated where forest rot, year rings, and the number of twigs. The data is generated and normalized with the criteria and occurrence as table 5.1 shows.

Field 1 2 Wreck

forest rot not allowed maximum 5% 5%<

year rings at least 12 at least 8 <8 twigs max 60 mm max 120 mm 120 mm <

occurrence 85.7% 12.2% 2.1%

Table 5.1: Table of the quality estimation [49][50]

5.1.2 Pre-processing

When the data is collected from the sensors it is normalized. The normalization is done by normalizing the data into different intervals. The parameters which are interesting for the Machine Learning step are length, dimension, forest rot, year rings, and twigs. The forest root, year rings, and twigs are normalized when they are generated and does therefore not need any pre-processing. The length is normalized into eight different intervals these can be seen in table 5.2. The dimension is normalized into three different intervals, which can be seen in table 5.3

5.1.3 Machine learning

(38)

Interval normalized value 319<x ≤406 1 406<x ≤416 2 416<x ≤466 3 466<x ≤496 4 496<x ≤556 5 556<x <633 6 x ≤319 7 x ≥633 8

Table 5.2: Table of the normalization of the lengths Interval normalized value

137≤x <701 1

x ≤137 2

x ≥701 3

Table 5.3: Table of the normalization of the dimensions

The tree is trained using gini impurity, which decides when it is optimal for a split from a root node[51]. The decision tree is configured to use a max depth of five. The trained decision tree can be found in Appendix B.

5.1.4 Visualization

After all the previous steps the result is visualized for the end-user. The visualization of the result is implemented in Microsoft Power BI. Each system have their dashboard where the in-parameters and the predicted sorting compartment are visualized. For further description of the visualization result, see chapter 6.4

5.1.5 Storage

(39)

reports of different systems.

5.2 Crosser system

The Crosser system contains all of the components mentioned in chapter 5.1. The system is implemented as follows; First the sensors data are simulated with the data set and sends the values in a given time-frequency to the Crosser platform. The data set is streamed from a Raspberry Pi 3+ to the Crosser platform which also is running on the Raspberry Pi. When Crosser receives a message with data it is first filtered and normalized. When the data is filtered and normalized it is ready to go through the decision tree. The decision tree model is deployed to the Crosser platform and are therefore predicting the sorting compartment on the edge. All raw data that is received into Crosser is also stored and compressed at the edge and sent to Azure once a day. The data which is generated from the decision tree is sent to Azure where the data is stored and visualized for the end-user. Figure 5.2 shows the overall figure of the Crosser system. Each part of the system is described more carefully in the sub-chapters below. The visualization and storing are the same as in the general approach and will therefore not be described in this chapter.

Figure 5.2: Overall of Crosser system

5.2.1 Edge node

(40)

with flow programming. The flow of the implemented node is shown in figure 5.3.

Figure 5.3: The structure of the flow

The flow is built on two different streams, which can be seen in figure 5.3. Where the upper one is the ”hot-stream” and the lower one is the ”cold-stream”.

(41)

Hub.

The cold-stream is executed in a given time interval, for example once a day. The stream is then checking for the files (Open File) which have been stored in the hot stream and read each file separately. The file is getting a time stamp and is sent to Azure Blob Storage.

5.2.2 Azure IoT Hub and Stream Analytics

Azure IoT Hub is receiving messages from Crosser Edge Node, where it is registered as a device. When the device is receiving a message it is sent through to Stream Analytics. Stream Analytics is just forwarding the message using a SQL-likely language. The IoT Hub and Stream Analytics are only used to pass forward the messages to Power BI for visualization to the end-user.

5.3 Cloud reference system

The Cloud Reference System is built using the cloud platform Microsoft Azure and a Raspberry Pi. The Raspberry Pi is sending simulated values to Azure IoT Hub which in turn sends to Azure Blob Storage and Stream Analytics. Azure Stream Analytics is connected to Azure Machine Learning. The results of the Machine Learning is then returned to the Stream Analytics and sent to Power BI. Figure 5.4 shows how the system is implemented. The sub-chapters will describe in more detail how each part has been implemented.

(42)

5.3.1 Gateway

The data is sent from a Raspberry Pi which acts as a gateway and is forwarding messages to Azure IoT Hub. The Raspberry Pi is programmed with a python script using Azure SDK, the script is pushing the sensor data to Azure IoT Hub.

5.3.2 Azure IoT Hub and Azure Stream Analytics

At Azure IoT Hub the gateway is configured as a device, which makes it possible for Azure IoT Hub to receive the messages sent from the Raspberry Pi. When the IoT Hub has received the data there are multiple options on how the data should be handled. In this implemen-tation Azure Stream Analytics is used for pre-processing of the data, where the data are dividing into different groups, which were shown in chapter 5.1.2.

Azure Stream Analytics builds on a SQL-likely language together with JavaScript functions. A JavaScript function is implemented for the filtering and pre-processing part. This function returns an array of the parameters which are going to be used in the Machine Learning execution. The Cloud Reference System does also have one hot stream and one cold stream. Where the cold stream is just storing into an Azure Blob Storage. In figure 5.5 the hot stream in Azure can be seen.

Figure 5.5: Hot stream Azure

5.3.3 Azure Machine Learning Services

(43)

5.4 Measurement arrangement

The systems are measured in three different areas. These areas are packet size (bandwidth), round trip time(latency), and computational power used(CPU used). In the sub-chapters below each measurement arrangement will be described.

5.4.1 Packet size

The packet size is measured by using tcpdump and the network protocol analyzer software Wireshark. Tcpdump is used to capture packets sent from the Raspberry Pi. The tcpdump is running until 160 packets have been sent, the packets are saved into a pcap file which could later be used in Wireshark to filtering on the receivers IP-address. The mean packet size for each system is calculated by formula 5.1 and the standard deviation is calculated by formula 5.2. The result is plotted in tables to show the difference between the two systems.

PSmean = 1 160 160

∑

i=1 x_i (5.1) PS_std = v u u t 1 159 160

∑

i=1 (x_i−PSmean)2 (5.2)

5.4.2 Round trip time

The round trip time is measured by taking a timestamp at the beginning of the python script which is sending the sensor data to both Crosser and Azure IoT Hub. Both systems are receiving answers from the Machine Learning and when this is happening a timestamp is also taken. By taking the timestamp when it is finished subtracted with the timestamp from the start the round trip time is calculated. The round trip time can be seen as the time it takes for the machine to make a decision from the time it collects a value from the sensors until it has made a decision.

(44)

the time it takes for Azure to route the message to the right place. DML is the time it takes to execute machine learning at Azure. DR is the time it takes for the end-user to get the result from Azure. All of these are the same as taking the stop time subtracted with the start time.

RTT_CRS = D_S+D_G+D_A+D_ML+D_R ⇔tstop−tstart (5.3)

The Crosser system is calculated with formula 5.4. Where DS is the time for reading sensor, D_C is the time it takes for Crosser to receive and pre-process the data, D_ML is the time it takes to execute Machine Learning at Crosser. DR is the time it takes for the end-user to get the result from Crosser. This is the same as taking the stop time subtracted with the start time.

RTT_Crosser =D_S+D_C+D_ML+D_R ⇔ tstop−tstart (5.4)

The measurements are arranged with three different time intervals of sending the messages, these time frequencies are 1 packet/second, 0.5 packets/second and 0.2 packets/second.

5.4.3 Computational power

The computational power needed for running the both system is calculated by analyzing how much CPU power that is used on the Raspberry Pi. The analysis is performed by the software iostats. Iostat is showing how much CPU used during a given time period. It is returning the parameters user, nice, system, iowait, steal and idle. To get the total activity either user, nice, system, iowait, and streal can be added together or idle can be subtracted from 100%. The formula for calculation of the CPU usage can be seen in formula 5.5. The CPU usage is calculated for 100 seconds per time-frequency.

CPUtot =CPUuser+CPUnice+CPUsystem+CPUiowait+CPUsteal

(45)

6. Results

In the following sub-chapter each measurement and the result of each system will be presented. The measurements which have been taken are packet size (bandwidth), round trip time(latency), and CPU usage. The sub-chapter front-end result is presenting the result for an end-user perspective and the chapter analyze of results is analyzing the results of each measurement and system.

6.1 Packet size

The packet size was measured by calculating the TCP segmentation length of each packet. Therefore the result does not consider the headers in the calculation. For both systems there is a process for setting up a connection to the server from the client which has another size of the packet. In the sub-chapters each of the systems packet size result is presented.

6.1.1 Crosser system

The mean packet size of the Crosser System is 182.206 bytes and the standard deviation is measured to 23.797 bytes. The result is presented in table 6.1

mean packet size standard deviation 182.206 bytes 23.797 bytes

Table 6.1: Table of the mean size of the packets in Crosser System

6.1.2 Cloud reference system

The mean packet size of the Cloud Reference System is 1303.306 bytes and the standard deviation is 144.995 bytes. The result of the packet size is shown in table 6.2

mean packet size standard deviation 1303.306 bytes 144.995 bytes

(46)

6.2 Round trip time (RTT)

The round trip time is measured with three different frequencies of packets sent per second. The different frequencies are, one packet per second, 0.5 packets per second and 0.2 packets per second. Both systems were measured and plotted with box plots, to see if the time frequencies of the packets affect the round trip time. The box plots where both plotted with outliers and without outliers. The result of each system can be seen in the sub-chapters.

6.2.1 Crosser system

The round trip time result of the Crosser System was measured to a mean value of around 70 ms. The measurements did have some outliers which were up to 7000 ms. Figure 6.1 show the result of the round trip time with outliers and figure 6.2 shows the result of the round trip time without outliers

Figure 6.1: Boxplot of round trip time at Crosser system with outliers

6.2.2 Cloud reference system

(47)

Figure 6.2: Boxplot of round trip time at Crosser system without outliers

Figure 6.3: Boxplot of round trip time at Cloud Reference System system with outliers

6.3 Computational power (CPU usage)

(48)

Figure 6.4: Boxplot of round trip time at Cloud Reference System system without outliers

idle and total use. The total use is calculated with the formula 5.5 which is described in chapter 5.4.3. The total CPU usage depending on messages sent per second is plotted in a scatter plot to see how it changes. In the sub-chapter each of the systems CPU usage result is presented.

6.3.1 Crosser system

The Crosser system did need 8.69 percent of the CPU to send 10 packets/second, 2.14 percent to send 1 packet/second, 0.92 percent to send 0.5 packets/second and 0.9 percent to send 0.2 packets/second. The distribution of the different processes of the CPU and the total for each time-frequency can be seen in table 6.3.

packets/second user system iowait idle total use

10 6.63 2.04 0.03 91.31 8.69

1 1,78 0.35 0.01 97.86 2.14

0.5 0.64 0.28 0.01 99.08 0.92

0.2 0.58 0.30 0.01 99.10 0.9

Table 6.3: Table of cpu percent usage depending on packets sent per second

(49)

message has been sent is shown in figure 6.5.

Figure 6.5: CPU usage Crosser system

6.3.2 Cloud reference system

The Cloud Reference System did need 4.59 percent of the CPU to send 10 packets/second, 1.08 percent to send 1 packets/second, 0.72 percent to send 0.5 packets/second and 0.49 percent to send 0.2 packets/second. The distribution of the different processes of the CPU and the total for each time frequency can be seen in table 6.4.

packets/second user system iowait idle total use

10 4.17 0.39 0.04 95.41 4.59

1 0.92 0.16 0.01 98.92 1.08

0.5 0.56 0.16 0.01 99.28 0.72

0.2 0.36 0.13 0.01 99.51 0.49

(50)

Figure 6.6: CPU usage at Raspberry Pi Cloud Reference System

6.4 Front-end result

The two systems which were implemented were both implemented with Power Bi to visualize the results of the data and the calculations. This was accomplished by creating two dashboards which have the same layout, but inputs from each of the systems. The front-end result can be seen in figure 6.7.

(51)

Figure 6.7: front-end result of the system

6.5 Analysis of the measurement results

The differences between the two systems at each measurement is described in the sub-chapters, where the systems will be compared against each other and analyzed in terms of the measurements.

6.5.1 Packet size

The packet size differs between the two different systems. This is not surprising since the Cloud Reference System is sending all data to the cloud and the Crosser system is sending the data after it has been computed and compressed. The Cloud Reference Systems packet size is approximately seven times larger than the Crosser System. Both system are still providing the information which is important for the end result. While using Crosser seven times less bandwidth is needed.

6.5.2 Round trip time (RTT)

Each system round trip time does not differ anything compared to how many packets sent per second. This can be determined since all the boxes lie in the same intervals.

(52)

70 milliseconds and the Cloud Reference System does have a mean round trip time of approximately 6000 milliseconds. This means that the Cloud Reference System is approximately 86 times slower than the Crosser system. The Cloud Reference System has a latency that is 86 times larger than Crosser. Crosser’s latency is still not real-time but much more closer than the Cloud Reference System.

Both systems have some outliers, this can be of many different things. The Crosser system gets more outliers depending on how many packets per second that are sent and the reason for this can be that it is a larger probability to hit a time window if more packets per second is sent. The Crosser system is running more processes than the Cloud Reference System and this can be the reason that there are more outliers.

6.5.3 Computational usage (CPU usage)

The computational power did differ a bit between the different systems. This was not surprising since the Crosser System runs all the processes on the edges and the Cloud Reference System does only collect the values and sends it forward to Azure. In table 6.5 the differences between the systems are shown. The column ”Difference” is how much more CPU power used of the Crosser System compared to the Cloud Reference System.

packets/second Crosser Cloud Reference Difference

10 8.69 4.59 189.32%

1 2.14 1.08 198.15%

0.5 0.92 0.72 127.78%

0.2 0.9 0.49 183.67%

(53)

power since it is doing much more than the Cloud Reference System is. As table 6.5 above shows, the Crosser System never needs more than 200% compared to the Cloud Reference System. What needs to be taken into considerations in the CPU measurements is that the Machine Learning algorithm may be less heavy and if a more heavy algorithm is chosen the CPU usage may be different.

6.5.4 Comparison to related work

In A fog computing industrial cyber-physical system for embedded low-latency machine learning Industry 4.0 application the result is that the difference in the latency is 67.7%-99.4% and the fog does not have any failure but the cloud have more depending on how many connections that where made. The result from their research matches the result from this thesis results. At least in how the latency is between the edge system and cloud system, where both results were that it was around 100% difference. The failure was never made in this thesis, therefore these measurements could not be compared.

In Ultra-Low latency cloud-fog computing for industrial internet of things three different systems are built and measured. These are cloud, fog, and CF-IIoT. CF-IIoT is a self implemented by the authors, which uses an algorithm for constrained optimization. The result of the three different systems is that the CF-IIoT system has the lowest latency, the fog has a little bit higher latency, around 36%. The cloud has much more latency than the CF-IIoT, around 93%. The results of their report match the result of this report, where it shows that the latency is much higher using the cloud.

(54)

it is interesting that they have tested it and have concluded that the delay will decrease if more fog nodes are added.

(55)

7. Conclusions

The first goal Examine different Edge solutions based on functions, usability, and capacity needed to run it. Choose one of them were accomplished by lit-erature studies of the different edge solutions available on the market. The literature study was containing both a study about IoT, IIoT, edge computing, and machine learning and a literature study on six different companies and their edge solutions. In the literature study about the topics it where found that there have been related work performed before, where the latency have been measures to see how well edge computing performs compared to cloud computing. These articles made it easier to get a perception of what that could be expected of the measurements. The choice of which companies to take a closer litera-ture study on were made by choosing three larger known companies that deliver many different products and three smaller which focuses on edge solutions. The main parameters which were investigated in the different products where which functions it had, how usable it was, how the platform is programmed, and how much computing power does the platform need. The parameters of the platforms were pointed out in a table and then a motivation of which of the six platforms to choose where made. The platform which was examined to fit best for this thesis where Crosser.

(56)

use case. The algorithm which was chosen where a decision tree. The decision three were made by using python and scikit-learn. The data which were used to train the tree where data which were collected at the SCA sawmill in Sundsvall. When the model where trained it could be deployed to both the Crosser System and the Cloud Reference System.

The third goal Measure the differences in latency, bandwidth, and compu-tational usage were satisfied by arranging three different measurements of the two different systems. The measurements were the basis of the result of this thesis. The measurements showed very interesting results. The packet size did differ as it where expected. The Crosser System did have much smaller packets than the Cloud Reference System, much of this depends on how much filtering and compression that are made at the Crosser System. In this thesis scenario it resulted in being approximately seven times smaller than the Cloud Reference System, and the Crosser System thus needs seven times smaller bandwidth than the Cloud Reference System. The Crosser System was fastest, which was not any surprise, but it was surprising that it where approx-imately 86 times faster than the Cloud Reference System. The Cloud Reference System was really slow and can not be recommended to use in critical situation since much can go wrong in 6000ms that it takes for the system to make a decision. The latency is thus 86 times less at Crosser than at the Cloud Reference System. The computational power needed for running both systems did differ a bit. But if taking into consideration that the Crosser System runs almost everything on the edge(Raspberry Pi) and the Cloud Reference System only passes forward the data to the cloud, the difference between the computational power needed is not such big. The Crosser System needs between 28-98% more computational power than the Cloud Reference System. But for 10 packets per second the Crosser System only needs 8.69% of the total CPU capacity.

(57)

power. My conclusion of the result is that which system that is best depends on what the use case is. Crosser have both least latency and require the least bandwidth, the system does, on the other hand, needs more computational power than the Cloud Reference System. Crosser seems to a bit more unreliable than the Cloud Reference System, but it still does not have larger latency than the Cloud Ref-erence System. If the latency and bandwidth are not the important parameters of the system and a more heavy machine learning algorithm is expected to be executed the Cloud Reference System is to recommend. The methodology of the project which was chosen has been successful since it has been easy to follow what that should be accomplished when. The sprints have made it easy to keep good contact with the supervisors, both at the school and at the company. If something could have been changed in the methodology in afterward it would have been in the implementation of the machine learning part at Azure, which took much longer time than expected because the documentation was deficient. The result which was provided in this thesis was dependent on the implementations which were chosen. If another edge platform or cloud reference platform had been chosen the result may have differed since the platforms can be faster or slower and the ones which were chosen. But since the differences have been such large in both the latency and the bandwidth it would be very odd if this will change much by choosing other platforms. The computational power on the edge may have a bigger difference between different edge solutions since this depends on how much the just the platform by itself takes of the CPU. The results of the project are realistic since they are matching with results from related work, this is analyzed in section 6.5.4.

7.1 Future work

(58)

cost of the different systems where out of scope, therefore it would be interesting to investigate how the cost is affected by the different system by some sort of cost analysis. During this project only one edge solution was chosen, it would be interesting to set up, test and compare some different edge solutions with each other to see how these perform, since just the platform by itself probably can differ much in how much CPU power it requires, the time it takes and certainly other parameters. In this project it was out of scope to investigate different machine learning algorithms and how these algorithms can perform in the edge, are some algorithms to heavy to run on an edge node or does it affect how fast the algorithm can perform? In future work it would be interesting to test different machine learning algorithms on the edge and measure how this affects the latency and computational power. As the related work Hierarchical fog-cloud computing for IoT systems: a computation offloading game investigate; it would be interesting to set up a larger network of edge devices as a cluster and see how the perfor-mance of the network is affected when very much data is circulating in the network and relying on each other. Does the latency decrease even more, or does the communication between the devices require more time when they are more than one?

At last a future work will be to try the system in real streamed data live in a factory, to see how it behave and performs compared with for example a human who is making decisions. The live test would then be examined to see how much it differs between a human and a computer.

7.2 Ethical considerations

(59)

Mostly it will affect the people who are today working in the industries, as in the other industrial revolution employees have been needed to change their working chores. The most uncertainty is because we do not know how far the automation will go.[53]

An evaluation of how edge computing is enabling the opportunities for Industry 4.0

Abstract

Acknowledgments

Table of Contents

Terminology

1. Introduction

1.1

Background and problem motivation

1.2

Overall aim

1.3

Concrete and verifiable goals

1.4

Scope

1.5

Outline

2. Theory

2.1

Internet of things

HTTP

CoAP

2.2

Industrial internet of things

2.2.1

Automation control protocols

PROFINET

Ethernet/IP (Industrial Protocol)

2.2.2

Architecture connected factories

2.2.3

Industry 4.0

2.3

Edge computing

2.3.1

Fog

2.3.2

Mist

2.3.3

MEC (Multi-Access Edge Computing)

2.4

Machine learning

2.4.1

Decision tree

2.5

Related work

2.5.1

A fog computing industrial cyber-physical system for

em-bedded low-latency machine learning Industry 4.0

appli-cations

2.5.2

Ultra-Low latency cloud-fog computing for industrial

in-ternet of things

2.5.3

Hierarchical fog-cloud computing for IoT systems: a

com-putation offloading game

3. Methodology

4. Choice of edge computing

solu-tion

4.1

Microsoft Azure IoT Edge

4.2

Amazon Web Services FreRTOS & Greengrass

4.3

Cisco Fog Computing

4.4

Nebbiolo Technologies

4.5

FogHorn

4.6

Crosser

4.7

Motivation of choice

5. Implementation

5.1

General approach

5.1.1

Sensor data

5.1.2

Pre-processing

5.1.3