CLOUD-CENTRIC SOFTWARE ARCHITECTURE FOR INDUSTRIAL PRODUCT-SERVICE SYSTEMS

(1)

V¨

aster˚

as, Sweden

(DVA501) Thesis for the Degree of Master of Science in Computer

Science with Specialization in Software Engineering 30.0 credits

CLOUD-CENTRIC SOFTWARE

ARCHITECTURE FOR INDUSTRIAL

PRODUCT-SERVICE SYSTEMS

Fizza Shams

fss15001@student.mdh.se

Supervisor Jan Carlson

M¨

alardalen University, V¨

aster˚

as, Sweden

Company supervisor: Markus Lindgren

ABB, V¨

aster˚

as

Examiner: Radu Dobrin

M¨

alardalen University, V¨

aster˚

as, Sweden

(2)

Abstract

A Product-Service System (PSS) is an integrated combination of products and services. Industrial systems falling under the Pay-for-Performance (PfP) business model are a type of PSS where an organization sells services to its customers. Offering a PfP system requires analysis of huge amount of data and based on the information gained, an organization can charge its customer on the performance. In this thesis, we identified several services that can be provided to PfP systems and identified several reasons for having a cloud platform together with edge computing for successful deployment of PfP systems. The main contribution of this thesis has been on identifying different architectural solution for PfP systems. We carried out this task by performing a cost analysis of a remote monitoring solution provided by Microsoft Azure for a single Stressometer system. Due to high cost and limited provisioning of Azure services in different Azure regions, we mentioned few alternative ways of designing the solution for PfP system that best suit needs of organizations that have systems distributed globally such that all industrial systems can be provided as PfP systems.

(3)

Acknowledgements

This thesis has taught me many things that were unknown to me and I will carry the lessons learned in my years to come. I am grateful for the opportunity provided by ABB. Performing research while having real-world industrial input has made this thesis more interesting.

I would like to express my gratitude to Jan Carlson, who is my supervisor at M¨alardalen H¨ogskola and Markus Lindgren, who is my supervisor at ABB. Their guidance and support throughout my work has made it possible to complete this thesis. I would also like to thank Fredrik Norlund at ABB Force Measurement for giving me this opportunity and Hongyu Pei-Breivold for having spontaneous conversations during the course of this thesis.

Finally, I would like to thank my mother for supporting me and always having encouraging words to say. Last but not the least, I would like to say a big thanks to my husband, Wasif Afzal, who always encouraged and believed in me.

(4)

List of Figures

1 Traditional way of using a photocopy machine [1] . . . 6

2 Pay-per-copy service by a service provider [1] . . . 12

3 Classification scheme of service area for product-service systems . . . 13

4 Analytics performed at different levels with different latency requirements [2] . . . 16

5 An instance of a general three-tier architecture for pay-for-performance system with more details . . . 19

6 Architecture for pay-for-performance system . . . 20

7 Basic functionalities performed at the edge layer . . . 21

8 Basic functionalities performed on the cloud . . . 21

9 Diagram of a Four-High mill. The workpiece passes through the gap between the work rolls. The blue circles are rolls that put force on the work rolls . . . 24

10 Figure depicting the components of remote monitoring solution [3] . . . 27

11 A logical architecture of PfP system when provisioned using two different Azure regions . . . 31

12 A logical architecture of PfP system when provisioned using two different Azure regions . . . 32

13 Two devices sending data to IoT suite in West Europe region. The blue dots rep-resent the systems whereas the black dot reprep-resent the location of the West Europe Azure region . . . 33

(6)

List of Tables

1 Table displaying different speed configuration for each piece of strip . . . 25

2 Provisioning of remote monitoring solution in different Azure regions . . . 30

3 Different pricing tier available in IoT hub by Azure . . . 32

4 Average monthly cost of running three units of Stream Analytics service . . . 41

5 Average monthly cost of running a single unit of S2 tier of DocumentDB service . 42 6 Average monthly cost of having a single throughput unit in an event hub . . . 42

7 Monthly ingress cost per event hub . . . 42

8 Average monthly cost of running two event hubs . . . 43

9 Values assumed for this solution . . . 43

10 Average monthly cost of storing data to the blob storage from a single Stressometer when using LRS standard . . . 43

11 Values assumed for geo-redundant storage . . . 43

12 Monthly cost of storing data to the blob storage from a single Stressometer when using GRS standard . . . 43

13 Amounts assumed for read-access geo-redundant storage . . . 44

14 Monthly cost of storing data to the blob storage from a single Stressometer when using RA-GRS standard . . . 44

15 Average monthly cost of running the web application on type S1 instances using App Services . . . 44

16 Average monthly cost of running the web application on type P1 instances using App Services . . . 45

(7)

1 Introduction

Selling of products is the traditional way of doing business for manufacturing organizations. After purchasing the product, the customer owns that product and is responsible for making product-related decisions such as when to perform the inspection of the product, how to perform diagnostics in case of a failure etc. In such a business model, the manufacturer usually provides after-sale services depending upon the contract with the customer. At the time of product selling, the incentive for providing such services is more for establishing the manufacturers reputation as a competent after-sales service provider than providing a product that best suits the customer needs on long-term basis. Under this model, there is no guarantee for the manufacturer that the customer will buy the product once again when its life has ended. As a result the manufacturing organizations are competing with other organizations, especially with those who sell their product with similar functionality but at a lower price. An example of a product-selling business model is of a photocopy machine as shown in figure 1. The manufacturer provides the photocopy machine and servicing. In return they get payment for their product and services. Although the customer is just interested in using the product, in order to use it, she has to purchase the equipment and manage it (i.e. provide consumables, monitor the performance and arrange servicing in case of some problem).

Figure 1: Traditional way of using a photocopy machine [1]

To secure their place in the market and to remain in business, the manufacturers are forced to look into new business models that can create and establish long-term revenue streams, which can be more reliable than product sales. A newer approach of doing business is to reduce the emphasis on ‘sale of product’ and move towards ‘sale-of-use’ business models. When we move towards the service domain, the organizations can offer integrated products and services to their customers. This phenomenon is called servitization of manufacturing [4] and such systems are called Product– Service System (PSS). PSS is defined as “a system of products, services, supporting networks and infrastructures that are designed to be competitive, satisfy customer needs and have a lower envi-ronmental impact than traditional business models” [5].

In PSS, the services are provided by catering to the continuously changing demands of the cus-tomers [6]. McDonald [7] defines a service as “an activity which has some element of intangibility associated with it. It involves some interaction with customer or property in their possession, and does not result in a transfer of ownership. A change of condition may occur and provision of the service may or may not be closely associated with a physical product”. Some key characteristics of a service are [7]:

1. Services are intangible, while products are concrete.

(8)

Some key benefits of PSS include the development of longer-term relationships between vendor and customer and the creation of new revenue for the businesses. There are several companies that have successfully used servitization, such as Rolls-Royce [8], and Rockwell Automation [9]. More and more customers are requesting products to be offered as services as it saves the customer from doing a huge up-front investment [10]. This offers new opportunities to the manufacturing organizations such as availability of different set of business models e.g. usage-based pricing. PSS can enable an environmentally sustainable business. This is possible as the owner will try to reduce the risks associated with owning the product by putting more effort in enhancing utilization, design and reliability of the product. As the product will be used more efficiently, the consumption of resources can be reduced.

Pay-for-Performance (PfP) system is a type of PSS in which the goal is to charge the customer on basis of performance of the system. If we take an example of flatness measurement and control sys-tem used in rolling mills, this could mean the customer can pay based on flatness quality, number of metal strips produced, etc. In order to charge the customer based on usage, the organization needs to have information about how the system has been used by the customer. For example it will need to know how many strips the customer has actually produced. One simple way to do this would be to ask the customer about this information. However, a better approach would be not to involve the customer but rather acquire this information directly from the control system. This information can be acquired by collecting and aggregating data from different systems and performing analysis. Based on the type of data procured from the systems, different services can be provided to the customers.

Such systems are always generating data and data collected over the time can help in under-standing the behavior and use of the system. A number of benefits can come from collecting and analyzing the data such as understanding the performance of the system and supporting trou-bleshooting activities. Because a lot of data is generated by the systems, large computation power is needed to process it. Therefore organizations need a proper infrastructure that is capable of handling such compute-intensive process.

1.1 Problem Formulation

Offering an industrial system as a PfP system introduces several unique challenges and these chal-lenges need to be addressed when designing a PfP system. As mentioned previously, an industrial control system generates a lot of data with different structure and types. Additionally this data may need to be collected from different customer sites that may be at different geographical loca-tions. In addition to this, some functionality may have a latency requirement that might not make it feasible to do the computation on the cloud. However if we have a non-cloud infrastructure, the organizations will need to invest in storage, CPU, security, etc. It will also need to hire personnel for support and maintenance. The organizations will need to take measures to support peak load and the demand of such amount of resources may be for a short duration and as a result the servers will be under-utilized during the other times. Because the organizations cannot really predict such demand of resources in advance, having a non-cloud solution may not be cost-effective. That being said, the cost of moving all the data in and out of the cloud might be too high so preferring a non-cloud solution (e.g. local servers) for some of the data might be a better option.

Mentioned above are just few challenges but they are enough to convince that different decisions are required when designing a PfP system. Thus the research questions (RQs) that will guide our thesis study are stated as follow:

• RQ1: What is a suitable architecture, i.e. what are the essential components and how should they be connected, when moving an existing industrial system to a pay-for-performance model?

(9)

• RQ2: What are the implications or challenges for adopting an existing IoT architecture for pay-for-performance system?

1.2 Research Site

ABB is one of the largest engineering company operating mainly in robotics, industrial automation, and power. ABB products are used in many different industries including oil and gas, railway and power automation. During this thesis, the product that would be used as a case study is Stressometer control system which is used in rolling mill industry. Stressometer control system is an industrial system of ABB which is“recognized as the world standard in flatness measurement and control in flat rolling mill” [11]. This system takes in different data such as strip thickness, speed of the roll, etc., to achieve optimal flatness performance.

1.3 Research Method

In order to answer RQ1, we conducted a literature study to identify requirements for PfP systems. This in turn helped us to design a high-level architecture. The next step was to investigate an existing IoT architecture that can be used to offer PfP systems, by performing a cost analysis and using a Stressometer system as a case study. We then identified several implications and explored each by proposing some new architectural design that can counter the shortcomings. We also discussed the effects on cost for each different scenario. Following is a list of activities that were performed to answer the RQs stated in Section 1.1:

Literature study

A literature study was conducted to find relevant material on the following topics: • Requirements for pay-for-performance system.

• Services provided by pay-for-performance systems

We began our search by first understanding what would be required by a PfP system. When we constructed search strings, such as “pay-for-performance system” or “pay-for-performance prod-ucts”, we did not find any relevant papers. As a result, we had to take a different approach and thus created search strings that explained the concept of PfP. This resulted in search strings such as “offering products as services” or “offering industrial systems as services” or “services and products”.

The search produced a lot of relevant articles. All of the articles gave great information on what is “product as a service” in general. Many articles were also published with the title “Industrial Product-Service Systems (IPSS) that particularly focused on offering industrial products as services to the customers. These articles allowed us to derive requirements for PfP industrial systems. Study a system at ABB

This activity was conducted to understand the type of data that is generated by industrial systems by looking into Stressometer. In addition to this, emphasis was also given on the quantity of data produced as well as the frequently with which it is produced. This task was achieved in the following way:

• We tried to develop a good understanding of the system by having numerous discussions with people having knowledge of the system and by reading documents. The purpose of this activity was to construct a general scenario for all industrial systems and not be specific to one system.

(10)

• We also had a simulator that helped us understand the behavior of the Stressometer.

Again the purpose of this activity was to understand the requirements of an actual industrial control system. This data received from Stressometer helped us in carrying out the cost analysis and determine the different ways cost can influence the architectural decisions. This in turn facilitated in determining several implications that cost has on the architecture.

Design an architecture

By combining the literature and the knowledge gained from the Stressometer system, we proposed a high-level architecture. We also identified different functionalities that a pay-for-performance system required.

The purpose of all the previously stated activities was to understand pay-for-performance systems and thus answer the first research question (RQ1). The next set of activities will answer the second research question.

Evaluating cloud technologies

The cloud platform that was used during this thesis was Microsoft Azure. The purpose of investi-gating different Azure services was to find how well they can support pay-for-performance systems. We looked into services that would allow us to connect our system to the cloud. Microsoft pro-vides an IoT suite that integrates devices and systems with the IoT solution. Under this IoT suite, Microsoft provides two preconfigured solutions (as of March 2017):

• Remote monitoring solution • Predictive maintenance solution

Both of these are end-to-end solutions that combine different Azure services. Remote monitoring solution allows a customer to remotely monitor their systems whereas predictive maintenance solution allows a customer to predict the point at which a failure is likely to occur. Both of these solutions can be configured or customized to fit the specific needs of an organization. Remote monitoring solution was investigated during this thesis.

Cost analysis

A simple cost analysis was conducted to put forward a cost for setting up the preconfigured solution for a single device. During this thesis, we did not connect an actual Stressometer nor did we deploy the solution. Rather we used Azure pricing calculator for cost analysis. The purpose of doing a cost analysis was to:

• Determine the cost for a single Stressometer.

• Determine several scenarios that would have an implication on the architecture. • Determine the effect on cost for each of these implications.

1.4 Actual Research Outcomes

In an effort to answer the RQs, we had the following outcomes at the end of the thesis: 1. A high-level architecture for pay-for-performance system.

2. Identification of services that may needed to be provided to offer a system as pay-for-performance system.

3. A cost analysis of an existing IoT architecture in Microsoft Azure using Stressometer as case. 4. Identification of scenarios that have implications on the architecture.

(11)

1.5 Structure of the Thesis

The report is organized into the following sections: Section 2 begins with a discussion on the clas-sification of Product-Service Systems and elaborates under which clasclas-sification would PfP system fall under. It then discusses the application of result-oriented PSS in different industries. There is also a discussion on the services that a service provider will need to provide in order to offer a system on a pay-for-performance business model. Furthermore there is a brief discussion on cloud computing and edge computing and why we would need both in a PfP system architecture. Section 3 discusses several architectural solutions present in the literature work that use just cloud platform and those that use cloud platform in combination with edge computing. Section 4 dis-cusses a high-level architecture of pay-for-performance system with a mention of general services that can be provided to a customer. Furthermore we also emphasize and briefly discuss the role that data plays in PfP systems.

In Section 5, we briefly discuss the purpose of Stressometer system. For understandability, we discuss a specific service that can be provided to the users of Stressometer system if it is offered as PfP system. We then perform a cost analysis in Section 6 to evaluate the cost of providing the service only for a single Stressometer. We then discuss several implications that can be drawn from the logical architecture of remote monitoring solution keeping in mind the impact on cost. The results and limitation of our work is discussed in Section 7 and the report concludes with the discussion on the future work in Section 8.

(12)

2 Background

Following chapter intends to familiarize the reader with the industrial applications of result-oriented services. Furthermore, a discussion on the requirements and functionalities needed for provision of such a system are also mentioned.

2.1 Classification of Product–Service systems

Tukker [12] has divided PSS into three types depending upon the services they offer to the cus-tomers: product-oriented services, use-oriented services and result-oriented services.

1. Product-oriented PSS: In such a business model, the company sells the product to the cus-tomer and provides after-sale services such as maintenance, repair, product training, etc. As the customer owns the product, the responsibility of keeping the product in check is on the customer. The company provides product-related services that are paid by the customer including product transportation, on-site installation, repairs and supplying spare parts. Ad-vice, training and consulting services (e.g. documentation, help desk and training on how to use the product) are available for the customer to use the product efficiently.

2. Use-oriented PSS: The product-service (PS) provider sells the use of the product to the customer. In such a scenario, the PS provider is the owner of the product whereas the customer simply uses the product. In such a business model, the company is motivated to maximize the use of the product and adopt methods that can extend the life of the product. Products under this category can be offered on lease, rent or share and pooling. The PS provider is responsible for providing maintenance, repair and control. Customers benefit from a reliable product that is well maintained. The product is better utilized which leads to higher level of productivity.

3. Result-oriented PSS: The PS provider remains the owner of the product and sells the result of the product. As a consequence, the customer pays according to the level of use. The services offered are customized and the customer pays for the result. This type of service occurs in three different forms: outsourcing, pay-per-use and functional result services.

• Outsourcing: Also known as activity management [12], the PS provider manages one or more activities on behalf of the customer. An example of this type of service is Full Service Maintenance Performance Management by ABB which takes over the responsi-bility of management process in maintenance, thus allowing the customer to focus on its core business values only.

• Pay-per-use: The provider takes payment from the customer in accordance to the use. Well known example of this type of service is pay-per-print as shown in Figure 2. The service provider, such as Ricoh and Canon, take over all activities that are needed to keep a copying function available such as availability of consumables (i.e. paper and toner supply), monitoring the health of the machine by providing maintenance, repair, and if required replacement with a newer machine. The customer is free from these responsibilities and pays for the number of copies. PfP system can be classified as pay-per-use system as the customer will pay for the use.

• Functional result: The provider sells the result or capability instead of a product (e.g. selling laundered clothes instead of a washing machine). The customer pays only for the provision of agreed results. Another example of functional result-oriented services is the “Total Fluids Management” program offered by PPG/Chemfil and Ford Motor Company [4]. PPG/Chemfil is responsible for most of the chemicals in the plant and the fee is fixed per unit of production.

(13)

Figure 2: Pay-per-copy service by a service provider [1]

2.2 Applications of Servitization in Manufacturing

Following are examples of result-oriented PSS used in the industries:

1. Rolls-Royce’s aircraft engine is used by over 600 airlines worldwide [8]. An aircraft engine is an expensive and complex product containing thousands of parts. Performing routine maintenance can be an expensive process and having to do corrective maintenance due to some failure in the engine is more costly. To improve customer relation, Rolls-Royce Plc offers a Total-Care Package to the airlines [1]. Instead of transferring ownership of the gas turbine engine to its customers, Rolls-Royce delivers it as ‘power-by-the-hour’. The users of its engine pay by the number of hours the engine is in flight [8].

Since Rolls-Royce retains the ownership of the engine, it has direct access to the product and can collect and analyze data on product performance and use. Such data can then be used to improve engine efficiency and asset utilization by predicting potential faults, optimize main-tenance schedule and use the knowledge gained through the analysis to improve designs of future engines. The improvement of performance parameters, such as maintenance schedule, reduce total costs. An example of saving cost by removing an unscheduled maintenance can be seen when a flight from Singapore to New York was struck by lightning. Rolls-Royces service team was able to assess the condition of the planes engines and advise the pilot that it was safe to continue the flight. This airline was able to save millions of dollars because of the information that Rolls-Royce had on the airplane’s engine through monitoring and analyzes of the engine data.

2. Rockwell Automation is another manufacturer that supplies equipments to petroleum indus-try. The lack of insight into the petroleum supply chain makes it difficult to help customers see potential issues and thus address these issues proactively. By collecting sensor data from remote equipment, Rockwell was able to capture real-time information on equipment per-formance and health factors [9]. By monitoring these assets, Rockwell Automation could provide predictive maintenance and even preventive maintenance. Rockwell Automation has also designed and developed an asset performance management (APM) solution using Mi-crosoft Windows Azure cloud platform for LACT unit. LACT unit (lease automatic custody transfer) consists of a series of pumps, pipes and valves designed to measure oil quality and purity and other key parameters. Data from the LACT are displayed on a local HMI (such as a truck) or industrial PC screen and transmitted into the Windows Azure cloud. Once in the cloud, Rockwell applications combine real-time and historical data to provide information on transfers, overall oil quality, and well productivity over time. This can be used by companies to improve billing accuracy and timing [13].

(14)

Figure 3: Classification scheme of service area for product-service systems

2.3 Requirements for Product–Service System

Providing product as a service means that the PS provider would be interested in minimizing total cost of ownership. This implies the provider would strive to reduce operational and maintenance cost. Traditionally the information regarding the product in use by the customer was either through the customer or the service personnel. When an equipment stopped working properly at the customer site, the on-site engineer had to look up the diagnostic solutions from the documents provided by the supplier to fix the problem. If that did not work, the supplier had to send a field-service experts to the customer site to perform on-site diagnosis, testing, and repair. Such a long downtime of the product could cause a significant production loss. A manufacturer who is interested in offering a system as pay-per-use system cannot afford to perform untimely maintenance as this would mean lower production of the customer. And since the customer is paying for the use, this would directly affect the revenue. The manufacturer needs to take measures to ensure that the product operates efficiently. Such a system puts extra responsibilities on the supplier as it needs to cater for consumables, repair, maintenance, and disposal. In result-oriented services, the customer would be certain that it will be provided with a highly available product that operate efficiently. Result-oriented systems give a platform to collect information from every product that is being used by the customers. Such products are always generating data. One big opportunity that in-dustries have is to collect this data from products that are in use by the customers. This provides opportunities for manufacturers to observe and measure products behaviour and use. A number of benefits can come from collecting this data. The data can be used to understand the performance of the system, diagnostic data can support troubleshooting activities and feature usage data can be used to understand the usage pattern of the user. The data can also be collected to to support continuous improvement of existing functionality provided by the product and as a basis of new in-novation [14]. Different configurations can have an impact on the performance of the system. The impact can be on reliability and productivity, product quality, capacity scalability, and costs [15]. Adapting the system to the customer needs, designing the system to reduce idle and operation costs and provide service when it is actually required, requires knowledge about the system behavior and prognostics about the system conditions [15]. The prognostics of the system condition is possible by the continuous supervision of products and processes as well as processing of data by algorithms. Services oriented towards the supplier’s product for PSS are divided into two categories by Haeberle et al. [16] which are shown in Figure 3. Following is an explanation of the services:

(15)

1. Product life-cycle services (PLC): These refer to the range of services that are offered to the customer along with the product. These include services that the manufacturer must provide to the customer to ensure proper functioning of the product during all stages of its life cycle. By providing these services, the manufacturer establishes itself as a competent service provider. The type of services can be before-sale services or after product-sale services including installation or inspection of the product and recycling of some component. 2. Asset efficiency services (AES): These services not just guarantee functioning of the product

but include range of other activities that improve productivity of an asset. These services allow organizations to differentiate themselves from competitors (e.g.“we guarantee availabil-ity of 98.5% of video screens up and running in an aircraft” [17]) and give an opportunity to charge customer based on value-based pricing. Services included in this category include system configuration services such as performing updates and upgrades of the installed soft-ware, managing risk of product failure by employing preemptive maintenance, on-site condi-tion monitoring of systems etc., and know-how informacondi-tion that can assist the customer in producing the desired result.

To offer AES, PS provider needs to ensure the following:

1. Availability of the product: Downtime means loss for both the customer as well as the PS provider. As the customer is paying in accordance to the use, any downtime means that the product was not used during that time frame and therefore no payment for the supplier. High available products can be provided by continuously monitoring the health of the system. By detecting faults in time and doing timely maintenance, asset utilization can also be improved. 2. Reliability of the product: The product should perform according to the expectation of the customer. It should provide accurate results that are according to the specifications. This quality attribute can be attained by monitoring the performance of the system and accord-ingly update the system for optimal performance.

3. Security: Since the product is no longer owned by the customer, assurance needs to be provided by the supplier that measures are taken to keep the data of each customer secure.

It is common for systems to produce huge amount of data. The engines provided by Rolls-Royce produce one terabyte of data on every flight. Due to the huge quantity, a lot of computation power is needed to process the data. One technology that is receiving a lot of attention these days and that can provide a solution to these data driven industries is cloud computing. Cloud computing model is a cost efficient alternative to owning and managing private data centers for industries requiring batch processing for huge amount of data.

2.4 Cloud Computing

Cloud computing is a very active research areas and is quickly gaining popularity because its ability to provide reliable services to users and businesses at low cost. Some of the leading companies that are providing cloud computing services are Amazon’s EC2 , Google’s App Engine and Salesforce’s Force.com. Facebook and Youtube are examples of cloud deployed web applications [18]. The US National Institute of Standards and Technology defines cloud computing as “a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing re-sources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction” [19]. This new com-puting model facilitate provisioning of resources such as CPU and storage on demand. The data centers are distributed around the globe and the user can easily access these cloud services from anywhere through the Internet. This technology offers a pay-as-you-go pricing model in which the user only pays for the usage. Three fundamental service models that are offered by cloud computing providers are 1) Infrastructure as a Service (IaaS), 2) Platform as a Service (PaaS), and 3) Software as a Service (SaaS). The user of the cloud can take advantage of these services to

(16)

achieve cost effective IT solutions. The responsibilities and expenses of managing the underlying hardware-software ecosystem will remain with the cloud provider.

Benefits for using cloud computing include lower operation and maintenance cost and availabil-ity of resources for batch-processing and analytics jobs. This implies that cloud is well suited for heavy analyses (i.e. those analyses that require many CPU cycles). Centralized data center infrastructures work for analyses that rely on static or historical data. It can take days or even weeks to gain some insight from the data. Centralized analytics can be used for fleet management and optimization. There are mainly three deployment models of cloud computing: private cloud, public cloud and hybrid cloud. When the cloud services are made available in a pay-as-you-go manner to the public, it is known as public cloud. Private cloud refers to data center dedicated to a single organization or part of a public data center that is dedicated to a single organization [20]. Hybrid cloud is a combination of both models where an organization utilizes services from both public and public.

2.5 Edge Computing

Cloud computing provides many benefits to the organizations. It can happen that for some specific cases, such as low latency applications, using the cloud for performing computations may not be the best approach due to issues relating to bandwidth (the rate of data transfer which is measured in bits per second), security and latency. Such applications may need computation nodes in their vicinity to meet these timing requirements and the cloud may not be able to fulfill this requirement as there might occur bandwidth congestion that can cause delay. As the Internet of Things (IoT) continues to expand, with more and more physical objects transmitting and receiving data, the size of the data generated will only increase and thus increasing the strain on the network infrastructure. Performing analytics on large volumes of data produced by distributed assets can make centralized computation of analytics problematic. The information derived from the analytics could lose its value even if it takes only a few minutes to collect and send data from these hundreds of devices to the cloud for analytics. This is true for applications that require real-time response such as in-dustrial control systems. Another use case of using edge computing is when it is not economical to send a large amount of raw data to centralized data stores to be analyzed even without a real-time requirement. Edge computing keeps the data closer to the devices instead of sending everything to the data centers in the cloud. It consists of putting servers or data analytics machines in re-mote locations, that are closer to the physical devices, in order to gain real-time insights from the data collected and also reduce the amount of data that is needed to be sent to the cloud data stores. Edge computing solves the inadequacies of cloud-only models, which have serious challenges with latency, network bandwidth, geographic focus, reliability and security. Many applications require both edge and cloud computing for analytics and Big Data. Applications or services having fol-lowing requirements are better suited for decentralized computing infrastructure such as edge computing [21]:

• Applications requiring low and predictable latency.

• Geo distributed applications where the data is spread out on more than one computer in a network and all the computers on the same network run simultaneously for a single task. • Mobile applications such as smart connected vehicle.

• Distributed control systems such as smart grid.

Edge Analytics, sometimes known as distributed analytics, is an approach in which some data is stored, processed and analyzed near to the source instead of sending all the data to the Cloud [22]. This functionality is critical for industrial IoT systems where systems are expected to generate huge amount of data and sending all of it is not possible due to cost or other non-functional requirements such as low latency. Edge analytics usually involves performing the analysis either on the device

(17)

itself or network edge device such as a gateway. IoT Edge Analytics is typically applicable for on-land and offshore oil rigs, mines and manufacturing plants which operate in low bandwidth, low latency environments. Edge-based analytics is not replacing the centralized analytic servers data center model rather it is supplementing it by providing opportunities to gain insight from the data quickly so that an immediate/real-time response can be delivered to the system sending the data. Hence the advantages of performing analytics close to the edge are:

1. Reduced network bottleneck: Some data is forwarded for further analysis while the rest is destroyed at the edge.

2. Fast response time: Since the raw data is processed close to the device, this implies that the information can be delivered to the device quickly.

3. Filtering data: Only the necessary data is sent to the cloud for further analysis.

Figure 4: Analytics performed at different levels with different latency requirements [2]

Figure 4, which is an example taken from smart grid application, depicts a need of hierarchical organization of networking, storage and compute resources depending upon the data latency and application of the analytics. The data is analyzed on different level. The machine-to-machine (M2M) interactions, which have lowest-latency data, are at the bottom level with real-time tech-nical analytics. The highest level (days to months) include highest latency data where operational analytics are used for business intelligence management dashboards and reporting. It can be seen that the scope of the data widens as we move up the levels. For instance, the analytics done for business intelligence may be performed on data collected over months from areas geographically distributed.

Edge devices can be of two types: Gateways and servers

• Gateways: Gateways were previously used to aggregate and route the data traffic. Now the functionality has evolved with addition to storing and performing some computation before

(18)

sending the data to the cloud. Edge analytics allows us to do some pre-processing or filtering of the data closer to where the data is being created.

• Servers: Many server vendors such as Dell are placing their servers as Edge devices by adding additional storage, computing power and analytics capabilities to their servers.

(19)

3 Related Work

Jin et al. [23] proposed a generic framework for creating IoT implementations. According to the article, there are three main viewpoints that guide the building of an IoT implementation: network-centric IoT, cloud-network-centric IoT and data-network-centric IoT. Each viewpoint has a strong relation with each other. In this thesis, we have focused on cloud-centric and data-centric IoT. Lake et al. [24] have described a data-centric IoT viewpoint where the data (produced by the sensor) flows from the gateways to the hubs and eventually to the cloud-based store stores. The gateways and hubs have cache and processing capabilities that enable data traffic to be filtered and processed efficiently without having to send every data to the cloud. Similarly PfP systems would also generate huge amount of data and this data would need to be processed closer to the actual system before sending it to the cloud. However Lake et al. [24] have proposed an architectural framework for addressing security and privacy challenges. In this thesis, we were interested on cost and the effects of cost when designing a PfP system.

Khoi et al. [25] have proposed a layered architecture for a system they call IoT-based Remote Health Monitoring (IReHMo). The system allows the healthcare providers to constantly moni-tor the behaviors of the elderly people. They have compared different network communication protocols for IoT devices such as HTTP, MQTT and CoAP. From there evaluation, CoAP based IReHMo implementation had the lowest bandwidth consumption as well as reduced the amount of data generated per sensor. Inspur SSM application monitoring system [26] have proposed a layered architecture for their service monitoring and management platform which monitors IT resources, including various mainframe devices, network devices, databases and web services. Similar to the work done by Khoi et al. [25] and [26], the architecture of PfP systems would also be a layered architecture.

Ye et al. [27] have also proposed a three logical layer architecture that uses edge analytics on the network edge layer, which is a layer between the application and the data layer. The data layer consists of physical devices, such as mobile phones, that are capable of collecting and sending data to the edge layer for analytics. The analytics, which are carried out at the edge layer, processes the raw data and as a result there is a reduction in the volume as well as the data consists of higher level of semantics. The application layer resides on the data center. It is responsible for receiving and aggregating data from the edge nodes. Again the architecture makes use of both edge computing and cloud and the analytics are performed at both the layers.

Satyanarayanan et al. [28] have proposed a hybrid cloud architecture called GigaSight. It is a three-tier architecture having a mobile device, cloudlet, and a cloud layer. Cloudlets are virtual machines based where computer vision analytics takes place on the video streams coming from different cameras. This decentralized cloud computing infrastructure helps in meeting end to end latency as well as sends selective data to the cloud. According to [28], having numerous cloudlets close to the network edge is better than having few larger cloudlets deeper in the network from viewpoint of performance. The focus in the article had been on video streams, however any indus-trial IoT application that generates a lot of data can benefit from the proposed solution.

While all these studies suggest a layer architecture and use of edge computing with cloud for PfP systems, none of the study discuss the impact of cost on the architecture and how cost can drive different architectural decisions. In addition to this, none of these studies mention components required by PfP systems.

(20)

4 Architecture for Pay-for-Performance System

Taking inspiration from the previous works described in Section 3, this section presents a three-tier architecture for PfP system as shown in Figure 5. Each layer can only use the layer directly below it. As described in Section 2.4, we may need to perform computation at the cloud as it is much easier to aggregate data from different parts of the world and by performing analytics on this data, an organization can receive knowledge on performance of fleet of systems. As pointed out in Section 2.5, we may also need to perform computation close to the physical systems due to reasons such as low latency and cost. Hence a PfP system will leverage both edge computing and cloud in its architecture.

Figure 5: An instance of a general three-tier architecture for pay-for-performance system with more details

Edge tier

This layer consists of all the physical systems which will be sending data to the layer above it. A local analysis, also known as edge analysis, is performed at this level. Data produced by hardware and software are collected and processed. Processing can include removal of noisy data, redundant data, encryption etc.

Platform tier

This layer is at the cloud. It performs the main functionalities required to maximize the asset productivity and optimize maintenance as mentioned in Section 2.3. Some basic functionalities in-clude monitoring of the products, performing analytics on the data for diagnostics and prognostics, storage of data and system configuration activities such as update and upgrade. More services can also be provided depending upon the use case.

Application tier

This layer interacts directly with the end users, such as business users, operational technology users etc. It provides a user interface which is responsible for displaying received information to the user. The information can be displayed as simple reports, dashboard etc.

Figure 6 is an instance of the general three-tier architecture shown above in Figure 5. As it can be seen, the edge gateway collects and aggregate data from different on-field systems and after performing some local analysis, it sends the data to the cloud computing platform. After perform-ing some computation on the cloud, result of the analysis are shown to the user. The end users can view and send control command back to the customer site.

(21)

Figure 6: Architecture for pay-for-performance system

4.1 Functions Performed on the Edge

From Section 2.5, we can derive some basic functionalities that would be provided at the edge. We already know that we may opt to have computation at the edge due to several reasons including stringent timing requirements posed by the physical system or when it is not economical to send all the raw data to the cloud. Hence without any hesitance, we can safely say that we would functionalities such as data collection and aggregation, and data analysis at the edge. In addition to this, we know also that this computation can be performed at the gateways or servers. Figure 7 depicts these functions and we will describe each in turn at the high-level.

Data management

It consists of ingesting sensor and operation data from multiple systems at the customer site and aggregating them. In Figure 6, we have two systems at the customer site namely Stressometer control system and Thickness Measurement system. Data from such systems can be of different types including time series data, events and alarm.

Local analysis

By performing an analysis close to the actual systems, it eliminates the need to send a large amount of raw sensor data to a remote system. We can use simple analytics or more advanced analytics such as data mining and machine learning. One of purpose of having the analysis closer would be to reduce the amount of data needed to be sent to the cloud. For simplicity, lets assume we have rules stored in an edge device. The data points sent from the systems might be compared to these rules, and when a value exceeds a threshold, only that value is sent to the cloud. As mention in Section 2.5, we might have other forms of analytics at the edge that deliver information quickly to the systems.

API

The edge will also need to provide an interface to the user. At the highest level, the user should be able to interact with the edge device through the interface.

(22)

Figure 7: Basic functionalities performed at the edge layer

4.2 Functions Performed on the Cloud

The functionalities that may be needed to be provided at the cloud for PfP systems are derived from Section 2.3. We know that we need to provide a highly available system that operates efficiently. To reduce the risk of system failure, we would need to monitor the system and provide diagnostic in case of some problem. To reduce untimely downtime, we may also provide predictive maintenance. To increase productivity, we may provide system configuration services. Since we can have PfP systems geographically dispersed, we may also need a service that collects and aggregates data from different customer sites. Last but not the least, we will need to provide an interface to the users. Figure 8 depicts basic functionalities that would be performed at the cloud. Following is a brief description of each:

Figure 8: Basic functionalities performed on the cloud

Data services

It consists of functions for data collection and processing, storing data in the storage and forwarding data to other services such as monitoring.

Analytics

It includes functions similar to those performed at edge analytics. However at this high level, these functions improve the systems model over longer period of time (i.e. the data, being analyzed, may be days or months old as shown in Figure 4). Following are some functionalities included in Analytics component:

(23)

1. Monitoring

This component will allow the users to remotely monitor the health and performance of the systems. Since it is done at the cloud layer, the actual system may be situated at a place far away from the user.

2. Diagnostic

Diagnostic consists of function that help in determining the cause of the problem. The user should be able to remotely diagnose the issue with any system sending its data to the cloud. 3. Prognostics

As mentioned in Section 2.3, to improve a system’s utilization, an organization providing PfP systems may want to do timely maintenance. Prognostics consists of functions that identify potential issues before they occur. With the help of these service, an organization can do predictive maintenance. Predictive analytics combine techniques from data mining, statistics, modeling and machine learning and require huge volume of historical data. 4. Optimization

Optimization consists of functions that improve the system’s performance and availability. As mentioned in Section 2.3, different configurations can have an impact on the performance of the system. Through this service, a user can send commands for software updates/upgrades to the control systems. Since the commands are being initiated from the user (i.e. the application layer), the arrows in Figure 6 point in the opposite direction to the data flow.

API

The result from all the activities discussed above will be displayed to the user. The user can browse through the reports, alarm log and look at the dashboards for information about the systems under control.

4.3 Role of Data in Pay-for-Performance System

By collecting sensor data from across the control systems and applying analytics to these data, we can gain insights to a business’s operation. With these insights, we can improve decision-making and optimize the system operations globally. Pay-as-you-go business model greatly reduces the cost of IT infrastructure as mentioned in Section ??. This only holds true if the software is designed in a way that optimally uses the cloud resources. The characteristics of data greatly influence the decision-making process that occurs while designing any system on cloud platform and an architect should have a clear understanding of type and amount of data that flows in and out of the system. Once the data has been understood, different design decisions can be made specifically relating to storage such as whether to use SQL, NoSQL and so on.

In the previous sections, we first discussed the high-level architecture for PfP systems and then we went into more details and discussed some basic functionalities that may be provided when offering PfP systems. We know that PfP systems will generate a lot of data and the data will coming from systems situated at different parts of the world. Kavis [29] has mentioned some characteristics of data that may also influence how the PfP systems are designed:

1. Physical Characteristics

In PfP system, similar to designing any other systems, it is important to know where the source of data is located. It might happen that the industrial system is situated in a place where there is no Internet. So then you might need to consider different ways of sending data, such as using cellular or satellite. This puts limitation on the amount of data that can be sent for analysis.

(24)

2. Volume

One of the reason for using edge computing in PfP systems was to reduce the amount of data sent to the cloud. Thus we have acknowledged that PfP systems would be generating huge amount of data and this data would be coming from systems located globally. Thus it is important to consider the volume of data so an organization might allocate resources accordingly.

3. Regulatory Requirements

It is important to understand regulatory constraints set by the customer. The industrial system might be located in a country does not allow data to be sent outside the country borders. Then, we would need to ascertain that the data sent and stored in the cloud is within that country.

4. Retention period

Retention period refers to how long data must be kept in storage. In Section 4.2, we identified predictive maintenance as one service that can be provided to PfP systems. To identify faults and predict when to perform the maintenance so that these faults do not manifest into failures, require huge amount of historic data. When designing PfP system, we would need to consider where we would store this data so that it does not become expensive to maintain such a system.

(25)

5 Stressometer Control System

As mentioned in Section 1.3, we used Stressometer as a case study to obtain more implications on the architecture of PfP system. To attain these implications, we had to first understand the data produced from this system. As mentioned in Section 4.3, data plays a vital role when designing a PfP system. We therefore put emphasize on understanding the quantity of data produced as well as the frequently with which it is produced by a Stressometer system.

Stressometer is a flatness measurement and control system. Thus it comprises of two main func-tionalities: one part of the system measures the flatness of the workpiece (i.e. the steel strip) and the other system sends command back to the actuators that are controlling gap between the work rolls as shown in Figure 9. In a steel industry, the shape of the steel strip determines the quality of the strip. The shape is greatly characterized by flatness, which is a feature used to describe how the surface of a flat-rolled product approaches a plane. Flatness is of the utmost importance in steel making as it determines acceptance or rejection by the customer. Usually there is a flatness tolerance zone which is determined by the use of the steel strip and the customer requirement. In rolling mills, flatness measurement and control system improve quality and productivity of the rolling mill by minimizing rejects, processing time and breaks during manufacturing.

To achieve the desired flatness, a flatness measurement and control system performs two main steps:

• Receive feedback from the flatness sensors measuring the flatness of the workpiece.

• Adjust the mechanical and thermal actuators on the rolling mill stand to minimize flatness error by controlling the gap between the work rolls.

Figure 9: Diagram of a Four-High mill. The workpiece passes through the gap between the work rolls. The blue circles are rolls that put force on the work rolls

Stressometer system receives data from the sensors in the Stressometer roll and also other type of data including strip thickness. The workpiece moves with variable speeds beginning with a slower speed and moving much faster after few hundred meters has passed over the Stressometer roll. The frequency with which data is produced by the system depends upon the speed of the Stres-someter roll. The data points are generated with lower frequency when the roll is moving slow and the frequency increases as the roll moves faster. The speed of the Stressometer roll is directly proportional to the speed of the strip. The roll is divided into zones and each zone has four sensors. Table 1 shows the speed with which the roll rotates and depending upon the speed, how quickly the sensor will be sending its value to the Stressometer system.

To demonstrate how we determined the values in Table 1, lets assume that the roll is rotating at 1000 RPM. Then the frequency with which sensors send values to the system can be calculated in the following way: We know that 1 min = 1000 rotations, thus each rotation occurs in 60000₁₀₀₀ =

(26)

60 ms. We also know that each zone has 4 sensors, thus each sensor produces value per rotation in60₄= 15 ms. Similarly we can calculate the sensor values for other speed configurations.

At an actual customer site, when the flat rolling process begins, the roll moves slowly and each value of the sensor is sent. But when the roll moves faster, the average value of each zone is sent after a certain time interval instead of sending every value of the sensor in that particular zone. For simplicity, in this report, we have assumed that the strip moves with a constant speed and the value is sent to the system every time it is generated by the sensors on the roll.

Rotation Speed (RPM) Each Sensor Value (ms)

200 75

500 30

1000 15

2000 8

4000 4

Table 1: Table displaying different speed configuration for each piece of strip

5.1 Quality Index Service

Now that we have determined the frequency with which the data is generated (i.e. Section 5), the next step is to determine the amount of data generated by a single Stressometer system. As mentioned in Section 1.3, this step will help us in carrying out the cost analysis of some particular IoT architecture and determine the different ways cost can influence the architectural decisions. The control system actually generates more than 1000 data points. Instead of focusing on all of this data, we took an assumption that we are interested in determining the quality of the strip produced at the customer site. Thus we provide a quality index service to the customer and a high level description of this service is that the customer can know quality of each strip and determine from this information if the quality fluctuates from one strip to another.

Lets suppose we can determine the quality of the strip from 10 parameters and we would need values of each of these parameters for each actuator. We know that each parameter is a double-precision floating point number that occupies eight Bytes in computer memory (64-bit). Thus 8 × 10 = 80 Bytes of data per actuator is produced. We will assume that in a typical customer site, 50 strips are produced each day where the length of each strip is 500.

To determine the amount of data produced by a single strip, we would need to determine the speed with which a strip moves in the rolling mill. We would determine this by using the angular velocity of the roll. Given a fixed speed v and radius r, then:

v = 2π

time of 1 rev !

r (1)

Assuming diameter of the roll is 400mm (r = 200mm = 0.2m) and the roll has a angular velocity of 1000 RMP, using Equation1, the linear velocity of the strip would be:

v = 1000 rotation 60 sec

2π(0.2m) ≈ 21m/s (2)

Now that we have determined the speed of the strip, shown in Equation 2, we next determine the time taken for the strip to pass over the Stressometer roll. If the length of the strip is 500m and the strip moves with the speed mentioned in Equation 2, then the time taken for the complete strip

(27)

to pass over the roll is 500₂₁ ≈ 23.8 seconds. From Table 1 we know that a sensor generates a value after every 15ms. If we assume that the mill has 10 actuators, then the amount of data produced after 15ms would be (10)(80) = 800 Bytes. Thus the amount of data produced by a single strip in 23.8s is:

800 15 × 1000

× 23.8 ≈ 1240KB (3)

So far we have assumed the actual amount of memory that data points occupy in the memory (i.e. a floating point occupies eight Bytes in memory). When we send data over the network to a cloud platform, the size of the message will increase because we send more additional information with the data point such as name of the parameter, the time stamp etc. We created a message in JSON format, that simply sent the name of each parameter and its value. The size of such a message was 7500 Bytes (approximately 7.3KB). In Microsoft Azure, Device-to-cloud maximum message size is 256 KB and a message is sent in 4KB blocks. This implies that 2 messages will be sent every 15ms. Thus in 1 second, we would have sent:

1000

15 ≈ 66 times =⇒ 66 × 2 = 132 messages (4) Since we already know that a single strip passes over the roll in 23.8 seconds, this means a single strip would send 132 × 23.8 = 3142 messages.

Hence we now know the amount of data produced per strip. Using this information, we can determine how much resources we would require when offering a Stressometer system as PfP system. Knowing this, we would be able to perform a cost analysis.

(28)

6 Remote Monitoring Solution

From Section 4, we know that a PfP system would require computation both at the edge and cloud. We also identified some general services that can be offered at the cloud for PfP systems in Section 4.2. In order to answer the second research question (RQ2) mentioned in Section 1.1, we will look into an existing solution by Microsoft Azure and determine how well it suits our requirements for PfP systems by first performing a cost analysis and as a result determine the limitations of this solution and propose some alternative architectural design and also the effect of cost for each.

The solution investigated for cost analysis is remote monitoring solution provided by Microsoft. It is an end-to-end monitoring solution for monitoring systems located globally. It packages multiple Azure services including IoT Hub , Event Hub , Stream Analytics, and storage services as shown in Figure 10. It is an end-to-end solution for sending data from a device, monitoring and controlling it.

Figure 10: Figure depicting the components of remote monitoring solution [3]

To perform a cost analysis, we would use the configuration mentioned in [30] which is shown below. It should be noted that this is not the only way to configure this solution and the solution can be changed to better suit the needs of the application.

• Azure IoT Hub (1 unit)

• Azure Stream Analytics (3 streaming units) • Azure DocumentDB (1 instance)

• Azure Storage

• Azure App Services (4 instances) • Azure Event Hub (1 unit)

(29)

Following is a brief explanation of what is the role of each service in this solution provided by Microsoft Azure. More information on the solution can be found in [3].

Simulated Device

When the solution is deployed, it automatically sets up a simulated device that sends data to the cloud. This is provided to easily explore the behavior of the solution without needing to connect an actual device to the solution.

IoT Hub

This service ingest data from the device and makes it available for Stream Analytics jobs. As can be seen in Figure 10, IoT hub is also responsible for:

• Identity registry: This contains the IDs of all the device connected to that IoT hub. The user of the application can also enable and disable devices by using identity registry. • Device twins: A device twin is a JSON document that stores device-related information. It

can be used by the device and back ends (such as a Web application) to synchronize device conditions and configuration [31].

• Job scheduler: This can be used to schedule jobs that allow properties to be set for multiple devices or invoke methods on multiple devices. As an example a user can change the telemetry frequency from 15 seconds to five seconds from the solution portal for multiple devices by creating a job.

Azure Stream Analytics

The Stream Analytics (ASA) jobs receive messages from the IoT hub and forwards them to other services. As can be seen from Figure 10, there are three ASA jobs where each is responsible for specific tasks based on the content of the messages.

• Job 1: Device Info: This job filters device information messages from the message stream coming from IoT hub and sends them to an Event Hub endpoint. Thus only a sub-set of all the messages (i.e.messages that relate specifically to device data) from IoT Hub are send to the event hub.

• Job 2: Threshold Rules: The query for this job takes any device data that breaks any of the rules defined in the blob. The job compares the value against the set threshold for the device and if it exceeds condition, the job outputs an alarm event. There are two outputs:

– The details of each alert is saved to the blob storage for historical analysis. – The other output is send to the rules event hub for clients to listen and act upon. • Job 3: Telemetry: This job takes the message stream as input from the IoT hub and stores

all the telemetry messages as it is to the blob storage.

Event Hub

This solution uses two events hub that are used as output by two ASA jobs (i.e. Device Info job and Threshold Rules job).The event hubs reliably forward data to the Event Processor Host running in the WebJob service.

(30)

As it has been discussed so far, the blob storage is used to store data of different types. It persists all the raw data coming from the device. The solution portal can read and graphically display it on the interface.

WebJobs

There is no additional cost to using WebJobs [32]. They simplify coding by providing common tasks. As can be seen from Figure 10, Event Processor WebJobs takes input from both the event hubs and outputs the result to documentDB database.

Azure Web App

The solution portal is a Web App that is a feature in Azure App Service. As shown in Figure 10, the key pages in the web application are Dashboard and Device List.

• Dashboard: This page visualizes the telemetry data sent from the devices by reading it from the blob storage.

• Device List: This page allows a user to perform different tasks including setting the threshold for each device, sending command from the cloud to the device and provision a new device. A new device is provisioned by creating unique device id and generating the authentication key. This information is then written to both the IoT Hub identity registry and the device registry in DocumentDB database.

Azure DocumentDB

This holds the list of registered devices and their differing properties. Each device is a separate record in the database and all information relating to the connected device, such as meta-data, configuration, is stored here.

6.1 Cost Analysis

We performed a cost analysis to determine the cost of sending data from a single Stressometer to the cloud. The pricing information and the calculations were done on Azure Pricing Calculator [33]. We made the following assumptions when we performed the analysis:

1. The data required for quality index service mentioned in Section 5.1 is used during cost analysis.

2. The Azure region used for the analysis is “North Europe” which is located in Ireland. 3. The services will be running 24 hours per day.

4. An average month consists of 732 hours (average of 720 hours and 744 hours). 5. All the cost mentioned in this document is in Swedish Krona (SEK).

6. There is no edge computing. Thus the data is sent directly from the Stressometer to the cloud platform.

6.1.1 Total cost

Details on the cost for each individual service that is included in the solution can be found in Appendix A. Now that we know how much each service cost, we can simply sum up the costs and get the monthly cost of running the solution for a single Stressometer as shown below:

(31)

No Azure regions Location

1 East US Virginia

2 West US California

3 North Europe Ireland

4 West Europe Holland

5 Southeast Asia Singapore

6 East Asia Hong Kong

7 Japan East Tokyo, Saitama

8 Japan West Osaka

9 Australia East New South Wales 10 Australia Southeast Victoria 11 Germany Central Frankfurt 12 Germany Northeast Magdeburg

Table 2: Provisioning of remote monitoring solution in different Azure regions

• For 30 days, the cost is 7,526.38 SEK • For 31 days, the cost is 7,763.88 SEK

Thus the average cost for a single Stressometer system is 7,645.13 SEK per month. If we consider that we have hundreds of Stressometer systems, then it can become too costly to offer these systems as PFP systems. Thus we need to address this issue by determining if there can be resource sharing among different Stressometer system. Hence this leads to implications on the architecture.

6.2 Implications

Following are some conclusions that can be drawn from the remote monitoring architecture. Limited provisioning of IoT suite

Azure services are currently available in 34 regions around the world [34]. However these regions do not provide all Azure services. To our interest is particularly the remote monitoring solution, which is being offered by Microsoft in much fewer regions as can be seen in Table 2. This creates following implications on the architecture:

1. As mentioned previously in the document, Stressometer is used by customers from many different countries. These include customers from China, Japan, Netherlands, Brazil, Spain and India. This solution would need to be customized to allow all customers to remotely monitor their Stressometer irrespective of their geographical location. Lets suppose we have a customer sitting in India. Currently there are three Azure regions in India and none of these provide IoT suite service. So then we have two different scenarios:

• Scenario 1: Connect to IoT hub service in West India Azure region and use IoT suite service in a region closest to its location, which would be either Southeast Asia region (Singapore) or East Asia region (Hong Kong). In this scenario, since both of these services are located in different Azure regions, outbound data transfer cost would apply (that is cost of transferring data from one region to another).

• Scenario 2: Send data directly to the Southeast Asia region or East Asia region. In such case, we would have to evaluate if it is possible to get real-time monitoring for that product. A closer data center will provide less latency and fast throughput.

CLOUD-CENTRIC SOFTWARE ARCHITECTURE FOR INDUSTRIAL PRODUCT-SERVICE SYSTEMS

V¨

aster˚

as, Sweden

(DVA501) Thesis for the Degree of Master of Science in Computer

Science with Specialization in Software Engineering 30.0 credits

CLOUD-CENTRIC SOFTWARE

ARCHITECTURE FOR INDUSTRIAL

PRODUCT-SERVICE SYSTEMS

Fizza Shams

fss15001@student.mdh.se

Supervisor Jan Carlson

M¨

alardalen University, V¨

aster˚

as, Sweden

Company supervisor: Markus Lindgren

ABB, V¨

aster˚

as

Examiner: Radu Dobrin

M¨

alardalen University, V¨

aster˚

as, Sweden

Acknowledgements

Table of Contents

List of Figures

List of Tables

1

Introduction

1.1

Problem Formulation

1.2

Research Site

1.3

Research Method

1.4

Actual Research Outcomes

1.5

Structure of the Thesis

2

Background

2.1

Classification of Product–Service systems

2.2

Applications of Servitization in Manufacturing

2.3

Requirements for Product–Service System

2.4

Cloud Computing

2.5

Edge Computing

3

Related Work

4

Architecture for Pay-for-Performance System

4.1

Functions Performed on the Edge

4.2

Functions Performed on the Cloud

4.3

Role of Data in Pay-for-Performance System

5

Stressometer Control System

5.1

Quality Index Service

6

Remote Monitoring Solution

6.1

Cost Analysis

6.2

Implications