Integrating Data Distribution Service in an Existing Software Architecture: Evaluation of the performance with different Quality of Service configurations

Full text

(1)LiU-ITN-TEK-A--20/058-SE. Integrating Data Distribution Service in an Existing Software Architecture: Evaluation of the performance with different Quality of Service configurations Kyriakos Domanos 2020-10-23. Department of Science and Technology Linköping University SE-601 74 Norrköping , Sw eden. Institutionen för teknik och naturvetenskap Linköpings universitet 601 74 Norrköping.

(2) LiU-ITN-TEK-A--20/058-SE. Integrating Data Distribution Service in an Existing Software Architecture: Evaluation of the performance with different Quality of Service configurations The thesis work carried out in Datavetenskap at Tekniska högskolan at Linköpings universitet. Kyriakos Domanos Norrköping 2020-10-23. Department of Science and Technology Linköping University SE-601 74 Norrköping , Sw eden. Institutionen för teknik och naturvetenskap Linköpings universitet 601 74 Norrköping.

(3) Upphovsrätt Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare – under en längre tid från publiceringsdatum under förutsättning att inga extraordinära omständigheter uppstår. Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervisning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säkerheten och tillgängligheten finns det lösningar av teknisk och administrativ art. Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsmannens litterära eller konstnärliga anseende eller egenart. För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/ Copyright The publishers will keep this document online on the Internet - or its possible replacement - for a considerable time from the date of publication barring exceptional circumstances. The online availability of the document implies a permanent permission for anyone to read, to download, to print out single copies for your own use and to use it unchanged for any non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional on the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility. According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement. For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its WWW home page: http://www.ep.liu.se/. © Kyriakos Domanos.

(4) Linköping University | Department of Computer and Information Science Master’s thesis, 30 ECTS | Datateknik 2020 | LIU-IDA/LITH-EX-A--20/001--SE. Integrating Data Distribution Service in an Existing Software Architecture. Evaluation of the performance with different Quality of Service configurations –. Kyriakos Domanos Supervisor : Qin-Zhong Ye Examiner : Anna Lombardi. Linköpings universitet SE–581 83 Linköping +46 13 28 10 00 , www.liu.se.

(5) Upphovsrätt Detta dokument hålls tillgängligt på Internet - eller dess framtida ersättare - under 25 år från publiceringsdatum under förutsättning att inga extraordinära omständigheter uppstår. Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervisning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säkerheten och tillgängligheten finns lösningar av teknisk och administrativ art. Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsmannens litterära eller konstnärliga anseende eller egenart. För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/.. Copyright The publishers will keep this document online on the Internet - or its possible replacement - for a period of 25 years starting from the date of publication barring exceptional circumstances. The online availability of the document implies permanent permission for anyone to read, to download, or to print out single copies for his/hers own use and to use it unchanged for non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional upon the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility. According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement. For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its www home page: http://www.ep.liu.se/.. © Kyriakos Domanos.

(6) Abstract The Data Distribution Service (DDS) is a flexible, decentralized, peer-to-peer communication middle-ware. This thesis presents a performance analysis of the DDS usage in the Toyota Smartness platform that is used in Toyota’s Autonomous Guided Vehicles (AGVs). The purpose is to find if DDS is suitable for internal communication between modules that reside within the Smartness platform and for external communication between AGVs that are connected in the same network. An introduction to the main concepts of DDS and the Toyota Smartness platform architecture is given together with a presentation of some earlier research that has been done in DDS. A number of different approaches of how DDS can be integrated to the Smartness platform are explored and a set of different configurations that DDS provides are evaluated. The tests that were performed in order to evaluate the usage of DDS are described in detail and the results that were collected are presented, compared and discussed. The advantages and disadvantages of using DDS are listed, and some ideas for future work are proposed..

(7) Acknowledgments I would like to thank my examiner, Anna Lombardi, and my supervisor, Qin-Zhong Ye, at LiU University for their valuable feedback and consistent help in improving this thesis. I would also like to thank my supervisor at Toyota Material Handling, Therén Håkan, for all the expertise, support and continuous feedback that he provided. Finally, I would like to thank Andreas Larsson, with whom I worked very close and exchanged many ideas during the span of this thesis. I am very grateful to Toyota Material Handling and Patrick Blomqvist for giving me the opportunity to work in this very interesting project, which helped me develop both academically and professionally.. iv.

(8) Contents Abstract. iii. Acknowledgments. iv. Contents. v. List of Figures. vii. List of Tables. ix. acronyms 1. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. 1 1 1 1 2 2. Background 2.1 Earlier Research . . . . . . . . . . . 2.2 AGV Platform . . . . . . . . . . . . 2.3 Smartness . . . . . . . . . . . . . . 2.4 Data Distribution Service . . . . . . 2.5 Data-Centric Publisher Subscriber 2.6 DDS Versions . . . . . . . . . . . . 2.7 Transport Types . . . . . . . . . . . 2.8 Quality of Service . . . . . . . . . . 2.9 Platooning . . . . . . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. 3 3 4 5 7 7 9 9 10 11. 3. Design Decisions and Limitations 3.1 Choice of Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Dummy Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 12 12 12 13. 4. Implementations 4.1 Existing Implementation . . . . . . . . . . . 4.2 Barebones . . . . . . . . . . . . . . . . . . . 4.3 DDS for External Communication . . . . . 4.4 DDS as Back Layer . . . . . . . . . . . . . . 4.5 DDS as Back Layer using Ownership Policy 4.6 DDS as Middle Layer . . . . . . . . . . . . .. 14 14 15 15 16 17 19. 2. 5. Introduction 1.1 Motivation . . . . . 1.2 Aim . . . . . . . . . 1.3 Research questions 1.4 Delimitations . . . 1.5 Overview . . . . . .. x. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. Test Methodology. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. 21 v.

(9) 5.1 5.2 5.3. Execution Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Communication Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Test Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 21 21 22. 6. Results and Discussion 6.1 Execution Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 23 23 34 45. 7. Conclusion. 47. Bibliography. 49. vi.

(10) List of Figures 2.1 2.2 2.3. Simple representation of how the fleet operates . . . . . . . . . . . . . . . . . . . . . Illustration of how the state machine operates. . . . . . . . . . . . . . . . . . . . . . Single DCPS domain layer [openDDSdeveloperGuide] . . . . . . . . . . . . . . . .. 4.1. How the existing implementation works, black lines represent direct access and red lines represent external communication using TCP or CAN. . . . . . . . . . . . How using DDS for external communication would look. The blue lines represent the DDS communication, the black lines represents direct access and the red lines represent external communication using TCP or CAN. . . . . . . . . . . . . . . . . Representation of how using DDS as a back layer would work, blue line represents DDS communication and red lines represent external communication using TCP or CAN. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . How the state machine operates with DDS as back layer. . . . . . . . . . . . . . . . Representation of how using DDS as a back layer using ownership would work, blue lines represent DDS communication and red lines represent external communication using TCP or CAN. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . How the state machine operates for DDS as back layer using ownership. . . . . . . How using DDS as a middle layer would work. The blue lines represent the DDS communication, the black lines represent direct access and the . . . . . . . . . . . . How the state machine operates with DDS as middle layer. . . . . . . . . . . . . . .. 4.2. 4.3. 4.4 4.5. 4.6 4.7 4.8 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13 6.14 6.15 6.16 6.17 6.18 6.19. Mean execution time of parse per number of interfaces . . . . . . . . . . . . . . . . Median execution time of parse per number of interfaces . . . . . . . . . . . . . . . Top 99th percentile execution time of parse per number of interfaces . . . . . . . . Mean execution time of decision maker per number of interfaces . . . . . . . . . . Median execution time of decision maker per number of interfaces . . . . . . . . . Top 99th percentile execution time of decision maker per number of interfaces . . . Plot of raw execution time measurements of the Decision Maker using the back layer with RTPS UDP and 0 dummy interfaces . . . . . . . . . . . . . . . . . . . . . Plot of raw execution time measurements of the Decision Maker using the back layer with RTPS UDP and 2 dummy interfaces . . . . . . . . . . . . . . . . . . . . . Mean execution time of send per number of interfaces . . . . . . . . . . . . . . . . . Median execution time of send per number of interfaces . . . . . . . . . . . . . . . Top 99th percentile execution time of send per number of interfaces . . . . . . . . . Mean execution time of the all states combined per number of interfaces . . . . . . Median execution time of the all states combined per number of interfaces . . . . . Top 99th percentile execution time of the all states combined per number of interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mean Latency per number of interfaces . . . . . . . . . . . . . . . . . . . . . . . . . Max Latency per number of interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . Mean Latency per number of interfaces . . . . . . . . . . . . . . . . . . . . . . . . . Max Latency per number of interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . Mean Latency per number of interfaces . . . . . . . . . . . . . . . . . . . . . . . . . vii. 5 7 8. 15. 16. 17 17. 18 19 20 20 25 25 26 27 27 28 29 30 31 31 32 33 33 34 37 37 38 38 39.

(11) 6.20 6.21 6.22 6.23 6.24 6.25. Max Latency per number of interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . Mean Latency per number of interfaces . . . . . . . . . . . . . . . . . . . . . . . . . Max Latency per number of interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . Mean Latency per number of interfaces in two different machines . . . . . . . . . . Max Latency per number of interfaces in two different machines . . . . . . . . . . . Standard Deviation of Latency of implementations running on the same host with different transport types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.26 Standard Deviation of Latency of implementations running on the same host with different qos policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.27 Standard Deviation of Latency of implementations running on different hosts with different qos policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. viii. 40 41 41 42 42 43 44 44.

(12) List of Tables 6.1 6.2 6.3 6.4. Total Loss over 1600000 samples using different Transport configurations. . . . . . Total Loss over 1600000 samples using different QoS policies . . . . . . . . . . . . . Total Loss over 1600000 samples using different QoS policies on two different host machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Standard Deviation of Latency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. ix. 35 36 36 43.

(13) Acronyms AGV Automated Guided Vehicle AMR Autonomous Mobile Robot DCPS Data-Centric Publish-Subscribe DDS Data Distribution Service FMS Fleet Management System IP Internet Protocol LIDAR Light Detection And Ranging OMG Object Management Group QOS Quality Of Service RTPS Real Time Publish Subscribe SFC Sensor Fusion Computer SGV Smartness Guided Vehicle SLAM Simultaneous Localization And Mapping TCP Transmission Control Protocol TMHMS Toyota Material Handling Mjölby Sweden UDP User Datagram Protocol. x.

(14) 1. 1.1. Introduction. Motivation. Toyota Material Handling has been developing autonomous products for almost a decade. As the autonomous trucks have evolved over time the complexity and performance of the systems have also increased. Today’s autonomous systems are packed with a multitude of various sensors such as lidars, cameras, etc. One of the major problems today is how to share the huge amount of data gathered both internally (between modules that run in the same system) and externally between different vehicles, interfaces, control systems and governing systems in the environment. There is often a real-time constraint which can be very stressful for the system to say the least. One increasingly popular networking middleware that has caught the attention of Toyota is the Data Distribution Service (DDS). Toyota Material handling is therefore very interested in investigating the opportunities that DDS holds for their products.. 1.2. Aim. The aim of this thesis project is to investigate the possibility of integrating DDS into the "Toyota Smartness" software platform and utilize some of the functional attributes that DDS exposes, such as QoS, real time exchange etc. As part of this investigation we will try to analyze the architecture of the "Toyota Smartness" software platform and try to detect potential architectural changes that would allow the best utilization of DDS. The final goal is to run various performance tests on TMHMS’ Automated Guided Vehicles (AGVs) and do a comparative analysis in order to evaluate what architectural changes are the most suitable and what the advantages and disadvantages of using DDS in this environment are.. 1.3. Research questions. 1. Is it possible to integrate DDS in the Smartness software platform? 2. What architectural changes can be used for utilizing DDS? 3. What are the main advantages and disadvantages of using DDS in TMH’s AGVs? 1.

(15) 1.4. Delimitations. 1.4. Delimitations. The following delimitations have been set due to various constraints, such as time and hardware availability: • In this project only one (OpenDDS) of the available DDS versions will be used for the integration and testing. • No AGV will be used for testing. The integration and testing will be done using a bench machine with the same computer as the AGV. • Not all interfaces will be converted to the implementations that will be tested.. 1.5. Overview. The current Chapter, is an introduction to this project. The intention is to help the reader understand what the aim, motivation and limitations behind this thesis are. Chapter 2 gives a general description of the technologies involved in this project. The intention is to familiarize the reader with the main concepts and terminologies that are used. Chapter 3 describes some of the design decisions that were taken and the justification behind them. Chapter 4 illustrates the main architectures that were implemented during this work and how DDS is used in each of them. Chapter 5 gives a thorough description of the methods used for testing and evaluating the results of this project. Chapter 6 presents and elaborates on the results that were collected during this project and evaluates the advantages and disadvantages of using DDS and the method that was used. Chapter 7 is the conclusion of this project and provides a summary of this thesis together with suggestions on future work that could potentially be done.. 2.

(16) 2. 2.1. Background. Earlier Research. Enabling Data Sharing with DDS on Real-time Constrained Industrial Robots An earlier research[1] on the performance of DDS was done in a similar platform by Sergio Erick Vieyra Enrıquez at the university of Mälardalen. This project looked into using DDS for communication over Ethernet between ABB robots and a centralized computer for computing optimal movement of the robots arms in real time. The main objective of this research was testing the latency and bit rate of the communication. The project found that while DDS was a good platform, a lot of latency appeared as a result of the implementation of DDS into the existing software architecture. Another finding he made was that the latency for each package did not increase significantly when increasing package size.. An Optimized, Data Distribution Service-Based Solution for Reliable Data Exchange Among Autonomous Underwater Vehicles In this paper [2], the authors discussed and evaluated the possibilities of implementing DDS in Autonomous Underwater Vehicles(AUV) for communication both between AUVs and from AUVs to Command and Control Stations (CCS). The paper mostly focused on the implementation, as well as potential problems with communication underwater, such as reliability and bandwidth constraints. The results show that communication between two systems, one on a boat and one inside a AUV could communicate with each other with a latency between 2 and 86 ms with a median of 7ms. The paper concludes that "it [middleware solution such as DDS] shows promise regarding the integration of the final set of autonomous maritime vehicles that are going to be included in SWARMs" [2].. 3.

(17) 2.2. AGV Platform. A New Location-Based Services Framework for Connected Vehicles Based on the Publish-Subscribe Communication Paradigm In this article[3], the authors evaluate the communication performance (latency and loss) of two different protocols, Distributed Data Service (DDS) and Message Queue Telemetry Transport (MQTT) for Location Based Services (LBS) communication between devices. The tests are performed using OpenDDS and Mosquitto MQTT with various setup configurations and in a mobile network environment. The results indicate that MQTT slightly outperformed DDS, but overall the findings were promising for both. In the case of DDS, the loss seems to vary a lot based on the different settings that were tested (size of published data, frequency of publishing, number of publishers/subscribers). The authors concluded that this loss is mainly is due to the poor reception of 4G and not the protocols themselves.. Integrating Data Distribution Service in an existing software architecture: Exploring the effects of performance in scaling A contemporary research on this subject was done by Andreas Larsson, on behalf of Toyota Material Handling [4]. Although that research shares a lot in common with this thesis, the focus was placed primarily on the hardware performance and the overhead the DDS integration puts on the CPU and RAM. The results indicate that although the memory footprint of DDS is rather trivial, the CPU overhead is significant when comparing with the current version of Toyota Smartness.. 2.2. AGV Platform. The TMHMS AGV platform that will be used during this project is known as AMR and is operating as part of a fleet. Each AGV in the fleet consists of multiple components, including motors, sensors and multiple computers of different types and uses. The main two components that will be of interest for this thesis, is the SFC and the AirWorks AWK-1137C network bridge commonly referred to as the Moxa. A network diagram of how a theoretical fleet operates can be seen in Figure 2.1. 4.

(18) 2.3. Smartness. AGV. AGV. AGV. SFC. SFC. SFC. Moxa. Moxa. Moxa. central router Figure 2.1: Simple representation of how the fleet operates. The Sensor Fusion Computer (SFC) is the brain of the AGV. It runs Linux with an rtos patch and does most of the calculations on the AGV, including SLAM, pathing, as well as running the smartness application which will be explained in Section 2.3. Since the SFC itself lacks any wireless capabilities, a network bridge (MOXA) is used to connect the SFC to a central router.. 2.3. Smartness. Smartness is a software platform created by TMHMS and is used as a middle layer between sensors, navigation software and motor controllers. Smartness also does some processing in between, to make sure the AGVs operate safely and follow the rules and regulations that exist. Smartness was created with the goal of being a generic platform that can be used by different types of AGVs, that have their own set of sensors and motor controllers, with minimal changes to the core code. This is done by keeping the communication with sensors, motor controllers and load handling devices separate from the central platform, meaning that in theory only those interfaces, together with some configuration files would need to be changed when using Smartness on different AGV platforms.. 5.

(19) 2.3. Smartness. Interface An interface is a module inside Smartness, that is used to communicate with other applications, sensors, etc. The interfaces can communicate both internally in the machine (e.g. SLAM software, sensors and motor controllers), and externally with software that runs for example on a centralized server, communicating with multiple AGVs at the same time. Each interface contains one parse function and one send function. These functions handle data that are sent from or received by Smartness. They are generally very simple and are only meant to either create and send outgoing messages from data stored in smartness or parse received message into Smartness. Very little to none pre/post processing should occur inside here.. SGV Dataset The SGV dataset is a group of datasets containing all information about the AGV including, but not limited to position, speed, mechanical specifications, planned path, etc. The datasets are created at startup and currently most modules have direct access to them, as it is the most prominent way of transporting information inside smartness.. Source Selector The source selector is used to decide which interfaces are allowed to write to the SGV dataset, as there can exist more than one interface receiving data of the same type. For instance, drive commands can be sent either from manual input or from a fleet management system. The Source Selector chooses which interfaces are allowed to send their data to the SGV dataset by comparing the active interfaces to a priority list that can be changed dynamically.. Decision Maker The decision maker is used to perform a sanity check on the SGV dataset and make sure that it follows the current safety requirements, as well as check for any warnings and errors and act accordingly.. Application Supervisor The application supervisor handles and acts upon all vehicle behaviours, such as drive, steer, lift etc.. State Machine A State Machine or Finite State Machine is a computational model that can be used to create sequential logic, this is done by having a finite number of states with rules for transitioning between them. State Machines are used in a lot of different applications including software, hardware, and mathematics [5]. The state machine used in Smartness is a simplified State Machine using five states and the transitions between the states are sequential and based on elapsed time rather than events. As illustrated in Figure 2.2, the state machine works by having all modules wait for their respective time slots during the state machine’s cycle.. 6.

(20) 2.4. Data Distribution Service. Figure 2.2: Illustration of how the state machine operates.. During runtime, the state machine consists of 5 separate states: Parse, Source Selector, Decision Maker, Application Supervisor and Send. The transition between states is done on a set timer and each state is assigned with an equal amount of time. The latter means that each state is assigned with one fifth of the total cycle time. If the content of a state has not finished by the time a state transition occurs it will continue until it is finished, but the functions of the new state will happen simultaneously. This can lead to a lot of problems; for example, if the Decision Maker is not finished by the time the Send state starts, the interfaces will receive old data. Similar things will happen for the interface Parse to Source Selector transition and Source Selector to Decision Maker transition. It is therefore extremely important that the state machine is followed flawlessly, i.e., the execution time of the functions in each state should not exceed the assigned time of each state.. 2.4. Data Distribution Service. DDS is a communication middle-ware used for peer to peer communication using various transport types [6], that will be described in more detail later in Section 2.7. It can be used to link multiple processes together in a publisher/subscriber relationship. Unlike a server/client relationship like basic TCP, DDS can operate with multiple publishers and subscribers together, allowing for a much more seamless integration between multiple modules. Another advantage of DDS is that it has the ability to use RTPS discovery allowing the system to automatically handle any new connections to an existing network as well as re-connections.. 2.5. Data-Centric Publisher Subscriber. The DCPS (Data-Centric Publisher Subscriber) layer [7] is responsible for sharing the data between publishers and subscribers and it consists of one or more domains that are defined 7.

(21) 2.5. Data-Centric Publisher Subscriber by a unique domain ID [8]. Figure 2.3 illustrates how the communication between publishers and subscribers is achieved in a single domain DCPS layer.. Figure 2.3: Single DCPS domain layer [9]. Domain As previously mentioned each domain is identified by a unique domain ID and it is the main partitioning unit within the DCPS layer. Each of the other DDS entities that we will see below (Publishers, Subscribers, Topics) belong to a domain and can only interact with other entities that exist in the same domain [9].. Topic Topics are simple structs or unions that can contain one or more variables and these are what is sent within the domain from publishers to subscribers. Each datareader and datawriter is linked to a single topic and all datareaders and datawriters within a domain that share the same topic are linked together. Each topic can contain zero or more keys, each unique key creates a separate instance of a topic and can be used to differentiate them[10]. For example, taking the stock market as a reference, if a topic is a corporate stock, then a key could contain the name of the company, 8.

(22) 2.6. DDS Versions and that would create an instance of each separate stock. If a topic has zero keys all variables inside the topic will be assumed to be a key. When using the default quality of service (QOS) whenever a new message is sent on an existing instance, it results to the older data of that instance being discarded.. Publisher Each publisher belongs to a specific domain and can publish data in a topic that exists within this domain.. Subscriber Each subscriber belongs to a specific domain and can subscribe to a topic that exists within this domain.. Data Reader/Writer Data writers are used to write a topic to the publisher and the data readers reads a topic from a subscriber, each data reader/writer are linked to a single topic. Multiple data reader/writer can exist for the same publisher/subscriber even for the same topic, but each data reader/writer can only be linked to one publisher/subscriber.. 2.6. DDS Versions. DDS is an open standard, and there are many implementations of DDS created by different companies and organizations [11]. All the different DDS implementations follow the specification developed by OMG (Object Management Group); however, they all have some additional features and implementations unique to them. Choosing which version to use therefore comes down to what functionality is needed and how much one is willing to pay. For this thesis, the OpenDDS implementation was selected, because it is free, open source, has a well documented and easy to use API and has an active development community that has been very helpful for answering questions, giving explanations and participating in discussions around DDS.. 2.7. Transport Types. Each data writer and data reader uses a transport type (tcp, udp, multicast, shared memory or rtps_udp), and can customize the configuration parameters defined for that tranport type via configuration files or through programming APIs. This section describes those parameters used in this project.. Shared Memory The Shared Memory can only provide communication between instances that reside on the same host and it is supported in Unix-like platforms with Portable Operating System Interface (POSIX) shared memory and on Windows platforms [9]. It does not support reliable communication and allows the DDS entities to communicate using a region of the memory.. Transmission Control Protocol The Transmission Control Protocol (TCP) is an end-to-end protocol that is implemented in the transport layer. It is commonly used in client-server architectures, it is cross-platform, ordered and reliable.. 9.

(23) 2.8. Quality of Service. RTPS_UDP RTPS over UDP is necessary for interoperable communication between the different DDS implementations as specified in the Real-time Publish-Subscribe Wire Protocol DDS Interoperability Wire Protocol Specification (DDSI-RTPS) [12]. It can support reliable communication if specified in the Quality of Service policies that will be described in Section 2.8 and by default is using Multicast in the transport layer.. 2.8. Quality of Service. The Quality of Service (QoS) is a set of policies, described in the DDS specification. These policies are used for configuring the data transmission and the behavior of the different DDS entities that were shown in Section 2.5 [9]. In this project we examine some of these QoS policies and we try to understand and evaluate them. Most notably we will use the following QoS policies:. Reliability A QoS policy that can be applied to topic, data writer and data reader entities and essentially controls how data samples are treated [9]. The two options available are Best Effort and Reliable. As implied by their names if Reliable is selected then a data sample is resend if it is not delivered, whereas when Best Effort is selected, there is no guarantee that a data sample will be delivered.. History A QoS policy that can be applied to topic, data reader and data writer entities and determines how the data samples in a particular instance are held. It is possible to specify if an instance should keep all the data samples or if it should keep an x amount of samples. When "keep all" is specified then the data are kept in the writer until the publisher retrieves them, and similarly the data are kept in the reader until they are "taken" by the reader. If the "keep last x" data samples is specified, then when a new (x+1) data sample is sent, the oldest data sample is discarded [9].. Durability A QoS policy that can be applied to topic, data reader, and data writer entities and determines whether a data writer should maintain data samples that they have already been sent to the subscribers and for how long these data samples should be kept. There are four available options. • Volatile: The data samples are discarded after being sent. • Transient local: The data samples are kept, as long as the writer is alive. • Transient: The data samples are kept (in memory), as long as the process is alive. • Persistent: The data samples are kept (in permanent storage) even after the process is stopped. When transient or persistent durability is used, the durability_service QoS policy can be used in combination. The durability_service QoS policy can specify for how long and how many samples should be kept [9].. 10.

(24) 2.9. Platooning. Ownership A QoS policy that controls if a data-object is exclusive or shared between various data writers. It can be used in conjunction with the Ownership_Strength QoS policy that determines which writer has priority to get exclusive access to a given data-object. Other factors may influence the ownership of a data-object, such as the Liveliness (owner is the "alive" writer with the highest OwnershipStrengthQosPolicy value) and Deadline (owner is the "alive" writer with the highest OwnershipStrengthQosPolicy value) [9].. 2.9. Platooning. One of the potential uses of DDS is to improve the platooning capabilities of the AGV fleet that Toyota is developing. Currently the limiting factor for throughput of AGVs is the safety distance that needs to be held between each AGV. This problem exist due to the fact that the AGVs’ sensors have no way of differentiating another moving AGV from a static object. If they had the ability to know the location, speed and status of other vehicles in real time they would not have to rely exclusively on sensors in order to a keep distance between each other.. 11.

(25) 3. 3.1. Design Decisions and Limitations. Choice of Topics. During the first stages of this project, it was important to set up a plan on how the DDS topics that were going to be used in smartness should be built. The groups of topics that were created for this project can be split into the following three distinct categories: • Interface -> Decision Maker: Topics that are used for communicating between the interfaces and the Decision Maker. These topics are used for matching what is already sent in the original implementation in order to best mimic the existing functionality. • Decision Maker -> Interface: As the interfaces do not need all the AGV data, these topics are created with the goal of limiting the information that each interface has access to and only include the data that are relevant to them. • External: An external interface is an interface that is used for communication between AGVs. A use case for such interfaces is data sharing of important data that could also be utilized in platooning formation, as mentioned in Chapter 2.9. The focus for these topics is to contain as much relevant information as possible, but since these topics will be sent over WiFi to other vehicles, the network infrastructure needs to be taken into account.. 3.2. Dummy Interface. As it would be extremely time consuming to convert all existing interfaces to use DDS, dummy interfaces were created for all the different implementations. These dummy interfaces were as simple as possible, in order to only show the effects of using DDS in the different implementations. How a dummy interface operates will be explained in more detail in Chapter 4. The topics created for the dummy interfaces followed the principles mentioned in Section 3.1 and were made to mimic the needs of the existing interfaces. The maximum number of dummy interfaces that was tested was 40. This number was deemed to be sufficient by TMHMS, as it is considerably higher than the actual number of interfaces that Smartness is currently using and at the same time small enough to make the test execution time reasonable.. 12.

(26) 3.3. Limitations. 3.3. Limitations. During this project, a set of limitations were imposed by the current way that the Smartness platform operates and by the limited scope of this project. The most important of them are presented in this section.. Priority Through Messages To make sure that the priority of each interface was correct and could not suffer from desync between threads, a limitation was set that the priority of an interface cannot be sent through DDS messages, neither can interfaces be completely disabled. This is as if the interface with the highest priority is disabled, or stops working, the interface with the second highest priority needs to take over immediately.. Follow the State Machine To make sure that all communication still happens in the correct order, all modules must follow the state machine that was mentioned in Chapter 2.3. The state machine cycle that is used in the current Smartness version is 20ms; this means that each state is allocated with 4ms in every cycle. In the first steps of this project, it became obvious that this cycle time is not going to be enough for some of the states in order to operate using DDS, so the cycle time was increased to 100ms, i.e. 20ms per state.. One Way Latency Measurement between SFCs When evaluating the communication between SFCs the latency was measured by publishing a data sample on one SFC and receiving it in a second SFC. A more accurate way to measure the latency would be to measure the Round Trip Time (RTT), as it is not possible to sync the two computer clocks in a nanosecond level. However, given that the main goal of this project was to evaluate the impact of the different QoS policies, the accuracy of the latency was not a priority.. AirWorks AWK-1137C Toyota Material Handling is using the AWK-1137C (MOXA) industrial Ethernet to WiFi bridge as a way to connect the its AGVs/SFCs to a WiFi network. A limitation that became obvious during the first steps of our integration testing is that the way that TMH is using these devices (in Client-Router mode) is not suitable for DDS communication. The problem is that in Client-Router mode the AWK-1137C is separating the WiFi and Ethernet interfaces into two different subnets, without support for forwarding multicast packages. A workaround for this limitation was to switch the operation mode from Client-Router to Client.. 13.

(27) 4. Implementations. The main goal of this project is to investigate how DDS can best be incorporated into the existing Smartness platform. A number of different approaches and configurations were tested in order to determine which would be more suitable for the Toyota Smartness platform. In this chapter the most important ones are depicted and analyzed. It is important to clarify that for simplicity reasons the figures below do not contain all the interfaces that exist in the Toyota Smartness platform and depicts only the modules and states that are relevant to this project.. 4.1. Existing Implementation. A visualization of the existing implementation can be seen in Figure 4.1. In this implementation all interfaces have direct access to the SGV dataset. In the current implementation, the Source Selector needs to look through all the interfaces in each cycle, in order to check which one has the highest priority and then it needs to save the data stored in that interface into the SGV dataset. This is to make sure that the data from the correct interface is retrieved. To do this the Source Selector needs direct access to all interfaces and as such all interfaces must be created inside smartness.. 14.

(28) 4.2. Barebones. Figure 4.1: How the existing implementation works, black lines represent direct access and red lines represent external communication using TCP or CAN.. Dummy Interface The dummy interface created for the existing implementation will operate as follows: 1. Read from the dataset during the send state. 2. Write to a local instance of the dataset during the parse state.. 4.2. Barebones. To get a better understanding of the affects of the different implementations a barebones implementation was created. It operates similarly to the current implementation except that its dummy interfaces do not access the SGV dataset. As such the measured execution time of the parse and send states only consists of the time it takes for the execution time logger. This means that the difference in execution time between any implementation and this implementation can be considered the actual time it takes for the code inside the interfaces of the different implementations to run.. 4.3. DDS for External Communication. This implementation is used primarily for evaluating the external communication performance of DDS. The main addition to Smartness is that now the interfaces communicate externally between AGVs using DDS. An example of this implementation can be seen in Figure 4.2. This implementation would allow data sharing between AGVs and could potentially be used to enhance the platooning capabilities of the SGVs, as described in Section 2.9.. 15.

(29) 4.4. DDS as Back Layer. Figure 4.2: How using DDS for external communication would look. The blue lines represent the DDS communication, the black lines represents direct access and the red lines represent external communication using TCP or CAN.. 4.4. DDS as Back Layer. This implementation will completely remove all interfaces access to the SGV dataset and allow them to be completely separate from the rest of the Smartness platform. The following four steps describe the main changes that take place. 1. The Source Selector logic will be a part of the Decision Maker, so it can parse the Smartness configuration files, and pass the interface with the highest priority to the Decision Maker. 2. The Decision Maker will then read the DDS data and use the data from the correct interface. The selection is done based on the interface name. 16.

(30) 4.5. DDS as Back Layer using Ownership Policy 3. The Decision Maker once done will write to topics that are then read by the interfaces. This allows the system to limit the amount of information each interface has access to. 4. DDS Readers will be added to the interfaces to receive topics from the Decision Maker. Representations of how this will work can be seen in Figures 4.3 and 4.4.. Figure 4.3: Representation of how using DDS as a back layer would work, blue line represents DDS communication and red lines represent external communication using TCP or CAN.. Figure 4.4: How the state machine operates with DDS as back layer.. 4.5. DDS as Back Layer using Ownership Policy. This implementation will work similarly to the back layer version, with the main difference being the usage of the Ownership QoS policy. As mentioned in Section 2.8, the Ownerhip 17.

(31) 4.5. DDS as Back Layer using Ownership Policy QoS policy can be used to allow only one writer to write in a given topic. That writer is determined by the highest Ownership strength that is defined in the writer’s QoS policy and can be dynamically modified. The strength of each writer is based on the interface priority configuration in the Smartness platform. When a writer is inactive, then the writer with the next highest strength takes the ownership of the topic. The main advantage of this implementation is that the Source Selector logic can be removed from the Decision Maker and a simple DDS layer can parse any received topics directly. Another major advantage is that similar to the "DDS as back layer" implementation, the interfaces are completely detached from the Smartness core, and can even run on a separate machine.. Figure 4.5: Representation of how using DDS as a back layer using ownership would work, blue lines represent DDS communication and red lines represent external communication using TCP or CAN.. 18.

(32) 4.6. DDS as Middle Layer. Figure 4.6: How the state machine operates for DDS as back layer using ownership.. Dummy Interface The dummy interface created for this implementation will operate similar to the back layer’s dummy interface, with the only difference being the addition of the Ownership QoS to the writer of each dummy interface.. 4.6. DDS as Middle Layer. This implementation will be briefly described, but will not be evaluated further in the next chapters. A thorough evaluation of this implementation has been done by Andreas Larsson [4]. A DDS communication layer is added between the interfaces and the Source Selector. The Source Selector serves a similar role for this implementation as it did in the current version of Smartness, but it now connects to the interfaces through DDS as shown in Figure 4.7 and Figure 4.8. This adds an additional layer of abstraction as the Source Selector now will not need direct access to each interface. As the source selector still needs to check the priority of each interface, the topics that are sent from the interfaces need to contain some form of identification which can then be cross referenced with the priority list.. 19.

(33) 4.6. DDS as Middle Layer. Figure 4.7: How using DDS as a middle layer would work. The blue lines represent the DDS communication, the black lines represent direct access and the .. Figure 4.8: How the state machine operates with DDS as middle layer.. Dummy interface The dummy interface created for this implementation will operate as follows: 1. Read from the dataset during the Send state. 2. Write to a DDS topic that is sent to the Source Selector during the Parse state.. 20.

(34) 5. Test Methodology. In order to evaluate the reliability and speed of the different implementations that are described in Chapter 4, a set of tests was performed. These tests, listed below, were performed with an emphasis on cases similar to how the final product will be used. A strong focus was placed on testing if the Toyota Smartness platform can cope with hard real-time requirements that are imposed by the state machine, while using DDS for communication between the different modules, as seen in Chapter 4.. 5.1. Execution Time. Measurements were performed to evaluate the execution time of the different modules of Smartness that were affected by implementing DDS. This included: • Interface Parse • Decision Maker • Interface Send All of the above states have a run function that waits for the correct state transition, performs a set of actions and then waits again. It was therefore easy to measure the execution time of these states by comparing the internal time between the state transition and the end of the execution of the run function. These measurements were saved locally on the machine and then a python script was used to visualize the data. The goal of these tests was to detect potential improvements and problems created by the different implementations when it came to following the state machine in Figure 2.2.. 5.2. Communication Performance. These tests were focused on measuring the performance of the communication between readers and writers internally and externally. The focus was put into: • Latency. 21.

(35) 5.3. Test Plan • Loss. For the external tests, a set of two SFC connected to two MOXA bridges were used. The MOXA devices were communicating via a Netgear CG3700EMR router device.. 5.3. Test Plan. Execution Time The main approach to conduct the execution time tests for each implementation is as follows: 1. Start Smartness with N dummy interfaces active. 2. Run for x minutes and log information. 3. Stop the execution of Smartness. 4. Increase N by 1 and return to step 1 until N > 40.. Internal Communication Performance The main approach to conducting the communication performance tests for each implementation (using 1 machine) is as follows: 1. Start Smartness with N dummy interfaces active. 2. Let each active interface publish 2000 times. 3. Log the statistics information that the reader collects. 4. Stop the execution of Smartness. 5. Increase N by 1 and return to step 1 until N > 40.. External Communication Performance The main approach to conducting the communication performance tests for each implementation (using 2 machines) is as follows: 1. Start Smartness with N dummy interfaces active in machine 1. 2. Start Smartness with a reader in machine 2. 3. Let each active interface publish 2000 times. 4. Log the statistics information that the reader collects. 5. Stop the execution of Smartness in both machines. 6. Increase N by 1 and return to step 1 until N > 40. This gave a good idea how the different implementations operated with an increasing amount of interfaces.. 22.

(36) 6. Results and Discussion. This chapter illustrates the results that were collected during the testing of the different implementations. Section 6.1 presents how the execution time of the main Smartness states (Parse, Decision Maker and Send) is affected with each different implementation and Section 6.2 presents how the Communication (Loss and Latency) is affected with different transport types and QoS policies.. 6.1. Execution Time. In total there were 7 different implementations tested. In all of these implementations (except the Current Smartness version) the communication is done directly between the interfaces and the Decision Maker, thus omitting the Source Selector state: 1. Current Smartness Version: The current implementation of Smartness. 2. Back Layer with RTPS UDP: An implementation using DDS to receive and send data from the interfaces to the Decision Maker, using RTPS UDP as a transport protocol. In this implementation, the logic of the Source Selector has been ported to the Decision Maker. 3. Back Layer with Shared Memory: An implementation using DDS to receive and send data from the interfaces to the Decision Maker using Shared Memory as a transport protocol. In this implementation, the logic of the Source Selector has been ported to the Decision Maker. 4. Back Layer with TCP: An implementation using DDS to receive and send data from the interfaces to the Decision Maker using TCP as a transport protocol. In this implementation, the logic of the Source Selector has been ported to the Decision Maker. 5. Back Layer with RTPS UDP, using Ownership QoS: An implementation using DDS to receive and send data from the interfaces to the Decision Maker using RTPS UDP as a transport protocol. In this implementation, the logic of the Source Selector has been replaced with the Ownership QoS policy, that allows the interface with the highest priority/strength to get the ownership of a given topic. The priority/strength of a given. 23.

(37) 6.1. Execution Time interface is determined similarly to how the Source Selector does it, i.e. by reading the Toyota Smartness configuration. 6. Back Layer with Shared Memory, using Ownership QoS: An implementation using DDS to receive and send data from the interfaces to the Decision Maker using Shared Memory as a transport protocol. In this implementation, the logic of the Source Selector has been replaced with the Ownership QoS policy, that allows the interface with the highest priority/strength to get the ownership of a given topic. The priority/strength of a given interface is determined similarly to how the Source Selector does it, i.e. by reading the Toyota Smartness configuration. 7. Back Layer with TCP, using Ownership QoS: An implementation using DDS to receive and send data from the interfaces to the Decision Maker using TCP as a transport protocol. In this implementation, the logic of the Source Selector has been replaced with the Ownership QoS policy, that allows the interface with the highest priority/strength to get the ownership of a given topic. The priority/strength of a given interface is determined similarly to how the Source Selector does it, i.e. by reading the Toyota Smartness configuration. Each implementation was executed with 0-40 dummy interfaces for 60 seconds each and the execution time of each state was continuously measured and saved locally on the SFC, as described in Section 5.1. The collected results were visualized using Python and the Matplotlib library. Four different types of measurement were collected from each run with a set number of dummy interfaces: • A list of all the measured execution times during the 60 seconds run. This was very useful in order to show how the execution time changed with the number of interfaces and also for detecting patterns in the measurements. • The median value of the list. This value was useful as it gave a very stable result as it removed any spikes in the data. • The mean value was also calculated for the same reason as the median, both the median and mean values have their own advantages and disadvantages, but in general the two should show fairly similar results unless the results are skewed in some way. • The 99th percentile value was also calculated to show the expected upper bound of execution time for each implementation and state. It is important to note, that according to the Barebones implementation (described in Section 4.2), there is an overhead for the execution time logging that is estimated between 0.1ms (when 1 writer/reader is used) and 0.5ms (when 40 writers/readers are used) in the Parse and Send states. This overhead exists in all the implementations that will be evaluated below.. Parse State In this part of the thesis, the results of the Parse state are presented and discussed. The Figures 6.1 - 6.3 show the mean, median and 99th percentile execution time of the dummy interfaces.. 24.

(38) 6.1. Execution Time. Figure 6.1: Mean execution time of parse per number of interfaces. Figure 6.2: Median execution time of parse per number of interfaces. 25.

(39) 6.1. Execution Time. Figure 6.3: Top 99th percentile execution time of parse per number of interfaces. In the case of the Parse state, the results are quite clear when looking at Figures 6.1 and 6.2, with the execution time scaling up in an almost linear trend as the number of the dummy interfaces increases. The current Smartness version performs significantly better than the implementations that use DDS and the version that uses the Ownership QoS with RTPS UDP performs significantly worse. In general it can be observed that the Shared Memory performs the best among the different transport types, which is expected given that the memory is used directly for writing to topics. Additionally it can be observed that the Ownership QoS policy adds a significant overhead to the execution time. That could potentially be explained due to the fact that the DDS middleware performs some additional operations to determine which writer/interface has the highest strength and block the other writers/interfaces. When looking at Figure 6.3 that depicts the top 99th percentile, it is clear that all the DDS implementations, when scaling up the number of interfaces, tend to exceed the 4ms time slot that is allocated to the Parse state in the Smartness platform.. Decision Maker State In this part of the thesis, the results of the Decision Maker state are presented and discussed. The Figures 6.4 - 6.6 show the mean, median and 99th percentile execution time of the Decision Maker.. 26.

(40) 6.1. Execution Time. Figure 6.4: Mean execution time of decision maker per number of interfaces. Figure 6.5: Median execution time of decision maker per number of interfaces. 27.

(41) 6.1. Execution Time. Figure 6.6: Top 99th percentile execution time of decision maker per number of interfaces. In the case of the Decision Maker state, as seen in Figures 6.4 and 6.5, the current Smartness version outperforms the DDS implementations, albeit with a much smaller margin compared to the Parse state. It can be observed that the DDS implementations that use the Ownership QoS perform better. That is reasonable, since there is only one dummy interface to read from, given the fact that the writer selection is already done based on the Ownership QoS policy. Among the different transport types, there are no significant differences, however, when the Ownership QoS policy is used, it seems that the Shared Memory and TCP performed a little better than the RTPS UDP. When looking at the mean and the median of back layer with RTPS UDP in Figures 6.4 and 6.5, there seems to be a significant difference between the two, with the mean execution time being noticeably higher. This indicates that there is a rather high variance in the execution time measurements of the back layer with RTPS UDP. The latter can further be supported when looking at the top 99th percentile execution time in Figure 6.6, where the back layer with RTPS UDP seems to have considerably higher measurements than the Shared Memory and TCP versions. Although the execution time is not a problem for the Decision Maker in this particular case, generally speaking low variance is a very important characteristic for safety critical real time systems, where there are hard time constraints and a need for deterministic behavior. When looking at Figure 6.6 that depicts the top 99th percentile, it seems that none of the DDS implementations exceed the 4ms time slot that is allocated to the Decision Maker state in the Smartness platform. This is an interesting result, as it indicates that reading from multiple writers and writing to multiple readers is not a very expensive operation in DDS. An interesting observation is that there seem to be some abnormal high execution times when running the tests with 0-2 interfaces. In order to get a better insight into this behavior, the raw execution time measurements were plotted over time as seen in Figures 6.7 and 6.8.. 28.

(42) 6.1. Execution Time. Figure 6.7: Plot of raw execution time measurements of the Decision Maker using the back layer with RTPS UDP and 0 dummy interfaces. 29.

(43) 6.1. Execution Time. Figure 6.8: Plot of raw execution time measurements of the Decision Maker using the back layer with RTPS UDP and 2 dummy interfaces. In Figure 6.7, where 0 dummy interfaces are used, it can be observed that there is a very high execution time in the beginning of the test execution and for the first 5 seconds. When looking at Figure 6.8, where 2 dummy interfaces are used, the execution time is more stable and the abnormal high execution time previously seen in Figure 6.7 is not observed. More investigation is required as to why this strange behavior occurs, however, given that this issue was only observed in the first seconds of the execution time, and only when 0-2 interfaces were used, it was considered to be a minor problem for the Smartness platform.. Send State In this part of the thesis, the results of the Send state are presented and discussed. The Figures 6.9 - 6.11 show the mean, median and 99th percentile execution time of the dummy interfaces.. 30.

(44) 6.1. Execution Time. Figure 6.9: Mean execution time of send per number of interfaces. Figure 6.10: Median execution time of send per number of interfaces. 31.

(45) 6.1. Execution Time. Figure 6.11: Top 99th percentile execution time of send per number of interfaces. In the case of the Send state, when looking at Figures 6.9 and 6.10 it can be observed that the current Smartness version outperforms the DDS implementations, similarly to the Parse and Decision Maker states. All the different DDS implementations seem to be performing very similar and the Ownership QoS policy is not having any effect here, since there is only one writer and multiple readers. When looking at Figure 6.6 that depicts the top 99th percentile, it seems that none of the DDS implementations exceeds the 4ms time slot that is allocated to the Send state in the Smartness platform. In contrast to the Parse state where all the interfaces were writing to a topic, in the Send state all the interfaces read from a topic. This result indicates that the reading operation is much less expensive than the write operation in DDS.. Total In this part of the thesis, the combined the results of all the states are presented and discussed. The total execution time is calculated by summing up the mean, median and 99th percentile of each state.. 32.

(46) 6.1. Execution Time. Figure 6.12: Mean execution time of the all states combined per number of interfaces. Figure 6.13: Median execution time of the all states combined per number of interfaces. 33.

(47) 6.2. Communication. § Figure 6.14: Top 99th percentile execution time of the all states combined per number of interfaces. The Figures 6.12 - 6.14 give a more holistic view on the execution time performance of the different implementations. The current version, not surprisingly, outperforms the DDS implementations with a significant margin. Among the different transport types, the Shared Memory performs the best, the RTPS UDP performs the worse and the TCP is in the middle. An interesting observation is that the overhead that the Ownership QoS policy adds to the execution time of the Parse state, seems to be higher than the gain we get from this policy in the Decision Maker state. As a result, all the implementation that use the Ownership QoS seem to perform slightly worse than their counterparts that do not use the Ownership QoS. Another interesting observation is that the mean and median total execution time depicted in Figures 6.12 and 6.13 is significantly less than the Smartness cycle which is 20 ms. Even when looking at the top 99th percentile in Figure 6.14 with 40 dummy interfaces, there is still a good margin between the total execution time and the Smartness cycle time. This is very encouraging, as it indicates that by slightly re-configuring the time slots allocated for each state, it is possible to maintain the current Smartness cycle time.. 6.2. Communication. One of the key things that were examined during this research was the performance of DDS in regards to loss and latency. In all the tests depicted below, as mentioned in Section 5.3, 1-40 interfaces were used and every interface published 2000 times. The back layer and the back layer using Ownership QoS implementations were used as a base for comparison and the different configurations that are depicted below were compared. QoS Policies • Reliability: A policy that determines if the communication is going to be Best Effort (i.e. no guaranteed delivery) or Reliable (i.e. if a sample is not delivered, it will be send again). A more thorough description can be found in Section 2.8. 34.

(48) 6.2. Communication • History: A policy that determines how many samples for each instance will be kept. Keep One means that the last sample will be discarded when a new sample arrives. Keep All means that all the samples will be kept (within some boundaries defined by the Hardware and the available memory). A more thorough description can be found in Section 2.8. • Durability: A policy that determines whether a data writer should maintain data samples that they have already been sent to the subscribers and for how long these data samples should be kept. A more thorough description can be found in Section 2.8. • Ownership: This policy when used, determines which writer has the ownership of a given topic. It can by configured dynamically and the writer with the highest strength value gets the ownership. A more thorough description can be found in Section 2.8. Transport Types • RTPS UDP • TCP • Shared Memory As mentioned in Section 5.2, the OpenDDS latency statistics module was used in order to collect the following information for every instance on the reader: • Number of data samples received • Mean Latency • Max Latency • Variance The results were saved on the SFC after the end of the test execution and were visualized using Python and the Matplotlib library. The Standard Deviation depicted in the end of the second section, was calculated using the Formula 6.1, where xi is any individual latency measurement, µ is the mean value and N is the total number of measurements.. g f N f1 ÿ σ=e ( xi ´ µ)2 N. (6.1). i =1. Loss In this part of the thesis the loss of the DDS communication is presented. The Tables 6.1 - 6.3 summarize the total loss of all the tests that were executed. Table 6.1: Total Loss over 1600000 samples using different Transport configurations. Transport Type. Total Lost Messages. Loss (%). RTPS UDP Shared Memory TCP. 270 68 0. 0.016875 0.00425 0. 35.

(49) 6.2. Communication. When comparing the different transport types using the default QoS policies in Table 6.1, we can see that the overall loss is quite small as a percentage, but there is still a significant difference between them. TCP performs much better than the other two with 0 lost messages, and this is to some extent expected given the inherit reliability in the that TCP has in the transport layer. Shared memory seems to perform around 4 times better than RTPS UDP. Table 6.2: Total Loss over 1600000 samples using different QoS policies QoS Policy. Total Lost Messages. Loss (%). 0 303 268 66 39 44 0. 0 0.0189375 0.01675 0.004125 0.0024375 0.00275 0. Ownership Best Effort Best Effort and Keep All Reliable Reliable and Keep All Reliable and Durable Reliable and Durable and Keep All. In Table 6.2, we see a comparison between the different QoS policies when using RTPS UDP as transport type. The version that uses the Ownership QoS policy is 0, but that is not very unexpected as one needs to remember that only one writer has the ownership of a given topic and therefore the scaling up of the interfaces does not affect the loss. The Reliability QoS policy seems to have the most significant impact, as it seems to reduce the loss by 5-6 times. Furthermore, the History QoS policy does seem to have some impact as well, as it seems to reduce the loss by 20% - 30%. However the only way to take down the loss to 0% (without using the Ownership QoS policy) was a combination of the Reliability, History and Durability QoS policies. Table 6.3: Total Loss over 1600000 samples using different QoS policies on two different host machines QoS Policy Best Effort and Keep One Reliable and Keep All. Total Lost Messages. Loss (%). 4548 86. 0.28425 0.005375. When testing the communication between two different machines, the results as seen in Table 6.3 reveal the same pattern as in Table 6.2. However, the loss is 15 times higher when using Best Effort and Keep One. Although that is higher than what was expected, it can be justified to some extent, given the fact that the communication between the two SFCs is done using WiFi, with two MOXA bridges and a Netgear router in between. It is worth noting that the use of the Durability QoS policy in addition to Reliability and History was not enough to take down the loss to 0% as seen in the internal communication tests.. Latency One of the key things that were examined during this research was the performance of DDS in regards to latency. In all the tests depicted below, as mentioned in Chapter 5, 1-40 36.

(50) 6.2. Communication interfaces were used and every interface was publishing 2000 times. In Figures 6.15 - 6.24 when no transport type is explicitly mentioned, RTPS UDP should be assumed and when no QoS policy is explicitly mentioned, Best Effort, Keep Last One and no Ownership are assumed.. Figure 6.15: Mean Latency per number of interfaces. Figure 6.16: Max Latency per number of interfaces. When comparing the latency of the different transport types in Figures 6.15 and 6.16, RTPS UDP seems to have a higher mean Latency, but has the least outliers when looking at the max values. The Shared Memory and TCP seem to have a quite similar mean latency; however, TCP has much higher max values, reaching above 17ms which would be a problem for the Smartness inter-state communication, since some of the data will miss a cycle.. 37.

(51) 6.2. Communication An interesting observation in Figure 6.15 is that the mean latency is going down when scaling up the number of interfaces. This is especially noticeable in the rtps udp version and more research is needed in order to understand this behavior. A potential explanation could be the fact that there is a certain performance cost when setting up the DDS middleware and the more samples are sent (with the increase of the dummy interfaces), the more this cost averages. This hypothesis could also be supported by the fact that the latency is going down in a somewhat linear trend. In Figures 6.17 and 6.18, the comparison is done between the default QoS and the Ownership policy.. Figure 6.17: Mean Latency per number of interfaces. Figure 6.18: Max Latency per number of interfaces. 38.

(52) 6.2. Communication. The Ownerhip QoS policy seems to add a significant overhead to the latency as seen in Figures 6.17 and 6.18. This could potentially be explained due to the additional operations that the DDS middleware is doing in order to select the writer with the highest ownership strength. When scaling up the dummy interfaces, the max values of the Ownership QoS version reach up to 12ms, which is a problem for the current Smartness inter-state communication. The default QoS latency is significantly lower, but when looking at Figure 6.18, it can be seen that when the number of interfaces goes above 24, there are some outliers that reach or surpass the 4ms threshold. When looking at Figure 6.17, the strange downwards mean latency that was observed in Figure 6.15 can be seen here as well. However, it is somewhat less obvious in the graph, due to the fact that the mean values of the default QoS and ownership QoS are on different scales. In Figures 6.19 and 6.20 below the Reliability and History policies are compared.. Figure 6.19: Mean Latency per number of interfaces. 39.

(53) 6.2. Communication. Figure 6.20: Max Latency per number of interfaces. When comparing the different combinations of Reliability and History QoS policies in Figures 6.19 and 6.20, it is pretty clear that when reliable is used, there is a much higher latency. This could be explained due to the fact that when reliable is used there are a lot of re-transmissions that result to higher latency. Additionally when best effort is used, a lot of the outlier samples are lost and therefore not measured in the latency statistics. The History QoS policy does not seem to have a strong impact in the latency. When looking at Figure 6.19, the strange downwards mean latency that was observed in Figures 6.15 and 6.17 is still evident when best effort is used, however it does not seem to be there when reliable is used (or at least not after we use more than 4 interfaces). The assumption is that the DDS setup performance cost still exists when reliable is used, but it is masked by the higher latency that is introduced with the increase of the re-transmissions. The latter can also be supported when looking at Figure 6.20, where the max latency measurements are by far higher when reliable is used, going even beyond 20ms, which is a problem for the Smartness inter-state communication. The Figures 6.21 and 6.22, the mean and the max latency of Reliability, History and Durability QoS policies are compared.. 40.

(54) 6.2. Communication. Figure 6.21: Mean Latency per number of interfaces. Figure 6.22: Max Latency per number of interfaces. When looking at Figure 6.21, the differences between the versions seems rather trivial. However, when looking at Figure 6.22, it can be seen that when the Durability QoS policy is not used there are a lot more outliers. This observation was surprising, as common intuition would suggest the contrary, given that when the Durability QoS is used more re-transmissions are expected. A more deep investigation in the OpenDDS source code in the future perhaps could provide an explanation for this. It is worth mentioning, that as seen in Figure 6.22, the max latency is very high, and therefore a problem to the inter-state communication of smartness.. 41.

(55) 6.2. Communication The last comparison that is illustrated in Figures 6.23 and 6.24 below depicts the difference in latency when running the writer(s) and the reader in two different machines. The QoS policies that are being evaluated are best effort with keep last one and reliable with keep all.. Figure 6.23: Mean Latency per number of interfaces in two different machines. Figure 6.24: Max Latency per number of interfaces in two different machines. In Figures 6.23 and 6.24, the pattern that was observed in the previous figures is similar. The implementation that uses best effort seems to have a clearly lower latency while the dummy interfaces scale up. The latency, as expected, is in a completely different scale, since the communication is done via the MOXA network bridge and the WiFi network. In order to get an idea about the variance in the latency measurements the tests with 20 interfaces were selected as a case study since they are in the middle of our test suite. Table 6.4 42.

No results found