Remote Diagnostics of Heavy Trucks through Telematics

(1)

Remote Diagnostics of Heavy Trucks through Telematics

TRULS SHANWELL H˚ AKAN SVENSSON

Master’s Degree Project

Stockholm, Sweden August 2013

(2)

(3)

Remote Diagnostics of Heavy Trucks through Telematics

TRULS SHANWELL HÅKAN SVENSSON

Stockholm 2013

Master of Science Thesis MMK 2013:69 MDA 458 KTH Industrial Engineering and Management

Machine Design

SE-100 44 STOCKHOLM

(4)

(5)

Master of Science Thesis MMK 2013:69 MDA 458

Remote Diagnostics of Heavy Trucks through Telematics

Truls Shanwell, Håkan Svensson

Approved Examiner Supervisor

2013-12-18 Martin Törngren Lei Feng

Commissioner Contact person

Scania CV AB Jonas Biteus

Abstract

Vehicle diagnostics is getting more and more sophisticated as the number and complexity of on-board computers grow. The use of computer aided diagnostics has become an integral part of repair and maintenance, but it is still almost exclusively used with the PC physically connected to the vehicle or at least very close by a few meters.

Web based services in ”in-vehicle-infotainment”¹(IVI) has grown rapidly over the last couple of years and as vehicle diagnostics belongs to IVI, it is natural for it to strive towards the web. This thesis has been carried out with the aim to investigate and demonstrate the possibility of remote diagnostics, meaning vehicle diagnostics over the internet. It has been done with the perspective of real-time user interaction². The report describes the diagnostic system as it is today, proposes changes needed for the adaptation towards the internet, discusses performance over mobile networks and guides the reader through the development of a remote diagnostics application run over 3G.

This thesis shows that it is possible to run a diagnostic application over the internet without sacrificing functionality and still retaining a good user experience. The difficulties of remote diagnostics has shown not to lie in performance, but in safety, security and managing a large fleet, which belongs to future work to solve.

1Solutions and applications for automobiles ranging from entertainment to navigation and maintenance, as defined by Accenture [1].

2In this sense real-time addresses the users feeling of instant response.

(6)

Fjärrdiagnostik av Tunga Lastbilar över Telematik

Truls Shanwell, Håkan Svensson

Godkänt Examinator Handledare

2013-12-18 Martin Törngren Lei Feng

Uppdragsgivare Kontaktperson

Scania CV AB Jonas Biteus

Sammanfattning

Fordonsdiagnostik blir mer och mer sofistikerad i takt med att anta- let och komplexiteten hos inbyggda fordonsdatorer växer. Datorstödd diagnos med hjälp av en PC som kör en diagnosapplikation är en nyc- kelfaktor inom reparation och underhåll.

Allt eftersom att utvecklingen inom web-baserade tjänster har vuxit under senare år har även fordonstillverkare börjat undersöka vilka web- baserade tjänster som kan byggas in i deras fordon. Ett intresseområde är fjärrdiagnostik, dvs diagnos-kommunikation över internet. Scania har nyligen lanserat en web-baserad tjänst för fordonsdiagnos, där använda- ren kan beställa en utläsning av felkoder på ett fordon av nyare modell, och sedan få resultatet presenterat i en webläsare. Dock så erbjuder den nuvarande tjänsten endast begränsad funktionalitet jämfört med vad som är möjligt med diagnosapplikationen som används då ett fordon tas in till en verkstad.

Den här rapporten presenterar en arkitektur som gör det möjligt att utföra samma typer av operationer över trådlöst WAN som idag endast görs med diagnosverktyget anslutet till fordonet via usb-kabel eller lokal wifi. En stor del av detta arbete har inneburit att utveckla en plattform som prototyp för att demonstrera systemet.

(7)

Acknowledgments

We would like to take the opportunity to thank our supervisor Jonas Biteus and the department of YSPX at Scania CV AB for their continuous support during this master thesis. We want to thank our supervisor Lei Feng at KTH who has been of great help to us, always making himself reachable even at short notice. We would also like to extend our thanks to our examiner Martin Törngren for his help of making this thesis possible.

(8)

(9)

List of Notations

3G Third Generation, page 18 4G Fourth Generation, page 18 ABS Anti Brake Lock System, page 4

AEBS Advanced Emergency Brake System, page 4 AW D All Wheel Drive System, page 7

CC Cruise Control, page 4

CSS Automatic Climate Control, page 7 CT L Computational Tree Logic, page 25 DT C Diagnostic Trouble Code, page 1 ECU Electronic Control Unit, page 6

EDGE Enhanced Data Rates for GSM Evolution, page 18 EM S Engine Management System, page 7

ESC Electronic Stability Control, page 4 GM S Gearbox Management System, page 7 GP S Global Positioning System, page 1 ICL Instrument Cluster, page 7

IV I In-Vehicle Infotainment, page 1 LT E Long Term Evolution, page 18 M IL Malfunction Indicator Light, page 9 OBD On-Board Diagnostics, page 1

OSI Open Systems Interconnection, page 10

(10)

QoS Quality of Service, page 16 QoS Quality of Service, page 54

RRC Radio Resource Controller, page 18 RT T Round Trip Time, page 17

T CT L Timed Computational Tree Logic, page 25 V CI Vehicle Communication Interface, page 1 W W AN Wireless Wide Area Network, page 18

(11)

List of Figures

1.1 Basic diagnostics setup, with wired connection and wireless respectively 2

2.1 Generic CAN bus . . . 8

2.2 Scania CAN bus topology . . . 8

2.3 Mapping of the diagnostics over CAN standard into the OSI model (Permission for publication obtained from SIS Förlag AB, www.sis.se, 08-555 523 10) . . . 10

2.4 Future system overview . . . 23

3.1 Property checking queries . . . 27

4.1 beaglebone platform . . . 34

4.2 beaglebone CAN bus cape . . . 35

4.3 VCI software architecture . . . 36

4.4 The architecture of SocketCAN . . . 36

5.1 Flow chart of the diagnostic API written in C . . . 41

5.2 CAN application input/output message format . . . 42

5.3 VCI+ top level initialization . . . 43

5.4 VCI+ data flow . . . 44

5.5 VCI+ diagnostic application architecture . . . 45

5.6 Proposed test script written in XML . . . 47

5.7 Back-office server architecture . . . 48

5.8 Sequence diagram of the VCI+ to back-office communication . . . 50

5.9 Sequence diagram of the VCI+ to back-office communication . . . 50

A.1 Graphical user interface of the VCI+ diagnostic application . . . 62

(12)

Acknowledgments iii

List of Notations vi

List of Figures vii

1 Introduction 1

1.1 Background . . . 1

1.2 Purpose and Definitions . . . 3

1.2.1 Sustainability . . . 3

1.2.2 Safety . . . 4

1.2.3 Purpose . . . 4

1.2.4 Goals . . . 4

1.2.5 Scope and Delimitation . . . 5

1.2.6 Method . . . 6

2 Pre-study 7 2.1 Current System Architecture . . . 7

2.1.1 Controller Area Network (CAN) . . . 7

2.1.2 On-Board Diagnostics . . . 9

2.1.3 Vehicle Communication Interface . . . 9

2.1.4 Diagnostics Over CAN . . . 10

2.1.5 Physical Layer . . . 10

2.1.6 Data Link Layer . . . 11

2.1.7 Network Layer . . . 11

2.1.8 Application Layer . . . 11

2.1.9 Diagnostic Application Layer . . . 12

2.2 Vehicle diagnostics . . . 12

2.2.1 Readouts of diagnostic trouble codes . . . 12

2.2.2 Unit calibration . . . 12

2.2.3 Unit configuration . . . 12

2.2.4 Firmware update . . . 13

2.2.5 Workshop tests . . . 13

2.2.6 Streaming sensor data . . . 13

(13)

2.3 SCOMM . . . 13

2.4 SDP3 XCOM C-Dev . . . 14

2.5 Summary . . . 14

2.6 Related Work . . . 14

2.7 The New Diagnostic Application . . . 15

2.7.1 The Internet . . . 15

2.7.2 The Need of a New VCI . . . 16

2.7.3 VCI+ Functionality . . . 16

2.8 Response Time . . . 17

2.9 Internet Connection in Depth . . . 18

2.9.1 Mobile Internet Service Provider . . . 18

2.9.2 Latency . . . 18

2.9.3 Countermeasures . . . 20

2.10 Transport Layer Protocols . . . 21

2.10.1 Transmission Control Protocol (TCP) . . . 21

2.10.2 User Datagram Protocol (UDP) . . . 21

2.10.3 On Transport Protocols . . . 21

2.11 Envisioning the Future System . . . 22

3 Further Areas of Research 25 3.1 Modeling the UDS Protocol . . . 25

3.1.1 The UPPAAL Software . . . 25

3.1.2 Computational Tree Logic . . . 25

3.1.3 Property Checking with UPPAAL . . . 26

3.1.4 UDS Models . . . 28

3.1.5 Results and Discussion . . . 28

3.2 Static Analysis of C-Application . . . 29

3.2.1 Clang Scan-Build . . . 29

3.2.2 Splint . . . 29

4 Design 33 4.1 Hardware . . . 33

4.2 Software . . . 34

4.2.1 Software Architecture . . . 34

4.2.2 SocketCan . . . 35

4.2.3 Programming languages . . . 37

5 Implementation 39 5.1 CAN Application . . . 39

5.1.1 Operation . . . 39

5.1.2 Error Handling . . . 40

5.1.3 Unix Socket Interface (input/output) . . . 40

5.1.4 Vehicle Input / Output . . . 42

5.2 VCI+ Application . . . 42

(14)

5.2.1 Application Initialization . . . 42

5.2.2 Data Flow . . . 43

5.2.3 Diagnostics Application . . . 44

5.2.4 Server Architecture . . . 47

5.2.5 Communication Protocol . . . 48

5.2.6 Testing . . . 49

6 Retrospect 53 6.1 Results . . . 53

6.2 Conclusion . . . 54

6.3 Discussion . . . 54

6.4 Future Work . . . 55

Bibliography 57

A Appendix A 61

B Appendix B 63

C Appendix C 73

(15)

Chapter 1

Introduction

Over the last two decades the vehicle industry has increased its focus towards developing more intelligent systems. A growing number of embedded computers and sensors are integrated in vehicles, providing new features in safety, entertainment, luxury and more. One of the more recent trends within the automotive industry is vehicle internet connectivity and In-vehicle Infotainment (IVI). While the area of vehicle mechanics is very mature, the IVI area is much less developed. It has a large unexplored market and is very rapidly expanding [1]. An important part of IVI is vehicle service and maintenance which is the area treated in this report.

1.1 Background

As new technologies develop the complexity of vehicles grow. New software and hardware are frequently introduced and can often be different between models, why workshop mechanics face increasing difficulties in troubleshooting and repairing vehicles.

For this and other reasons, repair and maintenance on heavy vehicles has become dependent on computer aided diagnostics. A simple description of diagnostics is, the ability of obtaining faults that the truck has registered during operation, which come in the form of Diagnostic Trouble Codes (DTC) together with Freeze Frames of the system state at that moment. It also includes the ability to manipulate different properties of the truck e.g. changing the state of different actuators and reading sensor values to see that they react accordingly. An important part of diagnostics is workshop tests, where the vehicle goes through a predefined set of operations on which its performance is evaluated. This is not the whole story of computer aided diagnostics but it will suffice for now.

These tests are carried out by connecting a PC to a Vehicle Communication In- terface (VCI), which is a gateway or translator between the computer and the vehicle’s communication protocols. The VCI is plugged into an On-Board-Diagnostics

(16)

(OBD) port on the vehicle, which enables communication with its internal system.

Through this connection it is possible to e.g. run diagnostics tests, calibrate components, read and interpret Diagnostic Trouble Codes and more.

In the traditional setup used today, the computer is either connected to the vehicle by wire, or through a wireless access point, see Figure 1.1. This means that the computer running the diagnostic application must be in a proximity of the vehicle under diagnosis, which imposes significant restrictions on the system. Diagnostics of a faulty vehicle is only possible if a repairman comes out to the site, or if the vehicle is in a workshop.

Figure 1.1. Basic diagnostics setup, with wired connection and wireless respectively

It would be a considerable benefit to enable this type of diagnostics over a longer, remote distance. Making it possible to connect to a vehicle and run diagnostic tests and other diagnostic applications at any time, or more importantly, regardless of the vehicle’s physical location. Picture a truck running on a highway when the driver experiences a malfunction. If remote diagnostics was possible the truck could be diagnosed at its current location, without waiting for a repairman to come to the site or needing to drive or get towed to a workshop.

Scania wants to improve its vehicle diagnostics system and provide a whole range of new diagnostics features. The ideal system would analyze all vehicles and collect their run-time data. Based on this information it would predict and recommend adequate maintenance and service intervals. It would find sources of failure and adjust them with calibration and software updates without requiring the drivers attention, in best case before the driver even noticed something was wrong. During roadside failures it would be possible to remotely run diagnostic tests, and based on the result the system would be able to determine the source of failure and recommend proper repair actions. It would further compile a list of spare parts and repair instructions for the nearest workshop.

(17)

1.2. PURPOSE AND DEFINITIONS

Apart from road vehicles, Scania also manufactures GENset units (electric gen- erators powered by a combustion engine) that often operate in very remote environments, with limited network infrastructure and connectivity. As these GENsets are stationed in very remote locations, maintenance and repair becomes very expensive. Repairmen needs to travel long distances, multiple times, both to investigate and to bring proper tools and spare parts. This application would benefit greatly from remote diagnostics and monitoring, which would enable live monitoring of state of health and performance and greatly ease repair and maintenance.

In the logistic business sector, failing vehicles mean large costs for owners. It is therefore essential for a company like Scania to provide efficient repair and service methods. Although reducing the cost of repair and spare parts is important, costs related to downtime and penalty fees for late goods can often be even more important[2].

Political legislation also drive manufacturers to increased awareness and effective- ness, both on requirements for environment and traffic safety. A standstill due to a failing truck is a good example of a dangerous situation that should always be avoided[3].

1.2 Purpose and Definitions

1.2.1 Sustainability

Scania, as any other vehicle manufacturer puts a significant effort into reducing the environmental footprint of their products, and vehicle maintenance techniques are an integral component of this improvement process[4]. Optimal fuel efficiency is an important part to reduce costs for Scania’s customers, and in order to ensure maximum fuel efficiency, all major systems in the vehicle must function as intended. Stricter emission legislation requires more advanced and sophisticated systems to deal with pollution treatment. For heavy trucks, advanced systems have been developed to reduce emissions in order to meet the Euro 6 standard, and in the distributed systems that controls the power-train and its accessories, correct functionality is usually dependent not only on one subsystem but on several.

Being able to perform remote diagnostics not only opens the possibility for troubleshooting on an already confirmed malfunction, but also for remote data mining and back office analysis of vehicle run-time data. With an efficient analysis application, a vehicle could be repaired before suffering a major breakdown. Also, the data of each unit can be collected and used for fleet management in a diagnostic purpose, making it possible to single out individuals with abnormal run-time data.

Being able to detect anomalies in an early stage to prevent a fault that affects fuel

(18)

consumption will not only save fuel, but also reduce the number of extensive repairs carried out at workshops, which will lead to reduced amount of changed spare parts and other consumption parts usually associated with a workshop repair, such as oils and other fluids.

1.2.2 Safety

Reduced emissions is only one factor that drives the evolution towards larger and more complex systems. Safety features of modern vehicles also depends more and more on distributed systems and software to function properly. Today, cars as well as trucks and buses rely on a number of distributed safety systems such as airbag systems, Electronic Stability Control(ESC), Anti Brake Lock System (ABS), and newer features such as Advanced Emergency Braking Systems (AEBS). Also, a number of other features such as Cruise Control (CC) and electronic park break are connected with safety, since a malfunction in this type of system can be directly hazardous.

Faults subject to electrical or software failures are quite often intermittent, meaning that the user of the vehicle experiences a fault that occurs only on particular occa- sions, usually under certain conditions that can be hard to identify or reproduce, or just appear plain randomly. This can make troubleshooting difficult, especially in cases where the symptom cannot be reproduced at the time vehicle is at the workshop. Under these circumstances, where no or unsatisfactory information can be collected from DTCs and other error logs, the workshop might have to take a guess of what part might be cause of the fault and replace it without knowing if the vehicle is ok. Or simply send it out to the customer again for an attempt to collect more information about the error conditions. By realizing remote diagnostics, the workshop would be able to connect to the vehicle as soon as the costumer are experiencing problems, to run a diagnostic session when the symptoms are present.

1.2.3 Purpose

The purpose of this thesis is to investigate and demonstrate ways of running the same diagnostic application that are run locally in workshops today, over a remote distance. The remote connectivity will be obtained through mobile services with 3G as reference. This thesis will also define limitations of this service, propose a system architecture and create a prototype to demonstrate this functionality.

1.2.4 Goals

The following goals have been set for this project:

• A remote diagnostic test should provide the same result as if it was connected by wire as the setup is today.

(19)

1.2. PURPOSE AND DEFINITIONS

• The results should bring clarity to the possibilities and limitations of remote diagnostics.

• The results should bring understanding to critical factors and bottlenecks in remote diagnostics.

• The solution should be fully compatible with all Scania vehicles.

• User interaction with the vehicle should give the user a perception of real-time i.e. fast response time.

• Minimize application specific data on the vehicle diagnostic device i.e. aim for a thin client application.

1.2.5 Scope and Delimitation

The following conditions was agreed on during the initial weeks of the project.

• No Scania specific hardware was to be used, which means that new hardware was to be selected and used for the development of the new diagnostics application. The reason for this decision was that this would give more in- dependence and flexibility from current Scania applications. There were also reasons to believe that it would speed up the development process because it would not be dependent on adequate collaboration with other departments.

• It was decided that the diagnostic application shall communicate with the vehicle through the OBD port and not by any other means.

• None of the available code for the current diagnostic application were found suited to be used on the new platform. It was agreed that it came with too high complexity and too much work to decide which parts was actually useful to the implementation.

• The research will only treat the connection with one vehicle at a time.

• Some diagnostic operations require a higher level of authentication given by a USB-key, these functions will be disregarded in the application and the solution will not regard the integration of this authentication.

• Demonstrating new functionality is of highest priority. All other aspects will have lower priority in the prototype and will be treated if the time allows for it.

• The remote connectivity will be displayed with a 3G connection, hence remote will be regarded as all geographical positions with an available 3G connection.

• With internet communication, security is a highly relevant factor that needs to be thoroughly investigated by itself, however security issues will not be addressed in this thesis.

• The matter of safety and whether it is appropriate to run these diagnostic tests when the truck is outside the test operators line of sight, and that the vehicle may be situated in dangerous environments will not be regarded.

(20)

The prototype that is one of the aims of this thesis should consist of an embedded platform that can be connected to the OBD port of the vehicle. It should preferably be small and lightweight, and should be able to run a client application communicating over an internet WWAN connection.

1.2.6 Method

Very few requirements were known at the start of the project, the basis of the project is better described as a set of ideas. Therefore it took some time to clarify what was really wanted and to what extent it was feasible to achieve. The project started with an extensive research on solutions to this problem and by delving into the current system used at Scania. After the initial research, the development process took the shape of an iterative methodology although no textbook procedure was applied. Weekly meetings with the supervisor was used as checkpoints and the half-time presentation as an important milestone. Swiftly changing requirements hampered long term planning and it was found to be best to stick to an iterative method with short development cycles. Another reason why many formal procedures were omitted was the small group size and that both authors were deeply involved with each others work. The question of what was feasible to achieve took very long to answer, why design and implementation often was done concurrently.

A first prototype was created to test key functionality of the application, but it was completely remade during the following iteration. Every new feature was ver- ified in three steps, first the hardware was connected to another computer playing the role of a truck. Next it was tested on a rig, a collection of Electronic Control Units (ECU) combined to simulate certain properties of a vehicle. The final verifi- cations was done on full scale Scania trucks. Features were tested on step one and two very frequently, but were only subject to step three if the feature was assessed as properly working.

The work effort has been divided between the two authors during the course of the project. As for the implementation part, Håkan has implemented a low level CAN API, while Truls has been developing the VCI+ application and the client and server communication.

The report has been written in a common effort, except for chapter 3 where the authors describes their individual research topics separately; static analysis of the source code for the low level CAN API by Håkan, and modeling of the diagnostic standard by Truls.

(21)

Chapter 2

Pre-study

2.1 Current System Architecture

The diagnostic system contains several parts, from the diagnostic user application to the internal workings of the truck itself. This section has the aim to explain these parts and how they fit together to form the current system. Every Scania truck uses a Controller Area Network (CAN) for the majority of its internal communication, and it is through that medium that diagnostic requests are sent into the truck. Let us therefore start by briefly explaining a general CAN network.

2.1.1 Controller Area Network (CAN)

Controller Area Network (CAN) was developed by BOSCH and is a serial communications protocol over a bus architecture, it uses message broadcast and is multi- mastered. CAN is created for broadcast of short messages that can be read by all connected nodes simultaneously, which replaces point to point communication[5].

CAN efficiently supports distributed real-time control, with a very high level of security[6]. [5] is a good introduction to CAN provided by Texas instruments. Fig- ure 2.1 shows a simple example of a CAN topology. The CAN bus consists of two wires, CAN-High and CAN-Low and is terminated by 120ohm resistors.

The CAN bus layout in Scania vehicles is shown in Figure 2.2. Scania vehicles have three main CAN buses, red, yellow and green, which are interconnected by a coordinator gateway.

The red bus holds ECUs that manage time critical applications with hard real- time constraints and is the highest priority bus. For example the ECUs on this bus controls the power-train e.g. the Engine Management System (EMS) and the Gearbox Management System (GMS).

Applications with slightly lower real-time requirements are connected to the yellow bus. You will find the instrument cluster (ICL) and the All Wheel Drive System

(22)

Figure 2.1. Generic CAN bus

Figure 2.2. Scania CAN bus topology

(AWD) on this bus.

(23)

2.1. CURRENT SYSTEM ARCHITECTURE

ECUs that manage applications with only soft real-time requirements reside on the green CAN bus, to which Automatic Climate Control (CSS) and the Auxiliary Heater System belongs.

Messages that travel in-between buses must go through the coordinator. The message priority will then be decided by their origin i.e. messages from the red bus will be handled before messages from the yellow and green bus.

It is important to note that the diagnostic bus is connected to the green bus, which gives diagnostic requests the lowest priority. This bus has an external port called the On-Board Diagnostics (OBD) port for connecting external diagnostics tools.

2.1.2 On-Board Diagnostics

The OBD port is a way for external hardware to communicate with the truck’s internal system, in this case the CAN bus. Generally, OBD refers to a vehicle’s ability to make self diagnosis and present the errors found to a user of a diagnostic tool, often a workshop technician[7].

The first OBD system appeared with the advent of Electronic Control Units (ECUs) in vehicles and early systems usually featured only a simple Malfunction Indicator Light (MIL). During the past twenty years, OBD has evolved into a much higher level of functionality allowing more advanced diagnostics with a much greater detail.

OBD functionality is driven by legislation. These undergo constant change, and are dependent on geographical location e.g. the European Union names the standard EOBD and in America the corresponding standard is OBD-II[8]. OBD is built upon several standards from SAE and ISO, which covers everything from physical up to application layer requirements[7].

It is common for modern vehicles to display OBD warning messages to the driver on a display if a malfunction occurs.

2.1.3 Vehicle Communication Interface

A VCI is a tool with the purpose of connecting an external computer to the vehicle through its OBD port, thus acting as gateway towards the internal communication bus(es). In today’s diagnostics application, Scania uses two different versions of VCI.

The older - VCI2 - connects a computer to an OBD port through a wired connection over USB. One objective of VCI2 is to translate messages from a format the computer can use to the vehicles communication format, which it does by emulating CAN over USB. But it also has the role of acting as a CAN node. This means that it manages the data link layer and physical layer towards the CAN bus, specified in

(24)

ISO11898.

A newer generation of VCI, VCI3, has the exact same functionality as VCI2. But instead of providing a wired interface over USB, it is equipped with a WiFi access point for short distance wireless communication, as shown in Figure 1.1.

2.1.4 Diagnostics Over CAN

So far the communication system of the truck as well as the means of connecting a computer to that system has been discussed. Before moving on to describing how the current diagnostics application has been implemented, let us first go through the standardized way of running diagnostic commands over CAN.

The Unified Diagnostic Services (UDS) protocol is a common set of services for vehicles using CAN, specified in the ISO 15765 standard. The standard is divided into four parts, each of which specifies the protocol according to the different layers defined in the Open Systems Interconnection (OSI) model, as can be seen in Fig- ure 2.3. The diagnostic protocol specifies certain timing windows for which different

Figure 2.3. Mapping of the diagnostics over CAN standard into the OSI model (Permission for publication obtained from SIS Förlag AB, www.sis.se, 08-555 523 10)

diagnostic messages are expected to arrive within, the format of the diagnostic messages, sequences and much more. These are defined in the applicable OSI layers according to the figure.

2.1.5 Physical Layer

Just as the title implies, the physical layer defines the physical properties including physical shape of connectors, electrical properties of the bus and the nodes connected to it, topology and much more. This is manufacturer dependent.

(25)

2.1. CURRENT SYSTEM ARCHITECTURE

2.1.6 Data Link Layer

The second lowest layer is defined by the ISO 11898 standard for high speed CAN communication (up to 1Mbit/s). It states the physical requirements on the CAN bus, such as resistance, capacitance and voltage levels, to ensure that the performance of the bus is good enough to handle bit-timing related properties.

2.1.7 Network Layer

Most timing requirements are specified in the Network layer, and often features a performance requirement and a timeout value. Performance requirements are the general requirements, but for certain circumstances with high bus load, the timeout values that are higher, allows the diagnostic functions to operate even in this conditions. Some timing parameters does not only specify an upper boundary for maximum time in which the message is expected to arrive but also a lower boundary, i.e. the minimum time before a message can be sent. If the message does not arrive within the specified time window, the communication fails.

One example is the “minimum separation time” (STmin) parameter, which is sent in flow control frames. These frames are used for sending messages that have to be fragmented into several CAN frames. The STmin parameter states the time that has to pass between each of these individual frames, and allows the receiver to exe- cute other tasks with higher priority before receiving consecutive frames. The flow control frame also contains other information, such as when the next flow control can be expected, and if it is clear to start sending consecutive frames, or if the sender has to wait for a new flow control.

Basic operations such as reading DTCs and freeze frames require the diagnostic interface to be able to receive segmented messages. In a similar manner, the receiving ECU should be able to receive segmented messages.

The example above describes what features typically are described on the network layer - the composition or layout of individual frames and the timing requirements.

2.1.8 Application Layer

The application layer defines mainly sequences of messages and the mapping of them into certain diagnostic services available in different diagnostic sessions by combining sequences and application timing parameters. Every diagnostic operation have to be initiated by the opening of a diagnostic session. There exists different types of sessions, each with a different level of privileges to carry out certain types of operations. The session with the lowest level of privileges is called standard session. This section is very restricted in terms of what actions the user is allowed to perform. For instance, the user cannot manipulate any configuration in any ECU, or perform operations to control any actuator on the vehicle. The number of

(26)

sessions are manufacturer specific, and can be changed at any point. After opening a diagnostic session, a timer is started and if there is no other message sent within a specific time frame, the session times out and has to be opened again to be able to continue. The different diagnostic services available on the different sessions can be specified either by the application layer or the diagnostic application.

2.1.9 Diagnostic Application Layer

The diagnostic application level describes features implemented by the manufacturer. This layer is characterized by workshop tests and other services related to manufacturer specific components or procedures on the vehicle such as settings, calibration, etc.

2.2 Vehicle diagnostics

The following section describes the different types of operations that can be under- taken with modern vehicle diagnostics.

2.2.1 Readouts of diagnostic trouble codes

This is the most fundamental operations that can be done with a diagnostic tool.

DTCs concerning emission systems are also regulated by legislations, imposing manufactures to incorporate this functionality. DTCs are set in an ECU when certain measured sensor values are breaking their corresponding upper or lower reference thresholds[9]. Along with a DTC is also follows a so called freeze frame, which are associated values of sensor data that was measured at the time when the DTC was triggered. The freeze frames could hold important information in order to pinpoint the cause of failure.

2.2.2 Unit calibration

Some electronic units require calibration after they have been replaced. One example of such a unit is the steering wheel module that has to calibrate the steering wheel sensor after replacement.

2.2.3 Unit configuration

Some electronic control units common for different type of models need to be con- figured according to the specific features of the model to which it is fitted. Same type of ECUs can be fitted on vehicle models with slightly different hardware spec- ifications, and for the ECU to make correct assumptions and calculations, different configuration parameters need to be set.

(27)

2.3. SCOMM

2.2.4 Firmware update

Even though rigorous tests are a part of the development process for all vehicle manufacturers, bugs and unforeseen conditions still causes faulty behavior to appear after a product has entered the market. Firmware updating makes it possible to correct such errors without replacing the electronic unit.

2.2.5 Workshop tests

Workshop tests are manufacturer specific tests designed to check the operation of certain components in the vehicle. The duration of a workshop test can span from a single second up to several minutes, from just an activation of a single component to controlling several of the vehicle’s main components such as the engine and gearbox. During this sequence, the vehicle receives instructions from the diagnostic tool. Typically, the tool monitors sensor data during the test, and report the result of the test if it completes, or abort the test if something fails. Advanced workshop tests are carefully crafted and engineered by a dedicated department at Scania, followed by rigorous testing. The ECUs in the vehicle should have built in protection mechanisms so that the tool cannot manipulate the vehicle in a way that could be hazardous (like putting the vehicle in gear when the engine is running), but still safety need to be taken into highest regard during the design of these tests.

2.2.6 Streaming sensor data

Another feature in diagnostic applications allows the user to monitor sensor data in real time. Different types of sensor data can be seen simultaneously on the same screen, something that experienced diagnostic mechanics can make use of to discover anomalies.

2.3 SCOMM

From this point, the parts of the current system that will be described lies on a computer running the diagnostic application, which is connected to a VCI. Starting with the Scania Communication application (SCOMM), which is an API used by all higher level applications. SCOMM contains all functionality for the diagnostic communication and handles ECU specific methods, communication according to the diagnostic UDS protocol described in the preceding section. It decodes and presents data and DTCs in readable form, handles authorization for different diagnostic sessions, logging and more. SCOMM is the underlying layer in the current system architecture that receives requests from the diagnostic user application, forwards them in correct format according to the standard via USB bus to the VCI, and subsequently receives replies from the vehicle and decodes it into human readable information before forwarding it back to the user application[10].

(28)

2.4 SDP3 XCOM C-Dev

SDP3, XCOM and C-Dev are diagnostic user applications developed by Scania. The main application used in Scania workshops is SDP3, which includes an extensive GUI and predefined operations towards the vehicle. XCOM and C-dev are created for developers and provide greater detail and freedom to manipulate the vehicle, but are less user friendly. All three applications utilizes SCOMM to get model specific information about the connected vehicle.

2.5 Summary

Lets summarize the current diagnostic system with two examples.

A repairman is performing computer aided diagnostics on a malfunctioning truck in a workshop. He uses SDP3 to specify a diagnostic command. The command goes through SCOMM where it is translated to a set of diagnostic instructions equiva- lent to the SDP3 command, but written according to the ISO15765 specification.

SCOMM includes several layers of functionality and performs authentication and retreives vehicle specific information from databases. Finally the set of diagnostic commands are sent to the VCI, which translates them and sends them on the vehicle’s CAN bus. The message travels to its destination ECU(s) and the reply goes the same way back but in the opposite direction.

The same repairman executes a workshop test from SDP3. This test requires several steps of manipulation and readouts from the vehicle and after each send/receive loop, the program must decide the next action based on the response. The communication path looks the same as in the previous example, but the intent of this example is to clarify that each decision for next instruction is decided in SDP3, meaning that for every cycle in the workshop test the data must travel through all layers of the system.

2.6 Related Work

A similar system of that discussed in this thesis was presented in[11]. The authors propose a communication protocol for relaying CAN frames over internet. The method is suggesting a capsulation of CAN messages into data-grams with a three byte header, relaying them on to the internet. The authors suggests that diagnostic services such as diagnostic read-out and CAN monitoring applications can actually be performed over internet, based on tests they have made. However, these tests does not address the problems that arises in cases when the connection is poor and latency is high. If the diagnostic application would depend on access to low latency 3G/4G, time and place would be a deciding factor about if it would be possible to run the diagnosis or not. In particular, commands that requires sending and re-

(29)

2.7. THE NEW DIAGNOSTIC APPLICATION

ceiving of consecutive frames would most likely be failing due to the short timeouts between each frame, as would manufacturer designed test sequences where activation and testing of physical actuators and sensors depends on immediate response.

This thesis aims to propose a system with full functionality, at the same time not dependent on the reliability of the speed and latency of the communication between the database and user application in one end, and the vehicle in the other. Conse- quently, another approach needs to be taken.

An implementation of a similar system was carried out at Scania in 2008, using an ECU that today is fitted on most Scania vehicles called C200[10]. It was based on a legacy system for remote diagnostics called Step. The purpose of the thesis was to suggest and implement a system with similar functionality as Step, but with databases residing on the back-office server instead of on the vehicle. The systems were not aimed at implementing full diagnostic functionality, such as workshop testing. The round trip time from that a request has been made from a user until the answer arrives (which is an important aspect in this work) is not stated. Currently, Scania are using a system that resembles the suggested one, called Scania Remote Diagnostics, a web based application that communicates with the vehicle through the C200. The diagnostic interaction are limited to readouts of DTCs and associated freeze frames, and the response time for a diagnostic request for this system is typically above 5 minutes.

Other studies such as [12] and [13] are addressing software downloading to vehicle ECUs from a safety and performance perspective respectively. There are many papers that discloses vehicles connected to WWAN, but we have not been able to find other papers that discuss real-time remote diagnostics in the same context as this thesis.

2.7 The New Diagnostic Application

The main goal of this thesis is to study and realize how to keep full SDP3 functionality when running a diagnostics application over remote distance. This section will treat some of the difficulties that arise and aspects to consider when creating the new system.

2.7.1 The Internet

The ability of connecting to a remote vehicle is today limited to the use of a third party Internet Service Provider (ISP). This implies giving up the control of the vehicle-to-operator communication to a third party, which in turn removes the possibility to guarantee the real-time properties that are needed in the diagnostic application. Grigorik [14] writes:

(30)

“The architecture of our IP networks is based on a best effort delivery model, which makes no guarantees about end-to-end performance”

This makes it impossible to run the application as it is designed today, by only routing the communication over the internet. Instead there is need for significant changes in the design.

2.7.2 The Need of a New VCI

First of all the VCI needs to be connected to the internet by a mobile network provider. Preferably it should be able to switch between several network services, much like a cellular phone for it to be able to provide the best connectivity in the current location. These features cannot be provided by the VCI2 and -3 designs used today, for they have a limited predefined set of functions and connectors. It is therefore necessary to create a new VCI+ with extended functionality. But what functionality does the VCI+ need?

The problem could be solved by creating a very extensive VCI+ and put the entire diagnostic application including databases and logic on it. SCOMM is however dependent on the .NET framework and has inherited its portability issues. However, the main reason why this is not appropriate is the continuous update of diagnostic functions. Hence the need for fast and easy update of diagnostic databases. Putting these databases on the VCI+ would demand ongoing updates on each VCI+ on every vehicle, whenever a diagnostic definition is revised. There are several reasons why this should be avoided, including large unnecessary data transmissions, data duplication and more.

Instead, a new design requires that the old systems functionality is split up into two parts, one put on the VCI+ and the other on the server, with the internet joining them together.

2.7.3 VCI+ Functionality

A couple of features were identified to be necessary on the VCI+. Similar to VCI2 and -3, the basic CAN communication physical layer ISO11898 is required. The standard for diagnostic communication ISO15765 with its real-time requirements must be implemented.

Another needed feature regards the ability of running functional tests as in SDP3.

The old method of running diagnostic tests requires direct communication between the vehicle and SDP3 in every request-response cycle. The timing requirements of the diagnostic tests must still be fulfilled, but the accuracy of them deteriorate over the internet, why QoS cannot be guaranteed and the method becomes unsuitable.

The consequence is that the entire test instruction must lie on the VCI+, logic and decision tree included.

(31)

2.8. RESPONSE TIME

Let us list the functions that are needed so far on the VCI+ before moving on:

• Internet connectivity through mobile services

• CAN data link layer as specified in ISO11898

• Diagnostics on Controller Area Networks as specified in ISO15765

• A new solution enabling diagnostic tests to be run despite timing uncertainties of the setup

In addition:

• Logging of diagnostic traffic

• Live monitoring of a parameter. This means a subscribing function where a parameter readout is specified and the VCI+ creates a high speed periodical request. The result must then reach the operator as fast as possible

• Auxiliary functions for facilitating the above functionality

2.8 Response Time

Keeping a feeling of user real-time in the application is a main target of this thesis.

But what does people really perceive as an instant response?

According to [15] there are three important upper limits on user perception when it comes to application performance:

• 0.1 second: The user feels that the system is reacting instantaneously

• 1.0 second: The user will notice the delay but her flow of thought stays unin- terrupted. She begins to lose the feeling of operating directly on the data

• 10 seconds: About the limit for keeping the user’s attention focused on the dialogue. Longer delays, will make the user want to perform other tasks while waiting for the computer to finish.

Note:

• For delays between 2-10 seconds it is good to provide some feedback indicating when the process is expected to be done, at least by using a “busy” indication or something that suggests that the computer is still operating on the data

• When the delay exceeds 10 seconds, it is good practice to show the progress as percentage done in a progress bar [16].

The application performance will be evaluated by investigating the round-trip-time (RTT) of a single, simple, UDS-command. This means, measuring the elapsed time from the point where the operator sends the diagnostic request, until the result is displayed on the screen. A reasonable goal according to the above limits for human perception is 1 second RTT.

(32)

2.9 Internet Connection in Depth

In the following we will look into the network connection while keeping the perspective of user real-time. Grigorik [14] uses two parameters when studying performance in this aspect:

• Latency: The time from the source sending a packet to the destination receiving it.

• Bandwidth: Maximum throughput of a logical or physical communication path.

This section will omit deeper reasoning about bandwidth. The diagnostic application uses very small messages and this significantly reduces its importance. It might be of importance during transfer of larger diagnostic tests or logs, but such review is outside the scope of this thesis. Instead it will focus on latency.

2.9.1 Mobile Internet Service Provider

A vehicle stranded in a remote location will neither have access to Ethernet nor a WiFi connection. It is solely dependent on a WWAN internet connection e.g.

EDGE, 3G, 4G. There is a significant difference in network performance among these different stages in the evolution of mobile networks. The available connection will of course affect the real-time user experience.

A 3G connection will be used as a reference in further discussions of mobile networks. The general building blocks of the different networks are very similar and a good reason to look closer on 3G is its vast geographical coverage and reasonable performance. Ericsson predicts that 85% of the worlds population will have 3G access by 2017[17].

3G is a very broad term. According to the original definition, a 3G network need only provide bandwidth peak rates at 200 Kbit/s. But the recent LTE can provide down-link data rates as high as 300 Mbit/s while still residing under the term 3G (usually referred to as 3.9G, for it to be the last step before 4G) [14]. The evolution has had high impact on latency as well, reducing it significantly during every past stage of mobile technology [18].

It is not possible to affect the inner workings of the 3G network provider, one must accept the prevailing conditions or simply find a better service provider. However, it is still important to understand how it works, to be able to create applications that maximizes the performance of the services available.

2.9.2 Latency

The following section includes several extracts from, and is mainly based on information from [14], which is recommended for the reader interested in the topic.

(33)

2.9. INTERNET CONNECTION IN DEPTH

Ethernet connected devices have a direct physical connection to the internet which is why they are always connected or “always-on”¹. WiFi uses radio and lacks this immediate connection, it does instead emulate “always-on” with the aid of different methods making the performance impairment unnoticeable.

Radio communication requires a lot of power and the power in mobile devices are very limited. In fact, on an average smart mobile phone, the power consumption of the radio is only exceeded by the display when it is active. For a mobile device to have “always-on” properties, it must always have an active radio. This would dramatically decrease the battery lifetime, why it is unwanted. Consequently, there is a tradeoff between network efficiency and power consumption (battery life).

Mobile radio towers simultaneously serves significant more clients than a WiFi router. The shared resources are limited and connected devices are expensive, which is another reason against an ”always-on” policy.

Mobile networks use something called a Radio Resource Controller (RRC) which schedules all the traffic on the network. If a mobile device wants to send data it must first ask the RRC for resources and wait until the RRC has assigned a channel for it. If there is incoming data on the network, the RRC will tell the device to start listen for an incoming message, hence no data is received or transmitted before this handshake.

A device has different radio states, each corresponding to different power con- sumptions and transmission capabilities. The radio state defines how the device is connected to the network and is decided by the RRC. Generally there are three types of states and even though details differ between network versions, this simple description will suffice:

• Idle: Lowest power state - the device listens only to control traffic and has no network resources.

• Connected: High power state - the network resources are assigned to the device and receive and transmit operations are possible.

• Dormant: Intermediate state - does not exist for all versions. Network capabilities vary, but the time Dormant to Connected is always shorter than Idle to Connected.

A device that has not been engaging in network activity for some time will be in the idle state. It must always ask the RRC to enter the connected state before it may send or receive any data. This negotiation may include several messages between

1The device in use is permanently connected to the internet and is ready to instantaneously react to user input or an incoming data packet [14]

(34)

the device and radio tower and represents a major part of the network latency, especially for the older 3G versions and below.

A timer then decides when to go to the dormant state and further back into idle.

The timer is reset by network activity and the device may return to connected from dormant if new network activity occurs.

There are also other latencies in the mobile communications network. Without going into detail, the main sources of network latency can be summarized to:

• Control-plane latency: as discussed earlier, the RRC negotiation contributes to the overall network latency by up to 100 ms for LTE but may extend to several seconds for early versions of 3G.

• User-plane one way latency: the time starting when a device begins to transmit a message and ends when the radio tower has received the full message.

Relatively short addition of a few milliseconds.

• Backbone and Core network latency: the total latency that accumulates through the mobile service providers logic and infrastructure e.g. client ac- count management and protection. This is entirely dependent on the network provider.

• The message reaches the Internet

There are also other delays of more random nature that may or may not add to the overall latency depending on circumstance. Examples of these are: radio conditions, packet loss, user density, mobility pattern (changing radio tower) and the offered traffic [19].

2.9.3 Countermeasures

What can be done to reduce latency? It is important that the VCI+ has the possibility to connect to a range of different generations of networks. One reason is geographical coverage. But the real-time experience is highly dependent on the mobile network performance, why it is always in interest to strive towards the best and newest connection available. This implies switching hardware when new radio devices are available.

Control-plane latency is a one time cost during data transfer because it is only needed during communication initialization. Once the device is in connected, it will remain there until timeout. The VCI+ will be powered by the vehicles battery, why it does not hold the same power limitations as regular mobile devices. For shortest response time, the devices should be held in a connected state during the entire diagnostic session. This can be done simply by performing periodic radio network activity preventing a timeout.

(35)

2.10. TRANSPORT LAYER PROTOCOLS

2.10 Transport Layer Protocols

The network performance of the internet service provider is a key factor when it comes to latency. But there is still the choice of transport protocol for the application, this choice may also affect latency and may work better or worse in combination with the available network.

There are two main branches of transport layer protocols, namely connection ori- ented communication and connectionless communication. These are represented by the most important and commonly used protocols, TCP and UDP respectively. For the reader unfamiliar with these protocols may look for a detailed description in [20].

2.10.1 Transmission Control Protocol (TCP)

A good introduction to TCP can be found in [14], from which this short description is taken:

“TCP provides an effective abstraction of a reliable network run- ning over an unreliable channel, hiding most of the complexity of net- work communication from our applications: retransmission of lost data, in-order delivery, congestion control and avoidance, data integrity, and more. When you work with a TCP stream, you are guaranteed that all bytes sent will be identical with bytes received, and that they will arrive in the same order to the client. As such, TCP is optimized for accurate delivery, rather than a timely one.”

It is a very rich protocol full off functionality to simplify application level design, which is why many of the applications running over the internet use TCP.

2.10.2 User Datagram Protocol (UDP)

UDP is often referred to as the “null-protocol” for its very limited featuristics. It is known for omitting functionality rather than adding to it. [20] describes it as:

“UDP is basically an application interface to IP. It adds no reliabil- ity, flow-control, or error recovery to IP. It simply serves as a multiplex- er/demultiplexer for sending and receiving datagrams”

This protocol is especially suited for applications with small messages were data loss can be tolerated. Examples of such applications includes online gaming and VoIP.

2.10.3 On Transport Protocols

The reason for including a short descriptions of these protocols is because the application in VCI+ will need both for its functionality. Instructions need reliability,

(36)

and fast updates are important for the user experience.

TCP functions that most often makes the protocol very useful, may have a neg- ative impact on performance. This regards areas such as flow-control, congestion avoidance, congestion control, reliable- and in-order packet delivery.

The application as it is envisioned will make use of very short messages fitted in only a single TCP packet, so it is difficult to assess to what extent TCP would add to the overall latency. Investigating the best suited transport protocol is outside the scope of this thesis, but it is very important to keep in mind in the continued work.

However, TCP sessions require a three-way handshake before a communication is established. For this reason, it is essential to keep the TCP connection alive during the diagnostic session to avoid unnecessary latency.

TCP has a relatively large header which forms a large contribution to the overall packet size when sending small messages. This would normally not generate a noticeable addition to latency, but with the increased loss rates of radio communication it may prove to have an impact although probably very small.

2.11 Envisioning the Future System

An architectural overview of the targeted future system is shown in Figure 2.4. It shows the different parts that constitutes the new system.

(37)

2.11. ENVISIONING THE FUTURE SYSTEM

Figure 2.4. Future system overview

(38)

(39)

Chapter 3

Further Areas of Research

3.1 Modeling the UDS Protocol

For the purpose of getting a deeper understanding of the diagnostics standard it has been modeled in the model checking tool UPPAAL®.

3.1.1 The UPPAAL Software From the creators own definition:

Uppaal is an integrated tool environment for modeling, simulation and verification of real-time systems, developed jointly by Basic Research in Computer Science at Aalborg University in Denmark and the Depart- ment of Information Technology at Uppsala University in Sweden. Typ- ical application areas include real-time controllers and communication protocols in particular, those where timing aspects are critical[21].

For the reader not familiar with this software, there is a very good introduction here [22].

The reasons for working with the UPPAAL software were the time critical properties of the standard, the visual representation UPPAAL simulation provides and its verification possibilities.

Before showing the actual modeling let us first go through the fundamentals of formal verification.

3.1.2 Computational Tree Logic

Temporal Logic regards the study of time i.e. to describe a system and its properties in terms of time. Lets say there is a system A. A has a set of properties that may vary in time, sometimes a property is true and sometimes it is not and studying the state of A over time can be done with temporal logic.

(40)

It might be of interest to ask if some property of A is always true, or if it is even possible that it will ever be true. Answers to these questions of future can be obtained by Computational Tree Logic (CTL), which is in the class of temporal logic.

CTL is a branching-time logic which considers paths of the future. The different possible futures can be visualized as branches of a tree and any one of them may be the one realized.

3.1.3 Property Checking with UPPAAL

Further, Timed CTL (TCTL) is CTL with clock constraints [23]. UPPAAL uses a subset of TCTL and just as TCTL it uses path formulae and state formulae.

A state formula describes individual states without looking at the behavior of a model. Path formulae looks at paths and traces of the model and can be divided into three properties, reachability, safety and liveness. The temporal operators used in UPPAAL are:

• A - for all paths

• E - there exists a path

• G - for all states in a path, written “[ ]” in UPPAAL

• F - there exists a state in a path, written “<>” in UPPAAL

Lets define p and q as local properties. There are five legal queries in UPPAAL [24], see Figure 3.1 for a graphical representation:

• A[ ]p - for all paths, p is true for all reachable states i.e. p is invariably true, this is a safety property

• E<>p - there exists a path where at least one reachable state holds p true, this is a reachability property

• A<>p - for all paths there exists a reachable state where p holds true i.e. p is inevitably true, this is a liveness property

• E[ ]p - there exists a path where p is true for all states i.e. p is potentially always true, this is a safety property

• p → q - if p becomes true, q will inevitably become true, this is a liveness property

The local properties p and q may assume values of: automata location, data guard, clock guard, p or q, not p and p imply q. For example p → q is the same as A[ ](p imply A<>q) i.e. for all paths, if p is true, q is inevitably true. There will be more examples in the model verification section.

(41)

3.1. MODELING THE UDS PROTOCOL

Figure 3.1. Property checking queries

(42)

3.1.4 UDS Models

The model consist of the ISO15765-2 standard that describes the network layers in the communication. This layer is similar for both client and server. It also includes models of the ISO15765-3 session layer, one describing the client and one the server, due to differences in session handling. In this case the VCI+ is the client and the vehicle ECU is the server. No space of this section will be given to explain the standard, there is simply no room for a sufficient explanation, why these models should be interpreted with the help of the ISO15765 documents.

Due to copyright and license claims the model with UPPAAL had to be omitted from this report.

3.1.5 Results and Discussion

The original intent when creating this model was to extend it beyond the standard.

To include a larger part of the entire VCI+ application. However, this part was found to be unsuitable for modeling. The reason is that the development of this product is in an early exploratory phase and no formal requirements have yet to be defined, which would have complicated the verification. Thus the model became a good way of visualizing the standard for a deeper understanding, but served unfor- tunately no further use.

However, all the queries that was formulated passed their test, suggesting that the model was accurate according to the properties tested. Although there is always the risk that important properties were left out.

There was also the intent of using the model for performance evaluation. This would have required thorough measurements to be of real use and was left out due to time limitations.

(43)

3.2. STATIC ANALYSIS OF C-APPLICATION

3.2 Static Analysis of C-Application

The VCI remote platform should be able to operate without having any interaction from the user except of that through the back-office application. This requires a flexible and reliable system that is able to handle various sources of errors, such as wrong input, absence or delay of input either from the back-office side or the vehicle side, corrupted settings and data, failed system calls etc.

One way of avoiding bugs are to ensure that best practices are being used, and this can be checked by means of static analysis tools (SAT). Static code analysis refers to an analysis of source code without executing it, in contrast to dynamic analysis, which checks the code during runtime[25]. Dynamic checks are often more extensive and time consuming, and static code analysis is therefore considered to be a good method to use from an early stage in the development of a new project. There are a number of tools available for static analysis C source code, both commercial and open source[26]. Although static analysis prevents many common errors such as uninitialized parameters, dereferencing of uninitialized pointers, dynamic allo- cation of memory and more, there are still many types of errors that usually goes without notation from a SAT, such as concurrency errors, both in single threaded and multithreaded applications[25].

3.2.1 Clang Scan-Build

The first analyzer that was tried out was Clang scan-build. Clang is a compiler that provides an alternative to GCC, and comes with a number of accessories including the static code analyzer called Clang scan-build.

Scan-build is run at compile time, and rendered two warnings for the project, one of which was due to an assignment of a variable to zero at the beginning of main loop, which was never read until it was assigned to another value. The second warning addressed a more serious flaw, namely an attempt to make a bit-wise OR assignment to a member of a data structure that has not yet been assigned a value. The conversion within the socketCAN API is generally to clear the data in structures after declaration using the memset() system call. Although this did not clear the warning, the only option was to assign the member the value of zero before bit-wise OR assignment.

3.2.2 Splint

With Clang scan-build only generating two warnings, it seemed necessary to try other static analyzers, at least for benchmarking purposes. The second candidate was Splint, a successor to LCLint, a tool for statically checking C programs for security vulnerabilities and programming mistakes[27].

(44)

Method

First, the project was developed and compiled so that no warnings were generated with the -Wall warning level. The application was also tested, so that it worked as intended in all aspects. After this, Splint was applied. The first attempts to run Splint on the source code was not successful. After some research, the answer was found in a discussion mailing list for Splint [28]

”Splint fails to compile code for the Linux kernel largely because it does not have some gcc specific features that the kernel code relies on.

Defining a list of FLAGS for splint that gcc defines by default helps with this.”

The author of the input on the referred web page also suggested a layout for a makefile to be able to run Splint on Linux, that with a few modifications, made it possible to run Splint on the source code for this project. The first successful exe- cution with Splint was done without using any flags for the purpose of suppressing warnings. This yielded 153 warnings for the entire application. The Splint manual states that using Splint without any flags “will produce a large number of warnings”.

Commonly, the -weak flag is given as an argument for a less strict warning level[27].

Classification of Warnings

The main bulk of warnings were generated from assignments or comparisons where the right hand value and left hand value had different types. There were mainly three different data types that were involved, __u8, int and unsigned int. The __u8 data type is an 8 bit unsigned integer defined by the Linux Kernel libraries (corresponding to unsigned character).. This is because the underlying data type (unsigned char) is being considered by Splint. The warnings occur even when doing comparisons where the right side value is a number, not a variable. For instance, the expression CAN_frame→data[0] == 0x03will yield a warning, because Splint expects an unsigned character to appear as a right side value. As with the previous category of warnings, one solution is to cast the expressions that generates a warning to the same type, e.g. the example above would be CAN_frame→data[0] ==

(__u8)0x03. In this project, the warning is suppressed with a flag instead.

The second most frequent type of warnings referred to ignored return values from functions. Warnings for omitted return values can serve as a reminder to check functions that are important for the application to function properly, such as some system calls and calls to various APIs. However,there are many functions that utilizes the return value such as user defined routines, as well as library calls that returns an int but the result is not crucial for the application, such as the printf() call. In order to avoid these warnings without suppressing them with a flag, a cast with (void) can be made to every such function call. The trade-off with such an approach is that the readability of the code decreases when applying too much am- bient code for error checking or casting around every function call. On the other