Chassis predictive maintenance and service solutions

(1)

IN

DEGREE PROJECT VEHICLE ENGINEERING, SECOND CYCLE, 30 CREDITS

,

STOCKHOLM SWEDEN 2019

Chassis predictive

maintenance and service

solutions

A diagnostic solution approach with onboard

spectrum analysis

PRAKHAR TYAGI

(2)

2

Acknowledgement

I would like to express my gratitude to all the people who helped me in successful completion of my studies and supported me in this memorable time of my life.

Firstly, my sincere gratitude to Volvo Cars, for giving me the opportunity to carry out my Master's Thesis at the company. This was possible thanks to Robin Westlund, my Supervisor, and the Vehicle Motion and Climate department, who helped whenever I was stuck or lost. Also, Special thanks to my colleague Sadegh Rahrovani and Erik Johansson who helped me during initial pilot study.

Moreover, to KTH for an amazing international experience during these last two years. I am grateful to Mikael Nybacka, my academic supervisor, and examiner, for his valuable advice and help during my Master's thesis and co-ordinating while I was away. Furthermore, to School of Electrical Engineering and Computer Science, my master's programme at KTH and Petter Ögren, head of the programme for his support throughout my masters.

(3)

3

Abstract

(4)

4

Sammanfattning

(5)

5

List of figures

Figure 1. Hällered proving ground tracks (Media, Volvo cars, 2010). ... 17

Figure 2. Test vehicle (Autoblog Volvo V90, 2018). ... 17

Figure 3. Hardware software interaction for data logging. ... 19

Figure 4. CANoe simulation FIBEX database and Measurement setup for logging trail 1.1.1. ... 21

Figure 5. Data visualization of error isolation evaluation of different class of errors (US patent Number 16/410,850). ... 24

Figure 6. Wheel speed signals for 4 tyres. ... 25

Figure 7. Analysed data sample as shown in Table 2. ... 25

Figure 8. Analysed data sample as shown in Table 3. ... 26

Figure 9. Analysed data sample in Table 3. ... 26

Figure 10. Analysed data sample in Table 4. ... 27

Figure 11. Trained classifier at speed 30 km/h. ... 29

Figure 14. Friction model slip curves for asphalt, gravel and snow. ... 33

Figure 15. Tire modelled as spring-damper system (Persson, Gustafsson & Drevö, 2002). ... 34

Figure 16. Wheel bearing modelled as spring-damper system. ... 35

Figure 17.Wheel and tyre dynamic model (Quarter car model, Chegg 2012) ... 35

Figure 18. Resonance frequency (Persson, Gustafsson & Drevö, 2002). ... 36

Figure 19. Time for detection (Persson, Gustafsson & Drevö, 2002). ... 37

Figure 20. Hyperplane. ... 40

Figure 21. Cluster. ... 42

Figure 22. Space. ... 42

Figure 23. Healthy and faulty operations. ... 49

Figure 24. Visualization of SVM in python data generated with gaussian kernel. ... 50

Figure 25. Frequency variation from raw data on the left to pre-processed data to the right. ... 52

Figure 26. Raw data presentation. ... 52

Figure 27. City road traffic measures. ... 53

Figure 28. Region of interest. ... 53

Figure 29. Test signal. ... 54

Figure 30. SVM algorithm accuracy. ... 55

Figure 31. Class clusters. ... 55

Figure 32. Result of trained classifier. ... 56

Figure 33. Confusion matrix. ... 57

Figure 34. Test trail 1 setup. ... 57

Figure 35. Test trail 2 setup. ... 58

Figure 36. W17d5-Reference vehicle. ... 58

Figure 37. Chassis health monitoring (True result). ... 58

Figure 38. Chassis health monitoring (false result no FFT generated). ... 59

(8)

8

(9)

9

List of tables

Table 1 Tire Pressure level for test conditions in front left tyre of test vehicle ... 17

Table 2 Sample of Dataset nomenclature with different maneuvers of handling track 2. ... 17

Table 3 Sample of Dataset nomenclature with different maneuvers of Country Road track. ... 17

Table 4 Sample of Dataset nomenclature with different maneuvers of Country Road track . ... 18

Table 5 Classes ... 44

Table 6 Decision Tree ... 44

Table 7 Feature list ... 49

(10)

10

Chapter 1

1. Introduction

The modern transportation sector is fragile which easily gets disrupted causing delays, economic losses and accidents. Vehicle uptime is becoming quite significant with increasing complications of the transport solutions and the transport industry looking for novel ways of staying competitive. Thus, a reliable and effective transportation systems are required (Carnero, 2005).

Recent advancements in the automotive industry have made the analysis of sensor data quite easy assisted by machine learning methods for prediction of failures in advance. This system indicates that vehicles must be available as when required and unplanned breakdowns and halts must be easily avoided (Ridolfo, 2004). It is therefore important to change the spare parts of vehicles at frequent intervals, as well as boosting sustainable development and reduce the reparation time in an event like this and also identify the reason behind such an error.

As described by Civerchia et al. (2017), the evolution of data analytics, artificial intelligence, and new sensors are advancing this revolution in many ways. The enormous machine-readable data which is available today is giving birth to multiple opportunities so that new features can be offered to the customers. Product recalls have been playing an important role in illustrating the importance of quality management. Recall costs has a huge impact on repair costs incurred by the automaker and also affects the sales and reputation of a brand (Civerchia et al. 2017).

A modern-day passenger vehicle has complicated software architecture and is equipped with an on-board diagnostics system (OBD) which monitors and reports the vehicle’s subsystems. The OBD generates important data along with diagnostic trouble codes that help the repairing technicians to look for vehicle malfunctions. The main drawback of the OBD system is dependent on varying sensor data showing only the software and electrical malfunctions of the vehicle. Nevertheless, with time the various mechanical parts do deteriorate resulting in a misdiagnosis of the diagnostic trouble codes (DTC). This might result in expenses that have to be incurred for locating the actual problems and also create confusion in making the corrections (Lu et al. 2009). Thus, the need of the hour is to analyse and identify the failure in mechanical parts such as a punctured tire, broken wheel hub and a broken damper.

A combination of data acquired from the vehicles and statistical models help in identification of the failures associated with the different mechanical parts that play an important role in alerting the driver much in advance creating a safe travel for the passengers. In such case, the vehicle is sent to the workshop even before the failure takes place thus avoiding the breakdowns in most of the cases. Predictive Maintenance is one of the modern inventions that aims at preventing unscheduled repairs and problems in vehicles (Sipos et al., 2004).

(11)

11

and systems, but each component has a limited life period. Predictive Maintenance helps in estimating the occurrence of a failure. Thus, one can easily plan the maintenance, inventory management and maximization of component life in advance (Xu, 2016).

The main focus of this thesis is the proposal of a machine learning based system designed for predicting the failure of mechanical parts that require replacement. The main investigation explores the possibilities for application of the machine learning algorithm for predicting the parts that require replacement found from the electronic errors that the vehicle exhibits. A strong association between the parts that cause faults and electronic error codes helps in yielding a powerful diagnostics tool. The likelihood of accumulating electronic error codes from the operating vehicles results in a tool that predicts the problems at quite an early stage. These vehicles can then go to the workshop even before the failure occurs which helps in avoiding impulsive halts and breakdowns.

1.1. Problem statement

Majority of the cars that run on the roads today have low pressure in their tyre, consume unwanted fuel and put their safety at risk. Poor tyre pressure is considered as one of the factors responsible for road accidents. 15% of the advanced cars are well-equipped with systems that monitor the tyre pressure and the trend is constantly rising in the times to come (Lee et al., 2017). However, most of the drivers are not aware of such systems but it is quite late now to identify those facts and to uncover the myths. Two different types of systems have been identified for monitoring the tyre pressure and they are indirect and direct. Both the technologies have varying weaknesses and strengths. Direct systems/dTPMS have pressure sensors fitted in the wheels and are associated with the rim. They keep on measuring the pressure and the temperature and send radio signals to the receiver sitting in the car.

iTPMS works on two basic principles like resonance frequency Fast Fourier Transform (FFT) and wheel radius analysis at varying pressure values. The key issue here is associated with a broken Component which gives away alarms and when iTPMS is used for detecting the faulty element along with analysis of inspiration considered the resonance of wheel FFT. It is clear that the breakdown of the various components affects the model in different ways generating resonance peaks at varying frequencies which is identical to iTPMS usage today. (US patent Number 16/410,850). The working of iTPMS is covered in section 4.2.1 in detail.

1.2. Research significance and motivation

(12)

12

1.3. Research aim and objectives

The thesis aims at developing an indirect condition monitoring system which can identify chassis mechanical failures with the help of a classification and detection algorithm which is based on various statistical learning and present proposal for monitoring detection online in a test vehicle.

This thesis’s scope is restricted to three key mechanical faults:

• Broken damper: A damper within a passive suspension system exhibits 50% oil leakage on front left of vehicle.

• Noisy wheel hub: It is the old and rusted wheel hub within the wheel assembly which does not exhibit any mechanical impact.

• Reference: The workshop technician did not exhibit any mechanical faults.

The thesis lacked a proper dataset which made it important to have appropriate data suitable for system testing and training of the learning model.

An analysis is conducted on the Fourier transform of wheel speed signals for identifying the key spectral features within the wheel speed signal for detecting the class of failure. Support Vector Machines (SVM) approach will be deployed for detection of failure vibration noisy signals. The failure mode will be detected after every trial lap with the help of a user interface that will package all the training model-based classifiers. This interface will be linked with a sampled package utilizing a voting mechanism approach for enhancing the confidence of the class.

1.4. Thesis outline

The current thesis is organized into the following chapters:

Chapter 1 comprises of the introduction in which the problem, thesis outline, objectives and motivation are highlighted.

Chapter 2 discusses the data processing that highlights the source of data, acquisition of data, concerns associated with acquisition of data.

(13)

13

Chapter 5 highlights the evaluation measures along with a naive classifier for comparing the accuracy of the predictions that are made. This chapter also includes the deployment of learning algorithms and also highlights the attainment of the parameters.

Chapter 6 presents the final outcomes along with appropriate discussion. The Chapter highlights the validation outcomes on performance evaluation of machine learning.

(14)

14

Chapter 2

2. Data pre-processing

Data pre-processing is usually essential in converting data into a form where condition indicators could be easily extracted. Data pre-processing consists of straightforward methods like removal of missing value and identifying and removing outliers, and advanced signal processing methods like transformations to the order domain and short-time Fourier transforms. To help decide what pre-processing techniques to use, it was essential to understand the machine and the type of data that we have (Lee et al., 2014). For example, when filtering noisy vibration data, what can aid to indicate what pre-processing techniques to use is determining what frequency range is most likely to show valuable features. In the same way, changing gearbox vibration data to the order domain might be handy, which is used for rotating machines once the rotational speed changes over time. But, that very same pre-processing would not be useful for vibration data from rigid bodies like an automobile chassis (Yam et al., 2001).

Every single machine learning based system is extremely reliant upon “good data.” The meaning of this good data might differ from problem to problem. It will be described in this chapter what good data for this thesis are. This thesis works on captured data set linking observed classes for Data acquired from front left Wheel and mechanical faults.

This task came with several challenges which will be mentioned in this chapter. As vital discussions and experiences were collected, the data sets were relentlessly refreshed and improved over the duration of this thesis. Such a data set had to be produced so as to conduct the thesis project (Hashemian, 2010). Throughout the progression of the thesis, the data set was constantly updated and enhanced upon as purposeful conferences and intuitions were collected. We will begin this chapter by discussing how would an ideal data set look like and then explain how the data set used were made and in what ways in which this differs from the ideal set.

2.1. Data acquisition

In developing a predictive maintenance solution, a well-made strategy to evaluate the operating state of the equipment and notice early faults in a timely fashion is needed. Doing so calls for effective use of both accessible sensor measurements and your familiarity of the system. To evaluate the operating condition of a vehicle, detect faults, or guess when the next failure is probably going to take place with the help of predictive maintenance. When one can detect or forecast failures, can escalate working effectiveness, plan maintenance beforehand, cut back idle time, and better manage inventory (Wang, 2016).

As part of data acquisition, the study has considered several factors mentioned below.

(15)

15

dependability and redundancies all have an effect on both algorithm improvement and cost. How different causes of faults interpret to observed symptoms. Such cause-effect analysis will need in depth processing of data from the existing sensors (Garcia et al., 2006).

Physical knowledge concerning the system dynamics. This information might come from mathematical modelling of the system and its faults and from the insights of domain specialists. Understanding system dynamics includes understanding of relations among varied signals from the equipment (such as input-output relations among the sensors and actuators), the operating range of the machine, and the nature of the measurements (for example, periodic, stochastic or constant) (Garcia et al., 2006).

The definitive maintenance objective, like development of a maintenance timetable or fault recovery. Designing predictive maintenance algorithms starts with a body of data. The study has managed and processed huge sets of data, including data from numerous machines running at completely different times and under different working environments, and data from various sensors. This study had access to different forms of data like real data from standard system operation, real data from system working in a faulty situation, and a real data from system failures (run-to-failure data) (Swanson, 2001).

For instance, this study had sensor data from system operation like pressure, vibration, and temperature. In several cases, error data from equipment are not accessible, or as a result of consistent maintenance being done, only a restricted range of failure data sets exist and the relative uncommonness of such occurrences. In this case, failure data may be created from a Simulink® model representing the system operation under other fault settings. Predictive Maintenance technique also offers practicality for classification, organizing, and retrieving such data kept on disks. It additionally offers tools to assist the generation of data from Simulink models for predictive maintenance algorithm development.

(16)

16

Gathering a huge set of sensor data demonstrating healthy and faulty operation was the first step. Under erratic operating circumstances, it was necessary to gather this data.

The data was picked up at HPG during a V90 test vehicle with three fault states. • Broken damper (mechanical impact was not visible).

• Noisy wheel hub (mechanical impact was not visible). • Reference (No Error).

To acquire variance in this study data, we used diverse tracks and different speeds. To get dependable data, each of the cases below was driven twice.

• Handling track 2

• Country road (straight track) • City traffic road

Logging and analysis:

• Log input signals by means of CANoe. • MATLAB for study of test result.

The Logging is explained in detail in section 2.5. The CANoe software setup comprises of a FIBEX Database for the above-mentioned vehicle, through which CANoe logging is completed in Binary Logging Format (blf). The MATLAB transform has FFT functions which can compute Fourier transform by use of fast algorithms of Fourier transform by using FFT for computing discrete Fourier transforms of signals.

2.2. Tracks

For capturing training data, the following tracks were chosen: • Track I

To test a vehicle's control and dependability, Track I is employed. It is represented by in Figure 1. The track has defects, for instance, uneven surfaces, erroneously assessed bends and wheel tracks. One area of the track is isolated into truck and vehicle circuits. An earth test track is also connected to the truck circuit.

• Track II

This track imitates the streets around a square city block and is used by vehicles. It is represented in Figure 1. The testing focused at this track simulates unpredictable traffic and turns at road corners. This lays distinctive strain on the driveline and suspension.

• Track III

(17)

17

black-top surfaces and variations are also offered by the country road track. For instance, American solid expressways, sewer vent covers and extension areas that endow us to direct commotion tests in and outside of the cars.

Figure 1. Hällered proving ground tracks (Media, Volvo cars, 2010).

2.3. Vehicle

Volvo V90 was used as the test vehicle for capturing the dataset. The test vehicle was taken from a prevailing Volvo project; no special modifications were made except for changing a broken damper in trial 1 and changing a broken and noisy wheel hub for trial 2. Trial 2 was done on a succeeding day after trial 1. Likewise, trial 3 is a reference car with no mechanical errors. The modifications were front left corner of the vehicle.

(18)

18

2.4. Test plan

To get variance in our data, we used a combination of various tracks, different speeds, and different levels of tire pressure. Every use-case was executed twice to get dependable data. Each test trial run of a vehicle is given a name in data set. For example, 1.1.1 represents a broken damper speed fixed at 60 km/h in trail 1. Table 2 displays all the terminology of data set for handling road 2. Likewise, the codes are given for country road and city traffic road. To know these codes is imperative, as later in thesis, each trial will be referred to with these codes.

Table 1. Tyre pressure level for test conditions in front left tyre of test vehicle.

Day1 Day2 Day3

Tire pressure warm 265 kPa 255 kPa 275 kPa Tire pressure reduced 198.8 kPa 191.25 kPa 206 kPa

Table 2. Sample of dataset nomenclature with different manoeuvres of Track I.

Handling track codes

Dataset code description 1.1 1.2 1.3 1.4 1.11 1.12

Broken damper speed fixed at 60 km/h (1, 2)

Broken damper + low tyre pressure, speed fixed at 60 km/h (1, 2) Reference, speed fixed at 60 km/h (1, 2)

Reference + low tyre pressure speed fixed at 60 km/h. (1, 2) Noisy wheel hub + low tyre pressure speed fixed at 60 km/h. (1, 2)

Noisy wheel hub speed fixed at 60 km/h (1, 2)

Table 3. Sample of dataset nomenclature with different manoeuvres of Track II.

Country road track codes

Dataset code description 2.1 2.30 2.13 2.5 2.7 2.18 2.17

Broken damper speed fixed at 30 km/h (1, 2) Reference, speed fixed at 30 km/h (1, 2) Noisy wheel hub speed fixed at 30 km/h (1, 2)

Broken damper speed fixed at 60 km/h (1, 2) Reference, speed fixed at 60 km/h (1, 2)

(19)

19

Table 4. Sample of dataset nomenclature with different manoeuvres of Track III.

City traffic track

codes Dataset code description

3.1 3.2 3.3 3.4 3.5 3.6

Broken damper speed fixed at 30 km/h (1, 2)

Broken damper + low tyre pressure, speed fixed at 30 km/h (1,2) Reference, speed fixed at 30 km/h (1, 2)

Reference + low tyre pressure speed fixed at 30 km/h. (1, 2) Noisy wheel hub speed fixed at 30 km/h (1, 2)

Noisy wheel hub speed fixed + low tyre pressure at 30 km/h (1, 2)

2.5. Hardware interaction

Positioned underneath the driver’s seat are the wheel speed signals that are inside the Chassis Electronic control unit (ECU), which is linked to licensed software by Vector - CANoe through a FlexRay breakout cable with the assistance of Vector Hardware VN7640.

(20)

20

2.5.1. FlexRay

FlexRay is a well-known high-rate communications protocol consisting of bus-signals which supports speeds of 10M/bits/second (Pop et al., 2008). At a particular time, only a single ECU can access the bus for reading and writing bits (in this case consisting of wheel-speed signals). All communication which runs on a bus is made up of frames with each frame having bundled messages with many bytes. The message frames are merger of 8-bit frames {x0,x, -..xm-1}.

Critical ECU’s must be designed for the network architecture which is fault tolerant for ensuring fast critical-data transmission. Controller area networks or CAN have limitations to the bandwidth and the event-driven bus signal as only a single message can go using CAN bus at a given time and priority codes have to be applied for each message (Pop et al., 2008). The disadvantage is that as the ECU-numbers increases, the implementation needs very high rates of data-transfer for transmitting increasing numbers of the control and the status-signals which must be transmitted very quickly with the transmissions needing to be time-critical with popular usage of the steer-drive & the brake-by wire-system.

2.5.2. FIBEX

Field-Bus-EXchanges are XML files for the FlexRay network. The XML files are used for interacting with the ECU and the PC and consist of manual signal definition and configurations. The FIBEX format, of the XML-based standard file format is defined using ASAM consortiums to describe the automotive network. While standard formats for the FlexRay network, use FIBEX database formats which are compatible with various automotive protocols, making these to be flexible standards (Song et al., 2013).

The FIBEX database is typically generated with vehicle-network designer and distributed between the engineers who work on particular aspects of a vehicle. Using FIBEX files and PC interfaces or an ECU which supports it, interactions can easily be established with the vehicle networks without a need for manually configuring the signal definitions and interfaces (Song et al., 2013). FIBEX has several aspects for particular networks which include the following features: The transmission and receiving schedules, the frame and signal definitions, bit-level encryption of the signal, network topology, the ECU information and the network configuration which includes the timing and the baud-rates.

2.5.3. VN7640

(21)

21

It supports Ethernet applications, like the DoIP or the measurements and the calibrations through XCP on the Ethernet. The availability of analogue and digital IO functionalities and connections for an external time-synchronization enables an accurate time-analysis for communication-data. Due to a robust-housing the hardware supports interactions between ECU FlexRay-Frames and a PC. This helps in detecting the FlexRay frame and the symbols for bus-supported transmissions and receptions of the data frames (Navet & Simonot-Lion, 2008). This consists of 2 FlexRay A & B channels which supports the D-SUB 9 connectors which are connected to the FlexRay breakout cables to ECU and the USB-connections with the PC.

2.5.4. CANoe

CANoe forms software tools used for the developing, testing and the analysing of the independent ECU’s and the entire ECU network. The application areas of CANoe tools which are relevant in the present thesis are the analysis & logging (Park et al., 2013). Acquired data from front left wheel signals in the ECU-frames are analysed with use of software-Trace-windows and logged into the desired Binary-Log Formats (blf).

Figure 4. CANoe simulation FIBEX database and Measurement setup for logging trail 1.1.1. 2.5.5. Logging & Format conversions

Logging files which are written can be exported from analyse-blocks which are then converted to desired file formats. CANoe tools provide support to 2 kinds of log-formats viz., the message-based log-formats which store all communication in the bus and the events and also the signal-based log formats which store exact values of signals that have been extracted from the message based signals which are not available directly from a communication bus.

(22)

22

Chapter 3

3. Ideal dataset

The main uses of datasets are for implementing supervised classification algorithms. This is attained by using the converted signal structures database that is achieved with above captured messages-based files. The numbers of the signals within individual event frames are very large and every vehicle trail runs have high numbers of events. This makes signal databases enormously large in their size and involves more computational time and power in the classification for processing. However, only a signal message i.e., a front-left-comer speed signal will be needed.

Therefore, from every signal-dataset the front-left- corner speed signal has to be extracted and the structured into a data-table or a matrix to make each column correspond to a single trial code like 1.1.1 and the other column is an extracted signal-value array-structure. These became challenging tasks as some of the signals are having 16 character-strings and some have 8-character-strings and others have 32-character 8-character-strings, so a script is used for searching only the names of signals and then these are stored in the above-mentioned structures. This reduces dimensionality of datasets and avoids any loss of the information during analysis of the wheel-speed signals.

3.1. Reducing dimensionality

In Supervised Machine-Learning (ML) there are the input variable (X) and the output variable (Y) which use algorithms for learning of mapping functions from input to output processes using the formula 𝑌 = 𝑓(𝑋). The aim is to approximate mapping functions so accurately that with new data inputs (X) the output variable (Y) for the data can be easily predicted. This is called supervised-learning or SL as the processes of algorithmic learning from training datasets can be compared with teacher-supervision of the learning process (Guest & Smith Genut, 2010). Correct answers are known, and the algorithms are used for making iterative predictions using training data and corrected continuously till it achieves acceptable performance level. For machine learning problems in classification, there are numerous factors using which final classifications are made. The factors are only variables or features and more features, causes greater visualization difficulties in training sets while working on them.

Intuitive examples in dimensionality reduction are seen in object classifications where it is necessary to classify if an object can be classified or cannot be so classified. This involves consideration of numerous features. 3-D classification problems are hard to imagine, whereas 2-D classification are easily mapped in simple 2-D spaces, and 1-D problems visualised as simple lines (Hinton & Salakhutdinov, 2006).

(23)

23

original input sizes. This is useful for modelling of temporal data so that models do not violate temporal order dimensions.

3.2. Signal selection and generating classes

Increasing complexities of traffic situations on roads place a high demand on the car driver. The driver assistance system can relieve burdens on drivers and optimizing road safety. Due to advanced driving assistance systems that are standard for new cars are posing new challenges for carmakers. Nowadays, vehicle electronics have major roles in providing safety and comfort to drivers. Optimum interactions of complex electronic systems and drivers ensure that vehicles can operate without problems, which increases their road safety. An intelligent communication system in electronic vehicles is sustained by the use of sensors. These are used by control units in the driving systems like ABS, ESP, TCS or ACC used for detecting wheel speeds. Information about wheel speed is also provided to various other systems through data lines in the ABS control units. The importance of selection of wheel-speed sensor signal value is justified by the previous works under (US patent Number 16/410,850) in relation to this thesis. The numbers of the signals like the lateral acceleration signal, longitudinal acceleration signal, engine speed signal, yaw-rate signal, engine torque signal, longitudinal speed signal, lateral speed signal, wheel speed signals for fault-corners, and wheel speed signals for relative speed have been labelled with the use of statistical data analytics that correspond to error states. The algorithm using decision tree classifications are used.

(24)

24

Figure 5. Data visualization of error isolation evaluation of different class of errors (US patent Number 16/410,850).

3.3. Analysed data

Key steps in the development of predictive maintenance algorithms are identifying the condition indicators (CIs) and features of the system-data having changes in behaviour which are predictable as systems degrade. The CI could be a useful feature for distinguishing the normal from the faulty operations or predicting the useful life remaining. Useful CIs tend to occur in clusters with system status which are similar which help to differentiate different status apart. Condition indicators are quantities which are derived through simple analysis like data mean values over time. It could be signal analyses which are complex, like frequencies of peak magnitudes in signal spectrums, or statistical moments which describe changes in spectrum over a period of time. It could also be data analyses which are model-based like eigenvalue maximum of state space models estimated by use of the data or combinations of various features in single fusion which forms an effective condition-indicator.

(25)

25

Figure 6 represents the different FFT profiles analysed as we can see different faults have different profile-characteristics specially between 5-20hz these key features from the curve will be used to detection of faults in real time. (US patent Number 16/410,850)

Figure 6. Wheel speed signals for 4 tyres.

(26)

26

Figure 8. Analysed data sample as shown in Table 3.

(27)

27

Figure 10. Analysed data sample in Table 4.

3.3.1. Matlab Fourier transform sampling frequency

In our study, the sampling makes use of Matlab Fourier transforms in frequency-regions within interest ranges from 10 Hz to 20 Hz for making comparisons with other signals including different signal types and classes. Fourier transform tool are used to perform power and frequency spectrum analysis of the signals in time-domains. Fourier transform tools reveal frequency components of a signal which are based on time and space and is able to represent it in a frequency space. This tool can compute frequency components of signals corrupted due to random noise and identify the frequency components (Qi et al., 2011).

By plotting of a power spectrum as the function of frequency when noise components disguise components of the signal frequency in time-based spaces, this tool can reveal them in the form of power spikes. In applications where power spectrums centred of a 0-frequency as it can represent signal periodicity better it is convenient to use FFT-shift functions for performing circular shifts on Y and plotting 0-centered power.

(28)

28

3.3.2. Different speed and different training classifiers

The classification is used for predicting classes of the given data points and are called targets/categories or labels. Predictive modelling using data points make approximations of the mapping function (f) the input variable (X) and the discrete output variable (Y).

The training process takes the content of specified classes are known and creating classifiers as the basis of the known content. Classification processes take classifiers which are built using the training content and makes it compatible and running for unknown content while determining class memberships for unknown content. The training uses iterative processes where best classifiers possible are built, and the classification forms one-time processes designed for running on an unknown content (Lewis et al., 1996).

Mark-Logic Server classifiers implement support vector machines or SVM's which use well-known algorithms for determining membership to given classes, based on the training data. The main idea is using classifiers by taking sets from training content and representing the known classes to perform statistical analysis of training content by using the knowledge obtained from training content for deciding the class of the unknown content, and classifiers can be used for gaining knowledge of content on the basis of the statistical analysis carried out during training (Lewis et al., 1996).

Initially defining classes of the set of the training content, and classifiers use these classes for analysing other content to determine their classification. After the classifiers have analysed the contents there can be conflicting measurements for help in determining whether the information in new content is belonging in or is out of the class. The probability that what is classified as being in a class is actually in that class (Swanson, 2001). High precision could come by missing some of the results that may resemble results from other classes.

(29)

29

3.4. Train model

At the core of predictive maintenance, algorithms are the prediction models which can analyse the CIs extracted for determining current system conditions using diagnosis and fault-detection or predicting future conditions like prediction of useful-life remaining. Diagnosis and fault detection depend on the use of CI-values to distinguish faulty and healthy operations, and define different fault-types (Civerchia et al., 2017). Simple types of fault-detection models could be the threshold values for CIs which are indicative of faulty conditions whenever threshold limits are exceeded. A different model may perhaps compare CIs with statistical distributions of the indicator values to determine the likelihood of a particular fault state. Fault-diagnosis approaches which are more complex can train classifiers for comparing current values of the different CIs for values which are related with the fault states and detects likelihoods of a fault state being present. While designing predictive maintenance algorithms, different fault-detection can be tested with use of diagnosis models having different CIs. This step of the designing process is therefore likely to be iterative using steps of extraction CIs as different CIs, different combination of CIs and a variety of decision models are used. Statistics & Machine Learning Toolboxes and various other toolboxes having functionalities which can be used for training the decision models like regression models and classifiers will be found useful in this respect (Civerchia et al., 2017).

In our study, the aim of a supporting vector machine-algorithm is for finding the hyper-plane in the N-dimension space with N standing for number of the features. This will be able to distinctly classify data points and support vectors which are the data points closest to the hyper-plane for influencing the orientation and position of the hyper-plane. Using the support vectors classifier margins can be maximized. Deleting support vectors changes the hyper-plane position which are points which help to build the SVM.

In the present SVM algorithm it is necessary to maximize margins between the hyper-plane and data points and loss functions help to maximize margin are the hinge losses.

As shown in Figure 11-13 the proposed system features trained classifier for 3 speed limits which are then broken down into 14 features.

(30)

30

Figure 12. Trained classifier at speed 60 km/h.

Figure 13. Trained classifier at speed 80 km/h.

(31)

31

Chapter 4

4. Theoretical framework

4.1. Introduction

The main contents of this chapter are an overview of the scientific basis and methods significant for the thesis. The key objective here is to provide an introduction on key technical concepts and also highlight some of the important features of such methods. The 1st_{section of the chapter will} shed light on iTPMS, statistical learning and then on few of the technical methods including concepts such as hyper planes, classification methods and separability. Overall, this chapter presents theoretical findings associated with key concepts of this research.

4.2. Overview of Tire Pressure Monitoring Systems (TPMS)

The air pressure existing within pneumatic tires of different vehicles is monitored using a tire-pressure monitoring system (TPMS). This system helps by reporting the vehicle driver about the tire-pressure either through a pictogram display or a gauge or sometimes a warning light depicting low-pressure (Ishtiaq Roufa et al., 2010). TPMS can be both indirect (iTPMS) and direct (dTPMS). The key objective of TPMS is to avoid traffic accidents, and poor fuel economy because of under-inflated tires by identifying hazardous tires much beforehand. TPMS is provided both aftermarket and at the OEM/factory level. This system first became common in Europe’s luxury vehicles in the year 1980 but was adopted universally in USA after the 2000 TREAD Act was passed following the Firestone and Ford tire debate (Velupillai & Guvenc, 2007). Various TPMS technology mandates are being passed in Russia, South Korea, Japan, the EU and various Asian countries.

4.2.1 Indirect TPMS (iTPMS)

The current study is greatly inspired by the functionalities of iTPMS. The iTPMS does not use any physical pressure sensors but air pressure is monitored by supervising the rotational speed of the wheels and various other signals that are available on the external area of the tire. Describing the principles of TPMS, Schofield & Lynam (2004) assert that the 1st_{generation iTPMS systems} follow the principle that states that under-inflated tires have a very small diameter when compared with the one that is adequately inflated. The 2nd_{generation iTPMS also detects under-inflation of} all the 4 tires by analysing the spectrum of all the wheels.

(32)

32

As per the study by Gotschlich (2004) factory installation for TPMS had become compulsory for all the passenger vehicles back in November 2014 in EU following which multiple iTPMS have got type approval as per UN Regulation R64. Some of the common models belong to the VW group like Renault models, FIAT, PSA, Mazda, Ford, Opel, Volvo and Honda (Gotschlich, 2013). iTPMS are known to be highly sensitive to the impact of the various tires and also to external elements like driving style or speed and road surfaces. No toxic or electronic waste, spare parts and hardware is involved with iTPMS and are customer friendly and easier to use. With passing time, iTPMS is gaining adequate market shares within the EU and will soon be the leading TPMS technology (Löhndorf et al., 2007).

Löhndorf et al. (2007) consider iTPMS to be inaccurate but some variations in temperature can result in varied pressure values for the same magnitude as that of valid detection thresholds. Thus, manufactures value its user-friendliness and wheel/tire change much more than the theoretical accuracy associated with direct TPMS. TPMS must be tested from the perspective of a customer by constantly monitoring the pressure losses. Real loss in pressure is not a sudden outcome because majority of the punctures result in steady and gradual losses over a period of few hours or minutes. Some of the pressure losses resulting from microscopic leaks in valves or diffusion might take a number of months before achieving adequate losses (Löhndorf et al., 2007). Thus, a 20% rise in tire pressure is an exception. Thus, TPMS should be sensitive mainly to constant loss in pressure. The step response is not practically significant even though lawful requirements are depicted as a pressure step test. Nevertheless, an approval test like this might not take longer than 6 months. It is quite easy to define a pressure step test and can be repeated easily. Thus, TPMS based on a pressure step test plays an important role in detecting steady and real drop in pressure.

The TPMS fails to inform the actual pressure present in each of the tyres and only functions when the vehicle is moving. It also does not inform when any two of the tyres have been equally underinflated on either same axle or same side or when pressure in all the four tyres are equally less. iTPMS is based on two fundamental principles such as wheel radius analysis and Fast Fourier Transform (FFT) at different pressure values. The current study proposes a solution to detect different component failure aligned with functioning of iTPMS.

In our current study, the classification and prediction of failures in mechanical parts as a result of fatigue with the help of supervised machine learning algorithm is done by leveraging of the current wheel speed related data. The breakdown of various components will have different impact on the model providing resonance peaks at varying frequencies identical to what is used by iTPMS today. It is true that wheel speed sensor FFT is quite sensitive towards the tire model which is why it must theoretically function for wheel bearing and damper.

(33)

33

machine learning model which can identify the anomalies, align the various faults and algorithm is then deployed and integrated within the system.

4.2.2. Wheel Radius Analysis (WRA)

According to Hall et al. (2007). TPMS which analyses the wheel radius is based on the fact that the wheel’s rolling radius reduces when there is a loss in the tire pressure. Supervision of the wheel speed is by far the best approach for detection of wheel under-inflation, but it has its own disadvantages and advantages (Hall et al., 2007). Majority of the approaches used for TPMS are based on static non-linear links for detection of under-inflation. One such example is:

𝒓 =𝒘𝟏 𝒘𝟐−

𝒘𝟑

𝒘𝟒 (1)

The various speed levels of the wheel are depicted as 𝜔𝑖 and are numbered as front left (1); front right (2); rear left (3) and rear right (4). A loss in pressure is identified when the test statistic, r, is found to be non-zero. The key drawback of the static consistency test is that it fails to identify equal loss in pressure on the same side or axle (Persson, Gustafsson & Drevö, 2002). When it is assumed that all the tires are having the same velocity:

𝒓 =𝒓𝟐 𝒓𝟏− 𝒓𝟒 𝜸𝟑= 𝒓𝟏𝒓𝟑3𝒓𝟏𝒓𝟒 𝒓𝟏𝒓𝟑 (2)

Similar loss in pressure in both the rear tires fails to create any impact on the test statistic, r. Thus, both the lateral and longitudinal vehicle dynamics are used rather than a static non-linearity. Longitudinal dynamics helps in comparing both the non-driven and driven wheel speeds based on a friction model whereas the lateral dynamics helps in comparing the right and left wheel speeds based on a yaw rate model (Gerdin et al., 2014).

According to Jiapeng & Xiaofeng (2009), a friction model helps in detecting under-inflation. A model of this kind follows the linear aspect of a classical longitudinal slip model. The slip is computed by comparing the wheel speeds of the rear and front wheels (Jiapeng & Xiaofeng, 2009).

(34)

34

Standard sensors are used for computing the engine torque and wheel speed. A wheel slip can be defined as the difference computed between the longitudinal and circumferential velocity.

𝒔 = 𝒘𝒓5𝒗𝒙

𝒗𝒙 - (3)

The wheel slip for smaller values for a normal traction force, 𝑧𝑥𝐹𝐹 = 𝜇, is defined to be its linear function and the relative difference existing in the tire radius of the rear and front wheel.

𝑺 = 𝝁 ∕ 𝒌 + 𝜹 (4)

Here, k is referred to as the inverse of δ and longitudinal stiffness and uses the state vector based on a time state space model for the vehicle’s left side (Jiapeng & Xiaofeng, 2009).

𝒘𝒙_𝒓B𝟏 = 𝒙_𝒓+ 𝒗_𝒓 (5)

An independent and analogous model is applied for the vehicle’s right side. Inequality in the tire radius results in an offset, δ, and is used for TPMS for indicating the pressure due to tire inflation. A state space model is directly applicable to the Kalman filter and a suitable trade-off can be achieved between tracking speed and noise reduction (Höpping & Augsburg, 2014).

4.2.3. Tire model

The below figure depicts a tire model in the form of a spring-damper system in both torsional and vertical direction

(35)

35

Figure 16. Wheel bearing modelled as spring-damper system.

The spring-damper is distributed in the axial, tangential and radial direction for approximating the tyre sidewalls. The stiffness of the sidewall is made of a pressurized membrane and a structural contribution. The latter is the result of the tensile, shear and bending stiffness of a sidewall and the former is the result of the sidewall deflection.

Figure 17.Wheel and tyre dynamic model (Quarter car model, Chegg 2012)

The quarter-car model comprises of the wheel, its various attachments, the quarter and the suspension elements, the tire and the chassis. The tire is considered as a linear spring devoid of damping, the wheel does not have any rotational motion and body, the damper and spring have linear behaviour and the tire is connected with the surface of the road. Here, the friction effect is ignored such that the residual structural damping is not included within the vehicle modelling (Businesswire, 2012).

(36)

36

A study by Persson et al. (2002) discusses the vertical spring-damper system which utilises two different phenomena. Firstly, the inflation pressure determines the rolling radius and the radius decrease when the pressure decreases. This enhances the rotation of the tire. rδ depicts the alteration in the radius as a result of deflation.

𝒘 =_𝒓5𝜹𝒗

𝒓 (6)

Secondly, roughness of the road makes the tire to vibrate vertically and results in fluctuation of the rolling radius based on a particular resonance frequency. This causes an indirect fluctuation in the speed of the wheel at a similar resonance frequency.

Figure 18. Resonance frequency (Persson, Gustafsson & Drevö, 2002).

A reduction in the inflation pressure of the tire results in a decrease in the spring-constant causing a reduced resonance frequency in the range of 10-20 Hz. kΔ is the most applicable mode for the vibration. The above figure shows a smoothed FFT of wheel speed signal for three different test runs with 100%, 85% and 70% of the nominal inflation pressure, respectively.

𝝎 = D𝒌5𝜟𝑲

𝑵 (7)

Here, N depicts the regular pressure which is applied on a wheel. The roughness of the road also excites the torsional spring-damper system. The vibration of the tire is seen to take place in a torsional direction which has a direct impact on the speed of the wheel. The reduced frequency of vibration also reduces the inflation pressure of the tire. 40-50 Hz is the most common vibration model for such vibration (Persson, Gustafsson & Drevö, 2002).

4.2.4. Voting scheme for iTPMS

(37)

37

on sensitivity for detecting robustness and pressure loss in various driving conditions. A combination of these approaches helps in detecting loss in pressure in multiple tires just within a minute (Persson, Gustafsson & Drevö, 2002).

The wheel radius and vibration analysis contain information regarding the inflation pressure of the tire. They are usually deployed independently but have to be combined for their best performance. Analysis of the wheel radius helps in estimating the relative radii existing between right, left, rear and front pair of wheels. The logic explains if any of the tires have been under-inflated. Both wheel radius and vibration analysis require calibration after altering the pressure in multiple tires. The calibration helps in computing the nominal value for tire radius and resonance frequency. None of the given methods result in absolute tire pressure which requires the application of a voting scheme. The working of the voting scheme has been depicted below:

Figure 19. Time for detection (Persson, Gustafsson & Drevö, 2002).

When both the flags depict similar under-inflation, _{𝑡1 = 𝑡2, a fixed threshold, (𝑐1 + 𝑐2)/2 >} ℎ0 is used for testing the confidence levels. A positive result requires a warning signal. A fixed threshold, _{𝑐1 > ℎ1}and _{𝑐2 > ℎ2, is used for testing the confidence level. A positive test result} would require a warning. However, no warning is required if the flags depict under-inflation. One must note that when both wheel radius and vibration analysis i.e. ℎ_{1, ℎ2 > ℎ0, indicate} under-inflation of the same tire, the level of threshold is quite low (Persson, Gustafsson & Drevö, 2002). This depicts higher confidence and small changes in the pressure can be detected quite quickly and accurately.

4.3. Statistical learning

(38)

38

learning, semi-supervised, un-supervised and supervised are the various types of machine learning. All of them have a different approach but have similar underlying theory and process (Friedman et al., 2001). The current study implements the concepts of machine learning for prediction results and highlight characters of the error state in the solution developed.

4.3.1. Supervised learning

Supervised as well as unsupervised learning are the two most important branches of machine learning. An AI agent can easily access different labels in supervised learning that can be used for improving the performance. For example, there is an array of emails within the email spam filter problem comprising of text. It is also easy to identify the emails that are spam with the help of labels that assist in supervised learning AI distinguishing the spam emails from others (Hastie et al., 2005). There are no labels in unsupervised learning. Thus, the AI agent’s task is not fully defined making it difficult to measure their performance. The unsupervised learning problem is not well-defined when compared with supervised learning problem and is much more difficult for the AI agent to resolve. However, the solution turns powerful if dealt with in a structured manner. An unsupervised system sometimes is superior to the supervised system in seeking novel patterns in upcoming data which makes the unsupervised solution quite quick (Vapnik, 2013).

Highlighting the concepts of deep learning Schmidhuber (2015) asserts that supervised learning is suitable for maximizing performance in tasks that have multiple labels. For instance, a huge dataset comprising of labelled object images is considered. When the dataset is quite big and the apt machine learning algorithms and powerful computers are used for training them, an excellent image classification system can be built which is based on supervised learning (Schmidhuber, 2015). For example, an AI agent training on supervised learning data measures the performance by making a comparison between true image label and predicted image label. The main objective is to minimize the cost function so that the unseen images and errors are quite low. Labels hold significant level of power as they assist the AI agent in measuring the errors. This error measure is then used for improving the performance with time (Simeone, 2018). Absence of labels will not let the AI classify the images correctly. Nevertheless, manual labelling of any image dataset is quite an expensive affair. It has been seen that the best image datasets only have a total of thousand labels. It serves to be a problem as supervised learning systems are excellent at classification of object images that have labels but unskilled at classification of object images that do not have labels. Supervised learning systems are powerful but restricted at generalization of knowledge for items that are not labelled.

It has been seen that most of the global data is not labelled and thus supervised learning restricts the ability of AI in extending its performance to unseen instances (Kumar & Tiwari, 2017).

4.3.2. Unsupervised learning

(39)

39

learning is not guided by any labels and functions by identifying the data structure (Qiu et al., 2016). It does so by representing the trained data with a number of parameters that are smaller when compared with the examples given within the dataset. Such representation learning helps in identifying the varying patterns existing within the dataset.

Unsupervised learning makes it easy to solve the complex problems that stemmed earlier and is quick at seeking the hidden patterns within historical and future data. However, today there is an AI approach for the huge amount of unlabelled data that exists all over the world (Jordan & Mitchell, 2015). Unsupervised learning is not that capable in solving problems that are not well-defined but are superior in dealing with open-ended and strong issues. It is also capable of dealing with various common problems faced by the scientists while developing the machine learning solutions. The examples in unsupervised learning comprise only of the input data and there are hardly any labelled examples to deal with. However, it is easier to seek complex and interesting patterns that are hidden in a data lacking labels.

A common and practical example for unsupervised learning will be the sorting of colourful coins into distinct piles. This is never taught to anyone, but the different colours are enough for one to sort them into the perfect groups. An unsupervised learning algorithm (t-SNE) plays an important role by clustering the digits that are written by hand into groups on the basis of their distinct characteristics (Doersch et al., 2015).

Sometimes unsupervised learning is more complicated than supervised learning because supervision removal does not define the problem adequately. The algorithm then cannot seek the correct patterns. Another good example here is one’s own learning. An individual who learned from a teacher to play a guitar grasps quickly as the supervised knowledge of rhythms, chords and notes that are re-used time and again. Another individual who learned this skill on their own might find it difficult to identify the starting point. An unsupervised teaching style begins with an empty slate which is not biased and seeks better and new ways to resolve a problem. This is also the reason why unsupervised learning is referred to as knowledge discovery. It plays an important role in carrying out exploratory analysis of data via clustering (Wang, 2016).

4.3.3. Linear separability classification

Classification is a feature which helps in grouping of the identical data points into varying sections for classifying them. The rules needed for explaining ways to distinguish the various data points are sought by machine learning. There are a number of ways for discovering the rules and the key focus is laid on using the answers and data for discovering the rules that differentiate the data points linearly (An et al., 2015).

(40)

40

to as a decision surface. It indicates that a data point which falls within the specified boundaries is referred to a particular class.

4.3.4. Separating hyperplane

A hyperplane divides any D-dimensional space into two halves. ‘w 2 RD’ is an outward pointing normal vector where ‘w’ depicts the orthogonal to the vector found on the hyperplane (Dimca, 2017).

Assumption: A hyperplane crosses via the origin. If it is not so, it results in a bias ‘b’. Then, both ‘b’ and ‘w’ will be required to define it. ‘b > 0’ indicates a parallel movement along w.

Figure 20. Hyperplane.

Consider ‘X’ which is a p-dimensional feature space where ‘D’ a hyperplane falls on a flat affine subspace ‘X’ having p > 1 as the dimension (Dimca, 2017).

4.3.5. Perceptron algorithm

The perceptron in the field of machine learning is defined as an algorithm used for supervised learning of the binary classifiers. It is a function which decides if an input that is signified by a set of numbers belongs to a particular class or not. It serves as a linear classifier which makes the predictions as per linear predictor function which is combined with a number of weights associated with a feature vector (Peña & Soheili, 2016).The perceptron serves as a linear classifier which is why it never has the input vectors correctly classified when the training set D has not been separated linearly.

(41)

41

algorithm but will result in the failure of learning (Schuld et al., 2015). When the linear separability of a training set is unknown, it becomes important to use a training variant.

A study by Schuld et al. (2015) describes that perceptron algorithm converges on a solution with respect to a linearly separable training set and picks up problems and solutions of different quality. The perceptron associated with optimal stability is referred to as linear support vector machine and is designed for resolving such problem (Schuld et al., 2015). A stochastic gradient decent is used for minimizing the distance between the decision boundary and the misclassified points. It indicates that the parameters ‘_0_’ undergo update as per every observation and not calculate the sum total of the various gradient contributions made by every observation and take a step further in a negative gradient direction. The parameters undergo update using a scheme for the misclassified observation ‘𝑥𝑖; 𝑖 2 𝑀’ where‘_ > 0’ is referred to as the learning-rate parameter (Kapoor et al., 2016).

Thus, the direction of the hyperplanes is altered for each misclassified observation and is shifted in a parallel direction to itself. A perceptron algorithm seeks a separating hyperplane which is dependent on values that are used for T when the classes can be separated linearly. The perceptron algorithm meets after limited steps are taken only when the classes can be separated linearly. A perceptron algorithm is not flawless and most of the problems have been summarized. The data needs to be separated linearly so that the algorithm can easily converge (Kapoor et al., 2016). A data which can be separated linearly requires unlimited separating hyper planes that are found by the algorithm based on the initial conditions. It has been seen that the algorithm joins in a limited number of steps and also can be quite large at the same time.

4.4. Support vector machine algorithm

A Support Vector Machine (SVM) acts as a discriminative classifier which is defined with the help of a separating hyperplane. In simple terms, supervised learning causes the algorithm to generate an optimal hyperplane that classifies new examples. The hyperplane acts as a line in a 2-D space which divides a plane into two different parts with each of the class lying on either side (Xu et al., 2018).

The current study implements the SVM algorithm to find a hyperplane in an N-dimensional space (N — the number of features) that distinctly classifies the data points. Support vectors are data points that are closer to the hyperplane and influence the position and orientation of the hyperplane. Using these support vectors, we maximize the margin of the classifier.

(42)

42

It has been seen that the SVM algorithm is applicable in multiple sciences and biological areas. They help in classifying proteins and they do that with 90% precision. SVM weights are used in many fields for interpreting SVM models. SVM follow the notion of decision planes which define the decision boundaries. Decision planes distinguish the various objects that have varying class memberships (Meyer & Wien, 2015). The below illustration depicts a schematic example where the objects fall under either a RED or GREEN class. The distinguishing line highlights a boundary which has GREEN objects on the right side and RED on the left. Any object which is found on the right side is labelled as GREEN and vice versa.

Figure 21. Cluster.

The diagram above is the best example depicting a linear classifier that differentiates a number of objects into specific groups with the help of a line. Nevertheless, majority of the classification tasks are not simple and require complicated structures to create an optimal separation and classify the new objects accurately using existing examples. The below illustration depicts this particular situation clearly (Al-Yaseen et al., 2017). Here, a curve is required for completely separating RED and GREEN objects. Any classification task which requires drawing of separating lines for differentiating objects belonging to varying class memberships are referred to as hyperplane classifiers. A task like this is well-handled by SVM.

The below illustration depicts the key idea that lies beneath SVM. The actual objects are rearranged/mapped with the help of mathematical functions referred to as kernels. This rearrangement is referred to as mapping/transformation. One must note that the mapped objects in this newly found setting is linearly separable. Thus, it becomes important to seek an optimal line which can differentiate the RED and GREEN objects rather than designing a complicated curve (Meyer & Wien, 2015).

(43)

43

SVM is a classifier technique which carries out the classification tasks. It does so by designing hyperplanes within a multidimensional space which differentiates the various class labels. It supports both classification and regression tasks and can also deal with various categorical and continuous variables. A dummy variable is designed for the categorical variables having case values as 1 or 0. Therefore, a categorical dependent variable comprising of 3 different levels ‘(A, B, C) is depicted via a set of dummy variables:

A: {1 0 0}, B: {0 1 0}, C: {0 0 1}

SVM designs an optimal hyperplane using an iterative training algorithm which reduces the error function (Al-Yaseen et al., 2017).

4.4.1. Confusion matrix in machine learning

A confusion matrix is commonly referred to as an error matrix and is used regularly in the fields of statistical classification and machine learning. A confusion matrix is nothing but a table which is generally used for describing the way a classification model/classifier performs on a particular series of test data that have true values (Hegde et al., 2018). It clears the confusion when a class is mislabelled as another. A confusion matrix thus helps in computing most of the performance measures. A confusion matrix summarizes the prediction outcomes of a classification problem. Both incorrect and correct predictions are briefed with specific count values and are broken down by concerned classes. It sheds light on the errors incurred by a classifier and on the error types that are incurred (Agarap, 2018).

Table 5. Classes.

(44)

44 Definition of the Terms:

• Positive (P): Indicates positive observation (for instance: is an orange).

• Negative (N): Indicates a negative observation (for instance: is not an orange).

• True Positive (TP): Indicates that the observation is a positive one and is also predicted to be positive.

• False Negative (FN): Indicates that the observation is positive, but the prediction says it to be negative.

• True Negative (TN): Indicates that the observation is negative and is also predicted to be negative.

• False Positive (FP): Indicates that the observation is negative, but the prediction says it to be positive.

Classification: Accuracy/Rate: Classification accuracy or rate is depicted by the relation given below (Agarap, 2018):

𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑇𝑃 + 𝑇𝑁/(𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (8)

Nevertheless, a number of problems exist with great precision assuming equal costs for these errors. Absolute accuracy can be terrible, poor, mediocre, good and excellent based on the issue.

Recall is nothing but the ratio between the positive examples and the ones that are classified positively. Higher recall means that the class has been recognized correctly (Trajdos and Kurzynski, 2018).

Recall is depicted by the relation given below;

𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑃/𝑇𝑃 + 𝐹𝑁 (9)

Precision: Precision is a value which is achieved by dividing the positive examples that are accurately classified by the number of positive examples that have been predicted. Higher precision results in an example which is labelled as positive (Agarap, 2018).

Precision can be explained with the relation given below:

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑇𝑃/𝑇𝑃 + 𝐹𝑃 (10)

High recall, low precision: It indicates that majority of the positive examples have been recognized correctly but many false positives do exist.