A Comparison of Isolation Algorithms on a Benchmark System

(1)

A Comparison of Isolation Algorithms on

a Benchmark System

MARTIN TORELM

(2)

Abstract

(3)

Acknowledgements

I would like to take this opportunity to thank my supervisor, Anna Pernest˚al, who has been of great help and support throughout this master’s degree project. I would also like to thank Scania - NED, and the department of Automatic Control at the Royal Institute of Technology, Sweden. I would also like to thank Vicen¸c Puig for providing information about the previous work on the benchmark problem used in this thesis.

(4)

Introduction

Diagnostics is an important task in the ﬁeld of industrial systems. A mal-functioning component could, for example, result in decreased eﬃciency, damage to the system, or even cause personal injury.

It is important to detect and isolate the fault to be able to make the right decisions when a fault is present. Isolation algorithms are used for ﬁnding out which component or how the component is failing. Isolation can be based on consistency tests for a process. Based on knowledge of how the tests react on diﬀerent faults, the faulty component can be pointed out. A good isolation system can be of valuable help for the shop technician when locating and repairing the fault.

1.1 Background

(7)

-Process Residual

generation Filtering Isolation

Diagnoses

Process data Test quantities

Figure 1.1: The ﬁgure shows the structure, which is used in this work for making diagnosis. The method is to use residuals, which are ﬁltered and thresholded, as inputs to the isolation. The output of the isolation is the diagnoses, which is a set of possible faults causing the behaviour of the process.

The residuals are then ﬁltered, and by applying thresholds to the residuals, tests are computed. The tests will then be used as inputs to the isolation.

1.2 Objective

The objective with this thesis is to compare diﬀerent isolation algorithms. Based on the comparison of the diﬀerent approaches, a recommendation should be made for which types of problems the isolation algorithms are applicable.

1.3 Goal

The goal of this project is to:

1. Implement four types of isolation methods on a benchmark problem 2. Develop performance measures to compare the isolation methods 3. Compare and evaluate the methods with the developed performance

(8)

Chapter 2

The benchmark problem

In this chapter the benchmark problem is presented and the model equations are derived. The model equations will be used for simulation and when developing forming the residuals. This benchmark problem has previously been used in [Pulido] and [O. Bouamama].

2.1 The model

The benchmark system that is used for comparing the algorithms is taken from [O. Bouamama] and consists of two tanks with various components connecting the tanks. The system is shown in Figure 2.1.

The purpose of the two-tank system is to provide a constant ﬂow to the consumer. A PI-controlled pump provides water to tank T1, with an inlet

ﬂow Q_p, to a nominal level of h_1c = 0.5 m. Tank T1 is connected with T2

by a pipe. The water level, h1, in tank T2 is controlled by an ”ON-OFF”

controller acting on the valve V_b and causing a water ﬂow, Q12, from tank

T1. The ”ON-OFF” controller is controlling the water level, h2, in tank T2.

The valves V_f1 and V_f2 are used to simulate water leakage of the respective tank.

The inputs to the system are the pump ﬂow, Q_p, the control output from the ”ON-OFF” controller, U_b, the input voltage U_o the valve for the outlet to the consumer, and the pump voltage U_p. The input vector u is a measured quantity. Measured quantities have the letter ”m” added to the variable. u = ⎡ ⎢ ⎢ ⎣ Qm_p U_bm U_om U_pm ⎤ ⎥ ⎥ ⎦

(9)

Figure 2.1: System used for benchmark. as: ym= y₁m y₂m = h1+ ε1 h2+ ε2 , (2.1)

where ε1 and ε2 are measurement noises.

The change of the volume in the tanks can be described as difference between the sum of all in-flows and the sum of all out-flows. This is written as: ˙ V1= A1˙h1 = Q_int1−Q_out1= = Q_p− Q12− Q_f1 ˙ V2= A2˙h2 = Qint2−Qout2= = Q12− Q_o− Q_f2 (2.2)

The inlet ﬂow Q_p is assumed to be proportional to the pump voltage, U_p, and taking into account for the limitation of the pump, the inlet ﬂow is described as in the equation below.

Q_p(t) = ⎧ ⎨ ⎩ U_p 0 < U_p < Q_{p max} 0 U_p ≤ 0 Q_{p max} U_p ≥ Q_{p max} (2.3)

The PI controller acting on the pump is modelled as: U_p = K_p(h_1c− h1(t)) + K_i

(h_1c− h1(t)) dt, (2.4)

where K_pand K_iare constants and h_1c is the set point for the PI-controller. Using Bernouilli’s law, the water ﬂow Q12 between the two tanks is

(10)

The ON-OFF controller controls the inlet ﬂow to tank T2 through a

valve, V_b. The valve opens if the water level in tank 2 is less or equal than 0.09 m and closes if the water level is greater than 0.09. The water level can not be less than 0 m and it can not be greater than 0.11 m. The control signal acting on the valve is

U_bm =

0 , if 0.09 m≤ h2< 0.11 m

1 , if 0.00 m≤ h2< 0.09 m . (2.6)

Using Bernouilli’s law for a second time gives the outﬂow to the con-sumer.

Q_o= C_vo·h2· U_om, (2.7)

where U_om is the control signal to the valve V_o. U_om=

1 if V_ois opened

0 if V_ois closed (2.8)

Using the Equations 2.2 - 2.8 the ﬁnal model of the system is written as: ˙h1= Qp−Cvb·sgn(h1−h2) √ |h1−h2|·U_bm−Qf 1 A1 ˙h2= Cvb·sgn(h1−h2) √ |h1−h2|·U_bm−Cvo·√h2·Uom−Qf 2 A2 (2.9)

2.2 Faults

Different types of faults are considered in the benchmark problem. In [Pulido] there are totally six additative faults simulated. In [O. Bouamama] there are total of eight faults simulated, both additative and multiplicative. Both of the works only cover the single fault scenario, but in this work we will also consider multiple faults. In this thesis both the set of faults in [Pulido] and in [O. Bouamama] are studied. In this text, the faults will be denoted F , with different subscripts for the different faults. The superscript, ”P”, will be used for faults taken from [Pulido], and ”B” will be used for faults taken from [O. Bouamama].

2.2.1 Additative faults

In [Pulido] there are only additative, single faults considered. To be able to compare the results following faults from this work are used:

F_ﬀP: Fault free mode: the process run without faults FpumpP : Pump fault: additative fault in pump P1

F_yP₁: Additative fault in level sensor y1

(11)

F_QP

f 1: Constant leak in tank T1

F_QP

f 2: Constant leak in tank T2

F_UP

p: Additative fault in the controller output Up in tank T1

In this case there is a total number of six diﬀerent single faults that can be simulated. All combinations of faults will be simulated in the benchmark system, and this means that there are 26 = 64 possible faults scenarios.

2.2.2 Mixed faults

The faults that has been taken into consideration in [O. Bouamama] are two types, additative and multiplicative.

F_ﬀB: Fault free mode: the process run without faults

F_pumpB : Pump fault: the pump is simulated oﬀ from t = 40s to t = 120s F_yB₁: Level sensor y₁m is stuck to zero from t = 40s to t = 120s F_yB₂: Level sensor y₂m is stuck to zero from t = 40s to t = 120s F_QB

f 1: Water leak in tank T1 from t = 40s to t = 120s. Qf1= 10

−4 _m3_/s

F_QB

f 2: Water leak in tank T2 from t = 40s to t = 120s. Qf2= 10

−4 _m3_/s

F_UB

p: Controller output U

m

p is short circuit to ground from t = 40s to t =

120s F_VB

b: Valve Vb is blocked out from t = 40s to t = 150s

F_UB

b: Controller output U

m

b is short circuit to ground from t = 40s to t =

120s

(12)

Chapter 3

Test quantities

In this chapter, residuals are computed and test quantities are formed. The test quantities are obtained by simple relations from the model and will be used as input for the isolation.

3.1 Residual generation

The detection part of the diagnostic system is based on residuals. Residuals are obtained from Analytic Redundancy Relations, ARR [Nyberg, Frisk]. The ARR, shown Equations 3.1 - 3.4, is obtained from the model equations (see Equation 2.2 - 2.9 in Chapter 2), by identifying the relations between the measured outputs, x_j, and the modelled outputs, ˆx_j, with j = 1, 2 . . . 4.

A1 _dym 1 dt + dε1 dt x1 ≈ Q12+ Qm_p+ ε3− Qf1 ˆ x1 (3.1) A2 dym₂ dt + dε1 dt x2 ≈ Q12− Cvo· (ym₂ + ε2)· U_om− Qf2 ˆ x2 (3.2) U_pm+ ε4 x3 ≈ Kp(eh1) + Ki (e_h1) dt ˆ x3 (3.3) Qm_p + ε3 x4 ≈ ⎧ ⎨ ⎩ U_pm+ ε4 , if 0 < U_pm+ ε4 < Qp max 0 , if U_pm+ ε4 ≤ 0 Q_{p max} , if U_pm+ ε4 ≥ Qp max ˆ x4 (3.4)

where ε_i, i = 1, 2 . . . 4, is the sensor noise, and

Q12 = −Cvbsign(ym1 + ε1− y2m− ε2)|y1m+ ε1− ym2 − ε2| · U_bm

(13)

The residuals can then be computed as the diﬀerence between the measured and modelled output.

r1(t) = xˆ1(t)− x1(t)

r2(t) = xˆ2(t)− x2(t)

r3(t) = xˆ3(t)− x3(t)

r4(t) = xˆ4(t)− x4(t)

3.1.1 Discretisation

The model equations in 2.9 must be discretisised because the implementation of the residuals in the simulation needs to be done in discrete form. The model was discretisised by ”Euler’s method” which, for instance, transfers dym/dt to

dym

dt =

ym(k)− ym(k− 1)

T (3.5)

The residual generation used for the simulations can then be written as: r1(k) = −Cvbsgn(y1m(k)− y2m(k)) |ym 1 (k)− y2m(k)| · U_bm(k)− −Qm p(k)− Qf1(k)− A1 _ym 1(k)−y1m(k−1) T r2(k) = Cvbsgn(ym1(k)− ym2(k))|ym1(k)− ym2(k)| · Ubm(k)− −Cvo· ym₂(k)· U_om(k)− Q_f2(k)− A2 _ym 2(k)−ym2(k−1) T r3(k) = U_pm(k)− Kp(em(k)− em(k− 1)) − KiT em(k)− Up(k− 1) r4(k) = Qm_p (k)− ⎧ ⎨ ⎩ U_pm 0 < U_pm(k) < Q_{p max} 0 U_pm(k) ≤ 0 Q_{p max} U_pm(k) ≥ Q_{p max} (3.6)

3.2 Filtering and thresholds

(14)

0 50 100 −0.1 −0.05 0 0.05 0.1 r1 Time [s] 0 50 100 −5 0 5 10 15x 10 −3 r2 Time [s] 0 50 100 −1 −0.5 0 0.5 1 x 10−7 r3 Time [s] 0 50 100 −2 −1 0 1 2 x 10−7 r4 Time [s]

Figure 3.1: Residuals for additative fault in y1. The dashed line shows the

threshold used for the residuals.

3.2.1 Cu-sum test

The Cu-sum test [Gustafsson] is used to detect small changes in the bias of the residuals. The Cu-sum test is deﬁned as

S_t+1 =

S_t+ y_t+1 y_t+1> h

0 y_t+1≤ h .

A rule of thumb is that the value of h should be 2.5 times the fault size.

3.2.2 Thresholds

Thresholds are used to determine if a test has reacted to a fault. The thresholds are chosen in a way such that no test reacts when the system is simulated in the fault free case. The threshold is then set just above the highest value of the residual. This is to avoid false alarms. A simulation of an additative fault in the sensor for y1 is shown in Figure 3.1. The residuals r1

and r2are sensitive to the fault and have exceeded the thresholds. Figure 3.2

shows the test results for the same simulation.

3.2.3 The Failure Signature Matrix

(15)

Test results for fault F_yP₁ Time [s] Te st 0 40 80 120 d1 d2 d3 d4

Figure 3.2: The test results for the additative fault, F_yP₁.

r_i FpumpP F_yP₁ F_yP₂ F_QP_{f 1} F_QP_{f 2} F_UP_p

r1 0 x x x 0 0

r2 0 x x 0 x 0

r3 0 0 0 0 0 x

r4 x 0 0 0 0 x

Table 3.1: The failure signature matrix

(16)

0 50 100 150 200 250 300 −2 −1 0 1 2 3 4 5 Threshold Time [s] Amplitude −2 0 2 4 6 0 5 10 15 20 25 Threshold Amplitude

(17)

Chapter 4

Isolation algorithms

There are many different algorithms that have been developed for isolation. Some of them are based on models of the process, while others are based on experience. In this project four different model based approaches to fault isolation are considered. Variations of the algorithms by changing different assumptions are also considered.

First, some notation: Let c_i be a variable, which describes the behavioural mode in component i, such that

c_i =

0 for ”no fault in component i”

1 for ”fault in component i” . (4.1)

Further on, let d_j be the test result from test j and d_j=

0 for no alarm

1 for alarm . (4.2)

The current system behavioural mode, C, and the test results, D, can then be written as

C = [c1, c2, c3, . . . cn] (4.3)

D = [d1, d2, d3, . . . dm]. (4.4)

Let Δ be a diagnosis, i. e. a system behvioural mode that is consistent with measurements. Note that there can be several system behavioural modes that are consistent with measurements. The output, D, from the isolation system is a set of diagnoses.

4.1 Column reasoning

(18)

1. No conclusion can be drawn from a test result that has not been acti-vated

2. An inactivated test result can exclude the faults where x:s are marked (this is the traditional way of doing isolation in FDI ﬁeld, but it is generally not to recommend, since it could exclude a correct diagnosis when a test misses a detection. See Figure 3.3 )

Example 1 Consider the FSM:

d_i c1 c2 c3

d1 x 0 x

d2 0 x x

(4.5)

When test d1 reacts and test d2does not react the ﬁrst variant of the Column

reasoning method will give the diagnoses:

D = {{c2}, {c3}}, (4.6)

while the second variant of Column reasoning will give:

D = {{c2}}. (4.7)

4.2 Row reasoning

Row reasoning/a variant of Reiter’s algorithm (DX) is a common way to handle the isolation problem. The main idea behind row reasoning is that each test results in a conflict. A conflict means, in this case, that not all of the components included in the conflict can be non faulty at once. These conflicts can be generated with the tests together with the rows of the Failure Signature Matrix. For every new conflict, the intersection with the old ones produces the new diagnosis statement.

Example 2

d_i c1 c2 c3

d1 x 0 x

d2 0 x x

Consider the FSM above. If test number one reacts, i. e. d1 = 1 then it

would produce the conﬂict {c1, c3} and the diagnoses:

D = {{c1}, {c3}} (4.8)

Later on, if also test number two reacts (d2 = 1), then conﬂicts would be

{c1, c3} ∧ {c2, c3}, and the Row reasoning isolation method will produce the

diagnoses:

{c1, c3} ∧ {c2, c3} ⇒

(19)

This means that there are four possible explanations of the system’s be-haviour, three of them contains double faults, and one contains a single fault.

The most common assumption is that the current behaviour of the process probably has its explanation from the fault that includes the least compo-nents, from the example above, this would be {c3}, even though all of the

above sets are possible.

In this thesis we will consider Reiter’s algorithm for ﬁnding the minimal diagnosis. It produces a minimal set of possible faults, and a common in-terpretation is that the current system behavioural mode exists in this set, even if all supersets of the minimal diagnosis are possible. From the example above, the result would be: D = {{c1c2}, {c3}}.

The structure of Reiter’s algorithm is shown below:

1. Initialise the set of minimal diagnoses to hold only the empty set, i.e.

2. Given a (new) conflict, find out if any minimal diagnosis is invalidated, i.e. has an empty intersection with the conflict 3. Extend any invalidated diagnosis to a set of new diagnoses

consisting of the invalidated diagnosis and an element from the new conﬂict

3. Remove any new diagnoses that are not minimal, i.e. are super-sets of any other minimal diagnosis

4. Iterate from Item 2 for all new conﬂicts

4.3 A Bayesian approach to isolation

A drawback with Column reasoning and Row reasoning is that the algo-rithms often produces many diagnoses. Therefore, a Bayesian approach to isolation [Pernest˚al] has been considered. The main idea behind this ap-proach to fault isolation is to compute the probability that a fault is present. This probability can then be used for ranking or decision making on fault accommodation.

Let C be the current system behavioural mode, D the test results, then, Baye’s rule is applicable as follows:

P (C | D) = P (D|C)P (C)_{P (D)} (4.10)

In order to obtain a good estimation of the functions in Equation 4.10, simulations of the system has to be done with all combinations of faults, single as well as multiple. If all faults are assumed to be independent and the probability for a single fault to occur is P (c_i) = p_c,∀i = 1 . . . n then

(20)

P (C) is called the prior probability. The simulations are done with the cur-rent system behavioural mode and the test results as outputs. The function P (D) is a normalisation factor and is calculated as follows:

P (D) =

C

P (D| C)P (C) (4.11)

The Bayesian approach is varied using different assumptions about indepen-dence. The only difference is how P (D | C) is computed. The next three subsections will describe the different assumptions.

4.3.1 Independence

In this variant of the Bayesian algorithm, all test results, d_j, are assumed to be independent. The probability, P (D| C), can then be computed as

P (D| C) =

j

P (d_j | C). (4.12)

For every fault simulation, simulation data is gathered at the sample times, t_k = kT , where k = 0, . . . , N − 1 and T is the sampling interval. Then the estimation of the distribution P (d_j | C) is done as follows:

P (d_j = 1| C) = N k=1 d_j(t_k | C) N P (d_j = 0| C) = 1 − P (d_j = 1| C), (4.13)

where d_j(t_k | C) is the observed test result from time t_k given the system behavioural mode C.

4.3.2 Partial independence

The assumption about independence among test results is generally not valid. Two tests could, for example, be dependent when they share the same underlying relations, and if one of the tests reacts it can cause the other to react. The assumption of partial independence of the test results is used in this variant of the Bayesian algorithm.

(21)

The method, which used in this work, for finding dependence among tests is taken from [Pernest˚al]. If tests are dependent, and a test has reacted, the knowledge about the system behavioural mode will not provide any information about the other tests. To decide if tests are dependent, training data is collected from different system behavioural modes. The training data is then evaluated and likelihoods of different dependencies are computed.

4.3.3 Full dependence

In reality, there is always a possibility for dependence among tests, and to make sure we cover all possibilities we can assume that there are dependen-cies between all tests, the probabilities should then be calculated as:

P (C| D) = P (d1d2...dm|C)

P (D) (4.15)

The assumption about full dependence is the best that can be done when computing the probabilities for system behavioural modes given the test results, and can also be used as reference when evaluating how well the other assumptions work.

4.4 Diagnostic Model Processor

The Diagnostic Model Processor [Petti], DMP, is a model-based algorithm for diagnostics. This method is also based on residuals. The residuals are weighted to decide the degree of violation of the model equations. The thresholds are be obtained as before, and when the residual r_j exceeds the threshold τ_j corresponds to v_j exceeding the value 0.5. The residuals are calculated from measurements, u and y, from the process. Let C be a vector of assumptions about the system behavioural mode. Then the residuals can be written as:

r_j = g_j(C) (4.16)

The residuals are used to calculate a satisfaction vector, vsf, which contains the information on how well the model equations are satisﬁed: 0 for perfect satisfaction and±1 when model equations are severely violated high or low respectively.

v_jsf= (rj/τj)

n

1 + (r_j/τ_j)n (4.17)

This can be seen as an other way of thresholding.

The sensitivity function, S, is determined through the partial derivative of the model equations, c_j, with respect to the fault, c_i:

S_ij = ∂gj/∂ci |τj|

(22)

The sensitivity function corresponds to the FSM in the FDI approach and describes how easy a behavioural mode, c_i, violates the residual r_i.

The failure likelihood, F_i, of assumption c_i is determined from the equation:

F_i = n j=1 (S_ijvsf_j) n j=1 |Sij| (4.19)

(23)

Chapter 5

Performance measures

To be able to compare the performance and the resources that are required from the respective algorithm, performance measures are needed. Explana-tions and deﬁniExplana-tions of performance measures follow in this chapter. Some of the measures are not applicable on certain types of problems. If this is the case, a description can be found in the respective section.

5.1 Memory usage

The memory usage of an algorithm is deﬁned as the amount of memory required for carrying through the isolation. This measure is dependent on size as well as the desired accuracy of the data structures:

• In Column reasoning and Row reasoning, an FSM needs to be stored • In the Bayesian approaches, the functions P (D|C) and P (C) needs to

be stored as tables

• In DMP, the sensitivity matrix S needs to be stored

5.2 Diagnostic resolution

The diagnostic resolution ([Pulido]) measures the average of the belief, p_kC, of the system behavioural mode, C, evaluated for sample number k. The diagnostic resolution is deﬁned as:

γ = 1 L L k=1 C p_kC, (5.1)

(24)

the optimal value of the diagnostic resolution, they need to point out one system behavioural mode in average.

Example 3 Let B1, B2, B3 and B4 be the possible behavioural modes. If

the Row reasoning method produces the diagnostic statement, {{B1}, {B2}}

in performance test k, then

B

p_kC = 2 (5.2)

If the Bayesian method states the diagnoses,

{P (B1) = 0.5, P (B2) = 0.4, P (B3) = 0.1, P (B4) = 0} for performance test

k, then

B

p_kC = 1 (5.3)

Since the Bayesian method states the diagnosis on the form of probabilities, the Diagnostic resolution will always become one for this case.

5.3 Normalised diagnostic accuracy

Normalised diagnostic accuracy, NDA, was developed in order to handle the multiple fault scenario. The idea is to place diﬀerent weights depending on how important a component is, i. e. to let single faults be more important than behavioural modes containing multiplicative faults. Let the function f (C | D) denote the conﬁdence of a diagnosis, and let

C

f (C | D) = 1 (5.4)

The NDA is then deﬁned as:

α = 1 N Cf (C | DC)· kC 1 N CkC = Cf (C| DC)· kC CkC (5.5) Where D_C is observations of the test results, when C is the true system behavioural mode, and k_C is a vector which includes weights for the system behavioural mode. Depending on how important it is that the system be-havioural mode, C, is included in the diagnosis when active, these weights is chosen in a way that important behavioural modes gets large values, and less important system behavioural modes gets smaller values. For example, it is more important to be able to have a correct diagnosis statement for sin-gle faults or NF than for the case when more or all components are faulty. The parameter k_C is design parameter. A good choice of k_C is to let single faults have the value 0.11, double faults have the value 0.12 etc.

(25)

the true system behavioural mode is C2, the test result is D_C and the isolation

reaches the conclusion that C2 or C3 is present then the conﬁdences of the

diagnoses are: f (C1|DC2) = 0 f (C2|DC2) = 1 2 f (C3|D_C₂) = 1 2

The only conﬁdence which contributes to the sum in the denominator is then f (C2|DC2) = 1/2.

The optimal value for the NDA is 1. The optimal value of the NDA is not, in reality, achievable because this means that the conﬁdence of the diagnosis needs to be one at all times.

5.4 Error rate

The error rate is deﬁned as the average percentage of faulty diagnoses for the current system behavioural modes. A faulty diagnosis means that the true system behavioural mode is not present in the diagnosis.

β = 1

L

C

(f (C|D_C) == 0) (5.6)

(26)

Chapter 6

Simulation and evaluation

This chapter describes how the simulation and evaluation of the isolation methods was done technically.

6.1 Simulations

The benchmark system was simulated in SIMULINK and the isolation algo-rithms were implemented in MATLAB. The residual generation is the same for all types of algorithms in order to make an objective comparison. All thresholds were also kept the same for all algorithms, with exception for DMP, where thresholds are deﬁned in an other way. All simulations were done both for the case where only additative faults used, with a total of six faults and for the case where multiplicative faults are used. Two ways of handling the test results have also been considered:

Time [s]

0 50 100 150

2. 1.

(27)

1. Test results are computed at every sample time

2. Test results are computed at every sample time until it is equal to one, then it is held to one during the entire simulation. In this way ﬂuctuation in the diagnosis is avoided (see Figure 6.1)

The residuals is shown in the Figures 6.2 - 6.3, while simulating the faults F_yP₁ and F_QP

f 1. Figure 6.4 - 6.5 shows the test results. The simulation

output for each isolation method is shown in the Figures 6.6 - 6.13, using the faults F_yP₁ and F_QP

f 1, when only single faults are considered and when

test results are held.

6.1.1 Column reasoning

Column reasoning was implemented both for single faults and multiple faults. The multiple fault case needed an extended FSM. This FSM is obtained by merging the single fault FSM into a multiple fault FSM. A Simulation of an additative fault in the level sensor for tank T1 is shown

in Figure 6.6 (only considering single faults). Note that the ﬁrst variant of Column reasoning is more careful with excluding faults, and this lead to many diagnoses, but if there are small faults active, causing the tests not to react, then the Column reasoning 1 seem reasonable.

6.1.2 Row reasoning

All combinations of faults were used during the simulations. The properties of row reasoning are such that the regular FSM can be used in both the single-fault case and multiple-fault case. Row reasoning always produces a diagnosis for multiple faults; therefore, when evaluating the algorithm in the single fault case, diagnoses with more than one component are ignored. The output from the Row reasoning method is shown in Figure 6.8 while simulating a fault in the level sensor for tank T1. A comparison of the output

from the Row reasoning method with the output from Column reasoning method (see Figure 6.6, to the right), shows that the methods are very similar. The diﬀerence between them shows when around 40 s, when just one test have reacted. The Row reasoning method shows that there are three possible components faulty, while the Column reasoning method has just pointed one wrong component. Shortly after that, the next test reacts and the two methods show the same output.

6.1.3 Bayesian isolation methods

(28)

0 50 100 −0.1 −0.05 0 0.05 0.1 r_1 Time [s] 0 50 100 −5 0 5 10 15x 10 −3 r_2 Time [s] 0 50 100 −1 −0.5 0 0.5 1 x 10−7 r_3 Time [s] 0 50 100 −2 −1 0 1 2 x 10−7 r_4 Time [s]

Figure 6.2: Residuals for the single fault scenario and additative faults con-sidered, simulating the fault F_yP₁ from t = 40 s to t = 120 s.

0 50 100 −0.01 0 0.01 0.02 0.03 r_1 Time [s] 0 50 100 −4 −2 0 2 4x 10 −4 r_2 Time [s] 0 50 100 −1 −0.5 0 0.5 1 x 10−7 r_3 Time [s] 0 50 100 −2 −1 0 1 2 x 10−7 r_4 Time [s]

Figure 6.3: Residuals for the single fault scenario and additative faults con-sidered, simulating the fault F_QP

(29)

Test results for fault F_yP₁ Time [s] Te st 0 40 80 120 d1 d2 d3 d4

Figure 6.4: Test results for the single fault scenario and additative faults considered, simulating the fault F_yP₁ from t = 40 s to t = 120 s.

Test results for fault F_QP_f1

Time [s] Te st 0 40 80 120 d1 d2 d3 d4

Figure 6.5: test results for the single fault scenario and additative faults considered, simulating the fault F_QP

(30)

Time [s]

Column reasoning 2, diagnoses for fault F_yP₁

0 40 80 120 F_pumpP F_yP₁ F_yP₂ F_QP f1 F_QP_f2 F_UP_p Time [s]

Column reasoning 2, diagnoses for fault F_yP₁

0 40 80 120 F_pumpP F_yP₁ F_yP₂ F_QP f1 F_QP_f2 F_UP_p

Figure 6.6: Output from the Column reasoning isolation methods for the single fault scenario and additative faults considered, simulating the fault F_yP₁ from t = 40 s to t = 120 s. The ﬁgure to the left shows that no conclusion can be drawn about the system behavioural mode for the ﬁrst 40 samples.

Time [s]

Column reasoning 1, diagnoses for fault F_QP_f1

0 40 80 120 F_pumpP F_yP 1 F_yP₂ F_QP_f1 F_QP f2 F_UP p Time [s]

Column reasoning 2, diagnoses for fault F_QP_f1

0 40 80 120 F_pumpP F_yP 1 F_yP₂ F_QP_f1 F_QP f2 F_UP p

Figure 6.7: Output from the Column reasoning isolation methods for the single fault scenario and additative faults considered, simulating the fault F_QP

(31)

Time [s]

Row reasoning, diagnoses for fault F_yP

1 0 40 80 120 F_pumpP F_yP₁ F_yP₂ F_QP f1 F_QP_f2 F_UP p

Figure 6.8: Output from the Row reasoning isolation method for the single fault scenario and additative faults considered, simulating the fault F_yP₁ from t = 40 s to t = 120 s.

Time [s]

Row reasoning, diagnoses for fault F_QP

f1 0 40 80 120 F_pumpP F_yP₁ F_yP₂ F_QP f1 F_QP_f2 F_UP p

Figure 6.9: Output from the Row reasoning isolation method for the single fault scenario and additative faults considered, simulating the fault F_QP

f 1

(32)

Example 5 Consider a system with three components and three tests, i. e. i = 3 and j = 3. If we simulate all behavioural modes, the probabilities can be estimated through

P (d1 = x, d2= y, d3= z| c1= ξ, c2= ζ, c3= ϑ) = n ξζϑ xyz

N (6.1)

where nξζϑ_xyz is the number of samples, N is the total number of samples and x, y, z, ξ, ζ, ϑ can take the values 0 or 1.

A simulation of an additative fault in the level sensor in tank T1, where only

single faults are considered, is shown in Figure 6.10.

6.1.4 Diagnostic Model Processor

DMP was implemented in MATLAB and the thresholds and the sensitivity function from [Pulido] was used for the case where only additative faults are considered. For the multiplicative faults, the sensitivity function was chosen such that the elements corresponding to multiplicative faults got the values +1 and -1 for residuals which reacts with a positive and a negative derivative respectively. The thresholds were chosen in a way that no residual react in the fault free mode.

A simulation of a fault in the level sensor for tank T1is shown in Figure 6.12,

where only single faults are assumed to be possible. The figure shows the likelihoods for faults in the different components. note that this method produces a different result than the previous methods. The likelihoods can be both positive and negative. The interpretation of negative likelihoods can be that a negative fault is present in the corresponding component or that it is highly unlikely that the fault is present.

6.2 Evaluation of the algorithms

The evaluation of the isolation algorithms was done with MATLAB. Scripts were used to simulate all combinations of faults and the data was processed afterwards. All of the performance measures are evaluated both for snap-shots of data, and snapsnap-shots where the test results are held active once activated. For the case where mixed faults are simulated, the performance is also measured with the test results ﬁltered using the Cusum-test. In the additative fault case, ﬁltering is not necessary because simulations showed that it was always possible to separate a violated residual from non-violated residual.

(33)

Time [s]

Bayesian method-independence, diagnoses for fault F_yP₁

0 40 80 120 F_pumpP F_yP₁ F_yP₂ F_QP f1 F_QP_f2 F_UP_p

Figure 6.10: Output from the Bayesian method, where the test results is assumed to be independent, and the fault F_yP₁ is present from t = 40 s to t = 120 s. The ﬁgure shows the probabilities for the single fault scenario and additative faults considered.

Time [s]

Bayesian method-independence, diagnoses for fault F_QP

f1 0 40 80 120 F_pumpP F_yP₁ F_yP₂ F_QP f1 F_QP_f2 F_UP_p

Figure 6.11: Output from the Bayesian method, where the test results is assumed to be independent, and the fault F_QP

f 1 is present from t = 40 s to

(34)

Time [s]

Diagnostic model processor, diagnoses for fault F_yP₁

0 40 80 120 F_pumpP F_yP₁ F_yP₂ F_QP f1 F_QP f2 F_UP_p

Figure 6.12: Output from the Diagnostic Model Processor. The ﬁgure shows the likelihoods for the single fault scenario and additative faults considered, simulating the fault F_yP₁ from t = 40 s to t = 120 s.

Time [s]

Diagnostic model processor, diagnoses for fault F_QP

f1 0 40 80 120 F_pumpP F_yP 1 F_yP₂ F_QP f1 F_QP_f2 F_UP_p

Figure 6.13: Output from the Diagnostic Model Processor. The ﬁgure shows the likelihoods for the single fault scenario and additative faults considered, simulating the fault F_QP

(35)

• Error rate

The error rate is only measured for the single-fault case. Error rate is not evaluated for the multiple fault case.

• Diagnostic resolution

The diagnostic resolution is measured for the single-fault case only. Diagnostic resolution is not measured for multiple faults.

• Normalised Diagnostic Accuracy

NDA uses diﬀerent weights, depending on the importance of isolating the current system behavioural mode. Therefore, it is suitable for measuring performance on multiple faults. Isolation performance is measured for all faults, single as well as multiple.

• Memory usage

The memory usage is computed by analysing the memory structures needed for storing the information about the isolation.

The parameters for the benchmark system and the isolation can be found in Appendix A.

6.2.1 Memory usage

The memory usage is denoted δ. For the case where non-boolean structures are used, the number of bits used in the elements of the structures is η. Single faults

Isolation algorithm Need to store δ [bit]

Column reasoning A regular FSM with m · n booleans m · 2n

Row reasoning A regular FSM with m · n booleans m · n

Bayesian method, in-dependence P (C) n · η P (D | C) = dj P (dj | C) m · n · η (1 + m)n · η Bayesian method,

par-tial independence P (C) n · η P (D | C) = P (drds| C) dj|j=r,s P (dj | c) ((m − 2)n + 22n) · η (3 + m)n · η Bayesian method, full

dependence

P (C) n · η

P (D | C) 2m_{· n · η}

(1 + 2m)n · η

Diagnostic Model Pro-cessor

The sensitivity function with m · n real numbers

(36)

Multiple faults

Isolation algorithm Need to store δ [bit]

Column reasoning An extended FSM with m · 2n_booleans _{m · n}

Row reasoning A regular FSM with m · n booleans m · n

Bayesian method, in-dependence P (C) 2n_{· η} P (D | C) = dj P (dj | C) m · n · η (1 + m)n · η Bayesian method,

par-tial independence P (C) 2n· η P (D | C) = P (drds| C) dj|j=r,s P (dj | c) ((m − 2)n + 22n) · η (3 + m)2n· η

Bayesian method, full dependence

P (C) 2n_{· η}

P (D | C) 2m_{· 2}n_{· η}

(37)

Chapter 7

Results

The sections in this chapter presents the results from the simulations and the evaluation of the performance.

7.1 Performance measures

A comparison of the performance measures for both sets of faults shows that the isolation is easier for the set of additative faults. The main reason for this is that the same residuals are used in both cases and there are more faults in the mixed fault case therefore, harder to isolate the faults. When only considering single faults, the diagnostic resolution has its largest value in the Column reasoning and Bayesian methods.

The optimal value for the error rate is zero, and the Bayesian methods al-ways makes the error rate equal to zero, because the true system behavioural mode always gets a probability greater than zero. The error rate is high for the Row reasoning and the Column reasoning method, assumption 2. This is because of the fact that the benchmark system’s reactions to some faults are delayed due to thresholds of the residuals and time delays in the system. The Column reasoning has a particularly high value of the error rate; it requires that the failure signatures exactly match the test results. The Row reasoning has a high error rate, in the case where the test results are held. Column reasoning, assumption 1, has higher diagnostic resolution than as-sumption 2, this means that asas-sumption 1 is more cautious excluding faults. The error rate for assumption 2, on the other hand, is higher than for as-sumption 1.

Row reasoning, compared to Column reasoning, has a lower diagnostic res-olution. This has to do with the minimal diagnoses that the Row reasoning produces. In the multiple fault case, the usage of minimal diagnoses lead to decreased normalised isolation accuracy.

The Bayesian methods have a high NDA for all simulations.

(38)

relatively low.

The effect of holding the test result active, once they are activated, is that the fluctuation in diagnosis statement are avoided. This leads to an increased diagnostic accuracy. The results of the evaluation of the performance show that the effects of holding the test results largest for the mixed fault case. It can be explained by that multiplicative faults generates test results that fluctuates more.

For the additative fault scenario, the diagnoses for different test results are shown in Appendix B. A comparison of the diagnoses from Column rea-soning, assumption 1, and Row reasoning shows that the only difference between them is how the behavioural mode NF is treated. The difference between the Bayesian approaches is that when dependence among test is as-sumed, for the test result [d1d2d3d4] = [0010], says that neither single fault

nor NF is probable. This means that this particular test result was never present during the simulations. If the algorithm sees such a test result it cannot say anything about the present fault.

The differences between the other assumptions are not very large. It is for example the test result [1000] and [0100] that differs marginally. The Di-agnostic Model Processor’s diagnoses looks a little different form the other methods. It shows the likelihoods for different faults. The likelihood has a value from -1 to +1, and one interpretation of this, in the single fault scenario, is that it shows if the current fault is positive or negative.

(39)

(40)

(41)

(42)

7.1.4 Memory usage

The result of the calculations follows by the next sections. Decreasing the number of bits used for storing the tables can reduce the memory usage for the Bayesian methods and the Diagnostic Model Processor. The Row reasoning algorithm uses less memory than any other method that has been considered. In the single fault case, the column reasoning and the row reasoning methods uses an equal amount of memory. The Diagnostic Model Processor uses less memory than the Bayesian methods, but more than the column- and row reasoning.

Additative faults

Memory usage, δ[bits]

Single faults Multiple faults

Column reasoning, assumption 1 24 256

Column reasoning, assumption 2 24 256

Row reasoning 24 24

Bayesian method, independence 1 920 20 480

Bayesian method, partial independence 2 688 28 672

Bayesian method, full dependence 6 528 69 632

Diagnostic Model Processor 1 536

-Mixed faults

Memory usage, δ[bits]

Single faults Multiple faults

Column reasoning, assumption 1 32 1 024

Column reasoning, assumption 2 32 1 024

Row reasoning 32 32

Bayesian method, independence 2 560 81 920

Bayesian method, partial independence 3 584 114 688

Bayesian method, full dependence 8 704 278 528

(43)

-Chapter 8

Conclusion

In this chapter the comparison of the isolation algorithms are discussed and conclusions are drawn from the results. Recommendations about when to use which of the diﬀerent algorithms are also presented.

8.1 Discussion

All isolation algorithms presented here are good at isolating single faults in the benchmark problem, but when it comes to multiple faults, it is almost impossible to ﬁnd an isolation algorithm capable of isolating all the 64 and 256 faults respectively. This is because there is too few test quantities, and the number of unique diagnoses that could be stated from the four test quan-tities is 24 = 16. In the DMP case, the number of unique diagnoses is much higher, because the thresholds are delivered in an other way, and the sign is also taken into account. To be able to increase the isolation performance, more test quantities are needed.

The isolation performance in general depends also on how good the tests are.

The performance measures in this thesis shows that simple algorithms like Row reasoning and Column reasoning are eﬃcient considering memory us-age. They are also good at isolating single faults. This could be a good start when developing an isolation system.

If it shows that the isolation performance is not good enough or if the diag-noses are desired to be ranked, then DMP or a Bayesian approach could be interesting. For systems, where there are memory restrictions, DMP would be preferred. If the NDA is not high enough, the Bayesian algorithms should be used. It is important to know that there is a trade oﬀ between memory usage and NDA.

(44)

Bayesian method, where the test results are assumed to be independent, is a good estimation of P (C|D) considering both isolation performance and error rate. This could be a good alternative if there are memory restrictions. The other assumptions about independence in the Bayesian algorithm are only necessary to use if better precision is required or if the test results have strong dependence. The diﬀerence between the assumptions might also grow with decreased fault sizes.

8.2 Recommendation

From the conclusions above the following can be recommended: Isolation method Suits

Column reasoning - Small to large-scale systems - When memory usage is restricted Row reasoning - When diagnosing large-scale systems

- When multiple faults needs to be diagnosed - Systems with narrow memory restrictions Bayesian methods - When the diagnosis statement needs to be

ranked, for example, when other isolation algo-rithms produces too many diagnoses

- For medium to large-scale systems - If memory usage is not an issue Diagnostic Model

Processor

- For small-medium sized systems

- When the diagnosis statement needs to be ranked

- When there are memory restrictions

Table 8.1: The isolation methods are listed below together with the type of problems the respective method is recommended.

8.3 Summary

In this thesis the following goals have been reached:

• Implementation of four isolation methods has been made on a bench-mark problem.

• Performance measures has been gathered and developed.

(45)

8.4 Future work

The following future work is recommended.

• Extend the benchmark by adding extra tanks and tests to be able to see how the isolation algorithms handles additional faults and measure complexity etc.

• The performance of the isolation depends on how the tests are formed and ﬁltering of the tests. More work is needed to be able to ﬁnd methods for optimising tests for the isolation.

(46)

Bibliography

[O. Bouamama] B. Ould Bouamama, R. Mrani Alaoui, P. Taillibert and M. Staroswiecki Diagnosis of a two-tank system, 2001.

[Nyberg, Frisk] Mattias Nyberg, Erik Frisk Model Based Diagnosis of Tech-nical Processes, 2005.

[Jensen] Mathias Jensen Distributed Fault Diagnosis for Networked Embed-ded Systems, 2003.

[Gertler] J. Gertler, D. Singer A New Structural Framework for Parity Equation-based failure Detection and Isolation., 381-388, Automatica, 1990.

[Pulido] B. Pulido, V. Puig, T. Escobet, J. Quevedo A new fault localization algorithm that improves..., 2005.

[Wotawa] F. Wotawa A variant of Reiter’s hitting-set algorithm, 1999. [Pernest˚al] A. Pernest˚al A Bayesian Approach to Fault Isolation - Structure

Estimation and Inference, 2005.

[Petti] Petti et al. Diagnostic Model Processor: using deep knowledge for process fault diagnosis. AICHE Journal, 36(4):565-575, 1990.

[Ni˚ArPe] Anders Nilsson, Karl-Erik ˚Arz´en, Thomas F. Petti Model-based diagnosis - State transition events and constraint equations, 1992. [M. Ko´scielny] Jan Maciej Ko´scielny Fault isolation in industrial processes

by the dynamic table of states method, Automatica, Vol 31, No 5, 747-753, 1995.

[Schmid] F. Schmid Model-Based Fault Detection And Isolation: A New Approach for Fault Isolation in Dynamic Networks with Time Delays, diploma thesis, 2004.

(47)

Appendix A

Notation and parameters

Quantity Description Value Unit

ε1 Measurement noise V

rate noise my1 5.00e-4

rate noise my2 3.00e-4

rate noise mUp 1.00e-7

rate noise mQp 1.00e-7

T Sample time 1 s

A1 Area tank 1 0.0154 m2

A2 Area tank 2 0.0154 m2

C_vb Global hydraulic ﬂow coeﬃcient of valve

Vb

1.59e-4

C_vo Hydraulic ﬂow coeﬃcient of valve Vo 1.60e-4

h_1c Reference value for PI-controller 0.5 m

h1 Water level in tank 1 - m

h2 Water level in tank 2 - m

h1max Maximal height of the tank T1 0.6 m

h2max Maximal height of the tank T2 0.6 m

K_p Proportional control constant 1.00e-3

-K_i Integration control constant 5.00e-6

P1 Pump

-Qint1 Inlet ﬂow tank T1 - m3/s

Qint2 Inlet ﬂow tank T2 - m3/s

Qout1 Out ﬂow from T1 - m3/s

Qout2 Out ﬂow from T2 - m3/s

Q12 Flow from T1 to T2 - m3/s

Q_f1 Leak ﬂow from T1 m3/s

(48)

Quantity Description Value Unit

Q_o Out ﬂow to consumer - m3/s

Q_pmax Max ﬂow from P1 0.01 m3/s

Q_p Water ﬂow from P1 - m3/s

Qm_p Measured water ﬂow from P1 - m3/s

T1 Tank 1 -

-T2 Tank 2 -

-U_bm Control signal for valve V_b - V

U_om Control signal for valve V_o - V

U_p Control signal for the pump, P1 - V

V1 Valve 1 -

-V2 Valve 2 -

-V_b Valve V_b -

-ym Measured level in tanks - m

ym₁ Measured level in tank T1 - m

ym₂ Measured level in tank T2 - m

C System behavioural mode -

-c_i Status for component no j -

-D Test result -

-d_i Status for test no i -

-i Component index -

-j Test/residual/satisfaction index -

-k Sample -

-l -

-m Number of components -

-n Number of test resultsresidualssatisfaction

tests - -r_j Residual number j - -a Assumption - -F_i Failure likelihood - -S_ij Sensitivity function - -vsf_i Satisfaction vector - -P (C) Prior probability -

-P (C|D) Probability of the system behavioural

mode, C, given the test results, D

-

-P (D|C) Probability of the test results, D, given the system behavioural mode C

(49)

-Quantity Description Value Unit

L Total number of samples used when

evalu-ating the performance measures

-

-f (C|D_C) Diagnostic conﬁdence of the system be-havioural mode C, given the test result D_C

D_C The test result from system behavioural

mode C

α Normalised diagnostic accuracy -

-β Error rate - -δ Memory usage - -γ Diagnostic resolution - -x - -y - -z - -ξ - -ζ - -ϑ -

-τ1 Threshold for residual 1 6.30e-4/

-7.63e-5/ -7.63e-5*

-1.23e-4/ -1.23e-4*

-1.14e-7/ -3.34e-7*

-1.03e-7/ -1.03e-7* -*) Threshold for the cases: additative faults/ mixed faults/ mixed faults

with ﬁltering

FSM used for additative faults

c1 c2 c3 c4 c5 c6

d1 0 x x x 0 0

d2 0 x x 0 x 0

d3 0 0 0 0 0 x

(50)

FSM for mixed faults c1 c2 c3 c4 c5 c6 c7 c8 d1 x x x x 0 0 x x d2 0 x x 0 x 0 x x d3 0 x 0 x 0 x 0 0 d4 x 0 0 0 0 x x 0

Sensitivity matrix, S, for additative faults

c1 c2 c3 c4 c5 c6

d1 0 -0.87 0.87 56.46 0 0

d2 0 0.2815 -0.85 0 54.92 0

d3 0 0 0 0 0 -0.93

d4 -0.87 0 0 0 0 0.87

Sensitivity matrix, S, for mixed faults

c1 c2 c3 c4 c5 c6 c7 c8

d1 1 1 1 1 0 0 1 1

d2 0 1 1 0 1 0 1 1

d3 0 1 0 1 0 1 0 0

(51)

Appendix B

Isolation

(52)

(53)

(54)

(55)

(56)

(57)

(58)

A Comparison of Isolation Algorithms on a Benchmark System