Sensor Validation Using Linear Parametric Models, Artificial Neural Networks and CUSUM

(1)

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Sensor Validation Using Linear Parametric Models,

Artificial Neural Networks and CUSUM

Examensarbete utfört i Reglerteknik vid Tekniska högskolan i Linköping

av

Gustaf Norman

LiTH-ISY-EX--15/4859--SE

Linköping 2015

Department of Electrical Engineering Linköpings tekniska högskola

Linköpings universitet Linköpings universitet

(2)

(3)

Artificial Neural Networks and CUSUM

Examensarbete utfört i Reglerteknik

vid Tekniska högskolan i Linköping

av

Gustaf Norman

LiTH-ISY-EX--15/4859--SE

Handledare: Karl Granström

isy, Linköpings universitet Tobias Ek

Siemens

Examinator: Gustaf Hendeby

isy, Linköpings universitet

(4)

(5)

Division of Automatic Control Department of Electrical Engineering Linköpings universitet

SE-581 83 Linköping, Sweden

2015-05-24 Språk Language ¤ Svenska/Swedish ¤ Engelska/English ¤ £ Rapporttyp Report category ¤ Licentiatavhandling ¤ Examensarbete ¤ C-uppsats ¤ D-uppsats ¤ Övrig rapport ¤ £

URL för elektronisk version

http://www.control.isy.liu.se http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-119004 ISBN — ISRN LiTH-ISY-EX--15/4859--SE

Serietitel och serienummer

Title of series, numbering

ISSN

—

Titel

Title

Sensorvalidering medelst linjära konfektionsmodeller, artificiella neurala nätverk och CUSUM

Sensor Validation Using Linear Parametric Models, Artificial Neural Networks and CUSUM Författare Author Gustaf Norman Sammanfattning Abstract

Siemens gas turbines are monitored and controlled by a large number of sensors and actuators. Process information is stored in a database and used for offline calculations and analyses. Before storing the sensor readings, a compression al-gorithm checks the signal and skips the values that explain no significant change. Compression of 90 % is not unusual. Since data from the database is used for analyses and decisions are made upon results from these analyses it is important to have a system for validating the data in the database. Decisions made on false information can result in large economic losses. When this project was initiated no sensor validation system was available. In this thesis the uncertainties in measure-ment chains are revealed. Methods for fault detection are investigated and finally the most promising methods are put to the test. Linear relationships between re-dundant sensors are derived and the residuals form an influence structure allowing the faulty sensor to be isolated. Where redundant sensors are not available, a gas turbine model is utilized to state the input-output relationships so that estimates of the sensor outputs can be formed. Linear parametric models and an ANN (Ar-tificial Neural Network) are developed to produce the estimates. Two techniques for the linear parametric models are evaluated; prediction and simulation. The residuals are also evaluated in two ways; direct evaluation against a threshold and evaluation with the CUSUM (CUmulative SUM) algorithm. The results show that sensor validation using compressed data is feasible. Faults as small as 1% of the measuring range can be detected in many cases.

Nyckelord

Keywords Sensor Validation, Linear Parametric Models, Artificial Neural Networks, ANN, Fault Detection, CUSUM

(6)

(7)

Abstract

Siemens gas turbines are monitored and controlled by a large number of sensors and actuators. Process information is stored in a database and used for offline calculations and analyses. Before storing the sensor readings, a compression al-gorithm checks the signal and skips the values that explain no significant change. Compression of 90 % is not unusual. Since data from the database is used for analyses and decisions are made upon results from these analyses it is important to have a system for validating the data in the database. Decisions made on false information can result in large economic losses. When this project was initiated no sensor validation system was available. In this thesis the uncertainties in measure-ment chains are revealed. Methods for fault detection are investigated and finally the most promising methods are put to the test. Linear relationships between re-dundant sensors are derived and the residuals form an influence structure allowing the faulty sensor to be isolated. Where redundant sensors are not available, a gas turbine model is utilized to state the input-output relationships so that estimates of the sensor outputs can be formed. Linear parametric models and an ANN (Ar-tificial Neural Network) are developed to produce the estimates. Two techniques for the linear parametric models are evaluated; prediction and simulation. The residuals are also evaluated in two ways; direct evaluation against a threshold and evaluation with the CUSUM (CUmulative SUM) algorithm. The results show that sensor validation using compressed data is feasible. Faults as small as 1% of the measuring range can be detected in many cases.

(8)

(9)

Acknowledgments

I would like to thank my supervisors, Karl and Tobias and my examiner, Gustaf, for support and enthusiasm, Bengt Svensson for the opportunity and Christer Karlsson for valuable input and support.

I would like to thank my parents and siblings for encouragement, my parents-in-law for baby-sitting and my son, Axel, for sleeping from time to time.

Last but not least, I would like to thank my lovely wife, Jeanette, for invaluable support and encouragement and for tirelessly caring of Axel and household.

(10)

(11)

Introduction

Siemens in Finspång, Sweden, design and build gas turbines for mechanical drive and electrical power generation. The SGT-800 gas turbine, solely used for electrical power generation, has with its 50 MW power output the highest capacity in the portfolio. The process is monitored and controlled by a large number of sensors and actuators.

As long as processes are monitored with sensors there will be a desire to assure the validity of the information they provide. Hence, fault detection and process monitoring is a vast research area. In [11] an artificial neural network (ANN) is used to estimate the temperature on a fuel rod in a nuclear reactor core. The estimate is compared with the sensor reading and the difference is then compared with upper and lower limits for fault detection. After pre-processing data with a discrete wavelet transform technique reasonable accuracy is achieved.

In [6] fault detection involves a NARX (Nonlinear Auto-Regressive with eX-ogenous input) model of a chemical process utilizing ANN for model identification. The difference between estimate and process measurement is processed with the CUSUM (CUmulative SUM) algorithm which provides adequate fault detection.

In this thesis an ANN is developed and the CUSUM algorithm is applied. Moreover linear parametric models are developed and the performance of CUSUM is compared with direct evaluation of the estimation error.

1.1 Motivation

Sensor measurements are used for controlling and performance calculations. They are also part of the foundation for upgrades and maintenance, which makes it important to understand how accurate the measurements are and also to be certain that decisions are made upon correct information. Siemens was looking for a way to take data collected from gas turbines and feed it to a system that detects sensor faults. The system should take data from compressed data, i.e. data that does not have a value for every sample. The system should not require deep understanding of physical relationships. Preferably the system should also be easy to implement in programming code and also not require heavy processing power. This thesis

(14)

will first deal with sensor accuracy and the second main part deals with methods on how to achieve sensor validation.

1.2 Problem Formulation

The purpose of this thesis is to assure the quality of data collected from gas turbines around the world. Fault tracing, maintenance and operational actions must not be based on false information. The aim is to develop a method for sensor fault detection.

Even if a sensor is not faulty, there are still uncertainties to consider. Not just the measurement uncertainty on the sensor. All components on the way to the final destination where the values are stored contribute to the total measurement uncertainty. When designing a fault detection system it is important to be aware of these uncertainties. Therefore, investigating the measurement chains and cal-culating the resulting uncertainties is part of the task. This thesis aims to answer the following questions:

• What methods for modeling or detection are available? • What methods are suitable for sensor fault detection?

• What are the measurement uncertainties of measurement chains on the gas turbine?

• Is it possible to use compressed data to achieve sensor validation?

1.3 Limitations

• No data for real faults is available. Faults have to be generated by manipu-lating data from normal operation.

• Only offset errors are considered and faults are constant for the complete sequences. Offset error is a reasonable assumption as calibration drift may occur, which can not be modeled as a step change.

• Available data is compressed and data points are linearly interpolated to achieve values for each sample.

• Data is treated as uncompressed data, i.e. not assuming the absence of noise.

1.4 Thesis Outline

Chapter 1 describes the problem formulation and the prerequisites together with this thesis outline.

Chapter 2 describes the measuring chain and its components. A procedure for calculating uncertainties is stated together with calculated uncertainties for common sensors. The compression algorithm is also described.

(15)

Chapter 3 takes on the theory of modeling techniques together with detailed descriptions on the methods that are used in this thesis.

Chapter 4 presents a method for fault detection with redundant sensors, the approach with non-redundancy and also the CUSUM algorithm is described. Chapter 5 states the resulting models together with plots and tables.

Chapter 6 discusses the results and methods. Chapter 7 contains the conclusion and future work.

(16)

(17)

Chapter 2

Measurement Uncertainty

It is important to understand that all measurements are subject to uncertainties. Estimation of a measurement can not be more accurate than the available infor-mation. In this chapter the analogue sensors and transmitters that are commonly used on gas turbines are investigated and their uncertainties are highlighted. Sec-tion 2.1 describes the general configuraSec-tion of a measuring chain. How to calculate uncertainties in a measurement chain is explained in Section 2.2 and the subsec-tions that follow give detailed information about common sensors and transmitters. Section 2.3 describes the historical data system and its compression algorithm.

2.1 Measuring Chain

Analogue input boards (AI-boards) generally require voltages and currents in spe-cific linear ranges, e.g. 0 − 10 V or 4 − 20 mA. It is common that sensors do not generate physical quantities suitable for AI-boards. Instead they generate charges or very small voltages or currents and the quantities are not always linear. Thus, sensors are often connected to transducers that handle the linearization and con-version between physical quantities. In some cases the sensor and transducer are combined into one unit called transmitter. There is no upper limit for the amount of additional devices such as safety barriers and converters, but they are preferably kept to a minimum. Figure 2.1 illustrates the general configuration of a measuring chain.

Transmitter

Sensor - Transducer - Barrier/_{converter etc.} - AI-board

Figure 2.1. General configuration of a measuring chain

(18)

On the core engine mostly temperatures and pressures are measured. Process variables such as compressor pressure and exhaust gas temperature but also bear-ing temperatures and vibrations are measured. For temperature measurements two types of sensors are used; the thermocouple and the Pt100. The thermocouple is based on the principle that a temperature gradient generates an electromotive force in a circuit of two metals or alloys. The Pt100 is based on platinum and has different resistance for different temperatures. For 0◦_{C the resistance is 100 Ω and} it then increases for increasing temperatures. Pressures, such as gauge, absolute and differential pressure on new models of gas turbines are measured with Siemens DSIII series. These instruments are available for different measuring ranges and can be further configured to span a more specific range.

2.2 Uncertainties in a Measuring Chain

All steps in a measuring chain contribute to the total uncertainty. The proce-dure in [4] describes how to handle multiple uncertainties and suits the indus-try application. When information on sensors and measuring equipment is taken from data sheets, as in this chapter, uncertainties are considered as ‘Type B’ uncertainties with a uniform distribution limited by ±a. Working with differ-ent kinds of uncertainties, i.e. uncertainties with differdiffer-ent origins and types, they must be transformed to the same unit and then be converted to ‘standard uncer-tainties’, u, in order to have the uncertainties expressed at the same confidence level. A Type B uncertainty is converted to standard uncertainty using u = _√a

3. If

the uncertainties are independent the ‘combined uncertainty’, uc, can be formed as uc =

p u2

1+ u22+ . . . + u2n, where ui, i = 1, 2, . . . , n are standard uncertainties. The combined uncertainty can further be expressed as the ‘expanded uncertainty’, U , by multiplying uc with a coverage factor, k, that gives the desired confidence level of U . For a confidence level of 95% the coverage factor k = 2 is used, which gives the resulting expression [4]

U = 2 s a1 √ 3 2 + a2 √ 3 2 + . . . + an √ 3 2 , (2.1)

where ai, i = 1, 2, . . . , n are uncertainties in the measuring chain. An example on how to express a measuring result can be seen in Section 2.2.1.

2.2.1 DSIII Gauge Pressure

The signal from the transmitter can be connected to two different AI-boards. They are:

• 6ES7336-1HE00-0AB0: Uncertainty ±0.4% • 6ES7331-7NF10-0AB0: Uncertainty ±0.05%

(19)

According to the data sheet for the transmitter, the expanded uncertainty also de-pends on the span ratio r = max. span/set span, i.e. the ratio between the trans-mitter’s maximum allowed range and the range configured by the user. Many combinations are present and therefore a representative mean has been selected with

r = 1600kPa − 16kPa

500kPa − (−70)kPa ≈ 2.78. The contributing uncertainties other than the AI-board are:

• DSIII:

– Linear characteristics: (0.0029r + 0.071)% – Influence of ambient temperature: (0.08r + 0.1)%

The expanded uncertainties calculated according to (2.1) are stated in Table 2.1. This means that when using the values stated above in combination with the

AI-board U [%]

6ES7336-1HE00-0AB0 0.6 6ES7331-7NF10-0AB0 0.39

Table 2.1. Expanded uncertainties for gauge pressure

AI-board 6ES7336-1HE00-0AB0 and the reading from the transmitter is x, the measurement can be expressed as x ± 0.006x kPa, where the expanded uncertainty is based on a standard uncertainty multiplied by a coverage factor k = 2, providing a level of confidence of approximately 95%.

2.2.2 DSIII Absolute Pressure

According to the data sheet the expanded uncertainty also depends on the span ratio r = max. span/set span. Many combinations are present and therefore a representative mean has been selected with

r = 130kPa − 4.3kPa

120kPa − 60kPa = 2.095. The contributing uncertainties other than the AI-board are:

• DSIII:

– Linear characteristics: 0.1%

– Influence of ambient temperature: (0.1r + 0.2)%

(20)

AI-board U [%]

6ES7336-1HE00-0AB0 0.67 6ES7331-7NF10-0AB0 0.49

Table 2.2. Expanded uncertainties for absolute pressure

2.2.3 DSIII Differential Pressure

According to the data sheet the expanded uncertainty also depends on the span ratio r = max. span/set span. Many combinations are present and therefore a representative mean has been selected with

r = 6kPa − 0.1kPa

6kPa − 0kPa ≈ 0.98. The contributing uncertainties other than the AI-board are:

• DSIII:

– Linear characteristics: (0.0029r + 0.071)% – Influence of ambient temperature: (0.08r + 0.1)%

The expanded uncertainties calculated according to (2.1) are stated in Table 2.3.

AI-board U [%]

6ES7336-1HE00-0AB0 0.51 6ES7331-7NF10-0AB0 0.23

Table 2.3. Expanded uncertainties for differential pressure

2.2.4 Thermocouple Type N Class 1

The signal from this type of sensor can either be connected to a safety barrier which also serves as a signal converter and then to an AI-board or it can be directly connected to a special thermocouple AI-board. The AI-boards are:

• 6ES7336-1HE00-0AB0: Uncertainty ±0.4% • 6ES7331-7PF11-0AB0: Uncertainty ±2.2◦_C

(21)

• D1072D Signal Converter:

– Calibration and linearity: ±40µV

– Ref. junction compensation influence: ±1◦_C – Calibration accuracy: ±0.1% – Linearity error: ±0.05% • Thermocouple: – Tolerance: ±1.5◦_C _{−40 < T ≤ 375} ±0.004|T |◦_{C 375 < T ≤ 1000}

The voltage contribution affects the output differently since the voltage generated from the thermocouple is not proportional to the temperature. Neither is the tolerance for the thermocouple. The expanded uncertainties calculated according to (2.1) are stated in Table 2.4.

AI-board 6ES7336-1HE00-0AB0 6ES7331-7PF11-0AB0

Temp [◦_C] _{U [}◦_C] _{U [}◦_C] 0 2.74 3.07 100 2.65 3.07 200 2.69 3.07 300 2.84 3.07 400 3.15 3.14 500 3.72 3.43 600 4.31 3.76 700 4.93 4.11 800 5.56 4.48 900 6.2 4.87 1000 6.84 5.27

Table 2.4. Expanded uncertainties for thermocouples at different temperatures

2.2.5 Pt100 Class A

This sensor can either be connected to a safety barrier which also serves as a signal converter and then to an AI-board, or it can be directly connected to a special Pt100 AI-board. The AI-boards are:

• 6ES7336-1HE00-0AB0: Uncertainty ±0.4% • 6ES7331-7NF10-0AB0: Uncertainty ±0.05% • 6ES7331-7PF01-0AB0: Uncertainty ±0.5◦_C

(22)

The sensor can be connected using 2, 3 or 4 wires. By using less than four wires an error is introduced. The 2-wire connection will not be treated here since it is not used. The contributing uncertainties other than the AI-board are:

• D1072D Signal Converter:

– Calibration and linearity: ±200mΩ – Calibration accuracy: ±0.1% – Linearity error: ±0.05% • Pt100:

– Tolerance: ±(0.15 + 0.002|T |)◦_C – 3-wire connection: ±0.5◦_C

Not all contributions are proportional and therefore this section will show the uncertainties for temperatures between 0 and 150 ◦_{C. The expanded uncertainties} calculated according to (2.1) are stated in Table 2.5.

6ES7336-1HE00-0AB0 6ES7331-7NF10-0AB0 6ES7331-7PF01-0AB0

3 wire 4 wire 3 wire 4 wire 3 wire 4 wire

Temp [◦_C] _{U [}◦_C] _{U [}◦_C] _{U [}◦_C] 0 0.84 0.61 0.84 0.61 0.60 0.17 10 0.85 0.62 0.85 0.62 0.61 0.20 20 0.87 0.65 0.86 0.64 0.63 0.25 30 0.88 0.66 0.87 0.65 0.65 0.30 40 0.90 0.68 0.88 0.66 0.68 0.35 50 0.91 0.71 0.88 0.67 0.71 0.41 60 0.93 0.74 0.89 0.68 0.74 0.47 70 0.96 0.77 0.90 0.69 0.78 0.52 80 0.98 0.80 0.91 0.71 0.82 0.58 90 1.02 0.84 0.93 0.73 0.87 0.64 100 1.05 0.88 0.94 0.75 0.91 0.70 110 1.08 0.91 0.96 0.76 0.96 0.77 120 1.11 0.95 0.97 0.78 1.01 0.83 130 1.15 0.99 0.98 0.80 1.06 0.89 140 1.19 1.04 1.00 0.81 1.11 0.95 150 1.23 1.08 1.02 0.84 1.16 1.01

Table 2.5. Expanded uncertainties for Pt100 at different temperatures

2.2.6 Vaisala Temperature and Relative Humidity

From this transmitter both temperature and relative humidity (RH) can be ex-tracted. The AI-board used is 6ES7331-7NF10-0AB0 and the measuring range for temperature measurement is [−40, +80]◦_{C. The contributing uncertainties are:}

(23)

• AI-board: – Basic error: ±0.05% • Transmitter: – Temperature tolerance: ±(0.2 + 1 400|T − 20|)◦C – RH accuracy: ≤ 1% 0 ≤ RH ≤ 90% ≤ 1.7% 90 < RH ≤ 100%

The expanded uncertainties calculated according to (2.1) when measuring temper-ature are stated in Table 2.6 and the expanded uncertainties calculated according to (2.1) when measuring the relative humidity are stated in Table 2.7.

Temp [◦_C] _{U [}◦_C] -40 0.40 -30 0.38 -20 0.35 -10 0.32 0 0.29 10 0.26 20 0.23 30 0.26 40 0.29 50 0.32 60 0.35 70 0.38 80 0.40

Table 2.6. Expanded uncertainties when measuring temperature

RH [%] U [%]

0-90 1.16

90-100 1.96

Table 2.7. Expanded uncertainties when measuring relative humidity

2.2.7 Coriolis Mass Flow

This transmitter is connected to the 6ES7331-7NF10-0AB0 AI-board and the con-tributing uncertainties are:

(24)

– Basic error: ±0.05% • Transmitter:

– Accuracy: ±0.35%

The expanded uncertainty calculated according to (2.1) is U ≈ 0.41%.

2.3 Overview of Data Acquisition

Siemens gas turbines are delivered with a historical data system that stores mea-surements in a database. The sensor readings are passed through a number of steps before finally reaching the Data Collector (DC) which is an application con-sisting of an OPC1_{client that fetches data from the control system, a compression}

algorithm and functions for storing data in a database. Figure 2.2 illustrates the path for the sensor readings.

- - -Sensor Reading Control System Operator Station Data Collector

Figure 2.2. Path for sensor readings

The sample frequency is approximately 1 second and values are time stamped. The calculation of how much a signal’s value is allowed to deviate from its current trend without being stored in the database is based on an FFT2 _{technique and}

determined permanently when the DC is installed and the gas turbine runs at full load. A signal’s value is assumed to be represented by frequencies in the lower band while the noise is assumed to be concentrated to higher frequencies. Assuming this, the signal is filtered out and the amplitude for the noise is calculated. Hereby, a parameter for a signal’s maximum allowed deviation, d, is set. This is done for each analogue signal.

2.3.1 The Compression Algorithm

To reduce the amount of data being stored in the database, a compression algo-rithm processes the data so that only relevant data remains. The algoalgo-rithm is explained in the following steps:

1. The DC receives a value x(t) at time t.

2. A line is interpolated between x(t) and x(t − n), where x(t − n) is the most recent permanently stored value in the database.

1_{Object Linking and Embedding for Process Control} 2_{Fast Fourier Transform}

(25)

3. A check is performed to see if the temporary stored values x(t − 1), . . . , x(t − n + 1) are close to the interpolated line, i.e. if

x(t − i) −x(t − n) − x(t)_n i − x(t) ≤ d (2.2)

where i = 1, . . . , n − 1 and d is the deviation parameter for the signal. 4. If no deviation is greater than d, no value is stored in the database and the

algorithm returns to step 1.

5. If one or more deviations are greater than d, the second most recent value, x(t − 1), will be stored in the database and the algorithm returns to step 1.

0 1 2 3 4 5 6 7 8 0 5 10 15 20 0 1 2 3 4 5 6 7 8 0 5 10 15 20

Most recent stored value

Most recent value

Most recent stored value

Outside limits

This value will be stored Most recent value

Figure 2.3. Principle of the compression algorithm

In Figure 2.3 the upper graph shows that the no values between t and t − n deviate more than d from the interpolated line and a new value can be sent to the DC. The lower graph in Figure 2.3 shows the case where the new value, x(t), makes the value x(t − 2) deviate more than d from the interpolated line, hence x(t − 1) will be permanently stored in the database.

2.3.2 CMSView

For data collecting, the tool CMSView (Figure 2.4) has been used. This tool is developed at Siemens and is an application to view and extract data stored in a database. The data in the database is imported to the application and the values are plotted and joined with lines in order to view trends of the selected signals (tags). Data can be exported to Excel.

(26)

(27)

Chapter 3

Analysis Methods

There are many different kinds of methods available for modeling and fault detec-tion. Depending on the application and circumstances such as noise and sample time etc., different methods can be more or less suitable for a sensor validation purpose. This chapter briefly describes some, for this application, less promising methods and then describes the more promising methods in more detail. The promising methods will be further investigated in this thesis.

3.1 Dismissed Methods

3.1.1 First Principle Models

This method is based on physical modeling which uses the knowledge of the char-acteristics of a system’s components, i.e. mass, inertia etc. The resulting models are often differential equations that have to be solved. [10]

Because of the excessive work required to utilize this method in a sensor valida-tion purpose, this method is ruled out and will not be considered as an alternative.

3.1.2 Wavelets

The wavelet transform resembles the Fourier transform in its mathematical def-inition. The Fourier transform is able to extract the frequency components of a time domain signal. However, the Fourier transform has a drawback. Both high resolution in time and frequency can not be achieved simultaneously. For non-stationary signals the wavelet transform can be more practical, since it uses a variable window width [13].

The aim is to have a system validating the data in a database and since the data in the database has been compressed and thus lost significant frequency information, a method based on frequency analysis would probably not be very successful.

(28)

3.1.3 Self-Organizing Maps

A self-organizing map (SOM) is a type of an artificial neural network which is categorized as a competitive learning network. The SOM relies on unsupervised learning which means that there are no targets for the given inputs [5]. A SOM gen-erally has input nodes in one layer and output nodes in a one- or two-dimensional layer. The output nodes are arranged in a lattice pattern and since the weights and positions of the output nodes are adjusted during the learning process, the pattern is often described as a topographic map over input data features. The output nodes have the function of classifying the input by activating only one output node for each input data [13].

Since the SOM-technique relies on unsupervised learning, this method is not appropriate for a sensor validation purpose. One possible field of application would however be to let a SOM classify input patterns as ‘normal’ or ‘not normal’, but there are more interesting alternatives to investigate.

3.1.4 Bayesian Networks

Bayesian networks are often used for reasoning when information is scarce and knowledge is less abundant. They are very useful for tasks in the fields of clas-sification and pattern recognition. Bayesian networks are built up by nodes and arcs connecting the nodes. The nodes represent variables and an arc between two nodes describe a direct influence while an indirect influence is described by the path from one node to another. A node is called a parent node or a child node and the arcs always have the direction from parent node to child node. The state of a child node depends on the connected parent nodes. A conditional probability table is used to explain the relationships between discrete nodes. Data can be used to train the table but it is also possible to manually edit the table. This however requires expert knowledge and the complexity increases with large tables [9].

The primary field of Bayesian networks does not really match the purpose of sensor validation as a first step. Training is a lively research area since there is no standard method for this [9]. A discussion with Finn Jensen, Dr. Tech., Aalborg University, reveals that only inconsistency in a cluster of sensors can be discovered. Only with further investigation, requiring more human interaction, it is possible to tell which sensor is faulty.

3.2 Promising Methods

The remainder of this chapter addresses the methods that are considered as promis-ing and thus they are presented with a higher level of detail than the above men-tioned methods.

3.3 Linear Parametric Models

This section refers to a certain family of black-box models that are popular due to their flexibility and applicability. These models are either based on linear

(29)

regres-sion or dynamic transfer functions [10]. Linear regresregres-sion is convenient because the output y can be written as y(t) = θT_{ϕ(t), where θ is a column vector of parameters} and ϕ(t) is a column vector with previous values of inputs and outputs. When the models are based on dynamic transfer functions the output from a model is calcu-lated with convolution of the input signal and the impulse response of the system model. If the output is denoted y(t), the system’s impulse response is denoted h(t) and the input is denoted u(t) then y(t) is calculated using convolution as

y(t) = (u ∗ h)(t) = ∞ Z −∞

u(τ )h(t − τ ) dτ. (3.1)

When developing models for a system it is sometimes difficult or too time consuming to establish the inherent physical relationships. A popular way to resolve this problem is to lean on this family of linear parametric models. A model of this type can be written as

y(t) = G(q, θ)u(t) + H(q, θ)e(t) (3.2)

where G(q, θ) = B(q) F (q) = b1q−nk+ b2q−nk−1+ . . . + bnbqnk−nb+1 1 + f1q−1+ . . . + fnfq−nf (3.3) and H(q, θ) = C(q) D(q) = 1 + c1q−1+ . . . + cncq−nc 1 + d1q−1+ . . . + dndq−nd. (3.4) Here, θ is the parameter vector for the coefficients bi, ci, diand fi. The output from the system is y(t), while u(t) and e(t) are input and noise respectively. Also q is the shift operator meaning that qu(kT ) = u((k + 1)T ) and q−1_{u(kT ) = u((k − 1)T ).} With no simplifications this model is known as the Box-Jenkins (BJ) model. If

C(q) = 1 and F (q) = D(q) = A(q) the simpler model structure called ARX1_is

ob-tained. Another simplification is to omit the noise model, i.e. C(q)/D(q) = 1 and the Output Error (OE) model is obtained. The models mentioned above can be seen in Figure 3.1.

The ARX-model uses the same dynamics for the input signal and the noise while the BJ-model separates the dynamics for input and noise. The OE-model concentrates on the input and has no model for the noise. In order to estimate a model, except from data, it is only required to specify the order and the parameters can be adjusted to fit the present data.

(30)

- _B - ?_Σi- 1 -A u y e ARX - B - ?_Σi -F u y e OE - - ?i ? -Σ B F C D u y e BJ

Figure 3.1. ARX, OE and BJ as block diagrams

3.3.1 Finding the Parameters

The configuration of the parameter vector θ is described in (3.5) for an ARX model, in (3.6) for an OE model and in (3.7) for a BJ model.

θARX = (a1, . . . , ana, b1, . . . , bnb)T (3.5) θOE = (b1, . . . , bnb, f1, . . . , fnf)T (3.6) θBJ = (b1, . . . , bnb, c1, . . . , cnc, d1, . . . , dnd, f1, . . . , fnf)T (3.7) The parameter vector θ is established by finding the values that minimizes the prediction error. The prediction ˆy(t|θ) is an estimation of y(t) at time t − 1 and the prediction error is thus ε(t, θ) = y(t) − ˆy(t|θ). For a data set containing data for times t = 1, . . . , N a quality number VN(θ) can be calculated as

VN(θ) = 1 N N X t=1 ε2(t, θ) (3.8)

and θ is chosen to minimize ˆθN = arg minθVN(θ). VN(θ) is often a complicated function and ˆθN has to be calculated using an iterative search-method for the minimum such as the Newton-Raphson method. [10]

3.3.2 Prediction

One way to use a model for sensor validation is to let it predict the value for the output y, i.e. the sensor reading. The predicted value ˆy is calculated as follows; first divide (3.2) by H(q, θ):

(31)

H−1(q, θ)y(t) = H−1(q, θ)G(q, θ)u(t) + e(t). (3.9) Adding and subtracting y(t) yields

y(t) = [1 − H−1(q, θ)]y(t) + H−1(q, θ)G(q, θ)u(t) + e(t) (3.10)

and the predictor is obtained by omitting the noise e(t), ˆ

y(t|θ) = [1 − H−1(q, θ)]y(t) + H−1(q, θ)G(q, θ)u(t). (3.11)

Equation (3.11) shows the predictor for the general case, i.e. for a BJ-model. It depends on old values of both y and u. If the model is an OE-model the predictor is

ˆ

y(t|θ) = G(q, θ)u(t) (3.12)

and does no longer depend on old values of y. In the case of an ARX-model the predictor is

ˆ

y(t|θ) = [1 − A(q, θ)]y(t) + B(q, θ)u(t) = (−a1q−1− . . . − anaq−na)y(t)

+ (b1q−nk+ b2q−nk−1+ . . . + bnbq−nk−nb+1)u(t) (3.13) which depends on old values of both y and u.

3.3.3 Simulation

Another way to use these models is to let them simulate the output when input data is presented to them. In this case, where old values of y are requested, old simulated values of y are used instead. The simulated output of a BJ-model is calculated according to

ysim(t) = [1 − H−1(q, θ)]ysim(t) + H−1(q, θ)G(q, θ)u(t) (3.14) i.e. same as (3.11) but with simulated values for old values of y instead of old measurements of y. For an OE-model the simulated output is calculated according to

ysim(t) = G(q, θ)u(t). (3.15)

(32)

y(t) = − a1y(t − 1) − . . . − anay(t − na)

+ b1u(t − nk) + . . . + bnbu(t − nk − nb + 1) + e(t). (3.16) When simulating an ARX-model, the output is calculated according to

ysim(t) = − a1ysim(t − 1) − . . . − anaysim(t − na)

+ b1u(t − nk) + . . . + bnbu(t − nk − nb + 1). (3.17)

In conformity with the OE- and BJ-models, the noise is omitted when simu-lating the output. Initially there are no old simulated values for the output and these have to be specified in advance.

3.3.4 System Identification Toolbox

A widely used tool for model estimation is the System Identification Toolbox [10]. It is used with Matlab and the GUI Ident (Figure 3.2) makes it easy to use. By typing ‘ident’ in the Matlab command line the GUI appears. In this thesis, Ident was used to identify the most successful model types. For time and efficiency purposes the procedures to import data and estimate the models were automated outside of Ident.

Figure 3.2. The System Identification Toolbox GUI, Ident

3.3.5 Model Estimation with Ident

To estimate models with Ident, data must be available in the Matlab workspace. To import data to Ident select ‘Time domain data. . . ’ in the ‘Import data’ dropdown

(33)

menu. Specify the input and output to the model and the data will be available in Ident. Also import data to validate the model, i.e. data that has not been used to estimate the model. It is often a good idea to remove the means from the data. This is done by dragging the icon for the data to the ‘Working data’-field and in the ‘Preprocess’ dropdown menu select ‘Remove means’. With preprocessed data in the ‘Working data’-field a model is estimated by selecting ‘Linear parametric models. . . ’ in the ‘Estimate’ dropdown menu. In the window that appears, the model structure, such as ARX, OE or BJ, must be selected. Depending on the selected model structure, a number of parameters must be specified such as number of poles, number of zeros and the delay. The estimated model appears and by dragging the preprocessed data to the ‘Validation data’-field and selecting ‘Model output’, the output of the model compared with real data can be viewed along with its model fit. This value is calculated as

Fit = 100 1 − ky − ˆyk2 ky − ¯yk2 (3.18) where y are measured values, ˆy are outputs from the model and ¯y is the mean of y.

3.4 Artificial Neural Networks

Artificial neural networks (ANNs) were originally used to explain the information processing in the human brain [2]. An ANN is a set of small processing units which communicate with weighted connections in a network. The weights assigned to the connections are randomly initiated and adjusted during a learning procedure called training in order to make the network generate the correct output for a given input. The training procedure is ‘supervised’ which means that both input data and output data must be available during training. When the ANN generates the correct outputs for the incoming inputs the training is finished and the weights are frozen.

An ANN is built up by layers with input nodes in an input layer, hidden nodes, called neurons, in one or more hidden layers and output neurons in an output layer. No computations are made in the input layer and therefore this layer is not taken into account when determining the number of layers in a network. The computations in the neurons in the hidden layer(s) or output layer consist of a summation followed by a transfer function. Common transfer functions are sigmoid functions like tanh and the logistic sigmoid function which is defined as

logsig(x) = 1

1 + e−x . (3.19)

ANNs can identify and learn the relationships between the inputs and the outputs of a non-linear multi-dimensional system and therefore, they have become very popular for solving problems that are difficult to solve by traditional means [1]. One layer of hidden neurons in a network is sufficient to approximate any continuous function provided that the transfer functions of the hidden units are

(34)

non-linear [7]. One specific type of ANNs, the auto-associative ANN, where the inputs and outputs are the same, has proved to be successful for sensor validation purposes [8].

There are several different architectures for ANN, but here only the feed-forward architecture will be discussed. This architecture only allows connections in one direction, i.e. from input to output. All connections are assigned with weights and there is also a bias for each hidden layer and the output layer. Figure 3.3 illustrates a feed-forward ANN with two layers. Figure 3.4 shows one neuron with a logistic sigmoid transfer function and one neuron with a linear transfer function.

±° ²¯ ±° ²¯ ±° ²¯ ±° ²¯ ±° ²¯ ±° ²¯ ±° ²¯ ±° ²¯ ³³³³ ³³³³ ³ 1³³ PPP PPP PPP_qP_P @ @ @ @ @ @ @ @ @ R @ ¡¡ ¡¡ ¡¡ ¡¡ ¡ µ¡ ³³³³ ³³³³ ³ 1³³ PPP PPP PPP_qP_P @ @ @ @ @_@_R @ A A A A A A A A A A AU A A B B B B B B B B B B B B B B B BBNB B B B Q Q Q Q Q Q Q QQs_Q Q -´´ ´´ ´´ ´´´3 ´´ A A A A A A A A A A AU A A w(1)12 w(1)₂₂ w(1)₃₂ w(1)₁₁ w21(1) w(1)₃₁ w(1)10 w(1)₂₀ w(1)₃₀ w(2)₁₁ w(2)₁₂ w(2)₁₃ w10(2) 1 1 -x1 x2 y1

Figure 3.3. An example of a two-layered feed-forward ANN

Figure 3.4. Artificial neurons with logistic sigmoid transfer function to the left and

linear transfer function to the right

The output from an arbitrary output neuron, k, in a two layer feed-forward ANN is yk= Fo   H X j=1 w(2)_kj · Fh M X i=1 w_ji(1)· xi+ wj0(1) ! + w_k0(2)   _(3.20)

(35)

where Fhis the transfer function in the hidden layer and Fois the transfer function in the output layer.

If the ANN in Figure 3.3 has the logistic sigmoid transfer function in the hidden layer and the linear transfer function in the output layer, its output is calculated as y1=w11(2)logsig(w (1) 11x1+ w(1)12x2+ w(1)10) + w₁₂(2)logsig(w₂₁(1)x1+ w(1)22x2+ w(1)20) + w₁₃(2)logsig(w₃₁(1)x1+ w(1)32x2+ w(1)30) + w (2) 10. (3.21)

3.4.1 ANN Training

The weights are adjusted during a learning process called training. If the inputs used during training have corresponding outputs, the learning process in said to be supervised and an error can be calculated. This error is used to update the weights. The first learning rule that made ANNs with more than one layer useful was the backpropagation algorithm. This algorithm is described in detail in [1], but the simple variations of this method are often too slow for practical problems. Advanced methods have been developed which can converge much faster.

There are three different modes of training:

• Batch training: All available input/output-relations are presented to the network one by one and an average error is then calculated and used to update the weights. When all available input/output-relations have been presented to the network, an epoch is finished. The input/output-relations are presented to the network again and the weight updating is done epoch-by-epoch.

• Sequential training: The weights are updated after each input/output-relation, i.e. pattern-by-pattern.

• Block training: The input/output-relations can be divided into blocks and an error is calculated after each block. The weights are updated block-by-block. The batch-training mode is considered to be the most stable but it is also associated with a slow convergence rate [7].

3.4.2 Adjustable Parameters

The number of nodes in the input and output layers is given by the stated problem, but the problem itself says nothing about the optimal size of the hidden layer. The method for finding this is trial-and-error. A large number of neurons makes the accuracy high but the training process gets slower. There is also a greater risk of overfitting, i.e. that the ANN learns the input-output relations of the training data too good and does not generalize well. Few hidden neurons make the training

(36)

process fast but can result in no convergence at all. A rule of thumb is to have twice the amount of neurons in the hidden layer as the sum of the number of input and output nodes [9].

The error produced during training is multiplied by a tunable learning rate (η) before propagated back to the net to tune the weights. This parameter can be set in the interval [0, 1]. A value close to 1 makes the convergence fast but unstable, a value close to 0 makes the convergence very slow but stable.

3.4.3 Training and Simulation

Before training, the data is normalized to fit the range of the transfer function in the hidden layer. As upper and lower value for a signal, the range for that signal can be used. If the upper and lower limits for a signal, xi, are denoted xi,maxand xi,min respectively and the upper and lower limits for the transfer function, Fh, in the hidden layer are denoted Fh,max and Fh,min respectively, the normalized signal, xi,n, is calculated as

xi,n= Fh,max− Fh,min

xi,max− xi,min (xi− xi,max) + Fh,max. (3.22) Another approach for normalization is to determine the maximum and mini-mum values for a signal in the training data set and normalize with some margins in order to have some extrapolating capabilities. This might be preferred if the detection system will only be in operation when the plant is running and the signal levels are most likely to fall inside a specific range.

Also when simulating the trained net, the data must be normalized within the same range as when training. The resulting outputs from the net must also be denormalized according to the inverse of the normalization. This procedure is described in Figure 3.5. TT ·· TT ·· TT ·· TT ··

Normalize ANN Denormalize

x1 .. . xM y1 .. . yN

(37)

Chapter 4

Fault Analysis and Detection

In this chapter the approach for sensor validation in this thesis is described. A method for sensor validation when two or more sensors measure the same property is stated and also how to handle the non-redundancy case.

The general approach for sensor validation is to generate an estimate of a sensor’s value. Which method is best depends on conditions like redundancy, system behavior and system complexity. The estimate is then compared with the measured value from the sensor and the difference is called residual. The residual can either be directly evaluated or evaluated after some post processing. In order to have fault detection, a threshold must be established so that when the residual, post processed or not, is larger than the threshold, an alarm is raised. The level of the threshold can not be set too low or there will be many false alarms. If the threshold is set too high there will be missed alarms when small errors are present.

4.1 Redundancy

On the core engine hardware redundancy is not unusual, i.e. more than one sensor is used to measure the same property. When hardware redundancy is present, no advanced method is required to validate the sensors. The easiest way is to compare the sensors’ outputs and alert when the difference is larger than a specified threshold. However, it is physically impossible to place all sensors on the exact same place and therefore they are not really measuring the exact same property. Considering this justifies the presence of differences between the outputs of sensors used to measure the same property. Easiest is to assume a linear relationship between the outputs of these sensors.

4.1.1 Least Squares

To derive a linear relationship between two sensors, the relationship can be ex-pressed as y1 = ky2 + m. In order to estimate the constants k and m, it is

necessary to collect fault-free data. The constants can then be derived by apply-ing the least square method. By expressapply-ing the relation on the form Ax = b, where

(38)

x is a column vector of the desired parameters, these parameters can be derived as following: Ax = b (4.1) AT_{Ax = A}T_b (AT_A)−1_AT_{Ax = (A}T_A)−1_AT_b x = (ATA)−1ATb.

To estimate the constants in the equation y1= ky2+m the least square problem

should be stated as:      y2,1 1 y2,2 1 .. . ... y2,N 1      | {z } A k m | {z } x =      y1,1 y1,2 .. . y1,N      | {z } b (4.2)

The result is a line that fits best to the data set in the least squares sense.

4.1.2 Two Sensors

With two sensors it is with the method of least squares only possible to state that one of the two sensors is faulty. The reason is that only one linearly indepen-dent equation can be stated. By modeling y1 with y2 and then y2 with y1, the

first equation would be the inverse of the second equation and thus not linearly independent.

4.1.3 Three Sensors

With three sensors it is also possible to isolate the faulty sensor. If three relations are expressed (Figure 4.1), these can be used to form an influence structure. Let the first sensor estimate the value of the second sensor and also the third sensor. Further, let the second sensor estimate the value of the third sensor. With these three equations the influence structure in Table 4.1 can be formed, where ei represents a fault on sensor i and Tj is a threshold for the relationship given by fj. A sensor can be considered as faulty when the residual becomes larger than the threshold, e.g. when |y2− f1(y1)| > T1is satisfied. However, the residual can

also depend on an error in the sensor used to estimate the second sensor. If, for example, the thresholds T1 and T3 are exceeded, then by examining T1 it can be

understood that it is not due to a fault in sensor 3 and by examining T3it is clear

that it is not due to a fault in sensor 1. This means that sensor 2 has failed.

4.1.4 Four Sensors

When four sensors measure the same property, it is analogous to the case of three sensors also possible to isolate the faulty sensor. Linear relationships can be

(39)

de-- -y1 f1 yˆ2,f 1 - -y1 f2 yˆ3,f 2 - -y2 f3 yˆ3,f 3

Figure 4.1. Linear relationships in the case of three sensors

e1 e2 e3

T1 X X 0

T2 X 0 X

T3 0 X X

Table 4.1. Influence structure for three sensors

rived and by arranging inputs and outputs as described in Figure 4.2, the influence structure in Table 4.2 is obtained.

- -y1 f1 yˆ2,f 1 - -y2 f2 yˆ3,f 2 - -y3 f3 yˆ4,f 3 - -y1 f4 yˆ4,f 4

Figure 4.2. Linear relationships in the case of four sensors

4.2 No Hardware Redundancy

Where hardware redundancy is not present a direct comparison is not possible. To validate these sensors, estimates of the sensors’ outputs must be made from information other than from their corresponding redundant sensors. In this thesis

(40)

e1 e2 e3 e4

T1 X X 0 0

T2 0 X X 0

T3 0 0 X X

T4 X 0 0 X

Table 4.2. Influence structure for four sensors

a few methods have been tested to generate these estimates and the performances of these methods have been compared. The foundation of all coming approaches to the non-redundant case in this thesis is a model of the gas turbine [3], see Figure 4.3, where ambient conditions (temperature, pressure and relative humidity) and torque are used as inputs and a number of outputs are estimated. The model used in this thesis uses inputs and outputs related or equivalent to the model in [3] but with a number of additional outputs. Table 4.3 states the relationships between the model in Figure 4.3 and the Siemens SGT-800 gas turbine that will be investigated in this thesis.

Figure 4.3. A graphical model of a gas turbine

4.3 Fault Detection with CUSUM

When a residual has been produced, the simplest way of fault detection is to compare it with a fixed threshold value and raise an alarm if the absolute value of the residual exceeds the threshold value. Such a simple method can result in sporadic alarms if the fault is close to the threshold or if the signals come with noise.

This section will describe the CUSUM (CUmulative SUM) algorithm, an often useful algorithm that can be used on any signal that is generated for detection [12]. Consider a signal s(t) that has a negative value in the fault free case and a positive value in the faulty case. The cumulative sum g(t) can be generated as in (4.3),

(41)

Model in SGT-800

Fig. 4.3 Signal description Signal tag Input/Output

Torque Active load CFA10CE001 Input

Patm Pressure compressor inlet MBA10CP010 Input

RH Relative humidity ambient air MBL30CM005 Input

Tatm Temperature ambient air MBL30CT005 Input

˙

mair Mass flow compressor inlet MBA10CF900 Output

- Diff. pressure compressor inlet MBA10CP005 Output

- Pressure disc 1 MBA10CP035 Output

- Diff. pressure turbine exhaust MBA10CP040 Output

P7 Pressure turbine exhaust MBA10CP045 Output

- Differential pressure air intake MBA10CP075 Output

- Temperature turbine casing MBA10CT065 Output

˙

mNG Total gas fuel flow MBP05CF005 Output

Table 4.3. Inputs and outputs for sensor validation

g(t) = t X i=1

s(i). (4.3)

The cumulative sum g(t) will decrease in the fault free case and increase when a fault occurs. A test quantity T (t) can be formed as

T (t) = g(t) − min

0≤i<tg(i) (4.4)

and then compared to a positive threshold J which means that an alarm will be generated when g(t) exceeds its minimum with more than J. An alternative form of this description is

Ts(t) = max(0, Ts(t − 1) + s(t)), Ts(0) = 0. (4.5) However, residuals are ideally zero in the fault free case and either positive or negative in the faulty case. A parameter ν is introduced and the CUSUM algorithm for residuals is

T (t) = max(0, T (t − 1) + |r(t)| − ν), T (0) = 0 (4.6)

where ν is a design parameter that specifies what general size the residual may have before T (t) starts to increase. An alarm is generated when T (t) is greater than a threshold J.

(42)

(43)

Chapter 5

Results

The results are presented in this chapter. The redundancy case is presented first, followed by the linear parametric models and the ANN. Their model parameters are presented together with their performance in the fault-free case. Fault detec-tion is investigated by adding offset errors to the signals. For the linear parametric models and the ANN both direct residual analysis and CUSUM are used.

5.1 Redundancy with Three Sensors

This section illustrates the performance of the method described in Section 4.1 when applied on three sensors measuring the temperature in the compressor outlet. Data has been collected so that temperatures in the complete operating range are included. The linear equations f1, f2 and f3 are calculated and Figure 5.1 shows

the good fit of the equations. The equations are f1(y1) = 0.9981y1+ 2.2461,

f2(y1) = 0.9999y1+ 0.1083 and f3(y2) = 1.0016y2− 2.1015. The measured values

and the outputs from the models can be seen in Figure 5.2. Figure 5.3 shows the differences between the model outputs and the sensor outputs with respect to the measuring range, which in this case is [0, 700]◦_{C. No post processing of the} residuals has been used and the threshold is set to 1 % error. The false alarms can be seen in Figure 5.4. Only Figure 5.1 plots data used for model estimation, while all other figures use validation data that has not been used for model estimation. To illustrate the fault detection for this method, an error of 1.2% of the mea-suring range, i.e. 8.4◦_{C has been added to y}

2. Detection limit is as above, 1%.

The error added to sensor 2 affects the comparison between y2 and ˆy2,f 1. The

error on sensor 2 also affects the estimate of y3from model f3. The estimate of y3

from model f2is still good since y2is not involved. The outputs from the models

and the measured values can be seen in Figure 5.5. Figure 5.6 shows the difference between the sensor readings and the outputs from the models with respect to the measuring range. Finally, Figure 5.7 shows the alarms.

The threshold in this case is set to 1 % which means that differences larger than 1 % will be considered as too large. In order to rule out sensor 2 to be faulty, both differences between y2 and ˆy2,f 1 as well as difference between y3 and ˆy3,f 3

(44)

0 50 100 150 200 250 300 350 400 450 0 500 y 1 y2

Data points and approximated linear functions

0 50 100 150 200 250 300 350 400 450 0 500 y 1 y3 0 50 100 150 200 250 300 350 400 450 0 500 y 2 y3

Figure 5.1. Data for estimating linear equations and the resulting linear equations

must be larger than the threshold according to Table 4.1. The bias of 1.2 % added to the values from sensor 2 is rather close to the threshold set to 1 % which results in partial detection as can be seen in Figure 5.7. A larger bias or a lower threshold would have made the error detection more constant. A low threshold increases the risk of having false error detections, i.e. an error is detected but no actual error is present. In this case, a threshold set to 0.5 % would have generated almost the exact amount of false alarms as in Figure 5.4.

An error of 8.4◦_{C might seem like a large error and not very precise in a sensor} validation point of view. A closer look at the estimation data however, reveals the inherent uncertainty in the measurements. This can be seen in Figure 5.8, where values from sensor 1 are on the x-axis and values from sensor 2 are on the y-axis. For a given value on sensor 1, the values on sensor 2 can vary by approx. 5◦_C.

(45)

0 200 400 600 800 1000 1200 0 200 400 600 y2 & ˆy2,f 1 0 200 400 600 800 1000 1200 0 200 400 600 y3 & ˆy3,f 2 0 200 400 600 800 1000 1200 0 200 400 600 y3 & ˆy3,f 3

Measured (gray) and estimate (black)

Figure 5.2. y2 & ˆy2,f 1, y3 & ˆy3,f 2and y3 & ˆy3,f 3

0 200 400 600 800 1000 1200 0 1 2 3 Error [%] 0 200 400 600 800 1000 1200 0 0.5 1 1.5 Error [%] 0 200 400 600 800 1000 1200 0 1 2 3 Error [%]

Difference between measured and estimate with respect to signal range.

(46)

0 200 400 600 800 1000 1200 0

1 2

Alarm for error on sensor 1

0 200 400 600 800 1000 1200

0 1 2

0 200 400 600 800 1000 1200

0 1 2

Figure 5.4. Alarms with threshold set to 1 %. There are two false alarms.

0 200 400 600 800 1000 1200 0 200 400 600 y2 & ˆy2,f 1 0 200 400 600 800 1000 1200 0 200 400 600 y3 & ˆy3,f 2 0 200 400 600 800 1000 1200 0 200 400 600 y 3 & ˆy3,f 3

Measured (gray) and estimate (black)

(47)

0 200 400 600 800 1000 1200 0 1 2 Error [%] 0 200 400 600 800 1000 1200 0 0.5 1 1.5 Error [%] 0 200 400 600 800 1000 1200 0.5 1 1.5 2 Error [%]

Difference between measured and estimate with respect to signal range.

Figure 5.6. Difference between measured and estimates with respect to measuring range

0 200 400 600 800 1000 1200

0 1 2

0 200 400 600 800 1000 1200

0 1 2

0 200 400 600 800 1000 1200

0 1 2

(48)

410 412 414 416 418 420 422 424 410

415 420 425

(49)

5.2 Data Properties with non-redundancy

For the non-redundancy case, data has been collected from an SGT-800 gas tur-bine in Germany. The sample period is 15 minutes and data sets are normally 8 days which results in 769 values for each signal and set. The sample period was selected after first trying to determine the bandwidth with Fourier analysis. The compression of the signals made this analysis very difficult and the sample period was set to 15 minutes. All tags have a compression rate between 85 % and 99 %, i.e. maximum 15 out of 100 values have been archived in the database. In order to cover as large span of ambient conditions as possible, data sets have been collected evenly spread over a year which can be seen in Table 5.1. The data set ‘Data3’ was found to be incomplete and had to be split into ‘Data3’ and ‘Data9’ with a total of 768 values for each signal.

Data set Start time End time

Data1 Jan-01 01:00:00.000 Jan-09 01:00:00.000 Data2 Feb-15 01:00:00.000 Feb-23 01:00:00.000 Data3 Apr-01 01:00:00.000 Apr-09 01:00:00.000 Data4 May-15 01:00:00.000 May-23 01:00:00.000 Data5 Jul-01 01:00:00.000 Jul-09 01:00:00.000 Data6 Jul-24 01:00:00.000 Aug-01 01:00:00.000 Data7 Oct-01 01:00:00.000 Oct-09 01:00:00.000 Data8 Nov-15 01:00:00.000 Nov-23 01:00:00.000 Testdata1 Mar-15 01:00:00.000 Mar-23 01:00:00.000 Testdata2 Oct-15 01:00:00.000 Oct-23 01:00:00.000

Table 5.1. Estimation/training data and test data

5.3 Model Parameters

This section describes what model parameters were used and how the final models were selected. All following figures with eight subplots are arranged according to the pattern explained in Table 5.2.

1: Mass flow compressor inlet 2: Diff. press. compr. inlet 3: Pressure disc 1 4: Diff. press. turb. exhaust 5: Pressure turbine exhaust 6: Diff. pressure air intake 7: Temperature turbine casing 8: Total gas fuel flow

(50)

5.3.1 Linear Parametric Models

For the linear parametric models it is interesting to study both prediction and simulation. System Identification Toolbox lets the user specify if focus should be on prediction or simulation when calculating the model parameters [10]. Therefore eight models have been established for studying the prediction approach and eight models for the simulation approach.

Table 5.3 summarizes the parameters that were used for evaluation of the models. All combinations of the parameters were tested resulting in six model types for both ARX and OE and twelve model types for BJ.

na nb nc nd nf nk

ARX 1, 2, 3 1, 2 0

OE 1, 2 1, 2, 3 0

BJ 1, 2 1 2, 3 1, 2, 3 0

Table 5.3. The parameters used for model evaluation

The models with the lowest value of VN(θ) calculated according to (3.8) when presented with unseen validation data, ‘Testdata1’, were considered best and they are presented in Table 5.4. The numbers in the model names refer to the param-eters. For an ARX-model it is na, nb and nk in this order. For an OE-model it is nb, nf and nk in this order. For a BJ-model it is nb, nc, nd, nf and nk in this order.

5.3.2 ANN

The final ANN was created using the Neural network toolbox in Matlab. The rule of thumb concerning the number of neurons in the hidden layer [9] proved not to be very successful in this application. Instead, 12 nodes, i.e. only half of what is recommended, generated the best results. For the neurons in the hidden layer, the logistic sigmoid transfer function was selected and the linear function

Signal description Best pred.-model Best sim.-model

Mass flow comp inlet ARX320 BJ11310

DP compr inlet ARX320 BJ11310

Press disc 1 ARX120 BJ11210

Diffpr turb.exhaust BJ21210 OE230

Press turb exhaust BJ11210 ARX320

DP air intake ARX110 ARX320

Temp turbine casing ARX120 ARX120

Flow tot gas fuel BJ11230 BJ11210

(51)

Figure 5.9. The final ANN consists of four input nodes, one hidden layer with twelve

neurons and one output layer with eight neurons. The neurons in the hidden layer use the logistic sigmoid transfer function and the neurons in the output layer are linear. was selected for the neurons in the output layer. Regarding the normalization, the training data was normalized to fit the extremes of the logistic sigmoid function, i.e. [0, 1], and the signal ranges were used to define the min and max of the signals. Batch-training mode with a learning rate of η = 0.05 was used for training. One ANN produces estimations of all eight sensors so, in order to compare one ANN with another, the sum of the fit values according to (3.18) was used to find the best ANN. The configuration of the resulting ANN is illustrated in Figure 5.9.

5.4 Model Performance

This section deals with the precision of the models. The models are presented with fault-free data that has not been used in the process of estimating the mod-els. All models used the same data for model estimation and in this section also the validation data is the same for all models. This has the advantage that the comparison of the models is very straight forward. In Figure 5.10, Figure 5.11 and Figure 5.12 the performance of the models can be seen when fault-free validation data is presented to the models. Both the outputs from the models and the values from the sensors are plotted in the same graph. In order to compare the ANN with the linear parametric models, the equation (5.1), which corresponds to (3.8), is used in Figure 5.12. VN = 1 N N X t=1 ε2_(t) _(5.1)

By examining Figure 5.10, Figure 5.11 and Figure 5.12 it can be seen that some models perform very well while some other models perform significantly worse. The method to predict the sensors’ outputs estimates best in all cases. The method to simulate is second best in all cases except one, signal 4, where the

(52)

200 400 600 124 126 128 arx320: 0.018502 200 400 600 13.8 14 14.2 arx320: 0.00026184 200 400 600 1.04 1.06 arx120: 1.7586e−006 200 400 600 1.9 2 2.1 bj21210: 0.0013973 200 400 600 96 98 bj11210: 0.0024723 200 400 600 1.1 1.12 1.14 1.16 1.18 arx110: 3.0725e−005 200 400 600 260 265 270 arx120: 0.098685 200 400 600 2.4 2.5 bj11230: 0.00017561

Figure 5.10. The final linear parametric models for prediction are presented with

validation data. Predicted signals in black and measured signals in grey. The value of VN(θ) is given below each plot. Plots arranged according to pattern in Table 5.2.

ANN performs better. The accuracy of the prediction method makes it in many cases even difficult to distinguish between the real signals and the estimates.

Sensor Validation Using Linear Parametric Models, Artificial Neural Networks and CUSUM

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Sensor Validation Using Linear Parametric Models,

Artificial Neural Networks and CUSUM

Artificial Neural Networks and CUSUM

Examensarbete utfört i Reglerteknik

vid Tekniska högskolan i Linköping

av

Abstract

Acknowledgments

Contents

Chapter 1

Introduction

1.1

Motivation

1.2

Problem Formulation

1.3

Limitations

1.4

Thesis Outline

Chapter 2

Measurement Uncertainty

2.1

Measuring Chain

2.2

Uncertainties in a Measuring Chain

2.2.1

DSIII Gauge Pressure

2.2.2

DSIII Absolute Pressure

2.2.3

DSIII Differential Pressure

2.2.4

Thermocouple Type N Class 1

2.2.5

Pt100 Class A

2.2.6

Vaisala Temperature and Relative Humidity

2.2.7

Coriolis Mass Flow

2.3

Overview of Data Acquisition

2.3.1

The Compression Algorithm

2.3.2

CMSView

Chapter 3

Analysis Methods

3.1

Dismissed Methods

3.1.1

First Principle Models

3.1.2

Wavelets

3.1.3

Self-Organizing Maps

3.1.4

Bayesian Networks

3.2

Promising Methods

3.3

Linear Parametric Models

3.3.1

Finding the Parameters

3.3.2

Prediction

3.3.3

Simulation

3.3.4

System Identification Toolbox

3.3.5

Model Estimation with Ident

3.4

Artificial Neural Networks

3.4.1

ANN Training

3.4.2