Applying Fuzzy Techniques
Peter Lindskog and Lennart Ljung Department of Electrical Engineering Linkoping University, S-581 83 Linkoping, Sweden
www:
http://www.control.isy.li u.s eemail:
lindskog@isy.liu.se,
ljung@isy.liu.se1997-04-10 Conference: SYSID '97
REGLERTEKNIK
AUTOMATIC CONTROL LINKÖPING
Technical reports from the Automatic Control group in Linkoping are available by anonymous ftp at
the address
130.236.20.24(
ftp.control.isy.liu.se). This report is contained in the compressed
postscript le
LiTH-ISY-R-1942.ps.Z.
BLACK BOX MODELSBY APPLYING FUZZY
TECHNIQUES
P. Lindskog andL. Ljung
Dept. of Electrical Engineering, Linkoping University, Sweden, E-mail:
lindskog@isy.liu.se,
ljung@isy.liu.seAbstract:
We consider the situation where a nonlinear physical system is identied from input-output data. In case no specic physical structural knowledge about the system is available, parameterized grey box models cannot be used. Identication in black-box-type of model structures is then the only alternative, and general approaches like neural nets, neuro-fuzzy models, etc., have to be applied.
However, certain non-structural knowledge about the system is sometimes available.
It could be known, e.g., that the step response is monotonic, or that the steady- state gain curve is monotonic. The question is then how to utilize and maintain such knowledge in a black box framework.
In this paper we show how to incorporate this type of prior information in an otherwise black box environment, by applying a specic fuzzy model structure, with strict parametric constraints. The usefulness of the approach is illustrated by experiments on real-world data.
Keywords:
Fuzzy modeling and identication nonlinear systems monotonicity.
1. INTRODUCTION
Don't estimate what you already know!
This is a basic principle in estimation and iden- tication and is also a pragmatic variant of the principle of parsimony|to be parsimonious with parameters to estimate. In an identication con- text, the concept of grey boxes has been intro- duced to denote model structures that use some kind of prior information about the system. The term tailor-made model structure has also been used. This is of course in contrast to black boxes or ready-made model structures, which just use
\size" as the basic structure option.
Now, there are several shades of grey. Often grey boxes employ rather specic knowledge of the system as an extreme it may correspond to a complete physical parameterization hav- ing some unknown parameters. These parame- ters are typically estimated by maximum likeli- hood/prediction error techniques.
The other end of this scale|the black box model structure|uses in the general nonlinear case a
function series expansion like
^
y (tj )=g('(t) )= n
X
j=1
j g
j
('(t)):
(1) Here
^y(tj ) 2 Ris the model's predicted value of the output
y(t)at time
t, and
'(t)are the regressors (past inputs and outputs) that are used to make the prediction.
g()is a general mapping, parameterized by , and we may think of
gj()as basis functions, building up the mapping. In turn, these
gj()could also be parameterized by . There are many choices of expansions of this par- ticular form. Neural nets, wavelet models, fuzzy models, nearest neighbor models, etc., all t into this framework, see, e.g., (Sjoberg et al., 1995).
An important challenge is here to combine the richness and exibility of (1) with prior physical knowledge that is not of precise, analytical char- acter. We would thus like to work with boxes that are just a shade lighter than black.
Prior system knowledge of this kind is quite often
available. It could be, e.g., that the step response
is monotonic, or that the steady-state gain curve is monotonic in certain input variables, or some other qualitative property.
It is in general not easy to incorporate such in- formation in conventional grey box parameter- ized model structures. In this contribution we will show how such properties can be cast into a fuzzy modeling framework. While a linguistic description could be well suited to pin down a qualitative behavior in a fuzzy rule base, it is also true that this knowledge can be annihilated by too
exible a parameterization. It is thus necessary to introduce constraints in the parameterization so that the qualitative behavior is guaranteed.
In Sec. 2 we describe one particular fuzzy model structure as one way to obtain the parameteriza- tion (1). The monotonicity of the static gain curve is then investigated in Sec. 3, where it is proved how to ensure such a property within an otherwise
exible parameterization. The ideas are nally tested on some real-world data from a heating process in Sec. 4.
2. FUZZY MODEL STRUCTURE The basic structure of a fuzzy model (or con- troller) is shown in Fig. 1. At the heart of the matter is the fuzzy rule base
R, which consists of a set of linguistic production rules
R=fR
1
R
2
:::R
n
g
(2)
here assumed to be of the form
R
j
:
If
(U1is
Aj1)and
:::and
(Uris
Ajr)then
(Yis
Bj)(3)
with
A11:::Anrbeing the linguistic values that can be assigned to the linguistic variables
U
1
:::U
r
(regressors), while
B1:::Bndenote the linguistic values that can be assigned to
Y. Each linguistic value is characterized by a fuzzy set
A2U, dened as the set of ordered pairs
A=f(u
A (u
A
A
)):u2Ug
(4) where
A(uAA)is called a membership func- tion (MF), carrying an element from
Uinto a membership value between
0(no degree of mem- bership) and
1(full degree of membership). The MF can be any function returning a value in
01], e.g., a sigmoid, a Gaussian, a triangular, and so on. The scale and position of such an MF are spec- ied by the parameters
Aand
A, respectively.
There are many ways to mathematically inter- prete the remaining fuzzy constructs. However, several identication aspects (see, e.g., (Lindskog, 1996)) motivate the use of a singleton fuzzi er, a sup-star based inference mechanism with Mam- dani implication and algebraic product for and , along with a center-of-sums defuzzi er consult (Driankov et al., 1993) for the denitions. The fuzzy rule base (2) can then be translated into the fuzzy model structure
Data
z(t)
Regressor generator Crisp '(t)2Rr
Fuzzier
Fuzzy sets inu2Ur
Linguistic variables Linguistic connectives
Fuzzy rule base Fuzzy inference
engine Fuzzy sets iny2Y Computational ow Information ow
Defuzzier Crisp
^ y(tj )2R
Fig. 1. Basic structure of a fuzzy model.
^
y (tj )=g(' (t) )= n
1
X
j
1
=1 :::
n
r
X
jr=1
j
w
A
j
n
1
X
j
1
=1 :::
nr
X
jr=1 w
A
j
(5)
w
A
j
= r
Y
k=1
A
j
k
k ('
k (t)
j
k
k
j
k
k
):
(6) This notation needs some further explanation.
First of all, the overall parameter vector is
T
=
T
T
T
(7)
which means that the MFs at the output side (associated with the fuzzy sets
B1:::Bn) are chosen to be fuzzy singletons or centers
, whereas the MFs at the input side (associated with the fuzzy sets
A11:::Anr) involve both scale
and position
parameters. This is the same terminology as is suggested in (Sjoberg et al., 1995), where the series expansion (5) is referred to as a tensor product construction. However, contrary to many other similar approaches, it should here be noted that the parameters can be linguistically interpreted. As we shall see in the following section, this is an important property to preserve in the parameter estimation step.
Secondly, a complete fuzzy rule base consists of
n = Q
r
k=1 n
k
dierent rules, where
nkis the number of MFs describing the
kth regressor. Ef- fectively, the labeling in (5) and (6) is just a conve- nient relabeling of the rules and the corresponding MFs of (2), where
j=j1:::jkis a grid-oriented multi-index, with
jkspecifying one particular MF associated with the
kth regressor. An example of such an enumeration is shown in Fig. 2.
We are now prepared to deal with the problem of how to achieve a monotone steady-state gain curve when applying model structure (5).
3. ENSURING MONOTONICITY
Many dynamic processes are known (from physics)
to have a steady-state gain curve that is mono-
tonic. Consider, e.g., a simple tank system where
the inow is the input and the liquid level the
output. Here it is known that a certain constant
inow eventually leads to a \constant" liquid
level. Starting from such a steady-state condition
we also know that an increase in the inow causes the liquid level to increase (in a non-oscillatory manner) and settle at a higher level. As will be il- lustrated in the following section, there are certain applications for which it is crucial that the used models show this kind of monotone behavior.
If we now apply exible nonlinear model struc- tures (neural nets, etc.), then it can be quite hard to achieve the requested monotonicity, especially if there are regression regions with few and noisy data. To remedy this, we suggest a restricted vari- ant of the fuzzy model structure (5) that guaran- tees an increasing (decreasing) function mapping from the regression space
Rrto the output space
R
. This structure together with a proper choice of regressor
'(t)result in dynamical models having the desired monotone behavior.
A conceptually rather simple way to ensure this property is to rst restrict the MFs at the input side to correspond to fuzzy partitions.
Denition 1. (fuzzy partition). Suppose that thekth linguistic variable can be assigned to
nk
dierent values each described by a membership function
Ajk
k ('
k (t)
jkk
jkk ) 2 U
k
. These MFs form a fuzzy partition if it holds on the entire domain
Ukthat
nk
X
j
k
=1
Aj
k
k ('
k (t)
j
k
k
j
k
k )=1:
By imposing this restriction on all of the
rlin- guistic variables and additionally assuming that the rule base is complete in the sense that it covers the whole input domain
Ur, it immediately follows that the model structure (5) simplies to
^ y(tj )=
n
1
X
j
1
=1 :::
n
r
X
j
r
=1
j
w
A
j
:
(8)
At this point, notice that a fuzzy partition puts certain demands on the MFs and their parame- ters. For example, we cannot in general use sig- moidal or Gaussian MFs because of their spread- ing and curvature. Piecewise linear MFs on the other hand can easily be parameterized so that a fuzzy partition is obtained. Within this category we will here restrict the discussion to the open left
mfl(u12)
, the open right
mfr(u12)and the triangular
mftri(u123)MFs, which in order are given by
max
min
2 -u
2 -
1
1
0
(9)
max
min
u-
1
2 -
1
1
0
(10)
max
min
u-
1
2 -
1
3 -u
3 -
2
0
:
(11) Consider now the case with a single input linguis- tic variable (
r=1), so that (8) simplies to
^ y (tj )=
n
X
j=1
j
A
j
('(t)
j
j
):
(12)
We are here searching for parameter values and MFs guaranteeing that the predictor is monotoni- cally increasing in
'(t). This turns out to be a simple task when the input MFs form a fuzzy partition.
To see this, assume that all input MFs are ordered on the universe
Uin such a way that
Aj()reaches a full degree of membership for a value of
'(t)that is lower than what is the case for
A
j+1
()
. If the ordered MFs at the input side form a fuzzy partition and the corresponding centers
jreecting the output MFs are such that
1
<
2
<:::<
n
(13)
then
y(tj^ )will show a monotonically increasing behavior. In verifying this, we rst notice that at intervals where the
jth input MF is fully active then the corresponding output becomes
j. With fuzzy partitions constructed by piecewise linear MFs we also have that
^
y(tj )=
j
A
j ()+
j+1
A
j+1 ()
=(
j+1 -
j )
A
j+1 ()+
j
(14)
for all intervals
jj+1] Usuch that
Aj()and
Aj+1()are not always zero. Since
j+1> j(equality gives a constant output on the current interval) and
Aj+1()is an increasing function on
jj+1]it follows that also
y(tj^ )is an increasing function on that interval, with values ranging from
jto
j+1. These facts give that the overall predictor is a non-decreasing function. To get a strictly increasing mapping it additionally must be required that the input MFs lack intervals with a full degree of membership.
The next step is to generalize this result to pre- dictors having
rregressors. To do this we start by formally dening what is meant by a monotoni- cally increasing predictor.
Denition 2. (regressor ordering). Let'(t),
'(t)2R
r
. We say that
'(t)'(t)if
'k(t)'
k
(t)
for
k=1::: r.
Denition 3. (monotonically increasing predic- tor). Let '(t)'(t) 2 Rr. We say that a predic- tor
g('(t) ) is monotonically increasing in the regressors if whenever
'(t)'(t) it holds that
g('(t) )g('(t)
).
We now have the following main theorem.
Theorem 1. Let the model structure be com- plete and given by (8). If, for all k = 1:::r, it holds that
n
k
X
j
k
=1
j
A
j
k
k ('
k (t)
jkk
jkk )
are increasing functions in
'k(t)on
Ukfor all possible combinations of xed values of
j1:::j
k-1
j
k+1
:::j
r
, then the predictor (8) is mono- tonically increasing in the regressors
'(t).
Proof. Let 'l(t) denote any regressor and x all other regressors
'k(t) = 'k 2 Uk,
k =
1:::l -1l+1:::r
. Rearranging the terms
Fuzzy rule base with 16 rules
A11 A21 A31 A
41
A
12A 22A 32A
42
11 21 31 41
12
22
32
42 13 23 33 43 14 24 34 44
11 21 31
41
12 22
3242
'1(t)2U1
'
2(t)2U
2
11
21 31
41
1
2
22
3
2
42
^ y(tj )
'1 (t)
2 U1 '
2(t
)
2
U
2
Fig. 2. Graphical representation of a complete fuzzy rule base containing 16 rules (left). Both linguistic variables at the input side have MFs forming fuzzy partitions. Ordering the centers as
j1j2
j1j2+1
for
j1 = 1::: 4,
j2 = 1:::3and as
j1j2 j1+1j2for
j1 = 1:::3,
j2 = 1:::4gives an increasing mapping as is shown in the right plot.
of the predictor (8) gives
^ y (tj )=
n1
X
j1=1 :::
n
l-1
X
jl-1=1 n
l+1
X
jl+1=1 :::
nr
X
jr=1
r
Y
k=1
k6=l
A
j
k
k ('
k )
n
l
X
j
l
=1
j
A
j
l
l ('
l (t))
where for simplicity the
jkkand the
jkkparameters have been dropped. The rst part of the expression (including the product) returns weights formed by taking the product of
r-1MFs, i.e., all the weights lie in
01]. By the assumption, the last sum returns functions that are increasing in
'l(t)on
Ul, which means that the predictor is a weighted (positive weights) sum of increasing functions. This gives that the overall predictor is monotonically increasing in the
regressors
'(t).
2The main point with Theorem 1 is that it is sucient to work with one-dimensional functions.
A simple way to ensure increasing functions in all
'
k
(t)
is now to restrict the input MFs to fuzzy partitions and order the corresponding centers as was done in the one-dimensional case.
Lemma 1. Let the model structure be (8) and let 'k(t) denote one of its regressors. Assume that the ordered (on
Uk) MFs associated with
'
k
(t)
are piecewise linear and such that they form a fuzzy partition. If, for all combinations of
j
1
:::j
k-1
j
k+1
:::j
r
, it holds that
j1:::jk:::jr
j1:::jk+1:::jr
8j
k
=1:::n
k
-1
, then every
n
k
X
j
k
=1
j
A
j
k
k ('
k (t)
j
k
k )
is a monotonically increasing function in
'k(t). This lemma follows directly from the one-dimen- sional case discussed above. The requirements for Theorem 1 to hold are fullled if all MFs are chosen according to Lemma 1. This is the case
for the rule base in Fig. 2, from which it is clear that the resulting predictor returns a larger (or unchanged) output if one or more of the regressors become larger. Moreover, if the orders among the parameters
and
are maintained in the estima- tion step, then this is a fact that cannot be altered by the estimation procedure. Using a squared prediction error optimization criterion, the then obtained constrained minimization problem can be solved, e.g., by a barrier function method see (Fletcher, 1987) for algorithmic details.
At this point, assume that the regressors include dynamics
'(t)=
y(t-1) y(t-2) :::
u(t) u(t-1) :::]
T
(15)
where, without loss of generality, only one input signal is present. A globally asymptotically stable predictor in
'(t)implies that a constant input
u = u(t) = u(t-1) = :::
leads to a constant output
yas
t!1. Plotting
yfor each value of
u
gives the steady-state gain curve.
Lemma 2. Letu y and
u y be two steady- state solutions to a globally asymptotically stable predictor
g('(t) ), i.e.,
y =g(y y ::: u u :::]
T
)
y =g(y
y ::: u u :::]T ):If
g(' (t) )is monotonically increasing in
'(t)and
u u, then
y y.
Proof. Suppose thaty <y . Let
u(t)=u for
t0
, whereupon
u(t)=ufor
t>0. This input sequence results in an output sequence
fy(t)gand
r
regressor sequences
f'k(t)g. If
y < y, then there exists a
tsuch that
y(t )< yoccurs for the rst time:
y(t )=
g(y(t -1) y(t -2) :::
u u :::]
T
)=g(' (t ) ):
Since this happens for the rst time we have that
'(t )'
, which by the monotonicity assump-
tion implies that
y(t )y. Contradiction!
2Thus, if the requirements of Lemma 2 are fullled, then we get a predictor with a monotonically increasing steady-state gain curve in the input.
Also, starting from a steady-state solution and in- creasing the input in a stepwise fashion, it follows by simple induction that
y (tj^ )increases mono- tonically with
t. This in particular means that the predictor shows a non-oscillatory step response behavior, which is a restriction but also a property that is valid for many industrial processes (e.g., thermal systems), as will be illustrated next.
4. EXAMPLE|WATER HEATING SYSTEM This section considers identication of a water heating system, depicted in Fig. 3. The process has earlier been investigated by (Koivisto, 1995).
u(t) T(t)
Q
in (t)
Qout(t)
Pt-100
Thyristor
Fig. 3. The water heating process.
Water ows across an uninsulated 0.4 liter tank.
On its way, the water is heated by a resistor element, which is controlled by the voltage
u(t)applied to a thyristor. At the outlet, the water temperature
T(t)is measured with a Pt-100 trans- ducer. As in (Koivisto, 1995), we will here restrict the discussion to a situation where
Qin(t)as well as the inlet water temperature is constant. The modeling problem is then to describe the outlet water temperature
T(t)given the voltage
u(t). The data to be used originate from a real time identication run (performed by Koivisto), where the process was driven by a pseudo-random type of input signal
u(t)(given in percent of the max- imum allowed voltage). The experiment lasted 9000 secs. and measurements were recorded every 3rd sec. The obtained data set was then divided into an estimation set of 2000 samples and into a validation set of 1000 samples.
Before performing any identication experiments we next list some important properties of the heating system.
1. Step response tests show that the time delay from that a change in the input can be seen in the output is 12 to 15 secs. Since the sampling interval is 3 secs., useful regressors stemming from the input are
u(t-4),
u(t-5), and so on.
2. The thyristor has a saturating characteristics.
3. The temperature
T(t)will increase if more power (a larger
u(t)) is applied to the heater element. The steady-state gain curve of a phys- ically sound model should thus be monotonically increasing in
u(t)=u.
The last item is extremely important as the model
is going to be used in a model predictive control (MPC) arrangement, see (Koivisto, 1995), where the aim of the control is to drive the temperature
T(t)
to a desired set-point value. To explain this, suppose that the current steady-state point lies in an area where the steady-state gain curve is decreasing in
u(t). Locally this curve indicates that to decrease the temperature
T(t), one should actually increase the voltage
u(t). Such a decision is of course fatal as it is known from physics that an increase in the voltage leads to an increase in the outlet temperature. A controller based on this model will thus react in a qualitatively opposite manner to what is reasonable, and in the end cause severe stability problems.
In (Koivisto, 1995) it is stressed that the impor- tant monotonicity property easily can be violated if neural network structures are t to the data without precaution. This problem occurs for large input signals and is due to the lack of identica- tion data at higher temperatures.
This diculty gives us a reason to try the fuzzy model structure (8). Desiring a model of low complexity, correlation tests indicate that
T(t-1) = '
1
(t)
( temp
(t-1) 2 U1 = 1050]) and
u(t-5)='
2
(t)
( voltage
(t-5)2U2 =0100]) are reasonable regressor candidates. The listed properties are achieved if the MFs ( vl , l , rl , m , h and vh are abbreviations for very low , low , rather low , medium , high and very high )
vl(T(tj^ ))= 11 l(T (tj^ ))= 12
rl(T(tj^ ))= 21 m(T (tj^ ))= 22
h(T(tj^ ))= 31 vh(T (tj^ ))= 32
and
A11 ('
1
(t))=mfl('1(t)1121)
A
21 ('
1
(t))=mftri('1(t)112131)
A31 ('
1
(t))=mfr('1(t)2131)
A
12 ('
2
(t))=mfl('2(t)2122)
A22 ('
2
(t))=mfr('2(t)2122)
are used in the predictor
^
T(tj )= 3
X
j
1
=1 2
X
j
2
=1
j
1
j
2 2
Y
k=1
A
j
k
k
()
(16) which contains 11 parameters
=
11
12
21
22
31
32
11
21
31
12
22 ]
T
(17)
chosen so that
10<
11
<
12
<
21
<
22
<
31
<
32
<50
10<
11
<
21
<
31
<50
0<
12
<
22
<100:
(18) A graphical representation of the corresponding fuzzy rule base is shown in Fig. 4, where dotted lines represent the initial positions of the MFs.
Constrained estimation of subject to the con-
straints (18) results in a model (with MFs ac-
cording to Fig. 4), whose simulation behavior is
reproduced in Fig. 5.
temp(t-1)voltage(t-5)
0
0 1 1
ll m hh
U1
C]
U
2
%]
11 21 31
12 22
10 20 30 40 50
0 20 40 60 80 100
l m
10 20 30 40 50
20 0 60 40 10080
vl rl
vh h
^T(tj)2Y
C]
T (t-
1) 2
U1
C]
u
(t
-
5
)
2
U
2
%]
11 12
21 22
31
32
10 20 30 40 50
Fig. 4. Input (left) and output (right) MFs used to describe the temperature of the heating system.
Dotted and solid curves show the situation before and after estimation, respectively.
Measured outputsT(t) Simulated outputsT(tj^ ^)
Norm of prediction errors
RMS error: 1.02 Max error: 4.23
Temp erature
C]
Time s]
6000 7000 8000 9000
10 20 30 40 50
Fig. 5. Simulation of the fuzzy heating model.
Compared to the best linear model found (an ARX model with four parameters), the root mean square (RMS) error decreases from 2.08 to 1.02, i.e., it is halved. The improvement is signicant at low and high temperatures, which mainly is due to that the linear model cannot capture the saturating characteristics of the thyristor. The built-in increasing nature of the steady-state gain curve of this model is shown in Fig. 6.
From these experiments we conclude that the derived fuzzy model is able to accurately describe the water heating system, at the same time as the important monotonicity property is ensured. The obtained RMS error 1.02 should be compared to 0.92, which is obtained in (Koivisto, 1995) using a neural net having many more parameters (31 compared to 11 in the fuzzy case). This along with the ensured monotone behavior suggest that the fuzzy approach might be a good alternative to neural nets when applied to predictive control.
5. CONCLUSION
We have in this paper addressed the problem of
\dark grey boxes": How to include, ensure and maintain a known qualitative behavior of a pro- cess in a model structure that is otherwise quite
exible. This is dicult to achieve in traditional grey box structures, unless a rather precise phys- ical knowledge in analytic form is at hand.
To deal with the problem, we have turned to black box structures of the kind (1), using basis func-
80 60
20 40 100
0 10 20 30 40 50
^T(tj
^)=T
C]
u(t)=u
%]
20.0 68.3