Asymptotic Variance Expressions for Identied Black-box Models
Urban Forssell
Department of Electrical Engineering Linkping University, S-581 83 Linkping, Sweden
WWW:
http://www.control.isy.liu.seE-mail:
ufo@isy.liu.se28 December 1998
REGLERTEKNIK
AUTOMATIC CONTROL LINKÖPING
Report no.: LiTH-ISY-R-2089 Submitted to Systems & Control Letters
Technical reports from the Automatic Control group in Linkping are available
by anonymous ftp at the address
ftp.control.isy.liu.se. This report is
contained in the compressed postscript le
2089.ps.Z.
Asymptotic Variance Expressions for Identied Black-box Models
?Urban Forssell
Division of Automatic Control, Department of Electrical Engineering, Linkoping University, S-581 83 Linkoping, Sweden. E-mail: ufo@isy.liu.se.
Abstract
The asymptotic probability distribution of identied black-box transfer function models is studied. The main contribution is that we derive variance expressions for the real and imaginary parts of the identied models that are asymptotic in both the number of measurements and the model order. These expressions are considerably simpler than the corresponding ones that hold for xed model orders, and yet they frequently approximate the true covariance well already with quite modest model orders. We illustrate the relevance of the asymptotic expressions by using them to compute uncertainty regions for the frequency response of an identied model.
Key words: Identication Prediction error methods Covariance Uncertainty
1 Introduction
A general linear, discrete-time model of a time-invariant system can be written
y
(
t) =
X1k
=1
g
(
k)
u(
t;k) +
v(
t) (1) Here
fy(
t)
gis the output,
fu(
t)
gthe input, and
fv(
t)
gan additive disturbance, whose character we will discuss later. With (1) we associate the transfer func- tion
G
(
ei!) =
X1k
=1
g
(
k)
e;ik! ; !(2)
?
This paper was not presented at any IFAC meeting. Corresponding author U.
Forssell. Tel. +46 13 282226. Fax +46 13 282622. E-mail ufo@isy.liu.se.
Preprint submitted to Elsevier Preprint 28 December 1998
In this paper we will discuss the statistical properties of transfer function mod- els of the form (2) when the impulse response coecients
g(
k) are determined or estimated using measured data
ZN=
fy(1)
u(1)
:::y(
N)
u(
N)
g. The identication method we will consider is the classical prediction error method, e.g., 4].
The main goal of the paper is to derive explicit expressions for the covariance matrix
P
(
!) =
E2
6
4
Re ^
GN(
ei!)
;EG^
N(
ei!)]
Im ^
GN(
ei!)
;EG^
N(
ei!)]
3
7
5 2
6
4
Re ^
GN(
ei!)
;EG^
N(
ei!)]
Im ^
GN(
ei!)
;EG^
N(
ei!)]
3
7
5 T
(3) where ^
GN(
ei!) denotes the identied model obtained using
Nmeasurements, that are asymptotic both in the number of observed data and in the model order. Similar results have previously been given in 3]. There, however, the focus was on expressions for
P
(
!) =
EjG^
N(
ei!)
;EG^
N(
ei!)
j2 (= tr
P(
!)) (4) which is real-valued and hence does not bring any information about the phase of ^
GN(
ei!)
;EG^
N(
ei!). With an explicit expression for the covariance matrix
P
(
!) dened in (3) we can construct condence ellipsoids for ^
GN(
ei!) in the complex plane which is useful, e.g., for robust control design and analysis.
2 Preliminaries
Notation and Denitions
The delay operator is denoted by
q;1 ,
q
;
1
u
(
t) =
u(
t;1) (5)
and the set of integers by
Z. The Kronecker delta function
(
) is dened as
k
=
8
<
:
1
k= 0
0
k 6= 0 (6)
3
As we shall work with signals that may contain deterministic components, we will consider generalized covariances and spectra of the form
R
yu
(
k) =
Ey(
t)
u(
t;k) (7)
yu(
!) =
X1k
=
;1R
yu
(
k)
e;ik!(8)
In (7) the symbol
Eis dened as
Ef
(
t) = lim
N!1
1
N N
X
t
=1
Ef
(
t) (9)
A lter
F(
q),
F
(
q) =
X1k
=0
f
(
k)
q;k(10)
is said to be stable if
1
X
k
=0
jf
(
k)
j<1(11)
A set of lters
F(
q)
2DM,
F
(
q) =
X1k
=0
f
(
k)
q;k(12)
is said to be uniformly stable if
1
X
k
=0 sup
2D
M
jf
(
k)
j<1(13) Basic Prediction Error Theory
Consider the set of models
y
(
t) =
G(
q)
u(
t) +
H(
q)
e(
t)
2DM Rd(14) where
DMis a compact and connected subset of
Rd(
d= dim
), and where
G
(
q) =
X1k
=1
g
(
k)
q;k(15)
H
(
q) = 1 +
X1k
=1
h
(
k)
q;k(16)
4
(In (14)
fe(
t)
gis supposed to be a sequence of independent, identically dis- tributed random variables of zero means, variances
0 , and bounded fourth order moments.) The corresponding one-step-ahead predictor is
^
y
(
tj) =
H;1 (
q)
G(
q)
u(
t) + (1
;H;1 (
q))
y(
t) (17) The prediction error is
"
(
t) =
y(
t)
;y^ (
tj) =
H;1 (
q)(
y(
t)
;G(
q)
u(
t)) (18) The standard least-squares prediction error estimate is found as
^
N
= arg min
2D
M V
N
(
) (19)
V
N
(
) = 1
N N
X
t
=1
1 2
"2 (
t) (20) We will denote the corresponding transfer function estimates, ^
GN(
q) and
^
H
N
(
q), respectively ^
GN(
q) =
G(
q^
N) and ^
HN(
q) =
H(
q^
N).
Under weak regularity conditions we have (see, e.g., 4])
^
N
!
with probability 1 as
N !1(21)
= arg min
2D
M
V
(
) (22)
V
(
) =
E1
2
"2 (
t) (23) Further,
p
N
(^
N ;)
2AsN(0
P) (24)
P
=
R;1
QR;1 (25)
R
=
V00(
) (26)
Q
= lim
N!1
E N V 0
N
(
)(
VN0(
))
T(27) Here prime and double prime denotes dierentiation once and twice, respec- tively, with respect to
.
Asymptotic Variances for Identied Models
The result (24)-(27) states that asymptotically, as
Ntends to innity, the estimate
pN^
Nwill have a normal distribution with mean
pNand
5
covariance matrix
P. Using the Taylor expansion
2
6
4
Re ^
GN(
ei!)
;G(
ei!)]
Im ^
GN(
ei!)
;G(
ei!)]
3
7
5
=
2
6
4
Re
G0(
ei!)]
Im
G0(
ei!)]
3
7
5
(^
N ;) +
o(
j^
N ;j) (28) where
G0(
ei!) is a 1
ddimensional matrix being the derivative of
G(
ei!) with respect to
evaluated at
=
, we thus have
p
N 2
6
4
Re ^
GN(
ei!)
;G(
ei!)]
Im ^
GN(
ei!)
;G(
ei!)]
3
7
5
2AsN
(0
P(
!)) (29) with
P
(
!) =
2
6
4
Re
G0(
ei!)]
Im
G0(
ei!)]
3
7
5P
2
6
4
Re
G0(
ei!)]
Im
G0(
ei!)]
3
7
5 T
(30) The matrix
P(
!) in (30) gives an expression for the sought covariance ma- trix (3). However, evaluation of (30) is complicated and leads to intractable expressions, which limits the usefulness of the result. From, e.g., 3] we know that it is possible to compute variance expressions that are asymptotic in the model order, i.e., in the dimensionality of
. This type of expressions tend to be simpler and, hence, easier to work with and to interpret than those ob- tainable using the above technique. Furthermore, despite the fact that these results are asymptotic in both the number of measurements and the model order, they frequently give reasonable approximations of the variance even for xed model orders. In the next section we will derive the corresponding expressions for (3).
3 Main Result
Introduce
T
(
q) =
2
6
4
G
(
q)
H
(
q)
3
7
5
(31)
and dene
0 (
t) =
2
6
4 u
(
t)
e
(
t)
3
7
5
(32)
6
The spectrum of
f0 (
t)
gis
0(
!) =
2
6
4
u(
!)
ue(
!)
ue(
;!)
0
3
7
5
(33)
Using (31) and (32) we can rewrite the model (14) as
y
(
t) =
TT(
q)
0 (
t) (34) Suppose that the parameter vector
can be decomposed so that
=
h1
T2
T ::: nTiTdim
k=
sdim
=
ns(35) We shall call
nthe order of the model (34). Suppose also that
@
@
k
T
(
q) =
q;k+1
@@
1
T(
q)
,q;k+1
Z(
q) (36) where the matrix
Z(
q) in (36) is of size 2
s.
It should be noted that most polynomial-type model structures, like ARMAX, Box-Jenkins, etc., satisfy this shift structure. Thus (36) is a rather weak as- sumption. Consider, e.g., an ARX model
y
(
t) =
B(
q)
A
(
q)
u(
t) + 1
A
(
q)
e(
t) (37) where
A
(
q) = 1 +
a1
q;1 +
+
anq;n(38)
B
(
q) = 1 +
b1
q;1 +
+
bnq;n(39) In this case we have
k=
hak bkiTand
Z
(
q) =
q;1
2
6
4
; B
(
q)
A 2
(
q) 1
A
(
q)
;
1
A 2
(
q) 0
3
7
5
(40)
From (36) it follows that the 2
ddimensional matrix
T0(
ei!), being the derivative of
T(
ei!) with respect to
, can be written
T 0
(
ei!) =
ei!Z(
ei!)
Wn(
ei!) (41) where
W
n
(
ei!) =
e;i!I e;i2
!I e;in!I(42)
7
with
Ibeing an
s sidentity matrix. The following lemma, which also is of independent interest, will be used in the proof of the main result, Theorem 2, below.
Lemma 1 Let
Wn(
ei!) be dened by (42) and let
fw(
t)
gbe an
s-dimensional process with invertible spectrum
w(
!). Then
lim
n!1
1
n W
n
(
ei!1)
1 2
Z
;
W T
n
(
e;i)
w(
;)
Wn(
ei)
d ;1
WnT(
e;i!2)
=
hw(
;!1 )
i;1
(
!1;!2)mod2
(43)
PROOF. From Lemma 4.3 in 7] (see also 5], Lemma 4.2) we have that lim
n!1
1
n W
n
(
ei!1)
1 2
Z
;
W T
n
(
ei)
w(
)
Wn(
e;i)
d ;T WnT(
e;i!2)
=
hw(
!1 )
i;T(
!1;!2) (44)
(Note especially the transposes, which are due to dierent denitions of corre- lation functions and spectra than the ones used here.) The result (43) follows since
Tw(
!) =
w(
;!) and
Wn(
ei(
!+2
k) ) =
Wn(
ei!)
8 k 2Z.
For the proof of Theorem 2 below we additionally need a number of technical assumptions which we now list. Let
(
n) = arg min
2D
M
V
(
) (45)
(If the minimum is not unique, let
(
n) denote any, arbitrarily chosen mini- mizing element.) The argument
nis added to emphasize that the minimization is carried out over
nth order models. Dene
^
N
(
n) = arg min
2D
M V
N
(
n) (46)
V
N
(
n) = 12
"
1
N N
X
t
=1
"
2 (
t) +
j;(
n)
j2
#
(47) where
is a regularization parameter, helping us to select a unique minimizing element in (46) in case
= 0 leads to nonunique minima.
Assume that the true system can be described by
y
(
t) =
G0 (
q)
u(
t) +
v(
t)
v(
t) =
H0 (
q)
e(
t) (48) where
fe(
t)
gis a sequence of independent, identically distributed random vari- ables with zero means, variances
0 , and bounded fourth order moments, and
8
where
G0 (
q) is stable and strictly causal and
H0 (
q) is monic and inversely sta- ble. From (48) it follows that the spectrum of the additive disturbance
fv(
t)
gis
v(
!) =
0
jH0 (
ei!)
j2 (49) It will be assumed that there exists a
0
2DMsuch that
G
(
q0 ) =
G0 (
q) and
H(
q0 ) =
H0 (
q) (50) Further, suppose that the predictor lters
H
;
1 (
q)
G(
q) and
H;1 (
q) (51) are uniformly stable for
2DMalong with their rst, second, and third order derivatives.
Let
T
n
(
ei!) =
T(
ei!(
n)) (52)
^
T
N
(
ei!n) =
T(
ei!^
N(
n)) (53)
T
0 (
ei!) =
T(
ei!0 ) (54) Assume that
lim
n!1 n
2
E
"
(
t(
n))
;e(
t)] 2 = 0 (55) which implies that
Tn(
ei!) tends to
T0 (
ei!) as
ntends to innity. Let
Z(
ei!) dened in (36) be denoted by
Z0 (
ei!) when evaluated for
=
0 . Assume that
Z
0 (
ei!)
Z0
T(
e;i!) (56) is invertible. Further assume that
Ru(
k) and
Rue(
k) exist and that
Rue(
k) = 0,
k <
0. Finally, assume that lim
N!1
1
p
N N
X
t
=1
E"
d
d
"
2 (
t(
n))
#
= 0 (
nxed) (57)
We can now state the main result of the paper.
Theorem 2 Consider the estimate ^
TN(
ei!n) under the assumptions (36) and (45)-(57). Then
p
N 2
6
4
Re ^
TN(
ei!n)
;Tn(
ei!)]
Im ^
TN(
ei!n)
;Tn(
ei!)]
3
7
5
2AsN
(0
P(
!n)) (58) as
N !1for xed
n,
]
9
where
lim
!
0
n!1lim 1
n
P
(
!n) =
8
>
>
>
>
>
>
<
>
>
>
>
>
>
:
1 2
v(
!)
2
4
Re
;01 (
!)] Im
;01 (
!)]
;
Im
;01 (
!)] Re
;01 (
!)]
3
5
if
!mod
6= 0
v(
!)
2
4
Re
;01 (
!)] 0
0 0
3
5
if
!mod
= 0
(59) The proof is given in Appendix A.
From (58)-(59) we conclude that, for
!mod
6= 0, the random variable
p
N
( ^
TN(
ei!n)
;Tn(
ei!)) (60) asymptotically has a complex normal distribution (e.g., 1]) with covariance matrix
P(
!n) satisfying
lim
!
0
n!1lim 1
n
P
(
!n) =
v(
!)
;01 (
;!) (61) Compare also with the results in 3]. (In connection to this the author would like to point out that the matrix on the right hand side of equation (3.22) in 3] should be transposed.) For
!mod
= 0 this is no longer true, since then the diagonal blocks of
lim
!
0 lim
n!1
1
n
P
(
!n) (62)
are not equal. That the covariance for the imaginary part is zero in this case is very natural, since the transfer function is real-valued for
!mod
= 0.
Further, with basically only notational dierences the result in Theorem 2 also holds for multivariable models. The extension of the results in 3] to the multivariable case was given in 8]. We can also prove the result for polyno- mial models where the dierent polynomials are of unequal orders, or if other criteria than the quadratic one is used. In open loop the situation simplies and, e.g., to prove the corresponding results for the model ^
GN(
ei!n) in this case the conditions in Theorem 2 can be relaxed so that a xed, arbitrary noise model
H(
q) =
H(
q) can be used. See 3] for all this.
Let us return to the result in Theorem 2. A more explicit expression for (59)
10
can be obtained if we note that
;01 (
!) =
2
6
4
1
=ru(
!)
;ue(
!)
=(
0
ru(
!))
;
eu(
!)
=(
0
ru(
!)) 1
=re(
!)
3
7
5
(63)
ru(
!) =
u(
!)
;jue(
!)
j2
=0 (64)
re(
!) =
0
;jue(
!)
j2
=u(
!) (65)
ru(
!) can be interpreted as the \noise-free" part of the input spectrum, i.e., that part of the total input spectrum
u(
!) that does not originate from
fe
(
t)
gbut from some external reference or set-point signal. In open loop we have
ue(
!) = 0, which, e.g., implies that
ru(
!) =
u(
!), and the expression for
;01 (
!) simplies to
;01 (
!) =
2
6
4
1
=u(
!) 0 0 1
=0
3
7
5
(66)
If we return to the general, closed loop situation we see from (63) that Re
;01 (
!)] =
2
6
4
1
=ru(
!)
;Re
ue(
!)
=(
0
ru(
!))]
;
Re
eu(
!)
=(
0
ru(
!))] 1
=re(
!)
3
7
5
(67) Im
;01 (
!)] =
2
6
4
0
;Im
ue(
!)
=(
0
ru(
!))]
;
Im
eu(
!)
=(
0
ru(
!))] 0
3
7
5
(68) Using (67) and (68) we can thus easily prove the following consequence of The- orem 2, dealing with the asymptotic distribution of the estimate ^
GN(
ei!n).
Corollary 3 Consider the situation in Theorem 2. Then
p
N 2
6
4
Re ^
GN(
ei!n)
;Gn(
ei!)]
Im ^
GN(
ei!n)
;Gn(
ei!)]
3
7
52AsN
(0
P(
!n)) (69) as
N !1for xed
n,
]
where
lim
!
0
n!1lim 1
n
P
(
!n) =
8
>
>
>
>
>
>
<
>
>
>
>
>
>
:
1 2
v(
!)
2
4
1
=ru(
!) 0 0 1
=ru(
!)
3
5
if
!mod
6= 0
v(
!)
2
4
1
=ru(
!) 0
0 0
3
5
if
!mod
= 0
(70)
11
From (69)-(70) we see that, for
!mod
6= 0, the random variable
p
N
( ^
GN(
ei!n)
;Gn(
ei!)) (71) asymptotically has a complex normal distribution with covariance matrix
P
(
!n) satisfying lim
!
0
n!1lim 1
n P
(
!n) =
v(
!)
ru(
!) (72)
This does not hold for
!mod
= 0, but for all
!we nevertheless have that tr
lim
!
0 lim
n!1
1
n
P
(
!n)
=
v(
!)
ru(
!) (73)
which ties in nicely with the results in 3], cf. (3) and (4).
An intuitive, but not formally correct, interpretation of the result (69)-(70) is that it gives the following convenient expression for the covariance matrix (3) (for the case
!mod
6= 0):
P
(
!)
nN
1 2
v(
!)
2
6
4
1
=ru(
!) 0 0 1
=ru(
!)
3
7
5
as
N n !1(74) From (74) we see that asymptotically, as both
nand
Ntend to innity, the real and imaginary parts of the
G-estimate are uncorrelated with equal variance proportional to the number-of-parameters-to-number-of-measurements ratio (
n=N) and to the noise-to-signal ratio (
v(
!)
=ru(
!)).
4 Example
If we assume that the random variable
2
6
4
Re ^
GN(
ei!)]
Im ^
GN(
ei!)]
3
7
5
(75)
has a normal distribution with covariance matrix
P(
!), we may compute an
% condence ellipsoid for ^
GN(
ei!) at a particular frequency
!k,
!kmod
6= 0, as the set of all
G(
ei!k) such that
2
6
4
Re
G(
ei!k)
;G^
N(
ei!k)]
Im
G(
ei!k)
;G^
N(
ei!k)]
3
7
5 P
;
1 (
!k)
2
6
4
Re
G(
ei!k)
;G^
N(
ei!k)]
Im
G(
ei!k)
;G^
N(
ei!k)]
3
7
5 T
C