Subspace Identication from Closed Loop Data
Lennart Ljung and Tomas McKelvey
Department of Electrical Engineering, Linkoping University Linkoping, Sweden
June 27, 1995
Technical Report LiTH-ISY-R-1752
Abstract
So called subspace methods for direct identication of linear mod- els in state space form have drawn considerable interest recently. They have been found to work well in many cases but have one drawback { they do not yield consistent estimates for data collected under out- put feedback. This contribution points to the reasons for this and also shows how to modify the basic algorithm to handle closed loop data. We stress how the basic idea is to focus on the estimation of the state-variable candidates { the k-step ahead output predictors. By re- computing these from a \non-parametric" (or, rather, high order ARX) one-step ahead predictor model, closed loop data can be handled.
1 Introduction
We consider state space models of the given type x ( t + 1) = Ax ( t ) + Bu ( t ) + w ( t )
y ( t ) = Cx ( t ) + Du ( t ) + e ( t ) (1) or in innovation form
x ( t + 1) = Ax ( t ) + Bu ( t ) + K" ( t )
y ( t ) = Cx ( t ) + Du ( t ) + " ( t ) (2)
The problem is to estimate the matrices A B C D and possibly K from
input-output data
fu ( t ) y ( t ) t = 1 2 :::N
gwithout any prior knowledge
of the structure. Several techniques for this have been developed. The \clas- sical" maximum likelihood (ML) approach would imply that the matrices are parameterized in some canonical way, and then these parameters are be- ing estimated, e.g., 3]. This has the advantage of being statistically ecient but the disadvantage that the iterative search { that might end up in a local minimum { has to be employed, and that the canonical parameterization might be numerically ill-conditioned.
So called subspace methods, 6, 7], 8], 9] form an interesting alternative to the ML approach. The idea behind these methods can be explained as rst estimating the state vector x ( t ), and then nding the state space matrices by a linear least squares procedure. The state vector in innovations type parameterizations like (2) is always found as linear combinations of k-step ahead output predictors. See 1], 4] or 3], Appendix 4.A.
In the subspace methods these k-step ahead predictors are found by so called oblique projections, 7]. While this constitutes algorithms that are very ecient, numerically, it gives the drawback that closed loop data cannot be handled in this way.
The purpose of this contribution is to show how to circumvent this draw- back. We do that by stressing the outlined perspective on subspace methods that gives a direct clue how to move on to closed loop data.
2 The Basic Steps of Subspace Identication
As clearly pointed out by 5] the subspace approach focuses on the state vector: First nd the state vector x ( t ) from data, then the state space matrices ABCD and K can be dened from linear regressions. There are two possibilities:
1a
Treat, for given
fx ( t )
gthe equation
y ( t ) = Cx ( t ) + Du ( t ) + e ( t )
as a linear regression, giving C and D by the least squares method, and leaving e ( t ) as the residuals,
1b
Then for given x ( t ), u ( t ) and e ( t ) (completed as above) estimate A B and K from the linear regression
x ( t + 1) = Ax ( t ) + Bu ( t ) + Ke ( t ) + ^ e ( t )
with ^ e as the residuals.
An alternative is
2a
Treat
"x ( t + 1) y ( t )
#
=
"
C D A B
#"
x ( t ) u ( t )
#
+
"
w ( t ) e ( t )
#
for given x y and u as a linear regression, with residuals v t ( t ) = w T ( t ) e T ( t )] T
2b
The Kalman gain K , can then, if desired be computed from A C and the covariance matrix of v ( t ).
So, once the state sequence x ( t ) has been determined, we may easily nd the corresponding state space model. Therefore, let us discuss how to nd these states. The literature on state space identication has very elegantly showed how the state can be estimated directly from the data by certain projections. We shall not use that language here but describe the process (equivalently) along the classical realization path, as developed by 2], 1]
and 4]. (See Appendix 4.A in 3] for an account).
The essence is as follows: Let
fe ( t )
gbe the innovations and let the impulse response matrix from
( t ) =
"
u ( t ) e ( t )
#
(3) to y ( t ) be H k so that
y ( t ) =
X1k
=0H k ( t
;k ) : (4)
H k is thus a p by p + m matrix where p is the number of outputs and m the number of inputs. Dene
y ^ ( t + j
jt ) =
X1k
=j H k ( t + j
;k ) (5)
If
f( t )
gwere white noise, this would be the correct j step ahead predictor for y ( t ), hence the notation. Denote by
Y m ( t ) =
2
6
4
y ^ ( t + 1
jt )
^ ...
y ( t + m
jt )
3
7
5
(6)
vectors of standard predictions. The basic points of the realization theory
are
The system (4) has a minimal state space realization of order n if and only if
rank
fY m ( t )
g= n for m
n (7) (The rank refers to the vector sequence Y m ( t ) t = 1 ::: )
All \Kalman states", i.e., state vectors that can be reconstructed from past inputs and outputs are obtained by picking a basis from
fY m ( t )
g. x ( t ) = LY m ( t ) (8) for some matrix L .
Now the identication strategy is clear:
1
Find ^ y ( t + j
jt ) as dened by (8)
2
Y m ( t ) t = 1 :::N as in (6) and estimate its rank n
3
Select L in (8) to obtain a well conditioned basis and x ( t ) = LY m ( t ) t = 1 :::N
4
nd the matrices A B C D and K in (7) by either of the procedures 1 or 2 described in the begining of this section.
This can be seen as the basic recipe for direct state space identication.
Depending on details in the dierent types we arrive at various implemen- tations of the recipe. This most crucial step is number 1. We shall discuss how to achieve this in Section 3.
3 Estimating the j-step Ahead Predictor
Let us rewrite (4) as y ( t + j ) =
X1k
=0h ek e ( t + j
;k ) +
X1k
=0h uk u ( t + j
;k )
By the innovations denition, there is a one-to-one correspondence between
e ( s ) and y ( s ), so past e ( s ) can be replaced by y ( s ) at no loss of information.
We do so for s
t :
y ( t + j ) = j
X;1k
=0H ek e ( t + j
;k ) +
X1k
=0H ~ ye y ( t
;k ) + j
X;1k
=0H uk u ( t + j
;k ) +
X1k
=0H uk
+j u ( t
;k )
=
4H e + H y + H Fu + H pu (9)
The predictor is then
y ^ ( t + j
jt ) = H y + H pu (10) How to nd H y and H pu from (9)? If there is no feedback in the input the term H e will be orthogonal to the others. We can thus estimate ~ H k y and H uk by the linear regression least squares method. In practice we have to replace the innite upper limit by nite values:
y ( t + j ) =
Xna
k
=0H ~ k y y ( t
;k ) + j
X;1k
=0H uk u ( t + j
;k ) +
Xnb
k
=0H uk
+j u ( t
;k ) + " ( t + j ) (11) Here " denotes the residuals. Here it is clear how H y H Fu and H pu are estimated by the least squares method, and how then the predictor (10) is found by throwing away the \future input" term H Fu . This is what is called the oblique projection in 7] (This step is necessary, due to the typical correlation between H Fu and H pu ignoring H Fu in the regressions would give a bias in H pu ).
The procedure just described is contained in one or other form in all the subspace identication variants.
It is now also clear why closed loop data will cause problems. If the input is generated by output feedback H Fu and H e will be correlated, and the estimate of H Fu in (11) will then be biased.
Are there other alternatives to estimate ^ y ( t + j
jt ) that do not have this
problem with closed loop data? Well, if there is a delay in the system,
H
0u = 0, { as is typically the case for systems under feedback control { then
(11) will work without problems for j = 1. The one step ahead predictor is
then easily formed.
Now, there is a simple recursive way to compute further step ahead predictors such as (5) from the one step ahead predictor. The recursive step is a follows:
Given that we have determined ^ y ( t + i
jt ) i = 1 :::j
;1
Then replace in the expression for ^ y ( t + j
jt + j
;1) all occurences of y ( t + k ) k = 1 :::j
;1 by ^ y ( t + k
jt ) and all occurences of u ( t + k ) k = 1 :::j
;1 by zero
This process gives a way to determine all the predictors, not suering from problems with feedback. It is functionally equivalent to dening an ( na nb )-order ARX-model from the data, and then computing this model's j -step ahead prediction and replacing \future" u ( t ) in this prediction by zero.
4 A Test Subspace Algorithm for Feedback Data
We now follow the second way of estimating the predictor, described in the previous section, and the approach 1 to determine the state matrices. This gives an algorithm that can be summarized in MATLAB as follows
function th=ssnew(z,maxo,na,nb,ny)
% z = y u] with y being the outputs and u the inputs
% maxo: the maximal order
% na: number of delayed outputs as in text
% nb: number of delayed inputs as in text
% ny: number of outputs if nargin < 5, ny=1 end if nargin < 4 nb=maxo+3 end if nargin < 3 na=maxo+3 end
N,nz]=size(z)
y=z(:,1:ny) u=z(:,ny+1:nz) nu=nz-ny
nk=1
m=arx(z,na*ones(ny,ny) nb*ones(ny,nu) nk*ones(ny,nu)])
a,b]=th2arx(m) dum,nca]=size(a) dum,ncb]=size(b)
% arx and th2arx are from the System Identification Toolbox thy=-a(:,ny+1:nca)'
thu=b(:,nu+1:ncb)' %nk=1
FIY=] FIU=]
for kk=1:na
FIY(:,(kk-1)*ny+1:kk*ny)=y(na-kk+1:N-kk,:) end
for kk=1:nb
FIU(:,(kk-1)*nu+1:kk*nu)=u(nb-kk+1:N-kk,:) end
dalen=N-max(maxo,na,nb]) Y1=zeros(dalen,maxo)
% Y1 is the vector of k-step ahead predictors for kk=1:maxo
Ydum=FIY*thy+FIU*thu
Y1(:,(kk-1)*ny+1:kk*ny)=Ydum FIY=Ydum FIY(:,1:(na-1)*ny)]
FIU=zeros(dalen,nu) FIU(:,1:(nb-1)*nu)]
end
U,S,V]=svd(Y1) diag(S)
n=input('Enter Order ') X=Y1*V(:,1:n)
start=max(maxo,nb,na])+1 yr=y(start:N,:)
C=X\yr % Least squares estimate of C e1=yr-X*C
C=C'
Xnew=X(2:dalen,:) Xr=X(1:dalen-1,:)
abk=Xr u(start:N-1,:) e1(1:dalen-1,:)]\Xnew
% Least squares estimate of a, b and k abk=abk'
A=abk(1:n,1:n) B=abk(1:n,n+1:n+nu)
K=abk(1:n,n+nu+1:n+nu+ny) D=zeros(ny,nu)
th=ms2th(modstruc(A,B,C,D,K),'d')
% This is just to put the model in the theta format of
% the System IdentificationToolbox
10
−210
−110
010
110
−110
010
110
2frequency (rad/sec) AMPLITUDE PLOT, input # 1 output # 1
True system Alg. in paper N4SID
Figure 1: Bode plots of models obtained for the test data set IDDATA5.
From above: (1) The true system, (2) A fth order model obtained by the algorithm given in this paper, (3) A fth order model obtained by N4SID, a standard subspace method.
Some comments should be added. In the above code there are no attempts to optimize ecency or numerics, we just want to demonstrate the feasibility.
It should also be remarked that this approach lacks a considerable advan- tage of the N4SID algorithm 7] (and its associates). By forming the vector Y m explicitly, we work in a provisional basis that could be ill-conditioned, while N4SID directly forms x ( t ) from the data.
We have performed a number of test with the algorithm on simple data sets, and its performance is compatible to that of N4SID, except for closed loop data, where N4SID might fail. We illustrate this by applying these algorithms to the data set IDDATA5 from the System Identication Toolbox.
The results are shown in Figure 1.
5 Some Additional Insights
The subspace algorithms involve a number of choices. We have pointed to the numbers na nb and m in (11) and (6). These correspond to the split into past and future in the subspace algorithms. Our discussion has pointed to some insights about these numbers, that can be summarized as follows:
1
na nb and m need not and probably should not be the same
2
na and nb re!ect the length of the tails of the true one-step ahead predictor of the system. They should be chosen to minimize the mean-square error, by balancing bias and variance in an ARX-model
2