Subspace Identication from Closed Loop Data

(1)

Subspace Identication from Closed Loop Data

Lennart Ljung and Tomas McKelvey

Department of Electrical Engineering, Linkoping University Linkoping, Sweden

June 27, 1995

Technical Report LiTH-ISY-R-1752

Abstract

So called subspace methods for direct identication of linear mod- els in state space form have drawn considerable interest recently. They have been found to work well in many cases but have one drawback { they do not yield consistent estimates for data collected under out- put feedback. This contribution points to the reasons for this and also shows how to modify the basic algorithm to handle closed loop data. We stress how the basic idea is to focus on the estimation of the state-variable candidates { the k-step ahead output predictors. By re- computing these from a \non-parametric" (or, rather, high order ARX) one-step ahead predictor model, closed loop data can be handled.

1 Introduction

We consider state space models of the given type x ( t + 1) = Ax ( t ) + Bu ( t ) + w ( t )

y ( t ) = Cx ( t ) + Du ( t ) + e ( t ) (1) or in innovation form

x ( t + 1) = Ax ( t ) + Bu ( t ) + K" ( t )

y ( t ) = Cx ( t ) + Du ( t ) + " ( t ) (2)

The problem is to estimate the matrices A B C D and possibly K from

input-output data

^f

u ( t ) y ( t ) t = 1 2 :::N

^g

without any prior knowledge

(2)

of the structure. Several techniques for this have been developed. The \clas- sical" maximum likelihood (ML) approach would imply that the matrices are parameterized in some canonical way, and then these parameters are be- ing estimated, e.g., 3]. This has the advantage of being statistically ecient but the disadvantage that the iterative search { that might end up in a local minimum { has to be employed, and that the canonical parameterization might be numerically ill-conditioned.

So called subspace methods, 6, 7], 8], 9] form an interesting alternative to the ML approach. The idea behind these methods can be explained as rst estimating the state vector x ( t ), and then nding the state space matrices by a linear least squares procedure. The state vector in innovations type parameterizations like (2) is always found as linear combinations of k-step ahead output predictors. See 1], 4] or 3], Appendix 4.A.

In the subspace methods these k-step ahead predictors are found by so called oblique projections, 7]. While this constitutes algorithms that are very ecient, numerically, it gives the drawback that closed loop data cannot be handled in this way.

The purpose of this contribution is to show how to circumvent this draw- back. We do that by stressing the outlined perspective on subspace methods that gives a direct clue how to move on to closed loop data.

2 The Basic Steps of Subspace Identication

As clearly pointed out by 5] the subspace approach focuses on the state vector: First nd the state vector x ( t ) from data, then the state space matrices ABCD and K can be dened from linear regressions. There are two possibilities:

1a

Treat, for given

^f

x ( t )

^g

the equation

y ( t ) = Cx ( t ) + Du ( t ) + e ( t )

as a linear regression, giving C and D by the least squares method, and leaving e ( t ) as the residuals,

1b

Then for given x ( t ), u ( t ) and e ( t ) (completed as above) estimate A B and K from the linear regression

x ( t + 1) = Ax ( t ) + Bu ( t ) + Ke ( t ) + ^ e ( t )

with ^ e as the residuals.

(3)

An alternative is

2a

Treat

^"

x ( t + 1) y ( t )

#

=

"

C D A B

#"

x ( t ) u ( t )

#

+

"

w ( t ) e ( t )

#

for given x y and u as a linear regression, with residuals v ^t ( t ) = w ^T ( t ) e ^T ( t )] ^T

2b

The Kalman gain K , can then, if desired be computed from A C and the covariance matrix of v ( t ).

So, once the state sequence x ( t ) has been determined, we may easily nd the corresponding state space model. Therefore, let us discuss how to nd these states. The literature on state space identication has very elegantly showed how the state can be estimated directly from the data by certain projections. We shall not use that language here but describe the process (equivalently) along the classical realization path, as developed by 2], 1]

and 4]. (See Appendix 4.A in 3] for an account).

The essence is as follows: Let

^f

e ( t )

^g

be the innovations and let the impulse response matrix from

( t ) =

"

u ( t ) e ( t )

#

(3) to y ( t ) be H k so that

y ( t ) =

^X¹

k

⁼⁰

H k ( t

^;

k ) : (4)

H k is thus a p by p + m matrix where p is the number of outputs and m the number of inputs. Dene

y ^ ( t + j

^j

t ) =

^X¹

k

⁼

j H k ( t + j

^;

k ) (5)

If

^f

( t )

^g

were white noise, this would be the correct j step ahead predictor for y ( t ), hence the notation. Denote by

Y m ( t ) =

2

6

4

y ^ ( t + 1

^j

t )

^ ...

y ( t + m

^j

t )

3

7

5

(6)

vectors of standard predictions. The basic points of the realization theory

are

(4)

The system (4) has a minimal state space realization of order n if and only if

rank

^f

Y m ( t )

^g

= n for m

n (7) (The rank refers to the vector sequence Y m ( t ) t = 1 ::: )

All \Kalman states", i.e., state vectors that can be reconstructed from past inputs and outputs are obtained by picking a basis from

^f

Y m ( t )

^g

. x ( t ) = LY m ( t ) (8) for some matrix L .

Now the identication strategy is clear:

1

Find ^ y ( t + j

^j

t ) as dened by (8)

2

Y m ( t ) t = 1 :::N as in (6) and estimate its rank n

3

Select L in (8) to obtain a well conditioned basis and x ( t ) = LY m ( t ) t = 1 :::N

4

nd the matrices A B C D and K in (7) by either of the procedures 1 or 2 described in the begining of this section.

This can be seen as the basic recipe for direct state space identication.

Depending on details in the dierent types we arrive at various implemen- tations of the recipe. This most crucial step is number 1. We shall discuss how to achieve this in Section 3.

3 Estimating the j-step Ahead Predictor

Let us rewrite (4) as y ( t + j ) =

^X¹

k

⁼⁰

h _ek e ( t + j

^;

k ) +

^X¹

k

⁼⁰

h _uk u ( t + j

^;

k )

By the innovations denition, there is a one-to-one correspondence between

e ( s ) and y ( s ), so past e ( s ) can be replaced by y ( s ) at no loss of information.

(5)

We do so for s

t :

y ( t + j ) = ^j

^X^;1

k

⁼⁰

H _ek e ( t + j

^;

k ) +

^X¹

k

⁼⁰

H ~ _ye y ( t

^;

k ) + ^j

^X^;1

k

⁼⁰

H _uk u ( t + j

^;

k ) +

^X¹

k

⁼⁰

H _uk

⁺

_j u ( t

^;

k )

=

4

H e + H y + H _Fu + H _pu (9)

The predictor is then

y ^ ( t + j

^j

t ) = H y + H _pu (10) How to nd H y and H _pu from (9)? If there is no feedback in the input the term H e will be orthogonal to the others. We can thus estimate ~ H _k ^y and H _uk by the linear regression least squares method. In practice we have to replace the innite upper limit by nite values:

y ( t + j ) =

^X

^na

k

⁼⁰

H ~ _k ^y y ( t

^;

k ) + ^j

^X^;1

k

⁼⁰

H _uk u ( t + j

^;

k ) +

^X

^nb

k

⁼⁰

H _uk

⁺

_j u ( t

^;

k ) + " ( t + j ) (11) Here " denotes the residuals. Here it is clear how H y H _Fu and H _pu are estimated by the least squares method, and how then the predictor (10) is found by throwing away the \future input" term H _Fu . This is what is called the oblique projection in 7] (This step is necessary, due to the typical correlation between H _Fu and H _pu ignoring H _Fu in the regressions would give a bias in H _pu ).

The procedure just described is contained in one or other form in all the subspace identication variants.

It is now also clear why closed loop data will cause problems. If the input is generated by output feedback H _Fu and H e will be correlated, and the estimate of H _Fu in (11) will then be biased.

Are there other alternatives to estimate ^ y ( t + j

^j

t ) that do not have this

problem with closed loop data? Well, if there is a delay in the system,

H

⁰

^u = 0, { as is typically the case for systems under feedback control { then

(11) will work without problems for j = 1. The one step ahead predictor is

then easily formed.

(6)

Now, there is a simple recursive way to compute further step ahead predictors such as (5) from the one step ahead predictor. The recursive step is a follows:

Given that we have determined ^ y ( t + i

^j

t ) i = 1 :::j

^;

1 Then replace in the expression for ^ y ( t + j

^j

t + j

^;

1) all occurences of y ( t + k ) k = 1 :::j

^;

1 by ^ y ( t + k

^j

t ) and all occurences of u ( t + k ) k = 1 :::j

^;

1 by zero

This process gives a way to determine all the predictors, not suering from problems with feedback. It is functionally equivalent to dening an ( na nb )-order ARX-model from the data, and then computing this model's j -step ahead prediction and replacing \future" u ( t ) in this prediction by zero.

4 A Test Subspace Algorithm for Feedback Data

We now follow the second way of estimating the predictor, described in the previous section, and the approach 1 to determine the state matrices. This gives an algorithm that can be summarized in MATLAB as follows

function th=ssnew(z,maxo,na,nb,ny)

% z = y u] with y being the outputs and u the inputs

% maxo: the maximal order

% na: number of delayed outputs as in text

% nb: number of delayed inputs as in text

% ny: number of outputs if nargin < 5, ny=1 end if nargin < 4 nb=maxo+3 end if nargin < 3 na=maxo+3 end

N,nz]=size(z)

y=z(:,1:ny) u=z(:,ny+1:nz) nu=nz-ny

nk=1

m=arx(z,naones(ny,ny) nbones(ny,nu) nk*ones(ny,nu)])

a,b]=th2arx(m) dum,nca]=size(a) dum,ncb]=size(b)

% arx and th2arx are from the System Identification Toolbox thy=-a(:,ny+1:nca)'

thu=b(:,nu+1:ncb)' %nk=1

(7)

FIY=] FIU=]

for kk=1:na

FIY(:,(kk-1)ny+1:kkny)=y(na-kk+1:N-kk,:) end

for kk=1:nb

FIU(:,(kk-1)nu+1:kknu)=u(nb-kk+1:N-kk,:) end

dalen=N-max(maxo,na,nb]) Y1=zeros(dalen,maxo)

% Y1 is the vector of k-step ahead predictors for kk=1:maxo

Ydum=FIYthy+FIUthu

Y1(:,(kk-1)ny+1:kkny)=Ydum FIY=Ydum FIY(:,1:(na-1)*ny)]

FIU=zeros(dalen,nu) FIU(:,1:(nb-1)*nu)]

end

U,S,V]=svd(Y1) diag(S)

n=input('Enter Order ') X=Y1*V(:,1:n)

start=max(maxo,nb,na])+1 yr=y(start:N,:)

C=X\yr % Least squares estimate of C e1=yr-X*C

C=C'

Xnew=X(2:dalen,:) Xr=X(1:dalen-1,:)

abk=Xr u(start:N-1,:) e1(1:dalen-1,:)]\Xnew

% Least squares estimate of a, b and k abk=abk'

A=abk(1:n,1:n) B=abk(1:n,n+1:n+nu)

K=abk(1:n,n+nu+1:n+nu+ny) D=zeros(ny,nu)

th=ms2th(modstruc(A,B,C,D,K),'d')

% This is just to put the model in the theta format of

% the System IdentificationToolbox

(8)

10

⁻²

10

⁻¹

10

⁰

10

¹

10

⁻¹

10

⁰

10

¹

10

²

frequency (rad/sec) AMPLITUDE PLOT, input # 1 output # 1

True system Alg. in paper N4SID

Figure 1: Bode plots of models obtained for the test data set IDDATA5.

From above: (1) The true system, (2) A fth order model obtained by the algorithm given in this paper, (3) A fth order model obtained by N4SID, a standard subspace method.

Some comments should be added. In the above code there are no attempts to optimize ecency or numerics, we just want to demonstrate the feasibility.

It should also be remarked that this approach lacks a considerable advan- tage of the N4SID algorithm 7] (and its associates). By forming the vector Y m explicitly, we work in a provisional basis that could be ill-conditioned, while N4SID directly forms x ( t ) from the data.

We have performed a number of test with the algorithm on simple data sets, and its performance is compatible to that of N4SID, except for closed loop data, where N4SID might fail. We illustrate this by applying these algorithms to the data set IDDATA5 from the System Identication Toolbox.

The results are shown in Figure 1.

(9)

5 Some Additional Insights

The subspace algorithms involve a number of choices. We have pointed to the numbers na nb and m in (11) and (6). These correspond to the split into past and future in the subspace algorithms. Our discussion has pointed to some insights about these numbers, that can be summarized as follows:

1

na nb and m need not and probably should not be the same

2

na and nb re!ect the length of the tails of the true one-step ahead predictor of the system. They should be chosen to minimize the mean-square error, by balancing bias and variance in an ARX-model

2

m is just the largest model order we wish to investigate. Once the desired order has been selected to n , it may suce to work with Y n .

6 Conclusions

We have in this contribution stressed how the subspace identication algo- rithms are primarily based on estimates of the k-step ahead output predic- tors. This gives some insight into the essence of the algorithms, it explains why close loop data may cause diculties and how to avoid these diculties.

The main contribution might be these insights. We have also given one concrete algorithm, based on our insights that can handle close loop data.

This algorithm performs about as well on a simple test case as the N4SID algorithm, but we have demonstrated its distinct advantage when it comes to handling closed loop data.

References

1] H. Akaike. Markovian representation of stochastic processes by canonical variables. SIAM Journal of Control and Optimization, 13:162{173, 1975.

2] P. Faurre. Stochastic realization algorithms. In R. Mehra and D. Lainio- tis, editors, System Identication: Advances and Case Studies. Academic Press, NY, 1976.

3] L. Ljung. System Identication: Theory for the User. Prentice-Hall,

Englewood Clis, New Jersey, 1987.

(10)

4] J. Rissanen. Basis of invariants and canonical forms for linear dynamic systems. Automatica, 10:175{182, 1974.

5] P. Van Overschee. Subspace Identication, Theory - Implementation - Application. PhD thesis, Katholieke Universiteit Leuven, Kard. Mercier- laan 94, 3030 Leuven (Heverlee), Belgium, February 1995.

6] P. Van Overschee and B. De Moor. Subspace algorithms for the stochas- tic identication problem. Automatica, 29(3):649{660, 1993.

7] P. Van Overschee and B. De Moor. N4SID: Subspace algorithms for the identication of combined deterministic-stochastic systems. Automatica, 30(1):75{93, 1994.

8] M. Verhaegen. Identication of the deterministic part of MIMO state space models given in innovations form from input-output data. Auto- matica, 30(1):61{74, 1994.

Subspace Identication from Closed Loop Data

Subspace Identication from Closed Loop Data

Lennart Ljung and Tomas McKelvey

Department of Electrical Engineering, Linkoping University Linkoping, Sweden

June 27, 1995

1 Introduction

We consider state space models of the given type x ( t + 1) = Ax ( t ) + Bu ( t ) + w ( t )

y ( t ) = Cx ( t ) + Du ( t ) + e ( t ) (1) or in innovation form

x ( t + 1) = Ax ( t ) + Bu ( t ) + K" ( t )

y ( t ) = Cx ( t ) + Du ( t ) + " ( t ) (2)

The problem is to estimate the matrices A B C D and possibly K from

input-output data

u ( t ) y ( t ) t = 1 2 :::N

without any prior knowledge

In the subspace methods these k-step ahead predictors are found by so called oblique projections, 7]. While this constitutes algorithms that are very ecient, numerically, it gives the drawback that closed loop data cannot be handled in this way.

The purpose of this contribution is to show how to circumvent this draw- back. We do that by stressing the outlined perspective on subspace methods that gives a direct clue how to move on to closed loop data.

2 The Basic Steps of Subspace Identication

As clearly pointed out by 5] the subspace approach focuses on the state vector: First nd the state vector x ( t ) from data, then the state space matrices ABCD and K can be dened from linear regressions. There are two possibilities:

Treat, for given

x ( t )

the equation

y ( t ) = Cx ( t ) + Du ( t ) + e ( t )

as a linear regression, giving C and D by the least squares method, and leaving e ( t ) as the residuals,

Then for given x ( t ), u ( t ) and e ( t ) (completed as above) estimate A B and K from the linear regression

x ( t + 1) = Ax ( t ) + Bu ( t ) + Ke ( t ) + ^ e ( t )

with ^ e as the residuals.

An alternative is

Treat

x ( t + 1) y ( t )

=

C D A B

x ( t ) u ( t )

+

w ( t ) e ( t )

for given x y and u as a linear regression, with residuals v t ( t ) = w T ( t ) e T ( t )] T

The Kalman gain K , can then, if desired be computed from A C and the covariance matrix of v ( t ).

and 4]. (See Appendix 4.A in 3] for an account).

The essence is as follows: Let

e ( t )

be the innovations and let the impulse response matrix from

 ( t ) =

u ( t ) e ( t )

(3) to y ( t ) be H k so that

y ( t ) =

k

H k  ( t

k ) : (4)

H k is thus a p by p + m matrix where p is the number of outputs and m the number of inputs. Dene

y ^ ( t + j

t ) =

k

j H k  ( t + j

k ) (5)

If

 ( t )

were white noise, this would be the correct j step ahead predictor for y ( t ), hence the notation. Denote by

Y m ( t ) =

y ^ ( t + 1

t )

^ ...

y ( t + m

t )

(6)

vectors of standard predictions. The basic points of the realization theory

are

The system (4) has a minimal state space realization of order n if and only if

rank

Y m ( t )

= n for m

n (7) (The rank refers to the vector sequence Y m ( t ) t = 1 ::: )

All \Kalman states", i.e., state vectors that can be reconstructed from past inputs and outputs are obtained by picking a basis from

Y m ( t )

. x ( t ) = LY m ( t ) (8) for some matrix L .

Now the identication strategy is clear:

Find ^ y ( t + j

t ) as dened by (8)

Y m ( t ) t = 1 :::N as in (6) and estimate its rank n

Select L in (8) to obtain a well conditioned basis and x ( t ) = LY m ( t ) t = 1 :::N

nd the matrices A B C D and K in (7) by either of the procedures 1 or 2 described in the begining of this section.

This can be seen as the basic recipe for direct state space identication.

In the subspace methods these k-step ahead predictors are found by so called oblique projections, 7]. While this constitutes algorithms that are very ecient, numerically, it gives the drawback that closed loop data cannot be handled in this way.

As clearly pointed out by 5] the subspace approach focuses on the state vector: First nd the state vector x ( t ) from data, then the state space matrices ABCD and K can be dened from linear regressions. There are two possibilities:

for given x y and u as a linear regression, with residuals v ^t ( t ) = w ^T ( t ) e ^T ( t )] ^T

( t ) =

H k ( t

H k is thus a p by p + m matrix where p is the number of outputs and m the number of inputs. Dene

j H k ( t + j

( t )

Now the identication strategy is clear:

t ) as dened by (8)

nd the matrices A B C D and K in (7) by either of the procedures 1 or 2 described in the begining of this section.

This can be seen as the basic recipe for direct state space identication.

Depending on details in the dierent types we arrive at various implemen- tations of the recipe. This most crucial step is number 1. We shall discuss how to achieve this in Section 3.

h _ek e ( t + j

h _uk u ( t + j

By the innovations denition, there is a one-to-one correspondence between

y ( t + j ) = ^j

H _ek e ( t + j

H ~ _ye y ( t

k ) + ^j

H _uk u ( t + j

H _uk

_j u ( t

H e + H y + H _Fu + H _pu (9)

t ) = H y + H _pu (10) How to nd H y and H _pu from (9)? If there is no feedback in the input the term H e will be orthogonal to the others. We can thus estimate ~ H _k ^y and H _uk by the linear regression least squares method. In practice we have to replace the innite upper limit by nite values:

^na

H ~ _k ^y y ( t

k ) + ^j

H _uk u ( t + j

^nb

H _uk

_j u ( t

The procedure just described is contained in one or other form in all the subspace identication variants.

It is now also clear why closed loop data will cause problems. If the input is generated by output feedback H _Fu and H e will be correlated, and the estimate of H _Fu in (11) will then be biased.

^u = 0, { as is typically the case for systems under feedback control { then