• No results found

Some analysis is pursued which shows that the proposed model structure retains the statistical properties of the standard identiable model structures

N/A
N/A
Protected

Academic year: 2021

Share "Some analysis is pursued which shows that the proposed model structure retains the statistical properties of the standard identiable model structures"

Copied!
41
0
0

Loading.... (view fulltext now)

Full text

(1)System Identication Using Overparametrized State-Space Models T. McKelvey Department of Electrical Engineering Linkoping University S-581 83 Linkoping, Sweden E-mail: tomas@isy.liu.se Abstract. In this report we consider identication of linear time-invariant nite dimensional systems using state-space models. We introduce a new model structure which is fully parametrized, i.e. all matrices are lled with parameters. All multivariable systems of a given order can be described within this model structure and thus relieve us from all the internal structural issues otherwise inherent in the multivariable state space identication problem. The models are obtained with an identication algorithm by minimizing a regularized prediction error criterion. Some analysis is pursued which shows that the proposed model structure retains the statistical properties of the standard identiable model structures. We prove, under some mild assumptions, that the proposed identication algorithm locally converges to the set of true systems. Inclusion of an additional step in the algorithm gives convergence to a balanced realization of the true system. Some results on the analysis of the sensitivity of the transfer function with respect to the parameters for a given realization are reviewed, which show that balanced realizations have low sensitivity. We show that for one particular choice of regularization the obtained model is in a norm minimal realization. Examples are given showing the properties of the proposed model structure using both real and simulated data.. 1.

(2) 1 Introduction In this report we consider identication of multivariable linear time invariant systems. We also assume that we have no structural information and thus have to resort to so called black-box models. These can be either of inputoutput type or of state-space type. There are several reasons that favor the state-space form: They are their own economic implementation algorithm. They have simple relationships between simulation and prediction applications. Many control design methods and algorithms, including the recent Doyle-Glover algorithm solving the H1 optimal control problem 2], use state-space descriptions of the plants to be controlled. From a computer implementation point of view it is important to use models which have low sensitivity with respect to round-o errors and nite word length e ects. Input-output descriptions of some systems are known to be very sensitive to these e ects 29]. We will mainly concentrate on the parametrization problem of multivariable state-space systems. We propose a fully parametrized state-space model structure which immediately solves the di

(3) cult parametrization problem and also opens possibilities to obtain models with good numerical properties. The only drawback is a higher computational burden since more parameters are estimated compared to standard techniques using identiable model structures. The report is outlined as follows: In Section 2 we give a brief review of state-space models and their associated predictors. The parametrization of state-space model structures is then discussed and the fully parametrized model structure is introduced in Section 3. Section 4 is devoted to the prediction error minimization problem and a regularized prediction error criterion is introduced. In section 5 a statistical analysis is performed showing that the quality of the prediction for the fully parametrized model is the same as for a standard identiable model. A convergence analysis show that the identication algorithm is locally convergent. In section 6 some results on the sensitivity of a transfer function are reviewed which show that a balanced 2.

(4) realization has low sensitivity. The main identication algorithm is presented in Section 7 which yield a balanced realization of the true system. In Section 8 we discuss a particular form of regularization and show that the estimated system will be a norm minimal realization. Some identication examples using both real and simulated data are given in Section 9. In Section 10 the report is summed up.. 2 State-Space Models of Linear Systems A general time-invariant linear system S can be described as S. : y(t) = G0(q)u(t) + H0 (q)e0(t). (1). where y(t) is an output vector of dimension p, u(t) is an input vector of dimension m and is assumed to be known. The unknown disturbances acting on the output are assumed to be generated by the second term where e0 (t) is a noise vector of dimension p and is assumed to be a sequence of independent stochastic variables with E e0 (t) = 0 and E e0 (t)e0(t)T = 0. The transfer functions can be characterized with their impulse responses as. G0(q) =. 1 X. k=0. g(k)q;k H0(q) = I +. 1 X. k=1. h(k)q;k. (2). where H0(q);1 is assumed to be stable. We will also assume that g(0) = 0 i.e. the direct term from u(t) to y(t) is missing. Inclusion of the direct term is straightforward and does not change any of the results to be presented. Furthermore we also will assume that the system is nite-dimensional, i.e. the transfer function can be written as a matrix where each entry is a rational function. A natural predictor y^(t) for the output y(t) given inputs u and outputs y up to time t ; 1 is. y^(t) = H0 (q);1G0(q)u(t) + I ; H0(q);1]y(t):. (3). Since the system (1) is of nite order it can also be described in a statespace model formulation introducing an auxiliary state vector with dimension n, the order of the system. In order to estimate the dynamics of the system 3.

(5) we introduce a model structure M, parametrized by some parameters . In this report we will focus on state-space model structures in innovations form M:. x^(t + 1) = A()^x(t) + B ()u(t) + K ()e(t) y(t) = C ()^x(t) + e(t). (4). where the matrices A B C K are constructed from the parameter vector  according to the model structure M. The model structure M is thus a mapping from the parameter space to the model (4). Let. dM = dim  (5) be the dimension of the parameter vector  and let M() denote the model (4). The model will thus have the transfer functions G(q ) = C ()qI ; A()];1 B () (6) and. H (q ) = I + C ()qI ; A()];1 K (): Using (4) the predictor is given by x^(t + 1) = A() ; K ()C ()]^x(t) + B ()u(t) + K ()y(t) y^(tj) = C ()^x(t). (7). (8). which also can be written as. y^(tj) = Wu(q )u(t) + Wy (q )y(t) where. (9). Wu(q ) = C ()qI ; A() + K ()C ()];1B () (10) Wy (q ) = C ()qI ; A() + K ()C ()];1 K (): (11) In order to use the predictor (8) we have to ensure stability of the two transfer functions Wu and Wy and make the following denition Denition 2.1 Let the set of parameters DM  R dM be DM = f 2 R dM j Wu(q ) and Wy (q ) are stableg 4.

(6) 2 From the structure of the state-space predictor (8) it follows trivially that we here have DM = f 2 R dM j the matrix A() ; K ()C () has all eigenvalues inside the open unit discg for the general state-space model structure (4).. Denition 2.2 Two models, M1(1 ) and M2(2 ) are equal if and only if G1(ei!  1) = G2 (ei!  2 ) and H1(ei!  1 ) = H2(ei!  2) for almost all !. (12). 2. To simplify the reasoning about di erent models we will use the term model set denoted by M , see 13]. A model structure M together with a parameter space will give us a model set: M. = fM() j  2 DMg. (13). 3 Parametrization of State-Space Models The parametrization of models to be used in system identication is often closely tied to the concept of identiability.. Denition 3.1 13] A model structure M is globally identiable at  if M( ) = M(  ) ) . = . (14). 2 An identiable model structure thus has a one to one correspondence between the models, i.e. the transfer functions, and the value of the parameter vector . 5.

(7) Traditionally, see 13, 27], model structures which are identiable have been favored for the purpose of system identication. The use of an identiable model structure has some clear advantages the criterion VN () has a unique minimum if the data is informative, numerical algorithms perform well etc. However the parametrization of multivariable state-space models is a well-known and notoriously di

(8) cult problem in system identication, see e.g. 9], 17] or 31]. The root of the problem is that there is not one single, smooth identiable parametrization of all multi-output systems (4). Typically one has to work with a large number of di erent possible parametrizations depending on the internal structure of the system. Also for some systems it is di

(9) cult to nd an identiable parametrization which is well conditioned 14]. As an example of an identiable model structure we have the following model structure: Let A() initially be a matrix lled with zeros and with ones along the super diagonal. Let then the row numbers r1  r2  : : :  rp , where rp = n, be lled with parameters. Take r0 = 0 and let C () be lled with zeros, (15) and then let row i have a one in column ri;1 + 1. Let B () and K () be lled with parameters.. Denote this model structure MI . This parametrization is uniquely characterized by the p numbers ri which are to be chosen by the user. Instead of ri we will use the numbers i = r1 ; ri;1 and call n = f1  : : :  pg (16) the multi-index associated with the model structure (15). Theorem 3.1 The state-space model structure MI dened by (15) is globally identiable at  if and only if fA( ) B ( ) K ( )]g is controllable.. Proof. See 13] Theorem 4A.2.. 2. From the denition of the identiable model structure (15) it follows that the dimension of the parameter vector is. dM = nm + 2np: I. 6. (17).

(10) For a given system order n and number of outputs p there exists (np;;11) di erent multi-indices. We also have from 13]: Theorem 3.2 Any linear system that can be represented in state-space form of order n can also be represented in the particular form (15) for some multi-index n. 2 Thus if we consider the union of all model sets Mn generated by the model structures given by all di erent multi-indices we have S2.   M n. n. for all possible linear systems S of order n. When identifying a multivariable system of a given order n using an identiable parametrization we can, in principle, not use one single parametrization and thus have to search for the best model among all model structures dened by the multi-indices. However, as pointed out in 13], the di erent model structures overlap considerable and one particular multi-index is capable of describing almost all n-dimensional linear systems. The price paid is that the identication might result in a numerically ill-conditioned model. The concept of identiable model structures is important if we are interested in the values of the parameters themselves, e.g. change detection techniques where a change in the parameters indicates a change in the underlying system. If we are only interested in the input/output relation G(q ) the value of each parameter is then of no interest to us. The parameters can then be seen as vehicles to describe the interesting characteristics of the system. It is also known that some systems described in their canonical forms have transfer functions with a very high sensitivity with respect to the parameters  and are thus sensitive to nite word length e ects occurring in a computer 29]. Based on these arguments we will introduce a new model structure which will circumvent these problems.. 3.1 The Fully Parametrized State Space Model. If we consider the state space model (4) and choose to ll all the matrices A B C K with parameters we will clearly over-parametrize the model and thus loose identiability (14). To formalize we have: 7.

(11) Let all the matrices A() B () C () and K () in the model (4) be lled with parameters from the parameter vector  2 DM and let the corre- (18) sponding model structure be called fully parametrized.. Denote this model structure MF . The number of parameters needed for this model structure (18) is. dM = n2 + 2np + nm: (19) The fully parametrized model structure thus has an additional n2 number of parameters compared to an identiable model structure (15,17). For completeness we have: F. Property 3.1 Any linear system S that can be represented in state-space. form of order n can also be represented by a model from the fully parametrized model structure (18). 2. Using this model structure for the purpose of identication will give two interesting implications. First, this state-space model structure would relieve us from the search through the many di erent forms dened by the multiindices when dealing with multivariable systems since the proposed model structure trivially includes all possible systems of a given order n. Secondly, the quality of the estimated model might increase if we use a exible model structure which not only can describe the underlying system but also allows a numerically well-conditioned description. This is a major di erence compared with the identiable model structures (15), which by denition only have one realization for each system S . Since we are conned to computers with limited accuracy for all calculations it is important to be able to use models which are numerically well conditioned. It might be important to point out that even if the proposed model has more parameters than the corresponding identiable model, the two models will have exactly the same exibility with respect to the transfer functions which we formally can state as MF. =.   M n. n. where MF denotes the model set generated by the fully parametrized model structure (18). 8.

(12) 4 Prediction Error Minimization In order to investigate the properties of the proposed model structure we will focus on system identication techniques based on the minimization of the prediction errors (PEM). The standard setting can thus be described as: Given an input sequence fu(t)g and an output sequence fy(t)g with t = 1 : : :  N , denoted by Z N , and a model structure M which denes a mapping between the parameter space DM and a predictor y^(tj), nd the value ^ which minimizes a criterion VN (). Let us dene the criterion to be N X 1 VN () = N j"(t )j2 (20) t=1 where j  j is the Euclidean l2 norm. The prediction error is the vector. "(t ) = y(t) ; y^(tj) (21) with the predictor y^(tj) given by (8). This problem formulation is well known and there exists a vast literature on how to minimize (20) and the properties of the estimate (22) ^N = arg min 2DM VN () under varying assumptions about the model structure M and the data Z N . See 13] or 27] for a general discussion on the topic.. 4.1 PEM for the overparametrized model. If we use (20) together with the proposed model structure MF the minimum of VN () will be a hyperplane in the parameter space and thus not unique. This follows from the fact that there exist innitely many parameter values  such that S = M(). In Section 5 we will further discuss this property. This non-uniqueness usually leads to convergence problems if standard optimization algorithms are used in the minimization of (20) since most algorithms use the inverse of the Hessian d2 V () d2 N 9.

(13) and in this case the Hessian will be singular and non-invertible. One possibility to overcome this property is to further constrain the solution. We will here focus on regularization which is a standard technique for ill-conditioned estimation problems, see 3]. Regularization means that the criterion (20) is augmented with a term: WN () = VN () + 2 j ; #j2  > 0 (23) where jj denotes the Euclidean norm or more general with a function r( #) with a positive denite Hessian. The regularization term =2j ; #j2 will have the e ect that the resulting estimate ^N = arg min WN () (24) 2DM. will be a compromise between minimizing VN () and being close to the value #. Since our objective is to nd the  which minimizes VN () it is clear that the choice of # will inuence the result. In the next section we will address this question together with the presentation of the identication algorithm.. 5 Analysis. 5.1 Statistical analysis of the prediction quality. In system identication literature the concept of parsimony 27] is considered to be very important: Use the simplest model with as few parameters as possible. This concept can easily be justied by analyzing the statistical properties of the prediction error. The key result is that the variance of each estimated parameter in most cases increases if we estimate more parameters with a xed amount of data and thus degrade the quality of the estimated model, i.e. the variance of the prediction error increases. An identiable state-space model of a given order has, by denition, a minimal number of parameters and thus satises this concept. Even though the fully parametrized model has more parameters than the corresponding identiable model we will in this section show that regularization will restore the statistical properties of an identiable model. In 26] some statistical properties are developed considering neural networks as nonlinear predictor models. 10.

(14) These results also apply to the fully parametrized model structure which is pointed out in 16]. In the sequel we will perform the analysis focusing on fully parametrized state-space models.. Denition 5.1 Let the true system S be described by (1) and consider a. model structure M of minimal order n such that S 2 M. Then let the set of the true parameters be dened as DT = f 2 DM j G0(z) = G(z ) H0 (z) = H (z ) 8zg (25). 2. For an identiable model structure (15) the set DT contains only one point which directly follows from (14) and Theorem 3.1. If the model structure is a fully parametrized state space model structure (18), the elements in DT will be related as follows. Lemma 5.1 Consider the fully parametrized model structure (18) with n states. Then 8i 2 DT  i = 1 2 9 a uniqe T 2 R nn  T invertible : (26) A(1 ) = T ;1A(2 )T B (1 ) = T ;1B (2 ) (27) C (1 ) = C (2 )T K (1 ) = T ;1K (2 ) Proof See 9] Theorem 6.2.4 2 If we now let the parameter vector  be composed as  =  vec(A)T vec(B )T vec(K )T vec(C )T ]T (28) where vec() is the operator which forms a vector from a matrix by stacking it's columns. The following relation for compatible matrices A B C vec(ABC ) = C T A vec(B ) (29) holds, where denotes the Kronecker product 6], dened as: 2 a B a B ::: a B 3 66 a1121 B a1222 B : : : a12nn B 77 A B = 66 .. ... ... 775 4 . am1 B am2 B : : : amnB 11.

(15) where A is of dimension m n. 1 and 2 in Lemma 5.1 will then be related as 1 = T(T )2 (30) where 2 T T T ;1 0 0 0 3 6 0 Im T ;1 0 0 777 T(T ) = 664 (31) 0 0 Im T ;1 0 5 0 0 0 T T Ip which proves the following lemma.. Lemma 5.2 Consider the fully parametrized model structure MF in (18) with n states, m inputs and p outputs. Then 8i 2 DT . i = 1 2. where T(T ) is given by (31).. 9. a unique T 2 R nn T invertible : 1 = T(T )2. 2. If we now consider two models of the same order n, one from the fully parametrized model structure (18), MF (F ), and one from the family of identiable model structures (15), MI (I ), with a multi-index n. Now assume that the two models are equal MI (I ) = MF (F ):. Let. I =  vecA(I )T vecB (I )T vecK (I )T vecC (I )T ]T : This vector consists of the xed ones and zeros as given by the model structure denition (15) and the parameters from the vector I . Since the two models are equal we have from Lemma 5.2 9 a uniqe T : I = T (T )F Thus for all F such that MF (F ) 2 MI , we can uniquely nd a T since the model structure MI is identiable. T is easily constructed from the 12.

(16) observability matrix given by the matrices in MF (F ) together with the multi-index n, see 9]. Hence there exists a di erentiable function. g() : D~ M. F. such that. ! DMI. ~ MF MF (F ) = MI (g (I )) 8F 2 D. (32) where D~ MF  DMF is the set of parameters which yield all models in the model set MI . The gradient of the predictor y^(tj) is dened as. 2 d y^(tj) = 66 (t ) = d 4. d d1 y^1 (tj). :::. ... d ddM y^1 (tj) : : :. d d1 y^p(tj). ... d ddM y^p(tj). 3 77 5. (33). a matrix of dimension dM p where the subscript k in yk (tj) denotes the position in the vector.. Lemma 5.3 Let the predictor y^F (tjF ) be given by a fully parametrized model structure MF (18). Then N 1X F (t F ) F (t F )T rank. N t=1 where is given by (33) and dM by (17)..

(17) dMI. I. Proof. Pick an identiable model structure (15) and a I such that MI (I ) = MF (F ):. Then using (32) we have F (t F ) = dd y^F (tjF ) = dd y^I (tjg(F )) = @@ g(F )  I (t I ) F F F which gives us N 1X F F T N t=1 (t F ) (t F ) 13.

(18) " X # N @ 1 I (t  ) I (t  )T  ( @ g ( ))T : = g (F ) . I I @F N t=1 @F F The proof is concluded by observing that I (t I ) I (t I )T has dimension dM dM and applying Sylvester's inequality. 2 I. I. We will continue by introducing some assumptions which we need in order to perform the convergence analysis. The set Z N , i.e. our measured input and output data, is our source of information about the true system. A model structure M together with the parameter space DM gives us the model set M. A natural question to pose is if the data set contain enough information to distinguish between di erent models in the model set.. Denition 5.2 13] A data set Z 1 is informative with respect to a model structure M if. N 1X 2 E Nlim !1 N jy^(tj1 ) ; y^(tj2 )j = 0 t=1 ) M(1 ) = M(2 ). (34). 2. Assumption A1 Consider a data set Z 1 which has been generated by the. system (4) where the input u(t) is chosen so the data set is informative and that the data set satises the technical condition D1 in 13, p. 210]. 2 We then have the following convergence result from 13]: Theorem 5.1 Let ^N be dened by (22) and let the data Z 1 satisfy A1. Then inf j^N ; j ! 0 w.p. 1 as N ! 1: (35)  2DT. 2. If we now consider the regularized version of the criterion (23) and let # = 0 2 DT we will have ^N ! 0 as N ! 1 since both terms in the criterion will simultaneously obtain their minimum at  = 0 which proves the following theorem. 14.

(19) Theorem 5.2 Let ^N be dened by (23) and (24) with # = 0 2 DT and let the data Z 1 satisfy A1. Then ^N ! 0  w.p. 1 as N ! 1. (36). 2. This choice is however articial since we in reality do not know 0 a priori. However the theoretical implications in the analysis to follow are still interesting since we will let # approach a 0 2 DT in the algorithm. In the analysis we will also need the following result. Lemma 5.4 Let the predictor be dened by (8),the gradient by (33) and let the data set Z 1 satisfy A1. Then 8 2 DM lim E N !1. N 1X T N k=1 (t ) (t ) = R() (0) < 1. Proof. Follows from Lemma 4.2 and Theorem 2.3 both in 13].. 2. We will now derive a statistical model quality measure for the fully parametrized models obtained via minimization of the regularized prediction error criterion (23) and (24) with # = 0 2 DT . For a nite data set Z N the obtained estimate ^N will deviate a little from 0 . As a measure of the quality of the estimated model using nite data we can consider the scalar ^ V (^N ) = Nlim (37) !1 E VN (N ): For all estimates we have V (^N )  V (0 ) = tr(0) 0 2 DT and the increase of V (^N ) due to the parameter deviation should be as low as possible. This assessment criterion thus considers the variance of the one-step prediction errors, when the estimated model is applied to future data. Finally we have VN = E V (^N ) (38) with the expectation taken over the estimated parameters, as a performance measure \on the average" for the estimated model. 15.

(20) At the minimum of the criterion (23) we have. d 0 4 WN (^N ) = d WN () = 0: =^. (39).  N. A Taylor expansion around the limiting estimate 0 will then give1 0 = WN 0(0 ) + WN 00(N )(^N ; 0 ) with N between ^N and 0 and. WN 00(N ) =4. (40). d2 W  () : d2 N = N. From 13, p. 270] we have. WN 00(N ) ! W  00 (0 ) w.p. 1 as N ! 1. where. N 1X 00  00   W (0 ) = Nlim !1 E N i=1 WN (0): Thus, for su

(21) ciently large N we can write ^N ; 0 = ;W  00(0 )];1WN 0(0 ) w.p. 1. We also have with and2. (41) (42) (43). WN 0 () = VN0 () + ( ; 0 ). (44). N X VN0 () = ; N2 (t )"(t ). (45). N X 2 00 VN () = N ( (t ) (t )T ; 0 (t )"(t )) t=1. (46). t=1. 1 The application of the Taylor expansion (40) may give dierent N in dierent rows of this vector expression. 2 Note that the equation is written in an informal way since  (t  ) is a tensor. 0. 16.

(22) which gives us where. W  00 (0 ) = V 00 (0 ) + I = Q + I. (47). N 2X T Q = V 00 (0 ) = Nlim E (48) !1 N t=1 (t 0 ) (t 0) since "(t 0) = e0 (t) and 0 (t 0) are independent. Now by Taylor expansion of VN (38) around 0 we have VN = E fV (0 ) + (^N ; 0 )T V 0(0 ) + 12 (^N ; 0)T V 00(0 )(^N ; 0)g (49) neglecting higher order terms. The second term in (49) is zero since V 0(0 ) = 0. If we now insert equation (43) and use the properties of the trace operator we obtain VN = tr(0) + 21 trfE WN 0(0 )(WN 0(0 ))T Q + I ];1QQ + I ];1g (50) If we assume that 0 = 0I we have  0 ( )(W  0 ( ))T = 2 Q: lim N E W (51) 0 0 N N 0 N !1 Equation (50) will then be.

(23) (52) VN = 0 p + N1 trfQQ + I ];1QQ + I ];1g Since Q is a symmetric positive semidenite matrix we can simultaneously make Q and Q + I diagonal which gives us 0 1 dX M 2 1  i A VN = 0 @p + N (53) 2 (  + i=1 i  ) where i are the eigenvalues of the matrix Q and dM is given by (19). From Lemma 5.3 we know that the matrix Q will at most have dM nonzero eigenvalues. If we now choose  to satisfy 0 <   i 8i 6= 0 (54) we arrive at  ! VN = 0 p + rank Q : (55) N F. F. I. Thus we have proved the following theorem: 17.

(24) Theorem 5.3 Consider an overparametrized model structure (18) which denes a predictor (8) and let the data set Z 1 satisfy A1 and be generated by a true system with 0 = 0 I . Furthermore let ^N be given by (23-24) where # = 0 2 DT . Then  ! rank Q  lim V = 0 p + : (56) !0 N. N. where VN is given by (37) and (38).. 2. The expression (56) is similar to the known expression. VN. . = 0 p + dM N. !. 13, 27] for identiable model structures, where the latter clearly shows the relation to model accuracy versus number of estimated parameters. However Theorem 5.3 shows us that the fully parametrized model retains the same properties since the rank of Q, by Lemma 5.3, can at most be dMI which is the number of parameters in a corresponding identiable model. Hence the fully parametrized predictor model in fact obtains the same statistical properties, i.e. the same prediction error variance, as an identiable model.. 5.2 Convergence analysis. In reality we cannot choose # = 0 since it is unknown to us so we have to use some a priori good guess. One possibility is to perform a sequence of minimizations of (23) and use the obtained parameter estimate ^N as # in the next minimization. This scheme is in fact locally convergent which we now will prove. Since we have to perform the analysis locally we can simplify the calculations by considering a linear regression set up: Let the data Z 1 be generated by the system (57) S : y (t) = (t)T 0 + e(t) where the regressors (t) are made up from data and the noise terms e(t) are assumed to be a sequence of independent stochastic variables with zero mean 18.

(25) E e(t) = 0, covariance E e(t)T e(t) =  and independent of the regressors. Let the prediction model be y^(tj) = (t)T :. (58). V () = Nlim !1 E VN ():. (59). M:. Consider the criterion This gives us with. V () = tr() + (0 ; )T Q(0 ; ) N 1X T Q = Nlim E !1 N (t) (t) : k=1. Now assume that the matrix Q is only positive semidenite which implies that V () obtains its minimum value tr() for all  = 0 + z z 2 N (Q) the null space of the matrix Q. If we let the data set Z 1 be informative, the set DT for this system is then given by DT = fj = 0 + z z 2 N (Q)g: (60) To nd the minimum of (59) consider the iterative scheme. h i ^k = arg min V (  ) +  j ; ^k;1 j2 : (61) 2DM Without loss of generality we can assume that 0 = 0. The right hand side of (61) will then be tr() + T Q + ( ; ^k;1)T ( ; ^k;1) which is a quadratic expression in . Completing the squares gives us ( ; (Q + I );1^k;1)T (Q + I )( ; (Q + I );1^k;1) + C where C is all the terms independent of  lumped together. The minimum is thus given by ^k = (Q + I );1^k;1: 19.

(26) Let T be a square nonsingular matrix such that TQT ;1 = " is diagonal and let ~k = T ^k . Furthermore let the diagonal elements of " be given in an descending order. This gives us ~k = (" + I );1~k;1: Using the diagonal form of " gives us. 2 66 66 6 ~k = 66 66 64.  +1. .... 0  +r. 1. .... 3 77 77 77 ~k;1 ~k;1 77  = R 77 5. 0 1 where i i = 1 : : : r are the nonzero eigenvalues of Q. If we now study the limiting estimate as k tends to innity we have " lim ~k = " lim Rk ~0 = 0: k!1. k!1. The limiting estimate limk!1 ~k thus belongs to the null space of " which shows us that lim ^k 2 N (Q): k!1 This proves the following theorem.. Theorem 5.4 Let the data set Z 1 be generated by the system (57) and be. informative. Let the predictor be given by (58). Let the sequence of estimates be given by (61) with  > 0, and let DT be given by (60). Then lim inf j^k ; j = 0: k!1  2DT. 2. If we now return to the state-space model structures and consider a Taylor expansion of V () around a 0 2 DT we get V ()  V (0 ) + ( ; 0 )T V 0(0 ) + 12 ( ; 0 )T Q( ; 0 ) 20.

(27) Q = V 00(0 ) for  in the neighborhood of 0. The second term is zero since 0 2 DT is a minimum. If all higher order terms are small the approximation is valid locally and we have the same problem as for the linear case given in Theorem 5.4. This implies that the iterative minimization scheme (61) will also be locally convergent for the fully parametrized state-space model structures which exhibit the same non-uniqueness properties as the linear regression setup in the theorem. From a regularization point of view the iterative scheme (61) will decrease the actual regularization e ect compared to minimization of (23) where # is a xed value. But in the light of Theorem 5.3 this is ne since  can become arbitrarily small and in the case (23)  is a measure of the degree of regularization. Theorem 5.4 proves that the iterative scheme (61) restores the convergence properties of the standard prediction error techniques, i.e. the limiting estimate belongs to the set DT , the true parameters. Our main goal is thus achieved we have one model structure which include all possible multivariable systems, the introduced regularization gives the estimated predictors the same statistical quality as identiable models as stated by Theorem 5.3. However we can not directly apply Theorem 5.3 on the iterative scheme (61) since # = k is a stochastic variable. Our model structure still give us more freedom: The set DT is a hyper plane in the parameter space and all the results in this section applies to all points 0 2 DT . However it is known in the literature that some realizations i.e points in DT are much more favorable to use on a computer from a numerical point of view. This important practical question will be addressed in the next section.. 6 Numerical Sensitivity In the area of digital lter synthesis, numerical sensitivity of di erent lter structures have been addressed by several authors, among others the work by Mullis and Roberts 24, 23]. They modeled the xed point calculations roundo error as noise and gave conditions which minimized the output noise. From our system identication point of view it is interesting to mention that their resulting lter structures are far from the standard canonical forms. 21.

(28) Other more recent results have focused on the sensitivity of the transfer function with respect to the parameters 19, 8, 10]. The results indicate that balanced realizations have low sensitivity and will thus yield numerically well conditioned models. The question of sensitivity combined with system identication is however not at all as developed in the literature. Some of the work has been focused on changing the delay operator q to some others in the ARX models 5, 11] which in some cases yield well conditioned parameter estimates. In 20] a balanced realization with some restrictions to make it canonical is proposed. This model would thus be well conditioned. The identication method however includes a di

(29) cult constrained optimization problem and no convergence analysis is given.. 6.1 A review of some results. In this section we will consider state-space systems of the form (4) without noise 3 , i.e. H (q) = 0. The matrix. Wo =. 1 X. (AT )k C T CAk. k=0. (62). is known as the observability Gramian for the state space system (4). The eigenvalues of this matrix describe how the state variables inuence the output signal y. This matrix also satises the following Lyapunov equation. Wo = AT WoA + C T C: The dual matrix. Wc =. 1 X k=0. Ak BB T (AT )k. (63). is called the controllability Gramian. This matrix describes how the inputs u inuence on the state vector x. Wc also satises. Wc = AWcAT + BB T : The Gramian matrices are symmetric by construction and if the state space realization is minimal the matrices are also positive denite. These Gramian 3. We could also consider the noise e(t) as an input and let B = B K ]. 22.

(30) matrices are the discrete time counterpart to the continuous time Gramians described in 9]. In 22] Moore showed that for every transfer function G(q) there exists a state space realization where Wc = Wo = " and " diagonal. This realization is called the balanced realization. Moore used this realization to perform model reduction. A change of basis of the state vector x~ = T ;1x gives new Gramian matrices W~ c = T ;1WcT ;T and W~ o = T T WoT The eigenvalues of the matrix WcWo are thus invariant under similarity transformations T . The Hankel singular values 4] of G(q) are dened as. i =4 i(WcWo)]1=2  i = 1 : : : n where i(WcWo) denotes the i'th eigenvalues of WcWo .. 6.2 The sensitivity problem. In computer implementations it is important to consider so called nite word length e ects, i.e. what e ect the use of nite representation of real numbers have for an algorithm. In our case we are interested in which state-space realization are least sensitive to these e ects and thus minimize the degradation of the system performance. As a measure of sensitivity of a transfer function with respect to the parameters we will use the expression in 28].       @G (z) 2 @G (z) 2 @G (z) 2    M =  @A  +  @B  +  @C  F L1 F L2 F L2 !2  Z  Z  @G(ei! ) 2 i! )  @G ( e 1 1     d ! + 2 ;  @B  d ! = 2 ;  @A F F   2 Z  @G(ei! )  +1   d! 2 ;  @C F 23. (64).

(31) where the Frobenius norm is kX k2F = tr X X and X  is the transposed conjugate of X . This scalar expression gives a measure of the sensitivity of the transfer function over the whole frequency range ; ]. The mix between L1 and L2 norms is motivated by the analytical properties of the rst term in (64). For single input single output systems (m = 1 p = 1) the measure M can be shown 30] to have an upper bound S given by. M

(32) S =4 tr(Wc) tr(Wo ) + tr(Wc) + tr(Wo ):. (65). For general multivariable systems the above bound is generalized to 19]. M

(33) S = tr(Wc) tr(Wo) + p tr(Wc) + m tr(Wo). (66). where p and m is the number of system outputs and inputs respectively.. Theorem 6.1 Let Wc and Wo be the n n Gramian matrices for a minimal state space realization of a transfer function G(z ). Then. X n !2 n X S = tr(Wc) tr(Wo ) + p tr(Wc) + m tr(Wo)  i + 2 i i=1. i=1. where fi g is the Hankel singular values of G(z ). The equality holds if and only if p Wc = m Wo:. Proof. See 19]. 2. In 30] it is shown that realizations of systems with p = m = 1 (SISO) which satises Wc = Wo also minimize the true measure M . From the theorem it immediately follows that balanced realizations of square multivariable systems p = m achieves the minimal value of the upper bound S . If the multivariable system has p 6= m a minimum sensitivity realization is obtained from the balanced realization via a state transformation matrix T = (p=m)1=4 I . We also note that if a realization satises Wc = Wo an orthonormal state transformation T (TT T = I ) will also give a new realization with the Gramian matrices still equal. This shows that there exist innitely many realizations which satisfy Wc = Wo. 24.

(34) 7 The Algorithm In this section we will present an algorithm for the purpose of identication using the proposed fully parametrized model structure. The algorithm is designed with the goal of providing accurate models with low sensitivity with respect to nite word length representations. The results on sensitivity presented in the previous section lead us to nd an algorithm which has a balanced realization as the convergence point in DT . This is easily achieved if we after each step in (61) adjust the vector ^k to be a balanced realization through a change of basis T . This means that we not only in the limit obtain a balanced realization but we also use it in each step of the identication. This discussion leads us to the following algorithm:. Algorithm 7.1. 1. Obtain an initial estimate by the least square solution of the equations A(q;1)y(t) = B (q;1)u(t) t = 1 : : :  N where A and B are polynomial matrices. Each entry of the A matrix is a monic polynomial 1 + a1 q;1 + : : : + anq ;n of degree n and the entries of the B matrix are polynomials b0 + b1 q;1 + : : : + bn q;n also of degree n. 2. Realize the transfer function A(q ;1 );1 B (q;1 ) in state-space form using some basis, e.g. observability canonical form 9]. This yields a statespace system with npm number of states. 3. Convert the system to a balanced realization by state transformation and reduce the system order to n by only including the n states corresponding to the n largest values of the gramian, see 22]. 4. If the model include the matrix K , let K be the solution to the Kalman ltering problem with noise covariance equal to identity matrices. The resulting model is the initial estimate ^b0 . 5. Solve the minimization problem. ^k. #  k ; 1 2 ^ = arg min 2DM VN () + 2 j ; b j : ". 25.

(35) 6. Convert the obtained estimate to represent a balanced realization. ^bk = b(^k ) 7. Repeat steps 5 ; 6 until a minimum of the criterion is reached.. 2. Step 4 ensures that the initial predictor will be stable and hence ^b0 2 DM. Step 5 of the algorithm can be easily be solved by a Gauss-Newton method, see section 7.1. The balanced realization is obtained via singular value decompositions, see 22]. A di erent version of the algorithm is obtained if only one numerical iteration is performed in step 5. Practical use show that both methods give compatible convergence properties.. 7.1 Numerical solution by iterative search. Since all state-space model structures give a nonlinear relation between the parameters  and the predictor y^(tj) we have to search the parameter space DM with some iterative method in order to nd a minimum of the criterion VN (). A well known method is the Newton method which can be described as ^i+1 = ^i ; V 00 (i)];1VN0 (i) (67) where ^i is the estimate at the i'th step in the iterative algorithm and N d V () = ; 2 X VN0 () = d N N (t )"(t ) t=1. with. d y^(tj) (t ) = d is the gradient of VN () with respect to the parameters . And N d2 V () = 2 X T 0 VN00 () = d 2 N N t=1( (t ) (t ) ; (t )"(t )) 26. (68) (69) (70).

(36) is the Hessian. In the neighborhood of the minimum  = 0 the Hessian VN00 () can be approximated by N X H () = N1 (t ) (t )T. (71). N X H () = N2 (t ) (t )T + I. (73). t=1. since "(t 0) and 0(t 0 ) are independent. If we use this approximation together with an adjustable step length we obtain the damped Gauss-Newton method. ^i+1 = ^i ; iH (^i)];1 VN0 (^i) (72) The scalar step length i is chosen so that the criterion VN () is decreased in every iteration. If we instead would chose H = I we would obtain a gradient method which is fairly ine

(37) cient close to the minimum compared to the Gauss-Newton method. A thorough treatment of the Newton methods can be found in 1]. A condition which has to be met, in order to be able to use a Newton method, is that the Hessian VN00 () must be nonsingular to ensure that its inverse is well dened. Using model structures which are identiable meets this condition in the neighborhood of the minimum if the input is rich enough (persistence of excitation) which result in data Z N which are informative 13]. The proposed overparametrized model structure does however not meet this condition since VN () will in this case not have a unique minimum. The introduction of regularization and the minimization of (23) instead of (20) solves this problem since the approximate Hessian now is given by t=1. which is positive denite.. 8 Norm Minimal Realizations In this section we will establish a connection between norm minimal realizations and the fully parametrized model structures together with a particular regularization, namely the choice # = 0 in (23). The key result is that the obtained realization is norm minimal and if the system is of a particularly 27.

(38) simple form this also implies that the realization is a balanced realization except for an orthonormal state transformation. A further specialization of the set in Denition 5.1 will dene norm minimal realizations.. Denition 8.1 Consider a fully parametrized model structure (18) and let set of parameters from DT which satises DN = fj  = arg min 2D jjg T. (74). be called norm-minimal and let the realization of the model M()  2 DN be called a norm-minimal realization. 2. Lemma 8.1 Let T be dened by (31). Then. TT T = I , T T T = I Proof. Follows immediately from the fact that 2 TT T (TT T );1 3 0 0 0 6 77 0 Im (TT T );1 0 0 TT T = 664 75 : 0 0 Im (TT T );1 0 0 0 0 T T T Ip) 2. Lemma 8.2 Consider the fully parametrized model structure (18). Then. i = 1 2 9T :  2 and TT T = I 1 = T Proof. The existence of a T is proven in Lemma 5.2. In 7] it is proven that the state transformation matrix T satises T T T = I . Applying Lemma 8.1 concludes the proof. 2 8i 2 DN . This lemma shows that the set DN  DT does not contain a singleton, but is a hyperplane in the parameter space DM. The following theorem shows that we will obtain a norm minimal realization (in the limit) if we use the regularized prediction error criterion with the special choice # = 0. 28.

(39) Theorem 8.1 Let the data set Z 1 satisfy A1 and consider a fully parametrized model structure (18) which contain the true system and let ^N be dened by (23) and (24) with # = 0. Then lim lim inf j^N ; j = 0 w.p 1 !0 N !1  where DN is dened by (74).. 2DN. Proof. We can consider (24) as a penalty method to solve the following constrained minimization problem min jj2 subject to  E Nlim !1 VN () = V () = tr 0 since we know from Theorem 5.1 that V ()  tr 0 and obtain its minimum for  2 DT (w.p. 1). The theorem in 18, p. 368] then states that the sequence of ^ ^1 = Nlim !1 N will converge to DN as  ! 0 which concludes the proof. 2 In the next section we will point out that a norm minimal realization is close to (in some cases identical) to realizations with low sensitivity to nite word length e ects occurring in computer implementations. The set of models M()  2 DN can also be characterized by a simple matrix equation with a balancing appearance. The following theorem will give the details and can also be found in 7] wherein a complete proof is given. The theorem is also mentioned in 25]. The partial proof given here is however di erent. Theorem 8.2 Let the model M() from the fully parametrized model struc B  C  K . Then ture (18) have matrices A AAT + B B T + K K T = AT A + C T C if and only if.  2 DN 29.

(40) Proof.. Let a model M()  2 DT be represented by the matrices  B  C  K corresponding to a parameter A B C K . The set of matrices A value in the set DN can be characterized via a nonsingular matrix T  B = T;1B C = C T  K = T;1K A = T;1AT where T is given by  ;1 ;1B T ;1K 2 T AT T  J (T ) =  CT 0 0 F T = arg min J (T ) T 6=0 Since k  kF denotes the Frobenius norm we can express J (T ) as J (T ) = tr(T ;1ATT T AT T ;T + T ;1BB T T ;T + T ;1KK T T ;T + T T C T CT ) = tr(T ;T T ;1ATT T AT + T ;T T ;1 BB T + T ;T T ;1KK T + TT T C T C ) The function J (T ) does not have a unique minimum T as already stated in Lemma 8.2. If we let P = TT T and let J(P ) = tr(P ;1APAT + P ;1BB T + P ;1KK T + PC T C ) a minimizing matrix P will then give us the whole set of minimizing matrices T. The existence and uniqeness of a minimizing P is shown in 7]. The minimizing P will then satisfy d J(P ) = 0 dP We thus have according to Lemma A.1-A.3 in appendix that  T P ;1 ; P ;1BB T P ;1 0 = AT P ;1A ; P ;1APA ;P ;1 KK T P ;1 + C T C (75) Take a T which satises TTT = P and insert in (75) and multiply the equation with TT from left and T from right. 0 = TT AT T;T T;1AT ; T;1ATTT AT T;T ; T;1BB T T;T ;T;1 KK T T;T + T T C T C T = AT A + C T C ; AAT ; B B T ; K K T which proves the \if" part. For the \only if" part we refer to 7]. 2 30.

(41) 8.1 Norm minimal realizations and sensitivity. The norm minimal realizations are partly related to balanced realizations and thus low sensitivity. We have the following theorem: Theorem 8.3 Assume that a single input single output transfer function G(z) of order n has the following structure n  X i z ; i  i 2 R + (R + = 0 1) i 2 R  i = 1 : : : n i=1 Then all norm minimal state-space realizations M()  2 DN will have the controllability Gramian Wc and the observability Gramian Wo equal. Proof. Let a fully parametrized model M() represent the system and let the parameter  give the following matrices. A = diag(i) i = 1 : : : n B = p1 p2  : : :  pn]T C = p1  p2 : : :  pn]. G(z) =. which gives us. n  X ; 1 G(z) = C (zI ; A) B = z ;i i=1. i. and thus  2 DT . Since A = AT and C T = B this realization trivially satisfy. AAT + BB T = AT A + C T C which shows that that the realization is norm minimal and  2 DN . Furthermore from (62-63) we automatically have Wc = Wo. From Lemma 8.2 we know that all other norm minimal realization can be reached via orthonormal transformation matrices T . Since the transformation matrices are orthonormal we will have Wc = Wo for all norm minimal realizations. 2 The result shows that if we identify systems with the structure in Theorem 8.3 using a fully parametrized state space model together with the criterion (23) and # = 0 we automatically obtain a minimum sensitivity realization. Although the class of systems which satisfy the conditions in 31.

(42) Theorem 8.3 is very small we believe there is a connection between low sensitivity and norm minimal realizations for the whole class of systems (4). To conclude this section we will give some numerical values on the measure S for di erent kinds of state-space realizations of the same system.. Example 8.1 Consider a discrete time Butterworth low pass lter of order n = 4 with cuto frequency =2 rad/s. A state space realization of the lter can be obtained in MATLAB 12] with the following command. A B C D] = butter(4 0:5). (76). If we evaluate S in (65) for some di erent realizations previously mentioned we obtain the following results. Realization Sensitivity S Original from (76) 8.11 Observer canonical form 11.37 Balanced realization 5.20 Norm-minimal realization 5.28 The balanced realization obtains a minimal value which was proven in Theorem 6.1 since the balanced realization satises Wc = Wo. The near optimal value obtained for the norm minimal realization shows the previously stated connection between norm-minimal realizations and low sensitivity. As expected the observability canonical form has the least favorable sensitivity of the four di erent realizations. 2. 9 Examples In this section we will present three examples illustrating the previously discussed properties of the proposed identication algorithm and model structure.. Example 9.1 In 21] Appendix A.2 a turbo generator model with two. inputs, two outputs and six states is presented. We used the continuous time model to generate an estimation data set and a validation set using random binary (;1 +1) signals as inputs. The sample time was set to 0.05. The 32.

(43) Model FM IM 6 = f1 5g IM 6 = f2 4g ^V 0.0056 0.0070 0.0123 Model IM 6 = f3 3g IM 6 = f4 2g IM 6 = f5 1g V^ 0.0177 0.0056 0.0059 Table 1: V^ evaluated for the fully parametrized model (FM) and the 5 identiable models (IM) from Example 9.1. estimation data set and the validation set was 500 respectively 300 samples long with the output sequences corrupted with zero mean gaussian noise with variance 0 = 0:0025I2 to make the system of output-error type. The estimation was performed on the estimation data according to the Algorithm 7.1 using a fully parametrized model with six states. This gives us a total of 64 parameters to estimate which can be compared with the 40 parameters for any identiable model structure (15). The regularization parameter  was set to 10;5 which makes the numerics well conditioned. To assess the quality of the estimated model we evaluate N X V^ = N1 jy(t) ; y^(tj^N )j2 t=1 using the independent validation set. To compare the proposed parametrization of the model with the conventional identiable parametrizations, we identied the ve di erent possible models corresponding to the ve sets of multi-indices 6. The results are given in Table 1 which clearly shows that the fully parametrized model (FM) is equally good as the best identiable model (IM). It is also interesting to notice that all the other identiable models perform signicantly worse. 2. Example 9.2 Consider a Butterworth lter of order n = 4 with a narrow passband 0:1 ; 0:11] rad/s. The corresponding state space realization can be obtained in MATLAB 12] with the following command. A B C D] = butter(2 0:1 0:11]) 33. (77).

(44) Model Fit 1 1:3 10;13 2 1:1 10;16 Table 2: The performance for the two estimated models in Example 9.2. Model 1: Identiable state-space model. Model 2: Fully parametrized statespace model. Let fu(t)g be a random binary input signal consisting of 400 samples and let fy (t)g be the output from the lter (77) using fu(t)g as the input. Let the estimation data set Z 400 consist of fu(t)g and fy(t)g. Based on the data set Z 400 and two di erent model structures, two models were estimated. 1. An identiable state space model of order four. 2. A fully parametrized state space model of order four. Model 1 was estimated by rst initializing the model with an ARX estimate and then numerically minimizing (20) using the standard commands canstart and pem from 14]. Model 2 was estimated using Algorithm 7.1. The minimizations were conducted until the minima were reached for both models. Table 2 shows the performance of the two estimated models simulated with fu(t)g as the input signal. The value Fit is the RMS value of the deviations between the true output and the models output. Since the two models theoretically have the same ability to model the underlying system the di erence in performance must be accredited to numerical properties. The exibility of the second model not only allows the true system to be correctly described but also o er a degree of freedom to obtain better numerical properties. Another possibility is that Model 1 reached a local minimum. 2. Example 9.3 In 32] an identication of a glass oven using real data is given. as an example to illustrate the subspace identication algorithm presented in the paper. The system consists of 3 inputs and 6 outputs and is modeled as a 34.

(45) Model Simulation 1 step prediction Subspace 32] 0.536 0.108 Full param. with K 0.547 0.135 Full param. K = 0 0.511 Table 3: Performance P of the estimated models in Example 9.3 system of order n = 5. Using the same data and our fully parametrized model structure we identied two di erent models. One output error model, i.e. K = 0 and one model where K was also estimated. In Table 3 a prediction error measure P of the two estimated models are compared with the model obtained by 32], where. v 6 u N (y (t) ; y^ (t))2 u X X 1 1 k k t P=6 2 yk (t) k=1 N t=1. is evaluated on independent data. The fully parametrized OE-model has the best performance of the models in simulations. The other fully parametrized model is a little worse compared to the subspace model. This example shows that identication using real data works quite well for the proposed model structure compared with the subspace identication method.. 2. 10 Conclusions In this report we have introduced a state space model structure which is fully parametrized and thus each parameter is not identiable. The use of this model structure for identication of multivariable systems will relieve us from the search through all possible identiable parametrizations and allow parametrizations which are well conditioned. In order to minimize the prediction error for the fully parametrized model with an e

(46) cient numerical method, regularization is introduced. It is shown that the use of regularization will automatically give the same statistical properties as an identiable model structure even though the fully parametrized model structure contains more parameters to estimate. 35.

(47) An identication algorithm is presented which yields, in the limit, a balanced realization of the true system. Some material is reviewed which shows that a balanced realization has low sensitivity with regard to nite word length e ects. The proposed algorithm thus produces a numerically sound model of the underlying system. It is also shown that the use of a particular type of regularization will give a model which, in the limit, is norm minimal. A connection between low sensitivity and norm minimal realizations was also established.. 36.

(48) A Appendix. Lemma A.1 Let A 2 R nn and P 2 R nn and symmetric. Then. d tr(P ;1APAT ) = d tr(PAT P ;1A) dP dP T ; 1 ; = A P A ; P 1APAT P ;1 Proof. Denote Y = PAT P ;1A and let V = AT P ;1A which gives Y = PV . From 6, p. 64] we have n @p n @v X @yii = X ik ki v ki + pik @P k=1 @P k=1 @P and @vki = ;P ;1AE AT P ;1 ki @P together with @pik = E ik @P where Eik is a matrix with element i k equal to 1 and the rest equal to zero. This gives us n @yii = X ;1 T ;1 @P k=1(Eik vki + pik (;P AEkiA P )) which nally gives us n X n d tr(Y ) = X (Eik vki + pik (;P ;1 AEkiAT P ;1)) dP i=1 k=1 = AT P ;1A ; P ;1APAT P ;1 which concludes the proof. 2. Lemma A.2 Let B 2 R nm and P 2 R nn and symmetric. Then d tr(P ;1BB T ) = d tr(B T P ;1B ) dP dP = ;P ;1 BB T P ;1 37.

(49) Proof. Denote Y = B T P ;1B . From 6, p. 65] we have which directly gives us. @yii = ;P ;1 BE B T P ;1 ii @P. n d tr(Y ) = X ;P ;1 BEii B T P ;1 dP i=1 = ;P ;1 BB T P ;1. 2. Lemma A.3 Let C 2 R pn and P 2 R nn and symmetric. Then d tr(C T CP ) = d tr(CPC T ) dP dP = CT C. Proof. Denote Y = CPC T . From 6, p. 65] we have which gives us. @yii = C T E C ii @P n d tr(Y ) = X C T EiiC = C T C dP i=1. 38. 2.

(50) References 1] J. E. Dennis and R. B. Schnabel. Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Prentice-Hall, Englewood Cli s, New Jersey, 1983. 2] J. C. Doyle, K. Glover, P. P. Khargonekar, and B. A. Francis. Statespace solutions to standard H2 and H1 control problems. IEEE Trans. on Automatic Control, 34(8):832{847, August 1989. 3] N. R. Draper and R. C. Van Nostrand. Ridge regression and Jamesstein estimation: Review and comments. Technometrics, 21(4):451{466, November 1979. 4] K. Glover. All optimal hankel-norm approximations of linear multivariable systems and their L1 -error bounds. International Journal of Control, 39(6):1115{1193, 1984. 5] G. C. Goodwin. Some observations on robust stochastic estimation. In Proc. IFAC Identication and System Parameter Estimation, Beijing, PRC, 1988. 6] A. Graham. Kronecker Products and Matrix Calculus With Applications. Ellis Horwood Limited, Chichester, England, 1981. 7] U. Helmke. Balanced realisations for linear systems: a variational approach. SIAM J. Control and Optimization, 31(1):1{15, January 1993. 8] M. Iwatsuki, M. Kawamata, and T. Higuchi. Statistical sensitivity structures with fewer coe

(51) cients in discrete time linear systems. IEEE Trans. on Circuits and Systems, 37(1):72{80, January 1989. 9] T. Kailath. Linear Systems. Prentice-Hall, Englewood Cli s, New Jersey, 1980. 10] G. Li, B. D. O. Anderson, and M. Gevers. Optimal FWL design of state-space digital systems with weighted sensitivity minimization and sparseness considerations. IEEE Trans. on Circuits and Systems-Vol I: Fundamental Theory and Applications, 39(5):365{377, May 1992. 39.

(52) 11] G. Li and M. Gevers. Data ltering, reparametrization, and the numerical accuracy of parameter estimators. In Proc. 31'th IEEE Conference on Decision and Control, Tuscon, Arizona, 1992. 12] J. Little and L. Shure. Signal Processing Toolbox. The Mathworks, Inc., 1988. 13] L. Ljung. System Identication: Theory for the User. Prentice-Hall, Englewood Cli s, New Jersey, 1987. 14] L. Ljung. Issues in system identication. IEEE Control Systems Magazine., 11(1):25{29, January 1991. 15] L. Ljung. A simple start-up procedure for canonical form state space identication, based on subspace approximation. In Proc. 30'th IEEE Conference on Decision and Control, pages 1333{1336, Brighton, England, December 1991. 16] L. Ljung, J. Sj$oberg, and T. McKelvey. On the use of regularization in system identication. Technical report, Report LiTH-ISY-I-1379, Dep. of Electrical Engineering, Link$oping University, S-581 83 Link$oping, Sweden, August 1992. 17] D. G. Luenberger. Canonical forms for linear multivariable systems. IEEE Trans. Automatic Control, AC-12:290, 1967. 18] D. G. Luenberger. Linear and Nonlinear Programming. Addison Wesley, Reading, Massachusetts, 1984. 19] W. J. Lutz and S. L. Hakimi. Design of multi-input multi-output systems with minimum sensitivity. IEEE Trans. on Circuits and Systems, 35(9):1114{1121, September 1988. 20] J. M. Maciejowski. Balanced realisations in system identication. In Proc. 7'th IFAC Symposium on Identication & parameter Estimation, York, UK, 1985. 21] J. M. Maciejowski. Multivariable Feedback Design. Addison-Wesley, Wokingham, England, 1989. 40.

(53) 22] B. C. Moore. Principal component analysis in linear systems: controllability, observability, and model reduction. IEEE Trans. on Automatic Control, 26(1):17{32, 1981. 23] C. T. Mullis and R. A. Roberts. Roundo noise in digital lters: Frequency transformations and invariants. IEEE Trans. on Acustics, Speech and Signal Processing, 24(6):538{550, December 1976. 24] C. T. Mullis and R. A. Roberts. Synthesis of minimal roundo noise xed point digital lters. IEEE Trans. on Circuits and Systems, 23(9):551{ 562, September 1976. 25] J. E. Perkins, U. Helmke, and J. B. Moore. Balanced realisations via gradient ow techniques. Systems & Control Letters, 14:369{379, 1990. 26] J. Sj$oberg. Regularization issues in neural network models of dynamical systems. Link$oping studies in science and technology. thesis no.366, liutek-lic-1993:08, Department of Electrical Engineering, Link$oping University, Sweden, 1993. 27] T. S$oderstr$om and P. Stoica. System Identication. Prentice-Hall International, Hemel Hempstead, Hertfordshire, 1989. 28] V. Tavsanoglu and L. Thiele. Optimal design of state-space digital lters by simultaneous minimization of sensitivity and roundo noise. IEEE Trans. on Circuits and Systems, 31(10):884{888, October 1984. 29] L. Thiele. Design of sensitivity and round-o noise optimal state-space discrete systems. Circuit Theory and Applications, 12:39{46, 1984. 30] L. Thiele. On the sensitivity of linear state-space systems. IEEE Trans. on Circuits and Systems, 33:502{510, 1986. 31] A. J. M. van Overbeek and L. Ljung. On-line structure selection for multivariable state space models. Automatica, 18(5):529{543, 1982. 32] P. Van Overschee and B. De Moor. Two subspace algorithms for the identication of combined deterministic-stochastic systems. In Proc. 31'st IEEE Conference on Decision and Control, Tuscon, Arizona, pages 511{516, 1992. 41.

(54)

References

Related documents

The ambiguous space for recognition of doctoral supervision in the fine and performing arts Åsa Lindberg-Sand, Henrik Frisk &amp; Karin Johansson, Lund University.. In 2010, a

Detta projekt utvecklar policymixen för strategin Smart industri (Näringsdepartementet, 2016a). En av anledningarna till en stark avgränsning är att analysen bygger på djupa

We have taken a somewhat dierent perspective in the main result, Theorem 3.1, showing that a traditional model validation test immediately gives a \hard" bound on an integral of

Att förhöjningen är störst för parvis Gibbs sampler beror på att man på detta sätt inte får lika bra variation mellan de i tiden närliggande vektorerna som när fler termer

Compared with other stochastic volatility models such as the famous Heston model, SABR model has a simpler form and allows the market price and the market risks, including vanna

During this time the out- come from political interaction between geographically divided groups in society will be non-cooperative in nature, as groups try to grab as large a

When the cost of manipulation, i.e., the size of bribe, is …xed and common knowl- edge, the possibility of bribery binds, but bribery does not occur when the sponsor designs the

All recipes were tested by about 200 children in a project called the Children's best table where children aged 6-12 years worked with food as a theme to increase knowledge