Order Estimation for Subspace Methods

(1)

DietmarBauer

Departmentof ElectricalEngineering

Linkoping University, SE-58183 Linkoping,Sweden

WWW: http://www.control.isy.l iu.s e

Email: Dietmar.Bauer@tuwien.ac.at

June 6,2000

REG

LERTEKNIK

AUTO

_{MATIC CONTR}

OL

LINKÖPING

Report no.: LiTH-ISY-R-2263

Submittedto Automatica

TechnicalreportsfromtheAutomaticControlgroupinLinkopingareavailablebyanonymousftp

(2)

(3)

DietmarBauer

Departmentof ElectricalEngineering

Linkoping University, SE-58183 Linkoping,Sweden

WWW: http://www.control.isy.l iu.s e

Email: Dietmar.Bauer@tuwien.ac.at

June 6,2000

Abstract

Inthispaperthe questionofestimatingtheorderinthecontextof subspacemethodsis

addressed. Three dierent approaches are presented andthe asymptoticproperties thereof

derived. Two of these methods are based on the information contained in the estimated

singularvalues,while thethirdmethodisbasedonthe estimatedinnovationvariance. The

case withobservedinputsistreatedaswellasthecasewithoutexogeneousinputs. Thetwo

methodsbasedonthe singularvaluesare showntobeconsistentunderfairly mild

assump-tions, while the sameresult for the third approach is onlyobtained ona generic set. The

formercanbeappliedtoLarimoretypeofproceduresaswellastoMOESPtypeofprocedures,

whereasthelatteris onlyappliedtoLarimoretypeofalgorithms. Thishasimplicationsfor

the estimation of the orderof systems, which are close to the exceptional set, as is shown

in anumerical example. All the estimation methods involve the choice ofa penalty term.

SuÆcientconditionsonthepenaltytermtoguaranteeconsistencyarederived.Theeectsof

dierentchoicesofthepenaltytermareinvestigatedinasimulationstudy.

Keywords: subspacemethods,orderestimation,asymptoticproperties

1 Introduction

There exists an extensive literature for order estimation algorithms forlinear, dynamical, state

space systems. Probablythe most important contribution canbe attributed to (Akaike, 1975)

forintroducing theinformation criteria. These criteriacomparethe model t ontheestimation

dataasmeasuredbyafunctionoftheestimatedinnovationvariancetosomepenaltyterm,which

punisheshighmodelorders. Inotherwords,thehighermodelorderisonlychosen,iftheincrease

intheaccuracyishigherthanacertainthreshold,whichdependsonthesamplesize. Alternatively

they canbeseen asa sequence of tests to identify themodel order,where the size of thetests

is adjusted to the sample size. The properties of these estimation methods are well studied

(Shibata,1980;Akaike,1975;Rissanen,1978)andtheeectsofthechoiceofthepenaltytermare

wellunderstood(seee.g.HannanandDeistler,1988,foracomprehensivediscussionoftheknown

properties). All these estimation methods however rely on the use of the maximum likelihood

estimate for the system for each order. Thus in practice a large number of systems has to be

estimated using numerical search procedures to optimise the likelihood for given system order,

leadingto asometimesprohibitiveamountofcomputations.

Parts of thiswork havebeen done whilethe authorwas holding a post-doc position at the Departmentf.

(4)

proposed quitesometimeago,there exist onlyfewreferencesdealingwith theestimationof the

order in thecontext of subspacemethods. Therst contribution seemsto be due to (Peternell,

1995). This method relies on theinformation oftheestimated canonicalcorrelations, which are

estimated in thesubspacemethods. This leadstoaveryeconomical(in termsof computations)

method, whichhasbeenshowntoleadtoalmostsure(a.s.) consistentestimatesunder theusual

assumptions. Seebelowformoredetailsonthis. Howeverithasbeenobserved,thatthismethod

seems to be relatively sensitive to the choice of certain user parameters in (Bauer, 1998). In

this thesis also an alternativeis analysed, which is a small adaptation of the criterion given in

(Peternell,1995),which seemstobelesssensitive. (Bauer,1998)introducesanothercriterionfor

theLarimore typeof procedures,which is much in thespirit of Akaikes informationcriteria, as

ituses theestimated innovationvariance. This three procedures willbepresentedandanalysed

below. Itwillbeclearfrom theproofshowever,thattheproposedestimationmethodisonlyone

possibility, as the main problem boils down to estimate therank of amatrix. Forthis problem

there arewellestablishedtesting methods, which howevermostlyrelyonthedistribution of the

matrix, whose rank is estimated. Such procedures are presented in (Sorelius, 1999): There the

rankofthecrucialmatrixisfoundbyincreasingthedimensionsofthematrixbyonein onestep

and performing atest onthenewly introduced smallestsingularvalue. This procedure however

has thedisadvantageof simultaneous tests, sincein practice asequence of testswill haveto be

performed,wherethenumberofthetestsandthedependencyofthetestsisunknownatthestart

of the tests. For this reasonit is believed, that estimating the order is more appropriate than

testingfortherankin thepresentsetup.

Theorganisation of the paper is asfollows: In thenext section the model set is stated and

themainassumptionsarepresented. Theestimationalgorithmsarebrie yreviewedinsection3,

wherealso thevariousorderestimation algorithmsarediscussed. Section 4then statesthemain

resultsofthispaperand providesproofsforthem. Asimulationstudyisperformedin section5.

Finallysection6concludes.

2 Model set an assumptions

Inthis paperlinear,nite dimensional, discretetime, time invariant, statespace systemsof the

form x t+1 = Ax t +Bu t +K" t y t = Cx t +Du t +" t (1) wherey t 2R s

denotes theobservedoutput process,u

t 2R

m

denotes theobservedinputprocess

and "

t 2 R

s

the unobserved white noise sequence. x

t 2 R

n

is the state sequence. Here the

true order of thesystem is denoted by n. Thematrices A 2 R nn ;B 2 R nm ;C 2R s n ;D 2 R sm ;K 2R ns

determinethesystem. Inthecasean inputdelayis postulatedD is restricted

to be zero. The system is assumed to be stable, i.e. all eigenvalues of A are assumed to lie

inside the unit circle, and strictly minimum-phase, i.e. theeigenvaluesof A KC are assumed

to lieinside theunit circle. Thesystemmatrices correspondto apairof transferfunctions: Let

k(z)=I+zC(I zA) 1

Kandletl(z)=D+zC(I zA) 1

B,wherezdenotesthebackwardshift

operator. Furthermore letM

n

denote theset of all pairsof transferfunctions (k;l) that permit

astatespacerepresentationof theform (1)fullling thestabilityandthe strictminimum-phase

assumption.

Thewhitenoise"

t

is assumedto bean ergodicmartingale dierencesequencesatisfying the

followingconditions: Ef" t jF t 1 g=0; Ef" t " 0 t jF t 1 g==E" t " 0 t >0 Ef" t;a " t;b " t;c jF t 1 g=! a;b;c ; Ef" 4 t;a g<1 (2)

Here E denotes expectation, F

t

denotes the -algebraspanned by (y

s

;s t) and "

t;a

denotes

thea-thcomponentofthevector"

t

(5)

in (Hannan and Deistler, 1988, Theorem 4.3.2). Corresponding to the input two dierent sets

of assumptions will be introduced for the Larimore type of procedures and the MOESP type of

procedures.

Assumptions1(Larimore type of procedure) The process(u

t

;t2Z)islteredwhitenoise

ofthe formu t = P 1 j=0 K u (j) t j ,where t

isanergodic, martingale dierencesequencefullling

the assumptions stated in equation (2) and being independent of "

t , and where kK u (j)k c u j u for some 0<c u <1;0< u <1. Furthermore u (!)=( P 1 j=0 K u (j)e i!j )( P 1 j=0 K u (j)e i!j ) 0 is assumedtofulll0<cI u cI<1for <!.

Assumptions2(MOESP type ofprocedures) Theinputprocess (u

t ;t2Z)isoftheformu t = v t + P h j=1 c j e i j t wherev t

fulllsassumptions1andthe randomvariablesc

j 2R

m

havezeromean

andnitevariancesandaresuchthatthecorrespondingprocessu

t

isrealvalued. Furthermorethe

process u

t

is assumedto be persistently exciting of order (tobe specied later) in the sense of

(Ljung,1999).

NotethattheassumptionsfortheinputsintheLarimoreprocedurearemoresevere. Thereason

forthisliesinthefact,thatforthisclassofproceduresanecessaryconditionforconsistencyisthat

theintegerparameterptendstoinnity(seebelowfordetails). Thisamountstotheconditionon

theinputtobeofinnitepersistency. FortheMOESPtypeofproceduresnotethattheassumptions

areidenticaltotheassumptionsimposedin theproofoftheasymptoticnormalityin (Bauerand

Jansson, 2000). Note however, that this only occurs, since no minimal conditionsare given in

this paper. In fact it will be clear from the proof given below, which properties for the input

signalarereallyneeded inthis respect. Alsonotethat theconditionsgiventhere permit certain

pseudostationarysequences,i.e. sumsofsinusoids. Howeverinthiscaseitisnecessarytoimpose

thenecessaryrestrictions(i.e. theexistenceofcertainlimits,whichappearintheproof)directly

onthesequenceratherthanusingsuÆcientconditionsontheunderlyingrandomvariables.

3 Estimation algorithms

Inthissectionabriefreviewofthemainstepsintheconsideredsubspaceproceduresisgivenand

theestimationalgorithmsaremotivated. Foramoredetaileddescriptionofsubspacemethodssee

(Larimore,1983;Verhaegen,1994)or (Bauer,1998,Chapter3). LetY + t;f =[y 0 t ;y 0 t+1 ;;y 0 t+f 1 ] 0 andletU + t;f andE + t;f

respectivelybeconstructedanalogously usingu

t and " t respectivelyin the place of y t . Let Z t;p = [y 0 t 1 ;u 0 t 1 ;;y 0 t p ;u 0 t p ] 0

. Here f and p are two integer parameters,

whichhaveto bechosenbythe user. Seebelow forassumptionson thechoice ofthese integers.

Thenitfollowsfromthesystemequations(1)that

Y + t;f =O f K p Z t;p +U f U + t;f +E f E + t;f +O f (A KC) p x t p Here O 0 f = [C 0 ;A 0 C 0 ;(A f 1 ) 0 C 0 ] and K p = [[K ;B KD];(A KC)[K ;B KD];;(A KC) p 1 [K ;B KD]]. Further U f and E f

are block Toeplitz matrices containing the impulse

responsesequences. Theactualformofthesetwomatricesisofnoimportanceforusandthuswe

refertotheoriginalarticlesfordetails. Thisequationbuildsthebasisforallsubspacealgorithms,

whichcanbedescribedasfollows:

1. RegressY + t;f ontoU + t;f andZ t;p toobtainanestimate ^ z ofO f K p andanestimate ^ u ofU f

respectively. Due tonitesampleeects ^

z

willtypicallybeoffull rank.

2. Forgivennndaranknapproximationof ^ z byusingtheSVDof ^ W + f ^ z ^ W p = ^ U n ^ n ^ V 0 n + ^ R. Here ^ n

denotesthediagonalmatrixcontainingthelargestnsingularvaluesin decreasing

order. ^

U

n

contains the corresponding left singular vectors as columns and ^

V

n

the

corre-spondingrightsingularvectors. Finally ^

(6)

leadsto anapproximation ^ O f ^ K p =( ^ W f ) 1 ^ U n ^ n ^ V 0 n ( ^ W p ) 1

. Theactual decompositionof

thismatrixinto ^ O f and ^ K p

hasnoin uenceontheestimatedtransferfunctions.

3. Usingtheestimates ^ O f ; ^ K p and ^ u

obtainthesystemmatrixestimates.

In the second step an order has to be specied. Also the matrices ^ W + f and ^ W p have to be

provided by the user. In the literature several dierent choices have been proposed. For the

matrix ^

W

p

thechoicesarerestrictedto( ^ p ) 1=2 and( ^ ; p ) 1=2 ,where ^ p = 1 T P T t=p+1 Z t;p (Z t;p ) 0

denotes the sample variance of Z

t;p . Further ^ ; p = ^ 1 p ^ u;z ^ 1 u ^ u;z . Here ^ u denotes the samplecovarianceofU + t;f and ^ u;z

thesamplecovarianceofU + t;f andZ t;p . Correspondingto ^ W + f

two choices will be considered: ( ^ +; f ) 1=2 = ^ + y ^ y;u ^ 1 u ^ u;y

using obvious notation, where

y standsfor Y + t;f , and ^ W + f =[K W (i j)] i;j , where w(z) = P 1 j=0 K W (j)z j denotes a frequency weighting. K W (j)=0;j <0andK W

(0)is assumednonsingular. Furthermore w(z)is assumed

to bestable and strictlyminimum-phase. Theintuitionof thisspecialchoiceof theweightingis

to emphasizesomefrequencyrangevia specically designingw(z) to be aband pass lter. The

ideaofthisstepisessentiallytodiscriminatebetweenthenonzero'signal'singularvaluesandthe

noise containedin ^

R, which is in uencedby the weighting, sincethis scalesdiferent directions.

Usingtheinformation containedin theestimatedsingularvalueswill bethebasisfor twoof the

estimationmethods.

FortheLarimoretypeofmethodsalsoanorderestimationalgorithmwillbegiven,whichrelies

ontheestimated innovationvariance. Thusitisnecessarytogivemoredetailsontheestimation

of thesystemmatrices in this case. Note, that from step2 anestimate ^

K

p

isobtained. Thisis

used to estimatethe statesequenceas x^

t = ^ K p Z t;p . Let ha t ;b t i= 1 T P T f t=p+1 a t b 0 t . Inserting the

estimatedstateintothesystemequations(1)oneobtainsestimatesof(A;B;C ;D)fromtheleast

squaressolution: [ ^ A T ; ^ B T ] = h^x t+1 ;x^ t i h^x t+1 ;u t i h^x t ;x^ t i h^x t ;u t i hu t ;x^ t i hu t ;u t i 1 h ^ C T ; ^ D T i = hy t ;x^ t i hy t ;u t i h^x t ;x^ t i h^x t ;u t i hu t ;x^ t i hu t ;u t i 1

If adelay is postulated, then in thesecond least squares problemu

t

is omitted. Thematrix K

and theinnovation sequence are estimated from theresiduals ofthese equations asfollows: Let

^ " t =y t ^ C T ^ x t ^ D T u t . Then ^ =h^" t ;"^ t iand ^ K T =h^x t+1 ;^" t i ^ 1 .

Followingthediscussiongivenabovethereareacoupleofratherobviousalgorithmstoestimate

theorder. Thesewillbepresentedinthefollowing.

3.1 Using the information contained in the singular values

From standard theory it follows, that ^ X f;p = ^ W + f ^ z ^ W p

converges a.s. to the limit X

Æ = W + f O f K p W p , where W + f and W p

denote the a.s. limits of ^ W + f and ^ W p respectively. Here

convergenceoccursintheoperatornormactingon` 2

almostsurely,wherethematricesoccurring

are seenasoperators byadding zerosin the innitematrix representationcorresponding to the

operator. Thereforeit followsfrom the resultsof operator theory (see e.g.Chatelin, 1983)that

thesingularvaluesalsoconverge. Since X

Æ

hasranknonlytherstn singularvaluesofX

Æ are

nonzero,therest beingzero. Thereforetheproblem boilsdown totheassessmentofthe rankof

anoisy matrix. Theproblem gets complicated, sincethedistribution ofthe noiseactingon the

matrixishard toquantify. Thereforeweresortto estimationalgorithmsasopposed tomethods

ofobtainingtheorderviaasequenceoftests(cf.Sorelius,1999). Thesealgorithmssharetheidea

oftheinformationcriteriaofcomparingthesignicanceoftheinclusionof anothercoordinatein

(7)

NIC(n) = M X j=n+1 ^ 2 j +C(T)d(n)=T (3) SVC(n) = ^ 2 n+1 +C(T)d(n)=T (4)

Hered(n)=nm+ns+smdenotesthenumberofparametersofastatespacesystemofordern.

C(T)>0;C(T)=T ! 0isa penalty term,which will be described below in moredetail. In the

denitionM=min ffs;p(s+m)g,thenumberofestimatedsingularvalues. Theestimatedordern,^

say,isobtainedastheminimisingargumentofthesecriterionfunctions. NIChasbeenintroduced

andanalysedin(Peternell,1995). Inthedenition(Peternell,1995)usedadierentchoiceofd(n),

whichhowevercanbereformulatedtotintothepresentsetting. Also(Peternell,1995)onlydealt

withf andpxedandnite,whilethefollowingdiscussionholdsforgeneralchoices. SVChasbeen

proposed as arenementof NICin (Bauer,1998). Themain dierence liesin thefact,that NIC

usestheFrobeniusnormof thematrix ^

R , whereasSVCusesthetwonormtomeasure thesizeof

theneglectedsingularvalues. Forbothcriteriatheorder estimateisobtainedbyminimizing the

aboveexpression. Note,that theseorderestimationtechniquesdonotdependonwhetherMOESP

ortheLarimoretypeofmethodsisusedandthuscanbeusedinalltheseprocedures. Theauthor

wantsto stress, that these are just twoalgorithms, howevermany moreseem possible,since in

principleallthatisdoneistocomparethesizeof ^

Rmeasuredinsomenormtosomesamplesize

dependentpenaltyterm. Alsonote,thatthechoiceoftheweightingmatrices ^ W + f and ^ W p isvery

in uentialfor the outcome of theestimation, aswill be demonstrated in section 5. This might

indeed bedesirable,sincespecial weightingscanbegivenasomewhatheuristicinterpretationas

frequencyshapinglters(cf.McKelvey,1995). Inthiscaseitfollows,thattheweightingmatrices

serveasatooltostresstheimportantfrequenciesfortheidentication, andthusthesedirections

mightbeupweighed,whereasothernotso importantdirectionsaredownweighed.

Note, that bothcriterion functions canbe implemented with almost nocomputational load.

The singular valuesare estimated in the algorithms, therefore only the addition of the penalty

termandtheminimizationoverasmallrangeofintegershasto beperformed.

3.2 Using the estimated innovation covariance

Asecondintuitiveideawould betoestimatetheorderusingtheestimatedinnovationcovariance

in the Larimore type of procedures. Recall that given the state sequence of dimension n, say

^ x n

t

,the innovation varianceis estimatedas ^ n =hy t ^ C n T ^ x n t ^ D n T u t ;y t ^ C n T ^ x n t ^ D n T u t i. Here [ ^ C n T ; ^ D n T

]denotestheestimatesofC andD using theestimated statex^ n

t

. Thenitistemptingto

usethecriterionfunction usedalsointheinformationcriteriaasfollows:

IVC(n)=logdet ^

n

+C(T)d(n)=T (5)

where d(n)and C(T) areidentical tothedenition of SVCandNIC. Againthe orderestimateis

obtainedby minimizingthis function overtheintegers 0nminffs;p(s+m)g. Theauthor

wantstostress,thatthisisnotthestandardinformationcriterion,sincetheestimates ^

n

arenot

themaximumlikelihoodestimatesof theinnovation sequence. Infact itwillbeshown, thatthis

estimationalgorithmmayperformpoorin somesituations.

Fromacomputationalpointofviewthiscriterionisveryattractiveinthecaseofnoexogenous

inputs present in the read out equation, i.e. in the case y

t = Cx

t +"

t

, and additionally the

choice of the weighting ^ W p = ( ^ p ) 1=2

. In this case the choice ^ K n p = ^ V 0 n ( ^ W p ) 1 Z t;p leads to h^x n t ;x^ n t

i = I, i.e. the components of the state are orthogonal and thus the regressionscan be

performedindependently. TheestimationalgorithmthenamountstoestimatingthematrixC for

themaximalstatedimensionandthenonlyadditionsandmultiplicationshavetoperformed. Let

^ C max T = h ^ C n T ; ^ C n max T i then ^ n = ^ max + ^ C n max T ( ^ C n max T ) 0

(8)

InthecaseofexogenousinputspresentoradierentchoiceoftheweightingW

p

onthecontrary

each regressionhas to be performed separately. Note however, that normallythese will be low

dimensionalregressioningeneralandalsoofnottoobignumericalload. Itispossibletoimplement

thesubspaceprocedures such that onlytheestimated covariancesare used ratherthanthe data

itself. Inthiscasethenecessarycovarianceestimatesarealreadycalculatedandthusonlymatrix

inversionshaveto be calculated. Otherwise also in this stepthe necessarycovariancescouldbe

calculated in order to minimize the number of necessary calculations. It will be shown in the

nextsection,thatalthoughthisprocedureseemsappealingonrstsight,itisnotarecommended

procedure. Thusinthisrespecttheresultofthispaperisratherto show,thatusingthismethod

leadsto problems,which aresomewhatunexpected.

4 Main results

Inthissectionthepropertiesofthevariousestimationalgorithmswillbederived. Thediscussion

draws heavily from (Bauer, 1998; Peternell, 1995). Some results for the MOESPcase have been

presentedin(Bauer,1999).

Theresults are mainly based onthe followinglemmas: The rstdeals with the accuracyof

theestimation of samplecovariancesunder thegivenassumptions onthesystemand theinput.

ThesecondonedealswiththelinearisationoftheSVDorSVDrelatedquantities,which willbe

ofimportance mainlyfortheSVCcase.

Lemma4.1 Let (y

t

;t 2 Z) be generated by a system of the form (1), where the noise fullls

the assumptions of section 2. Let ^

z;z (j) = T 1 P T j j=1 z t z 0 t j and let z;z (j) = Ez t z 0 t j , where z t =[y 0 t ;u 0 t ] 0 . Furthermore letH T =o((logT) a )for some0<a<1. Ifu t

fulllsassumptions 1then

max jjjHT k^ z;z (j) z;z k 2 =O(Q T ) (6) If u t

onlyfulllsthe assumptions 2,thenthe statement istruefor H

T

=M<1.

This lemma follows from (Hannan and Deistler, 1988, Theorem 5.3.2 and Chapter 4). The

lemma providesrelativelysharpbounds forthe estimationerror ofthe covariancesequences. In

factitfollowsfromthelawoftheiteratedlogarithmfortheestimatedcovariancesequences,that

except for the exact evaluation of the constant involved in the O(Q

T

) statement the bound is

tight.

Lemma4.2 (Chatelin) Let T

T

denote a sequence of symmetric, compact operators acting on

` 2

, which converges in norm to the operator T

Æ

. Then it follows, that the set of eigenvalues

of T

T

converges to the set of eigenvalues of T

Æ

. Also the corresponding eigenspaces converge in

the gap metric. Let P Æ

i

denote the orthonormal projection matrixonto the space of eigenvectors

corresponding totheeigenvalue Æ i ofT Æ andletP T i and T i

denotethecorresponding quantitiesof

T

. Herefor amultipleeigenvalue Æ i ofT Æ the quantitiesP T i

refer totheorthonormal projection

matrixontothe spacespannedbythe eigenvectorstoalleigenvaluesofT

T convergingto Æ i . Then P T i =P Æ i + X j6=i P Æ j T T T Æ Æ i Æ j P Æ i +o(kT T T Æ k) (7)

Thelemma implies, that the eigenspaces converge, and in particular the projections on the

eigenspacesconvergeatthesamerate,astheerrorin theapproximation.

It has been shown in (Bauer and Jansson, 2000), that the MOESP type of methods leads to

consistentestimatesforthesystemmatrixestimatesonlyingenericcases. ThereforealsotheSVC

criterion can onlyproduce consistent estimates in these cases. Let

u

(!) denote the spectrum

of the stationary process u

t

and assume, that theintegers f and p are used for theestimation.

(9)

n u n

estimatesofthepairoftransferfunctions. Itisalsoshown,thatthissetisgenericinM

n

. However

asthe examplegivenin (Janssonand Wahlberg, 1997) shows, the set is notidentical to M

n in

general. Inthecaseminff;pg>3nithasbeenshownin(Chui,1997)thatinfactthisisthecase,

i.e. the consistency holds for everypair (k;l) 2 M

n

. In fact the suÆcientconditions stated in

(Chui,1997)aremuch sharper.

Theorem4.1 Lettheprocess(y

t

;t2Z)begeneratedbyasystemofthe form (1),wherethetrue

systemorder isequal ton

Æ

, andwherethe white noise process ("

t

;t2Z)fullls the assumptions

of section 2. If the input fullls the assumptions 1, then maxff;pg=o((logT) a

) isassumedfor

some a<1. In this casethe conditions C(T) >0;C(T)=T !0;C(T)=(fploglogT)!1 are

suÆcientfor the a.s. consistency ofthe orderestimate obtainedby minimizingSVC(n).

Ifthe inputfullls the assumptions2with =f +p 1, thenfor eachf andpthereexistsa

set U n (f;p; u (!);)M n

suchthat for (k;l)2U

n (f;p;

u

(!);) the SVCmethod leads to a.s.

consistentestimatesofthe orderunderthe assumptionC(T)>0;C(T)=T !0;C(T)=loglogT !

1. If (k;l) 62 U

n (f;p;

u

(!);) then consistency fails for the same choice of the penalty term

C(T), i.e. lim T!1 ^ n<n Æ a.s.

PROOF:Note,thatunderbothsetofassumptionstheerrorintheestimationoftherstf+p 1

covariances

z

(j)areof orderO(Q

T

)uniformelydueto theLemma4.1. Theestimationusesthe

singular values of ^ X f;p = ^ W + f ^ z ^ W p , which converges to X Æ = W + f O f K p W p

a.s. ashas been

shown e.g. in (Peternell et al., 1996). Here convergence is in operator norm in the embedding

` 2

. Considertheestimation errorin ^

z

rst: Introducethenotationha

t ;b t i=T 1 P T f j=p+1 a t b 0 t . Then h ^ z ; ^ u i =hY + t;f ; Z t;p U + t;f ih Z t;p U + t;f ; Z t;p U + t;f i 1

The estimation error in each entryof these matrices is of the order O(Q

T

) asfollowsfrom the

Lemma4.1togetherwith(HannanandDeistler,1988,Theorem6.6.11.),whichassuresthe

summa-bilityofthecolumnsoftheinverseuniformelyin f andp. Thusconsidertheweightingmatrices:

Recallthattheweightingmatricesarerestrictedtobeeitherdeterministicorchosenasthesquare

rootsofmatriceslikehY + t;f ;Y + t;f i hY + t;f ;U + t;f ihU + t;f ;U + t;f i 1 hU + t;f ;Y + t;f

i. Usingthesamearguments

ashave been used aboveshowsthat the estimation error in the entries of these matrices areof

order O(Q

T

). Therefore also the error in the positivedenite symmetric square root is of the

sameorder,ascanbeseenfrom aTaylorseriesexpansionofthesquareroot, which canbeused

todenethesymmetricsquarerootofanoperator.

Itthus follows,that ^ X f;p ! X Æ , where k ^ X f;p X Æ k 2 =O(Q T p

fp). Thereforethe singular

valuesconvergeat thesamerate. This shows, that underestimationof theorder is notpossible

asymptotically,if

n

Æ

>0,where

i

denotesthesingularvaluesofX

Æ ordereddecreasingly,as SVC(n) = ^ 2 n+1 + d(n)C(T) T = 2 n+1 +(^ 2 n+1 2 n+1 )+ d(n)C(T) T = 2 n+1 +O( p fpQ T + d(n)C(T) T )

Sincethesecondtermtendstozero,theminimumcannotbeattainedatn<n

Æ

. Inthecaseofthe

MOESPprocedureand (k;l)62U

n (f;p;

u

;)then-th singularvalueiszeroand thus consistency

fails, as follows from the same arguments given below, since in that case, the same arguments

show that theasymptotic statedimension isequalto the numberof nonzerosingularvalues for

thelimitingmatrix.

Thereforewehavetoshow,that thetrueordern

Æ willbepreferredton>n Æ asymptotically. Thusconsider ^ n+1 = k ^ W + f ^ z ^ W p ^ U n ^ n ^ V 0 n k 2 =k ^ U 0 ^ X f;p ^ U n ^ n ^ V 0 n k 2 k ^ U 2 ^ U 0 2 ^ X f;p U 2 U 0 2 X Æ k 2

(10)

Here ^ U = [ ^ U n ; ^ U 2;n ]; ^ U n 2 R ; ^ U 2;n 2 R and ^ U 2 = ^ U 2;n Æ

, which explains the last

inequality. Sincetheentriesof ^

X

f;p X

Æ

havebeenshownto beoforderO(Q

T )thenorm k ^ U 2 ^ U 0 2 ^ X f;p X Æ k 2 k ^ U 2 ^ U 0 2 k 2 k ^ X f;p X Æ k Fr =O(Q T p fp)

Therefore itremains to obtaina bound on k ^ U 2 ^ U 0 2 U 2 U 0 2 k 2

. Butthis followsfrom Lemma 4.2,

usingT T = ^ X f;p ^ X 0 f;p ;T 0 =X Æ X 0 Æ

andtheexponentialdecreaseinelementsintherowsofU

n

Æ (cf.

Bauer,1998). Fromthistheresultfollows,since

SVC(n Æ ) SVC(n) = ^ 2 n Æ +1 ^ 2 n+1 +(d(n Æ ) d(n)) C(T) T = C(T) T

[O(fploglogT=C(T))+d(n

Æ

) d(n)]<0

sincefploglogT=C(T)!0andd(n)>d(n

Æ ).

Note,thattheresultalsoprovestheconsistencyoftheNICcriterionforthesamerestrictions

on thepenalty term. Also note, that concerningthe penaltyterm only asuÆcient conditionis

given. Thebound isobtainedbyratherbruteforcearguments,bounding thetwonormwith the

Frobeniusnorm. Inthecase,wheref andptendtoinnityataratelogT itseemstobedesirable

tousealowerpenaltyterm,aswill bearguedinthenumericalexamples.

For the estimation criteria, which are based on an estimate of the innovation variance, the

situation is somewhat dierent. Note that this procedure only appliesfor the Larimoretypeof

procedures. Thereforeassume,thattheinputprocessfulllsassumptions1. Notethatifnodelay

ispostulated ^ n = hy t ;y t i hy t ; ^ x t u t ih ^ x t u t ; ^ x t u t i 1 h ^ x t u t ;y t i = hy t ;y t i hy t ;Z t+1;p+1 i ^ L 0 n ( ^ L n hZ t+1;p+1 ;Z t+1;p+1 i ^ L 0 n ) 1 ^ L n hZ t+1;p+1 ;y t i = hy t ;y t i b H 1 ^ L 0 n ( ^ L n ^ L 0 n ) 1 ^ L n b H 0 1 where ^ L n = 0 0 ^ K p (n) 0 I 0 ; ^ K p (n) = ^ V 0 n ( ^ W p ) 1 . Also ^ L n = ^ L n hZ t+1;p+1 ;Z t+1;p+1 i 1=2 and b H 1 =hy t ;Z t+1;p+1 ihZ t+1;p+1 ;Z t+1;p+1 i 0 =2 .

Firstconsidertheproblemofunderestimatingtheorder. Let

n

denotethelimitof ^ n . Then det[ n Æ ]<det[ n Æ 1

] isasuÆcientconditiontoavoidasymptoticunderestimationoftheorder.

ThisfollowsfromC(T)=T !0. Thisconditionhasbeenanalyzedinmoredetailin(Bauer,1998):

In the special case, where no input is present in the readout equation, i.e. D = 0 and where

^ W p =( ^ p ) 1=2

thisconditionis equivalentto C

0;nÆ

6=0,whereC

0;nÆ

denotes thelast columnof

thelimiting realization of thetrue system. It has beenfound, that this conditionis fullled on

ageneric subsetin some special cases. It is referredto theoriginal work for details. In general

howevertheimplicationsofthisconditionareunknown.

Nextconsiderthequestionofoverestimation: Fornn

Æ

oneobtains

n =

nÆ

andthusthe

estimationerrorhasto beanalysedmoreclosely. Accordingtotheequationaboveoneobtains:

^ n ^ nÆ = b H 1 ^ L 0 n Æ ( ^ L nÆ ^ L 0 n Æ ) 1 ^ L nÆ ^ L 0 n ( ^ L n ^ L 0 n ) 1 ^ L n b H 0 1

Usingthematrixinversionlemmaforpartitionedmatricesone obtains

^ L 0 n ( ^ L n ^ L 0 n ) 1 ^ L n = ^ L 0 n Æ ( ^ L nÆ ^ L 0 n Æ ) 1 ^ L nÆ + ^ P ? n Æ ^ L 0 n;2 ( ^ L n;2 ^ P ? n Æ ^ L 0 n;2 ) 1 ^ L n;2 ^ P ? n Æ where ^ P ? n Æ = I ^ L 0 n Æ ( ^ L nÆ ^ L 0 n Æ ) 1 ^ L nÆ and where ^ L 0 n = [ ^ L 0 n Æ ; ^ L 0 n;2

]. Since the second term is a

projectionoperator ithasnormone. Thustheessentialtermis b H 1 ^ P ? n Æ

(11)

since H 1 ![C ;D] L n Æ L nÆ and ^ P nÆ ! I L nÆ ( L n Æ L nÆ ) L n Æ =P nÆ

. The estimationerrors are

derivedusingthe uniformconvergenceof thecovarianceestimates: Themain emphasis herelies

on ^ K p K p

. It is straightforward to show, using Lemma 4.1 and Lemma 4.2 that there exists

a matrix ^ S T such that k ^ S T ^ K p K p k 2 =O(Q T p p). ApplyingLemma 4.1 to hy t ;Z t+1;p+1 ithis alsoimpliesk b H 1 [C ;D] L n Æ L 0 nÆ k 2 =O(Q T p p)as wellask ^ P ? nÆ P ? nÆ k 2 =O(Q T p p). Therefore consider IVC(n) IVC(n Æ ) = (d(n) d(n Æ )) C(T) T +log det ^ n =det ^ n Æ = (d(n) d(n Æ )) C(T) T +log det h I+( ^ n ^ n Æ ) ^ 1 nÆ i = (d(n) d(n Æ )) C(T) T +tr h ( ^ n ^ nÆ ) ^ 1 n Æ i = (d(n) d(n Æ )) C(T) T +O(Q 2 T p)

asfollowsfromaTaylorseriesexpansionoflog(1+x) . Thus

T C(T) (IVC(n) IVC(n Æ ))=d(n) d(n Æ

)+O(ploglogT=C(T))

Thisshowsthefollowingresult:

Theorem4.2 Lettheprocess(y

t

;t2Z)begeneratedbyasystemofthe form (1),wherethetrue

system(k;l)2M

n

. Let the noise fulll the assumptions of section 2 andlet the input fulll the

assumptions 1. Let thesystembeestimatedaccording tothe Larimore typeofprocedure.

Then theorderestimateobtainedastheminimizing argumentofIVC(n)usingapenaltyterm

C(T)>0;C(T)=T !0and C(T)=(ploglogT)!1 isa.s. consistent, ifdet [

n Æ 1 ]>det [ n Æ ]. If det[ n Æ 1 ]=det[ n Æ

] thenthe orderisunderestimateda.s. asymptotically.

Thetheoremleadstoapenaltyterm,whichhastobeslightlyhigherthanploglogTandtherefore

thechoice logT seemstobeareasonablechoicenotingthat loglogT issmall evenforrelatively

largeT,althoughnottheoreticallyjustiedfortheLarimoretypeofmethods,wheref andptend

to innity. This result isnew, as in (Bauer,1998)much moresevererestrictionson thepenalty

termhavebeenused. Therestriction det[

nÆ 1

]>det [

nÆ

]is worthbeinginvestigated further.

The fundamental dierence of the criterion IVC(n) as compared to the information criteria,

althoughformally denedanalogously,isthat theinnovationvarianceiscalculatedfortruncated

statesonly,ratherthannewlycomputedstates. Howevertherstncomponentsofx

t =K

1 Z

t;1

neednotbegeneratedbyastatespaceequationofordernforn<n

Æ

, i.e. thematrixK

1 might

not havetheshift invariance structure K

2:1 = A n K 1:1

forany matrix

A

n

of dimension nn,

using obviousnotation to denote submatrices. Therefore the criterion onlymeasures the direct

in uence of the statecoordinates on theprediction of y

t

, but it doesnot takeinto account the

dynamical generation of the state. Thus in the case, where a state does not contribute to the

presentoftheoutput,butonlytothefuture,itwill beneglectedaccordingtothecriteriongiven

above. Asthecitedresultsshow,thismightbeanextremelyraresituation. Themainconcernin

thisrespectis,thatinsituations,wherethecontributionissmall,thesamebehaviourisexpected,

i.e. manyobservationwillbeneededin ordertodetectthisstatecomponent. Inthenextsection

anexampleforthiswillbegiven.

5 Numerical Examples

In this section we present three dierent examples in order to compare the various proposed

order estimation methods. The candidate order estimationalgorithms will beSVC(n);IVC(n)

(12)

Est. Order <2 2 >2 <2 2 >2 IVC1 0.00 0.83 0.17 0.00 0.77 0.23 CCA IVC2 1.00 0.00 0.00 0.82 0.18 0.00 SVC1 0.00 0.96 0.04 0.00 0.93 0.07 SVC2 1.00 0.00 0.00 0.86 0.14 0.00 IVC1 0.82 0.03 0.15 0.68 0.06 0.26 N4SID IVC2 1.00 0.00 0.00 1.00 0.00 0.00 SVC1 0.00 0.66 0.34 0.00 0.40 0.60 SVC2 0.45 0.55 0.00 0.04 0.96 0.00

Table1: Heretheprobabilityofestimatingtheindicatedorderfor1000timeseriesofsamplesizeT

isshownfortwodierentweightingschemes(CCAandN4SID)and4dierentestimationmethods:

IVC(n)withC(T)=logT (IVC1)andC(T)=fplogT (IVC2)andSVCwithC(T)=logT (SVC1)

andC(T)=fplogT (SVC2). f =p=p^

AIC

hasbeenused.

in the N4SIDprocedure of thesystem identication toolbox of MATLAB(Ljung, 1991): The idea

hereis toformalisethesearchfora"gap"inthesingularvalues. Theorderisestimatedas

^ n=max n:log^ n > 1 2 (log^ 1 +log^ M )

i.e. thelargestinteger,which isgreaterthanthearithmetic meanofthelargestandthesmallest

nonzero estimated singular value. The three examples include a low order single input single

outputsystemwithoutexogenousinputs,wheretheorderis expectedtobeeasyto nd,another

SISOsystem withoutexogenous inputs, wherethe order is expected tobe hardto identify, and

nally aMIMO systemwith atwo-dimensionalobservedinput. Themain points ofinterest are

theeectsofthechoiceofthepenaltyterm,theweightingmatrices,theintegerparametersf and

p,andofcourseacomparisonbetweenthevarious procedures.

Asarstexampleconsiderthesystemdenedbythefollowingmatrices:

A= 0 1 0:7 0:5 ;K= 1:3 0:3 ;C=[1;0]

This system has Lyapunov balanced Gramianof roughly = diag(2:55;1:78). The system

polesareat0:250:7984iandthezerosat 0:40:4359i. Intheestimationweusetwodierent

weighting schemes: CCA uses ^ W + f = ( ^ + f ) 1=2

and the method using ^

W +

f

= I will be labelled

N4SID. The indices f = p = p^

AIC

are used. For each of the weighting schemes, the order is

estimatedusingfour dierentmethods: IVCandSVCwithC(T)=logT (denoted withIVC1and

SVC1 respectively in the sequel), IVC and SVC with C(T) = fplogT (denoted with IVC2 and

SVC2respectively). Note,that onlyforthelasttwoprocedurestheconsistencyresultshavebeen

derived. 1000 time series of length100and 1000 respectivelyhavebeengenerated and used for

estimation. Table1showstheresultsforT =100andT =1000respectively. Theyshow,thatthe

performanceoftheorderestimationproceduredependsheavilyontheweightingscheme: ForCCA

theIVC1methodworkswell,whereasitshowsproblemstoestimatethetrueorder,whenusedwith

N4SID.This isdueto thefact,that intheLyapunovbalancedrealizationofthetruesystem,the

entryC

1;2

isequalto 0:0146andthusclosetozero. Thisleadstoahighriskofunderestimating

theorderusingIVCtogetherwithN4SIDinthisexample. ForCCAitisobserved,thatashasbeen

expected, the higherpenalty term results in a high risk of underestimation, while reducing the

risk of overestimation. For N4SIDthe SVCmethod outperforms IVC and wealso observe, that

forC(T)=fplogT theaccuracyincreaseswiththesamplesize,whereasthelowerpenaltyterm

does notseem to lead to consistent order estimates. Inthe CCAcase it is seen, that the higher

penaltytermleadsto abig riskofunderestimating theorder. On theother hand fortheN4SID

(13)

^

W

f

T Methodd=2 Method d=4

IVC SVC NIC MOE IVC SVC NIC MOE

CCA 100 3.50 2.30 5.50 3.67 2.07 2.31 6.32 4.03 250 5.80 2.78 7.06 5.23 5.23 2.87 15.25 10.89 500 6.53 3.37 7.64 6.03 5.55 3.43 17.23 13.48 1000 7.53 4.03 7.74 6.65 6.63 4.11 17.90 15.14 lowpass 100 5.12 5.23 5.91 5.24 3.99 5.92 6.85 5.93 250 8.43 6.91 8.04 6.92 11.10 13.12 16.41 13.15 500 9.61 7.68 9.10 7.69 12.64 15.26 19.27 15.34 1000 10.43 8.51 10.10 8.52 14.32 16.68 21.09 16.76 highpass 100 3.97 5.60 6.30 5.64 2.57 6.43 7.21 6.50 250 6.93 7.55 8.75 7.69 6.89 14.50 17.83 15.11 500 8.12 8.38 9.92 8.61 7.82 16.83 20.89 17.66 1000 9.08 9.30 11.04 9.55 9.86 18.34 22.85 19.36

Table2: This table showsthe estimatedmeans of the various order estimation procedures asa

function ofsample sizeand dierentweightingmatrices ^

W +

f

. Here SVCand IVCusethepenalty

termC(T)=logT. Thetablehasbeenproduced using1000replicationsin eachcase.

aboutthechoiceof thepenaltyhasbeenfound. Bothchoicesused in thisexampleare heuristic

andnotmotivatedbyadditionalarguments. Atheoreticaljusticationseemstobeneeded.

Nextthe various order estimation procedures will be tested on an eight order system with

polesat z=0:8e 0:2i ;z =0:7e 0:3i ;z=0:5e 0:5i ;z=e 0:4i

andzerosat z=0:8e 0:1i

;z=

0:4755;z =0:1;z = 0:3;z = 0. Using this exampleextensive simulations comparing the four

orderestimationcriteriahavebeenperformed. Thesystemorderishardtoestimateandconsistent

estimatesoftheorderarenotthemaingoalinthisexample. TheLyapunovbalancedGramianis

equaltodiag(6:85;4:46;1:08;0:39;0:045;0:015;0:0004;0:0002)andthusthesystemisexpectedto

beapproximatedwellusing afourthordersystem. Acoupleofdierentsetupshavebeentested.

Inarstsimulationstudy1000replicationsoftimeseriesofsamplelengthsT =100;250;500and

T =1000havebeengenerated. Inthesubspacealgorithmsthreedierentweightingmatrices ^

W +

f

havebeenapplied: The CCAweights, alow-passlter, generatedusing a6-thorder butterworth

lter with cuto frequency0:5 and thecorrespondinghigh passlter havebeen incorporated.

f =p=d^p

AIC

hasbeenusedinallcases,whered=2andd=4aretried. Theaveragevaluesof

thecorrespondingorderestimatesaregivenin Table2. Itcanbeseen,thatthebehaviourofthe

variousalgorithmsis verydierentfordierentweightings ^

W +

f

. FortheCCAweightingNICgives

valuescloseto thetrueorderfor d=2,while itresultsin overlylargeestimatesford=4. Also

MOEseemsto suer from the biggerchoice of d,whereas bothSVC andIVCare relativelyrobust

withrespecttothischoice. Forthelowpassweightingallestimationproceduresshowatendency

to overestimate thesystemorder byafactorof twofor d=4,and alsotheresultsford=2are

largecompared to the CCAcase. Thesameresultalso holds for thehigh passweighting, except

that the estimates of NIC for d= 4are better than therespective estimates of theother order

estimationprocedures. Thisindicates,thatd=2isthefavourablechoice,ascomparedto d=4.

Theorder estimationprocedures arealso comparedto themoretraditionalmaximumlikelihood

basedinformationcriteria. InFigure 1ahistogramfortheestimatedorders usingIVC,SVC,AIC

andBICisgiven. HereT =100andf =p=2^p

AIC

havebeenused. Itcanbeobserved,thatBIC

tendsto choosen=4withahigh probability,while AICselectsrelativelylargeorders. Thetwo

subspaceorder estimates leadto slightlysmaller order estimates. Especially the resultsfor SVC

andBICseemto becomparable.

Howevertheorderestimatemightbeseentobenottheonlyinterestingindicator. Therefore

also the resulting estimates of the system are considered. The right plot of gure 2 shows the

(14)

1

2

3

4

5

6

7

8

9

10

11

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7 Estimated order, f=p, T=100, CCA

Prob.

Comparison of IVC and SVC to AIC and BIC

IVC

SVC

AIC

BIC

Figure1: InthisguretheorderestimatesobtainedbySVCandIVCarecomparedtotheestimates

obtainedintheMLframeworkusingAICandBIC.T =100andf =p=2^p

AIC

areusedtogether

withtheCCAweightingscheme. Theplotshavebeenobtainedusing100replications.

range [0;] obtained by CCA using d = 2 for the various procedures. Here the sample size is

equalto T =1000. Theguresshow,thattheIVCestimatesareworse,despitethefact,that the

averageestimated orderseemstobethebestforthisscenario. Thisisexplainedintherightplot

ofgure2,whichgivesthehistogramoftheorderestimates: IntheIVCcasethereisarelatively

highportionoflowordersystems(over50%arelessthann=4)occurs,aswellasahighnumber

ofoverlylargeestimates(35%largerthann=10). Thiscombinestoahighbias,whichshowsin

themeansquareerror. TheNICandMOEperformaboutequal,duetotheverysimilardistribution

oftheorderestimates. TheSVCmethodleadstoacomparablemeansquareerror,whilechoosing

smallerorders on average,which might be seenasan advantagefor control design. Theresults

ford=4aresimilarwithoneexception: Contrarytowhat hasbeensaidbefore,theSVCmethod

reactsmuch largerto thechange of d thanNICand MOEwith respect to themean square error.

Thisisduetothefact,thatthereisahigherpercentageoflowordersestimatedinthiscaseleading

toahighbiaserror. TheresultsforNICandMOEarenotthatsensitive,althoughonaveragemuch

higherordersarechosen. Thecorrespondingpicturesaregivenin gure3. Finallyalsotheeect

oftheweightingsonthemeansquareerroris discussed: Figure 4showstwoplots,where theleft

onerefersto T =100and theorder estimate accordingto IVC. Therightplot showsthe result

for MOE and T =1000. In both casesthere is somewhat surprisingly hardly any dierence due

to thechoiceof theweightingmatrices. Asimilar pictureholdsfortheother casesaswell. This

observationisincontrasttotheobservationsintheresultsofsimulationswithxedorder,where

anin uence of the weightingmatrices with respect to themeansquare error hasbeenobserved

(seee.g.Bauer,1998). Itisremarked,that thisobservationmightnotbetypical,but iscertainly

worthtobeinvestigatedfurther.

Asalastexamplealsoasystemwithobservedexogenousinputsistreated:Considerthesystem

givenbythefollowingmatrices:

A= 0:8 0:2 0:4 0:5 ;B= 0 1 1 0:5 ;K= 1:5 0 0:2 0:8 ;D= 0 0 0 0 andC=I 2

,thetwo-by-twoidentitymatrix. Thepolesofthesystemare0:7352and 0:4352,the

zerosof k

0

areat 0:6583and0:2583,whereasthezerosofl

0

areat 0:10000:9327i. Figure5

shows theprobabilities of the order estimatesfor dierent sample sizes, where the input is i.i.d

(15)

0

1

10 −1

10

0

10

1 MSE of estimated transfer function, d=2

Angular frequency /

π

NIC

SVC

IVC

MOE

3

7 >10

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9 Order

Histogram of estimated orders

Rel. frequ.

IVC

SVC

MOE

NIC

Figure2: TheseplotsshowtheresultofthesimulationforT =1000. Theleftpictureshowsthe

averagemeansquareerrorofthetransferfunctionat50equallyspacedfrequenciesintheangular

frequencyrange ! 2 [0;)obtained using thevarious order estimation procedures withthe CCA

weightingschemeandf =p=2^p

AIC

. Therightplotshowsthecorrespondinghistogramforthe

estimatedorders. Theplotshavebeenobtainedusing1000replications.

0

1

10 −1

10

0

10

1 MSE of estimated transfer function, d=4

Angular frequency /

π

NIC

SVC

IVC

MOE

3

7 >10

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 Histogram of estimated orders, d=4

Order

Rel. frequ.

IVC

SVC

MOE

NIC

Figure3: TheseplotsshowtheresultofthesimulationforT =1000. Theleftpictureshowsthe

averagemeansquareerrorofthetransferfunctionat50equallyspacedfrequenciesintheangular

frequencyrange ! 2 [0;)obtained using thevarious order estimation procedures withthe CCA

weightingschemeandf =p=4^p

AIC

. Therightplotshowsthecorrespondinghistogramforthe

(16)

0

1

10

0 Angular frequency /

π

MSE of estimated transfer function, d=2, T=100

CCA

Low Pass

High Pass

0

1

10 −1

10

0 MSE of estimated transfer function, d=2, T=1000

Angular frequency /

π

CCA

Low Pass

High Pass

Figure 4: These plotsshowtheresultof thesimulationforT =100(leftpicture) andT =1000

(right picture). The pictures shows the average mean square error of the transfer function at

50 equallyspaced frequenciesin the angularfrequency range ! 2[0;) obtainedusing the IVC

procedure (leftplot)and theMOEprocedure (rightplot) forthe threedierent weightingscheme

andf =p=2^p

AIC

. Theplotsarebasedon1000replications.

has atendency to underestimate the order in allcases, NIC only for T = 100. In this example

SVCand NICshowthe best performance. Note, that for thecaseof additionalexogenous inputs

(Peternell,1995)suggeststoused(n)=(p(s+m) n)(fs n),whichessentiallyleadstoabigger

penalty termof the form n(sf +p(m+s))C(T). This examplewaschosen merelyto illustrate

that theproposed methods alsowork in thecaseofexogenous inputs. Itshould be notedagain,

thattheorderestimationtechniques,exceptfortheIVCmethod,applyequallytotheMOESPtype

ofmethods.

6 Conclusions

Inthis paperthequestion oforder estimation in the contextof subspacemethods has been

ad-dressed. Two newprocedures havebeenproposed and analysed. Lowerbounds on the penalty

terminorderfortheestimatestobeconsistenthavebeengiven. Alsothebehaviourofone

partic-ularprocedurein thecaseofinnite-dimensionalmodelswithsummablesecondordermodeshas

beeninvestigated. Themethodusingtheinnovationvariancehasbeenshowntosuerfromsevere

theoreticaldisadvantagesandthustheuseofthis intuitivelyappealingprocedure isdiscouraged.

FortheSVCcriteriontheadvantagescertainlyarethepossibilitytoobtainanestimateoftheorder

withalmostnocomputationalcosts,asonlythepropertiesoftheestimatesofthesingularvalues,

whichareestimated inanycase,areused. Inasimulationstudyithasbeendemonstrated,that

themethodsleadtoreasonableresults. Ithasbeenshown,thatSVCislesssensitiveto thechoice

ofthetruncationintegersf andpthanthecriterionintroducedby(Peternell,1995)orthemethod

usedinthesystemidenticationtoolboxof(Ljung,1991). HowevertheSVCcriterionalsocontains

a subjective component in the choice of the penalty term. In the simulations no clear picture

onhowthis shouldbechosencouldbeobtainedandnoheuristical motivation foranyparticular

(17)

0

1

2

4

10 >10

0

0.5

1

Prob.

T=100

IVC

SVC

MOE

NIC

0

1

2

4

10 >10

0

0.5

1

Prob.

T=500

IVC

SVC

MOE

NIC

0

1

2

4

10 >10

0

0.5

1

Prob.

T=1000

Estimated order, f=p, d=2, CCA, inputs white noise

IVC

SVC

MOE

NIC

Figure 5: This gureshowsthe resultof 1000 simulationruns using samplesizes T =100(top

row), T = 500 (middle row), T = 1000 (bottom row). Each picture shows the probabilities

of estimating the order using the four estimation algorithms: The weightings havebeen chosen

accordingto CCA. Thetruncation indices havebeen chosen asf = p= 2^p

AIC

. The inputs are

i.i.d. uniformelydistributedwhitenoisenormalizedto zeromeanandunit variance.

References

Akaike,H.(1975).Markovianrepresentationofstochasticprocessesbycanonicalvariables.SIAM

JournalofControl 13(1),162{172.

Bauer,D.(1998).SomeAsymptoticTheoryfortheEstimationofLinearSystemsUsingMaximum

LikelihoodMethodsorSubspaceAlgorithms.PhDthesis.TUWien.

Bauer,D. (1999). Orderestimation in thecontextof MOESPsubspaceidentication methods. In:

Proceedings of the ECC'99Conference, Karlsruhe,Germany.

Bauer, D. and M.Jansson (2000).Analysis of theasymptoticpropertiesof theMOESP typeof

subspacealgorithms.Automatica36(4),497{509.

Chatelin,F.(1983).Spectral Approximation ofLinearOperators.AcademicPress.

Chui,N. (1997).SubspaceMethods andInformativeExperimentsforSystemIdentication. PhD

thesis.CambridgeUniversity.

Hannan,E.J.andM.Deistler(1988).TheStatisticalTheoryof LinearSystems.JohnWiley.New

York.

Jansson,M.andBoWahlberg(1997).Counterexampletogeneralconsistencyofsubspacesystem

identicationmethods.In: Proceedings of SYSID'97.Fukuoka,Japan.pp.1677{1682.

Larimore,W. E. (1983).System identication, reduced order ltersand modeling viacanonical

variateanalysis.In: Proc.1983Amer.ControlConference2.(H.S.RaoandP.Dorato,Eds.).

Piscataway,NJ.pp.445{451. IEEEServiceCenter.

Ljung,L. (1991).SystemIdentication Toolbox,User's Guide.TheMathWorks.

(18)

thesis.Dept. ofElectr.Eng.,Linkoping.

Peternell,K.(1995).IdenticationofLinearDynamicSystemsbySubspaceandRealization-Based

Algorithms.PhDthesis.TUWien.

Peternell, K.,W. Scherrerand M.Deistler(1996).Statisticalanalysis ofnovelsubspace

identi-cationmethods. SignalProcessing52, 161{177.

Rissanen,J.(1978).Modelingbyshortestdatadescription.Automatica14, 465{471.

Shibata, R. (1980). Asymptotically eÆcient selection of the order of the model for estimating

parametersofalinearprocess.Ann. Statistics8(1),147{164.

Sorelius, J. (1999).Subspace-Based Parameter EstimationProblems in Signal Processing. PhD

thesis.UppsalaUniversity.

Verhaegen,M.(1994).Identication ofthedeterministicpartofMIMOstatespacemodelsgiven