DietmarBauer
Departmentof ElectricalEngineering
Linkoping University, SE-58183 Linkoping,Sweden
WWW: http://www.control.isy.l iu.s e
Email: Dietmar.Bauer@tuwien.ac.at
June 6,2000
REG
LERTEKNIK
AUTO
MATIC CONTR
OL
LINKÖPING
Report no.: LiTH-ISY-R-2263
Submittedto Automatica
TechnicalreportsfromtheAutomaticControlgroupinLinkopingareavailablebyanonymousftp
DietmarBauer
Departmentof ElectricalEngineering
Linkoping University, SE-58183 Linkoping,Sweden
WWW: http://www.control.isy.l iu.s e
Email: Dietmar.Bauer@tuwien.ac.at
June 6,2000
Abstract
Inthispaperthe questionofestimatingtheorderinthecontextof subspacemethodsis
addressed. Three dierent approaches are presented andthe asymptoticproperties thereof
derived. Two of these methods are based on the information contained in the estimated
singularvalues,while thethirdmethodisbasedonthe estimatedinnovationvariance. The
case withobservedinputsistreatedaswellasthecasewithoutexogeneousinputs. Thetwo
methodsbasedonthe singularvaluesare showntobeconsistentunderfairly mild
assump-tions, while the sameresult for the third approach is onlyobtained ona generic set. The
formercanbeappliedtoLarimoretypeofproceduresaswellastoMOESPtypeofprocedures,
whereasthelatteris onlyappliedtoLarimoretypeofalgorithms. Thishasimplicationsfor
the estimation of the orderof systems, which are close to the exceptional set, as is shown
in anumerical example. All the estimation methods involve the choice ofa penalty term.
SuÆcientconditionsonthepenaltytermtoguaranteeconsistencyarederived.Theeectsof
dierentchoicesofthepenaltytermareinvestigatedinasimulationstudy.
Keywords: subspacemethods,orderestimation,asymptoticproperties
1 Introduction
There exists an extensive literature for order estimation algorithms forlinear, dynamical, state
space systems. Probablythe most important contribution canbe attributed to (Akaike, 1975)
forintroducing theinformation criteria. These criteriacomparethe model t ontheestimation
dataasmeasuredbyafunctionoftheestimatedinnovationvariancetosomepenaltyterm,which
punisheshighmodelorders. Inotherwords,thehighermodelorderisonlychosen,iftheincrease
intheaccuracyishigherthanacertainthreshold,whichdependsonthesamplesize. Alternatively
they canbeseen asa sequence of tests to identify themodel order,where the size of thetests
is adjusted to the sample size. The properties of these estimation methods are well studied
(Shibata,1980;Akaike,1975;Rissanen,1978)andtheeectsofthechoiceofthepenaltytermare
wellunderstood(seee.g.HannanandDeistler,1988,foracomprehensivediscussionoftheknown
properties). All these estimation methods however rely on the use of the maximum likelihood
estimate for the system for each order. Thus in practice a large number of systems has to be
estimated using numerical search procedures to optimise the likelihood for given system order,
leadingto asometimesprohibitiveamountofcomputations.
Parts of thiswork havebeen done whilethe authorwas holding a post-doc position at the Departmentf.
proposed quitesometimeago,there exist onlyfewreferencesdealingwith theestimationof the
order in thecontext of subspacemethods. Therst contribution seemsto be due to (Peternell,
1995). This method relies on theinformation oftheestimated canonicalcorrelations, which are
estimated in thesubspacemethods. This leadstoaveryeconomical(in termsof computations)
method, whichhasbeenshowntoleadtoalmostsure(a.s.) consistentestimatesunder theusual
assumptions. Seebelowformoredetailsonthis. Howeverithasbeenobserved,thatthismethod
seems to be relatively sensitive to the choice of certain user parameters in (Bauer, 1998). In
this thesis also an alternativeis analysed, which is a small adaptation of the criterion given in
(Peternell,1995),which seemstobelesssensitive. (Bauer,1998)introducesanothercriterionfor
theLarimore typeof procedures,which is much in thespirit of Akaikes informationcriteria, as
ituses theestimated innovationvariance. This three procedures willbepresentedandanalysed
below. Itwillbeclearfrom theproofshowever,thattheproposedestimationmethodisonlyone
possibility, as the main problem boils down to estimate therank of amatrix. Forthis problem
there arewellestablishedtesting methods, which howevermostlyrelyonthedistribution of the
matrix, whose rank is estimated. Such procedures are presented in (Sorelius, 1999): There the
rankofthecrucialmatrixisfoundbyincreasingthedimensionsofthematrixbyonein onestep
and performing atest onthenewly introduced smallestsingularvalue. This procedure however
has thedisadvantageof simultaneous tests, sincein practice asequence of testswill haveto be
performed,wherethenumberofthetestsandthedependencyofthetestsisunknownatthestart
of the tests. For this reasonit is believed, that estimating the order is more appropriate than
testingfortherankin thepresentsetup.
Theorganisation of the paper is asfollows: In thenext section the model set is stated and
themainassumptionsarepresented. Theestimationalgorithmsarebrie yreviewedinsection3,
wherealso thevariousorderestimation algorithmsarediscussed. Section 4then statesthemain
resultsofthispaperand providesproofsforthem. Asimulationstudyisperformedin section5.
Finallysection6concludes.
2 Model set an assumptions
Inthis paperlinear,nite dimensional, discretetime, time invariant, statespace systemsof the
form x t+1 = Ax t +Bu t +K" t y t = Cx t +Du t +" t (1) wherey t 2R s
denotes theobservedoutput process,u
t 2R
m
denotes theobservedinputprocess
and "
t 2 R
s
the unobserved white noise sequence. x
t 2 R
n
is the state sequence. Here the
true order of thesystem is denoted by n. Thematrices A 2 R nn ;B 2 R nm ;C 2R s n ;D 2 R sm ;K 2R ns
determinethesystem. Inthecasean inputdelayis postulatedD is restricted
to be zero. The system is assumed to be stable, i.e. all eigenvalues of A are assumed to lie
inside the unit circle, and strictly minimum-phase, i.e. theeigenvaluesof A KC are assumed
to lieinside theunit circle. Thesystemmatrices correspondto apairof transferfunctions: Let
k(z)=I+zC(I zA) 1
Kandletl(z)=D+zC(I zA) 1
B,wherezdenotesthebackwardshift
operator. Furthermore letM
n
denote theset of all pairsof transferfunctions (k;l) that permit
astatespacerepresentationof theform (1)fullling thestabilityandthe strictminimum-phase
assumption.
Thewhitenoise"
t
is assumedto bean ergodicmartingale dierencesequencesatisfying the
followingconditions: Ef" t jF t 1 g=0; Ef" t " 0 t jF t 1 g==E" t " 0 t >0 Ef" t;a " t;b " t;c jF t 1 g=! a;b;c ; Ef" 4 t;a g<1 (2)
Here E denotes expectation, F
t
denotes the -algebraspanned by (y
s
;s t) and "
t;a
denotes
thea-thcomponentofthevector"
t
in (Hannan and Deistler, 1988, Theorem 4.3.2). Corresponding to the input two dierent sets
of assumptions will be introduced for the Larimore type of procedures and the MOESP type of
procedures.
Assumptions1(Larimore type of procedure) The process(u
t
;t2Z)islteredwhitenoise
ofthe formu t = P 1 j=0 K u (j) t j ,where t
isanergodic, martingale dierencesequencefullling
the assumptions stated in equation (2) and being independent of "
t , and where kK u (j)k c u j u for some 0<c u <1;0< u <1. Furthermore u (!)=( P 1 j=0 K u (j)e i!j )( P 1 j=0 K u (j)e i!j ) 0 is assumedtofulll0<cI u cI<1for <!.
Assumptions2(MOESP type ofprocedures) Theinputprocess (u
t ;t2Z)isoftheformu t = v t + P h j=1 c j e i j t wherev t
fulllsassumptions1andthe randomvariablesc
j 2R
m
havezeromean
andnitevariancesandaresuchthatthecorrespondingprocessu
t
isrealvalued. Furthermorethe
process u
t
is assumedto be persistently exciting of order (tobe specied later) in the sense of
(Ljung,1999).
NotethattheassumptionsfortheinputsintheLarimoreprocedurearemoresevere. Thereason
forthisliesinthefact,thatforthisclassofproceduresanecessaryconditionforconsistencyisthat
theintegerparameterptendstoinnity(seebelowfordetails). Thisamountstotheconditionon
theinputtobeofinnitepersistency. FortheMOESPtypeofproceduresnotethattheassumptions
areidenticaltotheassumptionsimposedin theproofoftheasymptoticnormalityin (Bauerand
Jansson, 2000). Note however, that this only occurs, since no minimal conditionsare given in
this paper. In fact it will be clear from the proof given below, which properties for the input
signalarereallyneeded inthis respect. Alsonotethat theconditionsgiventhere permit certain
pseudostationarysequences,i.e. sumsofsinusoids. Howeverinthiscaseitisnecessarytoimpose
thenecessaryrestrictions(i.e. theexistenceofcertainlimits,whichappearintheproof)directly
onthesequenceratherthanusingsuÆcientconditionsontheunderlyingrandomvariables.
3 Estimation algorithms
Inthissectionabriefreviewofthemainstepsintheconsideredsubspaceproceduresisgivenand
theestimationalgorithmsaremotivated. Foramoredetaileddescriptionofsubspacemethodssee
(Larimore,1983;Verhaegen,1994)or (Bauer,1998,Chapter3). LetY + t;f =[y 0 t ;y 0 t+1 ;;y 0 t+f 1 ] 0 andletU + t;f andE + t;f
respectivelybeconstructedanalogously usingu
t and " t respectivelyin the place of y t . Let Z t;p = [y 0 t 1 ;u 0 t 1 ;;y 0 t p ;u 0 t p ] 0
. Here f and p are two integer parameters,
whichhaveto bechosenbythe user. Seebelow forassumptionson thechoice ofthese integers.
Thenitfollowsfromthesystemequations(1)that
Y + t;f =O f K p Z t;p +U f U + t;f +E f E + t;f +O f (A KC) p x t p Here O 0 f = [C 0 ;A 0 C 0 ;(A f 1 ) 0 C 0 ] and K p = [[K ;B KD];(A KC)[K ;B KD];;(A KC) p 1 [K ;B KD]]. Further U f and E f
are block Toeplitz matrices containing the impulse
responsesequences. Theactualformofthesetwomatricesisofnoimportanceforusandthuswe
refertotheoriginalarticlesfordetails. Thisequationbuildsthebasisforallsubspacealgorithms,
whichcanbedescribedasfollows:
1. RegressY + t;f ontoU + t;f andZ t;p toobtainanestimate ^ z ofO f K p andanestimate ^ u ofU f
respectively. Due tonitesampleeects ^
z
willtypicallybeoffull rank.
2. Forgivennndaranknapproximationof ^ z byusingtheSVDof ^ W + f ^ z ^ W p = ^ U n ^ n ^ V 0 n + ^ R. Here ^ n
denotesthediagonalmatrixcontainingthelargestnsingularvaluesin decreasing
order. ^
U
n
contains the corresponding left singular vectors as columns and ^
V
n
the
corre-spondingrightsingularvectors. Finally ^
leadsto anapproximation ^ O f ^ K p =( ^ W f ) 1 ^ U n ^ n ^ V 0 n ( ^ W p ) 1
. Theactual decompositionof
thismatrixinto ^ O f and ^ K p
hasnoin uenceontheestimatedtransferfunctions.
3. Usingtheestimates ^ O f ; ^ K p and ^ u
obtainthesystemmatrixestimates.
In the second step an order has to be specied. Also the matrices ^ W + f and ^ W p have to be
provided by the user. In the literature several dierent choices have been proposed. For the
matrix ^
W
p
thechoicesarerestrictedto( ^ p ) 1=2 and( ^ ; p ) 1=2 ,where ^ p = 1 T P T t=p+1 Z t;p (Z t;p ) 0
denotes the sample variance of Z
t;p . Further ^ ; p = ^ 1 p ^ u;z ^ 1 u ^ u;z . Here ^ u denotes the samplecovarianceofU + t;f and ^ u;z
thesamplecovarianceofU + t;f andZ t;p . Correspondingto ^ W + f
two choices will be considered: ( ^ +; f ) 1=2 = ^ + y ^ y;u ^ 1 u ^ u;y
using obvious notation, where
y standsfor Y + t;f , and ^ W + f =[K W (i j)] i;j , where w(z) = P 1 j=0 K W (j)z j denotes a frequency weighting. K W (j)=0;j <0andK W
(0)is assumednonsingular. Furthermore w(z)is assumed
to bestable and strictlyminimum-phase. Theintuitionof thisspecialchoiceof theweightingis
to emphasizesomefrequencyrangevia specically designingw(z) to be aband pass lter. The
ideaofthisstepisessentiallytodiscriminatebetweenthenonzero'signal'singularvaluesandthe
noise containedin ^
R, which is in uencedby the weighting, sincethis scalesdiferent directions.
Usingtheinformation containedin theestimatedsingularvalueswill bethebasisfor twoof the
estimationmethods.
FortheLarimoretypeofmethodsalsoanorderestimationalgorithmwillbegiven,whichrelies
ontheestimated innovationvariance. Thusitisnecessarytogivemoredetailsontheestimation
of thesystemmatrices in this case. Note, that from step2 anestimate ^
K
p
isobtained. Thisis
used to estimatethe statesequenceas x^
t = ^ K p Z t;p . Let ha t ;b t i= 1 T P T f t=p+1 a t b 0 t . Inserting the
estimatedstateintothesystemequations(1)oneobtainsestimatesof(A;B;C ;D)fromtheleast
squaressolution: [ ^ A T ; ^ B T ] = h^x t+1 ;x^ t i h^x t+1 ;u t i h^x t ;x^ t i h^x t ;u t i hu t ;x^ t i hu t ;u t i 1 h ^ C T ; ^ D T i = hy t ;x^ t i hy t ;u t i h^x t ;x^ t i h^x t ;u t i hu t ;x^ t i hu t ;u t i 1
If adelay is postulated, then in thesecond least squares problemu
t
is omitted. Thematrix K
and theinnovation sequence are estimated from theresiduals ofthese equations asfollows: Let
^ " t =y t ^ C T ^ x t ^ D T u t . Then ^ =h^" t ;"^ t iand ^ K T =h^x t+1 ;^" t i ^ 1 .
Followingthediscussiongivenabovethereareacoupleofratherobviousalgorithmstoestimate
theorder. Thesewillbepresentedinthefollowing.
3.1 Using the information contained in the singular values
From standard theory it follows, that ^ X f;p = ^ W + f ^ z ^ W p
converges a.s. to the limit X
Æ = W + f O f K p W p , where W + f and W p
denote the a.s. limits of ^ W + f and ^ W p respectively. Here
convergenceoccursintheoperatornormactingon` 2
almostsurely,wherethematricesoccurring
are seenasoperators byadding zerosin the innitematrix representationcorresponding to the
operator. Thereforeit followsfrom the resultsof operator theory (see e.g.Chatelin, 1983)that
thesingularvaluesalsoconverge. Since X
Æ
hasranknonlytherstn singularvaluesofX
Æ are
nonzero,therest beingzero. Thereforetheproblem boilsdown totheassessmentofthe rankof
anoisy matrix. Theproblem gets complicated, sincethedistribution ofthe noiseactingon the
matrixishard toquantify. Thereforeweresortto estimationalgorithmsasopposed tomethods
ofobtainingtheorderviaasequenceoftests(cf.Sorelius,1999). Thesealgorithmssharetheidea
oftheinformationcriteriaofcomparingthesignicanceoftheinclusionof anothercoordinatein
NIC(n) = M X j=n+1 ^ 2 j +C(T)d(n)=T (3) SVC(n) = ^ 2 n+1 +C(T)d(n)=T (4)
Hered(n)=nm+ns+smdenotesthenumberofparametersofastatespacesystemofordern.
C(T)>0;C(T)=T ! 0isa penalty term,which will be described below in moredetail. In the
denitionM=min ffs;p(s+m)g,thenumberofestimatedsingularvalues. Theestimatedordern,^
say,isobtainedastheminimisingargumentofthesecriterionfunctions. NIChasbeenintroduced
andanalysedin(Peternell,1995). Inthedenition(Peternell,1995)usedadierentchoiceofd(n),
whichhowevercanbereformulatedtotintothepresentsetting. Also(Peternell,1995)onlydealt
withf andpxedandnite,whilethefollowingdiscussionholdsforgeneralchoices. SVChasbeen
proposed as arenementof NICin (Bauer,1998). Themain dierence liesin thefact,that NIC
usestheFrobeniusnormof thematrix ^
R , whereasSVCusesthetwonormtomeasure thesizeof
theneglectedsingularvalues. Forbothcriteriatheorder estimateisobtainedbyminimizing the
aboveexpression. Note,that theseorderestimationtechniquesdonotdependonwhetherMOESP
ortheLarimoretypeofmethodsisusedandthuscanbeusedinalltheseprocedures. Theauthor
wantsto stress, that these are just twoalgorithms, howevermany moreseem possible,since in
principleallthatisdoneistocomparethesizeof ^
Rmeasuredinsomenormtosomesamplesize
dependentpenaltyterm. Alsonote,thatthechoiceoftheweightingmatrices ^ W + f and ^ W p isvery
in uentialfor the outcome of theestimation, aswill be demonstrated in section 5. This might
indeed bedesirable,sincespecial weightingscanbegivenasomewhatheuristicinterpretationas
frequencyshapinglters(cf.McKelvey,1995). Inthiscaseitfollows,thattheweightingmatrices
serveasatooltostresstheimportantfrequenciesfortheidentication, andthusthesedirections
mightbeupweighed,whereasothernotso importantdirectionsaredownweighed.
Note, that bothcriterion functions canbe implemented with almost nocomputational load.
The singular valuesare estimated in the algorithms, therefore only the addition of the penalty
termandtheminimizationoverasmallrangeofintegershasto beperformed.
3.2 Using the estimated innovation covariance
Asecondintuitiveideawould betoestimatetheorderusingtheestimatedinnovationcovariance
in the Larimore type of procedures. Recall that given the state sequence of dimension n, say
^ x n
t
,the innovation varianceis estimatedas ^ n =hy t ^ C n T ^ x n t ^ D n T u t ;y t ^ C n T ^ x n t ^ D n T u t i. Here [ ^ C n T ; ^ D n T
]denotestheestimatesofC andD using theestimated statex^ n
t
. Thenitistemptingto
usethecriterionfunction usedalsointheinformationcriteriaasfollows:
IVC(n)=logdet ^
n
+C(T)d(n)=T (5)
where d(n)and C(T) areidentical tothedenition of SVCandNIC. Againthe orderestimateis
obtainedby minimizingthis function overtheintegers 0nminffs;p(s+m)g. Theauthor
wantstostress,thatthisisnotthestandardinformationcriterion,sincetheestimates ^
n
arenot
themaximumlikelihoodestimatesof theinnovation sequence. Infact itwillbeshown, thatthis
estimationalgorithmmayperformpoorin somesituations.
Fromacomputationalpointofviewthiscriterionisveryattractiveinthecaseofnoexogenous
inputs present in the read out equation, i.e. in the case y
t = Cx
t +"
t
, and additionally the
choice of the weighting ^ W p = ( ^ p ) 1=2
. In this case the choice ^ K n p = ^ V 0 n ( ^ W p ) 1 Z t;p leads to h^x n t ;x^ n t
i = I, i.e. the components of the state are orthogonal and thus the regressionscan be
performedindependently. TheestimationalgorithmthenamountstoestimatingthematrixC for
themaximalstatedimensionandthenonlyadditionsandmultiplicationshavetoperformed. Let
^ C max T = h ^ C n T ; ^ C n max T i then ^ n = ^ max + ^ C n max T ( ^ C n max T ) 0
InthecaseofexogenousinputspresentoradierentchoiceoftheweightingW
p
onthecontrary
each regressionhas to be performed separately. Note however, that normallythese will be low
dimensionalregressioningeneralandalsoofnottoobignumericalload. Itispossibletoimplement
thesubspaceprocedures such that onlytheestimated covariancesare used ratherthanthe data
itself. Inthiscasethenecessarycovarianceestimatesarealreadycalculatedandthusonlymatrix
inversionshaveto be calculated. Otherwise also in this stepthe necessarycovariancescouldbe
calculated in order to minimize the number of necessary calculations. It will be shown in the
nextsection,thatalthoughthisprocedureseemsappealingonrstsight,itisnotarecommended
procedure. Thusinthisrespecttheresultofthispaperisratherto show,thatusingthismethod
leadsto problems,which aresomewhatunexpected.
4 Main results
Inthissectionthepropertiesofthevariousestimationalgorithmswillbederived. Thediscussion
draws heavily from (Bauer, 1998; Peternell, 1995). Some results for the MOESPcase have been
presentedin(Bauer,1999).
Theresults are mainly based onthe followinglemmas: The rstdeals with the accuracyof
theestimation of samplecovariancesunder thegivenassumptions onthesystemand theinput.
ThesecondonedealswiththelinearisationoftheSVDorSVDrelatedquantities,which willbe
ofimportance mainlyfortheSVCcase.
Lemma4.1 Let (y
t
;t 2 Z) be generated by a system of the form (1), where the noise fullls
the assumptions of section 2. Let ^
z;z (j) = T 1 P T j j=1 z t z 0 t j and let z;z (j) = Ez t z 0 t j , where z t =[y 0 t ;u 0 t ] 0 . Furthermore letH T =o((logT) a )for some0<a<1. Ifu t
fulllsassumptions 1then
max jjjHT k^ z;z (j) z;z k 2 =O(Q T ) (6) If u t
onlyfulllsthe assumptions 2,thenthe statement istruefor H
T
=M<1.
This lemma follows from (Hannan and Deistler, 1988, Theorem 5.3.2 and Chapter 4). The
lemma providesrelativelysharpbounds forthe estimationerror ofthe covariancesequences. In
factitfollowsfromthelawoftheiteratedlogarithmfortheestimatedcovariancesequences,that
except for the exact evaluation of the constant involved in the O(Q
T
) statement the bound is
tight.
Lemma4.2 (Chatelin) Let T
T
denote a sequence of symmetric, compact operators acting on
` 2
, which converges in norm to the operator T
Æ
. Then it follows, that the set of eigenvalues
of T
T
converges to the set of eigenvalues of T
Æ
. Also the corresponding eigenspaces converge in
the gap metric. Let P Æ
i
denote the orthonormal projection matrixonto the space of eigenvectors
corresponding totheeigenvalue Æ i ofT Æ andletP T i and T i
denotethecorresponding quantitiesof
T
T
. Herefor amultipleeigenvalue Æ i ofT Æ the quantitiesP T i
refer totheorthonormal projection
matrixontothe spacespannedbythe eigenvectorstoalleigenvaluesofT
T convergingto Æ i . Then P T i =P Æ i + X j6=i P Æ j T T T Æ Æ i Æ j P Æ i +o(kT T T Æ k) (7)
Thelemma implies, that the eigenspaces converge, and in particular the projections on the
eigenspacesconvergeatthesamerate,astheerrorin theapproximation.
It has been shown in (Bauer and Jansson, 2000), that the MOESP type of methods leads to
consistentestimatesforthesystemmatrixestimatesonlyingenericcases. ThereforealsotheSVC
criterion can onlyproduce consistent estimates in these cases. Let
u
(!) denote the spectrum
of the stationary process u
t
and assume, that theintegers f and p are used for theestimation.
n u n
estimatesofthepairoftransferfunctions. Itisalsoshown,thatthissetisgenericinM
n
. However
asthe examplegivenin (Janssonand Wahlberg, 1997) shows, the set is notidentical to M
n in
general. Inthecaseminff;pg>3nithasbeenshownin(Chui,1997)thatinfactthisisthecase,
i.e. the consistency holds for everypair (k;l) 2 M
n
. In fact the suÆcientconditions stated in
(Chui,1997)aremuch sharper.
Theorem4.1 Lettheprocess(y
t
;t2Z)begeneratedbyasystemofthe form (1),wherethetrue
systemorder isequal ton
Æ
, andwherethe white noise process ("
t
;t2Z)fullls the assumptions
of section 2. If the input fullls the assumptions 1, then maxff;pg=o((logT) a
) isassumedfor
some a<1. In this casethe conditions C(T) >0;C(T)=T !0;C(T)=(fploglogT)!1 are
suÆcientfor the a.s. consistency ofthe orderestimate obtainedby minimizingSVC(n).
Ifthe inputfullls the assumptions2with =f +p 1, thenfor eachf andpthereexistsa
set U n (f;p; u (!);)M n
suchthat for (k;l)2U
n (f;p;
u
(!);) the SVCmethod leads to a.s.
consistentestimatesofthe orderunderthe assumptionC(T)>0;C(T)=T !0;C(T)=loglogT !
1. If (k;l) 62 U
n (f;p;
u
(!);) then consistency fails for the same choice of the penalty term
C(T), i.e. lim T!1 ^ n<n Æ a.s.
PROOF:Note,thatunderbothsetofassumptionstheerrorintheestimationoftherstf+p 1
covariances
z
(j)areof orderO(Q
T
)uniformelydueto theLemma4.1. Theestimationusesthe
singular values of ^ X f;p = ^ W + f ^ z ^ W p , which converges to X Æ = W + f O f K p W p
a.s. ashas been
shown e.g. in (Peternell et al., 1996). Here convergence is in operator norm in the embedding
` 2
. Considertheestimation errorin ^
z
rst: Introducethenotationha
t ;b t i=T 1 P T f j=p+1 a t b 0 t . Then h ^ z ; ^ u i =hY + t;f ; Z t;p U + t;f ih Z t;p U + t;f ; Z t;p U + t;f i 1
The estimation error in each entryof these matrices is of the order O(Q
T
) asfollowsfrom the
Lemma4.1togetherwith(HannanandDeistler,1988,Theorem6.6.11.),whichassuresthe
summa-bilityofthecolumnsoftheinverseuniformelyin f andp. Thusconsidertheweightingmatrices:
Recallthattheweightingmatricesarerestrictedtobeeitherdeterministicorchosenasthesquare
rootsofmatriceslikehY + t;f ;Y + t;f i hY + t;f ;U + t;f ihU + t;f ;U + t;f i 1 hU + t;f ;Y + t;f
i. Usingthesamearguments
ashave been used aboveshowsthat the estimation error in the entries of these matrices areof
order O(Q
T
). Therefore also the error in the positivedenite symmetric square root is of the
sameorder,ascanbeseenfrom aTaylorseriesexpansionofthesquareroot, which canbeused
todenethesymmetricsquarerootofanoperator.
Itthus follows,that ^ X f;p ! X Æ , where k ^ X f;p X Æ k 2 =O(Q T p
fp). Thereforethe singular
valuesconvergeat thesamerate. This shows, that underestimationof theorder is notpossible
asymptotically,if
n
Æ
>0,where
i
denotesthesingularvaluesofX
Æ ordereddecreasingly,as SVC(n) = ^ 2 n+1 + d(n)C(T) T = 2 n+1 +(^ 2 n+1 2 n+1 )+ d(n)C(T) T = 2 n+1 +O( p fpQ T + d(n)C(T) T )
Sincethesecondtermtendstozero,theminimumcannotbeattainedatn<n
Æ
. Inthecaseofthe
MOESPprocedureand (k;l)62U
n (f;p;
u
;)then-th singularvalueiszeroand thus consistency
fails, as follows from the same arguments given below, since in that case, the same arguments
show that theasymptotic statedimension isequalto the numberof nonzerosingularvalues for
thelimitingmatrix.
Thereforewehavetoshow,that thetrueordern
Æ willbepreferredton>n Æ asymptotically. Thusconsider ^ n+1 = k ^ W + f ^ z ^ W p ^ U n ^ n ^ V 0 n k 2 =k ^ U 0 ^ X f;p ^ U n ^ n ^ V 0 n k 2 k ^ U 2 ^ U 0 2 ^ X f;p U 2 U 0 2 X Æ k 2
Here ^ U = [ ^ U n ; ^ U 2;n ]; ^ U n 2 R ; ^ U 2;n 2 R and ^ U 2 = ^ U 2;n Æ
, which explains the last
inequality. Sincetheentriesof ^
X
f;p X
Æ
havebeenshownto beoforderO(Q
T )thenorm k ^ U 2 ^ U 0 2 ^ X f;p X Æ k 2 k ^ U 2 ^ U 0 2 k 2 k ^ X f;p X Æ k Fr =O(Q T p fp)
Therefore itremains to obtaina bound on k ^ U 2 ^ U 0 2 U 2 U 0 2 k 2
. Butthis followsfrom Lemma 4.2,
usingT T = ^ X f;p ^ X 0 f;p ;T 0 =X Æ X 0 Æ
andtheexponentialdecreaseinelementsintherowsofU
n
Æ (cf.
Bauer,1998). Fromthistheresultfollows,since
SVC(n Æ ) SVC(n) = ^ 2 n Æ +1 ^ 2 n+1 +(d(n Æ ) d(n)) C(T) T = C(T) T
[O(fploglogT=C(T))+d(n
Æ
) d(n)]<0
sincefploglogT=C(T)!0andd(n)>d(n
Æ ).
Note,thattheresultalsoprovestheconsistencyoftheNICcriterionforthesamerestrictions
on thepenalty term. Also note, that concerningthe penaltyterm only asuÆcient conditionis
given. Thebound isobtainedbyratherbruteforcearguments,bounding thetwonormwith the
Frobeniusnorm. Inthecase,wheref andptendtoinnityataratelogT itseemstobedesirable
tousealowerpenaltyterm,aswill bearguedinthenumericalexamples.
For the estimation criteria, which are based on an estimate of the innovation variance, the
situation is somewhat dierent. Note that this procedure only appliesfor the Larimoretypeof
procedures. Thereforeassume,thattheinputprocessfulllsassumptions1. Notethatifnodelay
ispostulated ^ n = hy t ;y t i hy t ; ^ x t u t ih ^ x t u t ; ^ x t u t i 1 h ^ x t u t ;y t i = hy t ;y t i hy t ;Z t+1;p+1 i ^ L 0 n ( ^ L n hZ t+1;p+1 ;Z t+1;p+1 i ^ L 0 n ) 1 ^ L n hZ t+1;p+1 ;y t i = hy t ;y t i b H 1 ^ L 0 n ( ^ L n ^ L 0 n ) 1 ^ L n b H 0 1 where ^ L n = 0 0 ^ K p (n) 0 I 0 ; ^ K p (n) = ^ V 0 n ( ^ W p ) 1 . Also ^ L n = ^ L n hZ t+1;p+1 ;Z t+1;p+1 i 1=2 and b H 1 =hy t ;Z t+1;p+1 ihZ t+1;p+1 ;Z t+1;p+1 i 0 =2 .
Firstconsidertheproblemofunderestimatingtheorder. Let
n
denotethelimitof ^ n . Then det[ n Æ ]<det[ n Æ 1
] isasuÆcientconditiontoavoidasymptoticunderestimationoftheorder.
ThisfollowsfromC(T)=T !0. Thisconditionhasbeenanalyzedinmoredetailin(Bauer,1998):
In the special case, where no input is present in the readout equation, i.e. D = 0 and where
^ W p =( ^ p ) 1=2
thisconditionis equivalentto C
0;nÆ
6=0,whereC
0;nÆ
denotes thelast columnof
thelimiting realization of thetrue system. It has beenfound, that this conditionis fullled on
ageneric subsetin some special cases. It is referredto theoriginal work for details. In general
howevertheimplicationsofthisconditionareunknown.
Nextconsiderthequestionofoverestimation: Fornn
Æ
oneobtains
n =
nÆ
andthusthe
estimationerrorhasto beanalysedmoreclosely. Accordingtotheequationaboveoneobtains:
^ n ^ nÆ = b H 1 ^ L 0 n Æ ( ^ L nÆ ^ L 0 n Æ ) 1 ^ L nÆ ^ L 0 n ( ^ L n ^ L 0 n ) 1 ^ L n b H 0 1
Usingthematrixinversionlemmaforpartitionedmatricesone obtains
^ L 0 n ( ^ L n ^ L 0 n ) 1 ^ L n = ^ L 0 n Æ ( ^ L nÆ ^ L 0 n Æ ) 1 ^ L nÆ + ^ P ? n Æ ^ L 0 n;2 ( ^ L n;2 ^ P ? n Æ ^ L 0 n;2 ) 1 ^ L n;2 ^ P ? n Æ where ^ P ? n Æ = I ^ L 0 n Æ ( ^ L nÆ ^ L 0 n Æ ) 1 ^ L nÆ and where ^ L 0 n = [ ^ L 0 n Æ ; ^ L 0 n;2
]. Since the second term is a
projectionoperator ithasnormone. Thustheessentialtermis b H 1 ^ P ? n Æ
since H 1 ![C ;D] L n Æ L nÆ and ^ P nÆ ! I L nÆ ( L n Æ L nÆ ) L n Æ =P nÆ
. The estimationerrors are
derivedusingthe uniformconvergenceof thecovarianceestimates: Themain emphasis herelies
on ^ K p K p
. It is straightforward to show, using Lemma 4.1 and Lemma 4.2 that there exists
a matrix ^ S T such that k ^ S T ^ K p K p k 2 =O(Q T p p). ApplyingLemma 4.1 to hy t ;Z t+1;p+1 ithis alsoimpliesk b H 1 [C ;D] L n Æ L 0 nÆ k 2 =O(Q T p p)as wellask ^ P ? nÆ P ? nÆ k 2 =O(Q T p p). Therefore consider IVC(n) IVC(n Æ ) = (d(n) d(n Æ )) C(T) T +log det ^ n =det ^ n Æ = (d(n) d(n Æ )) C(T) T +log det h I+( ^ n ^ n Æ ) ^ 1 nÆ i = (d(n) d(n Æ )) C(T) T +tr h ( ^ n ^ nÆ ) ^ 1 n Æ i = (d(n) d(n Æ )) C(T) T +O(Q 2 T p)
asfollowsfromaTaylorseriesexpansionoflog(1+x) . Thus
T C(T) (IVC(n) IVC(n Æ ))=d(n) d(n Æ
)+O(ploglogT=C(T))
Thisshowsthefollowingresult:
Theorem4.2 Lettheprocess(y
t
;t2Z)begeneratedbyasystemofthe form (1),wherethetrue
system(k;l)2M
n
. Let the noise fulll the assumptions of section 2 andlet the input fulll the
assumptions 1. Let thesystembeestimatedaccording tothe Larimore typeofprocedure.
Then theorderestimateobtainedastheminimizing argumentofIVC(n)usingapenaltyterm
C(T)>0;C(T)=T !0and C(T)=(ploglogT)!1 isa.s. consistent, ifdet [
n Æ 1 ]>det [ n Æ ]. If det[ n Æ 1 ]=det[ n Æ
] thenthe orderisunderestimateda.s. asymptotically.
Thetheoremleadstoapenaltyterm,whichhastobeslightlyhigherthanploglogTandtherefore
thechoice logT seemstobeareasonablechoicenotingthat loglogT issmall evenforrelatively
largeT,althoughnottheoreticallyjustiedfortheLarimoretypeofmethods,wheref andptend
to innity. This result isnew, as in (Bauer,1998)much moresevererestrictionson thepenalty
termhavebeenused. Therestriction det[
nÆ 1
]>det [
nÆ
]is worthbeinginvestigated further.
The fundamental dierence of the criterion IVC(n) as compared to the information criteria,
althoughformally denedanalogously,isthat theinnovationvarianceiscalculatedfortruncated
statesonly,ratherthannewlycomputedstates. Howevertherstncomponentsofx
t =K
1 Z
t;1
neednotbegeneratedbyastatespaceequationofordernforn<n
Æ
, i.e. thematrixK
1 might
not havetheshift invariance structure K
2:1 = A n K 1:1
forany matrix
A
n
of dimension nn,
using obviousnotation to denote submatrices. Therefore the criterion onlymeasures the direct
in uence of the statecoordinates on theprediction of y
t
, but it doesnot takeinto account the
dynamical generation of the state. Thus in the case, where a state does not contribute to the
presentoftheoutput,butonlytothefuture,itwill beneglectedaccordingtothecriteriongiven
above. Asthecitedresultsshow,thismightbeanextremelyraresituation. Themainconcernin
thisrespectis,thatinsituations,wherethecontributionissmall,thesamebehaviourisexpected,
i.e. manyobservationwillbeneededin ordertodetectthisstatecomponent. Inthenextsection
anexampleforthiswillbegiven.
5 Numerical Examples
In this section we present three dierent examples in order to compare the various proposed
order estimation methods. The candidate order estimationalgorithms will beSVC(n);IVC(n)
Est. Order <2 2 >2 <2 2 >2 IVC1 0.00 0.83 0.17 0.00 0.77 0.23 CCA IVC2 1.00 0.00 0.00 0.82 0.18 0.00 SVC1 0.00 0.96 0.04 0.00 0.93 0.07 SVC2 1.00 0.00 0.00 0.86 0.14 0.00 IVC1 0.82 0.03 0.15 0.68 0.06 0.26 N4SID IVC2 1.00 0.00 0.00 1.00 0.00 0.00 SVC1 0.00 0.66 0.34 0.00 0.40 0.60 SVC2 0.45 0.55 0.00 0.04 0.96 0.00
Table1: Heretheprobabilityofestimatingtheindicatedorderfor1000timeseriesofsamplesizeT
isshownfortwodierentweightingschemes(CCAandN4SID)and4dierentestimationmethods:
IVC(n)withC(T)=logT (IVC1)andC(T)=fplogT (IVC2)andSVCwithC(T)=logT (SVC1)
andC(T)=fplogT (SVC2). f =p=p^
AIC
hasbeenused.
in the N4SIDprocedure of thesystem identication toolbox of MATLAB(Ljung, 1991): The idea
hereis toformalisethesearchfora"gap"inthesingularvalues. Theorderisestimatedas
^ n=max n:log^ n > 1 2 (log^ 1 +log^ M )
i.e. thelargestinteger,which isgreaterthanthearithmetic meanofthelargestandthesmallest
nonzero estimated singular value. The three examples include a low order single input single
outputsystemwithoutexogenousinputs,wheretheorderis expectedtobeeasyto nd,another
SISOsystem withoutexogenous inputs, wherethe order is expected tobe hardto identify, and
nally aMIMO systemwith atwo-dimensionalobservedinput. Themain points ofinterest are
theeectsofthechoiceofthepenaltyterm,theweightingmatrices,theintegerparametersf and
p,andofcourseacomparisonbetweenthevarious procedures.
Asarstexampleconsiderthesystemdenedbythefollowingmatrices:
A= 0 1 0:7 0:5 ;K= 1:3 0:3 ;C=[1;0]
This system has Lyapunov balanced Gramianof roughly = diag(2:55;1:78). The system
polesareat0:250:7984iandthezerosat 0:40:4359i. Intheestimationweusetwodierent
weighting schemes: CCA uses ^ W + f = ( ^ + f ) 1=2
and the method using ^
W +
f
= I will be labelled
N4SID. The indices f = p = p^
AIC
are used. For each of the weighting schemes, the order is
estimatedusingfour dierentmethods: IVCandSVCwithC(T)=logT (denoted withIVC1and
SVC1 respectively in the sequel), IVC and SVC with C(T) = fplogT (denoted with IVC2 and
SVC2respectively). Note,that onlyforthelasttwoprocedurestheconsistencyresultshavebeen
derived. 1000 time series of length100and 1000 respectivelyhavebeengenerated and used for
estimation. Table1showstheresultsforT =100andT =1000respectively. Theyshow,thatthe
performanceoftheorderestimationproceduredependsheavilyontheweightingscheme: ForCCA
theIVC1methodworkswell,whereasitshowsproblemstoestimatethetrueorder,whenusedwith
N4SID.This isdueto thefact,that intheLyapunovbalancedrealizationofthetruesystem,the
entryC
1;2
isequalto 0:0146andthusclosetozero. Thisleadstoahighriskofunderestimating
theorderusingIVCtogetherwithN4SIDinthisexample. ForCCAitisobserved,thatashasbeen
expected, the higherpenalty term results in a high risk of underestimation, while reducing the
risk of overestimation. For N4SIDthe SVCmethod outperforms IVC and wealso observe, that
forC(T)=fplogT theaccuracyincreaseswiththesamplesize,whereasthelowerpenaltyterm
does notseem to lead to consistent order estimates. Inthe CCAcase it is seen, that the higher
penaltytermleadsto abig riskofunderestimating theorder. On theother hand fortheN4SID
^
W
f
T Methodd=2 Method d=4
IVC SVC NIC MOE IVC SVC NIC MOE
CCA 100 3.50 2.30 5.50 3.67 2.07 2.31 6.32 4.03 250 5.80 2.78 7.06 5.23 5.23 2.87 15.25 10.89 500 6.53 3.37 7.64 6.03 5.55 3.43 17.23 13.48 1000 7.53 4.03 7.74 6.65 6.63 4.11 17.90 15.14 lowpass 100 5.12 5.23 5.91 5.24 3.99 5.92 6.85 5.93 250 8.43 6.91 8.04 6.92 11.10 13.12 16.41 13.15 500 9.61 7.68 9.10 7.69 12.64 15.26 19.27 15.34 1000 10.43 8.51 10.10 8.52 14.32 16.68 21.09 16.76 highpass 100 3.97 5.60 6.30 5.64 2.57 6.43 7.21 6.50 250 6.93 7.55 8.75 7.69 6.89 14.50 17.83 15.11 500 8.12 8.38 9.92 8.61 7.82 16.83 20.89 17.66 1000 9.08 9.30 11.04 9.55 9.86 18.34 22.85 19.36
Table2: This table showsthe estimatedmeans of the various order estimation procedures asa
function ofsample sizeand dierentweightingmatrices ^
W +
f
. Here SVCand IVCusethepenalty
termC(T)=logT. Thetablehasbeenproduced using1000replicationsin eachcase.
aboutthechoiceof thepenaltyhasbeenfound. Bothchoicesused in thisexampleare heuristic
andnotmotivatedbyadditionalarguments. Atheoreticaljusticationseemstobeneeded.
Nextthe various order estimation procedures will be tested on an eight order system with
polesat z=0:8e 0:2i ;z =0:7e 0:3i ;z=0:5e 0:5i ;z=e 0:4i
andzerosat z=0:8e 0:1i
;z=
0:4755;z =0:1;z = 0:3;z = 0. Using this exampleextensive simulations comparing the four
orderestimationcriteriahavebeenperformed. Thesystemorderishardtoestimateandconsistent
estimatesoftheorderarenotthemaingoalinthisexample. TheLyapunovbalancedGramianis
equaltodiag(6:85;4:46;1:08;0:39;0:045;0:015;0:0004;0:0002)andthusthesystemisexpectedto
beapproximatedwellusing afourthordersystem. Acoupleofdierentsetupshavebeentested.
Inarstsimulationstudy1000replicationsoftimeseriesofsamplelengthsT =100;250;500and
T =1000havebeengenerated. Inthesubspacealgorithmsthreedierentweightingmatrices ^
W +
f
havebeenapplied: The CCAweights, alow-passlter, generatedusing a6-thorder butterworth
lter with cuto frequency0:5 and thecorrespondinghigh passlter havebeen incorporated.
f =p=d^p
AIC
hasbeenusedinallcases,whered=2andd=4aretried. Theaveragevaluesof
thecorrespondingorderestimatesaregivenin Table2. Itcanbeseen,thatthebehaviourofthe
variousalgorithmsis verydierentfordierentweightings ^
W +
f
. FortheCCAweightingNICgives
valuescloseto thetrueorderfor d=2,while itresultsin overlylargeestimatesford=4. Also
MOEseemsto suer from the biggerchoice of d,whereas bothSVC andIVCare relativelyrobust
withrespecttothischoice. Forthelowpassweightingallestimationproceduresshowatendency
to overestimate thesystemorder byafactorof twofor d=4,and alsotheresultsford=2are
largecompared to the CCAcase. Thesameresultalso holds for thehigh passweighting, except
that the estimates of NIC for d= 4are better than therespective estimates of theother order
estimationprocedures. Thisindicates,thatd=2isthefavourablechoice,ascomparedto d=4.
Theorder estimationprocedures arealso comparedto themoretraditionalmaximumlikelihood
basedinformationcriteria. InFigure 1ahistogramfortheestimatedorders usingIVC,SVC,AIC
andBICisgiven. HereT =100andf =p=2^p
AIC
havebeenused. Itcanbeobserved,thatBIC
tendsto choosen=4withahigh probability,while AICselectsrelativelylargeorders. Thetwo
subspaceorder estimates leadto slightlysmaller order estimates. Especially the resultsfor SVC
andBICseemto becomparable.
Howevertheorderestimatemightbeseentobenottheonlyinterestingindicator. Therefore
also the resulting estimates of the system are considered. The right plot of gure 2 shows the
1
2
3
4
5
6
7
8
9
10
11
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Estimated order, f=p, T=100, CCA
Prob.
Comparison of IVC and SVC to AIC and BIC
IVC
SVC
AIC
BIC
Figure1: InthisguretheorderestimatesobtainedbySVCandIVCarecomparedtotheestimates
obtainedintheMLframeworkusingAICandBIC.T =100andf =p=2^p
AIC
areusedtogether
withtheCCAweightingscheme. Theplotshavebeenobtainedusing100replications.
range [0;] obtained by CCA using d = 2 for the various procedures. Here the sample size is
equalto T =1000. Theguresshow,thattheIVCestimatesareworse,despitethefact,that the
averageestimated orderseemstobethebestforthisscenario. Thisisexplainedintherightplot
ofgure2,whichgivesthehistogramoftheorderestimates: IntheIVCcasethereisarelatively
highportionoflowordersystems(over50%arelessthann=4)occurs,aswellasahighnumber
ofoverlylargeestimates(35%largerthann=10). Thiscombinestoahighbias,whichshowsin
themeansquareerror. TheNICandMOEperformaboutequal,duetotheverysimilardistribution
oftheorderestimates. TheSVCmethodleadstoacomparablemeansquareerror,whilechoosing
smallerorders on average,which might be seenasan advantagefor control design. Theresults
ford=4aresimilarwithoneexception: Contrarytowhat hasbeensaidbefore,theSVCmethod
reactsmuch largerto thechange of d thanNICand MOEwith respect to themean square error.
Thisisduetothefact,thatthereisahigherpercentageoflowordersestimatedinthiscaseleading
toahighbiaserror. TheresultsforNICandMOEarenotthatsensitive,althoughonaveragemuch
higherordersarechosen. Thecorrespondingpicturesaregivenin gure3. Finallyalsotheeect
oftheweightingsonthemeansquareerroris discussed: Figure 4showstwoplots,where theleft
onerefersto T =100and theorder estimate accordingto IVC. Therightplot showsthe result
for MOE and T =1000. In both casesthere is somewhat surprisingly hardly any dierence due
to thechoiceof theweightingmatrices. Asimilar pictureholdsfortheother casesaswell. This
observationisincontrasttotheobservationsintheresultsofsimulationswithxedorder,where
anin uence of the weightingmatrices with respect to themeansquare error hasbeenobserved
(seee.g.Bauer,1998). Itisremarked,that thisobservationmightnotbetypical,but iscertainly
worthtobeinvestigatedfurther.
Asalastexamplealsoasystemwithobservedexogenousinputsistreated:Considerthesystem
givenbythefollowingmatrices:
A= 0:8 0:2 0:4 0:5 ;B= 0 1 1 0:5 ;K= 1:5 0 0:2 0:8 ;D= 0 0 0 0 andC=I 2
,thetwo-by-twoidentitymatrix. Thepolesofthesystemare0:7352and 0:4352,the
zerosof k
0
areat 0:6583and0:2583,whereasthezerosofl
0
areat 0:10000:9327i. Figure5
shows theprobabilities of the order estimatesfor dierent sample sizes, where the input is i.i.d
0
1
10
−1
10
0
10
1
MSE of estimated transfer function, d=2
Angular frequency /
π
NIC
SVC
IVC
MOE
3
7
>10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Order
Histogram of estimated orders
Rel. frequ.
IVC
SVC
MOE
NIC
Figure2: TheseplotsshowtheresultofthesimulationforT =1000. Theleftpictureshowsthe
averagemeansquareerrorofthetransferfunctionat50equallyspacedfrequenciesintheangular
frequencyrange ! 2 [0;)obtained using thevarious order estimation procedures withthe CCA
weightingschemeandf =p=2^p
AIC
. Therightplotshowsthecorrespondinghistogramforthe
estimatedorders. Theplotshavebeenobtainedusing1000replications.
0
1
10
−1
10
0
10
1
MSE of estimated transfer function, d=4
Angular frequency /
π
NIC
SVC
IVC
MOE
3
7
>10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Histogram of estimated orders, d=4
Order
Rel. frequ.
IVC
SVC
MOE
NIC
Figure3: TheseplotsshowtheresultofthesimulationforT =1000. Theleftpictureshowsthe
averagemeansquareerrorofthetransferfunctionat50equallyspacedfrequenciesintheangular
frequencyrange ! 2 [0;)obtained using thevarious order estimation procedures withthe CCA
weightingschemeandf =p=4^p
AIC
. Therightplotshowsthecorrespondinghistogramforthe
0
1
10
0
Angular frequency /
π
MSE of estimated transfer function, d=2, T=100
CCA
Low Pass
High Pass
0
1
10
−1
10
0
MSE of estimated transfer function, d=2, T=1000
Angular frequency /
π
CCA
Low Pass
High Pass
Figure 4: These plotsshowtheresultof thesimulationforT =100(leftpicture) andT =1000
(right picture). The pictures shows the average mean square error of the transfer function at
50 equallyspaced frequenciesin the angularfrequency range ! 2[0;) obtainedusing the IVC
procedure (leftplot)and theMOEprocedure (rightplot) forthe threedierent weightingscheme
andf =p=2^p
AIC
. Theplotsarebasedon1000replications.
has atendency to underestimate the order in allcases, NIC only for T = 100. In this example
SVCand NICshowthe best performance. Note, that for thecaseof additionalexogenous inputs
(Peternell,1995)suggeststoused(n)=(p(s+m) n)(fs n),whichessentiallyleadstoabigger
penalty termof the form n(sf +p(m+s))C(T). This examplewaschosen merelyto illustrate
that theproposed methods alsowork in thecaseofexogenous inputs. Itshould be notedagain,
thattheorderestimationtechniques,exceptfortheIVCmethod,applyequallytotheMOESPtype
ofmethods.
6 Conclusions
Inthis paperthequestion oforder estimation in the contextof subspacemethods has been
ad-dressed. Two newprocedures havebeenproposed and analysed. Lowerbounds on the penalty
terminorderfortheestimatestobeconsistenthavebeengiven. Alsothebehaviourofone
partic-ularprocedurein thecaseofinnite-dimensionalmodelswithsummablesecondordermodeshas
beeninvestigated. Themethodusingtheinnovationvariancehasbeenshowntosuerfromsevere
theoreticaldisadvantagesandthustheuseofthis intuitivelyappealingprocedure isdiscouraged.
FortheSVCcriteriontheadvantagescertainlyarethepossibilitytoobtainanestimateoftheorder
withalmostnocomputationalcosts,asonlythepropertiesoftheestimatesofthesingularvalues,
whichareestimated inanycase,areused. Inasimulationstudyithasbeendemonstrated,that
themethodsleadtoreasonableresults. Ithasbeenshown,thatSVCislesssensitiveto thechoice
ofthetruncationintegersf andpthanthecriterionintroducedby(Peternell,1995)orthemethod
usedinthesystemidenticationtoolboxof(Ljung,1991). HowevertheSVCcriterionalsocontains
a subjective component in the choice of the penalty term. In the simulations no clear picture
onhowthis shouldbechosencouldbeobtainedandnoheuristical motivation foranyparticular
0
1
2
4
10
>10
0
0.5
1
Prob.
T=100
IVC
SVC
MOE
NIC
0
1
2
4
10
>10
0
0.5
1
Prob.
T=500
IVC
SVC
MOE
NIC
0
1
2
4
10
>10
0
0.5
1
Prob.
T=1000
Estimated order, f=p, d=2, CCA, inputs white noise
IVC
SVC
MOE
NIC
Figure 5: This gureshowsthe resultof 1000 simulationruns using samplesizes T =100(top
row), T = 500 (middle row), T = 1000 (bottom row). Each picture shows the probabilities
of estimating the order using the four estimation algorithms: The weightings havebeen chosen
accordingto CCA. Thetruncation indices havebeen chosen asf = p= 2^p
AIC
. The inputs are
i.i.d. uniformelydistributedwhitenoisenormalizedto zeromeanandunit variance.
References
Akaike,H.(1975).Markovianrepresentationofstochasticprocessesbycanonicalvariables.SIAM
JournalofControl 13(1),162{172.
Bauer,D.(1998).SomeAsymptoticTheoryfortheEstimationofLinearSystemsUsingMaximum
LikelihoodMethodsorSubspaceAlgorithms.PhDthesis.TUWien.
Bauer,D. (1999). Orderestimation in thecontextof MOESPsubspaceidentication methods. In:
Proceedings of the ECC'99Conference, Karlsruhe,Germany.
Bauer, D. and M.Jansson (2000).Analysis of theasymptoticpropertiesof theMOESP typeof
subspacealgorithms.Automatica36(4),497{509.
Chatelin,F.(1983).Spectral Approximation ofLinearOperators.AcademicPress.
Chui,N. (1997).SubspaceMethods andInformativeExperimentsforSystemIdentication. PhD
thesis.CambridgeUniversity.
Hannan,E.J.andM.Deistler(1988).TheStatisticalTheoryof LinearSystems.JohnWiley.New
York.
Jansson,M.andBoWahlberg(1997).Counterexampletogeneralconsistencyofsubspacesystem
identicationmethods.In: Proceedings of SYSID'97.Fukuoka,Japan.pp.1677{1682.
Larimore,W. E. (1983).System identication, reduced order ltersand modeling viacanonical
variateanalysis.In: Proc.1983Amer.ControlConference2.(H.S.RaoandP.Dorato,Eds.).
Piscataway,NJ.pp.445{451. IEEEServiceCenter.
Ljung,L. (1991).SystemIdentication Toolbox,User's Guide.TheMathWorks.
thesis.Dept. ofElectr.Eng.,Linkoping.
Peternell,K.(1995).IdenticationofLinearDynamicSystemsbySubspaceandRealization-Based
Algorithms.PhDthesis.TUWien.
Peternell, K.,W. Scherrerand M.Deistler(1996).Statisticalanalysis ofnovelsubspace
identi-cationmethods. SignalProcessing52, 161{177.
Rissanen,J.(1978).Modelingbyshortestdatadescription.Automatica14, 465{471.
Shibata, R. (1980). Asymptotically eÆcient selection of the order of the model for estimating
parametersofalinearprocess.Ann. Statistics8(1),147{164.
Sorelius, J. (1999).Subspace-Based Parameter EstimationProblems in Signal Processing. PhD
thesis.UppsalaUniversity.
Verhaegen,M.(1994).Identication ofthedeterministicpartofMIMOstatespacemodelsgiven