Stochastic Approximation Estimates for
Regression Parameter Tracking
Alexander V. Nazin and Lennart Ljung
Department of ElectricalEngineering
Linkoping University, SE-581 83Linkoping, Sweden
WWW: http://www.control.isy.l iu. se
Email: ljung@isy.liu.se
October2, 2001
REG
LERTEKNIK
AUTO
MATIC CONTR
OL
LINKÖPING
Report no.: LiTH-ISY-R-2360
Stochastic Approximation Estimates
for Regression Parameter Tracking Alexander V. Nazin, Institute of Control Sciences, Profsoyuznaya str., 65, 117997 Moscow, Russia nazine@ipu.rssi.ru LennartLjung, Departmentof Electrical Engineering,
Linkoping University,
S-581 83 Linkoping,
Sweden
ljung@isy.liu.se
April25, 2001
Abstract
ThesequenceofestimatesformedbytheLMSalgorithmforastandard
linear regression estimation problem are considered. It is known since
earlier that smoothing these estimates by simpleaveraging will lead to,
asymptotically,the recursiveleastsquares algorithm. Inthis paperit is
rstshownthat smoothing the LMSestimates usingamatrixupdating
willlead tosmoothedestimateswithoptimaltrackingproperties,alsoin
thecasethetrueparametersarechangingasarandomwalk. Thechoice
of smoothing matrixshould betailored tothe properties ofthe random
walk.Second,itisshownthatthesameaccuracycanbeobtainedalsofor
asimpliedalgorithm,SLAMS,whichisbasedonaveragesandrequires
muchlesscomputations.
1 Introduction
Trackingof time varying parametersis abasicproblem in many applications,
andthereis aconsiderableliteratureonthisproblem. See,amongmany
refer-ences,e.g. [12],[6],[5].
One of the most common methods is the least mean squares algorithm,
LMS, ([12]) which is a simple gradient based stochastic approximation (SA)
TheworkoftherstauthorhasbeencarriedoutwhilevisitingLinkopingUniversityas
theaccuracycouldinfactbequitebad. Itiswellknownthat forsuchsystems,
the recursiveleast squares (RLS) algorithm is optimal, but it is on the other
handconsiderablymorecomplex. Averyniceobservation,independentlymade
byB.T.PolyakandD.Ruppert,isthatthisoptimalaccuracycanasymptotically
alsobeobtainedbyasimpleaveragingoftheLMS-estimates. See[10],[11],and
[9][1]fortheanalysis.
In [3] it is shown that this asymptotic convergence of the averaged
LMS-algorithm to theRLS algorithm is obtainedalso for thetrackingcase, with a
moving true system and constant gain algorithms. This means that in
gen-eral theaveragedalgorithm will notgive optimal accuracy. Optimal tracking
propertiesthenwill beobtainedbyaKalman-lterbasedalgorithmwhere the
updatedirection is carefullytailoredto theregressor properties, thecharacter
ofthechangesinthetrueparametervectorandthenoiselevel.
Inthispaperweshallconsideramoregeneralpost-processingofthe
LMS-estimates, obtained from a constant gain, unnormalized LMS-method. The
generalversionof this algorithm we callSLAMS { Smoothed Averaged LMS
(allowing a metathesis for pronouncability). It consists of rst forming the
standardLMS-estimates ^
(t),andthenformingsimpleaveragesofthese
~ (t)= 1 m t X t m ^ (k)
and nally smoothingthese by asimple exponentialsmoother, applyinga
di-rectioncorrectioneverym:thsample:
(t)= ( (t m) S( (t m) e (t)); t=km;k=1;2;::: (t 1) ( (t 1) ^ (t 1)); t6=km;k=1;2;:::
Thisalgorithmhasthedesignvariables(thegainoftheLMSalgorithm),S;m
and . By,for example,choosing masthedimensionof theaveragenumber
ofoperationsperupdate in the SLAMSalgorithm is still proportionalto dim
,just asin thesimpleLMSalgorithm.
The main result of this paperis to establishan asymptotic expression for
thecovariancematrixof thetrackingerror
(t) (t) ((t) beingthetrue
pa-rametervalue). WeshowthatbythechoiceofSand wecanobtainthesame
asymptoticcovarianceas theoptimalKalmanlter gives,regardlessof mand
(aslongasithasacertainsize relationto ).
InSection2weformulatethetrackingproblemandstatethebasic
assump-tions. InSections3and4wetreataspecialcaseofSLAMS,viz. withmxed
to1. Theextensiontothegeneralalgorithm isdonein Section5.
2 Problem Statement and Basic Assumptions
Consideradiscrete-timelinearregressionmodelwithtime-varyingparameters.
y(t) = T
(t)'(t)+e(t) (1)
(t) = (t 1)+w(t) (2)
wheree(t) and w(t)stand forobservation errorand parameterchange
respec-tively. Due to the followingassumptions, the equation(2) describesevolution
ofslowlydriftingparameter(t)2R n
asarandomwalk.
BasicAssumptions:
A1 Thesequencesfe(t)g, fw(t)g and f'(t)garei.i.d. mutuallyindependent
sequencesofrandomvariables.
A2 The observation error e(t) is unbiased and hasa nite variance, that is
Ee(t)=0andEe 2 (t)= 2 e 2(0;1).
A3 Theparameterchangew(t)isunbiasedvariablewithpositivedenite
co-variance matrix, i.e. Ew(t) = 0 and Ew(t)w T (t) = 2 R w > 0, where
>0representsmall parameteroftheproblemunderconsideration.
A4 Theregressorcovariancematrixisnon-singular,i.e. E'(t)' T
(t)=Q>0;
moreover,Ej'(t)j 4
<1.
A5 Theinitial parameter value(0) is supposed to bexed (for thesakeof
simplicity).
Considertheparametertrackingproblemwiththeperformanceevaluatedas
theasymptoticmeansquare error(MSE).That is,theproblemisto designan
estimationalgorithmwhichon-linedeliversanestimatesequencef
(t)g onthe
basisofobservations(1)withaminimalasymptoticerrorcovariancematrixU:
U = lim t!1 U t (3) and U t =E( (t) (t))( (t) (t)) T (4)
As is known (see, e.g. [8] and the lower bound below), matrix U must be
proportionaltothesmallparameter ,thatis
U = U
0
+o( ) as !0: (5)
Moreover,fromthelowerboundformatrixU
0
provedin[8]follows,inparticular,
thatwithGaussianrandomvariablese(t)andw(t)
U
0 U
lb
(6)
foranyparameterestimatorwhere U
lb
isasymmetricsolutiontotheequation
asfollows U lb QU lb = 2 e R w (7)
U lb = e Q 1=2 (Q 1=2 R w Q 1=2 ) 1=2 Q 1=2 (8) By(6)ismeantthatU 0 U lb
isapositivesemidenitematrix.
Wefurther call matrixU
0
(5) limitingasymptoticerror covariance matrix,
since U 0 = lim !0 1 U: (9)
Below we study how the matrix U
0
depends on the parameters both of the
problemandthealgorithmdescribedinthefollowingsection. Wethenminimize
thismatrixoverthedesignparametersin thealgorithm.
3 Parameter Tracking Algorithm
We aim at studying the following recursive constant gain SA-like procedure,
whichwecall SLMS,(SmoothedLMS):
^ (t) = ^ (t 1) + '(t)(y(t) ' T (t) ^ (t 1)) (10) (t) = (t 1) S( (t 1) ^ (t 1)) (11)
Here >0is ascalarstepsize whileS representsannn-matrixgain. The
relation (10)is exactly the constant(scalar) gainSA-algorithm (LMS), while
recursiveprocedure(11)generatesasequenceofsmoothedSAestimates.
Specialinterestmightbeconnectedtotheparticularcaseofa\scalarmatrix"
SwhenS=I
n
withascalarstepsize>0andidentitynn-matrixI
n (see
subsection 4.1 below). In that case there are no matrix calculations in the
algorithm(10),(11),whichmakesitparticularlysimple.
Assumptionson theAlgorithm Parameters:
B1 =o(1)as !0
B2 =o()as !0
B3 The matrix ( S) is stable, i.e., the real part of any eigenvalue of S is
positive.
Note: Dueto assumptionsB1{B2,stochasticstabilityofequations(10),(11)
(in mean-square sense) is obviously ensured (for suÆciently small ). This
impliestheexistenceoflimitin (3)andfurther.
4 Main Results
Theorem 1 Let the assumptions A1{A5and B1{B3 hold, and consider the
estimates
(t)generatedbythe algorithm (10), (11). Then the limiting
asymp-toticerrorcovariancematrix U
0
,denedby(5), isthe solutiontothe equation
SU 0 +U 0 S T =R w + 2 e SQ 1 S T : (12)
0
Hence,if( S)isstablethenauniquesolutionU
0 =U
0
(S)to(12)existswhich
is symmetric and positive denite. Furthermore, the relationship (12) might
be consideredasan algebraicRiccati equationwith respect to S. Due to well
knownpropertiesofRiccatiequation(see,e.g. [4],Lemma5.1,p.127)wearrive
atthefollowing
Corrolary: IfthematrixgainS issubjecttoassumptionB3thenthesolution
U
0
(S)to(12)hasthefollowinglowerbound
U 0 (S)U min = e Q 1=2 (Q 1=2 R w Q 1=2 ) 1=2 Q 1=2 (13)
whichcoincides withU
lb
(8)andisattainedforS=S
opt with S opt = 2 e U min Q= 1 e Q 1=2 (Q 1=2 R w Q 1=2 ) 1=2 Q 1=2 (14)
ThecorrolaryaboveisaspecialcaseofLemma5.1in[4]. However,itcanbe
easilyprovedindependently. Indeed,from(12)andanevidentmatrixinequality
( 2 e SQ 1=2 U 0 Q 1=2 )( 2 e SQ 1=2 U 0 Q 1=2 ) T 0
itdierctlyfollowsthat
U 0 QU 0 2 e R w
whereequalityisattainedfor
2 e SQ 1=2 =U 0 Q 1=2 (15)
Consequently(13)and(14)hold true.
Note: Since bothU
min
and Qare positivedenite matrices, then itsproduct
U
min
Qhasonlyeigenvalueswithpositiverealparts. Hence,theoptimalmatrix
gainS
opt
(14)meetsthestabilityassumptionabove.
4.1 Scalar Smoothing Gain
Now consider the special case of \scalar matrix" gain S = I
n , > 0. Then equation(12)impliesU 0 =U 0 ()with U 0 ()= 1 2 1 R w + 2 e Q 1 : (16)
Hence,theoptimalinasense ofminimaltraceTrU
0 isasfollows opt = 1 e TrR w TrQ 1 1=2 (17)
andtheminimumtraceoflimitingasymptoticerrorcovariancematrix
TrU 0 ( opt )=min >0 TrU 0 ()= e ( TrR w ) 1=2 TrQ 1 1=2 (18)
min lb dependentmatricesR 1 w andQ,that is R 1 w =Q forsome2R; (19)
thetracescoincide
TrU 0 ( opt )=TrU lb (20)
which means that TrU
0
() attains its lower bound for =
opt
among all
possible estimators. The condition (19) is both necessary and suÆcient for
theequality(20). That followsdirectly from thewellknown properties of the
correspondingCauchy-Schwarzinequalityformatrixtraces[2],thatis
TrAB T 2 TrAA T TrBB T ; (21)
withequalityiAandB arelinearlydependent. Theinequality(21)mightbe
appliedhereforA=Q 1=2 andB T =(Q 1=2 R w Q 1=2 ) 1=2 Q 1=2 .
Note, nallythatunder condition(19)theoptimalmatrixgainS
opt (14)is reducedto S opt = 1 e p I n
whichalsofollowsfrom (17).
Note: Ifthepropertiesoftheregressors'(t)canbechosenthen itispossible
to ensure the condition (19) { if the paramter variation R
w
is known { by
experiment design. Such an experiment would thus give oprimal parameter
trackingwiththesimplestpossiblealgorithm.
4.2 Proof of Theorem 1
Weprovean evenmoregeneraltheoremhavingitsown interest. The
general-izationconsistsinintroducingnon-singularmatrixgainA intoprocedure(10),
thatis ^ (t)= ^ (t 1)+A'(t)(y(t) ' T (t) ^ (t 1)) (22)
Hence,it will be provedthat the matrix U
0
(5) doesnot depend onA. This
result explains why only a scalar step size is enough for procedure (10), and
thatwearenotabletoin uence matrixU
0
byamatrixgainS in (22).
Proof:Lettheestimates ^
(t)begeneratedbythemoregeneralprocedure(22),
insteadof(10). LetthematrixgainAbenon-singularandassumethat( AQ)
isstable. Denotetherelatedestimationerrorcovariancematrixby
V t =E( ^ (t) (t))( ^ (t) (t)) T (23)
anditslimit
V = lim
t!1 V
t
AQV +VQA T = 2 e AQA T + 2 R w (25)
followsdirectlyfromwellknownpreviousresults(see, e.g. [5]). Then
assump-tionsB1,B2implytheLyapunovequation
AQV +VQA T = 2 e AQA T +o( ) (26)
fromwhatfollows,inparticular,that 1
kVk=O() as !0: (27)
Furthermore,forthecrosscovariancematrix
R t =E( ^ (t) (t))( (t) (t)) T (28)
anditslimit
R= lim
t!1 R
t
(29)
weobtainfrom (22),(11)and(1),(2)that
R t =(I n AQ) R t 1 (I n S)+ V t 1 S+ 2 R w (30)
Lettingt!1andtakingassumptionsB1,B2intoaccount,weobtain
R= 1 Q 1 A 1 VS T +o( ): (31)
Inasimilarmanner,weevaluateU
t
(denedby(4))andU (denedby(3)):
U t = (I n S)U t 1 (I n S) T + 2 R w + 2 SV t 1 S T + (I n S)R T t 1 S T + SR t 1 (I n S) T
and,taking(27),(31)as wellasB1,B2intoaccount,wendthat
SU+US T = R w +R T S T +SR+ SUS T +o( ): (32)
Note, that (32)is aLyapunov-typeequationwith U entering linearly. Hence,
kUk=O( )as !0,andsubsituting (31)into(32)weobtain
SU+US T = R w + 1 SVA T Q 1 S T +SQ 1 A 1 VS T +o( ) = R w + 1 S(AQ) 1 (AQV +VQA T )(AQ T ) 1 S T +o( )
Finally,using (26),wearriveat
SU+US T = R w + 2 e SQ 1 A 1 (AQA T +o(1))A T Q 1 S T +o( ) = R w + 2 e SQ 1 S T +o( ) (33)
Thus, thelimitmatrix U
0
dened by(5)meet the equation(12)anddoesnot
depend onA, andTheorem1isproved.
1
HereandfurtheronweusematrixnormkAk= TrAA T
1=2
whichcorrespondstoinner
producthA;Bi=TrAB T
Letusconsiderthefollowingmodicationoftheparametertrackingalgorithm
(10),(11). Itcontainsanaturalnumbermasaparameter.
^ (t) = ^ (t 1) +'(t)(y(t) ' T (t) ^ (t 1)) (34a) e (t) = 1 m t 1 X =t m ^ () (34b) (t) = ( (t m) S( (t m) e (t)); t=km;k=1;2;::: (t 1) ( (t 1) ^ (t 1)); t6=km;k=1;2;::: (34c)
Sinceit isaSmoothingalgorithm basedandthe Averagedestimates from the
LMS procedure, we call it SLAMS. Evidently, this algorithm coincides with
(10), (11),when m =1. However, when m >1, the procedure (34a) {(34c)
takeslessarithmetic calculationspertimeunit beingcomparedto (10),(11).
Theorem2 Assumethatthe assumptions A1{A5andB1{B3 hold,and
con-diderthe estimates
(t)generatedbythealgorithm (34a){(34c). Then forany
xednaturalnumber mthe asymptoticerrorcovariance matrix
U (m) = lim t!1 E( (t) (t))( (t) (t)) T (35)
isthe solutiontothe equation
SU (m) +U (m) S T = mR w + 2 e m SQ 1 S T +o( ) as !0: (36)
Hence, the lower boundtothe limitingasymptoticerrorcovariance matrix
U (m) 0 =lim !0 1 U (m) (37) coincides withU lb =U min
(see(8)and(13)), thatis
U (m) 0 (S)U min = e Q 1=2 (Q 1=2 R w Q 1=2 ) 1=2 Q 1=2 (38)
Thus, the lower boundto U (m)
0
does not depend on m and isattainedfor S =
S (m) opt with S (m) opt =m 2 e U min Q=m 1 e Q 1=2 (Q 1=2 R w Q 1=2 ) 1=2 Q 1=2 : (39)
The proof of Theorem 2 is analogous to that of Theorem 1. Note, that
a comparison of the equation (36) with (33) shows that the modication of
thetrackingalgorithmsuggestedabovecorrespondsto asimultaneousm-times
increase in the drift covariance matrix R
w
and and m-times decrease in the
varianceof observation error 2
e
in therighthand side ofLyapunovequations
(12), (33). Since the optimal gain matrix (39) is balanced the in uence of
correspondent summands in the right hand side (36), this explains that the
Considerasubsequence of time instantst = km, k =1;2;::: and rst prove
theTheoremforthepartiallimit
U (m) = lim k !1 E( (km) (km))( (km) (km)) T : (40)
Theestimation error
Æ(t)=
(t) (t)fort=kmisrecursivelyrepresentedas
Æ(t)=(I n S) Æ(t m)+ S e Æ(t) (I n S) t 1 X =t m w(+1): (41)
Note,thattherstandthelastsummandsinther.h.s. of(41)areuncorrelated.
Furthermore, the correlation between the second and the last summands is
evaluatedasO( 2
)=o( 2
),sincebyCauchy-Schwarzinequality
kE e Æ(t)w T ()kkE e Æ(t) e Æ T (t)kkEw()w T ()k=O( )=o( )
Hence,thecovariancematrix(40)meets theequation
SU (m) +U (m) S T = mR w + e R T m S T +S e R m +o( ) (42) where e R m = lim k !1 E e Æ(km) Æ T (km m) (43)
WemovetrivialandratherbulkycalculationstotheAppendixwhichprovethat
e R m = lim k !1 E e Æ(km) Æ T (km)+o( )= 2m 2 e Q 1 S T +o( ) (44)
Substituting(44)into(42),wearriveat (36)forthepartiallimitconsidered.
Now, consider subsequenceoftime instantst =km+1,k=1;2;:::; then
wehavefortheestimationerror
Æ(t)thesimplierrecursive-likeequation
Æ(t)=(1 ) Æ(t 1)+ ^ Æ(t 1) w(t); (45)
fromwhichfollowsthat
lim k !1 E Æ(km+1) Æ T (km+1)=U (m) +O( 2 ) (46) withU (m)
asdenedby(40);hence,thepartiallimit(46)alsomeettheequation
(36).
Continuingfurther thisniteinduction,weobtain
lim k !1 E Æ(km+s) Æ T (km+s)=U (m) +O( 2 ); s=1;2;:::;m 1 with U (m)
as dened by (40). This proves the equation (36) for any partial
limitsofthematrixsequencefU(t)g.
The rest of the Theorem is provedin completely the same manner as the
Fromtheobtainedresultsitfollowsthattheoptimallimitingasymptoticerror
covariancematrixU
min
(13)for theSLAMS algorithm(34).coincides with the
lowerboundU
lb
(8). Thus,Theorem1andthelowerbound(8)implythatunder
Gaussian distributions of e(t) and w(t) the algorithm (10), (11)with optimal
matrixgainS=S
opt
(14)deliversasymptoticallyoptimalestimatesamongall
possibleestimators.
An interestingtheoretic aspectof this isthat itis possibleto achieve
opti-malaccuracy with analgorithm that is considerably simplerthat theoptimal
Kalman-lterbasedalgorithm. Thismightalsoproveusefulincertainpractical
applications.
It mightbeseenas aparadoxthat theresultis independentofthe integer
m,whichalsogovernsthealgorithmcomplexity. Oneshouldbearinmindthat
theresultisasymptoticin!0. Forxed, non-zero,therewillbeanupper
limitof m forwhich thelimit expressionid agood approximationof the true
covariancematrix.
Appendix
Proof of (44): First,note that wemayusethe relations(23){(27)from
theproof of Theorem 1,since theprocedure (22) coincides withthat of (34a)
underA=I
n
. Thus,puttingA=I
n
andintroducingthetrackingerrorfor ^ (t) asfollows ^ Æ(t)= ^ (t) (t); (47)
weobtainthesolutionto (25)
V = lim t!1 E ^ Æ(t) ^ Æ T (t)= 2 2 e I n +o( ) (48) andconsequently lim t!1 kE ^ Æ(t) ^ Æ T (t)k=O() as !0 (49)
Note,that(49)canbeextendedto
lim t!1 kE ^ Æ(t) ^ Æ T (t+`)k=O() as !0 8`=1;:::;m: (50)
Indeed,for`=1weobtaintheequation
E ^ Æ(t) ^ Æ T (t+1)=E ^ Æ(t) ^ Æ T (t)(I n Q) (51)
whichfollowsfromthestrightforwardrecursiveequation
^ Æ(t)= I n '(t)' T (t) ^ Æ(t 1) w(t) +'(t)e(t) (52)
Furthermore,from(50)and(34b)theevaluation lim t!1 kE e Æ(t) e Æ T (t)k=O() as !0; =!0 (53)
followsdirectlyforthesecondtrackingerror(i.e.,theonefor e (t)) e Æ(t)= e (t) (t)= 1 m t 1 X =t m ^ Æ() t 1 X s= w(s+1) ! (54)
sinceforeachsand underconsideration
kEw(s+1)w T (s+1)k=O( 2 )=o() and kE ^ Æ()w T (s+1)k kE ^ Æ() ^ Æ T ()kkEw(s+1)w T (s+1)k ! t!1 O( )=o()
Nowwearereadytodirectlyevaluate(43),that is
e R m = lim k !1 E e Æ(km) Æ T (km m) (55)
Taking(54)intoaccount,weobtainfort=km
E e Æ(t) Æ T (t m) = 1 m t 1 X =t m E ^ Æ() t 1 X s= w(s+1) ! Æ T (t m) ! k !1 (1+O()) lim k !1 E ^ Æ(km) Æ T (km)+O( )U (m) (56)
sincedueto (52)weobtainfort, andsunder consideration
E ^ Æ() Æ T (t m) = (I n Q)E ^ Æ( 1) Æ T (t m) = =(I n Q) t+m E ^ Æ(t m) Æ T (t m) ! k !1 (1+O()) lim k !1 E ^ Æ(km) Æ T (km) andEw(s+1) Æ T (t m)=0.
Thus,wehaveto evaluate
^ R m = lim k !1 E ^ Æ(km) Æ T (km) (57) Denote e w(t)= t 1 X =t m w( +1): (58)
E ^ Æ(t) Æ T (t) = E ^ Æ(t) h (I n S) Æ(t m) w(t)e + S e Æ(t) i T (59) = E ^ Æ(t) Æ(t m) w(t)e T (I n S) T (60) + E ^ Æ(t) e Æ T (t)S T (61)
Consequentlyapplying(52)totheexpectationtermof(60)weobtain
E ^ Æ(t) Æ T (t m) = (I n Q)E ^ Æ(t 1) Æ T (t m) = =(I n Q) m E ^ Æ(t m) Æ T (t m) (62)
and,dueto(52)and(58),
E ^ Æ(t)we T (t) = t 1 X =t m E ^ Æ(t)w T (+1) (63) = t 1 X =t m (I n Q) t 1 E ^ Æ(+1)w T (+1) = 2 t 1 X =t m (I n Q) t R w =O( 2 ) (64)
Atlast,fortheexpectationtermin (61)which turnsouttobeofthemain
orderO()weobtain(remindingthat weconsider t=km)
E ^ Æ(t) e Æ T (t) = 1 m t 1 X =t m E ^ Æ(t) ^ Æ() t 1 X s= w(s+1) ! T (65) = 1 m t 1 X =t m E ^ Æ(t) ^ Æ T () 1 m t 1 X =t m t 1 X s= E ^ Æ(t)w T (s+1) (66)
Here,foreach ;s=t m;:::;t 1,weobtainfrom (48)and(51)
E ^ Æ(t) ^ Æ T ()=(I n Q) t E ^ Æ() ^ Æ T () ! k !1 2 2 e I n +o( ) and(similarlyto(63){(64)) kE ^ Æ(t)w T (s+1)k ! k !1 O( 2 )
Hence,from (65){(66)follow
E ^ Æ(t) e Æ T (t) ! k !1 2 2 e I n +o( ) (67)
(57)intoaccount,wearriveat thefollowingequation ^ R m =(I m Q) m ^ R m + 2 2 e I n +o( ) S T +O( 2 ): (68) Since (I m Q) m =I m mQ+O( 2 ) as!0;
theasymptoticsolutiontotheequation(68)isasfollows
^ R m = 2m 2 e Q 1 S T +o( ) (69)
Thus, combining(43),(56),(57)and(69)wearriveat(44).
Acknowledgement ThisworkwassupportedbytheSwedishResearch
Coun-cilunderthecontractonSystemModeling. Theauthorswouldalsoliketothank
Prof. J.P.LeBlanc (Lulea)forhiscreativeacronyms.
References
[1] H. J. Kushner and J. Yang. Stochastic approximation with averagingof
theiterates: Optimalasymptoticrateofconvergenceforgeneralprocesses.
SIAMJournalof Control andOptimization,31(4):1045{1062,1993.
[2] P. LancasterandM.Tismenetsky. The Theory ofMatrices, 2nded.
Aca-demicPress,Inc. Boston,SanDiego, 1985.
[3] L.Ljung.Recursiveleast-squaresandacceleratedconvergenceinstochastic
approximationschemes. Int. J. Adaptive Control and Signal Processing,
15(2):169{178,March2001.
[4] L.LjungandT.Glad.ControlTheory.MultivariableandNonlinear
Meth-ods. Taylor&Francis,LondonandNewYork,2000.
[5] L.Ljungand S. Gunnarsson. Adaptivetrackingin systemidentication
-asurvey. Automatica,26(1):7{22,1990.
[6] L.LjungandT.Soderstrom. Theory andPractice ofRecursive
Identica-tion. MITpress,Cambridge,Mass., 1983.
[7] A.M. Lyapunov. StabilityofMotion. Taylor&Francis,London, 1992.
[8] A.V. Nazinand A.B. Yuditskii. Optimal androbust estimation ofslowly
driftingparametersinlinear-regression. Automation andRemoteControl,
52(6,Part1):798{807,1991.
[9] B.T.PolyakandA.B.Juditsky. Accelerationofstochasticapproximation
andRemoteControl,51(7):937{946,Part2,July1990.
[11] D.Ruppert.EÆcientestimationsfromaslowlyconvergentRobbins-Monro
process. TechnicalReportNo.781,CornellUniv.,1988.
[12] B. Widrow and S. Stearns. Adaptive Signal Processing. Prentice-Hall,