## Performance Analysis of General Tracking Algorithms

### Lei Guo

### Institute of Systems Science, Chinese Academy of Sciences Beijing, 100080, China

### Lennart Ljung and

### Department of Electrical Engineering, Linkoping University S-581 83 Linkoping, Sweden

### Abstract | A general family of tracking algorithms for lin- ear regression models is studied. It includes the familiar LMS (gradient approach), RLS (recursive least squares) and KF (Kalman lter) based estimators. The exact expressions for the quality of the obtained estimates are complicated.

### Approximate, and easy-to-use, expressions for the covari- ance matrix of the parameter tracking error are developed.

### These are applicable over whole time interval, including the transient and the approximation error can be explicitly cal- culated.

### I. Introduction

### Tracking is the key factor in adaptive algorithms of all kinds. We shall in this contribution study the special case where the underlying model is a linear regression, i.e., the observations are related by

### y k = ' _{k} k + v k k 0 : (1) Here y k is an observation made at time k , and ' k is a d - dimensional vector, that is known at time k , v k represents a disturbance and the parameter vector k describes how the components of ' k relate to the observation y k . It is the objective to estimate the vector k from measurements

f

### y t ' t t

^{}

### k

^{g}

### .

### Many technical problem formulations t the structure (1) by choosing ' k and y k appropriately. See, among many references, for example, 15] and 22].

### In order to come up with good algorithms for estimating

### _{k} , it is natural to introduce some assumptions about the time-variation of this parameter vector. In general we may write

### k = k

^{;}1

### + w k (2) where is a scaling constant and w k is an as yet undened variable.

### The tracking algorithms will provide us with an estimate

### ^ _{k} = ^ _{k} ( y ^{k} ' ^{k} ^{k} ) (3) where superscript denotes the whole time history: y ^{k} =

f

### y

0### y

1### :::y k

^{g}

### , etc.

### Supported by the National Natural Science Foundation of China Supported by the Swedish Research Council for Engineering Sci- ences (TFR)

### A prime question concerns of course the quality of such an estimate. We shall evaluate the quality in terms of the covariance matrix of the tracking error

e

### k = k

^{;}

### ^ k (4) This covariance matrix will be denoted by

^{0}

_{k} = E

^{e}

### k

^{e}

_{k} ] (5) where expectation will be taken over all relevant stochastic variables. A precise denition will be given later.

### An exact expression for

^{0}

_{k} will be very complicated | except in some trivial cases | and it will not be possible to derive it explicitly in closed form. However, the prac- tical importance of having good tracking algorithms and estimates of their quality still makes it vital to be able to work with

^{0}

_{k} .

### For that reason, there is a quite substantial literature on the problem of how to approximate

^{0}

_{k} with expressions k

### that are simple to to work with. This literature is { partly { surveyed in 2], 1], 12], and 20].

### The current paper has the ambition to give a general re- sult that subsumes and extends most of the earlier results.

### Example 1.1 A Preview Example .

### Consider the model (1){(2) under the assumptions that a). ' k and k are scalars

### b).

^{f}

### ' k

^{g}

^{f}

### v k

^{g}

### and

^{f}

### w k

^{g}

### are independent sequences of independent random variables with zero mean values and variances R ' , R v and Q w , respectively.

### c). The fourth moment of ' k is R

4### .

### Assume also that the estimate ^ k is computed by the sim- ple LMS algorithm

### ^ k

+1### = ^ k + ' k ( y k

^{;}

### ' k ^ k ) : (6) This case is one | essentially the only one | where a simple exact expression for

^{0}

_{k} can be calculated. Straight- forward calculations give

e_{k}

+1### = (1

^{;}

### '

^{2}

_{k} )

^{e}

_{k}

^{;}

### ' _{k} v _{k} + w _{k}

+1 ### : (7) Squaring and taking expectations gives

^{0}

_{k}

+1### = (1

^{;}

### 2 R _{'} +

^{2}

### R

4### )

^{0}

_{k} +

^{2}

### R _{'} R _{v} +

^{2}

### Q _{w} : (8)

### This is a linear time-invariant dierence equation for

^{0}

_{k} , and can be explicitly solved. In particular, if

j

### 1

^{;}

### 2 R ' +

^{2}

### R

4^{j}

### < 1 the solution of (8) will converge to

^{}

### with

^{}

### = 1

### 1

^{;}

### R

4### = (2 R _{'} ) = 1 2 R _{'} ^{} R ' R v +

^{2}

### Q ^{w} ^{]} Simple manipulations then give (9)

j

^{}

^{;}

^{j}

^{}

### ( ) ( ) = R

4### = (2 R ' ) 1

^{;}

### R

4### = (2 R ' )] : Thus,

^{}

### can be well approximated by for small , since

### ( )

^{!}

### 0 as

^{!}

### 0.

^{2}

### Now, this example was particularly easy, primarily be- cause of the assumed independence among

^{f}

### ' k v k w k

^{g}

### which makes ' k and

^{e}

### k independent.

### In more general cases we have to deal with dependence among

^{f}

### ' _{k}

^{g}

### , and that is actually at the root of the prob- lem. Generally speaking, if

^{f}

### ' k

^{g}

### are weakly dependent, so should ^ ^ _{k} in (3) depends to a small extent on the \latest" k and ' k be, provided that ' _{k} , i.e. if the adaptation rate ( in the example) is small and the error equation ((7) in the example) is stable.

### The extra term caused by the dependence in the equa- tion corresponding to (8) in the example should then have negligible inuence. Indeed, it is the purpose of this contri- bution to establish this for a fairly general family of track- ing algorithms. Despite the simple idea, it turns out to be surprisingly technically dicult to prove. This paper could be said to make the end of a series of results on performance analysis, starting with Theorem 1 in 12] and then followed by 14],13] and 10]. There are many related, relevant re- sults using other approaches. We may point to 20], 2], 5], 6], 4], 16], 3] 18], and to the references in these books and papers.

### The bottom line of the analysis is a result of the character

k

### E

^{e}

### k

^{e}

_{k} ]

^{;}

### k

^{k}

^{}

### ( )

^{k}

### k

^{k}

### (10) where ( )

^{!}

### 0 as

^{!}

### 0, and is a measure of the adaptation rate in the algorithm, k obeys a simple linear, deterministic, dierence equation (like (8) without the term

^{2}

### R

4### ).

### The point with a result of the character (10) is, clearly, that we can arbitrarily well approximate the actual track- ing error covariance matrix with a simple expression that can be easily evaluated and analyzed. The essence of this paper does not lie in the expression for k itself | it is not dicult to conjecture that such an approximation should be reasonable. Our contribution is rather to establish the connection in the explicit fashion (10) for a wide family of the most common tracking algorithms. One important step in achieving such results is to rst establish that the underlying algorithm is exponentially stable. This is a ma- jor problem in itself, and a companion paper 9] is devoted to this step, for the same family of algorithms.

### The paper is organized as follows. In Section 2 the track- ing algorithms are briey described. Section 3 gives the main result: That (10) holds under the same general condi- tions for all algorithms in the family. There we also briey discuss the practical consequences of the result. In the fol- lowing section, a more general theorem is presented, which is the basis for the analysis. This theorem is more general, and uses weaker but less explicit conditions. The proof of the main result is then given in Section 5, by showing that the general theorem can be applied to our family of algo- rithms. Notice that this analysis is of independent interest in that for each individual algorithm, the conditions can be somewhat weakened in dierent ways.

### II. The Family of Tracking Algorithms We shall consider the general adaptation algorithm

### ^ k

+1### = ^ k + L k ( y k

^{;}

### ' _{k} ^{^} k )

^{2}

### (0 1) (11) where the gain L _{k} is chosen in some dierent ways:

### Case 1: Least Mean Squares (LMS) ^{:}

### L k = ' k (12)

### This is a standard algorithm, 21],22], and has been used in numerous adaptive signal processing applications.

### Case 2 : Recursive Least Squares(RLS) ^{:}

### L k = P k ' k (13)

### P _{k} = 1 1

^{;}

### P _{k}

^{;}1

^{;}

### P ^{k}

^{;}

^{1}

### ' k ' _{k} P k

^{;}1

### 1

^{;}

### + ' _{k} P _{k}

^{;}1

### ' _{k}

### (14)

### P

0### > 0 : (15)

### This gives an estimate ^ k that minimizes

### k

X

### t

=1### (1

^{;}

### ) ^{k}

^{;}

^{t} ( y t

^{;}

### ' _{t} )

^{2}

### where (1

^{;}

### ) is the \forgetting factor".

### Case 3: Kalman Filter (KF) Based Algorithm:

### L k = P k

^{;}1

### ' k

### R + ' _{k} P k

^{;}1

### ' k (16) P k = P k

^{;}1

^{;}

### P k

^{;}1

### ' k ' _{k} P k

^{;}1

### R + ' _{k} P k

^{;}1

### ' k + Q (17) ( R > 0 Q > 0) (18) Here R is a positive number and Q is a positive denite matrix. The choice of L _{k} corresponds to a Kalman lter state estimation for (1)-(2), and is optimal in the a poste- riori mean square sense if v k and w k are Gaussian white noises with covariance matrices R and Q , respectively, and if is chosen as in (2).

### If

^{f}

### ' k y k k

^{g}

### obey (1) - (2) and ^ k is found using (11) we can write the estimation error

^{e}

### k as

e

### _{k}

+1### = ( I

^{;}

### F _{k} )

^{e}

_{k}

^{;}

### L _{k} v _{k} + w _{k}

+1### F _{k} = L _{k} ' _{k} (19)

### This is a purely algebraic consequence of (1) - (2) and (11), and holds for whatever sequences v k and w k .

### If we introduce stochastic assumptions about

^{f}

### v k

^{g}

### and

f

### w k

^{g}

### , we can use (19) to express the covariance matrix E

^{e}

### k

+1^{e}

_{k}

_{+1}

### ]. That will however be quite complex, primar- ily due to the dependence between

^{f}

### L _{k} ' _{k}

^{e}

_{k}

^{g}

### . The basic approximating expression will instead be based on the fol- lowing expression

### k

+1### = ( I

^{;}

### G k ) k ( I

^{;}

### G k ) ^{} +

^{2}

### R v ( k ) M k +

^{2}

### Q w ( k +1) where G k = EF k , M k = EL k L _{k} , R v ( k ) = Ev

^{2}

_{k} (20) and Q w ( k ) = Ew k w _{k} . As follows from Example 1.1, this would be the correct expression for the covariance matrix of

^{e}

### k

+1### , if v k and w k were white noises and L k ' _{k} was independent of

^{e}

_{k} , and if a term of size

^{2}

### k was neglected.

### Indeed, we shall prove that (20) provides a good approx- imation of the true covariance matrix in the sense that (10) holds. Note that k obeys a simple linear dierence equa- tion, and can easily be calculated and examined.

### III. The Main Result A. The Assumptions

### We shall now consider the algorithm (11) with either of the three choices of the gain L _{k} , discussed in the previous section. For the analysis we shall impose some conditions on the involved variables. These are of the following char- acter.

### C1. The regressors

^{f}

### ' _{k}

^{g}

### span the regressor space (in or- der to ensure that the whole parameter vector can be estimated)

### C2. The dependence between the regressors ' k and ( ' _{i} v _{i}

^{;}1

### w _{i} ) decays to zero as the time distance ( k

^{;}

### i ) tends to innity

### C3. The measured error v k and the parameter drift w k

### are of white noise character.

### In more exact terms, the three assumptions take the fol- lowing form:

### P1 ^{. Let} ^{S} t = E ' t ' _{t} ], assume that there exist constants h > 0 and > 0 such that

### k

X+### h

### t

=### k

+1### S t I

^{8}

### k

### P2 . Let

^{G}

### k =

^{f}

### ' k

^{g}

### ,

^{F}

### k =

^{f}

### ' i v i

^{;}1

### w i i

^{}

### k

^{g}

### . As- sume that

^{f}

### ' _{k}

^{g}

### is weakly dependent ( -mixing) in the sense that there is a function ( m ) with ( m )

^{!}

### 0, as m

^{!}

^{1}

### , such that

### A

^{2G}k

^{+}

### sup

m### B

^{2F}k

^{j}

### P ( A

^{j}

### B )

^{;}

### P ( A )

^{j}

^{}

### ( m )

^{8}

### k

^{8}

### m: (21) Also, assume that there is a constant c ' > 0 such that

k

### ' k

^{k}

^{}

### c ' a:s:

^{8}

### k .

### P3 ^{. Let}

^{F}

### k be the -algebra dened in P2, assume that E v k

^{jF}

### k ] = 0 E w k

+1^{jF}

### k ] = E w k

+1### v k

^{jF}

### k ] = 0 E v _{k}

^{2}

^{jF}

### k ] = R v ( k ) E w k w _{k} ] = Q w ( k )

### sup _{k}

^{f}

### E

^{j}

### v k

^{j}

### r

jF### k ] + E

^{k}

### w k

^{k}

### r

g### M

### for some r > 2 M > 0 : B. The Result

### Now, let k be dened by the following linear, determin- istic dierence equation:

### k

+1### = ( I

^{;}

### R _{k} S _{k} ) k ( I

^{;}

### R _{k} S _{k} ) ^{}

### +

^{2}

### R v ( k ) R k S k R k +

^{2}

### Q w ( k + 1) (22) where S k = E ' k ' _{k} ], and R k is dened as follows:

### LMS-case

### R k = I (23)

### RLS-case

### R _{k} = R _{k}

^{;}1

^{;}

### R _{k}

^{;}1

### S _{k} R _{k}

^{;}1

### + R _{k}

^{;}1

### ( R

0### = P

0### ) (24)

### KF-case

### R k = R k

^{;}1

^{;}

### R k

^{;}1

### S k R k

^{;}1

### + Q=R ( R

0### = P

0### =R ) (25) We then have the following main result.

### Theorem 3.1 Consider any of the three basic algo- rithms in Section 2. Assume that P1, P2 and P3 hold. Let

### k be dened as above. Then

^{8}

^{2}

### (0

^{}

### )

^{8}

### k 1

k

### E

^{e}

### k

^{e}

_{k} ]

^{;}

### k

^{k}

^{}

### c ( ) +

^{2}

### ^{+ (1}

^{;}

### ) ^{k} ] (26) where ( )

^{!}

### 0 (as

^{!}

### 0), which is dened by

### ( ) = min

^{4}

_{m}

1 ^{f}

p

### m + ( m )

^{g}

### (27) and ( m ) was dened in P2, and

^{2}

### (0 1)

^{}

^{2}

### (0 1) c >

### 0 are constants which may be computed using properties of

^{f}

### ' k v k w k

^{g}

### .

### The proof is given in Section 5. Let us now discuss the conditions used in the above theorem.

### C. The Degree of Approximation

### First of all, it is clear that the quantity ( ) plays an important role. The faster it tends to zero, the better ap- proximation is obtained. The rate by which it tends to zero is according to (27) a reection of how fast ( m ) (that is, the dependence among the regressors) tends to zero as m increases. For example, if the regressors are m -dependent, so that ' k and ' ` are independent for

^{j}

### k

^{;}

### `

^{j}

### > m , then ( n ) = 0 for n > m and ( ) will behave like

p

### . Also, if the dependence is exponentially decaying ( ( m )

^{}

### Ce

^{;}

^{m} ), then we can nd that

### ( ) < C

^{0}

^{:}

^{5}

^{;}

^{}

### for arbitrarily small, positive . This gives a good picture

### of typical decay rates of .

### D. Persistence of Excitation: Condition P1

### Condition P1 is quite natural and weak, just requiring the regressor covariance matrix to add up to full rank over a given time span of arbitrary length. It has been known to be a necessary condition (in a certain sense) for bound- edness of E

^{k}

^{e}

### _{k}

^{k}

^{2}

### generated by LMS (cf. 8]), it is also known to be the minimum excitation condition needed for the stability analysis of RLS (cf. 10]).

### E. Boundedness and -mixing of the regressors: Condition Condition P2 requires boundedness and P2 -mixing of the regressors. Although such conditions are standard ones in the literature (e.g. 11]), they can still be considered as re- strictive. As seen in several of the results in Section 5, both -mixing and boundedness can be weakened considerably when we deal with specic algorithms.

### It may also be remarked that when

^{f}

### ' k

^{g}

### is unbounded, we can modify the algorithm and make Theorem 3.1 hold true: Introduce the normalized signal

### ( y _{k} ' _{k} v k ) = 1

p

### 1 +

^{k}

### ' k

^{k}

^{2}

### ( y k ' k v k ) Then we have from (1)

### y _{k} = _{k} ' _{k} + v k :

### Thus,

^{f}

### k

^{g}

### may be estimated based on this normalized linear regression. In this case, Theorem 3.1 can be applied to this case if only S k and R v ( k ) in (22){(25) are replaced by E ' k ' _{k}

### 1 +

^{k}

### ' k

^{k}

^{2}

### ] and E 1

### 1 +

^{k}

### ' k

^{k}

^{2}

### ] R v ( k ), respectively.

### F. The Parameter Drift Model: Condition P3

### There are two things to mention around the Conditions P3. First, we note that the martingale dierence property of w k essentially means that the true parameters, accord- ing to the model (2) are assumed to be a random walk.

### Although this model is quite standard, it has also been criticized as being too restrictive. We believe that a ran- dom walk model, in the context of slow adaptation (small

### ), captures the tracking behavior of the algorithm very well. This is, in a sense, a worst case analysis, since the future behavior of the model is unpredictable.

### We may also note that time-varying covariances Q w ( k ) and R v ( k ) are allowed. Several of the special model drift cases described in 12] are therefore covered by P3. Other drift models, where the driving noise is colored, can be put into a similar Kalman lter framework. However, to cover also that case with our techniques requires more work.

### Condition P3 also introduces assumptions about higher moments than 2. We remark that if we only assume that

f

### v k

^{g}

### and

^{f}

### w k

^{g}

### are bounded in e.g. mean square sense, then upper bounds for the mean square tracking errors can be established (cf. 8] and 7]). The strengthened assumption in P3 allows us to obtain performance values much more accurate than upper bounds.

### G. The Practical Use of the Theorem

### The practical consequences of Theorem 3.1 is that a very simple algorithm, the linear, deterministic dierence equa- tion (22) will describe the tracking behavior. Now, this equation is quite easy to analyze. In fact, there is an ex- tensive literature on such analysis, in particular for the special case of LMS. Among many references, we may refer to 12] for a survey of such results. In essence, all these results capture the dilemma between tracking error ( is large because is small) and the noise sensitivity ( is large because is large) and may point to the best compromises between these requirements.

### For example, under weak stationarity of the regressors S k

^{ }

### S

### we nd that R k will converge to ~ R as k

^{!}

^{1}

### , where ~ R = I in the LMS-case, ~ R = S

^{;}

^{1}

### in the RLS case and for the KF case we have to solve

### RS ~ R ^{~} = Q=R

### for ~ R . Inserted into (22) this gives the following stationary values for the tracking error covariance matrix (neglect- ing the term

^{2}

### ):

### LMS ^{S} ^{ + } ^{S} ^{=} ^{R} v S +

^{2}

### Q ^{w}

### RLS _{ = 12} R v S

^{;}

^{1}

### +

^{2}

### Q ^{w} ^{]}

### KF ^{RS} ^{~} ^{ + ( ~} ^{RS} ^{)} ^{} ^{=} ^{R} v Q=R +

^{2}

### Q ^{w}

### Note, that if we have Q = Q w and R = R v , then the latter equation can be solved as

### = R

### 2 ( +

^{2}

### ^{) ~} R

### From these expressions the trade-os between tracking ability and noise sensitivity are clearly visible.

### IV. A General Theorem

### In this section, we shall present a general theorem on performance of tracking algorithm (11) when the gain L k is not specied, from which our main result Theorem 3.1 will follow. The general theorem has weaker, but less explicit assumptions. From now on the treatment and discussion will be more technical. However, the main line of thought in the proofs follows the outline given after Example 1.1 in the Introduction.

### A. Notations

### The following notations will be used in the remainder of

### the paper. These notations are the same as in the compan-

### ion paper 9].

### a). The minimum and maximum eigenvalues of a matrix X are denoted by

min### ( X ) and

max### ( X ), respectively, and

k

### X

^{k}

### =

^{4}

^{f}max

### ( XX ^{} )

^{g}

^{1}

^{2}

k

### X

^{k}

_{p} =

^{4}

^{f}

### E (

^{k}

### X

^{k}

^{p} )

^{g}

^{1}

^{p}

### p 1 :

### b). ^{Let} ^{x} ^{=}

^{f}

^{x} k ( ) k 1

^{g}

### be a random sequence pa- rameterized by

^{2}

### (0 1) . Denote

L

### p (

^{}

### ) =

### x : _{ } sup

2(0

### ]

_{k} sup

1 ^{k}

### x k ( )

^{k}

### p <

^{1}

### (28)

### c). Let F =

^{f}

### F k ( )

^{g}

### be any (square) matrix random process parameterized by

^{2}

### (0 1). For any p 1

^{}

^{2}

### (0 1), dene

S

### p (

^{}

### ) =

^{f}

### F :

^{k}

^{Y}

^{k}

### j

=### i

+1### ( I

^{;}

### F k ( ))

^{k}

### p

^{}

### M (1

^{;}

### ) ^{k}

^{;}

^{i}

8

^{2}

### (0

^{}

### ]

^{8}

### k i 0 for some M > 0and

^{2}

### (0 1)

^{g}

### similarly,

S

### (

^{}

### ) =

^{f}

### F :

^{k}

^{Y}

^{k}

### j

=### i

+1### ( I

^{;}

### E F k ( )])

^{k}

^{}

### M (1

^{;}

### ) ^{k}

^{;}

^{i}

8

^{2}

### (0

^{}

### ]

^{8}

### k i 0 for some M > 0 and

^{2}

### (0 1)

^{g}

### In what follows, it will be convenient to introduce the sets

S

### p =

^{4}

^{}

^{2}(01)

S

### p (

^{}

### )

^{S}

### =

^{4}

^{}

^{2}(01)

S

### (

^{}

### ) (29) We may call these stability sets. They are related to the stability of random equation (19) and deterministic equa- tion (20), respectively. For simplicity, we shall sometimes suppress the parameter ( ) in F k ( ), when there is no risk of confusion.

### d). For scalar random sequences a = ( a _{k} k 0), we set

S0

### ( ) =

f

### a : a k

^{2}

### 0 1] E

^{Y}

^{n}

### j

=### i

+1### (1

^{;}

### a j )

^{}

### M ^{k}

^{;}

^{i}

8

### k i 0 for some M > 0

^{g}

### : Also,

S

0

### =

^{4}

^{}

^{2}(01)

S

0

### ( ) (30)

### e). ^{Let} ^{p} ^{1 and let} ^{x} ^{=}

^{4}

^{f}

^{x} i

^{g}

### be any random process.

### Set

M

### p =

(

### x :

^{k}

^{m}

^{X}

^{+}

^{n}

### i

=### m

+1### x i

^{k}

### p

^{}

### C p n

^{1}

^{2}

^{8}

### n 1 m 0 for some C p depending only on p and x

)

### :

### As is known for example from 10], martingale dierence sequence, - and -mixing sequences, and linear processes (a process generated from a white noise source via a linear lter with absolutely summable impulse response) are all in the set

^{M}

### p .

### In particular, when

^{f}

### x i

^{g}

### is a martingale dierence se- quence, by the Burkholder inequality we have ( p > 1)

k

### m

^{X}+

### n

### i

=### m

+1### x i

^{k}

### p

^{}

### ( B p x

^{}

_{p} ) n

^{1}

^{2}

^{8}

### n 1 m 0 (31) where x

^{}

_{p} = sup

^{4}

_{k}

^{k}

### x k

^{k}

### p , and B p is a constant depending on p only (cf. 11]). (This fact will be frequently used in the sequel without explanations).

### f). ^{Let}

^{f}

^{A} k

^{g}

### be a matrix sequence, b k 0

^{8}

### k 0.

### Then by A k = O ( b k ) we mean that there exists a constant M <

^{1}

### such that

k

### A k

^{k}

^{}

### Mb k

^{8}

### k 0 :

### The constant M may be called the ordo-constant.

### Throughout the sequel, the ordo-constant does not depend on , even if

^{f}

### A _{k}

^{g}

### or

^{f}

### b _{k}

^{g}

### does.

### B. Assumptions

### We will rst show that given the exponential stability of the homogenous part of (19) and a certain weak depen- dence property of the adaptation gains, how the tracking performance can be analyzed, then we present more de- tailed discussions on such properties.

### In the sequel, unless otherwise stated,

^{F}

### k denotes the -algebra generated by

^{f}

### ' i w i v i

^{;}1

### i

^{}

### k

^{g}

### , and

^{f}

### F k

^{g}

### is dened in (19).

### To establish the general theorem, we need the following assumptions:

### (A1). (Exponential stability) There are

^{}

^{2}

### (0 1), and p 2 such that

f

### F k

^{g}

^{2}

^{S}

### p (

^{}

### )

^{\}

^{S}

### (

^{}

### )

### (A2). (Weak dependence) There is a real number q 3 together with a bounded function ( m ) 0 with

m

### lim

^{!1}

^{!0}

### ( m ) = 0

### (taking rst m to innity and then to zero) such that

^{8}

### m

^{8}

### k

^{8}

^{2}

### (0

^{}

### ]

k

### E F k

^{jF}

### k

^{;}

### m ]

^{;}

### E F k ]

^{k}

### q

^{}

### ( m )

### (A3). ^{L} i

^{2}

^{F}

### i

^{8}

### i 1, and there is

^{}

^{2}

### (0 1) such that

f

### L i

^{g}

^{2}

^{L}

### r (

^{}

### )

^{f}

### F i

^{g}

^{2}

^{L}2

### q (

^{}

### )

### with r _{= (12}

^{;}

### p ^{1}

^{;}

_{32} q ^{)}

^{;}

^{1}

### , and with p and q dened as

### in (A1) and (A2).

### (A4). ^{For all} ^{k} ^{1 we have}

### E v _{k}

^{jF}

_{k} ] = 0 E w _{k}

+1^{jF}

### k ] = E w _{k}

+1### v _{k}

^{jF}

_{k} ] = 0 E v _{k}

^{2}

^{jF}

### k ] = R v ( k ) E w k

+1### w _{k}

_{+1}

### ] = Q w ( k + 1) E

^{j}

### v k

^{j}

### r

jF### k ] + E

^{k}

### w k

+1^{k}

### r ]

^{}

### M <

^{1}

^{8}

### k 1 for deterministic quantities R _{v} ( k ), Q _{w} ( k + 1) and M , where r is dened as in (A3).

### The key conditions are (A1) and (A2). In general, (A1) can be guaranteed by a certain type of stochastic persis- tence of excitation condition, which is studied in the com- panion paper 9] while (A2) can be guaranteed by impos- ing a certain weak dependence condition on the regressor

f

### ' i

^{g}

### . More detailed discussions will be given later. At the moment, we just remark that if (A1) and (A2) hold for all p 1 and all q 1, then in (A3) and (A4), the number r needs only to satisfy r > 2.

### C. The General Theorem

### Now, recursively dene a matrix sequence

^{f}

### ^ k

^{g}

### as fol- lows:

### ^ k

+1### = ( I

^{;}

### E F k ])^ k ( I

^{;}

### E F k ]) ^{}

### +

^{2}

### R v ( k ) E L k L _{k} ] +

^{2}

### Q w ( k + 1) (32) where ^

0### = E

^{e}0

^{e}

### ^{}

_{0}

### ], and R _{v} ( k ) and Q _{w} ( k +1) are dened in Assumption (A4). Note that this denition is very close to the denition of k in (22). We now have a result that is the "mother-theorem" of Theorem 3.1:

### Theorem 4.1 Let Assumptions (A1)-(A4) hold. Let the tracking error

^{e}

### k be dened by (11) (or (19)), and let ^ k

### dened by (32). Then

^{8}

^{2}

### (0

^{}

### ]

^{8}

### k 1:

k

### E

^{e}

### k

+1^{e}

_{k}

_{+1}

### ]

^{;}

### ^ k

+1^{k}

^{}

### c ( ) +

^{2}

### ^{+ (1}

^{;}

### ) ^{k} ] where c > 0 and

^{2}

### (0 1) are constants and ( ) is a function that tends to zero as tends to zero. It is dened by ( )

^{4}

### = min _{m}

1 ^{f}

p

### m + ( m )

^{g}

### : The proof is given in Appendix A.

### Next, we show that under more conditions, the expres- sion for ^ k in (32) can be further simplied.

### Corollary 4.1 . Under the conditions of Theorem 4.1, if F k = P k ' k ' _{k} with

^{k}

### ' k

^{k}2

### t = O (1),

^{k}

### F k

^{k}

### t = O (1), for some t > 1, and if there are some function ( ), tending to zero as tends to zero, and some deterministic sequence

^{f}

### R k

^{g}

### such that

k

### P k

^{;}

### R k

^{k}

### s = O ( ( ))

^{8}

### k

^{8}

^{2}

### (0

^{}

### ] s = (1

^{;}

### t

^{;}

^{1}

### )

^{;}

^{1}

### then we have (

^{8}

^{2}

### (0

^{}

### ]

^{8}

### k 1)

k

### E

^{e}

### k

+1^{e}

_{k}

_{+1}

### ]

^{;}

### k

+1^{k}

### c ( ) + ( )]

### +

^{2}

### ^{+ (1}

^{;}

### ) ^{k}

### (33)

### for some constants c > 0 and

^{2}

### (0 1), where k is recur- sively dened by

### k

+1### = ( I

^{;}

### R k S k ) k ( I

^{;}

### R k S k ) ^{} +

^{2}

### R v ( k ) R k S k R _{k} +

^{2}

### Q w ( k + 1) (34) with S k = E ' k ' _{k} ] and

0### = ^

0### .

### Proof . By Theorem 4.1, we need only to show that

k

### ^ k

+1^{;}

### k

+1^{k}

### = O

### ( ) +

^{2}

### ^{+ (1}

^{;}

### ) ^{k} ]

### This can be derived by straightforward calculations based on the equations for ^ k and k , and hence Corollary 4.1 is true.

### Remark . If in Condition (A2),

###
( m ) = O (
( m ) + ( )) ( ) = min _{m}

1

^{p}

### m + ( m )] then ( ) dened in Theorem 4.1 satises ( ) = O ( ( )).

### This will be the case for RLS and KF algorithms in Theo- rem 3.1, as can be seen from section V.

### The following result also follows directly from Theorem 4.1.

### Corollary 4.2. If, in addition to the conditions of The- orem 4.1, R _{v} ( k )

^{ }

### R _{v} Q _{w} ( k )

^{ }

### Q _{w} , and there are F G

### and a function ( ), tending to zero as tends to zero, such that

^{8}

^{2}

### (0

^{}

### ],

k

### EF k

^{;}

### F

^{k}

### +

^{k}

### E ( L k L _{k} )

^{;}

### G

^{k}

^{}

### ( )

^{8}

### k then for some

^{2}

### (0 1) and for all

^{2}

### (0

^{}

### ] k 1

### E

^{e}

_{k}

+1^{e}

_{k}

_{+1}

### ]

### = + O

### ( ) + ( )] +

^{2}

### ^{]}

### + O

^{;}

### (1

^{;}

### ) ^{k}

^{ }

### (35) where satises the following Lyapunov equation:

### F + F ^{} = R v G +

^{2}

### Q ^{w} : (36) Now denote

### R v = R v

Z

1

0

### e

^{;}

^{Ft} Ge

^{;}

^{F}

^{}

^{t} Q _{w} =

Z

1

0

### e

^{;}

^{Ft} Q w e

^{;}

^{F}

^{}

^{t} the solution to the Lyapunov equation (36) can be ex- pressed as

### = R v +

^{2}

### Q ^{w}

### in which there is a reminiscence of the results obtained in

### the simple example discussed in Section 1 (see, (9)).

### D. Discussion on the Assumptions

### Now, let us discuss the key assumptions (A1) and (A2).

### First, assumption (A1) has been studied in the compan- ion paper 9], and here we only give some results concerning

f

### F k

^{g}

^{2}

^{S}

### , which will be used shortly in the next section.

### Proposition 4.1. ^{Let}

^{f}

^{G} k

^{g}

### be a random matrix pro- cess, possibly dependent on , with the property

### E

^{k}

### G k

^{k}

^{}

### ( ) for all small and all k (37) where ( )

^{!}

### 0 as

^{!}

### 0 :

### Then

^{f}

### F k

^{g}

^{2}

^{S}

^{(}

^{)}

^{f}

### F k + G k

^{g}

^{2}

^{S}

### .

### Proof. Suciency: Recursively dene (

^{8}

### x :

^{k}

### x

^{k}

### = 1) x k

+1### = ( I

^{;}

### E F k + G k ]) x k

^{8}

### k m x m = x

### Then

### x k

+1### = ( I

^{;}

### E ( F k )) x k

^{;}

### E ( G k ) x k

### =

^{Y}

^{n}

### i

=### m I

^{;}

### E ( F _{i} )] x _{m}

;

### n

X

### i

=### m

^{Y}

^{n}

### j

=### i

+1### I

^{;}

### E ( F j )] E ( G i ) x i

### Consequently, similar to the proof of Theorem 3.1 in 9], by the Gronwall inequality we have

k

### x n

+1^{k}

^{}

### 2 M (1

^{;}

### ) ^{n}

^{;}

^{m}

^{+1}

8

<

:

### 1 +

^{X}

^{n}

### i

=### m n

Y

### j

=### i

+1### (1 + E

^{k}

### G _{j}

^{k}

### )

^{ }

### E

^{k}

### G _{i}

^{k}

9

=

### From this and the condition (37), it is not dicult to con- vince oneself that

^{f}

### F k + G k

^{g}

^{2}

^{S}

### .

### Necessity: by using the fact proved above, and noting that F k = ( F k + G k )

^{;}

### G k , we know that

^{f}

### F k

^{g}

^{2}

^{S}

### . This completes the proof.

### The following useful result follows from Proposition 4.1 # immediately.

### Proposition 4.2. Let F _{k} = P _{k} H _{k} and the following conditions be satised:

### (i).

^{f}

### H k

^{g}

^{2}

^{L}

### t (

^{}

### ),

^{}

^{2}

### (0 1) t 1.

### (ii).

^{k}

### P k

^{;}

### P k

^{k}

### s

^{}

### ( ) ,

^{8}

^{2}

### (0

^{}

### ], where, ( )

^{!}

### 0 as

^{!}

### 0, s = (1

^{;}

### t

^{;}

^{1}

### )

^{;}

^{1}

### , and

^{f}

### P k

^{g}

### is a deterministic process.

### Then

^{f}

### F k

^{g}

^{2}

^{S}

^{(}

^{)}

^{f}

### P k H k

^{g}

^{2}

^{S}

### .

### Proof. The result follows directly from Proposition 4.1, if we note that

### F k = P k H k + ( P k

^{;}

### P k ) H k :

### We now turn to discuss the weak dependence condition # (A2).

### Example 4.1 ^{Let}

^{f}

^{'} i

^{g}

### satisfy (21), and L (

^{ }

### ) : R ^{d}

^{;}

^{!}

### R ^{d}

^{}

^{d} be a real matrix function with

^{k}

### L ( ' ( k ))

^{k}

### q = O (1), for some 1

^{}

### q

^{}

^{1}

### . Then we have the following inequality (c.f.19]):

k

### E L ( ' k )

^{jF}

### k

^{;}

### m ]

^{;}

### EL ( ' k )

^{k}

### q = O ( ( m )]

^{1}

^{;}

^{1}

^{q}

### ) :

^{8}

### k m Hence, if F k = L ( ' k ), then condition (A2) holds. (38)

### Note that when

^{f}

### ' _{k}

^{g}

### satises condition P2 in Section 3, we have by taking q =

^{1}

### in (38)

k

### E ' k ' _{k}

^{jF}

### k

^{;}

### m ]

^{;}

### E' k ' _{k}

^{k}

^{1}

### = O ( ( m )) : (39) This fact will be used in the next section in the proof of

### Theorem 3.1. #

### Example 4.2 ^{Let}

^{f}

^{'} k

^{g}

### be generated by

### x k = Ax k

^{;}1

### + B k (A stable) : ' k = Cx k + k

### where

^{f}

### j j k + 1

^{g}

### and

^{f}

### v j

^{;}1

### w j j

^{}

### k

^{g}

### are indepen- dent, and

^{f}

### j

^{g}

### is an independent sequence. Assume that

### sup _{k} E

^{k}

### k

^{k}

^{(}

### b

+1)### q <

^{1}

### for some b 0 q 1 : Then for any function L (

^{ }

### ) : R ^{d}

^{;}

^{!}

### R ^{d}

^{}

^{d} , with

k

### L ( x )

^{;}

### L ( x

^{0}

### )

^{k}

^{}

### M (

^{k}

### x

^{k}

### +

^{k}

### x

^{0}

^{k}

### + 1) ^{b}

^{k}

### x

^{;}

### x

^{0}

^{k}

^{8}

### x x

^{0}

### there is a constant

^{2}

### (0 1) such that (cf.14])

^{8}

### m 0

^{8}

### k 0

k

### E L ( ' k

+### m )

^{jF}

### k ]

^{;}

### EL ( ' k

+### m )

^{k}

### q = O ( ^{m} ) Hence, if F k = L ( ' k ), then again, condition (A2) holds.

### The following simple result will be useful in the sequel. #

### Proposition 4.3 Let F k = P k L ( ' k ), and the following two conditions hold:

### (i). There is a bounded deterministic matrix sequence

f

### P k

^{g}

### , and a function ( ) tending to zero as tends to zero, such that

k

### P k

^{;}

### P k

^{k}

### s

^{}

### ( )

^{8}

^{2}

### (0

^{}

### ] for some s > 1 (ii). There is a number r > 1 such that

^{k}

### L ( ' k )

^{k}

### r = O (1), together with a function ( m ) tending to 0 as m tends to innity, such that

k

### E L ( ' k

+### m )

^{jF}

### k ]

^{;}

### EL ( ' k

+### m )

^{k}

### q

^{}

### ( m )

8

### k

^{8}

### m ( q = ( r

^{;}

^{1}

### + s

^{;}

^{1}

### )

^{;}

^{1}

### ) :

### Then condition (A2) holds with ( m ) = O ( ( m )+ ( )) : Proof. The result follows directly from the following identity:

### E F k

+### m

^{jF}

### k ]

^{;}

### EF k

+### m

### = ( P k

+### m

^{;}

### P k

+### m ) L ( ' k

+### m )

^{jF}

### k ]

;

### E

^{}

### ( P _{k}

+### m

^{;}

### P _{k}

+### m ) L ( ' _{k}

+### m )

^{}

### + P k

+### m

^{f}

### E L ( ' k

+### m )

^{jF}

### k ]

^{;}

### EL ( ' k

+### m )

^{g}

### :

### #

### V. Analysis of the Basic Algorithms In this section, we shall show that, for the basic LMS, RLS and KF algorithms, conditions (A1)-(A3) of the pre- vious section can be guaranteed by imposing some explicit (stochastic excitation and weak dependence) conditions on the regressors

^{f}

### ' k

^{g}

### , and at the same time prove Theorem 3.1.

### A. Analysis of LMS

### For the LMS dened by (11) - (12), let us introduce the following two kinds of weak dependence conditions:

### L1). Condition P2 of Section 3 is satised but with the boundedness condition on

^{f}

### ' _{k}

^{g}

### relaxed to the following : There exist positive constants "M and K such that

### E exp

^{f}

^{X}

^{n}

### j

=### i

+1### "

^{k}

### ' j

^{k}

^{2+}g

### M exp

^{f}

### K ( n

^{;}

### i )

^{g}

^{8}

### n i 0 :

### L1'). The random process F k =

^{4}

### ' k ' _{k} has the following expansion:

### F k =

^{X}

^{1}

### j

=0### A j Z k

^{;}

### j + D k

^{X}

^{1}

### j

=0k

### A j

^{k}

### <

^{1}

### where

^{f}

### Z _{k}

^{g}

### is an independent process such that

^{f}

### Z _{j} j k + 1

^{g}

### and

^{f}

### v j

^{;}1

### w j j

^{}

### k

^{g}

### are independent and satises

### sup _{k} E exp

^{f}

^{k}

### Z _{k}

^{k}

^{1+}

^{}

^{g}

### <

^{1}

### for some > 0 > 0 and where

^{f}

### D k

^{g}

### is a bounded deterministic process.

### Theorem 5.1 . Let Conditions P1 and P3 of Section 3 be satised. If either L1) or L1') above holds, then Condi- tions (A1)-(A4) of Theorem 4.1 hold (for all p 1 q 1) and Theorem 3.1 is true for the LMS case.

### Proof . First, in the LMS case, Conditions P1 and L1 (or L1') ensure that Condition (A1) of Theorem 4.1 holds for all p 1 (cf. 9], Theorem 3.3). Next, when L1) holds, by Example 4.1 we know that Condition (A2) is true for all q 1. Also, when L1') holds, by the assumed independency we have for all q 1,

k

### E F _{k}

^{jF}

_{k}

^{;}

_{m} ]

^{;}

### EF _{k}

^{k}

_{q} =

^{k}

^{X}

^{1}

### j

=### m A _{j} Z _{k}

^{;}

_{j}

^{;}

### EA _{j} Z _{k}

^{;}

_{j} ]

^{k}

### q

### = O (

^{X}

^{1}

### j

=### m

k

### A j

^{k}

### )

^{8}

### m 1 : Hence (A2) holds again for all q 1.

### Moreover, Conditions (A3) and (A4) hold obviously in the present case. Finally, by 39, the result of Theorem 3.1(in the LMS case) follows directly from Theorem 4.1.

### This completes the proof.

### B. Analysis of RLS

### For the RLS algorithm dened by (11), (13) and (14), let us introduce the following two kinds of excitation con- ditions:

### R1) . There exist constants h > 0 c > 0 > 0 such that P

(

### min ( ^{k}

^{X}

^{+}

^{h}

### i

=### k

+1### ' i ' _{i} ) c

^{jF}

### k

)

### >

^{8}

### k

### R1') . There exists h > 0 such that sup _{k} E

"

### min ( ^{k}

^{X}

^{+}

^{h}

### i

=### k

+1### ' i ' _{i} )]

#

;

### t

### <

^{1}

^{8}

### t 1 : The following weak dependence condition will also be used:

### R2) . There exists a number t 5, such that

^{k}

### ' _{k}

^{k}4

### t = O (1), and that

k

### E ' k ' _{k}

^{jF}

### k

^{;}

### m ]

^{;}

### E' k ' _{k}

^{k}2

### t

^{}

### ( m )

^{8}

### k m where ( m )

^{!}

### 0 as m

^{!}

^{1}

### .

### Remark 5.1 . Detailed discussions and investigations on the above rst two conditions can be found in 10] and 17].

### It has been shown in 10] that if Condition P1 and (21) in Section 3 hold, then R1) is true Also, if

^{f}

### ' k

^{g}

### is generated by a linear state space model as in Example 4.2, then R1') can be veried (cf.17]). Moreover, Condition R2) has been discussed in the last section.

### Theorem 5.2. Let Conditions R1 ( or R1') and R2 above be satised. Then Conditions (A1)-(A3) of Theorem 4.1 hold (for any p < 2 tq < t ) and Theorem 3.1 is true for the RLS case.

### Proof. First, note that

### k

Y

### j

=### i

+1### ( I

^{;}

### F j ) = (1

^{;}

### ) ^{k}

^{;}

^{i} P k

+1### P _{i}

^{;}

_{+1}

^{1}

^{8}

### k i (40) and P _{k}

^{;}

^{1}

### = (1

^{;}

### ) P _{k}

^{;}

^{;}

^{1}

_{1}

### + ' k ' _{k} : (41) From this and condition R2 it follows that

k

### P _{k}

^{;}

^{1}

^{k}2

### t = O (1)

^{8}

^{2}

### (0 1) : (42) Also, by Theorem 1 in 10], there is

^{}

^{2}

### (0 1) such that

f

### P k

^{g}

^{2}

^{L}

### s (

^{}

### )

^{8}

### s 1 (43) Combining (40), (42), (43), we get

f

### F _{k}

^{g}

^{2}

^{S}

_{p}

^{8}

### p < 2 t: (44)

### Now, dene ( P

0### = P

0### )

### P

^{;}

_{k}

^{1}

### = (1

^{;}

### ) P

^{;}

_{k}

^{;}

^{1}

_{1}

### + E ( ' k ' _{k} ) : (45) Since either R1 or R1' implies P1 in Section 3 (cf.10]), by a similar (actually simpler) argument as that used for the proof of (43) we know that

^{k}

### P k

^{k}

### = O (1) : We next prove that

k

### P _{k}

^{;}

^{1}

^{;}

### P

^{;}

_{k}

^{1}

^{k}2

### t = O ( ( )) ( ) = min _{m}

1^{f}

p

### m + ( m )

^{g}

### : First, by (41) and (45) (46)

### P _{k}

^{;}

^{1}

^{;}

### P

^{;}

_{k}

^{1}

### =

^{X}

^{k}

### i

=1### (1

^{;}

### ) ^{k}

^{;}

^{i} ' i ' _{i}

^{;}

### E' i ' _{i} ] (47) For any xed m 1, by denoting

### j ( i ) = E ' i ' _{i}

^{jF}

### i

^{;}

### j ]

^{;}

### E ' i ' _{i}

^{jF}

### i

^{;}

### j

^{;}1

### ] 0

^{}

### j

^{}

### m

^{;}

### 1 we have

### ' i ' _{i}

^{;}

### E' i ' _{i}

### = ^{m}

^{X}

^{;}

^{1}

### j

=0### j ( i ) +

^{f}

### E ' i ' _{i}

^{jF}

### i

^{;}

### m ]

^{;}

### E ' i ' _{i} ]

^{g}

### (48) Now, since for each j , the sequence

^{f}

### j ( i ) i 1

^{g}

### is a mar- tingale dierence, we can apply Lemma A.2 in the Ap- pendix to each such

^{f}

### j ( i ) i 1

^{g}

### to obtain

^{k}

^{X}

^{k}

### i

=1### (1

^{;}

### ) ^{k}

^{;}

^{i} ^{m}

^{X}

^{;}

^{1}

### j

=0### j ( i )

^{k}2

### t = O (

^{p}

### m ) (49) Also, by our assumption

^{k}

^{X}

^{k}

### i

=1### (1

^{;}

### ) ^{k}

^{;}

^{i}

^{f}

### E ' i ' _{i}

^{jF}

### i

^{;}

### m ]

^{;}

### E ' i ' _{i} ]

^{gk}2

### t

^{}

### ( m ) Hence, (46) follows from (47)-(50) immediately. (50)

### Similar to the proof of (44), it is evident that

### P _{k} ' _{k} ' _{k}

^{}

^{2}

^{S}

### : (51) Now

k

### P k

^{;}

### P k

^{k}

^{}

^{k}

### P k

^{k}

^{ }

^{k}

### P _{k}

^{;}

^{1}

^{;}

### P

^{;}

_{k}

^{1}

^{k}

^{ }

^{k}

### P k

^{k}

### from this, (43) and (46) it follows that

k

### P k

^{;}

### P k

^{k}

### s = O ( ( ))

^{8}

### s < 2 t (for small ) (52) Hence, by Proposition 4.2 and (51), we know that

^{f}

### F k

^{g}

^{2}

S

### : This in conjunction with (44) veries Condition (A1).

### Now, by (52) and R2 from Proposition 4.3 it is evident that Condition (A2) holds for any q < t .

### To prove (A3), rst note that for any q < t , (44) implies

f

### F _{k}

^{g}

^{2}

^{L}2

### q (

^{}

### ) for some

^{}

### > 0 :

### So we need only to prove that

f

### L i

^{g}

^{2}

^{L}

### r (

^{}

### ) for r > _{(12}

^{;}

_{2} ^{1} t

^{;}

### 2 ^{3} t ^{)}

^{;}

^{1}

^{= 2} t t

^{;}

### 4 : This is true since by (43) and

^{k}

### ' k

^{k}4

### t = O (1),

f

### L i

^{g}

### =

^{f}

### P i ' i

^{g}

^{2}

^{L}

### r (

^{}

### )

^{8}

### r < 4 t

### and since 4 Thus, by taking t > t ^{2}

^{;}

### t 4. Hence (A3) holds. t =

^{1}

### in the above argument, we see that Conditions (A1) and (A2) hold for all p 1 and all q 1. Hence Theorem 4.1 can be applied to prove Theorem 3.1 for the RLS case, while the expression for k will follow from Corollary 4.1 if we can prove that

k

### P k

^{;}

### R k

^{k}

### s = O ( ( )) s = t

### t

^{;}

### 1 (53) where P k and R k are respectively dened by (14) and (24).

### Furthermore, by (52), it is clear that (53) will be true if

k

### R k

^{;}

### P k

^{k}

### = O ( ( ))

### holds. However, this can be veried by using the deni- tions for R k and P k (see Appendix B). Hence the proof is complete.

### C. Analysis of the KF algorithm

### Among the three basic algorithms described in Section 2, the KF algorithm dened by (11), (16) and (17) is the most complicated one to analyze. Let us now introduce the following two conditions on stochastic excitation and weak dependence.

### K1) . There are constants h > 0 and

^{2}

### (0 1) (inde- pendent of ) such that

### k

### 1 + b kh

+1

2S0

### ( )

### where

^{S}

^{0}

### ( ) is dened by (30), and k and b k are dened as follows: (

^{G}

### k is as before the sigma-algebra generated by

f

### ' i i

^{}

### k

^{g}

### .)

### k =

^{4}min

8

<

:

### E

2

4

### 1 1 + h

(

### k

X+1)### h i

=### kh

+1### ' i ' _{i} 1 +

^{k}

### ' i

^{k}

^{2}

^{jG}

### kh

3

5 9

=

### b k = (1

^{;}

### ) b k

^{;}1

### + (

^{k}

### ' k

^{k}

^{2}

### + 1)

^{2}

### (0 1)

### K2) . There exists a number t 7 together with a func- tion ( m )

^{!}

### 0 ( as m

^{!}

^{1}

### ) such that

^{k}

### ' k

^{k}4

### t = O (1), and that

k

### E ' k ' _{k}

^{jF}

### k

^{;}

### m ]

^{;}

### E' k ' _{k}

^{k}

### t

^{}

### ( m )

^{8}