On the Cramer-Rao bound for Terrain-Aided Navigation

(1)

On the Cramer-Rao bound for Terrain-Aided Navigation

Niclas Bergman

Department of Electrical Engineering Linkoping University, S-581 83 Linkoping, Sweden

www:

http://www.control.isy.liu.se

email:

niclas@isy.liu.se

LiTH-ISY-R-1970 August 26, 1997

REGLERTEKNIK

AUTOMATIC CONTROL

LINKÖPING

Technical reports from the Automatic Control group in Linkoping are available as UNIX-compressed Postscript les by anonymous ftp at the address

130.236.20.24

(ftp.control.isy.liu.se)

.

(2)

On the Cramer-Rao bound for Terrain-Aided Navigation

Niclas Bergman August 26, 1997

1 Introduction

The Cramer-Rao bound sets a lower limit on the error covariance matrix of an unbiased estimator. It is calculated from the Fisher information matrix, which depends on the joint probability density function (PDF) of the estimated parameters and the observations. Since the Cramer-Rao bound depends on the true value of the parameters it is mainly used for analysis. The Cramer- Rao bound also sets a lower limit on the mean square error performance of the estimator thus it can be used to test how far from optimal the estimator performs.

This paper is organized as follows. We review and derive the Cramer-Rao bound in the next section. In the last section a brief background to the tarrain- aided navigation (TAN) problem is presented and the Cramer-Rao bound for TAN is derived.

2 Theory

This section serves as a background to state estimation and reviews some relations with the Fisher information matrix and the Cramer-Rao bound. Mainly, we follow the presentation and notation in Sch91].

2.1 Maximum likelihood estimation

Let p ( ) denote the PDF for the stochastic vector which is parameterized in the parameter vector . We are interested in the problem of, from measurements or samples ^ of , estimate the unknown parameter . One natural estimate is the maximum likelihood (ML) estimate

^ = argmax p ^{^}

:

This estimate has some interesting properties such as being asymptotically nor- mal and unbiased.

For ease of notation we make some denitions.

(3)

Denition 1

The likelihood function l ( ^{^} ) is

l ( ^{^} ) =

^M

p ^{^}

the log-likelihood function L ( ^{^} ) is

L ( ^{^} ) = ln

^M

p ^{^}

and the score function s ( ) is

s ( ) =

^M

@

@L ⁽ ) :

If s ( ) is continously dierentiable, the ML estimate satises

s (^ ) = 0 : (1)

We will assume that this equation only has one solution.

Now, we extend the idea of ML to the case of stochastic parameters

. Let p

( ) denote the joint PDF for the stochastic variables and

. For an observed value ^ of the ML estimate of

is

^ = argmax p

^{^}

: Using

p

( ) = p

^j

(

^j

) p

( ) (2) it is easy to incorporate the prior information, before observing ^ , in the ML estimate. Using (1) we see that the ML estimate must satisfy

s (^ ) = @

@

ln p

^j

(

^j

) + ln p

( )

⁼^{^}

= 0 :

The conditional PDF p

^j

(

^j

) for the case of a stochastic parameter plays the same role as p ( ) for the case of non-stochastic parameters. Alternatively the conditioning in (2) could be interchanged, yielding

p

( ) = p

^j

(

^j

) p ( ) and the ML estimate should thus also satisfy

s (^ ) = @

@

ln p

^j

(

^j

)

=

^

= 0 :

The PDF p

^j

(

^j

) is the posterior PDF for

given , this explains why the ML estimate is often called the maximum a posteriori (MAP) estimate of

.

In the sequel of this section we will work with non-stochastic parameters

but from the discussion above it follows that the stochastic case is handled by

substituting p ( ) with p

( ).

(4)

2.2 Fisher's information matrix

The rst and second central moments are two fundamental properties of an estimator,

E

^{^}

E

(

^{^}^;

E

^{^}

)(

^{^}^;

E

^{^}

)

^T

:

A more interesting measure on the performance of an estimator is the mean deviation from the estimated stochastic variable

Denition 2

The error covariance matrix C for an estimator is C =

^M

E

^h

(

^;^{^}

)(

^;^{^}

)

^Tⁱ

Using the denition above, the error covariance can be written C = E

^h

(

^{^}^;

E

^{^}

)(

^{^}^;

E

^{^}

)

^Tⁱ

+

^h

E

^;^{^}ⁱ^h

E

^;^{^}ⁱ^T

:

For an unbiased estimator the second term disappears and the error covariance matrix is the estimator covariance matrix,

E

^{^}

=

⁾

C = E

^h

(

^{^}^;

E

^{^}

)(

^{^}^;

E

^{^}

)

^Tⁱ

:

Let

^k

x

^k²

denote the usual 2-norm of a vector,

^k

x

^k²

= x

^T

x . The diagonal elements of C are the mean square errors for each of the scalar estimators

^{^}ⁿ

, the sum of these is the mean square error of the total estimator,

tr C = E

^h^k^;^k^{^}²ⁱ

:

The performance of an estimator can be evaluated using Monte Carlo analysis to nd this mean square error, evaluating the estimator M times for dierent noise realizations,

M 1

M

X

k =1

k

^;

^{^}

^{(k )}^k²^!

tr C as M

^!¹

:

Above, ^

^{(k )}

denotes the estimate at Monte Carlo run k . The Fisher information matrix will give us a lower bound on C that can be compared with the Monte Carlo result in an evaluation of an estimator.

Replacing the realization by the stochastic variable in the score function we get the stochastic score function,

s ( ) = @

@L ⁽ ) = @

@ ^ln p ( ) :

Theorem 1

E s ( ) = 0

(5)

Es

(

) =

^E ^@

@

ln

^p

(

) =

^Z ^@

@

ln

^p

(

)]

^p

(

)

^d

=

^Z ^@

@

p

(

)

^d

=

^@

@

Z

p

(

)

^d

= 0

^:

2

Denition 3 (The Fisher information matrix)

The Fisher information matrix J ( ) is the covariance matrix of the score function J ( ) =

^M

E

s ( ) s

^T

( )

= E

"

@ @ ^ln p ( )

@

@ ^ln p ( )

T

#

There is an alternative, and in some cases simpler, expression for the information matrix.

Theorem 2

J ( ) =

^;

E

"

@ @

@

@ ^ln p ( )

T

#

Proof:

Since

@

ln

^p

(

) = 1

p

(

)

^@^@ ^p

(

) (3) the second gradient of ln

^p

(

) is

@

ln

^p

(

)

^T

=

^p

(

)

^@^@ ^;^@^@^p

(

)

^T ^;^@^@^p

(

)

^;^@^@^p

(

)

^T

(

^p

(

))

²

=

^@^@

;

@

p

(

)

^T

p

(

)

^;^@^@

ln

^p

(

)

^@

@

ln

^p

(

)

^T^:

Taking expectation on both sides yields the desired result since, using (3), we get

E

"

@

;

@

p

(

)

^T

p

(

)

#

=

^Z ^@

@

p

(

^x

)

T

dx

=

^Z ^@

@

p

(

^x

)

^@

@

ln

^p

(

^x

)

^dx

=

^@

@

Es

(

) = 0

by Theorem 1.

²

2.3 The Cramer-Rao bound

For the derivation of the Cramer-Rao bound an intermediate result is needed.

Theorem 3

The cross covariance matrix between the score function and any unbiased estimator is the identity matrix, i.e., if E

^{^}

=

E

^h

s ( )(

^{^}^;

)

^Tⁱ

= I:

(6)

Proof:

The unbiasedness may be expressed as

Z

(^

^;

)

^T^p

(

)

^d

= 0

^:

Take the gradient w.r.t.

of this expression

Z

@

h

(^

^;

)

^T^p

(

)

ⁱ^d

= 0

Z

@

^p

(

)](^

^;

)

^T^d^;^Z ^Ip

(

)

^d

= 0

^:

Using (3), we get

Z

@

ln

^p

(

)](^

^;

)

^T^p

(

)

^d

=

^I:

2

The famous Cramer-Rao lower bound can now be proven.

Theorem 4 (The Cramer-Rao bound)

Let be an n -dimensional stochastic variable with PDF p ( x ). For any unbiased estimator ^ ( ), the error covariance matrix is bounded by

C = E

^h

(

^;^{^}

)(

^;^{^}

)

^Tⁱ

J ( )

^;1

if J > 0.

Proof:

(From Sch91].) Study the stochastic vector

;

^

s

(

)

its covariance matrix,

Q

=

^E

;

^

s

(

)

(

^;^{^}

)

^T ^s^T

(

)

=

C I

I J

can be diagonalized if

^J^>

0

I ;J

;1

0

^I

C I

I J

I

0

;J

;1

I

=

C;J

;1

0

^J

:

Since

^Q

is a covariance matrix it is positive semi denite by denition. Further,

Q

0

⁾^A^T^QA

0 and we get

CJ

;1

:

2

(7)

3 Application { Terrain-Aided Navigation

Both civil and military aircraft navigation is based on inertial navigation systems (INS). These dead-reckoning systems measure accelerations and attitude angle rates of the aircraft. The position and velocity of the aircraft is found by integrating these measurements from a given initial position and velocity. Due to the open loop conguration of these dead-reckoning systems they will have an error that drifts away with time. Using terrain information is one among several ways to limit the errors in INS.

The principle of terrain-aided navigation is as follows. A combination of the INS and a barometric pressure sensor gives the aircraft absolute altitude over mean sea-level. With a vertical radar the aircraft measures the distance to the ground below it, the dierence between the aircraft absolute altitude and this ground clearance measurement is an estimate of the terrain altitude over mean sea-level. Comparing measurements of this terrain altitude variations with a pre-stored map of the terrain it is possible to estimate the aircraft position. For a more detailed background on the terrain aided navigation problem see Ber96].

3.1 The estimation model

The estimation of the aircraft position in the map is a nonlinear problem. The INS gives estimates of the aircraft movement between measurements and the map gives the nonlinear relation between the measurements and the position of the aircraft. Let

^x^t

denote the two-dimensional position of the aircraft over the map. Assuming no bias errors in the absolute aircraft altitude, the model of the estimation problem is

x

k +1

=

^x^k

+ u

^k

+

^v^k

y

k

= h (

^x^k

) +

^e^k

k = 0 1 ::: (4) The INS provides the estimate u

^k

of heading and distance own from the last measurement to the next, the process noise

^v^k

models the error in this estimate.

The nonlinear function h (

) is the on-board carried map, it is stored as a mesh of values of terrain height in known positions. The terrain height over mean sea-level at any point inside the map can be found through interpolation of the neighboring values. The additive measurement error,

^e^k

, models both errors in the vertical radar, errors in the map and errors in the absolute aircraft altitude.

As seen in (4) the problem is linear in the state update but nonlinear in the measurements. Further the nonlinearity has no structure since the map is just a set of points on a grid. Thus the estimation problem is non-trivial and the algorithm solving this problem must be approximate. For this reason there is a need to know just how well the algorithm can perform over a certain map. The Cramer-Rao bound gives a lower bound on the mean square error performance of any unbiased state estimation algorithm. Comparing the Cramer-Rao bound with Monte Carlo simulations will give an indication of how \approximate" the algorithm is.

Let

^N

( mP ) denote the n-dimensional Gaussian distribution with mean vector m and covariance matrix P ,

N

( mP ) = 1

p

(2 )

ⁿ^j

P

^j

^exp

;

1 2( x

^;

m )

^T

P

^;1

( x

^;

m )

:

(8)

The noises in the model (4) are assumed to be white, independent of each other and distributed with

v

k

p

^vk

( x ) =

^N

(0 Q

^k

)

^e^k

p

^ek

( x ) =

^N

(0 R

^k

) : The prior distribution of the position is described by

x

0

p

^x⁰

( x ) =

^N

( x

⁰

P

⁰

)

and it is assumed to be independent of both measurement and process noise.

Note that the model is two-dimensional with scalar measurements but we will keep the matrix notation so that the the result below will be valid for general dimensions on both

^x^k

and

^y^k

. We will assume that both Q

^k

and R

^k

are strictly positive denite so that there inverses always exist.

3.2 The Cramer-Rao bound of the TAN problem

We will consider the one step prediction problem, nding a lower bound on E

(

^x^k^;

^

^x^{k jk ;1}

)(

^x^k^;^x

^

^{k jk ;1}

)

^T

the lter result follows analogously.

The key to the Cramer-Rao bound is the likelihood. Due to the nonlinear measurement equation in (4) the likelihood for all gathered measurements until time k

^;

1 given that the actual position is x

^k

at time k will be hard to nd.

Even though the state update is linear, and thus the states will be Gaussian distributed, the unstructured nonlinearity in the measurement equation makes the measurement's distribution impossible to nd. The trick here is to consider the whole sequence of states from initial time k = 0 until present time and derive the bound for the estimation of this whole sequence. The bound on the one step predictor performance can be found from this result.

Denote the complete state vector history by X

^k

=

x

^T⁰

x

^T¹

::: x

^T^k^T

and similarly Y

^k

for all measurements taken until time k . Using (2) and the whiteness of

^v^k

repeatedly, the prior for

^X^k

is

p

^X^k^;

X

^k

= p

^x^k^jX^{k ;1}^;

x

^k^j

X

^{k ;1}

::: p

^x1jx0

( x

¹^j

x

⁰

) p

^x⁰

( x

⁰

)

= p

^x⁰

( x

⁰

)

^Y^k

i=1

p

^v^i;1

( x

ⁱ^;

x

^i;1^;

u

^i;1

) : (5) Likewise, with (2) and the model (4) the likelihood is

p

^Y^{k ;1}^jX^k^;

Y

^{k ;1}^j

X

^k

= p

^y^{k ;1}^jY^{k ;2}^X^k^;

y

^{k ;1}^j

Y

^{k ;2}

X

^k

::: p

^y0jX^k^;

y

⁰^j

X

^k

=

^{k ;1}^Y

i=0

p

^eⁱ

( y

ⁱ^;

h ( x

ⁱ

)) (6)

since

^e^k

is white.

(9)

The stochastic score function for the estimation of the complete state history

X

k

from all measurements up to and including k

^;

1 is s ( X

^k

^Y^{k ;1}

) = @

@X

^k

^ln p

^Y^{k ;1}^jX^k^;^Y^{k ;1}^j

X

^k

+ @

@X

^k

^ln p

^X^k^;

X

^k

: (7) Denoting the weighted quadratic norm by

k

x

^k²^Q

= x

^T

Qx and inserting (5) and (6) in (7) we have

s ( X

^k

^Y^{k ;1}

) =

^;

@

@X

^k

"

1

2

k

x

⁰^;

x

⁰^k²P

;1

0

+

^{k ;1}^X

i=0 1

2 ky

i

;

h ( x

ⁱ

)

^k²R

;1

i

+ +

^X^k

i=1 1

2

k

x

ⁱ^;

x

^i;1^;

u

^i;1^k²Q

;1

i;1

#

: In Appendix B some common formulas on vector gradients are summarized.

Using these we get

s ( X

^k

^Y^{k ;1}

) =

2

6

4

D

⁰^;

P

⁰^;1

( x

⁰^;

x

⁰

) D

¹^;

Q

^;1⁰

( x

¹^;

x

⁰^;

u

⁰

) D

^{k ;1}^;

Q

^{k ;2}

( x

^{k ;1}

...

^;

x

^{k ;2}^;

u

^{k ;2}

)

;

Q

^;1^{k ;1}

( x

^k^;

x

^{k ;1}^;

u

^{k ;1}

)

3

7

5

where

D

ⁱ

= @h

^T

( x

ⁱ

)

@x

ⁱ

R

^;1ⁱ ^;^yⁱ^;

h ( x

ⁱ

)

+ Q

^;1ⁱ

( x

ⁱ⁺¹^;

x

ⁱ^;

u

ⁱ

) : Taking the gradient of the stochastic score function above yields

@X @

^k

s

^T

( X

^k

^Y^{k ;1}

) =

2

6

4

D

⁰⁰^;

P

⁰^;1

Q

^;1⁰

Q

^;1⁰

D

⁰¹^;

Q

^;1⁰

Q

^;1¹

Q

^;1¹

... ...

... D

⁰^{k ;1}^;

Q

^;1^{k ;2}

Q

^;1^{k ;1}

Q

^;1^{k ;1} ^;

Q

^;1^{k ;1}

3

7

5

(8)

where D

⁰ⁱ

= @

@x

ⁱ

@h

^T

( x

ⁱ

)

@x

ⁱ

R

^;1ⁱ ^;^yⁱ^;

h ( x

ⁱ

)

^;

@h

^T

( x

ⁱ

)

@x

ⁱ

R

^;1ⁱ

@h

^T

( x

ⁱ

)

@x

ⁱ

T

;

Q

^;1ⁱ

Theorem 2 says that

J ( X

^k

) =

^;

E

@

@X

^k

s

^T

( X

^k

^Y^{k ;1}

)

(10)

i.e., taking negative expectation of (8) yields the Fisher information matrix for the complete state vector history. Since

^yⁱ

are the only stochastic entities in (8) and E

^yⁱ

= h ( x

ⁱ

) we get

J ( X

^k

) =

2

6

4

S

⁰

+ P

⁰^;1 ^;

Q

^;1⁰

;

Q

^;1⁰

S

¹

+ Q

^;1⁰ ^;

Q

^;1¹

;

Q

^;1¹

^... ^...

... S

^{k ;1}

+ Q

^;1^{k ;2} ^;

Q

^;1^{k ;1}

;

Q

^;1^{k ;1}

Q

^;1^{k ;1}

3

7

5

(9)

where

S

ⁱ

= H

ⁱ

R

^;1ⁱ

H

ⁱ^T

+ Q

^;1ⁱ

and

H

ⁱ

= @h

^T

( x

ⁱ

)

@x

ⁱ

:

Note that the information matrix (9) will depend on the actual state value X

^k

. The information matrix can be rewritten compactly as a recursion in k , introduce the abbreviated notation

^J^k

= J ( X

^k

)

J

k

=

J 11

k J

12

(

^J^k¹²

)

^T ^Jk^k²²

=

2

4 J

11

k ;1

J 12

k ;1

0 (

^J^{k ;1}¹²

)

^T ^J^{k ;1}²²

+ S

^{k ;1} ^;

Q

^;1^{k ;1}

0

^;

Q

^;1^{k ;1}

Q

^;1^{k ;1}

3

5

(10) starting the recursion with

J

0

=

^J⁰²²

= P

⁰^;1

:

Alternately the information matrix for the lter estimate of X

^k

given

^Y^k

could be derived. It is straight forward to verify that, with obvious notation, the information matrix obeys the following two-step, measurement update, time update, recursion

J

k jk

=

"

J 11

k jk ;1

J 12

k jk ;1

(

^J^{k jk ;1}¹²

)

^T ^J^{k jk ;1}²²

+ H

^k

R

^;1^k

H

^k^T

#

(11)

J

k +1jk

=

2

6

4 J

11

k jk

J 12

k jk

0 (

^J^{k jk}¹²

)

^T ^J^{k jk}²²

+ Q

^;1^k ^;

Q

^;1^k

0

^;

Q

^;1^k

Q

^;1^k

3

7

5

: (12)

The Cramer-Rao bound, Theorem 4, says that

C

k +1jk

= E

^h

(

^X^{k +1}^;^X^{^}^{k +1jk}

)(

^X^{k +1}^;^X^{^}^{k +1jk}

)

^Tⁱ^J^{k +1jk}^;1

:

Using the corollary to the block matrix inversion lemma in Appendix A, Corol- lary 1, we have

J

;1

k +1jk

=

^M^{k +1jk}

=

2

6

4 M

11

k jk M

12

k jk

M 12

(

^M¹²^{k jk}

)

^T ^M²²^{k jk} ^Mk jk²²^{k jk}

(

^M¹²^{k jk}

)

^T ^M²²^{k jk} ^M²²^{k jk}

+ Q

^k

3

7

5

:

(11)

As mentioned above we assume that Q

^k

> 0. Making the same partitioning of the error covariance

C

k +1jk

= E

"

~

X

k jk

(

^X^~^{k jk}

)

^T ^X^~^{k jk}^~^x^T^{k +1jk}

~ x

k +1jk

(

^X^~^{k jk}

)

^T ^~^x^{k +1jk}^x^~^T^{k +1jk}

#

=

C

k jk

?

? C

^{k +1jk}

where

~

X

k jk

=

^X^k^;^X^{^}^{k jk}

and

~ x

k +1jk

=

^x^{k +1}^;^{^}^x^{k +1jk}

summarizing the relations above the Cramer-Rao bound says that

C

k +1jk

M

k +1jk

C

k jk

?

? C

^{k +1jk}

"

M

k jk

?

^M²²^{k jk}

+ Q

^k

#

:

Thus, introducing the notation P

^{k +1}

for the Cramer-Rao bound for the one step ahead predictor error covariance, we have

C

^{k +1jk}^M²²^{k jk}

+ Q

^k

= P

^{k +1}

: From (11), (12) and Lemma 1 we get

P

^{k +1}^;1

=

^J^{k +1jk}²² ^;

(

^J^{k +1jk}¹²

)

^T

(

^J^{k +1jk}¹¹

)

^;1^J^{k +1jk}¹²

= Q

^;1^k ^;

0

^;

Q

^;1^k

(

^J^{k +1jk}¹¹

)

^;1

0

;

Q

^;1^k

= Q

^;1^k ^;

Q

^;1^k ^;

Q

^;1^k

+ H

^k

R

^;1^k

H

^k^T

+

^J^{k jk ;1}²² ^;

(

^J^{k jk ;1}¹²

)

^T

(

^J^{k jk ;1}¹¹

)

^;1^J^{k jk ;1}¹²

| {z }

P

;1

k

;1

Q

^;1^k

: That is, the Cramer-Rao bound for the complete state sequence says that the one step ahead predictor error covariance will be bounded by

C

^{k +1jk}

P

^{k +1}

= Q

^;1^k ^;

Q

^;1^k ^;

H

^k

R

^;1^k

H

^k^T

+ Q

^;1^k

+ P

^k^;1^;1

Q

^;1^k ^;1

: Using the matrix inversion lemma, Lemma 2 in Appendix A, twice on the expression above a familiar recursion for P

^k

is found,

P

^{k +1}

= Q

^;1^k ^;

Q

^;1^k ^;

H

^k

R

^;1^k

H

^k^T

+ Q

^;1^k

+ P

^k^;1^;1

Q

^;1^k ^;1

= Q

^;1^k

+ ( P

^k^;1

+ H

^k

R

^;1^k

H

^k^T

)

^;1

= P

^k^;

P

^k

H

^k

( H

^k^T

P

^k

H

^k

+ R

^k

)

^;1

H

^k^T

P

^k

+ Q

^k

:

The last expression coincides with the recursion for the error covariance in the discrete time Kalman lter, see, e.g., AM79]. Thus, the expression above is the Riccati recursion for the Kalman lter state error covariance solution to the model (4) where the nonlinear function h (

) has been replaced by its gradient evaluated at the true state value at time k . This result is rather reassuring since, if the nonlinearity in fact is linear we can never outperform the Kalman

lter, which is known to be the optimal state estimator for the linear model.

(12)

4 Conclusion

To summarize the result of this text, any unbiased estimator of the states in (4) has an error covariance that, at time k , is bounded by the relation

E

(

^x^k^;^{^}^x^{k jk ;1}

)(

^x^k^;^{^}^x^{k jk ;1}

)

^T

P

^k

where

P

^{k +1}

= P

^k^;

P

^k

H

^k

( H

^k^T

P

^k

H

^k

+ R

^k

)

^;1

H

^k^T

P

^k

+ Q

^k

and the recursion is initiated with the initial state covariance P

⁰

. Further, H

^k

is the gradient of h (

), evaluated at the true state value at time k ,

H

^k

= @h

^T

( x

^k

)

@x

^k

:

We refer to Ber97] for some simulation analysis, using the Cramer-Rao bound for TAN.

A Some Matrix Relations

In this appendix some useful matrix identities are summarized, the relations have been taken from Kai80].

The inverse of a block matrix i described in the following lemma.

Lemma 1 (Block matrix inversion)

If A

^;1

exists

A D C B

;1

=

A

^;1

+ A

^;1

D

^;1

CA

^;1 ^;

A

^;1

D

^;1

;

;1

CA

^;1 ^;1

where = B

^;

CA

^;1

D .

Proof:

Multiplying the original block matrix with the right hand side expression

yields an identity block matrix.

²

Note that the matrix is known as the Schur complement of

^A^C^D^B

.

Corollary 1

If B

^;1

exists

A D C B

;1

=

;1

;

;1

DB

^;1

;

B

^;1

C

^;1

B

^;1

+ B

^;1

C

^;1

DB

^;1

where = A

^;

DB

^;1

C .

Proof:

Apply the permutation matrix

^P

=

⁰^I^I⁰

=

^P^;1

on both sides of the expres-

sion in the lemma above.

²

A very useful identity in systems theory is a relation for the inverse of a sum

of matrices, known as the matrix inversion lemma.

(13)

Lemma 2 (The matrix inversion lemma)

If A

^;1

and C

^;1

exist

( A + BCD )

^;1

= A

^;1^;

A

^;1

B ( DA

^;1

B + C

^;1

)

^;1

DA

^;1

:

Proof:

Multiplying the left hand side without the inverse with the right hand side

will yield an identity matrix.

²

B Vector Gradients

Here we summarize some denitions and relations on vector gradients, in the appendix of chapter 6 in Sch91] more details can be found.

Let a ( x ) denote a scalar function of the p

1 vector x . The gradient of a (

) with respect to x is the p

1 vector:

@xa @ ⁽ x ) =

2

6

4

@a(x)

@x1

@a(x)

@x2

...

@a(x)

@x

p 3

7

5

:

The generalization to vector valued functions is straight forward, let a

^T

( x ) be a 1

n row vector, then

@xa @

^T

⁽ x ) =

2

6

4

@a

1 (x)

@x1

:::

^@a^@x1ⁿ^(x)

... ...

@a1(x)

@x

p

:::

^@an(x)^@xp 3

7

5

: Some useful formulas can be derived from the denitions above:

@xx @

^T

⁼ I

@xb @

^T

x = @

@xx

^T

b = b

@xa @

^T

⁽ x ) b ( x ) = @

@xa

^T

⁽ x )

b ( x ) + @

@xb

^T

⁽ x )

a ( x )

@xx @

^T

Mx = 2 Mx if M is a constant matrix.

References

AM79] B.D.O. Anderson and J.B. Moore. Optimal Filtering. Prentice Hall, Englewood Clis, NJ., 1979.

Ber96] N. Bergman. A bayesian approach to terrain-aided navigation. Tech- nical Report LiTH-ISY-R-1903, Dept. of EE, Linkopings University, 1996.

On the Cramer-Rao bound for Terrain-Aided Navigation