Derivation of a Bayesian Bhattacharyya bound for
discrete-time filtering
Carsten Fritsche
Division of Automatic Control
E-mail: carsten@isy.liu.se
20th June 2017
Report no.: LiTH-ISY-R-3099
Address:
Department of Electrical Engineering Linköpings universitet
SE-581 83 Linköping, Sweden
WWW: http://www.control.isy.liu.se
AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET
Technical reports from the Automatic Control group in Linköping are available from http://www.control.isy.liu.se/publications.
by Reece and Nicholson [1] is revisited. It turns out that the general results presented in [1] are incorrect, as some expectations appearing in the information matrix recursions are missing. This report presents the corrected results and it is argued that the missing expectations are only zero in a number of special cases. A nonlinear toy example is used to illustrate when this is not the case.
Derivation of a Bayesian Bhattacharyya bound for
discrete-time filtering
Carsten Fritsche
1
Background
Let ˆx(y) denote any estimator of a nx× 1 vector random variable x = [x1, x2, . . . , xnx], and let y denote the
vector of measurements. The Weiss-Weinstein family of lower bounds on the nx× nxMSE matrix is given by
Ey,x{(ˆx(y) − x)(ˆx(y) − x)T} ≥ V J−1VT, (1)
where the (i, j)-th element of the nx× K matrix V and the K × K matrix J is given by
[V]i,j = Ey,x{xiψj(y, x)}, (2a)
[J]i,j = Ey,x{ψi(y, x)ψj(y, x)}, (2b)
respectively. Note, that ψi(y, x) is the i-th element of the K × 1 vector ψ(y, x) = [ψ1(y, x), ..., ψK(y, x)]T and
ψi(y, x) has to satisfy the following condition
Z
ψi(y, x)p(y, x) dx = 0, ∀y and i = 1, . . . , K. (3)
In order for the bound matrix to be full rank, we require that the dimension K of the vector ψ(y, x) be at least
nx. A popular choice for the N -th order vector parameter Bayesian Bhattacharyya bound is to assume that
K = nx, such that ψi(y, x) is given as follows
ψi(y, x) = N X n=1 an,i· 1 p(y, x) ∂np(y, x) ∂xn i , i = 1, . . . , nx, (4) where ∂np(y, x)/∂xn
i denotes the n-th order partial derivative of p(y, x) with respect to xi, and an,i are real
valued variables that should be chosen such that the bound expression in (1) is maximized.
Lemma 1. The condition given in (3) is satisfied for ψi(y, x) given in (4), if we assume that the partial
derivatives ∂np(y, x)/∂xni, n = 0, . . . , (N − 1), are absolutely continuous with respect to xi almost everywhere
(a.e.) y ∈ Rny, {x
1, . . . , xi−1, xi+1, . . . , xnx} ∈ R
nx−1.
Proof. Inserting (4) into (3) gives
N X n=1 an,i· Z Rnx ∂np(y, x) ∂xn i dx1· · · dxnx = N X n=1 an,i· Z Rnx−1 " ∂n−1p(y, x) ∂xn−1i #∞ −∞ dx1· · · dxi−1dxi+1· · · dxnx (5)
The equation is trivially zero if we assume an,i= 0, for n = 1, . . . , N . It is also zero for n = 1, . . . , N , if we have
lim xi→±∞ ∂n−1p(y, x) ∂xn−1i = 0, a.e. y ∈ R ny, {x 1, . . . , xi−1, xi+1, . . . , xnx} ∈ R nx−1 and i = 1, . . . , n x, (6)
i.e. the partial derivatives are zero if we let the function argument xi→ ±∞. But this is nothing else than the
property of an absolutely continuous function.
Corollary 1. A direct consequence of absolute continuity of the partial derivatives is that for n = 1, . . . , N − 1
lim xi→±∞ xi· ∂n−1p(y, x) ∂xn−1i = 0 (7) a.e. y ∈ Rny, {x 1, . . . , xi−1, xi+1, . . . , xnx} ∈ R nx−1 holds.
• i 6= j:
[V]i,j= Ey,x{xiψj(y, x)}
= N X n=1 an,j· Z Rny " Z Rnx xi· ∂np(y, x) ∂xn j dx # dy = N X n=1 an,j· Z Rny " Z Rnx−1 xi· " ∂n−1p(y, x) ∂xn−1j #∞ −∞ dx1· · · dxj−1dxj+1· · · dxnx # dy = 0, (8) • i = j:
[V]i,i= Ey,x{xiψi(y, x)}
= N X n=1 an,i· Z Rny Z Rnx xi· ∂np(y, x) ∂xn i dx dy = N X n=1 an,i· Z Rny " Z Rnx−1 ( xi· ∂n−1p(y, x) ∂xn−1i ∞ −∞ − Z ∞ −∞ ∂n−1p(y, x) ∂xn−1i dxi ) dx1· · · dxi−1dxi+1· · · dxnx # dy = −a1,i. (9)
A last condition for the Bayesian Bhattacharyya bound to exist is to require that the matrix J is non-singular. This condition, together with the conditions provided in Lemma 1 and Corollary 1 constitute the so-called regularity conditions of the Bayesian Bhattacharyya bound.
2
Bayesian Bhattacharyya bounds for discrete-time filtering
2.1
System Model
Consider the following discrete-time nonlinear system
xk = fk(xk−1, vk), (10a)
yk = hk(xk, wk), (10b)
where yk ∈ Rnz is the measurement vector at discrete time k, xk ∈ Rnx is the state vector and fk(·) and hk(·)
are arbitrary nonlinear mappings of appropriate dimensions. The noise vectors vk ∈ Rnv, wk ∈ Rnw and the
initial state x0are assumed mutually independent white processes with arbitrary but known probability density
functions (pdfs). We further introduce Xk = [xT0, . . . , xTk]
Tand Y
k = [yT1, . . . , yTk]
Twhich denote the collection
of augmented states and measurement vectors up to time k.
In nonlinear filtering, one is interested in estimating the current state xk from the sequence of available noisy
measurements Yk. The corresponding estimator is denoted as ˆxk(Yk), which is a function of the measurement
sequence Yk. The performance of any estimator ˆxk(Yk) is commonly measured by the mean-square error
(MSE) matrix,
M(ˆxk) = Exk,Yk(ˆxk(Yk) − xk)(·)
T ,
(11) where Exk,Yk{·} denotes expectation with respect to the joint density p(xk, Yk).
2.2
Bound proposed by Reece and Nicholson
Instead of directly bounding from below the MSE matrix M(ˆxk) of the current estimate by the Bayesian
Bhattacharyya bound, the idea followed by Reece and Nicholson (Tichavský et. al [2] firstly introduced this approach to find a Bayesian Cramér-Rao bound for discrete-time filtering) is to find a lower bound for the MSE matrix of the sequence of estimators ˆXk(Yk) = [ˆx0(Yk), . . . , ˆxk(Yk)]T, which is given by
M( ˆXk) = EXk,Yk
n
( ˆXk(Yk) − Xk)(·)T
o
, (12)
This matrix has size nx(k + 1) × nx(k + 1) and is growing with time index k. The MSE matrix for estimating
the state xk can be found by taking the nx× nx lower-right submatrix of M( ˆXk), which can be expressed
mathematically as
with mapping matrix U = [0, . . . , 0, Inx], where Inx is the nx× nxidentity matrix and 0 is a matrix of zeros of
appropriate size. Accordingly, a corresponding Weiss-Weinstein family of lower bounds for the MSE matrix of the state vector sequence Xk can be stated as
M( ˆXk) ≥ V(k)
h J(k)i
−1
V(k),T, (14)
with matrix elements
h V(k)i i,j= EYk,Xk{xi ψj(Yk, Xk)}, (15a) h J(k)i i,j= EYk,Xk {ψi(Yk, Xk)ψj(Yk, Xk)}, (15b)
and where xi denotes the i-th element of Xk = [x1, . . . , x(k+1)nx]. Reece and Nicholson suggested to choose
ψi(Yk, Xk) = N X n=1 an,i· 1 p(Yk, Xk) ∂np(Y k, Xk) ∂xni , i = 1, . . . , (k + 1)nx, (16)
where an,i are arbitrary real numbers that are chosen such that the bound is maximized. For each state vector
xξ in the state vector sequence Xk, we define a vector aξ = [a1,1(ξ), . . . , aN,1(ξ), . . . , a1,nx(ξ), . . . , aN,nx(ξ)]
T
with mapping an,m(ξ) = an,(ξ+1)m, and ξ = 0, . . . , k, such that Ak = [aT0, a T 1, . . . , a
T
k]
T. It is assumed further,
that the following regularity conditions are fulfilled • The partial derivatives ∂np(Y
k, Xk)/∂xni, n = 0, . . . , (N − 1), are absolutely continuous with respect to
xi a.e. Yk, x1, . . . , xi−1, xi+1, . . . , x(k+1)nx
• The limit limxi→±∞xi· (∂
n−1p(Y
k, Xk)/∂xn−1i ) = 0, a.e. Yk, x1, . . . , xi−1, xi+1, . . . , x(k+1)nx
• The matrix J(k)is non-singular,
Then, a corresponding N -th order Bayesian Bhattacharyya can be derived, which is given by M( ˆXk) ≥ max Ak V(k)hJ(k)i −1 V(k) T , (17)
with block-diagonal mapping matrix
V(k)= blkdiag (V0, V1, . . . , Vk) , (18)
where Vξ = −diag([a1,1(ξ), . . . , a1,nx(ξ)]), and information matrix
J(k) ∆= J(k)(0, 0) J(k)(0, 1) · · · J(k)(0, k − 1) J(k)(0, k) J(k)(1, 0) J(k)(1, 1) · · · J(k)(1, k − 1) J(k)(1, k) .. . ... . .. ... ... J(k)(k − 1, 0) J(k)(k − 1, 1) · · · J(k)(k − 1, k − 1) J(k)(k − 1, k) J(k)(k, 0) J(k)(k, 1) · · · J(k)(k, k − 1) J(k)(k, k) (19)
which is partitioned into blocks J(k)(ξ, η), ξ, η = 0, . . . , k each of size n
x× nx, and whose (i, j)-th element is
given by h J(k)(ξ, η)i i,j = N X m=1 N X n=1 am,i(ξ)an,j(η) · EYk,Xk ( 1 p2(Y k, Xk) ∂mp(Y k, Xk) ∂xm ξ,i ∂np(Y k, Xk) ∂xn η,j ) . (20)
Note, that due to symmetry of the matrix J(k), the following equality J(k)(ξ, η) = J(k)(η, ξ)T
holds. Note further, that the nx(k + 1) × nx(k + 1) matrix J(k), the nx(k + 1) × nx(k + 1) matrix V(k) and the vector of
optimization variables Ak are growing with time k, making the computation of the bound eventually infeasible
for large k due to the required inversion of the matrix J(k) and the solution of a large optimization problem (17). Hence, a recursive solution to compute the bound is desired that avoids the inversion of large matrices such as J(k), and that additionally requires to optimize only a subset of variables from Ak. Since we are only
interested in a bound for the MSE matrix M(ˆxk) of the current state xk, it is easy to verify that this can be
2.3
Recursive Calculation of the Bound
Lemma 2. For ξ ≤ k − 2 it holds that
J(k)(ξ, k) = 0. (21)
Proof. We show that each (i, j)-th entry of the matrix J(k)(ξ, k), m, n = 1, . . . , M is zero for ξ ≤ k − 2, i.e.
h J(k)(ξ, k)i i,j= N X m=1 N X n=1 am,i(ξ)an,j(k) · EYk,Xk ( 1 p2(Y k, Xk) ∂mp(Yk, Xk) ∂xm ξ,i ∂np(Yk, Xk) ∂xn k,j ) = 0 (22)
The joint density can be decomposed as follows
p(Yk, Xk) = p(yk|xk)p(xk|xk−1)p(Yk−1, Xk−1). (23)
The n-th order derivative of p(Yk, Xk) w.r.t. the state xk,j can be written as
∂np(Y k, Xk) ∂xn k,j = p(Yk−1, Xk−1) " ∂n ∂xn k,j p(yk|xk) · p(xk|xk−1) # = p(Yk−1, Xk−1) " n X r=0 n! r!(n − r)! ∂n−rp(y k|xk) ∂xn−rk,j ∂rp(xk|xk−1) ∂xr k,j # , (24)
where the second equality results from the general Leibniz rule. In case that ξ ≤ k − 2 the m-th order derivative of p(Yk, Xk) w.r.t. the state xξ,isimplifies to
∂mp(Y k, Xk) ∂xm ξ,i = p(yk|xk)p(xk|xk−1) ∂mp(Y k−1, Xk−1) ∂xm ξ,i . (25)
Inserting (24) and (25) into (22) gives
h J(k)mn(ξ, k)i i,j = N X m=1 N X n=1 n X r=0 am,i(ξ)an,j(k) n! r!(n − r)! EXk,Yk ∂mp(Yk−1,Xk−1) ∂xm ξ,i ∂n−rp(yk|xk) ∂xn−rk,j ∂rp(xk|xk−1) ∂xr k,j p(yk|xk)p(xk|xk−1)p(Yk−1, Xk−1) . (26) In case r = 0, the expectation can be rewritten as
EXk,Yk ( 1 p(yk|xk)p(Yk−1, Xk−1) ∂mp(Y k−1, Xk−1) ∂xm ξ,i ∂np(y k|xk) ∂xn k,j ) = Exk ( Eyk|xk ( 1 p(yk|xk) ∂np(y k|xk) ∂xn k,j )) · EXk−1,Yk−1 ( 1 p(Yk−1, Xk−1) ∂mp(Y k−1, Xk−1) ∂xm ξ,i ) = 0, (27)
where we used the regularity condition
Eyk|xk ( 1 p(yk|xk) ∂np(yk|xk) ∂xn k,j ) = Z Rny ∂np(yk|xk) ∂xn k,j dyk= 0. (28)
In case r = n, we can make use of the smoothing property of expectations, yielding
EXk,Yk ( 1 p(xk|xk−1)p(Yk−1, Xk−1) ∂mp(Yk−1, Xk−1) ∂xm ξ,i ∂np(xk|xk−1) ∂xn k,j ) = Exk−1 ( EXk−2,xk,Yk|xk−1 ( 1 p(xk|xk−1) ∂np(xk|xk−1) ∂xn k,j 1 p(Yk−1, Xk−1) ∂mp(Yk−1, Xk−1) ∂xm ξ,i )) = Exk−1 ( Exk|xk−1 ( 1 p(xk|xk−1) ∂np(x k|xk−1) ∂xn k,j ) · EXk−2,Yk−1|xk−1 ( 1 p(Yk−1, Xk−1) ∂mp(Y k−1, Xk−1) ∂xm ξ,i )) = 0, (29)
where we used the regularity condition Exk|xk−1 ( 1 p(xk|xk−1) ∂np(x k|xk−1) ∂xn k,j ) = Z Rnx ∂np(x k|xk−1) ∂xn k,j dxk= 0. (30)
In all other cases, we have
EXk,Yk ( 1 p(yk|xk)p(xk|xk−1)p(Yk−1, Xk−1) ∂mp(Y k−1, Xk−1) ∂xm ξ,i ∂n−rp(y k|xk) ∂xn−rk,j ∂rp(x k|xk−1) ∂xr k,j ) = Exk ( EXk−1,Yk|xk ( 1 p(yk|xk) ∂n−rp(yk|xk) ∂xn−rk,j 1 p(xk|xk−1) ∂rp(xk|xk−1) ∂xr k,j 1 p(Yk−1, Xk−1) ∂mp(Yk−1, Xk−1) ∂xm ξ,i )) = Exk Eyk|xk ( 1 p(yk|xk) ∂n−rp(y k|xk) ∂xn−rk,j ) · EXk−1,Yk−1|xk ∂rp(x k|xk−1) ∂xr k,j ∂mp(Y k−1,Xk−1) ∂xm ξ,i p(xk|xk−1)p(Yk−1, Xk−1) = 0, (31)
where we again used the smoothing property of expectations and the regularity condition (28). Hence, for
ξ ≤ k − 2 we have
h
J(k)(ξ, k)i
i,j
= 0, (32)
which concludes the proof of Lemma 2. Lemma 3. For ξ, η ≤ k − 2 it holds that
J(k)(ξ, η) = J(k−1)(ξ, η). (33)
Proof. We show that each (i, j)-th entry of the matrix
h J(k)(ξ, η)i i,j= N X m=1 N X n=1 am,i(ξ)an,j(η) · EXk,Yk ( 1 p2(Y k, Xk) ∂mp(Y k, Xk) ∂xm ξ,i ∂np(Y k, Xk) ∂xn η,j ) (34)
is equal to the (i, j)-th entry of the matrix h J(k−1)(ξ, η)i i,j = N X m=1 N X n=1 am,i(ξ)an,j(η) · EXk−1,Yk−1 ( 1 p2(Y k−1, Xk−1) ∂mp(Yk−1, Xk−1) ∂xm ξ,i ∂np(Yk−1, Xk−1) ∂xn η,j ) , (35) i.e. we show that ∀ξ, η ≤ k − 2
h J(k)(ξ, η)i i,j =hJ(k−1)(ξ, η)i i,j . (36)
Inserting (25) into (34) gives h J(k)(ξ, η)i i,j = N X m=1 N X n=1 am,i(ξ)an,j(η) · EXk,Yk ( p2(y k|xk)p2(xk|xk−1) p2(Y k, Xk) ∂mp(Y k−1, Xk−1) ∂xm ξ,i ∂np(Y k−1, Xk−1) ∂xn η,j ) = N X m=1 N X n=1 am,i(ξ)an,j(η) · EXk−1,Yk−1 ( 1 p2(Y k−1, Xk−1) ∂mp(Y k−1, Xk−1) ∂xm ξ,i ∂np(Y k−1, Xk−1) ∂xn η,j ) =hJ(k−1)(ξ, η)i i,j , (37)
which concludes the proof of Lemma 3. Lemma 4. For ξ ≤ k − 2 it holds that
Proof. We show that each (i, j)-th entry of the matrix h J(k)(ξ, k − 1)i i,j = N X m=1 N X n=1 am,i(ξ)an,j(k − 1) · EXk,Yk ( 1 p2(Y k, Xk) ∂mp(Y k, Xk) ∂xm ξ,i ∂np(Y k, Xk) ∂xn k−1,j ) (39) is equal to the (i, j)-th entry of the matrix
h J(k−1)(ξ, k − 1)i i,j = N X m=1 N X n=1 am,i(ξ)an,j(k − 1) · EXk−1,Yk−1 ( 1 p2(Y k−1, Xk−1) ∂mp(Y k−1, Xk−1) ∂xm ξ,i ∂np(Y k−1, Xk−1) ∂xn k−1,j ) , (40) i.e. we show that ∀ξ ≤ k − 2
h J(k)(ξ, k − 1)i i,j = h J(k−1)(ξ, k − 1)i i,j . (41)
The n-th order derivative of p(Yk, Xk) w.r.t. the state xk−1,j can be written as
∂np(Y k, Xk) ∂xn k−1,j = p(yk|xk) " ∂n ∂xn k−1,j p(xk|xk−1) · p(Yk−1, Xk−1) # = p(yk|xk) " n X r=0 n! r!(n − r)! ∂n−rp(x k|xk−1) ∂xn−rk−1,j ∂rp(Y k−1, Xk−1) ∂xr k−1,j # . (42)
Inserting (25) and (42) into (39) yields
h J(k)(ξ, k − 1)i i,j= N X m=1 N X n=1 n X r=0 am,i(ξ)an,j(k − 1) n! r!(n − r)! EXk,Yk ∂mp(Y k−1,Xk−1) ∂xm ξ,i ∂n−rp(x k|xk−1) ∂xn−r k−1,j ∂rp(Y k−1,Xk−1) ∂xr k−1,j p(xk|xk−1)p2(Yk−1, Xk−1) . (43) In case r = 0, we rewrite the expectation using the smoothing property
EXk,Yk ( 1 p(xk|xk−1)p(Yk−1, Xk−1) ∂mp(Y k−1, Xk−1) ∂xm ξ,i ∂np(x k|xk−1) ∂xn k−1,j ) = Exk−1 ( EXk−2,xk,Yk|xk−1 ( 1 p(xk|xk−1) ∂np(x k|xk−1) ∂xn k−1,j 1 p(Yk−1, Xk−1) ∂mp(Y k−1, Xk−1) ∂xm ξ,i )) = Exk−1 ( Exk|xk−1 ( 1 p(xk|xk−1) ∂np(x k|xk−1) ∂xn k−1,j ) · EXk−2,Yk−1|xk−1 ( 1 p(Yk−1, Xk−1) ∂mp(Y k−1, Xk−1) ∂xm ξ,i )) = 0, (44)
where we used the regularity condition
Exk|xk−1 ( 1 p(xk|xk−1) ∂np(x k|xk−1) ∂xn k−1,j ) = Z Rnx ∂np(x k|xk−1) ∂xn k−1,j dxk= 0. (45)
In case r = n the expectation can be written as
EXk−1,Yk−1 ( 1 p2(Y k−1, Xk−1) ∂mp(Yk−1, Xk−1) ∂xm ξ,i ∂np(Yk−1, Xk−1) ∂xn k−1,j ) . (46)
In all other cases we have EXk,Yk ( 1 p(xk|xk−1)p2(Yk−1, Xk−1) ∂mp(Y k−1, Xk−1) ∂xm ξ,i ∂n−rp(x k|xk−1) ∂xn−rk−1,j ∂rp(Y k−1, Xk−1) ∂xr k−1,j ) = Exk−1 EXk−2,xk,Yk|xk−1 1 p(xk|xk−1) ∂n−rp(xk|xk−1) ∂xn−rk−1,j ∂mp(Y k−1,Xk−1) ∂xm ξ,i ∂rp(Y k−1,Xk−1) ∂xr k−1,j p2(Y k−1, Xk−1) = Exk−1 Exk|xk−1 ( 1 p(xk|xk−1) ∂n−rp(xk|xk−1) ∂xn−rk−1,j ) · EXk−2,Yk−1|xk−1 ∂mp(Y k−1,Xk−1) ∂xm ξ,i ∂rp(Y k−1,Xk−1) ∂xr k−1,j p2(Y k−1, Xk−1) = 0, (47)
where we used the smoothing property of expectations and the regularity condition (45). Hence, we arrive at h J(k)(ξ, k − 1)i i,j = N X m=1 N X n=1 am,i(ξ)an,j(k − 1) · EXk−1,Yk−1 ( 1 p2(Y k−1, Xk−1) ∂mp(Y k−1, Xk−1) ∂xm ξ,i ∂np(Y k−1, Xk−1) ∂xn k−1,j ) =hJ(k−1)(ξ, k − 1)i i,j , (48)
which concludes the proof of Lemma 4.
Proposition 1. The block matrix J(k)(k − 1, k − 1) can be decomposed as follows
J(k)(k − 1, k − 1) = J(k−1)(k − 1, k − 1) + eJ(k)(k − 1, k − 1). (49)
where the (i, j)-th element of the matrix eJ(k)(k − 1, k − 1) is given by
h e J(k)(k − 1, k − 1)i i,j = N X m=1 N X n=1 m−1 X r=0 n−1 X s=0 am,i(k − 1)an,j(k − 1) m! r!(m − r)! n! s!(n − s)! × EXk,Yk ∂m−rp(xk|xk−1) ∂xm−rk−1,i ∂rp(Yk−1,Xk−1) ∂xr k−1,i ∂n−sp(xk|xk−1) ∂xn−sk−1,j ∂sp(Yk−1,Xk−1) ∂xs k−1,j p2(x k|xk−1)p2(Yk−1, Xk−1) . (50)
Proof. The (i, j)-th entry of the matrix J(k)(k − 1, k − 1) is given by h J(k)(k − 1, k − 1)i i,j = N X m=1 N X n=1
am,i(k − 1)an,j(k − 1)EXk,Yk
( 1 p2(Y k, Xk) ∂mp(Y k, Xk) ∂xm k−1,i ∂np(Y k, Xk) ∂xn k−1,j ) . (51) Inserting (42) into (51) yields
h J(k)(k − 1, k − 1)i i,j= N X m=1 N X n=1 m X r=0 n X s=0 am,i(k − 1)an,j(k − 1) m! r!(m − r)! n! s!(n − s)! × EXk,Yk ∂m−rp(x k|xk−1) ∂xm−rk−1,i ∂rp(Y k−1,Xk−1) ∂xr k−1,i ∂n−sp(x k|xk−1) ∂xn−sk−1,j ∂sp(Y k−1,Xk−1) ∂xs k−1,j p2(x k|xk−1)p2(Yk−1, Xk−1) . (52)
We can extract from the double sum the term with r = m and s = n, yielding
N X m=1 N X n=1 m X r=0 n X s=0 am,i(k − 1)an,j(k − 1) m! r!(m − r)! n! s!(n − s)! × EXk,Yk ∂m−rp(x k|xk−1) ∂xm−rk−1,i ∂rp(Y k−1,Xk−1) ∂xr k−1,i ∂n−sp(x k|xk−1) ∂xn−sk−1,j ∂sp(Y k−1,Xk−1) ∂xs k−1,j p2(x k|xk−1)p2(Yk−1, Xk−1) = N X m=1 N X n=1 am,i(k − 1)an,j(k − 1) · EXk−1,Yk−1 ( 1 p2(Y k−1, Xk−1) ∂mp(Yk−1, Xk−1) ∂xm k−1,i ∂np(Yk−1, Xk−1) ∂xn k−1,j ) + N X m=1 N X n=1 m−1 X r=0 n−1 X s=0 am,i(k − 1)an,j(k − 1) m! r!(m − r)! n! s!(n − s)! × EXk,Yk ∂m−rp(xk|xk−1) ∂xm−rk−1,i ∂rp(Yk−1,Xk−1) ∂xr k−1,i ∂n−sp(xk|xk−1) ∂xn−sk−1,j ∂sp(Yk−1,Xk−1) ∂xs k−1,j p2(x k|xk−1)p2(Yk−1, Xk−1) =hJ(k−1)(k − 1, k − 1)i i,j +hJe(k)(k − 1, k − 1) i i,j . (53)
With the results of Lemma 1 to Lemma 3 and Proposition 1 it is possible to partition the matrix J(k)as follows J(k)= Ak−1 Bk−1 0 BT k−1 Ck−1+ eJ(k)(k − 1, k − 1) J(k)(k − 1, k) 0 J(k)(k, k − 1) J(k)(k, k) , (54) where J(k−1) = Ak−1 Bk−1 BTk−1 Ck−1 . (55)
We are interested in the inverse of the nx× nxlower-right corner ofJ(k)
−1
, which we denote in the following by Jk. Block-matrix inversion of J(k) gives
Jk = J(k)(k, k) − 0 J(k)(k, k − 1) A k−1 Bk−1 BT k−1 Ck−1+ eJ(k)(k − 1, k − 1) −1 0 J(k)(k − 1, k) (56) = J(k)(k, k) − J(k)(k, k − 1)heJ(k)(k − 1, k − 1) + Ck−1− BTk−1A−1k−1Bk−1 i−1 J(k)(k − 1, k). (57)
Now, since the inverse of the nx× nx lower-right corner ofJ(k−1)
−1
is Jk−1 = Ck−1− BTk−1A
−1
k−1Bk−1, we
obtain a recursion for the information matrix Jk which is given by
Jk = J(k)(k, k) − J(k)(k, k − 1) h e J(k)(k − 1, k − 1) + Jk−1 i−1 J(k)(k − 1, k). (58) The corresponding Bayesian Bhattacharyya bound for estimating the current state is given by
M(ˆxk) ≥ max
Ak
VTk[Jk]−1Vk. (59)
2.4
Evaluation of the bound up to order N = 2
The recursive computation of the N -th order Bayesian Bhattacharyya bound according to (58) requires the evaluation of the terms J(k)(k, k), J(k)(k − 1, k) and eJ(k)(k − 1, k − 1). As this is generally a tedious procedure,
the authors of [1] restrict themselves to the computation of terms up to order N = 2 only. In case N = 2, the (i, j)-th element of the matrix J(k)(k, k) can be written as follows
h J(k)(k, k)i i,j = 2 X m=1 2 X n=1
am,i(k)an,j(k)EXk,Yk
( 1 p2(Y k, Xk) ∂mp(Y k, Xk) ∂xm k,i ∂np(Y k, Xk) ∂xn k,j ) . (60)
Evaluating (60) thus requires the computation of the following expectation h J(k)mn(k, k)i i,j ∆ = EXk,Yk ( 1 p2(Y k, Xk) ∂mp(Y k, Xk) ∂xm k,i ∂np(Y k, Xk) ∂xn k,j ) , (61)
for m, n = 1, 2. By making use of the decomposition (23), the derivatives up to the second order for ` = 1, . . . , nx
can be expressed as ∂p(Yk, Xk) ∂xk,` = p(Yk−1, Xk−1) ∂p(yk|xk) ∂xk,` p(xk|xk−1) + p(yk|xk) ∂p(xk|xk−1) ∂xk,` , (62a) ∂2p(Y k, Xk) ∂x2 k,` = p(Yk−1, Xk−1) " ∂2p(y k|xk) ∂x2 k,` p(xk|xk−1) + 2 ∂p(yk|xk) ∂xk,` ∂p(xk|xk−1) ∂xk,` + p(yk|xk) ∂2p(x k|xk−1) ∂x2 k,` # . (62b) For m = n = 1, the expectation in (61) can be evaluated as
h J(k)11(k, k)i i,j= EXk,Yk 1 p2(y k|xk) ∂p(yk|xk) ∂xk,i ∂p(yk|xk) ∂xk,j + EXk,Yk 1 p(yk|xk)p(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,j + EXk,Yk 1 p(yk|xk)p(xk|xk−1) ∂p(xk|xk−1) ∂xk,i ∂p(yk|xk) ∂xk,j + EXk,Yk 1 p2(x k|xk−1) ∂p(xk|xk−1) ∂xk,i ∂p(xk|xk−1) ∂xk,j . (63)
With the smoothing property of expectations, we obtain EXk,Yk 1 p(yk|xk)p(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,j = Exk EXk−1,Yk|xk 1 p(yk|xk)p(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,j = Exk Eyk|xk 1 p(yk|xk) ∂p(yk|xk) ∂xk,i · Exk−1|xk 1 p(xk|xk−1) ∂p(xk|xk−1) ∂xk,j = 0, (64)
where the last equality follows from the regularity condition (28). Similarly,
EXk,Yk 1 p(yk|xk)p(xk|xk−1) ∂p(xk|xk−1) ∂xk,i ∂p(yk|xk) ∂xk,j = 0, (65)
so that we can write h J(k)11(k, k)i i,j = EXk,Yk 1 p2(y k|xk) ∂p(yk|xk) ∂xk,i ∂p(yk|xk) ∂xk,j + EXk,Yk 1 p2(x k|xk−1) ∂p(xk|xk−1) ∂xk,i ∂p(xk|xk−1) ∂xk,j . (66)
For m = 2 and n = 1, the expectation in (61) can be evaluated as h J(k)21(k, k)i i,j = EXk,Yk ( 1 p2(y k|xk) ∂2p(yk|xk) ∂x2 k,i ∂p(yk|xk) ∂xk,j ) + EXk,Yk ( 1 p(yk|xk)p(xk|xk−1) ∂2p(y k|xk) ∂x2 k,i ∂p(xk|xk−1) ∂xk,j ) + 2 · EXk,Yk 1 p2(y k|xk)p(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,i ∂p(yk|xk) ∂xk,j + 2 · EXk,Yk 1 p(yk|xk)p2(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,i ∂p(xk|xk−1) ∂xk,j + EXk,Yk ( 1 p(yk|xk)p(xk|xk−1) ∂2p(xk|xk−1) ∂x2 k,i ∂p(yk|xk) ∂xk,j ) + EXk,Yk ( 1 p2(x k|xk−1) ∂2p(x k|xk−1) ∂x2 k,i ∂p(xk|xk−1) ∂xk,j ) . (67)
With the smoothing property of expectations and the regularity condition (28), it is easy to verify that
EXk,Yk ( 1 p(yk|xk)p(xk|xk−1) ∂2p(y k|xk) ∂x2 k,i ∂p(xk|xk−1) ∂xk,j ) = Exk ( Eyk|xk ( 1 p(yk|xk) ∂2p(y k|xk) ∂x2 k,i ) · Exk−1|xk 1 p(xk|xk−1) ∂p(xk|xk−1) ∂xk,j ) = 0, (68a) EXk,Yk 1 p(yk|xk)p2(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,i ∂p(xk|xk−1) ∂xk,j = Exk Eyk|xk 1 p(yk|xk) ∂p(yk|xk) ∂xk,i · Exk−1|xk 1 p2(x k|xk−1) ∂p(xk|xk−1) ∂xk,i ∂p(xk|xk−1) ∂xk,j = 0, (68b) EXk,Yk ( 1 p(yk|xk)p(xk|xk−1) ∂2p(x k|xk−1) ∂x2 k,i ∂p(yk|xk) ∂xk,j ) = Exk ( Eyk|xk 1 p(yk|xk) ∂p(yk|xk) ∂xk,j · Exk−1|xk ( 1 p(xk|xk−1) ∂2p(x k|xk−1) ∂x2 k,i )) = 0. (68c)
Hence,hJ(k)21(k, k)i
i,j can be written as
h J(k)21(k, k)i i,j= EXk,Yk ( 1 p2(y k|xk) ∂2p(y k|xk) ∂x2 k,i ∂p(yk|xk) ∂xk,j ) + 2 · EXk,Yk 1 p2(y k|xk)p(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,i ∂p(yk|xk) ∂xk,j + EXk,Yk ( 1 p2(x k|xk−1) ∂2p(x k|xk−1) ∂x2 k,i ∂p(xk|xk−1) ∂xk,j ) . (69)
For m = 1 and n = 2, it is easy to verify that due to symmetry the expectation in (61) can be expressed as h J(k)12(k, k)i i,j= EXk,Yk ( 1 p2(y k|xk) ∂p(yk|xk) ∂xk,i ∂2p(y k|xk) ∂x2 k,j ) + 2 · EXk,Yk 1 p2(y k|xk)p(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,j ∂p(yk|xk) ∂xk,j + EXk,Yk ( 1 p2(x k|xk−1) ∂p(xk|xk−1) ∂xk,i ∂2p(x k|xk−1) ∂x2 k,j ) . (70)
For m = n = 2, the expectation in (61) can be evaluated as h J(k)22(k, k)i i,j= EXk,Yk ( 1 p2(y k|xk) ∂2p(y k|xk) ∂x2 k,i ∂2p(y k|xk) ∂x2 k,j ) + 2 · EXk,Yk ( 1 p2(y k|xk)p(xk|xk−1) ∂2p(y k|xk) ∂x2 k,i ∂p(yk|xk) ∂xk,j ∂p(xk|xk−1) ∂xk,j ) + EXk,Yk ( 1 p(yk|xk)p(xk|xk−1) ∂2p(yk|xk) ∂x2 k,i ∂2p(xk|xk−1) ∂x2 k,j ) + 2 · EXk,Yk ( 1 p2(y k|xk)p(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,i ∂2p(yk|xk) ∂x2 k,j ) + 4 · EXk,Yk 1 p2(y k|xk)p2(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,i ∂p(yk|xk) ∂xk,j ∂p(xk|xk−1) ∂xk,j + 2 · EXk,Yk ( 1 p(yk|xk)p2(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,i ∂2p(xk|xk−1) ∂x2 k,j ) + EXk,Yk ( 1 p(yk|xk)p(xk|xk−1) ∂2p(x k|xk−1) ∂x2 k,i ∂2p(y k|xk) ∂x2 k,j ) + 2 · EXk,Yk ( 1 p(yk|xk)p2(xk|xk−1) ∂2p(x k|xk−1) ∂x2 k,i ∂p(yk|xk) ∂xk,j ∂p(xk|xk−1) ∂xk,j ) + EXk,Yk ( 1 p2(x k|xk−1) ∂2p(x k|xk−1) ∂x2 k,i ∂2p(x k|xk−1) ∂x2 k,j ) . (71)
With the smoothing property of expectations and the regularity condition (28), we obtain
EXk,Yk ( 1 p(yk|xk)p(xk|xk−1) ∂2p(y k|xk) ∂x2 k,i ∂2p(x k|xk−1) ∂x2 k,j ) = Exk ( Eyk|xk ( 1 p(yk|xk) ∂2p(yk|xk) ∂x2 k,i ) · Exk−1|xk ( 1 p(xk|xk−1) ∂2p(xk|xk−1) ∂x2 k,j )) = 0, (72a) EXk,Yk ( 1 p(yk|xk)p2(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,i ∂2p(x k|xk−1) ∂x2 k,j ) = Exk Eyk|xk 1 p(yk|xk) ∂p(yk|xk) ∂xk,i · Exk−1|xk 1 p2(x k|xk−1) ∂p(xk|xk−1) ∂xk,i ∂2p(x k|xk−1) ∂xk,j = 0. (72b)
Similarly, we have EXk,Yk ( 1 p(yk|xk)p(xk|xk−1) ∂2p(x k|xk−1) ∂x2 k,i ∂2p(y k|xk) ∂x2 k,j ) = 0, (72c) EXk,Yk ( 1 p(yk|xk)p2(xk|xk−1) ∂2p(x k|xk−1) ∂x2 k,i ∂p(yk|xk) ∂xk,j ∂p(xk|xk−1) ∂xk,j ) = 0. (72d) Hence,hJ(k)22(k, k)i i,j can be simplified to h J(k)22(k, k)i i,j= EXk,Yk ( 1 p2(y k|xk) ∂2p(y k|xk) ∂x2 k,i ∂2p(y k|xk) ∂x2 k,j ) + 2 · EXk,Yk ( 1 p2(y k|xk)p(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,i ∂2p(y k|xk) ∂x2 k,j ) + 4 · EXk,Yk 1 p2(y k|xk)p2(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,i ∂p(yk|xk) ∂xk,j ∂p(xk|xk−1) ∂xk,j + 2 · EXk,Yk ( 1 p2(y k|xk)p(xk|xk−1) ∂2p(y k|xk) ∂x2 k,i ∂p(yk|xk) ∂xk,j ∂p(xk|xk−1) ∂xk,j ) + EXk,Yk ( 1 p2(x k|xk−1) ∂2p(x k|xk−1) ∂x2 k,i ∂2p(x k|xk−1) ∂x2 k,j ) . (73)
As a result, we can writeJ(k)(k, k)
i,j as follows h J(k)(k, k)i i,j = 2 X m=1 2 X n=1 am,i(k)an,j(k) EXk,Yk ( 1 p2(y k|xk) ∂mp(yk|xk) ∂xm k,i ∂np(yk|xk) ∂xn k,j ) + 2 X m=1 2 X n=1 am,i(k)an,j(k) EXk,Yk ( 1 p2(x k|xk−1) ∂mp(x k|xk−1) ∂xm k,i ∂np(x k|xk−1) ∂xn k,j ) + 4 · a2,i(k)a2,j(k) EXk,Yk 1 p2(y k|xk)p2(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,i ∂p(yk|xk) ∂xk,j ∂p(xk|xk−1) ∂xk,j + 2 · a1,i(k)a2,j(k) EXk,Yk 1 p2(y k|xk)p(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,j ∂p(yk|xk) ∂xk,j + 2 · a2,i(k)a1,j(k) EXk,Yk 1 p2(y k|xk)p(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,i ∂p(yk|xk) ∂xk,j + 2 · a2,i(k)a2,j(k) EXk,Yk ( 1 p2(y k|xk)p(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,i ∂2p(y k|xk) ∂x2 k,j ) + 2 · a2,i(k)a2,j(k) EXk,Yk ( 1 p2(y k|xk)p(xk|xk−1) ∂2p(y k|xk) ∂x2 k,i ∂p(yk|xk) ∂xk,j ∂p(xk|xk−1) ∂xk,j ) (74)
The expression forJ(k)(k, k)
i,j shall be equivalent to the expression [D
22
n ]kl derived in the paper, see Eq. (6)
in [1]. However, it can be easily checked that the latter four terms of (74) are missing in [1]. In general, these four terms are not zero, as will be exemplified later on a nonlinear toy example. Furthermore, the third term in (74) can be decomposed into the two forms:
EXk,Yk 1 p2(y k|xk)p2(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,i ∂p(yk|xk) ∂xk,j ∂p(xk|xk−1) ∂xk,j = Exk−1 Exk|xk−1 1 p2(x k|xk−1) ∂p(xk|xk−1) ∂xk,i ∂p(xk|xk−1) ∂xk,j · Eyk|xk 1 p2(y k|xk) ∂p(yk|xk) ∂xk,i ∂p(yk|xk) ∂xk,j (75a) = Exk Exk−1|xk 1 p2(x k|xk−1) ∂p(xk|xk−1) ∂xk,i ∂p(xk|xk−1) ∂xk,j · Eyk|xk 1 p2(y k|xk) ∂p(yk|xk) ∂xk,i ∂p(yk|xk) ∂xk,j (75b)
However, it is in general not possible to write E 1 p2(y k|xk)p2(xk|xk−1) ∂p(yk|xk) ∂xk,i ∂p(xk|xk−1) ∂xk,i ∂p(yk|xk) ∂xk,j ∂p(xk|xk−1) ∂xk,j = E 1 p2(x k|xk−1) ∂p(xk|xk−1) ∂xk,i ∂p(xk|xk−1) ∂xk,j · E 1 p2(y k|xk) ∂p(yk|xk) ∂xk,i ∂p(yk|xk) ∂xk,j , (76)
as was done in [1], without inducing further assumptions on p(yk|xk) and p(xk|xk−1).
In the following, we focus on the evaluation of the matrix J(k)(k − 1, k). For N = 2, the (i, j)-th element of this
matrix can be expressed as h J(k)(k − 1, k)i i,j = 2 X m=1 2 X n=1
am,i(k − 1)an,j(k)EXk,Yk
( 1 p2(Y k, Xk) ∂mp(Yk, Xk) ∂xm k−1,i ∂np(Yk, Xk) ∂xn k,j ) . (77) Evaluating (77) thus requires the computation of the following expectation
h J(k)mn(k − 1, k)i i,j ∆ = EXk,Yk ( 1 p2(Y k, Xk) ∂mp(Yk, Xk) ∂xm k−1,i ∂np(Yk, Xk) ∂xn k,j ) , (78)
for m, n = 1, 2. The gradients for ` = 1, . . . , nxare given by
∂p(Yk, Xk) ∂xk−1,` = p(yk|xk) ∂p(xk|xk−1) ∂xk−1,` p(Yk−1, Xk−1) + p(xk|xk−1) ∂p(Yk−1, Xk−1) ∂xk−1,` , (79a) ∂2p(Y k, Xk) ∂x2 k−1,` = ∂ ∂xk−1,` ∂p(Yk, Xk) ∂xk−1,` = p(yk|xk) " ∂2p(x k|xk−1) ∂x2 k−1,` p(Yk−1, Xk−1) + 2 ∂p(xk|xk−1) ∂xk−1,` ∂p(Yk−1, Xk−1) ∂xk−1,` +p(xk|xk−1) ∂2p(Y k−1, Xk−1) ∂x2 k−1,` # , (79b)
For m = n = 1, the expectation in (78) can be evaluated as h J(k)11(k − 1, k)i i,j = EXk,Yk 1 p(yk|xk)p(xk|xk−1) ∂p(xk|xk−1) ∂xk−1,i ∂p(yk|xk) ∂xk,j + EXk,Yk 1 p2(x k|xk−1) ∂p(xk|xk−1) ∂xk−1,i ∂p(xk|xk−1) ∂xk,j + EXk,Yk 1 p(yk|xk)p(Yk−1, Xk−1) ∂p(Yk−1, Xk−1) ∂xk−1,i ∂p(yk|xk) ∂xk,j + EXk,Yk 1 p(xk|xk−1)p(Yk−1, Xk−1) ∂p(Yk−1, Xk−1) ∂xk−1,i ∂p(xk|xk−1) ∂xk,j . (80) With the smoothing property of expectations and the regularity condition (28), we obtain
EXk,Yk 1 p(yk|xk)p(xk|xk−1) ∂p(xk|xk−1) ∂xk−1,i ∂p(yk|xk) ∂xk,j = Exk Eyk|xk 1 p(yk|xk) ∂p(yk|xk) ∂xk,j · Exk−1|xk 1 p(xk|xk−1) ∂p(xk|xk−1) ∂xk−1,i = 0, (81a) EXk,Yk 1 p(yk|xk)p(Yk−1, Xk−1) ∂p(Yk−1, Xk−1) ∂xk−1,i ∂p(yk|xk) ∂xk,j = Exk Eyk|xk 1 p(yk|xk) ∂p(yk|xk) ∂xk,j · EXk−1,Yk−1 1 p(Xk−1, Yk−1) ∂p(Xk−1, Yk−1) ∂xk−1,i = 0. (81b) Similarly, we can write
EXk,Yk 1 p(xk|xk−1)p(Yk−1, Xk−1) ∂p(Yk−1, Xk−1) ∂xk−1,i ∂p(xk|xk−1) ∂xk,j = Exk−1 EYk−1,Xk−2|xk−1 1 p(Yk−1, Xk−1) ∂p(Yk−1, Xk−1) ∂xk−1,i · Exk|xk−1 1 p(xk|xk−1) ∂p(xk|xk−1) ∂xk,j = 0, (81c)
where the last equality follows from the regularity condition (30). Thus,hJ(k)11(k − 1, k)i i,j is given by h J(k)11(k − 1, k)i i,j= EXk,Yk 1 p2(x k|xk−1) ∂p(xk|xk−1) ∂xk−1,i ∂p(xk|xk−1) ∂xk,j . (82)
For m = 2 and n = 1, the expectation in (78) can be expressed as h J(k)21(k − 1, k)i i,j= EXk,Yk ( 1 p(yk|xk)p(xk|xk−1) ∂2p(x k|xk−1) ∂x2 k−1,i ∂p(yk|xk) ∂xk,j ) + EXk,Yk ( 1 p2(x k|xk−1) ∂2p(xk|xk−1) ∂x2 k−1,i ∂p(xk|xk−1) ∂xk,j ) + 2 · EXk,Yk 1 p(yk|xk)p(xk|xk−1)p(Yk−1, Xk−1) ∂p(xk|xk−1) ∂xk−1,i ∂p(Yk−1, Xk−1) ∂xk−1,i ∂p(yk|xk) ∂xk,j + 2 · EXk,Yk 1 p2(x k|xk−1)p(Yk−1, Xk−1) ∂p(xk|xk−1) ∂xk−1,i ∂p(Yk−1, Xk−1) ∂xk−1,i ∂p(xk|xk−1) ∂xk,j + EXk,Yk ( 1 p(yk|xk)p(Yk−1, Xk−1) ∂2p(Y k−1, Xk−1) ∂x2 k−1,i ∂p(yk|xk) ∂xk,j ) + EXk,Yk ( 1 p(xk|xk−1)p(Yk−1, Xk−1) ∂2p(Yk−1, Xk−1) ∂x2 k−1,i ∂p(xk|xk−1) ∂xk,j ) . (83)
With the smoothing property of expectations and the regularity condition (28), it is again easy to show that
EXk,Yk ( 1 p(yk|xk)p(xk|xk−1) ∂2p(x k|xk−1) ∂x2 k−1,i ∂p(yk|xk) ∂xk,j ) = 0, (84a) EXk,Yk 1 p(yk|xk)p(xk|xk−1)p(Yk−1, Xk−1) ∂p(xk|xk−1) ∂xk−1,i ∂p(Yk−1, Xk−1) ∂xk−1,i ∂p(yk|xk) ∂xk,j = 0, (84b) EXk,Yk ( 1 p(yk|xk)p(Yk−1, Xk−1) ∂2p(Y k−1, Xk−1) ∂x2 k−1,i ∂p(yk|xk) ∂xk,j ) = 0. (84c)
Similarly, we can write
EXk,Yk ( 1 p(xk|xk−1)p(Yk−1, Xk−1) ∂2p(Y k−1, Xk−1) ∂x2 k−1,i ∂p(xk|xk−1) ∂xk,j ) = Exk−1 ( EYk−1,Xk−2|xk−1 ( 1 p(Yk−1, Xk−1) ∂2p(Y k−1, Xk−1) ∂x2 k−1,i ) · Exk|xk−1 1 p(xk|xk−1) ∂p(xk|xk−1) ∂xk,j ) = 0, (84d)
where the last equality follows from (30). Hence, forhJ(k)21(k − 1, k)i
i,j we can write h J(k)21(k − 1, k)i i,j= EXk,Yk ( 1 p2(x k|xk−1) ∂2p(x k|xk−1) ∂x2 k−1,i ∂p(xk|xk−1) ∂xk,j ) + 2 · EXk,Yk 1 p2(x k|xk−1)p(Yk−1, Xk−1) ∂p(xk|xk−1) ∂xk−1,i ∂p(Yk−1, Xk−1) ∂xk−1,i ∂p(xk|xk−1) ∂xk,j . (85) By further decomposing
p(Yk−1, Xk−1) = p(yk−1|xk−1)p(xk−1|xk−2)p(Yk−2, Xk−2), (86)
we can write 1 p(Yk−1, Xk−1) ∂p(Yk−1, Xk−1) ∂xk−1,i = 1 p(yk−1|xk−1) ∂p(yk−1|xk−1) ∂xk−1,i + 1 p(xk−1|xk−2) ∂p(xk−1|xk−2) ∂xk−1,i , (87)
and the second expectation in (85) can be rewritten as follows EXk,Yk 1 p2(x k|xk−1)p(Yk−1, Xk−1) ∂p(xk|xk−1) ∂xk−1,i ∂p(Yk−1, Xk−1) ∂xk−1,i ∂p(xk|xk−1) ∂xk,j = EXk,Yk 1 p2(x k|xk−1)p(yk−1|xk−1) ∂p(xk|xk−1) ∂xk−1,i ∂p(yk−1|xk−1) ∂xk−1,i ∂p(xk|xk−1) ∂xk,j + EXk,Yk 1 p2(x k|xk−1)p(xk−1|xk−2) ∂p(xk|xk−1) ∂xk−1,i ∂p(xk−1|xk−2) ∂xk−1,i ∂p(xk|xk−1) ∂xk,j . (88) Since EXk,Yk 1 p2(x k|xk−1)p(yk−1|xk−1) ∂p(xk|xk−1) ∂xk−1,i ∂p(yk−1|xk−1) ∂xk−1,i ∂p(xk|xk−1) ∂xk,j = 0, (89) the matrixhJ(k)21(k − 1, k)i i,j
can be finally written as
h J(k)21(k − 1, k)i i,j= EXk,Yk ( 1 p2(x k|xk−1) ∂2p(x k|xk−1) ∂x2 k−1,i ∂p(xk|xk−1) ∂xk,j ) + 2 · EXk,Yk 1 p2(x k|xk−1)p(xk−1|xk−2) ∂p(xk|xk−1) ∂xk−1,i ∂p(xk−1|xk−2) ∂xk−1,i ∂p(xk|xk−1) ∂xk,j . (90)
For m = 1 and n = 2, the expectation in (78) can be written as h J(k)12(k − 1, k)i i,j= EXk,Yk ( 1 p(yk|xk)p(xk|xk−1) ∂p(xk|xk−1) ∂xk−1,i ∂2p(y k|xk) ∂x2 k,j ) + 2 · EXk,Yk 1 p(yk|xk)p2(xk|xk−1) ∂p(xk|xk−1) ∂xk−1,i ∂p(yk|xk) ∂xk,j ∂p(xk|xk−1) ∂xk,j + EXk,Yk ( 1 p2(x k|xk−1) ∂p(xk|xk−1) ∂xk−1,i ∂2p(x k|xk−1) ∂x2 k,j ) + EXk,Yk ( 1 p(yk|xk)p(Yk−1, Xk−1) ∂p(Yk−1, Xk−1) ∂xk−1,i ∂2p(y k|xk) ∂x2 k,j ) + 2 · EXk,Yk 1 p(yk|xk)p(xk|xk−1)p(Yk−1, Xk−1) ∂p(Yk−1, Xk−1) ∂xk−1,i ∂p(yk|xk) ∂xk,j ∂p(xk|xk−1) ∂xk,j + EXk,Yk ( 1 p(xk|xk−1)p(Yk−1, Xk−1) ∂p(Yk−1, Xk−1) ∂xk−1,i ∂2p(x k|xk−1) ∂x2 k,j ) . (91)
Using the smoothing property of expectations and the regularity condition (28), it can be shown that
EXk,Yk ( 1 p(yk|xk)p(xk|xk−1) ∂p(xk|xk−1) ∂xk−1,i ∂2p(y k|xk) ∂x2 k,j ) = 0, (92a) EXk,Yk 1 p(yk|xk)p2(xk|xk−1) ∂p(xk|xk−1) ∂xk−1,i ∂p(yk|xk) ∂xk,j ∂p(xk|xk−1) ∂xk,j = 0, (92b) EXk,Yk ( 1 p(yk|xk)p(Yk−1, Xk−1) ∂p(Yk−1, Xk−1) ∂xk−1,i ∂2p(y k|xk) ∂x2 k,j ) = 0 (92c) EXk,Yk 1 p(yk|xk)p(xk|xk−1)p(Yk−1, Xk−1) ∂p(Yk−1, Xk−1) ∂xk−1,i ∂p(yk|xk) ∂xk,j ∂p(xk|xk−1) ∂xk,j = 0. (92d) Similarly, we can write
EXk,Yk ( 1 p(xk|xk−1)p(Yk−1, Xk−1) ∂p(Yk−1, Xk−1) ∂xk−1,i ∂2p(x k|xk−1) ∂x2 k,j ) = Exk−1 ( EYk−1,Xk−2|xk−1 1 p(Yk−1, Xk−1) ∂p(Yk−1, Xk−1) ∂xk−1,i · Exk|xk−1 ( 1 p(xk|xk−1) ∂2p(x k|xk−1) ∂x2 k,j )) = 0, (92e)