(1)Determining the initial states in forward-backward ltering Fredrik Gustafsson Submitted to IEEE Trans

(1)

Determining the initial states in forward-backward ltering

Fredrik Gustafsson

Submitted to IEEE Trans. on Signal Processing

Abstract

Forward-backward ltering is a common tool in o-line ltering. Filtering

rst forwards and then backwards and the other way around do not give the same result generally. Here we propose a method to choose the initial state in the lter to eliminate this discrepancy. The objectives are to obtain uniqueness and to remove transients in both ends.

1 Introduction

In batchwise signal processing one has the possibility not only to lter a signal forwards in time as usual, but also apply a lter backwards on the ltered signal. Forward- backward ltering is needed in the following applications for example:

Zero-phase ltering using IIR lters.

Implementation of non-causal Wiener lters or other lters with poles both inside and outside the unit circle.

The rst trick to obtain zero-phase lters is used in many software packages, see for instance the function ^filtfilt in Matlab 2], although it is not much discussed in

Department of Electrical Engineering, Linkoping University, S-58183 Linkoping, Sweden

1

(2)

standard text books. The design of non-causal Wiener lters leads to a lter with one stable part and one unstable one as discussed in for instance 1]. The unstable part must be implemented using backward ltering. In both cases, there might be a problem with transients when applied to nite signals.

A lter ^G(^q) becomes all-pass, or zero phase, when applied rst forwards, ^y^f(^t) =

G(^q)^u(^t), and then backwards on the reversed ltered signal, ^y^R^fb(^t) = ^G(^q)^y^R^f(^t), and

nally the output is reversed again. The superindex ^R stands for reversed sequences, the indexes ^f for forward and^b for backward and ^q denotes the shift operator^qu(^t) =

u(^t+1). We can also rst lter backwards and then forwards and obtain^y^bf(^t). In both cases, the total eect is a zero phase lter with transfer function ^jG(^e^i!)^j². However, we have ^y^fb(^t) ⁶= ^y^bf(^t) generally due to unknown initial conditions, and this is not satisfactory from the logical point of view. Besides, the FB (forward-backward) ltered sequence has visible transients in the end and the BF ltered sequence at the beginning.

One way to obtain symmetry is to implement a lter with poles both inside and outside the unit circle by using the partial fractions of the stable poles and unstable poles respectively. This is the standard way in non-causal Wiener ltering. The stable lter is applied forwards on the input and the unstable lter backwards and the lter output is the sum of these terms. Since both lters are applied to the input, the result is of course independent of the order of computations. However, there still is a problem with transients, this time in both ends, due to unknown initial conditions.

We will here determine the initial conditions that makes^y^bf(^t) =^y^f^b(^t) and at the same time removes the transients. That is, the initial condition of one lter is chosen to match the end of the other lter. Instead of solving the system of equations arising from the condition^y^bf(^t) =^y^fb(^t), we will work in a least squares framework, which immediately gives a feasible implementation. The method is illustrated on ltered white noise. The improvement for short data records or long lter impulse responses is obvious.

2 Determining the initial state

A linear lter can always be expressed as

Y =^{H U} +^Ox⁰

where^Y = (^y⁰^y¹^::^y^N^;1)^T is the vector of outputs,Û = (û⁰û¹^::û^N;1)^T is the vector of inputs to the lter and^x⁰ its initial state. ^His a Toeplitz matrix of impulse response

2

(3)

coecients

H=

0

B

@ h

0 0 0

h

1 h

0 0 ...

... ...

h

N;1

h

1 h

0 1

C

A

(1)

where ^N is the number of inputs, and ^O is an \observability" matrix. If the lter is written in state space form

x(^t+ 1) = ^F^x(^t) +^Gu(^t)

y(^t) = ^Hx(^t) +^Du(^t) then

H =

0

B

@

D 0 0

HG D 0 ...

... ...

HF N;2

G HG D 1

C

A

(2)

O =

0

B

@ H

HF...

HF N;1

1

C

A

(3)

We introduce the row and column reversing operators R and C with the following easily checked properties:

A = ^BC ⁾

A

R = ^B^R^C

A

C = ^BC^C

BC

R = ^B^C^C (^B^C^RC)^RC = ^B^RC^C

A

RC = ^A^T if ^AToeplitz

The forward and backward lters are characterized by the pairs (^H^f^O^f) and (^H^b^O^b).

Using the notation ^x⁰ and ^x^N^;1 for the initial conditions for forward and backward

lters, the result of forward-backward and backward-forward ltering can be written as

Y

fb = (^H^b^Y^f^R+^O^b^x^N;1)^R

= ^H^R^b(^H^fÛ +Ô^f^x⁰)^R+Ô^b^R^x^N;1

= ^H^R^b^H^R^fÛ +^H^b^RÔ^f^R^x⁰+Ô^b^R^x^N;1 3

(4)

and

Y

bf = ^H^f^Y^b^R+^O^f^x⁰

= ^H^f(^H^bÛ^R+Ô^b^x^N^;1)^R+Ô^f^x⁰

= ^H^C^f^H^C^b Û+^H^fÔ^R^b ^x^N;1+Ô^f^x⁰ respectively. Note that^H^R^b^H^f^R⁶=^H^C^f^H^C^b .

It turns out that there always is a solution to ^Y^f^b ^;^Y^bf = 0. We will not prove this statement. Instead we formulate the problem as minimizing^kY^fb^;^Y^bf^k²², which turns out to be easier to solve. We have

Y

fb

;Y

bf = (^H^b^R^H^R^f ^;^H^C^f^H^C^b )Û + (^H^R^bÔ^R^f ^;Ô^f)^x⁰+ (Î^;^H^f)Ô^R^b ^x^N^;1 Since this is a linear expression in^x⁰ and ^x^N;1, the minimizing arguments are given by the least squares estimate

d 0

@ x

0

x

N;1 1

A=^h(^H^R^b Ô^f^R^;Ô^f) (Î^;^H^f)Ô^b^Rⁱ^#^H^R^b^H^R^f ^;^H^C^f^H^C^b Û

Here # denotes pseudo-inverse. By substituting these estimated initial states into the FB and BF lters we get

^

Y

fb = ^H^b^R^H^R^fÛ+^h^H^b^RÔ^f^R Ô^b^Rⁱ^h(^H^R^b Ô^f^R^;Ô^f) (Î^;^H^f)Ô^b^Rⁱ^#^H^R^b^H^R^f ^;^H^C^f^H^C^b Û

^

Y

bf = ^H^f^C^H^b^CÛ +^hÔ^f ^H^fÔ^R^b ⁱ^h(^H^R^bÔ^f^R^;Ô^f) (Î^;^H^f)Ô^R^bⁱ^#^H^R^b^H^R^f ^;^H^C^f^H^C^b Û The expressions given above are not useful for implementation, since ^N ^N matrices are involved. Next we seek a feasible implementation without computing^H explicitely.

3 Fast implementation

In this section we describe how ^Y^fb, without proof claimed to be the same as ^Y^bf, can be computed eciently. First an exact method is detailed, where the initial states are computed with linear in time complexity, and then a constant complexity approximation is proposed.

We rst compute the nominal values of ^Y^fb and ^Y^bf for zero initial conditions using standard lter routines. That is, we have

Y 0

fb = ^H^R^b^H^f^R^U

Y 0

bf = ^H^C^f^H^C^b^U 4

(5)

We also need arbitrary state space realizations of the lters, (^F^f^G^f^H^f^D^f) and (^F^b^G^b^H^b^D^b). The \observability" matrixes Ô^f and Ô^f are computed according to (3). Now, we just have to compute the quantities ^S^bf =⁴ ^H^bÔ^R^f and ^S^fb =⁴ ^H^fÔ^R^b and we can express the least squares estimate as

d 0

@ x

0

x

N;1 1

A=^h(^S^fb^R ^;^O^f) ^O^R^b ^;^S^bfⁱ^#(^Y^bf⁰ ^;^Y^f⁰^b)^: Thus, we have

Y

fb = ^Y^f⁰^b+^h^S^bf^R Ô^b^Rⁱ^h(^S^fb^R ^;Ô^f) Ô^R^b ^;^S^bfⁱ^#(^Y^bf⁰ ^;^Y^f⁰^b)^: (4) Next is shown that ^S^fb can be computed without forming the ^N ^N matrix ^H^f. The vectors of impulse response coecients can be expressed in ^h^f as

h

f =

2

4

D

f

O

f(1 :^N ^;1:)^G^f

3

5

:

Here ^O^f(1 :^N ^;1:) means rows 1 to ^N^;1 of ^O^f. Note that ^h^f is the rst column of

H

f. Let ^S^fb(^t: :) denote the ^t'th row of ^S^fb and ^h^f(^t) the t'th component of ^h^f. From (1), (2) and (3) we have

S

fb(^N:) = (^h^R^f)^TÔ^b^R= (^h^T^f)^CÔ^b^R=^h^T^fÔ^b

S

fb(^N ^;1:) = ^S^fb(^N:)^F^b^;^h^f(^N)^H^b^F^b^N;1

S

fb(^t^;1:) = ^S^fb(^t:)^F^b^;^h^f(^t)^H^b^F^b^N^;1 Thus, ^S^f^b can be computed by the recursion

S

fb(^N:) = ^h^T^f^O^b

S

fb(^t^;1:) = ^S^f^b(^t:)^F^b^;^h^f(^t)^H^b^F^b^N;1 ^t=^N^N ^;1^:::2^:

and ^S^bf is computed similarly. In this way, we do not have to compute the ^N ^N matrix ^H explicitely, but only the ^N2ⁿ matrices ^O and ^S.

4 Fixed complexity approximation

For long signals, it seems to be overkill to work with the complete impulse response ^h and observability matrix ^O. If we assume that the impulse response is approximately

5

(6)

zero for ^t ^> ^M a logical approximation in (4) is to replace ^F^t by 0 if ^t ^> ^M. This leads to an ^^O having non-zero elements in its rst^M rows only, and ^^S having non-zero elements in its last^M rows. Thus, we are faced with a least squares problem with 2^M equations and 2ⁿ unknowns, so all that is needed is to compute the pseudo-inverse of a 2^M 2ⁿ matrix in (4).

5 Simulations

In this section, a band-pass Butterworth lter applied to a white noise sequence¹ will be examined. The lter is of tenth order with (normalized) cut-o frequencies 0.2 and 0.25 and the noise sequence is of length 200. Figure 1 shows rst^Y^fb and^Y^bf using the initial states in Matlab's ^filtfilt² function in the signal processing toolbox and second the result using zero inital states ^x⁰ and ^x^N;1, i.e. ^Y^f^b = ^H^R^H^R^U and ^Y^bf = ^H^C^H^C^U. Then follows the signals (4) using optimized initial states. Finally, the result using partial fractions,

jH(^q)^j²= ^B(^q)

A(^q)

B(^q^;1)

A(^q^;1) =

C(^q)

A(^q) +

D(^q)

A(^q^;1)

is shown. As seen, the last two curves are identical showing the symmetry of the proposed initialization method. The 2-norms are as follows:

kY fil tfil t

fb

;Y fil tfil t

fb k

2 = 1^:6

kY 0

fb

;Y 0

fb k

2 = 1^:0

k^Y^^fb ^;^Y^^fb^k² = 810^;13

Thus, there is a signicant dierence depending on the order of forward and backward

ltering for the rst two choices of initial conditions. The transient from the lter is clearly visible either in the beginning or the end. The proposed method and the partial fraction method are perfectly symmetric.

To highlight the transients due to unknown initial conditions in the partial fraction method, a very simple zero-phase high-pass lter is used,

y(^t) =

1

q+ 0^:8

2

u(^t) = 125⁼36

q+ 1^:25^; 20⁼9

q+ 0^:8

1The result can be reproduced using Matlab's Gaussian distribution with seed 0

2Version 4.1 and earlier of Matlab has a bug. With the following de nition of the initial state^zi, the function works as intended: ^zi ⁼ (eye(nfilt-1) + aa(2:nfilt)' -eye(nfilt-2) zeros(1,nfilt-2)]])n (bb(2:nfilt)-bb(1)*aa(2:nfilt))'

6

(7)

0 20 40 60 80 100 120 140 160 180 200

−0.5

−0.4

−0.3

−0.2

−0.1 0 0.1 0.2 0.3 0.4 0.5

Forward−backward filtering using Matlab‘s filtfilt

0 20 40 60 80 100 120 140 160 180 200

−0.5

−0.4

−0.3

−0.2

−0.1 0 0.1 0.2 0.3 0.4 0.5

Forward−backward filtering using zero initial conditions

0 20 40 60 80 100 120 140 160 180 200

−0.5

−0.4

−0.3

−0.2

−0.1 0 0.1 0.2 0.3 0.4 0.5

Forward−backward filtering using optimized initial conditions

0 20 40 60 80 100 120 140 160 180 200

−0.5

−0.4

−0.3

−0.2

−0.1 0 0.1 0.2 0.3 0.4 0.5

Forward−backward filtering using partial fractions

Figure 1: The forward-backward and backward-forward ltered signal using rst Mat- labs ltlt, then zero initial condition, optimized initial conditions and nally using partial fractions.

7

(8)

0 20 40 60 80 100 120 140 160 180 200 5

10 15 20 25 30 35 40 45 50

Forward−backward filtering using optimized initial conditions

0 20 40 60 80 100 120 140 160 180 200

−100

−50 0 50 100 150 200

Forward−backward filtering using partial fractions

Figure 2: The ltered signal using optimized initial conditions and partial fractions.

and a DC oset of 100 is added to the ^u(^t) used in the previous example. The result is shown in Figure 2. The transients for the partial fraction method are obvious. Of course, some ad-hoc rules can be used to determine initial conditions to compensate for the DC oset in this example. The point is, that the proposed method computes the initial conditions for each lter to match the end of the other lter response when the transient hopefully is gone.

6 Conclusion

We have here detailed a method to choose the initial conditions for lters to be applied forwards and backwards to a signal. The objectives were to get the same result inde- pendently of the order of ltering and to minimize transients due to osets and trends in the data. Zero initial conditions or other ad-hoc initializations can by examples easily be shown to give undesired transients at the beginning or end of the data sequence depending on the order of application of the forward and backward lters. Transients are also a problem when using partial fractions of the stable and unstable parts. The proposed method gives no transients and is symmetric, which was exemplied by a simulation.

References

1] T. Kailath. Lectures on Wiener and Kalman ltering. Springer-Verlag, Wien, 1981.

2] MATLAB. Signal Processing Toolbox User's Guide. The MathWorks, Inc, 1993.

8