Comparing Estimation Algorithms for Camera Position and Orientation

(1)

Institutionen för systemteknik

Department of Electrical Engineering

Internship

Comparing Estimation Algorithms for Camera

Position and Orientation

Richard J.B. Pieper LITH-ISY-EX--07/0331--SE

Linköping 2007

Department of Electrical Engineering Linköpings tekniska högskola

Linköpings universitet Linköpings universitet

(2)

(3)

Comparing Estimation Algorithms for Camera

Position and Orientation

Internship

performed at

Division of Automatic Control

Department of Electrical Engineering

Linköpings universitet, Sweden

by

Richard J.B. Pieper LITH-ISY-EX--07/0331--SE

Supervisor: M.Sc. J.D. Hol

isy, Linköpings universitet Examinators: Prof. F. Gustafsson

isy, Linköpings universitet Prof. J.B. Jonker

Laboratory of Mechanical Automation and Mechatronics, University of Twente Linköping, 23 May, 2007

(4)

(5)

Avdelning, Institution

Division, Department

Division of Automatic Control Department of Electrical Engineering Linköpings universitet

SE-581 83 Linköping, Sweden

Datum Date 2007-05-23 Språk Language Svenska/Swedish Engelska/English Rapporttyp Report category Licentiatavhandling Examensarbete C-uppsats D-uppsats Övrig rapport

URL för elektronisk version

http://www.control.isy.liu.se http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-9010 ISBN — ISRN LITH-ISY-EX--07/0331--SE

Serietitel och serienummer

Title of series, numbering

ISSN

—

Titel

Title

Jämförelse av Skattningsalgoritmer för Kamerapositionering

Comparing Estimation Algorithms for Camera Position and Orientation

Författare

Author

Richard J.B. Pieper

Sammanfattning

Abstract

State estimation deals with estimation of the state of an object of interest by observing noisy measurements. The process to obtain the state estimations is called filtering. In this report several filters are compared to an existing one. The new filters deal with nonlinear process and measurement models in a different way than the existing filter. Instead of approximating the nonlinear transformations the probability densities are approximated by a set of points which undergo the nonlinear transformation.

The application for which the filters will be used is to estimate the position and orientation of a camera in a markerless environment, using data from an inertial measurement unit and a camera. It is found that the corresponding process and measurement models contain nonlinearities and therefore an accuracy improve-ment is expected with the new filters.

The new filters are variations of the so-called unscented Kalman filter. Also a discussion on the marginalized particle filter is presented. Instead of using randomly chosen samples as in the particle filter methods, the unscented Kalman filter uses deterministically chosen points. The marginalized particle filter partitions the variables of the system in a linear and a nonlinear part. Linear Kalman filters are applied to the linear variables and a particle filter to the nonlinear variables, thus reducing the computational load. Details of various implementations of the filters are given, as well as the motivation for the specific implementations.

Tests are carried out to assess the performance of the filters. This is done with both simulation data and real measurements. A comparison is made to the original

extended Kalman filter. The tests are focussed on accuracy and computational load.

Results showed that the use of the new filters did not improve accuracy. This is mainly due to the fact that the nonlinearities are not so severe. Furthermore the filters had a higher computational load, which is an important aspect in the system reviewed in this report. Therefore the current filter need not to be replaced. The unscented Kalman filter is a good alternative to the EKF in case of new applications, since it can handle the system in a black-box manner in contrast to the EKF.

Nyckelord

(6)

(7)

Abstract

State estimation deals with estimation of the state of an object of interest by observing noisy measurements. The process to obtain the state estimations is called filtering. In this report several filters are compared to an existing one. The new filters deal with nonlinear process and measurement models in a different way than the existing filter. Instead of approximating the nonlinear transformations the probability densities are approximated by a set of points which undergo the nonlinear transformation.

The application for which the filters will be used is to estimate the position and orientation of a camera in a markerless environment, using data from an inertial measurement unit and a camera. It is found that the corresponding process and measurement models contain nonlinearities and therefore an accuracy improve-ment is expected with the new filters.

The new filters are variations of the so-called unscented Kalman filter. Also a discussion on the marginalized particle filter is presented. Instead of using ran-domly chosen samples as in the particle filter methods, the unscented Kalman filter uses deterministically chosen points. The marginalized particle filter partitions the variables of the system in a linear and a nonlinear part. Linear Kalman filters are applied to the linear variables and a particle filter to the nonlinear variables, thus reducing the computational load. Details of various implementations of the filters are given, as well as the motivation for the specific implementations.

Tests are carried out to assess the performance of the filters. This is done with both simulation data and real measurements. A comparison is made to the original

extended Kalman filter. The tests are focussed on accuracy and computational

load.

Results showed that the use of the new filters did not improve accuracy. This is mainly due to the fact that the nonlinearities are not so severe. Furthermore the filters had a higher computational load, which is an important aspect in the system reviewed in this report. Therefore the current filter need not to be replaced. The unscented Kalman filter is a good alternative to the EKF in case of new applications, since it can handle the system in a black-box manner in contrast to the EKF.

(8)

(9)

Acknowledgments

First of all I like to thank my direct supervisor Jeroen Hol for the pleasant co-operation and the learningfull experience. I also like to thank Professor Fredrik Gustafsson for letting me be a part of the control group for a short while. Further-more my gratitude goes out to everyone at the control group for their dedication to science and making me feel welcome in Sweden.

Special thanks go out to Dr. Ir. F. van der Heĳden and Dr. Ir. Y Boers for introducing me to the control group.

To the many people who made my stay in Sweden a pleasant one; the gradu-ate students Ahmal, Per, Simon and Tobias, my corridor mgradu-ates Katarina, Maria, Rokie and Stefan and the people who came to visit me all the way from The Netherlands – you know who you are! – thanks a lot! Finally I want to thank my parents and my sister for their love and support and Lenneke for being the great person she is and basically for putting up with me for so long (I’ll be home soon!).

Linköping, May 2007

Richard Pieper

(10)

(11)

6 Experiments 47 6.1 Test Method . . . 47 6.1.1 Simulation . . . 47 6.1.2 Real data . . . 48 6.1.3 Tests. . . 49 6.1.4 Consistency checks . . . 49 6.1.5 Filter parameters . . . 49 6.2 Results. . . 50 6.2.1 Simulation results . . . 50 6.2.2 Real data . . . 53 7 Concluding remarks 57 7.1 Conclusions . . . 57 7.2 Future work . . . 58 Bibliography 59 Notation 63 A Math 66 A.1 Multi-dimensional Taylor series . . . 66

A.2 Quaternion Algebra . . . 66 B Accuracy of the Unscented Kalman Filter 68

(13)

Chapter 1

Introduction

T

his report is an extension to previous research [1] carried out as a part of the MATRIS project.1 _{Running from februari 2004, MATRIS aims at}

devel-oping a system capable of determining the position and orientation of a camera in real-time by a combined analysis of the images captured by the camera and measurements obtained with unobtrusive inertial motion sensors. With a known camera position and orientation (from now on simply called pose), the ability rises to render virtual objects into a realistic scene (so called augmented reality) in a convincing way. This can be done by superimposing them on camera images or displaying them on see-through head-mounted displays. Typical applications include television broadcasts, education, entertainment and teleconferencing with both real and virtual participants. Although augmented reality (AR) is already being applied in for instance sport broadcasts, it is usually limited to stationary camera positions. The goal of MATRIS is to extend these systems to applications which involve moving camera’s.

The key feature of the system is that it does not require any markers. Therefore the system is applicable in an unprepared environment, in contrast to current AR systems.

1.1 Previous work

In this section a short summary is given of previous work within the MATRIS project. By doing so the need for further research will become clear. Details regarding the information in this section can be found in [1].

To produce a convincing augmented image the deviation of the estimated pose with respect to the actual pose of the camera must be within certain boundaries. In the MATRIS project this yields for typical settings a maximum deviation in the position of the camera of 2 mm and an angular accuracy of 0.1 degrees.

To achieve the desired accuracy multiple sensors are used. The setup includes the camera itself and an Inertial Measurement Unit (IMU). The camera and the

1_{Markerless real-time Tracking for Augmented Reality Image Synthesis [}₂_]

(14)

Figure 1.1. Example of augmented reality: A virtual car is rendered into the scene.

complementary computer vision give a good prediction of the pose only when the motion is minimal. The IMU by itself gives a good estimation on a short time-scale but tends to drift as time goes on. The idea is to fuse the data of both sensors together, as can be seen in Figure1.2. The computer vision provides a

3D scene model Computer vision Camera ? 6 IMU Sensor fusion ? -Position orientation ? 6

Figure 1.2. Schematic illustration of the approach.

list of 2D/3D correspondences, i.e. a list of 3D coordinates of a certain feature in the environment with corresponding 2D image coordinates. From this relation the camera pose can be estimated. The IMU provides 3D acceleration and 3D angular velocity measurements. Also a signal is being sent to the camera in order to synchronise the data from the IMU and the camera.

The data from the IMU and the camera are fused together in a Kalman frame-work. The mean and covariance of the state of interest (position and orientation) are propagated in time by means of recursion. Correct modelling of the process and measurement models is essential. The derivation of the process and measurement

(15)

1.2 Report outline 3

models are discussed in detail in [1]. It is found that the models are nonlinear, and an Extended Kalman Filter (more about the EKF is found in Chapter2) is used to approximate the nonlinear transformation of the probability distributions.

Tracking accuracies obtained by this system are shown in Figure 1.3. The estimation of the orientation is within bounds, but the error in the position es-timation is around 5 mm where a maximum of 2 mm is allowed. In [1] several possible causes are mentioned. One of them being wether or not the developed filter has a satisfactory performance. With the the results of this report that question can be answered.

Figure 1.3. RMSE for position (black) and orientation (grey) for different motion types.

1.2 Report outline

In this report two new filters will be developed, being an Unscented Kalman Filter and a Marginalized Particle Filter. Chapter2 deals with general estimation the-ory. In Chapter3 the unscented Kalman filter is treated. Also the general filter equations are derived. Various possible implementations are discussed. Chapter4

discusses the marginalized particle filter. The general particle filter is presented afterwhich the framework of the marginalized particle filter is discussed. General filter equations are presented. Chapter5reviews the system models. Adjustments are made to the models found in [1]. Furthermore, characteristics of the general equations w.r.t. to the models used in this report are treated in this chapter. In Chapter6 tests are specified to assess the performance of the filters compared to ground truth data. Also presented in this Chapter are the results of the tests. Finally, in Chapter7conclusions are drawn from the test results and basically an answer is given to the question mentioned in Section1.1.

(16)

(17)

Chapter 2

Estimation Theory

T

his chapter serves as an introduction to estimation theory. Many textbooks can be found on this subject. It will be shown that in order to be able to estimate the states of interest of a process, models of the process and measurement system are needed, which can be linear, nonlinear or nonlinear with a linear sub-structure. Here the first two cases will be discussed. The latter case is postponed to Chapter4 (and especially Section4.2.1).

2.1 Linear systems

Consider a certain process, for example tracking a target. Measurements are carried out in order to obtain information about the state of the process (e.g. position and velocity). With the measurements comes (electrical) noise and other unpredictable phenomena. This noisy data has to be filtered in order to find the true system state. The goal of the filter is to combine prior knowledge of the state of the process with the (noisy) measurements to give the best possible estimate of the state. Of great importance is Bayes’ theorem for conditional probabilities [3]:

p(x|z) = p(z|x)p(x)

p(z) (2.1)

with state x, measurements z and p(·|·) a conditional probability density. The den-sity p(z|x) represents the randomness of the measurements given a certain state. The density p(x) represents the prior knowledge of the state. The probability density of the measurements is given by p(z). Finally the posterior probability p(x|z) represents the knowledge available on the state when the measurements are known. With this probability at hand, calculation of the optimal estimate of the state is possible. For instance the minimum mean square error estimator,

ˆ

xMMSE(z) =

Z

x

xp(x|z)dx = E[x|z] (2.2)

is given by the expected value of the posterior probability density. 5

(18)

The idea behind Kalman filtering is to recursively maintain the posterior prob-ability density in time. The recursion starts with the initialization of the prior knowledge of the state, i.e. the expectation of x(0). At that point the informa-tion of the current measurement is incorporated to correct the prior knowledge (yielding the posterior probability). Estimation of the state is now possible. The Kalman filter than predicts the state of the system for the next time step after which it can be corrected with the measurement in the new timestep, and so on.

To calculate the densities in Eq. (2.1), models of the process and the mea-surement system have to be available. In the linear Gaussian case (i.e. linear system models and Gaussian noise), the densities are fully defined by its mean and covariance.

A (dynamic) system can be represented by the following (linear and discrete) state space equations [3]:

x(i + 1) = A(i)x(i) + B(i)u(i) + w(i)

z(i) = C(i)x(i) + v(i) (2.3)

with A being the model of the process and C the measurement model. In the Gaussian case the noise terms w(i) and v(i) have zero mean and the assumption is often made that they are not correlated to each other. These equations represent the dynamic behaviour of the system, and how the measurements are related to the states. Taking the first and second moments of the first line in Eq. (2.3) yields [3]

E[x(i + 1)] = A(i)E[x(i)] + B(i)u(i) Px(i + 1) = A(i)Px(i)AT(i) + Pw(i)

(2.4) where Pw(i) follows from E[w(i)wT(i)] = Pw(i)δij. The uncertainty of the state

is represented by the covariance matrix Px. Equation (2.4) shows how the mean

and covariance are propagated to the next timestep and thus the optimal estimate of the state can be calculated in time.1

2.2 Nonlinear systems

When the system models are nonlinear, Eq. (2.3) changes into [3] x(i + 1) = f (x(i), u(i), i) + w(i)

z(i) = h(x(i), i) + v(i) (2.5)

and Eq. (2.4) does not hold. In fact, the mean and covariance are no longer sufficient to represent the transformed probability density. The optimal solution requires a complete discription of the posterior probability density and hence an impractical amount of parameters are needed. Instead, many suboptimal approx-imations have been derived. The most well known is the Extended Kalman Filter (EKF), which has also been implemented in [1]. The EKF uses first-order Taylor linearizations of the nonlinear system, after which the linear Kalman filter can be applied.

(19)

2.2 Nonlinear systems 7

Although its apparent simplicity, due to the linearization the error in state estimation may be substantial and in can in some cases lead to divergence of the filter. Furthermore, calculation and implementation of the (often large) Jacobian matrices are prone to errors. The following example clearly demonstrates the possible shortcomings of the EKF.

Example 2.1: A simple nonlinear state estimation problem

This example will be used throughout the text to show the differences in per-formance between various filters. It comprises a simple state estimation problem which has been analysed in many publications, mostly to demonstrate the short-comings of the extended Kalman filter.2 _{In this particular example the true system}

state will be estimated using noisy measurements and an EKF. The state of the following (nonlinear) system will be estimated

xt+1= xt 2 + 25xt 1 + x2 t + 8 cos(1.2t) + wt (2.6) yt= x2 t 20+ et (2.7)

where x0 ∼ N (0, 5), wt and et are mutually indepedent white Gaussian noise

sequences, wt ∼ N (0, 10) and et ∼ N (0, 1). Figure2.1 shows that the EKF is

clearly not able to cope with the nonlinearities in the system. This calls the need for better ways to deal with this nonlinear system, and with nonlinear systems in general. 0 10 20 30 40 50 60 70 80 90 100 −100 −50 0 50 100 150 200 250 Time True EKF Measurements

Figure 2.1. The true system state is indicated by the dashed line, the estimated state is indicated by the solid line. The measurements are indicated by an ’+’.

(20)

(21)

Chapter 3

The Unscented Kalman

Filter

J

ulier and Uhlmann [5] proposed a method to circumvent the problems in-herent to the EKF. It lead to a formulation in which the system can be treated as a ’black box’. This is due to the fact that calculation of the Jacobian matrices is not necessary, so implementation of new system functions is straightforward. Furthermore it has the same computational complexity as the EKF.

In this Chapter the filter equations of the unscented Kalman filter (UKF) will be derived. First the Unscented Transformation will be discussed.

3.1 Unscented Transformation

The linear Kalman filter uses the fact that a Gaussian random variable (GRV) which undergoes a linear operation still remains a GRV. In the nonlinear case this is not possible and other methods have to be applied to transform the mean and covariance from the current timestep to the next. The unscented transform is based on the intuition that [5] it is easier to approximate a Gaussian distribution

than it is to approximate an arbitrary nonlinear function or transformation. So

instead of using Taylor approximations and Jacobian matrices of the nonlinear transformations, the aim is to approximate the untransformed density. Which, be-ing a Gaussian distribution is relatively straightforward.

The Gaussian distribution is approximated by a set of points which captures the mean and covariance of the distribution. These sigma points are then passed into the nonlinear model of the system, yielding a set of transformed points. By choos-ing appropriate weights, which obey the constraint that the sum of the weights is equal to 1, the weighted average and the weighted outerproduct of the trans-formed points give the mean and covariance of the transtrans-formed distribution. The weights can be positive or negative. The method is made clear in Fig.3.1, where both the linearization approach of the EKF and the unscented transformation are

(22)

Figure 3.1. Linearization approach and the unscented transformation. The dots repre-sent the sigma points. The thick line is an arbitrary nonlinear function.

depicted.1

Choosing the set of sigma-points is discussed in Section3.2. For now let’s just assume that 2n + 1 sigma-point vectors are needed to represent the probability density of a random variable x. Appropriate weights Wi are available. Consider

the following nonlinear transformation

y = f (x) (3.1)

Passing the sigma-point vectors through the nonlinear function yield the trans-formed sigma points,

Yi = f (Xi) (3.2)

from which the mean and covariance of the transformed probability can be ap-proximated ¯ y = 2n X i=0 WiYi (3.3) Py= 2n X i=0 Wi{Yi− ¯y} {Yi− ¯y} T (3.4)

3.2 Sigma-point selection

The major difference between the Unscented Transform and other approximations such as Monte Carlo Sampling is that the sigma points are deterministically chosen to represent the untransformed mean and covariance. This yields a required set of

(23)

3.2 Sigma-point selection 11

points that is much lower in number than with Monte Carlo methods and hence has a smaller computational load. Since the original formulation much research has been focussed on achieving the same accuracy with a lower amount of samples. A survey of the literature showed the following schemes:

1. Symmetric sigma points [5] (1995) 2. Simplex sigma points [6] (2002) 3. Minimal skew simplex points [6] (2002) 4. Spherical simplex points [7] (2003)

The author refers the interested reader to the following articles which discuss other sigma point slection schemes. Those are The higher order unscented filter (HOUF) [8], A more robust unscented transform [9] and Second-order simplex

sigma points. The first scheme as published operates for one dimensional systems.

If it is expected or experienced that the higher order moments have a significant influence one can extend the scheme to the actual state dimension. In this report the assumption is made that the needed accuracy will be in reach using one of the other sigma point schemes, and the effort to extend the HOUF to the actual dimension of the system is not undertaken. In the second article the use of extra sigma points (test points) is discussed which can improve the estimated mean and variance in case of discontinuities in the nonlinear functions. The latter selection scheme was unfortunately not available to the author. In the remainder of this section the schemes will be discussed in more detail.

Symmetric sigma points

Julier and Uhlmann proposed the following sigma-point selection, X0 = x¯ Xi = x +¯ p(n + κ)Px i Xi+n = x −¯ p(n + κ)Px i (3.5) with weights W0 = κ/(n + κ) Wi = 1/2(n + κ) Wi+n = 1/2(n + κ) (3.6)

where n is the dimension of the state variable, Px is the state covariance matrix

and i is de ith column of the matrix square-root. Rewriting the root term in Eq. (3.5), p (n + κ)pPx i= p (n + κ) (σ)_i (3.7)

points out that the ith column corresponds to the standard deviation in the di-rection of one of the principal axis.

(24)

The term κ can be interpreted as a scaling factor. It can be used to tune the higher order moments of the approximation. When the state variable can be considered Gaussian, setting this scaling factor according to (n + κ) = 3 is a good choice. For other prior state variable distributions, setting the scaling factor to n + κ = k will minimize the fourth order moment (so called kurtosis) k [5]. The matrix square-root can be calculated by a numerical efficient operation such as Cholesky decomposition. The matrix containing the sigma vectors is of size [n × (2n + 1)], so for every state 2n + 1 sigma points are determined.

Wan and Van der Merwe [10] suggest a slightly different scaling factor. Basi-cally it incorporates the Scaled Unscented Transformation, which is discussed in Section3.3. The sigma points are chosen according to:

X0 = ¯x Xi = ¯x + p(n + λ)Px i i = 1, . . . , n Xi = ¯x − p(n + λ)Px i−n i = n + 1, . . . , 2n (3.8) with weights W₀(m) = λ/(n + λ) W₀(c) = λ/(n + λ) + (1 − α2_{+ β)} W_i(m) = W_i(c)= 1/{2(n + λ)} (3.9)

where λ = α2_{(n + κ) − n. The parameter α determines the spread of the sigma}

points around ¯_{x and is suggested to set to a small positive value (e.g. 1 6 α 6} 1e − 4). For 1-dimensional systems, α = 1 is optimal. κ is the same parameter as used in Eqs. (3.5) and (3.6). For Gaussian distributions β = 2 is optimal. The superscripts (m) and (c) define the weights for the mean and covariance.

Simplex sigma points

When the dimension of the state vector is n, the covariance can be represented by n points [6]. For instance if the state variable is 2-dimensional, at least 2 points are needed to represent the second order moment. To also account for the mean, an additional sigma point is needed thereby implying a minimal set of n + 1 sigma points (see Figure3.2). The simplex sigma point set does not account for higher

Figure 3.2. Simplex Sigma Points, here n = 2 and a third sigma point captures the mean of the 2-dimensional distribution

(25)

3.3 The scaled unscented transformation 13

nonlinear transformation.

Minimal skew simplex points

This lead to the derivation of a new set of sigma points [6]. The minimal skew simplex sigma point set incorporates the third order moments (so called skew) by minimizing it. Because the skewness can be in any direction, the average error is minimized by assuming a symmetric distribution. The sigma points themselves are positioned asymmetric around the mean. This sigma point selection needs an additional point, so in total n + 2 points are needed to capture the first and second moments and to minimize the third order. The sigma point selection can be found in Algorithm3.1, where it is noted that the algorithm works for a random variable v with the properties ¯v = 0 and Pv= I.2 However this variable can be

transformed to a random variable z with mean ¯z and covariance Pzby the linear

transformation [6]

z = ¯z +pPv

T

v (3.10)

Spherical simplex points

The sphere that bounds the sigma points in the set derived above is proportional to the dimension of the state. In fact the radius of the sphere is 2n/2_{. This can}

lead to problems with numerical stability, especially for high dimension systems. A better sigma point selection algorithm was developed in [7]. The selection of the sigma points can be found in Algorithm3.2.

The difference with the minimal skew simplex points is that the weight is the same for every sigma point (apart from the 0th point) and that all sigma points (apart from the 0th point) lie on a hypersphere with radius√n/(1 − W0). This

set of n + 2 sigma points therefore has better numerical properties.

3.3 The scaled unscented transformation

The sigma point selection schemes discussed so far (except the selection scheme of Wan / Van der Merwe) share the property that the radius of the sphere that bounds the sigma points increases as the state dimension increases, thereby sampling non-local effects. To overcome this problem a method is proposed in [11], called the scaled unscented transformation, which scales the sigma points of an arbitrary sigma point set. The scaled unscented transformation can be implemented as presented below. The new sigma points are chosen as follows

X0_i = X0+ α (Xi− X0) (3.11) with weights W_i0 = ( W0/α2+ (1/α2− 1) i = 0 Wi/α2 i 6= 0 (3.12)

(26)

Algorithm 3.1 Minimal skew sigma point selection 1. Choose 0 ≤ W0≤ 1

2. Choose weight sequence:

Wi=

( _1−W

0

2n i = 1, 2

2i−1W1 i = 3, . . . , n + 1

3. Intialize vector sequence as: X10= [0] , X 1 1= −√1 2W1 , X12= ₁ √ 2W1

4. Expand vector sequence for j = 2, . . . , n according to

Xj_i =                    Xj−1₀ 0 i = 0 " Xj−1_i −_√1 2Wj # i = 1, . . . , j " 0j−1 1 √ 2Wj # i = j + 1

Algorithm 3.2 Spherical simplex sigma point selection 1. Choose 0 ≤ W0≤ 1

2. Choose weight sequence:

Wi= (1 − W0)/(n + 1)

3. Intialize vector sequence as: X10= [0] , X 1 1= −√1 2W1 , X12= ₁ √ 2W1

4. Expand vector sequence for j = 2, . . . , n according to

Xj_i =                    Xj−1₀ 0 i = 0 " Xj−1_i −_√ 1 j(j+1)W1 # i = 1, . . . , j " 0j j √ j(j+1)W1 # i = j + 1

(27)

3.4 Choice of filter 15

where α has the same function and suggested value as in Eq. 3.9. Furthermore some of the transformations are adjusted to incorporate the scaling factor3

Y0i= f h X0i i (3.13) ¯ y = p X i=1 W_i0Y0i (3.14) Pyy = p X i=1 W_i0nY0_i0− ¯y o n Y0_i0− ¯y oT + W0+ 1 + β − α2 n Y00− ¯y o n Y00− ¯y oT (3.15)

3.4 Choice of filter

3.4.1 Criteria

To make a well-founded decision a list of criteria was adopted based on the applica-tion for the MATRIS project. The following criteria were taken into consideraapplica-tion:

1. Accuracy

2. Computational load 3. Numerical stability

In what follows a motivation is given for the particular implementation.

Accuracy

In order to meet the tight bounds determinend for the MATRIS project, high accuracy has to be maintained during operation (see Section 1.1). In fact, the achieved accuracy with the current filter has been the motive for this report. A substantial deviation of the camera pose will lead to non-credible augmentation.

Computational load

This is somewhat related to the accuracy in the sense that any lag in the estima-tion of the pose will be very much visible to the user of the system. This lag is mostly due to the computational load needed for real-time estimation of the pose. Since the computational load is directly related to the number of sigma points the implementation with the least number of sigma points preferred, while maintaining the needed accuracy. Furthermore when the process and measurement noise are purely additive, a computationally less demanding formulation becomes applicable [12]. However, care must be taken with this formulation since it has been reported that in some cases it can lead to a decrease in accuracy [13]. Furthermore in [14] (pp. 109) two different ways are presented on how to actually incorporate the

3_{As mentioned earlier, when the sigma point selection of Wan and Van der Merwe is chosen}

(28)

additive noise. A trade-off has to be made between emphasis on the information captured by the sigma points about odd-moments and computational load.

Numerical stability

Of course the filter has to perform without divergence or deterioration of the es-timates. So this is actually a criterium that has to be fullfilled at all time. The reason that it is mentioned here is because of the choice between the original implementation and a so called square-root implementation [15]. This formula-tion is reported to yield the same computaformula-tional load but with improved numerical stability and guaranteed positive semi-definiteness of the state covariance matrix.4

3.4.2 Motivation

Preliminary simulations with various implementations of the filters showed that the implementation with the Spherical Simplex sigma points yielded the fastest estimations, as was expected, and with the same accuracy as the more extensive sigma points selection schemes. Numerical instabilities were not encountered, so there seems no need for a square-root implementation, wich also showed to be slower. Therefore an Unscented Kalman filter will be derived based on Algo-rithm 3.2. Since the process and measurement noise are assumed to be purely additive there is no need to augment the state vector as seen in the original for-mulations in e.g. [10]. Instead the filter is based on the filter equations in [12,15], but with the reduced sigma point set. Since the estimates have to be made in real-time the choice is made in favor of the faster incorporation of the additive process noise. Also a square-root implementation will be developed, just in case of numerical stability issues.

3.5 Filter equations

In this section the Unscented Transform will be used in a Kalman filtering

frame-work. Three different implementations are presented. First the main algorithm

will be derived which is the UKF with the spherical sigma point set, after that the square-root version of the previous filter is derived and finally the augmented UKF with the scaled symmetric sigma point set is presented. The algorithms are summed up in AppendixC. Many variations on the filters below are possible, but differences are merely in the choice of the sigma point selection and whether or not the augmented version of the UKF is chosen. The algorithms presented here are of the general type. The characteristics of the filters for the system which is discussed in this report are postponed to Chapter5.

(29)

3.5 Filter equations 17

3.5.1 Unscented Kalman filter with spherical simplex sigma

points

The recursion starts by exploiting the knowledge of the state by letting its estimate equal the expectation of the state and the estimated covariance equal to

ˆ

x0= E[x0]

P0= E[(x0− ˆx0)(x0− ˆx0)T]

(3.16) A sigma point set Xinit is determined according to Algorithm 3.2. Then, for

k ∈ {1, . . . , ∞}, the filter consist of a prediction and a correction step. First the sigma point set is transformed to the actual mean and covariance of the prior probability, Xk−1= ˆxk−1+ _p Pk−1 T Xinit (3.17)

afterwhich the points are passed through the nonlinear process model:

X∗_k|k−1= f [Xk−1] (3.18)

Now a prediction is available of the mean and covariance in the next timestep. The mean is captured by the following approximation:

ˆ x−_k = n+1 X i=0 WiXi,k|k−1∗ (3.19)

The state covariance is then approximated by

P−_k = n+1 X i=0 Wi h X_i,k|k−1∗ − ˆx−_ki hX_i,k|k−1∗ − ˆx−_ki T + Q (3.20)

This prediction is then corrected using the current measurement. First a new sigma point set has to be drawn because of the added process noise Q in Eq. (3.20). The initial sigma point set is transformed to the new mean and covariance

Xk−1= ˆx−k +

q P−_k

T

Xinit (3.21)

where the matrix square root can be performed by the numerical efficient Cholesky

decomposition. This set of points is passed through the measurement models to make a prediction of the measurement. This prediction is then compared to the actual measurement to yield a correction term for the mean and covariance. Passing the sigma points through the measurement model yields a transformed set of sigma points according to

Zk|k−1= h [Xk−1] (3.22)

afterwhich the predicted measurement mean can be approximated ˆ z−_k = n+1 X i=0 WiZi,k|k−1 (3.23)

(30)

Next the observation covariance and the state-observation correlation matrices are determined, incorporating the measurement noise covariance Rmeas

P˜zk˜zk= n+1 X i=0 WiZi,k|k−1− ˆz−k Zi,k|k−1− ˆz−k T + Rmeas (3.24) Pxkzk= n+1 X i=0 Wi h X_i,k|k−1∗ − ˆx−_kiZi,k|k−1− ˆz−k T (3.25)

Finally the Kalman gain is computed and the correction can be made, Kk= PxkzkP −1 ˜ zk˜zk (3.26) ˆ xk= ˆx−k + Kk zk− ˆz−k (3.27) Pk= P−k − KkP˜zk˜zkK T k (3.28)

yielding the estimated state-vector and state-covariance. In the following example the result of this UKF with a simple state estimation problem is presented.

Example 3.1: Comparison EKF and UKF

In this example a comparison will be shown of the state estimation problem intro-duced in Example2.1. The two filters compared are a extended Kalman filter and an unscented Kalman filter with the spherical simplex sigma point set. The figure clearly demonstrates the better ability of the UKF to deal with the nonlinearities. The plot was made using W0= 0.2.

0 10 20 30 40 50 60 70 80 90 100 −100 −50 0 50 100 150 Time True EKF UKF Measurements

Figure 3.3. With the same computational complexity, the UKF gives a better estimate of the nonlinear system than the EKF.

(31)

3.5.2 Square-root UKF with spherical simplex sigma points

The square-root (from now on abbrieved by SR) version has some computational advantages such as guaranteed positive semi-definiteness of the state-covariance matrix. The key behind the SR filter is not to propagate the state-covariance Pk but the matrix square-root Sk, where SkSTk = Pk. The SR-UKF makes use

of three linear algebra techniques, namely QR decomposition, Cholesky factor

updating and Efficient least squares, of which the details can be found in [12]. These techniques are available in Matlab as

• [Q,R] = qr(A’,0)

• S = cholupdate(S,columnvector,’sign’) • ’/’ operator

The result of the economy qr decomposition of matrix A is the upper triangular matrix R. The second command performs the rank-1 update or downdate (de-pending on ’sign’) of matrix S with columnvector. The ’/’ operator yields a qr decomposition with pivoting.

The SR version has the same structure as the non-SR implementation. Changes are presented here. For the whole algorithm a reference is made to AppendixC. The recursion starts with

ˆ

x0= E[x0]

S0= chol{E[(x0− ˆx0)(x0− ˆx0)T]}

(3.29) where chol denotes a Cholesky decomposition. The sigma points are chosen ac-cording to:

Xk−1= ˆxk−1+ (Sk−1) T

Xinit (3.30)

The calculation of the mean is the same as in the non-SR version (Eqs. (3.18) and (3.19)). The covariance is calculated according to

S−_k = qrnhpW1

X∗1:n+1,k|k−1− ˆx−k

_p

Qio (3.31)

Next, the rank-1 Cholesky update is performed S−_k = cholupdateS− k, X ∗ 0,k− ˆx − k, sign(W0) (3.32) Again a new sigma point set is drawn to incorporate the process noise. The points are then passed through the measurement model and the mean of the observation is calculated. The update step for the observation-covariance again makes use of the qr decomposition and the cholesky update

S˜zk= qr nhp W1(Z1:n+1,k− ˆzk) p Rmeas io (3.33) S˜zk= cholupdate {S˜zk, Z0,k− ˆzk, sign(W0)} (3.34)

(32)

The state-observation correlation matrix is the same as in the non-SR implemen-tation. The correction follows according

Kk= Pxkzk/S T ˜zk /S˜zk (3.35) ˆ xk= ˆx−k + Kk zk− ˆz−k (3.36) U = KkS˜zk (3.37) Sk= cholupdateS−_k, U, −1 (3.38)

The cholupdate performs the rank-1 downdate with the columns of U. The Kalman gain follows from the fact that [15]

Kk S˜zkS

T ˜

zk = Pxkzk (3.39)

3.5.3 Augmented UKF with symmetric sigma points

The augmented version differs from the previous algorithms in the fact that the process and measurement noise are added to the state and state covariance matrix before determining the sigma points. As a result the noises need not to be purely additive in contrast to the previous algorithms. Once again the recursion start with

ˆ

x0= E[x0]

P0= E[(x0− ˆx0)(x0− ˆx0)T]

(3.40) after which they are augmented as follows

ˆ xa 0= E [xa] = ˆ xT₀ 0 0 T Pa 0= E(xa0− ˆxa0)(xa0− ˆxa0)T =   P0 0 0 0 Q 0 0 0 Rmeas   (3.41)

where the superscript a denotes the augmented state and state covariance. Next the sigma points are determined according to the desired selection scheme. For the symmetric scaled sigma points this gives

Xa k−1= h ˆ xa k−1 ˆx a k−1+ q (n + λ)Pa k−1 xˆ a k−1− q (n + λ)Pa k−1 i (3.42) where Xa = [(Xx)T (Xv)T (Xn)T]T_{. The set of weights is chosen according}

to Eq. (3.8). Next the set of sigma points is split up and passed into the nonlinear system equation

Xx_k|k−1= fXx_k−1, Xv_k−1 (3.43) afterwhich the predicted mean and covariance can be calculated

ˆ x−_k = 2n X i=0 W_i(m)Xi,k|k−1 (3.44) P−_k = 2n X i=0 W_i(c)Xi,k|k−1− ˆx−k Xi,k|k−1− ˆx−k T (3.45)

(33)

Notice the difference with Eq. (3.20) in terms of the different weights and the absence of the state covariance Q. Since the process noise is already incorporated in the predicted sigma points there is no need to draw a new set of sigma points. Instead, the measurement noise sigma points are used together with the predicted sigma points to determine the predicted measurement

Zk|k−1= h h Xx k|k−1, X n k|k−1 i (3.46) ˆ z−_k = 2n X i=0 W_i(m)Zi,k|k−1 (3.47)

The measurement update is performed equal to the non-augmented version except for the measurement noise

P˜zk˜zk = 2n X i=0 W_i(c)Zi,k|k−1− ˆz−k Zi,k|k−1− ˆz−k T (3.48) Pxkzk = 2n X i=0 W_i(c)Xi,k|k−1− ˆx−k Zi,k|k−1− ˆz−k T (3.49) Kk= PxkzkP −1 ˜zk˜zk (3.50) ˆ xk= ˆx−_k + Kk zk− ˆz−_k (3.51) Pk= P−k − KkP˜zk˜zkK T k (3.52)

Example 3.2: Augmented VS non-augmented

To see how the augmented UKF behaves in comparison to the non-augmented UKF the states of the state estimation problem introduced in Example2.1will be estimated. This is especially interesting since it is the same system as the authors of [13] used in their article to demonstrate the difference in accuracy between the augmented and non-augmented UKF. Here the comparison is made for both the symmetric sigma point set and the spherical simplex sigma point set. The plots are made using the following parameters: α = 1, β = 0, κ = 0 and W0= 0.2 and

can be found on page22.

The plots show the RMSE for 100 independent runs. All filters were given the same initial conditions. From Fig. 3.2 it is clear that the UKF with the symmetric sigma point confirms the statements made in [13]. Surprisingly, the difference between the augmented and non-augmented UKF’s with the spherical sigma point is much smaller. In fact, the non-augmented UKF (’Spherical’) even has a lower RMSE as seen in Table 3.1. Unfortunately this sigma point set is not discussed in detail in [13]. From this example it can be concluded that the UKF with the spherical sigma point set is less sensitive to whether the augmented or non-augmented implementation is chosen. Also, the augmented UKF with the symmetric sigma point set has a lower error than both of the implementations with the spherical simplex sigma points.

(34)

Symmetric Spherical

Augmented 10.4392 12.5590

Non-augmented 14.8880 12.3354

Table 3.1. RMSE augmented VS non-augmented. For the UKF’s with the spherical simplex sigma point set the difference in RMSE is much smaller than for the symmetric sigma point set. The RMSE of the non-augmented UKF with the spherical simplex sigma point set is in this case even lower than the augmented version.

0 10 20 30 40 50 60 70 80 90 100 6 8 10 12 14 16 18 20 22 24 Number of runs RMSE Augmented Non−augmented

(a) Scaled symmetric sigma point set.

0 10 20 30 40 50 60 70 80 90 100 6 8 10 12 14 16 18 20 Number of runs RMSE Augmented Non−augmented

(b) Spherical simplex sigma point set.

Figure 3.4. RMSE plots. Upper: Augmented VS non-augmented for the symmetric sigma point set, bottom: the same comparison but this time for the UKF’s with the spherical simplex sigma point set.

(35)

Chapter 4

The Marginalized Particle

Filter

A

nother way to deal with nonlinear and/or non-Gaussian systems is by the use of Sequential Monte Carlo methods, commonly known as particle filters. This Chapter handles a variation on the standard particle filter. This is done in Section 4.2. In Section 4.3 the filter equations are presented. First, the general characteristics of the particle filter are discussed.

4.1 Particle filters

The particle filter (from now on abbrieved by PF) approximates the state proba-bilities rather than adjusting the model equations, which is done in the Extended

Kalman Filter. The difference with the Unscented Kalman Filter is that the

par-ticles are randomly chosen instead of deterministically and in contrast to the UKF the accuracy of the PF is directly related to the number of particles.

4.1.1 Importance sampling

In particle filtering, the approximation of a probability density is done by K point masses drawn from p(·) which are modelled using the Dirac function δ(·) [16]

ˆ pK(x) ∼= 1 K K X i=1 δ(x − xi) (4.1)

Now, let us reconsider Eq. (2.2), but this time generalized for the expectation of any function g(x) E [g(x)|z] = Z x g(x)p(x|z)dx (4.2) 23

(36)

Perfect sampling approximates Eq. (4.2) with the use of Eq. (4.1). Let x(i), i = 1, . . . , K be samples drawn from the conditional probability density p(x|z), then [3]

E [g(x)|z] ∼= 1 K K X i=1 g(x(i)) (4.3)

The posterior density p(x|z) is not known beforehand. To eliminate the need for the posterior probability density the particle filter uses a so-called proposal (or

importance) density q(x). Rewriting Eq. (4.2) yields E [g(x)|z] = Z g(x)p(x|z)dx = Z g(x)p(x|z) q(x) q(x)dx = Z g(x)p(z|x)p(x) p(z)q(x) q(x)dx = Z g(x)w(x) p(z)q(x)dx = 1 p(z) Z g(x)w(x)q(x)dx (4.4)

where Bayes’ theorem is used. The weights are given by w(x) =p(z|x)p(x)

q(x) (4.5)

The importance weights involve the likelihood p(z|x), the prior density p(x) and the importance density q(x). Substitution of

p(z) = Z p(z|x)p(x)dx = Z p(z|x)p(x)q(x) q(x)dx = Z w(x)q(x)dx (4.6)

in Eq. (4.4) together with Eq. (4.3) yields

E [g(x)|z] ∼= K P i=1 w(x(i)_)g(x(i)₎ K P i=1 w(x(i)₎ ∼ = K X i=1

(37)

4.1 Particle filters 25

where wnorm(i) = w(i)/P w(i)denote the normalized importance weights. So

draw-ing a set of samples from the importance density together with the accorddraw-ing weights enables us to determine the various moments of the posterior density. For example, the covariance matrix is found by substitution of g(x) = (x − ˆx)(x − ˆ¯ ¯x)T_.

The MMSE estimator is given by the substitution of g(x). Since a representation of the posterior probability is available, other estimation criteria an be adopted as well, in contrast to Kalman filtering.

4.1.2 Degeneracy

Often the prior distribution is chosen as the importance density for it’s conve-nience. The weights than only depend on the likelihood function (see Eq. (4.5)). Though it is an easy to implement solution, it has some drawbacks. Especially in high signal-to-noise systems where the likelihood is very peaked. Figure4.1shows the particles representing the likelihood and the prior distribution for a moment in time of the system presented in Examples2.1and3.1. Looking at Fig.4.1one can imagine that when the likelihood is very peaked only a few particles will be of any importance. A similar reasoning yields for likelihoods wich are in the tail of the prior distribution. Again only a small amount of particles will incorporate the information from the likelihood function. Thus the choice of the importance density is of paramount importance to move the samples to regions where they are most needed. It has a great impact on filter performance. In the worst case so little particles will be of importance that after a few iterations one of the weights will become 1, while the rest will be close to zero (and thus numerically insignif-icant). This is phenomenon is called degeneracy. Choosing another importance density in that case can greatly improve performance.

0 0.1 0.2 0.3 0.4 0.5 −25 −20 −15 −10 −5 0 5 10 15 20 25 0 5 10 15 20 −25 −20 −15 −10 −5 0 5 10 15 20 25

Figure 4.1. Left: The solid line represents the estimated path, dashed is the true path. Right: particles with a ’+’ represent the likelihood. Plotted at the arbitrary position 0.1 are the particles drawn from the prior distribution, indicated with a ’o’.

Another problem is that the variance of the importance weights increases stochas-tically in time. This can also lead to degeneracy. A method to counteract this behaviour is by means of resampling. Samples with low importance weights will

(38)

be removed and samples with high importance weights will be multiplied so as to keep a constant amount of samples.

Resampling leads to using particles with a high probability many times and removing particles with low probability. This will result in dependency among the samples. This phenomenon is called sample impoverishment. Though there exist some counteracting methods, one has to be carefull with resampling. It is wise only to resample when really needed. A criterium about when to resample is given below, Nef f = N X i=1 w_i2 !−1 (4.8)

with Nef f representing a measure of the number of effective particles. Resampling

occurs when the number of effective particles is below a certain threshold, for instance Nef f < 2Ns/3, with Nsthe number of particles used for the estimation.

There are various ways of resampling. In [16] 4 different methods of resampling are discussed.

Recursive importance sampling and resampling is given in basic form by the

condensation algorithm [3]. It consists of the Sampling Importance Resampling (SIR) algorithm followed by a time update step. Many variations on the basic algorithm exist, see e.g. [4]. In Algorithm4.1 (page 31) only the basic particle filter is presented. Resampling is done by systematic resampling1_{and the proposal}

density is the transition prior.

Example 4.1: Comparison EKF, UKF and PF

In this example it is shown how the basic particle filters performs compared to the EKF and UKF on the previously introduced state estimation problem. The two filters compared are an extended Kalman filter, an unscented Kalman filter with the spherical simplex sigma point set (W0 = 0.2) and a particle filter with 1000

particles. The results are shown in Figure4.2. For clearity the EKF is left out in the left plot.

Both the UKF and the PF are at some points completely off (e.q. around 55 seconds) but overall the PF estimates the states with lower error. This is made clear by the right plot, where the RMSE is plotted for 100 different runs. The right plot also shows that the variance of the error is lower. This gain in accuracy is at the expense of extra computational load. Furthermore it is noted that the problem at hand has a bimodal nature, while the UKF approximates a Gaussian distribution. The PF has clearly a better ability to cope with this bimodal nature, since it approximates the true distribution.

1_{This type of resampling is most appropriate in terms of variance of the weights and}

(39)

4.2 Marginalization 27 0 10 20 30 40 50 60 70 80 90 100 −30 −20 −10 0 10 20 30 40 50 Time True UKF PF Measurements

(a) True and estimated states.

0 10 20 30 40 50 60 70 80 90 100 0 20 40 60 80 100 120 Number of runs RMSE EKF UKF PF

(b) Root mean squared error over 100 runs.

Figure 4.2. Comparison EKF, UKF and PF

4.2 Marginalization

The particle filter suffers from a high computational load to achieve the desired accuracy, especially for high dimension systems. In [17] a method is presented to aleviate this burden. The idea is to marginalize out the states that are linear in the dynamic models and applying a linear Kalman filter for the estimation. The nonlinear states are estimated with a particle filter. This technique is often referred to as Rao-Blackwellization and the resulting particle filter is referred to here as the marginalized particle filter.

4.2.1 Mixed linear/nonlinear systems

When some of the variables of interest appear linear in the system equations the state vector can be partitioned in a linear part and a nonlinear part, yielding

x =

xn _xl T

(4.9) where xn _{denotes the nonlinear variables and x}l _{denotes the linear variables. To}

be able to implement the MPF the system equations have to be split up in a linear part and a nonlinear part. First, recall the general nonlinear state-space equations as introduced in Chapter2, which are repeated here for convenience2

xk+1 = f (xk, uk, k) + wk

zk = h(xk, k) + vk

(4.10)

(40)

Substituting the partitioned state vector of Eq. (4.9) yields the mixed linear/nonlinear state-space models xn k+1 = f x n k, x l k, uk, k + wnk xl k+1 = f x n k, x l k, uk, k + wnk (4.11) zk = h xnk, x l k, k + vk

By writing out the linear function evaluations as matrix operations the following general model is derived for nonlinear systems with linear sub-structures

xn_k+1= f_kn(xn_k, uk) + Ank(x n k)x l k+ G n k(x n k)w n k xl_k+1= f_kl(xn_k, uk) + Alk(x n k)x l k+ G l k(x n k)w l k (4.12) zk= hk(xnk) + Ck(xnk)x l k+ vk

which corresponds to the the general model given in [17]. The matrices A, C, and G are arbitrary. The state noise is assumed white and Gaussian distributed with

wk= wn k wl k ∼ N (0, Qk) , Qk = _Ql k Q ln k Qln k T Qn k (4.13) the measurement noise is assumed white and Gaussian distributed,

et∼ N (0, Rmeas) (4.14)

and xl0 is Gaussian

xl₀∼ N ¯x0, ¯P0

(4.15) Note that in the additive noise case the term Qln

k is zero.

In two special cases the Gaussian noise assumptions can be relaxed. That is if • the nonlinear state equation is independent of the linear states,

and/or

• the measurement equation does not depend on the linear state variables. In the first case An_k(xn_k) = 0 and the nonlinear state equation is solely used in the particle filter part of the algorithm and hence the process noise for the nonlinear state variables can be arbitrarily distributed. In the second case Ck(xnk) = 0 and

the measurement noise can be arbitrarily distributed with the same reasoning as mentioned above.

Systems can be written in the form of the general model given by Eq. (4.12). For some cases the model can be simplified. For instance in the often encountered case of linear state equations and a nonlinear measurement equation, Eq. (4.12) can be written as xnk+1= A n n,kx n k + A n l,kx l k+ B n nuk+ Gnkw n k xlk+1= Aln,kxnk + All,kxlk+ Blnuk+ Glkwlk (4.16) zk= hk(xnk) + vk with wn

k ∼ N (0, Qnk) and wlk∼ N 0, Qlk. The distribution of the measurement

(41)

4.2.2 Computational complexity

In [18] an analysis of the computational complexity of the marginalized particle filter is performed. One might be tempted to just marginalize out all the linear state variables. Actually this is not so trivial as it may seem. It has been found that for the case of linear state equations and nonlinear measurements that the computational complexity is always reduced when the MPF is used instead of the PF. At the same time the estimates are improved. But for the case of linear state equations and mixed nonlinear/linear measurements the computational complexity is not always reduced using the MPF. For systems with high state dimension and many marginalized states the standard PF becomes more efficient than the MPF. In this case if computational load is an issue, one can choose to increase the number of states estimated by the PF. The computational complexity for other systems can be performed using the approach as proposed in [18].

4.3 Filter equations

Again the aim is to recursively determine the posterior density p(x|z), but without spending many particles on variables that are linear in the equations. Not only will this reduce the computational load, but since the Kalman filter is an optimal solution and the PF merely an approximation to the optimal solution, the esti-mates can be expected to improve. The linear state variables can analytically be marginalized out by [17] p (xk|Zk) = p xlk, X n k|Zk = p xlk|X n k, Zk | {z } KF p (Xn_k|Zk) | {z } P F (4.17) with Xn

k = {xni}ki=0 and Znk = {zni}ki=0 and where it is indicated whether the

corresponing density is determined by the linear Kalman filter (KF) or the particle filter (PF). The marginalized particle filter is simply an extension to the standard particle filter as given in Algorithm 4.1. Only the prediction step is changed to incorporate the Kalman equations. The marginalized particle filter equations presented here apply to the general model given by Eq. (4.12). The equations are summed up in AlgorithmC.5.

The MPF is initialized with choosing the particles according to the nonlinear variables and setting the linear state variables as follows

xn,(i)_0|−1∼ pxn 0 (x n 0) (4.18) n xl,(i)_0|−1, P(i)_0|−1o=¯xl₀, ¯P0 (4.19)

afterwhich steps 2 to 4 from the basic particle filter can be performed with the set xn,(i)_0|−1 instead of samples of the whole state dimension. Note that in the resample step both linear and nonlinear variables are resampled using the same weights. Equation (4.19) points out what is actually happening in the MPF; one Kalman filter is associated with each particle. Note that when Ck(xnk) = 0, the covariance

(42)

only one covariance matrix has to be initialized (in contrast to the particles for the state variables).

From here it is assumed that step 4 of the basic particle filter is just executed, so a set of resampled particles xn,(i)_selected is available. Instead of the normal time update, an PF time update and Kalman filter equations will be performed. First a Kalman filter measurement update is executed

where the particle indices are left out for notational clearity. These matrix opera-tions have to be calculated for every Kalman filter associated with a particle. The estimates as expected means of the linear state variables and their covariances are now available as follows

ˆ xl_k|k=

K

X

i=1

w(i)_normˆxl,(i)_k|k (4.24)

where w(i)norm are the normalized importance weights given by step 2 in

Algo-rithm4.1on page31.

Next a particle filter time update is performed, so for i = 1, . . . , K a new set of samples are predicted according to

xn,(i)_k+1|k∼ pxn_k+1|k|xn,(i)_selected, zk

(4.26) Finally a Kalman filter time update is performed,

Nk = AnkPk|k(Ank) T + Gn_kQn_k(Gn_k)T (4.27) Lk = ¯AlkPk|k(Ank) T N−1_k (4.28) ˆ xl,(i)_k+1|k= ¯Al_kˆxl,(i)_k|k + Gl_k Qln_k T (Gn_kQn_k)−1˜z(i)_k + f_kl(xn,(i)_selected) + Lk ˜z(i)_k − An kˆx l,(i) k|k (4.29) Pk+1|k= ¯AlkPk|k A¯lk T + GlkQ¯lk Glk T − LkNkLTk (4.30) with ˜ z(i)_k = ˆxn,(i)_k+1|k− fn k(x n,(i) selected) (4.31) ¯ Al_k= Al_k− Glk Q ln k T (Gn_kQn_k)−1An_k (4.32) ¯ Qlk= Qlk− Qlnk T (Qnk) −1 Qlnk (4.33)

(43)

where it is noted that if the linear and nonlinear process noises wl_t and wn_t are independent that Qln_t = 0 and hence ¯Al_t= A_tl and ¯Ql_t= Ql_t.

Algorithm 4.1 Basic particle filter For k ∈ {1, . . . , ∞},

1. Initialize the particles n

x(i)_0|−1o

K

i=1∼ p(x0) and set k = 0 (4.34)

2. Measurement update using importance sampling

w(i)= p(zk|x(i)) (4.35)

w_norm(i) = w(i)/XK

i=1w (i) _(4.36) 3. Estimate E [g(x)|z] ∼= K X i=1 w(i)normg(x (i) ) (4.37) 4. Systematic resampling

• Generate K ordered numbers according to um=

(m − 1) + ˜u

K , u ∼ U (0, 1)˜ (4.38)

• The resampled particles are obtained by producing nk copies of particle xi

to yield xi_selected, where

ni= the number of um∈ Xi−1 s=1q (s) norm,k, Xi s=1q (s) norm,k (4.39)

5. Time update, predict new particles

x(i)_k+1|k∼ p xk+1|k|xiselected , i = 1, . . . , K (4.40)

(44)

(45)

Chapter 5

System Models and Filter

Equations

I

n this chapter the system models will be derived. With the process model a prediction can be made of the state in the next timestep. With the observation models a prediction of the measurement at the current timestep can be made. Combining this prediction with the actual measurement yields an correction of the predicted state. The more accurate the models represent the real dynamics and measurements of the system, the higher the accuracy of the estimates can expected to be. As mentioned in the introduction the complete system fuses information given by an IMU and a camera. Measurement models are given in Section 5.3. In Section5.2 the process model will be derived. First the different coordinate systems are defined.

5.1 Coordinate systems

To define the positions and rotations of the camera, four coordinate systems have to be taken into consideration. One for the camera and the sensor, one coordinate system at a certain distance from the optical centre of the camera and one fixed system. Quite obviously the fixed system is chosen to be fixed to the earth1 _and

is indicated by w. The origins of the moving systems for the camera (c) and the sensor (s) are positioned in the optical centre of the camera and a fixed distance from the camera respectively (see Fig. 5.1). The coordinate system for the image frame (i) is located at a distance apart from the optical centre of the camera (the so called focal length). This way the coordinates of the camera and the sensor can be expressed in the fixed frame. The sensor is rigidly attached to the camera and so it is moving together with the camera. The same goes for the image frame.

The state vector is chosen somewhat different than in [1], where the inertial quantities were treated as input signals. Although the dimension of the state vector increases by treating the intertial quantities as measurement signals (from

1_{Thereby ignoring the earth’s rotation.}

(46)

10 to 22), it also has certain advantages such as the ability to predict the angular velocity and accelerations. Furthermore the transition from continuous to discrete time models is simplified. The state vector also incorporates the offsets of the inertial sensors.

Figure 5.1. The coordinate systems with their unit vectors

5.2 Process model

In this report the state vector is defined by the following quantities x = [sw, ˙sw, ¨sw, qsw, ωssw, bsa, bsω]

T

(5.1) with sw, ˙swand ¨swthe position, velocity and acceleration of the sensor expressed in the world coordinates. qsw is the quaternion2 describing the rotation from the world frame to the sensor frame and ωssw is the angular velocity. Furthermore the

state vector incorporates the accelerometer offset bsa and the gyroscopic offset bsω.

There is no knowledge available about the jerk, the gyroscopic velocity and the time derivative of the offsets. Therefore they are considered zero (with uncertainty expressed by the added noises). The time derivative of the rotation quaternion follows from [1]. The differential equations of the process can now be described by

˙ x ≡ ∂ ∂t           sw ˙sw ¨sw qsw ωs sw bs_a bs_ω           = f (x) + w =           ˙sw ¨sw 0 −1 2ω s sw qsw 0 0 0           +           0 0 wa 0 wω wb,a wb,ω           (5.2)

(47)

5.3 Measurement models 35

where indicates a quaternion multiplication. Integrating this continuous-time model with respect to time yields the discrete-time model:

xt+Ts ≡           sw ˙sw ¨sw qsw ωs sw bs a bs_ω           t+Ts =           sw_{+ T} s˙sw+12T 2 s¨sw ˙sw_{+ T} s¨sw ¨sw ∆q qsw ωs sw bs a bs ω           t +           0 0 wa 0 wω wb,a wb,ω           t (5.3) where ∆q = cos −T₂sωs_sw , −Ts 2 ω s sw −T₂sωs_sw sin −T₂sωs_sw ! (5.4) with sampletime Ts and w ∼ N (0, Q). Notice the difference with Eq. (2.5) since

u ∈ ∅.

5.3 Measurement models

5.3.1 Inertial Measurement Unit

The IMU returns acceleration and angular velocity data. Gravity is inherently measured by the IMU and the observation model has to account for that. Apart from that the IMU has a certain offset in the measurements, which also have to be accounted for. The noises are assumed additive and Gaussian.

The equations describing the measurements by the gyroscope and the ac-celerometer can be characterised by (see Eq. (2.5))

zω= ωssw+ b s

ω+ vω (5.5)

za= qsw [¨sw− gw] qws+ bsa+ va (5.6)

with gw = [0, 0, −g]T the gravity vector. To be able to implement Eq. (5.6) in Matlab Eq. (A.7) is used

za= qsw (0, a) qws+ bsa+ va

=q2

0− q · q a + 2 [a · q] q + 2q0[q × a] + bsa+ va (5.7)

where the quaternion has been split up in a scalar part (q0) and a vector part (q).

A ‘·’ denotes an inner product and a ‘×’ denotes a cross-product.

5.3.2 Vision system

The projection of a scene on the image plane of the camera is modelled by the pinhole camera model. This yields the following observation model [1]

Comparing Estimation Algorithms for Camera Position and Orientation

Institutionen för systemteknik

Department of Electrical Engineering

Internship

Comparing Estimation Algorithms for Camera

Position and Orientation

Comparing Estimation Algorithms for Camera

Position and Orientation

Internship

performed at

Division of Automatic Control

Department of Electrical Engineering

Linköpings universitet, Sweden

by

Abstract

Acknowledgments

Contents

Chapter 1

Introduction

T

1.1

Previous work

1.2

Report outline

Chapter 2

Estimation Theory

T

2.1

Linear systems

2.2

Nonlinear systems

Chapter 3

The Unscented Kalman

Filter

J

3.1

Unscented Transformation

3.2

Sigma-point selection

3.3

The scaled unscented transformation

3.4

Choice of filter

3.4.1

Criteria

3.4.2

Motivation

3.5

Filter equations

3.5.1

Unscented Kalman filter with spherical simplex sigma

points

3.5.2

Square-root UKF with spherical simplex sigma points

3.5.3

Augmented UKF with symmetric sigma points

Chapter 4

The Marginalized Particle

Filter

A

4.1

Particle filters

4.1.1

Importance sampling

4.1.2

Degeneracy

4.2

Marginalization

4.2.1

Mixed linear/nonlinear systems

4.2.2

Computational complexity

4.3

Filter equations

Chapter 5

System Models and Filter

Equations

I

5.1

Coordinate systems