Estimation and Adaptive Smoothing of Camera Orientations for Video Stabilization and Rolling Shutter Correction

(1)

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Estimation and Adaptive Smoothing of Camera

Orientations for Video Stabilization and Rolling

Shutter Correction

Examensarbete utfört i Reglerteknik vid Tekniska högskolan vid Linköpings universitet

av

Nicklas Forslöw

LiTH-ISY-EX--11/4474--SE

Linköping 2011

Department of Electrical Engineering Linköpings tekniska högskola

Linköpings universitet Linköpings universitet

(2)

(3)

Estimation and Adaptive Smoothing of Camera

Orientations for Video Stabilization and Rolling

Shutter Correction

Examensarbete utfört i Reglerteknik

vid Tekniska högskolan i Linköping

av

Nicklas Forslöw

LiTH-ISY-EX--11/4474--SE

Handledare: Jonas Callmer

isy, Linköpings universitet Examinator: David Törnqvist

isy, Linköpings universitet Linköping, 21 June, 2011

(4)

(5)

Avdelning, Institution

Division, Department

Computer Vision Laboratory Department of Electrical Engineering Linköpings universitet

SE-581 83 Linköping, Sweden

Datum Date 2011-06-21 Språk Language Svenska/Swedish Engelska/English Rapporttyp Report category Licentiatavhandling Examensarbete C-uppsats D-uppsats Övrig rapport

URL för elektronisk version

http://www.cvl.isy.liu.se http://www.ep.liu.se ISBN — ISRN LiTH-ISY-EX--11/4474--SE

Serietitel och serienummer

Title of series, numbering

ISSN

—

Titel

Title

Estimering och adaptiv glättning av kameraorienteringar för videostabilisering och korrektion av bilddistorsion orsakad av kamera med rullande slutare

Estimation and Adaptive Smoothing of Camera Orientations for Video Stabiliza-tion and Rolling Shutter CorrecStabiliza-tion

Författare

Author

Nicklas Forslöw

Sammanfattning

Abstract

Most mobile video-recording devices of today, e.g. cell phones and music players, make use of a Rolling Shutter camera. The camera captures video by recording every frame line-by-line from top to bottom leading to image distortion when either the target or camera is moving. Capturing video by hand also leads to visible frame-to-frame jitter.

This thesis presents algorithms for estimation of camera orientations using ac-celerometer and gyroscope. These estimates can be used to reduce the image dis-tortion caused by camera motion using image processing. In addition an adaptive low pass filtering algorithm used to produce a smooth camera motion is presented. Using the smooth motion the frame-to-frame jitter can be reduced.

The algorithms are implemented on the iPod 4 and two output videos are evaluated in a blind experiment with 30 participants. Here, videos are compared to those of competing video stabilization software. The results indicate that the iPod 4 application performs equal or better than its competitors. Also the iPod 4 accelerometer and gyroscope are compared to high end reference sensors in terms of bias and variance.

Nyckelord

(6)

(7)

Abstract

Most mobile video-recording devices of today, e.g. cell phones and music players, make use of a Rolling Shutter camera. The camera captures video by recording every frame line-by-line from top to bottom leading to image distortion when either the target or camera is moving. Capturing video by hand also leads to visible frame-to-frame jitter.

This thesis presents algorithms for estimation of camera orientations using ac-celerometer and gyroscope. These estimates can be used to reduce the image dis-tortion caused by camera motion using image processing. In addition an adaptive low pass filtering algorithm used to produce a smooth camera motion is presented. Using the smooth motion the frame-to-frame jitter can be reduced.

The algorithms are implemented on the iPod 4 and two output videos are eval-uated in a blind experiment with 30 participants. Here, videos are compared to those of competing video stabilization software. The results indicate that the iPod 4 application performs equal or better than its competitors. Also the iPod 4 ac-celerometer and gyroscope are compared to high end reference sensors in terms of bias and variance.

Sammanfattning

Det är vanligt att dagens mobiltelefoner använder en kamera med rullande slutare för videoinspelning. Dessa kameror fångar varje bild rad för rad från topp till botten vilket leder till bilddistorsion när antingen målet eller kameran rör sig. Inspelade videor upplevs även som skakiga eftersom kameran ej är stilla under inspelningen.

I detta examensarbete presenteras algoritmer för skattning av kamerans oriente-ring med hjälp av accelerometer och gyroskop. Skattningarna används sedan för att reducera bilddistorsionen som uppstår då kameran rör sig. En adaptiv algo-ritm för lågpassfiltrering av kamerans rörelse presenteras. Den mjuka rörelsen som produceras används sedan för att reducera skakigheten i filmerna.

Algoritmerna implementeras på iPod 4 och de resulterande filmerna utvärderas i ett blindtest med 30 deltagare. Filmerna jämförs med versioner stabiliserade av konkurrerande mjukvara. Resultaten visar att iPod-applikationen producerar ett

(8)

vi

lika bra eller bättre resultat än konkurrenterna. Accelerometern och gyroskopet på iPod 4 jämförs även med referenssensorer i den högre prisklassen.

(9)

Acknowledgments

Writing this thesis has had its up and downs but there are a lot of people who have brightened the experience. My examiner David Törnqvist and supervisor Jonas Callmer have provided invaluable help during this work. Without your guidance it would have taken twice the time.

Thanks also goes out to Per-Erik Forssén and Erik Ringaby for your constant belief in the success of this work. Discussing problems and ideas with you have lead to many improvements especially on the iPod application.

My friend, fellow student and co-worker Gustav Hanning with whom I’ve ex-changed views on most topics imaginable. Whenever I hit a dead end, consulting with you usually points me in the right direction.

Last but not least I would like to thank my family and close friends for the support you have provided not only during this master thesis but for my entire education at LiU.

(10)

(11)

Introduction

In this chapter the background of the thesis is presented. The problems that the thesis aims to solve are formulated together with the purpose and goal. A brief discussion about the limitations is given as well as a summary of the related research on the same area. Lastly the chapter ends with an outline of the report.

1.1 Background

Most mobile video-recording devices of today, e.g. cell phones and music players, make use of a Rolling Shutter (RS) camera. An RS camera captures video by recording every frame line-by-line from top to bottom of the image. This is in contrast to a Global Shutter camera where each video frame is captured in its entirety at once.

The RS technique gives rise to image distortions in situations where either the device or the target is moving. Since the device almost always is in motion while capturing video there is a need of correcting this type of distortion. Recently it has become common to equip handheld devices with sensors such as accelerometers and gyroscopes to detect orientation and motion in order to enhance the user experience. This information can also be used to correct RS distortion generated by motion of the device via image processing of a recorded video. Figure 1.1 displays an RS distorted image captured while the recording device was in motion.

Capturing video by hand also leads to visible frame-to-frame jitter. The recorded video is perceived as "shaky” and is not enjoyable to watch. By calculating a smooth motion close to the actual motion of the device and employing image processing, much of this jitter can be reduced.

(14)

2 Introduction

Figure 1.1. RS distorted image captured by iPod 4 when the device was moving from

left to right. Since the RS camera records the frame line-by-line from top to bottom the flagpoles have a "slanting" look

1.2 Problem Formulation

The procedure of correcting video distortions caused by motion of an RS camera device that includes gyroscope and accelerometer can be divided into the following main steps.

I. Simultaneously capture video and collect accelerometer and gyroscope data.

II. Estimate the orientation of the device using accelerometer and gyroscope data.

III. Manipulate the video frames according to the estimated motion of the device.

In the first step, data from three different sources are collected. Due to the low processing power of todays handheld devices it is not always possible to simultane-ously log data from all sources at the highest available frequency without risking sudden drops in frame rate. Therefore the objective is to log sensor data at as high a frequency as possible while keeping a constant frame rate.

Step two aims at producing the best possible estimates of the orientation of the device using the available sensor data. By deriving mathematical models for the sensors and the rotational motion of the device one can employ sensor fusion algorithms to compute these estimates.

The third and last step utilizes the estimates to perform image processing on each frame of the video in order to correct the distortions.

In addition to correcting the RS distortion, the recorded video is also to be sta-bilized. By applying Low-Pass (LP) filtering on the estimated rotational motion

(15)

1.3 Rectification and Stabilization Basics 3

a new smooth motion can be computed. Using this motion as a reference, each video frame can be processed to achieve a stable output video.

This thesis will focus on step II as well as the additional LP-filtering, leaving the image processing and data collection steps to a parallel thesis [1] by Gustav Hanning.

1.3 Rectification and Stabilization Basics

To better understand the basic principles behind image rectification and stabiliza-tion we look at two examples. For simplicity we restrict the camera to movement in a plane only allowing translations along the horizontal and vertical axes.

First the problem of correcting Rolling Shutter distortion, i.e. image rectification, is addressed. The distortion arise when a target is captured during a camera motion since the frame is recorded line-by-line from top to bottom. To the left in Figure 1.2 we see a target, illustrated by dark gray squares in a 16x12 image, captured when the camera was still. The middle illustration displays the target captured during a camera translation to the left leading to RS distortion. Now, if we know the relative camera position from row one to an arbitrary row in the image it is possible to move the pixels to where they should have been had the whole image been captured at once. A rectified version of the middle illustration is seen to the right. Since rows 7-12 are moved to left we lose image information in the pixels, colored black, in the bottom right corner. This information loss will be referred to as "black borders" throughout this thesis.

Figure 1.2. Illustration of the principle behind RS correction. To the left we see a target

(grey) captured when the camera was still. In the middle the same target is captured but during a camera motion to the left. To the right a rectified version of the middle image is displayed. Since rows 7-12 are moved to left we lose image information in the pixels, colored black, in the bottom right corner.

(16)

4 Introduction

subject to small movements. This leads to frame-to-frame jitter or a "shaky" video. Suppose that a person tries to hold the camera still while recording the target to the left in Figure 1.2. Because the camera is not always still there is a possibility that every now and then one of the top images in Figure 1.3 are captured. If the camera trajectory is known, one can apply an LP-filter to receive a smooth trajectory. Then the difference of the two trajectories can be used to move the pixels of the individual frames to get a stable output video. Suppose that this smooth trajectory corresponds to an entirely still camera where the captured images always have the target in the center. The bottom row of Figure 1.3 shows how all the pixels of an image are moved when the stabilization in applied. As with the RS correction this leads to "black borders" at the edge of the frames.

Figure 1.3. Illustration of the principle behind video stabilization. If the camera is

entirely still during the capturing of a video the target (gray) should be in the center as in Figure 1.2. Due to small camera movements the top images are sometimes captured. Knowing the smooth camera trajectory, in this example a camera without motion, all the pixels of an image can be moved, as illustrated by the bottom images, resulting in a video where the target is always in the center. The downside of the stabilization is the resulting black pixels where no image information is present.

1.4 Purpose and Goal

The purpose of the thesis is to implement and evaluate a video stabilization algo-rithm and a rolling shutter correction algoalgo-rithm on a mobile device using gyroscope and accelerometer data to track the orientation of the device. The built in sensors

(17)

1.5 Limitations 5

will also be compared to external reference sensors. The device used in the thesis is mainly the iPod 4 although some testing is also done on the similar unit, iPhone 4.

The goal is to create an iPhone/iPod application that decreases the video distortion caused by motion of the cell phone and its rolling shutter camera. The generated sequences should match those from commercial products such as iMovie ’11 [2]. This program only uses video extracted information for correction as compared to this project which also utilize gyroscope and accelerometer. Also the result of this thesis should be on par with applications such as Movie Stiller [3] which stabilize videos captured by iPhone 3GS, iPhone 4 and iPod 4.

1.5 Limitations

This thesis will focus on correcting RS distortion generated by motion of the device rather than motion of the target. In order to correct distortions caused by moving targets, additional tracking algorithms applied on the video sequences would be necessary.

This thesis will focus on an offline approach for video rectification and stabilization i.e. gyroscope, accelerometer and video data will first be simultaneously recorded and stored for later post processing. This makes the algorithms presented in this thesis applicable on most video recording devices equipped with accelerometer and gyroscope. Due to the low processing power of most handheld devices an online approach would be very difficult to carry out without reducing the frame rate of the processed video.

As mentioned in Section 1.2, the collection of sensor data, video capturing along with the image processing will not be covered thoroughly in this thesis. Detailed explanations about these issues are available in [1].

1.6 Related Work

David Törnqvist [4] derives a state space model of a rigid body system including measurements from an Inertial Measurement Unit (IMU). Here, quaternions are used to describe the orientation of the body relative to a global coordinate system. In addition several filtering algorithms are presented. A simplified version of this model will be used in this thesis for orientation estimation.

Forssén and Ringaby [5] present a method for rectifying video sequences i.e. correct image distortion caused by RS cameras. They model the distortions as being caused by the 3D motion of the camera and use a KLT tracker to estimate this motion. The rectification algorithm presented in this thesis is based on this work.

(18)

6 Introduction

Joshi et al. [6] present a model to recover the camera rotation and translation by integrating the measured acceleration and angular velocities of a camera platform using accelerometers and gyroscopes. This information is then used to remove motion blur from images.

1.7 Report Outline

Chapter 2 presents both a time-continuous and time-discrete model of the

rota-tional motion of a rigid body equipped with accelerometer and gyroscope.

Chapter 3 introduces the algorithms used for orientation estimation along with

the additional LP-filtering.

Chapter 4 deals with how the orientation estimates from Chapter 3 can be used

to rectify and stabilize a video.

Chapter 5 contains an evaluation of the iPod 4 gyroscope and accelerometer.

Here, high end Xsens sensors are used as reference.

Chapter 6 includes an evaluation of the algorithms used for orientation

estima-tion. In addition an example of how the LP filtering algorithm processes real data is presented.

Chapter 7 showcases the iPod/iPhone application created during this thesis. It

also includes a user study comparing output videos of the application to those of other competing software.

Chapter 8 rounds off the thesis with conclusions and a brief discussion regarding

(19)

Chapter 2

State Space Modeling

In this chapter both time-discrete and time-continuous state space models of the motion of the device are derived. We start off by introducing two coordinate systems and the measurements used in the models. Then the time-continuous model is derived followed by the time-discrete one which will later be used to estimate the orientation of the device.

2.1 Coordinate Systems

To describe the orientation of a mobile device, two coordinate systems are used. The origins of both systems are located at the center of the device. The first system is fixed to the world, having its Z-axis pointing up, and will serve as a reference to the second local system following the device. The coordinate systems, illustrated by Figure 2.1, will hereon be referred to as the F-system and D-system. To describe the orientation of the D-system with respect to the F-system a unit quaternion q = (q0, q1, q2, q3)T is used. For more information on quaternions view Appendix A.

2.2 Measurements

If the orientation of the device is to be estimated, measurements of the current state of the system is needed. Assuming that the built-in sensors are accelerometer and gyroscope it is possible to measure the acceleration a = (ax, ay, az)T and angular

velocity ω = (ωx, ωy, ωz)T. Note that the measurements are given in the local

D-system following the device.

(20)

8 State Space Modeling

Figure 2.1. Coordinate systems. F-system XYZ, D-system xyz

2.3 Time-Continuous Model

The time-continuous model has linear dynamics and a nonlinear measurement equation and can be written on the form

˙

qt= A(·)qt+ B(·)vt yt= h(qt) + et

The model is a down scaled versions of the one found in [4] which also include global position as state and incorporates magnetometer measurements into the measurement model.

2.3.1 Motion Model

When modeling the motion of the device we choose to only consider the orientation. This approach is based on the results of [5] where a pure rotation model yielded the best result for rolling-shutter rectification. Since the orientation of the device is described by a unit quaternion, qtis modeled as a state. The derivative of the

unit quaternion, derived in section A.2, is given by

˙ qt= A(ωt)qt= 1 2     0 −ωx,t −ωy,t −ωz,t ωx,t 0 ωz,t −ωy,t ωy,t −ωz,t 0 ωx,t ωz,t ωy,t −ωx,t 0     qt. (2.1)

Instead of modeling ωtas a state and including the measurements of the gyroscope

(21)

2.4 Discrete-Time Motion Model 9

measured and distorted by noise, this consideration is not mathematically correct. Instead we introduce the measurement noise vt∼ N (0, Q) and model the angular

velocity as ωt+ vt. Inserted into (2.1) we get

˙

qt≈ A(ωt+ vt)qt. (2.2)

Using (A.16), (2.2) can be rewritten as

˙ qt= A(ωt)qt+ B(qt)vt= =1 2     0 −ωx,t −ωy,t −ωz,t ωx,t 0 ωz,t −ωy,t ωy,t −ωz,t 0 ωx,t ωz,t ωy,t −ωx,t 0     qt+ 1 2     −q1,t −q2,t −q3,t q0,t −q3,t q2,t q3,t q0,t −q1,t −q2,t −q1,t q0,t     vt, (2.3)

forming the time continuous motion model.

2.3.2 Measurement Model

Having incorporated the gyroscope measurements into the motion model, the mea-surement model is only comprised of equations describing the acceleration. An accelerometer measures proper acceleration, which is the acceleration relative to free fall. The acceleration, measured in the D-system, can be divided into two parts, gravitational acceleration gd

t and movement acceleration atd. However, when

recording a video the user usually tries to hold the camera in a steady position. Therefore we assume that only the gravitational acceleration is affecting the device by letting a_td= 0. We get

yt= adt + g d

t + et≈ gdt + et, (2.4)

where etis the measurement noise. In order to model the gravitational acceleration

which we know is always pointing downwards to the ground we use our knowledge of the device orientation qt. A unit quaternion q has the corresponding rotation

matrix R(q) =   q2 0+ q12− q22− q23 2(q1q2+ q0q3) 2(q1q3− q0q2) 2(q1q2− q0q3) q20− q21+ q22− q32 2(q2q3+ q0q1) 2(q1q3+ q0q2) 2(q2q3− q0q1) q20− q12− q22+ q32  . (2.5) Using this rotational representation and the gravity vector gf _{= (0, 0, −g)}T_{, the}

final measurement equation can be written as

yt= −R(qt)gf + et= h(qt) + et. (2.6)

2.4 Discrete-Time Motion Model

In order to handle discrete-time gyro-measurements, a discrete version of the mo-tion model (2.3) must be derived. The aim is to write the discrete-time momo-tion

(22)

10 State Space Modeling

model on the form

qt= Ft−1(·)qt−1+ Gt−1(·)vt−1. (2.7)

Assume that the angular velocity ω is constant during the sampling interval. Then the model (2.1) can be considered a piecewise constant and time-invariant system and the matrix of the noise free time-discrete motion model, St−1(ωt−1), can be

calculated as

St−1(ωt−1) = eA(ωt−1)T, (2.8)

where T is the sampling interval. Doing some additional calculations, as in [4], (2.8) can be rewritten as St−1(ωt−1) = C(ωt−1) + D(ωt−1)A(ωt−1), (2.9) where C(ωt−1) = cos  ||ωt−1||T 2 I, D(ωt−1) = 2 sin||ωt−1||T 2 ||ωt−1|| .

Now, consider the noise free time-discrete motion model

qt= St−1(ωt−1)qt−1 (2.10)

Introducing the measurement noise of the angular velocity as in (2.2) and applying (A.16) we can write the motion model as

qt= [C(ωt−1) + D(ωt−1)A(ωt−1)]qt−1≈ ≈ [C(ωt−1) + D(ωt−1)A(ωt−1+ vt−1)]qt−1= = [C(ωt−1) + D(ωt−1)A(ωt−1)]qt−1+ D(ωt−1)B(qt−1)vt−1, (2.11) resulting in Ft−1(ωt−1) = C(ωt−1) + D(ωt−1)A(ωt−1) (2.12) Gt−1(ωt−1, qt−1) = D(ωt−1)B(qt−1). (2.13)

Optionally, additional approximations can be made to C(ωt−1) and D(ωt−1) by

assuming that ||ωt−1||T is small due to high sampling frequency.

2.5 Linearized Measurement Model

If linear estimation theory is to be used the nonlinear measurement model needs to be linearized. In Chapter 3 the Extended Kalman Filter (EKF) will be introduced and later used to estimate the orientation of the device. Here, a prediction step or time update is made before incorporating the measurements from the accelerom-eter into the estimate. Denote this estimate ˆqt|t−1. Now, we can use a first order

Taylor expansion around ˆqt|t−1to approximate the measurement model as

(23)

ytby rearranging equation (2.14)

˜

yt= yt− ht(ˆqt|t−1) + Ht(ˆqt|t−1)ˆqt|t−1= Ht(ˆqt|t−1)qt+ et. (2.16)

(24)

(25)

Chapter 3

Filtering

This chapter will focus on estimating the orientation of the device based on the state space model derived in Chapter 2. In addition, an LP-filtering algorithm applied on the estimated orientation will be presented.

3.1 Orientation Estimation

Using the state space model from Chapter 2, one can estimate the states qtusing

various filtering algorithms. Due to the nonlinearity of the measurement model, a linear filtering solution is not applicable. One possible algorithm is the particle filter, however the number of particles needed to ensure convergence would be large and not suited for a device with low computational capabilities. Instead this thesis opts for the more efficient Extended Kalman Filter (EKF) algorithm, which handles the nonlinearities by linearization. The derived time-discrete state space model in Chapter 2 can be written on the form:

qt= Ft−1(ωt−1)qt−1+ Gt−1(ωt−1, qt−1)vt−1

yt= h(qt) + et (3.1)

To simplify the notation and stress the fact that the dynamic part of the model is linear, the dependencies of Ft−1and Gt−1will be omitted throughout this chapter.

3.1.1 EKF

The EKF is based on the Kalman Filter (KF) which is the Best Linear Unbiased

Estimator (BLUE) and a commonly used estimator for linear systems. The EKF

performs a linearization of the nonlinearities present in the system model around

(26)

14 Filtering

the predicted and/or filtered estimates of the state and then applies the KF on the linearized system. Hence, the EKF does not guarantee optimality. However, this model approximation works well for mildly nonlinear systems.

Assume that the noise terms, vt−1 and et, in (3.1) are zero-mean

normally-distributed random variables with covariance matrices Qt−1 and Rt, respectively.

Furthermore, assume qi ∼ N (µi, Pi), where qi is the initial state. With these

assumptions the EKF can be applied to the derived state space model using the following algorithm:

Algorithm 1: Extended Kalman filter (EKF)

1. Initialize: set ˆq0|0= qi and P0|0= Pi.

2. Time update: ˆ qt|t−1= Ft−1qˆt−1|t−1 ˆ qt|t−1:= ˆqt|t−1/||ˆqt|t−1|| Pt|t−1= Ft−1Pt−1|t−1FTt−1+ Gt−1Qt−1GTt−1 3. Measurement update: Kt= Pt|t−1HTt HtPt|t−1HTt + Rt −1 ˆ qt|t= ˆqt|t−1+ Kt yt− h(ˆqt|t−1) ˆ qt|t:= ˆqt|t/||ˆqt|t|| Pt|t= (I − KtHt) Pt|t−1

4. Set t := t + 1 and repeat from step 2.

Note that q is normalized after each update in order to maintain q as a unit quaternion. The normalization is in general not a part of the EKF algorithm. The initialization step will be discussed in Section 3.1.2. For more information on the EKF refer to [7]. A derivation of the EKF algorithm can be found in [8].

3.1.2 EKF Initialization

To use the EKF we need a guess of the initial state q0= (q0, q1, q2, q3)T and state covariance P0. Measurements from the accelerometer a0 = (ax, ay, az)T are used

to set the initial quaternion state. Note that these measurements are given in the D-system while the initial quaternion state describes the orientation of the device relative to the F-system. Denote the gravitational vector gf _{= (0, 0, g)}T_{, given}

in the F-system. Once again assume that only the gravitational acceleration is affecting the device. Using the rotational representation of q0 the acceleration measurement is approximately given by

(27)

3.1 Orientation Estimation 15

Based on only the accelerometer it is impossible to get a global estimate of the initial state, i.e. one can never know the orientation of the device with reference to the north pole. Imagine, the device laying on a flat surface parallel to the ground. Assuming that the accelerometer is noise free a typical measurement would be,

at = (0, 0, g)T. Rotating the device around the z-axis and thus changing the

orientation would generate the same measurement, proving this fact. However, introducing an additional vector, based on magnetometer measurements, would make a global estimate possible. In our case a global estimate of the orientation is not needed and north is instead chosen arbitrarily. A rotation matrix R is composed by the orthonormal vectors r1, r2 and r3 as

R =  r1 r2 r3  =   r11 r12 r13 r21 r22 r23 r31 r32 r33  . (3.3)

When R = R(q0) these vectors form an orthonormal basis describing the orien-tation of the D-system relative to the F-system. Replacing R(q0) in (3.2) with R and knowing that g = kgf_{k ≈ ka}

0k we get

r3g ≈ a0⇒ r3≈ a0/||a0||, (3.4) where r3corresponds to the z-axis of the D-system relative to the F-system. Since no knowledge of the remaining axes is available we choose r1 arbitrarily but not aligned with r3. Let u = (u1, u2, u3)T, ||u|| = 1, u 6k r3, then

r1=

u − (u · r3)r3 ||u − (u · r3)r3||

. (3.5)

The Gramm-Schmidt process in (3.5) assures that r3 and r1 are orthonormal. r2 can be calculated by the cross product

r2= r1× r3 (3.6)

Using the rotation matrix R and the matrix R(q0), we can calculate q0 = (q0, q1, q2, q3)T as, √ 1 + r11+ r22+ r33= q 1 + 3q2 1− q22− q23− q02= q 4q2 1 = 2q1= b r23− r32 2b = 2(q2q3+ q0q1) − 2(q2q3− q0q1)) 2b = 2q0q1 b = q0 b 2 = q1 r21− r12 2b = 2(q1q2− q0q3) + 2(q1q2+ q0q3)) 2b = 2q1q2 b = q2 r13− r31 2b = 2(q1q3− q0q2) + 2(q1q3+ q0q2)) 2b = 2q1q3 b = q3

(28)

16 Filtering

3.1.3 EKF Smoother

This thesis presents an off-line approach i.e. we have access to all measurements

yt, t = 1, . . . , N when estimating the orientation. We want to use this information

to find the best possible state estimate ˆqt|1:N. The EKF is essentially an online

algorithm, where the estimated state ˆqtis based on measurements yt, t = 1, . . . , t.

To utilize all available measurements when estimating ˆqt, a second filtering pass is

also performed, improving the EKF estimates. The second pass estimates ˆqt|1:N

using an Extended Kalman Filter Smoother (EKF Smoother or EKFS) which is presented below.

Algorithm 2: Extended Kalman Filter Smoother (EKF Smoother)

Available observations are yt, t = 1, . . . , N

1. Forward filter: Run the EKF and store both time and measurement updates, ˆqt|t, ˆqt|t−1, Pt|t, Pt|t−1.

2. Backward filter: Apply the following time recursion backwards in time:

ˆ qt−1|1:N = ˆqt−1|t−1+ Pt−1|t−1FTt−1P −1 t|t−1(ˆqt|1:N− ˆqt|t−1) ˆ qt−1|1:N := ˆqt−1|1:N/||ˆqt−1|1:N||

The covariance matrix of the estimation error Pt|N is

Pt−1|N= Pt−1|t−1+ Pt−1|t−1FTt−1P −1

t|t−1(Pt|N− Pt|t−1)P−1t|t−1Ft−1Pt−1|t−1

The algorithm, which corresponds to the Rauch-Tung-Stiebel Formulas found in [7], is derived in [8].

3.2 LP-Filtering

When capturing a movie using a handheld device it is almost impossible to hold the device steadily. Due to user induced shakes and/or small movements, the resulting video will be perceived as "shaky". In order to stabilize this kind of video sequence, a smooth orientation estimation is needed. One can think of this smoothed estimation as the rotational motion that the device would have had if the orientation estimate was free from small shakes or movements. To achieve this smooth motion, LP-filtering is performed on the estimated state ˆqt|1:N, i.e. the

output of the EKF Smoother. How this LP-filtered version of ˆqt|1:N is utilized to

stabilize a video will be explained further in Chapter 4. In this section we assume that the time interval between each estimate of ˆqt|1:N is constant.

(29)

3.2 LP-Filtering 17

3.2.1 Windowing

A simple way to low-pass filter a one dimensional signal is to perform window-ing. This thesis uses a common Hann or Hanning window due to the good trade off between noise suppression and amplitude preservation. Let the length of the Hanning window be M , then the corresponding weights wm can be calculated as,

wm= ( 0.51 + cos2πm M −1 , −M −1 2 ≤ m ≤ M −1 2 0, elsewhere . (3.7)

The window can then be applied in the time domain at time t on ˆqt|1:N as,

ˆ qLP_t = M −1 2 X m=−M −1₂ wmqˆt+m|1:N (3.8) ˆ qLP_t := ˆq_tLP/kˆqLP_t k

In order to achieve enough smoothing when the orientation estimate only suffers from small user induced shakes, the window length M needs to be large. However, a large M will create an inaccurate motion for fast motion changes. Therefore, we want to use windows of variable length, where the length is based on how fast ˆqt|1:N changes. To solve this problem we use a change detection algorithm to

detect when fast changes occur. The idea is to divide ˆqt|1:Ninto different segments.

These segments can then be filtered as described above but with Hanning windows of different lengths.

3.2.2 CUSUM LS Filter

The fast orientation changes are detected using the Cumulative Sum Least Squares

Filter (CUSUM LS Filter), which is an adaptive change detection filter. The

objective is to split up the orientation estimation ˆqt|1:N into S segments which

can then be LP-filtered as described in the previous section. The input to the filter is the norm of the angular velocity, kωkk, measured by the gyroscope. kωkk

is used since it we are only interested in detecting when the orientation changes fast not how the orientation changes. Let yk= kωkk. We model this new measurement

as

yk = θk+ ek, (3.9)

where k = 1, . . . , N and ek is the measurement noise. The LS estimation ˆθ of the

signal is given by:

ˆ θk = 1 k k X i=1 yi (3.10)

Define the residuals as

(30)

18 Filtering

and the distance measures s1

k and s2k as

s1k= k, s2k = −k (3.12)

where ˆθk−1is the LS estimation based on measurements up to k − 1. The distance

measures will serve as input to the test statistics, g1

k and gk2, defined as,

g1k= max(gk−1+ s1k− ν, 0), gk2= max(gk−1+ s2k− ν, 0), (3.13)

where ν is a small drift term subtracted at each time instance. The reason for having two test statistics is the ability to handle positive as well as negative changes in kωkk. The CUSUM LS algorithm, also described in [9], is given below. Algorithm 3: Cumulative Sum Least Squares Filter (CUSUM LS Filter)

ˆ θk = 1 k − k0 k X i=k0+1 yi k = yk− ˆθk−1 s1_k = k s2_k = −k g_k1= max(gk−1+ s1k− ν, 0) g_k2= max(gk−1+ s2k− ν, 0) Alarm if g1 k > h or g2k> h

After an alarm, reset g1

k= 0, g2k= 0 and k0= k, add k to I

Output : The index vector I

In each iteration ˆθk is estimated according to (3.10). Next, the distance measures

are evaluated and used to calculate the test statistics g1 k and g

2

k. Note that test

statistics sums up its input with the intention of giving an alarm when the sum exceeds the threshold h. To prevent positive drift the term ν is subtracted while the negative drift is handled by letting g1

k= gk2= 0 for negative values. When an

alarm is signaled the filter is reset and a new LS estimation starts. For fast changes in kωkk this will result in alarms close to one another, i.e. small segments, while

a slowly changing kωkk will yield no alarms. Note that ˆθk is the mean of kωkk

over an interval. This is a good model since it does not capture fast changes very well which is exactly what we want. The output is the index vector I containing the alarm indices. These indices will make up the segments of ˆqt|1:N. Both ν and

h are design parameters set to achieve alarms at the desired levels.

3.2.3 LP-Filtering On CUSUM LS Segments

When the CUSUM LS Filter have found the segments of ˆqt|1:N we can perform

(31)

3.2 LP-Filtering 19

{s1, s2, s3, . . . , sn} where each segment si consists of a start and end index, si,s

and si,e. The algorithm presented uses two main filters, both Hanning windows

as described in Section 3.2.1. Let these two windows be of length K and L, where

K > L. In order to receive symmetrical windows, K and L need to be odd integers.

First we find the subset of S where the segments si fulfill

length(si) > K. (3.14)

Denote this set of big segments BS = {bs1, bs2, bs3, . . . , bsk}. In these segments

kωk is small or changes slowly. This corresponds to segments where no rapid movements are made by the user and can thus be LP-filtered with the wider window of length K. Let SS = S − BS, SS = {ss1, ss2, ss3, . . . , ssl} be the set

of remaining small segments. These are the segments where kωk changes rapidly and can thus be filtered with the narrow window of length L.

Now, consider the segment bsi ∈ BS. The neighboring segment forward in time

can either be of type bs or ss. If the type is bs, a fast but short change has occurred in kωk at the egde of bsi. We need to filter with the more narrow window of length

L but only for a very short time since it is followed by a big segment of type bs

suited for heavy LP-filtering. If the type is ss this indicates that kωk is changing rapidly, at least enough to yield two close CUSUM alarms. It is also likely that the ss segment is followed by yet another segment of type ss. In this case we need to filter with the narrow window but for a longer period of time.

In order to achieve a smooth transition between the two windows we let the length of the windows increase or decrease with a step size a = 2. This will also guarantee that the transitioning window’s symmetrical shape is kept.

Now, we have two cases. Either the transition is from window K to L or from L to

K. In the first case the transition starts at index si,e− (K − L) and ends at index

si,e, where si∈ BS. In the second case the transition starts at index si,e and ends

at index si,e+ (K − L), where si ∈ S. Note that this covers transitions between

the segments pairs bs → ss, ss → bs and bs → bs. Let c be the transition length, where c = (K −L), and denote the set of transition indices T I = {ti1, ti2, . . . , ti2k}, where tii = ( bsi,s− c, if i is even bsi,e, if i is odd. (3.15)

The algorithm below can then be used to achieve the desired LP-filtering. The procedure is illustrated in Figure 3.1.

(32)

20 Filtering

Algorithm 4: Windowing On CUSUM LS Segments

1. Run the CUSUM LS Filter on kωk to determine S.

2. Let K and L be the length of the wide and narrow window respectivly.

3. Find BS ⊆ S where si∈ S and length(si) > K.

4. Initiate with window K if s1∈ BS, else, set initial window to L. 5. Calculate the transition indices tii∈ T I from BS.

6. LP-filter ˆqt|1:N using windowing according to Section 3.2.1 but let the

window start a smooth length transition at each index tii. If the current

window is of length K make the transition to L and vice versa.

Output : ˆqLP t

Figure 3.1. At the top we see the segments s ∈ S found by the CUSUM LS Filter. In

the middle the big segments bs ∈ BS have been found. The indices ti ∈ T I are calculated according to (3.15). We can then use windows of different lengths on the different parts of the signal. In this case we use windows K → L → K. The dotted parts at the bottom corresponds to intervals where the window lenghts increase or decrease.

Figures showing the result of the algorithms performance on real data will be presented in Section 6.2.2.

(33)

Chapter 4

Rectification And

Stabilization

Here, the general ideas behind the rectification and stabilization algorithms are presented. The algorithms are based on the work in [5]. For a more thorough explanation refer to [1]. We assume that we have the orientation estimates ˆqt|1:N

from the EKF filters in Chapter 3, as well as the LP-filtered version, ˆqLP t .

4.1 Image Rectification

When an image is captured by a rolling shutter camera, the image is recorded line-by-line from top to bottom. If the camera is moving while capturing, each row will be recorded at different camera orientations. The idea is to transform such an image to make it look like all the rows were captured at the same time instance using the orientation estimates ˆqt|1:N. In particular we are interested in

how the orientation has changed from the first row r0to an arbitrary row ri, where

i = 0, 1, ..., l and l is the index of the last row.

Now, assume that the frame time stamps correspond to the capture of the first row of an image. Also assume that the read-out time of the RS camera tr is known,

i.e. the time between the capturing of the first and last row of an image. Then we can use interpolation to find the orientation of the camera at the time instances of each row. Denote the time stamp of a frame tf. Then the capture time instance

of row ri, denoted tri, can be calculated as

tri = tf+

i

ltr (4.1)

Linear interpolation, defined as

qLERP = (1 − w)q1+ wq2, w ∈ [0, 1], (4.2) 21

(34)

22 Rectification And Stabilization

can be used on ˆqt|1:N, to find the orientation estimate of each image row. Note

that the interpolated quaternion needs to be normalized to maintain it as a unit quaternion. It is important to understand that the interpolated quaternion does not move with constant speed when w changes from 0 to 1. However, the smaller the difference between q1 and q2, the closer we are to a constant speed. In our case q1 and q2 are always close to one another. For more information regarding quaternion interpolation refer to [10]. Denote the orientation estimate correspond-ing to row i of an image qri. Then the relative orientation change from the first

row r0 to an arbitrary row ri is given by

qr0ri= q

−1

r0 qri (4.3)

A camera can be regarded as a mapping from the 3D world to a 2D image. Let

x represent an image point described in homogenous coordinates and let X be the

corresponding point in space. Then the mapping from the 3D world to the image is given by

x = KX, (4.4)

where K is the Camera Calibration Matrix described in [11]. It is also possible to perform the opposite mapping, i.e. map a point in the image plane to a point in space. This relation is given by

X = K−1x. (4.5)

In order to rectify a point x on row i in the image, we start by mapping it to 3D space using (4.5). Then the orientation change qr0ri from the first row to row ri

is calculated. Next, we rotate the point in space by inserting qr0ri into

0 X0 = q−1 0 X q, (4.6)

Finally the new position of the rotated point in the image is given by

x0 = KX0. (4.7)

Figure 4.1 displays a frame captured during a motion from right to left, together with its rectified counterpart. Note that all rows have been aligned to the first row.

(35)

4.2 Video Stabilization 23

(a) Frame of a video captured with iPod 4.

(b) Rectified frame. All rows have been aligned to the first row.

Figure 4.1. The first image shows a frame captured during a motion from right to left.

The bottom image is a rectified version of the same frame. Note that all rows have been aligned to the first row.

4.2 Video Stabilization

In Chapter 3 an LP-filtering algorithm was presented describing how the orienta-tion estimates ˆqt|1:N could be filtered in order to get a smooth rotational motion,

ˆ

qLP

t . After an image has been rectified, using the steps presented in the previous

(36)

orien-24 Rectification And Stabilization

tation estimate qr0 represents the orientation of the device for the entire image. It is possible to change this reference orientation to an arbitrary row of the image. The most logical row to use as reference is the middle row. Denote the orientation estimate of the middle row qrm. Using (4.3) we can calculate qrorm and process

each point of the previously rectified image using (4.5), inserting qrorm into

0 X0 = q 0 X q−1, (4.8)

and apply (4.7) to change reference row from r0to rm, see Figure 4.2(a).

In order to stabilize the video we use the LP-filtered orientation estimates ˆqLP t .

We start off by calculating the orientation change from qrm to q LP

rm using (4.3).

Then we use the same procedure as before according to (4.5), (4.8) and (4.7). This can be seen as moving all the pixels of an image to the locations where they would have been, had the camera trajectory been smooth, see Figure 4.2(b).

Instead of doing this in two separate steps, i.e. change reference row and stabilize according to this reference, we can calculate to total orientation change from qr0 to qLP rm as, qr0 → q LP rm = (q −1 r0 qrm) (q −1 rm q LP rm) = q −1 r0 q LP rm, (4.9)

and then apply (4.5), (4.8) and (4.7). In fact, one can perform both stabilization and rectification in one step. Consider the point x on row i in the image. The total orientation change q(x) needed to rectify and stabilize this point is given by,

q(x) = q−1_r 0 qri (q −1 r0 q LP rm) −1_. _(4.10)

Applying (4.5) and plugging q(x) into (4.6) followed by (4.7) will result in the new rectified and stabilized position of x in the image.

(37)

4.2 Video Stabilization 25

(a) Rectified frame. The reference has been changed from first row to middle row.

(b) Stabilized and rectified frame.

Figure 4.2. We continue with the example in Figure 4.1. In Figure 4.1(b) all rows

were aligned to the first row. In the top image we have changed reference from first to middle row using the orientation change qrorm. Note that the black borders are more

evenly distributed in this image, which is the main reason for choosing the middle row as reference. The bottom image shows a stabilized and rectified version of the frame. From the top image with middle row as reference, the image can be stabilized using a single orientation change, namely qrm to q

LP rm.

(38)

(39)

Chapter 5

iPod 4 Accelerometer And

Gyroscope Evaluation

Having a good understanding of the sensor output is important when analyzing the output of the filtering algorithms. Therefore, this chapter looks at the per-formance of the accelerometer and gyroscope equipped on the iPod 4 compared to external reference sensors. The reference sensors used are components of an

Inertial Measurement Unit (IMU) of Xsens Technologies B.V [12] named Xsens

MT motion sensor. This is a high end IMU product, although a couple of years old. The main focus of the evaluation is the estimation and comparison of bias and variance on the two units.

5.1 IMU

An IMU is an electronic device using a combination of accelerometer and gyro-scope in order to measure the acceleration and angular velocities affecting the unit. IMUs are typically used in aircrafts and Unmanned Aerial Vehicles (UAVs) but have recently made their way into more commercial products such as mo-bile phones. This has become possible due to Micro-Electro-Mechanical Systems (MEMS) technology making the units cheap, small and and low on power consump-tion. Measurements are given in three dimensions making it possible to track the orientation of the unit. However, the output is distorted by noise as well as having bias present. In the following sections comparisons will be made between the two IMUs in terms of bias and variance in an attempt to evaluate the two units.

(40)

28 iPod 4 Accelerometer And Gyroscope Evaluation

Figure 5.1. The rig used in the experiments. The orange Xsens MT motion sensor and

the iPod 4 are attached on opposite sides of a wooden plate.

5.2 Accelerometer

First we take a closer look at the accelerometer measurements. During all the experiments in this section the accelerometer of the iPod 4 was set to sample at the highest possible frequency of 100 Hz. However, the iPod does not sample at a constant frequency and was clocked at ∼95 Hz. A constant frequency of 100 Hz was used on the Xsens sensor. In order to compare the units a rig was set up according to Figure 5.1. Having the units attached on opposite sides of the plate ensures that the two sensors give roughly the same output. In fact, in theory, the gyroscopes could be placed anywhere on the plate and should still produce the same measurements. This is not true for the accelerometers.

Figure 5.2 shows a plot of the raw acceleration data of the two units along the x-axis. We clearly see that the iPod data is scaled compared to the data of the Xsens sensor. In order to use the iPod data in the models described in Chapter 2 we need to change the sign of the measurements as well as estimate the scaling factor λ.

5.2.1 Bias and Scaling Estimation

Estimating the bias of an accelerometer is not as easy as one would think. An initial approach was to let the sensor be at rest on a flat surface parallel to the ground and simply calculate the mean for the two axes "free" from the gravity component. However, finding a surface that is entirely flat, i.e. a surface where the gravity component only affects one axis is difficult. Instead we use a different but more tedious approach where a set of individual measurement sessions are used. In each session the accelerometer is at rest but placed in an orientation different from

(41)

5.2 Accelerometer 29 4 5 6 7 8 9 10 11 12 −5 −4 −3 −2 −1 0 1 2 3 4 5

Raw accelerometer data along the x−axis

t [s]

acc [m/s

2]

iPod 4 Xsens

Figure 5.2. Raw x-axis accelerometer data from the iPod 4 (solid) and the Xsens MT

motion sensor (dashed). Note that the iPod 4 data is scaled and thus gives smaller values as well as of the opposite sign.

previous sessions. Denote the accelerometer measurements ak= (ax,k, ay,k, az,k)T.

For each session the mean acceleration , ai

m = (aix,m, aiy,m, aiz,m)T, is calculated according to, ai_m= PN k=1ak N , (5.1)

where N is the number of measurements of the particular session with index i = 1, . . . , M . If the set of all am are considered points in space they should all lay

on the surface of a sphere. Figure 5.3 displays amfrom 14 different measurement

sessions using the iPod 4. The idea is to fit a sphere to these points and thus get an estimate of the acceleration bias, ba = (ba_x, ba_y, ba_z)T, from the translation of the sphere relative to the origin. The measurements are scaled by an unknown factor

λ which can be estimated by scaling the sphere since we know that the radius

should be equal to the gravity magnitude g. A sphere is mathematically described by,

f (x) = (x − xc)2+ (y − yc)2+ (z − zc)2− r2= 0, (5.2)

where x = (xc, yc, zc, r)T. We use the Gauss-Newton method for nonlinear least

squares [13] to find x using the points ai

m. This is done by iteratively solving the

linear system

(JT(xl)J(xl))sl= −J(xl)f (xl), (5.3)

for the Newton step sl where

J(xl) =      2(xc,l− a1x,m) 2(yc,l− ay,m1 ) 2(zc,l− a1z,m) −2rl 2(xc,l− a2x,m) 2(yc,l− ay,m2 ) 2(zc,l− a2z,m) −2rl .. . ... ... ... 2(xc,l− aMx,m) 2(yc,l− ay,mM ) 2(zc,l− aMz,m) −2rl      (5.4)

(42)

30 iPod 4 Accelerometer And Gyroscope Evaluation −0.5 0 0.5 1 −1 −0.5 0 0.5 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 x

Mean Acceleration from 14 measurment sessions using iPod 4

y

z

Figure 5.3. Mean acceleration from 14 measurement sessions using the iPod 4. The

accelerometer was at rest having different orientations for all sessions. A sphere can be fitted to the points in order to estimate the bias. Note that the acceleration is scaled by an unknown factor λ which can be estimated by scaling the fitted sphere.

and f (xl) =      (xc,l− a1x,m)2+ (yc,l− a1y,m)2+ (zc,l− a1z,m)2− rl2 (xc,l− a2x,m)2+ (yc,l− a2y,m)2+ (zc,l− a2z,m)2− rl2 .. . (xc,l− aMx,m)2+ (yc,l− aMy,m)2+ (zc,l− aMz,m)2− rl2      . (5.5)

x is then updated using the newton step according to

xl+1= xl+ sl, (5.6)

and the process is repeated until convergence. Since we know that the bias is small and that the radius of the sphere is close to 1, the initial guess x0= (0, 0, 0, 1)T was used. The scaling factor λ is calculated as

λ = g/r. (5.7) The same experiments were carried out on the Xsens sensor but with initial guess

x0= (0, 0, 0, g)T. Table 5.1 shows the estimated bias and scaling for the two units. Note that the bias has been scaled with λ. We conclude that both units have a sim-ilar bias magnitude although the Xsens MT motion sensor bias is slightly smaller. However, we see a difference for ba

z where the iPod 4 performs far worse than the

Xsens sensor. One possible explanation is that the Xsens MT motion sensor is a high end product and should be well calibrated in terms of bias compared to the iPod 4.

(43)

5.2 Accelerometer 31

Unit ba

x[m/s2] bay[m/s2] baz[m/s2] λ

iPod 4 -0.0628 0.0185 -0.1756 9.8193

Xsens 0.0566 -0.0159 -0.0200 0.9996

Table 5.1. Comparison of the estimated acceleration bias and scaling factors of the two

units. Note that the bias is of roughly the same magnitude except for baz.

4 5 6 7 8 9 10 11 12 −5 −4 −3 −2 −1 0 1 2 3 4 5

Bias corrected accelerometer data along the x−axis

t [s]

acc [m/s

2]

iPod 4 Xsens

Figure 5.4. Bias corrected and scaled accelerometer data from the iPod 4 (solid) and

Xsens MT motion sensor (dashed). The same data set was used in Figure 5.2.

Figure 5.4 displays a bias corrected and λ scaled version of the same accelerometer data used in Figure 5.2. We still see a small difference between the two measure-ment sets. It is difficult to know if this is hardware related or due to the placemeasure-ment of the units. Note that the accelerometers could behave nonlinearly. This could also contribute to the difference seen in Figure 5.4.

5.2.2 Variance Estimation

Now that we have estimated the acceleration bias we move on to the variance. Since we already have gathered data during M = 14 measurement sessions for each unit as described in the previous section, we use the same data for variance estimation. For each session the variance ai

v= (aiv,x, aiv,y, aiv,z)T is calculated as a_vi =

PN

k=1(ak− aim)2

N (5.8)

using ami from the previous section. Then the mean of all aiv is calculated to

determine the final variance estimate va _{= (v}a

x, vya, vaz)T according to: va=

PM

i=1aiv

(44)

32 iPod 4 Accelerometer And Gyroscope Evaluation

Table 5.2 presents the estimated variance for the two units. Note that the variance has been scaled with λ2.

Unit va

x[m2/s4] vay[m2/s4] vza[m2/s4]

iPod 4 3.8435e-04 3.5923e-04 6.1966e-04 Xsens 1.3874e-04 1.3229e-04 3.9424e-04

Table 5.2. Comparison of the estimated acceleration variance of the two units. As with

the acceleration bias both units have variance of similar magnitude. Still, the Xsens MT motion sensor slightly outperforms the iPod 4.

5.3 Gyroscope

The same rig as in Figure 5.1 was used to get an initial idea of the gyroscope measurements of the two units. As with the accelerometer we set the iPod sampling frequency at the highest possible setting, i.e. 100 Hz. In this case the actual sampling frequency was estimated at ∼60 Hz and was non constant. The Xsens MT motion sensor was sampled at a constant frequency of 100 Hz. Figure 5.5 shows the raw gyroscope measurements of two units about the x-axis.

4 5 6 7 8 9 10 11 12 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2

Raw gyroscope data about the x−axis

t [s]

angular velocity [rad/s]

iPod 4 Xsens

Figure 5.5. Raw gyroscope data about the x-axis using the rig shown i Figure 5.1. Ipod

4 (solid), Xsens MT motion (dashed). We see that the two units give very similar output.

5.3.1 Bias And Variance Estimation

When evaluating the gyroscope we decided to estimate the order of magnitude of bias and variance. Figure 5.6 shows the raw gyroscope data about the x-axis of

(45)

5.3 Gyroscope 33 0 10 20 30 40 50 60 −0.06 −0.04 −0.02 0 0.02 0.04 0.06

Xsens gyroscope data about the x−axis

t [s]

angular velocity [rad/s]

0 10 20 30 40 50 60 −0.06 −0.04 −0.02 0 0.02 0.04 0.06

iPod 4 gyroscope data about the x−axis

0 10 20 30 40 50 60 0.005 0.01 0.015 0.02 0.025 0.03

iPod 4 gyroscope data about the x−axis

Figure 5.6. Raw gyroscope data about the x-axis logged when the units were at rest.

We see that there is some quantization in the iPod data.

the units for the duration of about a minute. Note that there is some quantization in the iPod data.

A series of experiments were conducted. We used 10 different measurement ses-sions for each unit. In each session the gyroscope was at rest producing similar measurements as in Figure 5.6. The duration of each session was about 10 seconds. Figure 5.7 (Ipod 4) and 5.8 (Xsens) displays the estimated bias and variance.

In Figure 5.7 we see clear variations in magnitude of the bias, especially about the z-axis. The same is true for the Xsens gyroscope although the variation is not as distinct for a particular axis. We conclude that the bias of both units is subject to changes in magnitude between sessions. The bias of both units does in fact vary over time but the experiments suggest that the bias is restricted to specific intervals. We chose not to investigate this further and instead focus on a more compact representation of the bias and variance to better compare the units. This was done by calculating, the mean of the bias and variance in Figure 5.7

(46)

34 iPod 4 Accelerometer And Gyroscope Evaluation 0 5 10 0 0.005 0.01 0.015 0.02 0.025 [rad/s] bias x 0 5 10 0 0.005 0.01 0.015 0.02 0.025 bias y 0 5 10 0 0.005 0.01 0.015 0.02 0.025 bias z 0 5 10 0 1 2x 10 −4 session [rad 2/s 2] var x 0 5 10 0 1 2x 10 −4 session var y 0 5 10 0 1 2x 10 −4 session var z

Figure 5.7. Gyro bias and variance for 10 measurement sessions when iPod 4 was at

rest. Note the different magnitude of the bias, especially about the z-axis.

0 5 10 −0.025 −0.02 −0.015 −0.01 −0.005 0 [rad/s] bias x 0 5 10 −0.025 −0.02 −0.015 −0.01 −0.005 0 bias y 0 5 10 0 0.005 0.01 0.015 0.02 0.025 bias z 0 5 10 0 1 2x 10 −4 [rad 2/s 2] session var x 0 5 10 0 1 2x 10 −4 session var y 0 5 10 0 1 2x 10 −4 session var z

Figure 5.8. Gyro bias and variance for 10 measurement sessions when Xsens MT motion

sensor was at rest. We see the same tendencies as in Figure 5.7 although the variation in bias is not as distinct as for the iPod.

Estimation and Adaptive Smoothing of Camera Orientations for Video Stabilization and Rolling Shutter Correction

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Estimation and Adaptive Smoothing of Camera

Orientations for Video Stabilization and Rolling

Shutter Correction

Estimation and Adaptive Smoothing of Camera

Orientations for Video Stabilization and Rolling

Shutter Correction

Examensarbete utfört i Reglerteknik

vid Tekniska högskolan i Linköping

av

Abstract

Sammanfattning

Acknowledgments

Contents

Chapter 1

Introduction

1.1

Background

1.2

Problem Formulation

1.3

Rectification and Stabilization Basics

1.4

Purpose and Goal

1.5

Limitations

1.6

Related Work

1.7

Report Outline

Chapter 2

State Space Modeling

2.1

Coordinate Systems

2.2

Measurements

2.3

Time-Continuous Model

2.3.1

Motion Model

2.3.2

Measurement Model

2.4

Discrete-Time Motion Model

2.5

Linearized Measurement Model

Chapter 3

Filtering

3.1

Orientation Estimation

3.1.1

EKF

3.1.2

EKF Initialization

3.1.3

EKF Smoother

3.2

LP-Filtering

3.2.1

Windowing

3.2.2

CUSUM LS Filter

3.2.3

LP-Filtering On CUSUM LS Segments

Chapter 4

Rectification And

Stabilization

4.1

Image Rectification

4.2

Video Stabilization

Chapter 5

iPod 4 Accelerometer And

Gyroscope Evaluation

5.1

IMU

5.2