UKF-SLAM Implementation for the Optical Navigation System of a Lunar Lander

(1)

Laura García

School of Electrical Engineering

Thesis submitted for examination for the degree of Master of Science in Technology.

Deutsches Zentrum für Luft und Raumfahrt (DLR), Germany 22.08.2017

Thesis supervisor:

D.Sc. Arto Visala

Thesis advisors:

M.Sc. Nikolaus Ammann

Ph.D. Anita Enmark

(2)

DISCLAIMER

This project has been funded with support from the European Commission. This publication reflects the views only of the author, and the Commission cannot be held responsible for any use which may be made of the information contained therein.

(3)

Author: Laura García

Title: UKF-SLAM Implementation for the Optical Navigation System of a Lunar Lander

Date: 22.08.2017 Language: English Number of pages: 6+59 Department of Electrical Engineering and Automation

Professorship: AS-84 Automation Technology Supervisor: D.Sc. Arto Visala

Advisors: M.Sc. Nikolaus Ammann, Ph.D. Anita Enmark

This thesis presents a state estimator for the optical navigation system of an autonomous lunar lander. The proposed estimator addresses the Simultaneous Localization and Mapping problem, SLAM, combined with the Unscented Kalman Filter. The ultimate goal of the thesis is to verify whether the state estimation can rely on a relative positioning scheme when measurements of the position of the vehicle, in world coordinates, are not available. Therefore, an image feature tracking routine is proposed as a reference for localization. In order to handle this functionality and guarantee reliable state estimation any time, the estima- tor is founded on a monocular SLAM approach, jointly with the Inverse Depth Parametrization. This work describes in detail the UKF-SLAM routine with the corresponding state transition model and observation model. In the context of SLAM, a novel approach for the initialization of newly observed image features in the state vector and covariance matrix, based on the Unscented Transform, is introduced to the tracking routine. Subsequently, the development method and validation tests are presented. Finally, the results of the simulations and final conclusions are discussed. The results acknowledge the capability of the UKF- SLAM scheme as a suitable estimator for the aforementioned navigation system, in absence or in presence of absolute position measurements.

Keywords: Unscented Kalman Filter, Simultaneous Localization and Mapping, Inverse Depth Parametrization, Pinhole Camera Model

(4)

Nomenclature

B subscript for Body coordinate frame.

m subscript for Moon-Centered, Moon-Fixed (MCMF) coordinate frame.

r_m position of the vehicle in the m coordinate frame. r_m ∈ <³

˜

r_m prediction of the position in the m coordinate frame. ˜r_m ∈ <³ δr_m position error contained in the prediction ˜r_m. δr_m ∈ <³ vm velocity of the vehicle in the m coordinate frame. v_m∈ <³

˜

v_m prediction of the velocity in the m coordinate frame. ˜v_m ∈ <³ δv_m velocity error contained in the prediction ˜v_m. δv_m ∈ <³

q^m_B attitude of the vehicle represented as a rotation quaternion from the B coordinate frame to m coordinate frame. q^m_B = [qw, qx, qy, qz] ∈ <⁴

˜

q^m_B attitude prediction as a rotation quaternion from B to m coordinate frame. ˜q^m_B ∈ <⁴

R^m_B attitude represented as a transformation matrix from B to m coordinate frame.

δθ^m_m attitude error contained in the prediction ˜q^m_B, expressed by the axis–angle representation in m coordinates. δθ^m_m ∈ <³.

b_a_B linear acceleration bias in the B coordinate frame. b_a_B ∈ <³

˜baB linear acceleration bias prediction in the B coordinate frame. ˜b_a_B ∈ <³ δb_a_B linear acceleration bias error. δb_a_B ∈ <³

b_ω_B angular rate bias in the B coordinate frame. b_ω_B ∈ <³

˜b_ω_B angular rate bias prediction in the B coordinate frame. ˜b_ω_B ∈ <³ δb_ω_B angular rate bias error. δb_ω_B ∈ <³

x total state vector, x ∈ <¹⁶

˜

x total state vector prediction, ˜x ∈ <¹⁶ xukf −slam

state vector of the UKF-SLAM-based estimator. x_{ukf −slam} ∈ <^(15+6n), n is the number of landmarks.

δx error state vector contained in x_{ukf −slam}, δx ∈ <¹⁵.

l landmark state vector contained in x_{ukf −slam}, l ∈ <⁶ⁿ, n is the number of landmarks.

P covariance matrix of state x_{ukf −slam}, P ∈ <(15+6n,15+6n).

Abbreviations

UT Unscented Transform UKF Unscented Kalman Filter EKF Extended Kalman Filter

SLAM Simultaneous Localization and Mapping GRV Gaussian Random Variable

MCMF Moon-Centered, Moon-Fixed

(7)

1 Introduction

Several deep-space exploration missions involve autonomous landing on celestial bod- ies, therefore, require neat and versatile autonomous navigation systems. Autonomous navigation in environments outside the Earth’s atmosphere requires alternative positioning references to GPS data. The absence of GPS information increases the complexity of localization schemes and entails the fusion of readings from multiple sensors.

Optical navigation systems are becoming widely used for such a mission since the optical measurements are very reliable and can be acquired with short delay. For landing procedures, a terrain-based optical navigation can be performed.

The German Aerospace Center (DLR) is developing technology for optical naviga- tion systems, within the framework of the project ATON or Autonomous Terrain-based Optical Navigation. The underlying mission for the development and test of the

algorithms is based on a simulation of Moon landing [1].

The ATON state estimator, described in [1], is founded on the Unscented Kalman Filter (UKF) and fuses sensor measurements of different kind to compute the total state or total navigation solution. There are two algorithms in charge of preprocessing the raw optical sensor data and providing vehicle’s position in world coordinates:

the first, consists of a crater detection and matching routine introduced by Maass in [2] and [3], while the second, relies on a shadow-pattern matching routine described by Kaufmann et al. in [4]. Both algorithms are subjected to mismatches and map gaps; hence, absolute positioning data may not be always available along the descent trajectory. In such scenario, a third algorithm addressing the Simultaneous Localization And Mapping (SLAM) problem is proposed for relative positioning [1].

The present work proposes a modification to the state estimator in [1], by including the functionality of the third algorithm into the filtering routine, hence, relative positioning information is available any time and is fused with the measurements from the other two algorithms. Consequently, the third algorithm is not longer a side routine, called only when the measurements from the other algorithms are lacking;

but instead, is executed in every iteration of the state estimating routine. This way, the UKF- based state estimator in [1] is upgraded to a UKF-SLAM estimator.

The UKF-SLAM routine is fed with various sensor measurements, such as absolute

(8)

Figure 1: Schematic of the lunar lander, as described in [1]

position, attitude, altitude and image feature coordinates, corresponding with 3D objects in the field of view of a ground-facing camera. These feature coordinates are inputted to a feature tracking routine, in order to provide information of the position of the vehicle relative to the map of objects observed by the camera. The observation model, linking state variables and sensor measurements are presented for each type of sensor measurement.

This thesis proposes a monocular or single camera approach, combined with the Inverse Depth Parametrization, to perform SLAM. This approach overcomes some flaws of stereo-vision, such as the increasing uncertainty in range estimation as the distance to the object increases and the larger baselines required for aircrafts operating in altitudes of 100 meter and beyond [14]. Since monocular vision provides only bearing information, the depth to the physical objects, observed by the camera, is modeled by the Inverse Depth Parametrization as a Gaussian random variable and is included to the state vector and covariance matrix. This parametrization is introduced by [16] and is described in detail in the present paper.

The thesis goals are listed as follows:

– Assess the expediency of the UKF-SLAM estimator, combined with the Inverse Depth Parametrization, to execute accurate state estimation in absence of absolute position references. Compare its performance with the UKF-based estimator.

– Determine the limitations and requirements of the image feature tracking routine as reference for positioning.

– Confirm the sufficiency of a single ground-facing camera, jointly with the Inverse Depth Parametrization, for successful image feature tracking.

– Derive suitable state transition and measurement models for the given state variables (position, velocity and attitude of the lander) and the sensor mea-

(9)

surements available (absolute position, attitude, altitude and image feature coordinates)

– Propose an empirical initial mean value and deviation of the inverse depth parameter.

– Derive a landmark initialization method to include newly observed landmark information in the state vector and covariance matrix.

(10)

2 Background

This chapter presents the theoretical framework of the thesis. It starts by describing the Unscented Kalman Filter and its advantage over the widely used Extended Kalman Filter. Next, the definition and the solution of the SLAM problem are presented; followed by a short section about the monocular SLAM. The chapter continues by introducing the Inverse Depth Parametrization and its advantages for landmark modeling. The chapter finalizes with a brief overview of the pinhole camera model, used to emulate the projection of landmarks onto the image plane, as performed by a real camera.

2.1 The Unscented Kalman Filter

The Unscented Kalman Filter is a variation of the standard Kalman Filter, based on the Unscented Transform, for non-linear estimation.

The Kalman Filter is a two-step recursive estimator, which uses the estimated state from the previous time step and the current measurement to compute the estimate for the current state. The filter comprises a predict phase and an update phase. The predict phase derives the estimate of the current state, also known as the a priori state estimate, from the estimate at the previous time step. In the update step, the a priori prediction is combined with current measurements to refine the state estimate. The result is the a posteriori state estimate [24].

The Kalman Filter approximates the state distribution by a Gaussian random variable and propagates it through the system dynamics. For non-linear dynamics, the most common version of the filter is the Extended Kalman Filter. The EKF propagates the state variables analytically through the first-order linearization of the nonlinear system dynamics, operation that can introduce large errors in the true posterior mean and covariance of the transformed GRV. Such errors may degrade the performance and lead to filter divergence [5].

The Unscented Kalman Filter (UKF), proposed by Julier and Uhlman [6], addresses this problem by introducing a deterministic sampling approach. The state distribution, also approximated by a GRV, is represented by a set of carefully chosen sample points. These points accurately capture the true mean and covariance of the distribution, and when propagated through the true nonlinear system, captures the

(11)

posterior mean and covariance accurately to the 3rd order (Taylor series expansion) for nonlinear transformations. The EKF, in contrast, only achieves first-order accuracy. Moreover, the computational complexity of both filter schemes is the same [5].

The Unscented Transform is an algorithm for computing the statistics of a random variable propagated through a nonlinear transformation [6]. From [5], the steps of the Unscented Transform are presented as follows. Consider propagating a random variable x, of dimension L, through a nonlinear function, y = g(x). Assume x has mean ˜x and covariance P_x. To compute the statistics of y, a matrix X of 2L + 1 sigma vectors X_i is obtained (with corresponding weights W_i), according to the following equations [5]:

X⁰ = ˜x Xⁱ = ˜x +

q

(L + λ)P_x

i

, i ∈ [1, L]

X^i+L = ˜x −

q

(L + λ)P_x

i

, i ∈ [1, L]

W_s⁰ = λ L + λ W_c⁰ = λ

L + λ + (1 − α²+ β) W_sⁱ = W_cⁱ = λ

2(L + λ)

(1)

where λ = α²(L + κ) − L is a scaling parameter. α determines the spread of the sigma points around ˜x and is usually set to a small positive value (e.g., 1e-3). κ is a secondary scaling parameter which is usually set to 0, and β is used to incorporate prior knowledge of the distribution of x. For Gaussian distributions, β = 2 is optimal.

hq(L + λ)P_xⁱ

i is the ith row of the matrix square root [5]. These sigma vectors are propagated through the nonlinear function,

Yⁱ = g(Xⁱ) i ∈ [1, 2L] (2)

and the mean and covariance for y are approximated using a weighted sample mean and covariance of the posterior sigma points [5],

˜ y =

2L

X

n=0

W_sⁱYⁱ (3)

P_y =

2L

X

n=0

W_cⁱ^hYⁱ− ˜y^{i h}Yⁱ − ˜yⁱ^T (4) The Extended Kalman Filter (EKF) has become a standard technique used in a number of non-linear estimation and machine learning applications. These include estimating the state of a non-linear dynamic system, estimating parameters for

(12)

non-linear system identification (e.g., learning the weights of a neural network), and dual estimation (e.g., the Expectation Maximization (EM) algorithm) where both states and parameters are estimated simultaneously [5].

This method differs from other Bayesian estimators like the Sequential Monte Carlo (particle filter), which requires a significant higher number of sample points in an attempt to propagate an accurate (possibly non-Gaussian) distribution of the state [5]. The Monte-Carlo sampling implies, as a result of handling more samples, an additional computational effort. Figure 2 shows an example for a 2-dimensional system: the left plot shows the true mean and covariance propagation using Monte- Carlo sampling; the central plots show the results using a linearization approach as done by the EKF; the right plots show the performance of the UT (note only 5 sigma points are required).

Figure 2: Example of the UT for mean and covariance propagation. From left to right: Monte-Carlo sampling, first-order linearization (EKF), and UT [5].

The Unscented Kalman Filter (UKF) performs the UT recursively to estimate the state vector and covariance matrix. The state vector is redefined as the concatenation of the original state and the mean of the system and measurement noise variables.

Similarly, the covariance matrix is redefined as the concatenation of the original covariance matrix and the system and measurement noise covariance [5]. The resulting augmented state vector and covariance are shown as follows:

x^a_k=^hx^T_k w^T_k v^T_kⁱ]^T

P^a_k=







Pk 0 0

0 Q_k 0

0 0 R_k







(5)

Where w_kis the mean of the system noise and v_k is the mean of the measurement noise. The UT sigma point generation scheme, in equation 1, is applied to this new

(13)

augmented state to calculate the corresponding sigma matrix, X^a_k. These sigma points are then propagated, in the predict phase, through the state transition model.

The resulting transformed sigma points are later propagated though the observation model in the update phase. The UKF equations are presented later in section3.5.1.

Note that no explicit calculation of Jacobians or Hessians are necessary to implement this algorithm. Furthermore, the overall number of computations are the same order as the EKF.

2.2 Simultaneous Localization and Mapping

The Simultaneous Localization and Mapping (SLAM) problem is concerned with the task of building a map of an unknown environment while, at the same time, navigating the environment using the map [12]. SLAM is widely used in autonomous vehicle applications.

In this thesis, the unknown environment is the lunar surface and is regarded as static, which means that the physical elements or landmarks used to build the map are non- moving objects. The landmarks are observed by the camera and used as a reference for localization and state estimation. They can be any scenery feature likely to be re-observed multiple times, such as stars (visible in the first phase of the descent), craters and shadow patterns.

In SLAM, both the trajectory of the vehicle and the location of all landmarks are estimated on-line without the need for any a priori knowledge of location [13].

As illustrated in [13], consider a mobile vehicle moving through an environment taking relative observations of a number of unknown landmarks using a sensor located on the vehicle. This is shown in figure3. At a time instant k, the following quantities are defined:

• x_k: The state vector describing the location and orientation of the vehicle.

• u_k: The control vector, applied at time k − 1 to drive the vehicle to a state x_k at time k.

• m_i: A vector describing the location of the i-th landmark, whose true location is assumed time invariant.

• z_ik: An observation taken from the vehicle of the location of the i-th landmark at time k.

2.2.1 Formulation of the SLAM Problem

In probabilistic form, the Simultaneous Localisation and Map Building problem requires that the probability distribution as follows

P (x_k, m|Z_0:k, U_0:k, x₀) (6)

(14)

Figure 3: The foundation of the SLAM problem. A simultaneous estimate of both vehicle and landmark locations is required. The true locations are never known or measured directly. Observations are made between true vehicle and landmark locations [13]

be computed for all times k. This probability distribution represents the joint posterior density of the landmark locations and vehicle state, at time k, given the recorded observations and control inputs up to and including time k, together with the initial state of the vehicle. In general, a recursive solution to the SLAM problem is desirable [13].

The joint posterior requires that a state transition model and an observation model are defined describing the effect of the control input and observation respectively.

The state transition model or vehicle dynamics can be described in terms of a probability distribution on state transitions in the form:

P (x_k|x_k−1, u_k) (7)

That is, the state transition is assumed to be a Markov process in which the next state x_k depends only on the previous state x_k−1 and the applied control u_k, and is independent of both the observations and the map.

The observation model describes the probability of making an observation z_kwhen the vehicle location and landmark locations are known, and is generally described in the form:

P (zk|xk, m) (8)

It is reasonable to assume that once the vehicle location and map are defined, observations are conditionally independent given the map and the current vehicle state [13].

(15)

The SLAM algorithm is implemented in a standard two-step recursive prediction (time-update) update (measurement-update) form:

A time-update,

P (x_k, m|Z_0:k−1, U_0:k, x0) =

Z

P (x_k|x_k−1, u_k) × P (x_k−1, m|Z_0:k−1, U_0:k−1, x0)dx_k−1 (9)

and a measurement-update as follows,

P (x_k, m|Z_0:k, U_0:k, x₀) = P (z_k|x_k, m)P (x_k, m|Z_0:k−1, U_0:k, x₀)

P (z_k|Z_0:k−1, U_0:k) (10)

Equations9and10provide a recursive procedure for calculating the joint posterior P (x_k, m|Z_0:k, U_0:k, x₀) for the vehicle state x_k and map m at a time k based on all observations z_0:k and all control inputs U_0:k up to and including time k. The recursion is a function of a vehicle model P (x_k|x_k−1, u_k) and an observation model P (z_k|x_k, m).

2.2.2 Solutions to the SLAM Problem

A solution to the SLAM problem involves finding an appropriate representation for the observation model in equation 8 and state transition model in equation 7, in order to compute the prior and posterior distributions in equations 9 and 10.

By far the most common representation is in the form of a state-space model with additive Gaussian noise, leading to the use of the extended Kalman filter (EKF) to solve the SLAM problem [13]. On the basis of the advantages of the Unscented Kalman Filter over the Extended Kalman Filter, the present thesis project performs UKF instead.

Based on the vehicle model description for the EKF-SLAM in [13], the state transition for the UKF-SLAM scheme can be described in the form:

P (xk|xk−1, uk) ↔ x_k = f (x_k−1, uk) + w_k (11) where f models the vehicle kinematics and where w_k are additive, zero mean uncorrelated Gaussian motion disturbances with covariance Q_k. The observation model is described in the form:

P (z_k|x_k, m) ↔ z_k= h(x_k, m) + v_k (12) where h describes the geometry of the observation and where v_k are additive, zero mean uncorrelated Gaussian observation errors with covariance R_k.

With these definitions the standard UKF method in [6] can be applied to compute the mean

"

ˆ x_k|k

ˆ m_k

#

= E

"

x_k m |Z_0:k

#

(13)

(16)

and covariance

Pk|k =

"

P_xx P_xm P^T_xm Pmm

#

k|k

= E





xk− ˆxk

m − ˆm_k

! xk− ˆxk

m − ˆm_k

!T

|Z0:k



 (14)

2.2.3 Monocular SLAM

As mentioned before, SLAM addresses the problem of estimating landmarks position given observations of the unknown environment and control inputs. A vision-based SLAM approach performs observations by means of one or more cameras. The first case is also known as monocular approach and the second as the stereo approach.

Stereo cameras provide bearing and range measurements, allowing the computa- tion of the 3D-position of observed landmark. However, the 3D range of a stereo camera is limited and the consequent uncertainty of the estimated landmark’s position increases with its distance from camera [14]. While this rising uncertainty can be taken into account during the estimation process, depending on the camera parameters (base length, focal length, pixel size) the range measurement of a stereo camera bears no more viable information for landmarks located at a certain distance and further away. The maximum observable range may be sufficiently large for ground operating robots with acceptable baseline lengths (like 10 cm). Flying robots however, especially airships that operate in altitudes of 100 meter and beyond, need larger baselines [14].

Due to these flaws of stereo approaches, several authors addressed the problem with a single camera. Since one camera cannot provide range measurements, resulting algorithms are often called bearing-only SLAM or simply monocular SLAM. This approach can be implemented in combination with the Inverse Depth Parametrization, a landmark model that encodes the depth to the landmark by its inverse, as a Gaussian distributed variable.

2.3 The Inverse Depth Parametrization

The Inverse Depth Parametrization is a unified and straightforward parametrization for image feature location that can handle both, initialization and standard tracking of both close and very distant features within the standard EKF framework, and so, the UKF framework too [16].

The parametrization of the inverse depth of an image feature along a semi-infinite ray, from the position at which it was first observed to the true 3D object, allows a Gaussian distribution to cover the high uncertainty in depth. This uncertainty range spans a depth range from the vicinity of the object to infinity [16].

An important advantage of this parametrization is that it does not require any additional algorithm to initialize the landmark upon feature observation. Landmarks are included right away in the state vector and thus, immediately contribute to state

(17)

estimates.The new features are also promptly correlated with all other features in the map.

The angle formed by two rays, connecting the 3D landmark and the camera’s optical axis from different viewpoints or camera positions, is addressed as the feature’s parallax. The depth to the landmark can only be estimated with a single camera by re-observing the landmark at different parallax angles. Another advantage of the inverse depth parameterization is the highly linear performance for low parallax angle, that prevails in the first observations of a feature.

The only drawback of the inverse depth scheme, in contrast to different param- eterizations, is the computational cost of storing and handling six parameters to represent a landmark. Six parameters per landmark enlarges significantly the state vector and covariance matrix.

2.3.1 Undelayed Initialization

The common approach to cope with feature initialization within the monocular SLAM framework consists of handling newly observed features separately from the main map, by compiling and processing information gathered over several frames in order to reduce depth uncertainty before incorporating the features into the estimation routine. Such delayed initialization schemes, such as [20] and [21], have the downside that new features, held outside the main probabilistic state, are not able to contribute to the estimation of the camera position until finally included in the map. Further, features that retain low parallax over many frames, as those very far from the camera or close to the motion epipole, are usually rejected completely because they never pass the test for inclusion [16].

Recently, several undelayed initialization schemes have been proposed, which still treat new features in a special way but these are able to influence immediately the state estimation. The premise of the proposed schemes is that features with highly uncertain depths are extremely useful as bearing references for orientation estimation despite they provide little information on camera motion. An example of such approach is found in [22]. The inverse depth scheme is also an undelayed landmark initialization approach however, it models the inverse depth parameter as a Gaussian random variable initialized heuristically to cover a wide range from an arbitrary minimum depth to the infinity.

2.3.2 Points at Infinity

Another motivation to use the inverse depth parametrization is the possibility of handling features at all depths, including at the infinity. This allows to use stars as landmarks during the first phases of the descent. Features at the infinity exhibit no parallax during camera motion due to its extremely high depth. A star, for instance, is always observed at the same image location despite the displacement of the camera.

Such a feature does not provide any information on the camera translation but is a perfect bearing reference for estimating rotation. The homogeneous coordinate

(18)

systems of visual projective geometry allow explicit representation of points at infinity, and they have proven to play an important role during off-line structure and motion estimation [16].

2.3.3 Landmark Model .

The UKF-SLAM estimator builds a map of physical elements (stars, terrain features, shadow patterns, etc) observed by the ground-facing camera along the landing trajectory. These elements are addressed as landmarks, since they represent a reference for relative positioning. The landmarks are perceived by the camera as 2D image features and can be represented as 3D elements by means of a parametrization.

This will be addressed in this section. The 3D representation of the landmark is simply the 3D cartersian position of the physical element observed by the camera.

Therefore, every 2D image feature from the camera can be associated to a 3D landmark. Landmark information is included to the state vector and covariance matrix of the estimator, to keep track simultaneously of the landmark and vehicle position.

The 3D position of a landmark can be expressed in terms of its cartesian coordi- nates, also called Euclidean parametrization in this project,

l = (X, Y, Z)^T (15)

Furthermore, the inverse depth parametrization describes the 3D landmark in terms of six parameters as follows:

l = (x, y, z, θ, φ, ρ)^T (16)

where,

- x, y, z is the camera position at which the image feature was observed for the first time. This is the camera optical center.

- θ and φ are ,respectively, the azimuth and elevation angles (coded in world frame) of the image feature when it was seen for the first time. These angles define a unit directional vector m(θ, φ) parrallel to the ray connecting the camera and the 3D landmark. Refer to Figure 4.

- ρ = 1/d is the inverse depth to the landmark from camera’s optical center.

The 3D position of the landmark can be retrieved from these six parameters as follows:







X Y Z





=







x y z





+ 1

ρm(θ, φ) (17)

(19)

Figure 4: Illustration of the inverse depth parametrization for 3D landmark modeling. The world and camera coordinate frames are denoted as W and C respectively. Camera’s frame origin has a position r^{W C} and an attitude q^{W C} with respect to the world frame. The position of the camera at which the landmark was first observed is denoted as (xi, y_i, z_i), the depth to the landmark as di

and the directional vector as m. The parallax angle α is formed by the ray from (x_i, y_i, z_i) to the landmark and the ray from the current position of the camera to the landmark. h^c_{XY Z} and h^c_ρ are the observation vectors based on the euclidean and the inverse depth parametrization, respectively.

This image was taken from [16]

Where the directional vector m is retrieved from the azimuth and elevation angles as follows,

m =







cos φ sin θ

− sin φ cos φ cos θ





 (18)

2.4 Pinhole Camera Model

The pinhole camera model establishes the geometric relationship between a 3D point and its corresponding 2D projection onto the image plane. This geometric mapping from 3D to 2D is called a perspective projection. The center of the projection is the point in which all the rays intersect and is addressed as the optical center or principal point. Moreover, the line perpendicular to the image plane passing through the optical center is called optical axis, see figure 5.

The pinhole camera model was used in this thesis to emulate the projection of the physical objects, observed by the ground-facing camera, onto the image plane as a real camera does. The input is the 3D position of a world point and the output is the pixel coordinates of the projection.

For a camera with the optical axis being collinear to the Z-axis and the optical center being located at the origin of a 3D coordinate system, the projection of a 3D point (X, Y, Z)^T, in world coordinates, onto the image plane at pixel position (u, v)^T

(20)

Figure 5: The ideal pinhole camera model describes the relationship between a 3D point (X,Y,Z)^T and its corresponding 2D projection (u,v) onto the image plane [25]

can be written as:

u = fX

Z (19)

v = fY

Z (20)

where f denotes the focal length. To avoid such a non-linear division operation, the previous relations can be reformulated using the projective geometry framework, as:

(λu, λv, λ)^T = (Xf, Y f, Z)^T (21)

This relation can be then expressed in matrix notation by:

λ







u v 1





=







f 0 0 0

0 f 0 0

0 0 1 0













X Y Z 1







(22)

where λ = Z is the homogeneous scaling factor.

(21)

3 The Navigation System

This chapter opens with a review of the operational scenario and the architecture of the navigation system, proposed in [1], for the ATON project. The landing phases and limitations, from this scenario, were taken into account to design the UKF-SLAM estimator; furthermore, the ATON architecture was adapted to be used as the framework to develop and test the estimator. The chapter follows with the state definition, a review of the Strapdown navigation algorithm presented in [1], and a detailed description of the UKF-SLAM estimator. A novel method for initializing newly observed features in the state vector and covariance matrix is presented. This chapter also describes the state transition and observation models.

3.1 Scenario Description

The operational scenario presented in [1], is assumed as the scenario of the navigation system for which the UKF-SLAM estimator is designed. The characterization of the scenario is important to detect limitations and requirements along the landing trajectory, to be fulfilled by the estimator.

The landing trajectory, from the descent orbit insertion maneuver down to the touchdown at the landing, site can be divided into four phases [1]:

An initial phase characterized by flying through the shadow of the Moon. In the absence of sunlight, the images from the camera cannot provide any information.

Therefore, the state estimation is computed only from inertial measurements and attitude from the star tracking camera.

The second phase is characterized by the presence of recognizable craters in the camera images that are matched against a catalogue onboard. Based on craters position, the absolute position of the vehicle in MCMF coordinates can be estimated and used to update the vehicle state estimate.

The amount of recognizable craters decrease while losing altitude. Hence, the crater matching- based absolute positioning becomes more and more difficult to achieve with accuracy. As a result, the third stage is characterized by the need of relative positioning, performed by the monocular SLAM algorithm fed with arbitrary image features in the field of view of a ground-facing camera.

(22)

The last phase is reached when the landing site becomes visible. During this stage, a shadow pattern matching algorithm is used for absolute positioning to estimate the navigation status of the vehicle. The algorithm matches the shadow information extracted from images against the predicted shadow pattern from the Digital Elevation Models of the landing site. Additionally, the laser altimeter gets more weight in the fusion of the sensor data.

The advantage of a UKF-SLAM-based estimator is that the landmark or feature tracking routine can be performed anytime as long as valid image features are available in the field of view of the camera. Hence, the resulting landmark map becomes a suitable reference for localization with and without the existence of other absolute position references. This compensates the lack of positioning references in phase three and constitutes a redundant reference during phases two and fourth.

The next section reviews the architecture of the navigation filter presented in [1].

This architecture is assumed as the framework in which the UKF-SLAM operates.

3.2 System Architecture

The architecture in [1] comprises two parallel execution paths running at different rates. One is characterized by a Strapdown navigation routine to predict the next vehicle state based on inertial measurements and the current state estimate; the other is featured by a UKF estimator to compute the error in the vehicle state prediction and in the inertial measurement bias estimate. The final navigation solution is obtained by combining the output of both paths, the state prediction and its error.

This architecture inspired the implementation in this work, which consists of a single execution path with two routines: the Strapdown and the UKF-SLAM.

The routines are executed by a rate of 1 to 100, where the UKF-SLAM estimator is updated once every 100 updates of the Strapdown routine. As the Strapdown prediction, updated every 0.01 seconds, drifts away from the true vehicle state over time, the error is suppressed every second by updating the UKF-SLAM estimator. The flowchart in figure6illustrates the execution path or workflow of the implementation.

The positive integer i is initialized to 1 and grows to 100, the execution rate. In each iteration, new sensor measurements are retrieved, the error compensation module is executed and the Strapdown prediction is updated. When i = 100, i is reset to 1, the UKF-SLAM is executed and the final navigation solution, denoted as x, combines the output from the two routines. For 1 6 i < 100, i is increased by one and the total navigation solution is simply the Strapdown prediction.

Figure 7 depicts the different dependencies between modules and the inputs to the navigation filter. The inputs are divided into two types: raw inertial unit measurements, such as vehicle’s linear acceleration ˜a_B and vehicle’s angular rate ˜ω_B, and a set of various sensor measurements with mean h and standard deviation σ_h. The sensor measurements available are absolute vehicle position, attitude, altitude and image feature coordinates.

The system architecture includes an Error Compensation module, described in

(23)

Figure 6: Flowchart of the algorithm used to test the UKF-SLAM routine.

Figure 7: Block diagram of the system architecture, inspired by the architecture of the navigation filter in [1]

[1], which is triggered by new inertial measurements. In this module, the incoming raw inertial sensor measurements ˜a_B and ˜ω_B are corrected by adding to them the respective sensor bias estimate, b_a_B and b_ω_B, computed by the UKF-SLAM routine.

As a result, the corrected inertial sensor measurements are defined in equations 23 and 24. The corrected measurements are inputted to the Strapdown algorithm to

(24)

predict the next navigation solution. The Strapdown is described in detail in the section 3.4.

a_B = ˜a_B+ b_a_B (23)

ω_B = ˜ω_B+ b_ω_B (24)

3.3 State Definition

The state space of this thesis comprises the definitions in [1], for the total state, error state and the error state covariance; and includes new definitions for the landmark state and landmark error covariance. The error state and landmark state are merged into one state vector, updated by the estimator. Similarly, the error state covariance and landmark state covariance are combined into one single covariance matrix. Thus, x denotes the total state or total navigation solution, δx denotes the error state, l stands for the landmark state, and P for the covariance matrix combining both, error state covariance and landmark state covariance. The ultimate output of the navigation system is the total state x. These definitions are explained in detail as follows.

3.3.1 Total State

This is the most important output of the navigation system since it constitutes the state of the vehicle. As defined in [1], it consists of the absolute position r_m, velocity v_m and attitude q^m_B of the vehicle (refer to equation 25). Additionally, the bias in the inertial measurements is included in the state, where b_a_B and b_w_B stand for the bias in acceleration and angular rate respectively; both expressed in body frame.

An update of this state is triggered by a new inertial measurement and is computed by the Strapdown algorithm. This occurs every 0.01 seconds.

x =







rm

v_m q^m_B ba_B

b_ω_B







∈ <¹⁶ (25)

3.3.2 Error State

Since the Strapdown prediction of the total state has an inherited error from defective inertial measurements (sensors have noise and bias), it has to be corrected to remain reliable. Therefore, an error state δx is defined in equation 26. As mentioned in [1], it comprises the absolute position error δr_m, absolute velocity error δv_m and the attitude error δθ^m_m . It also includes the error in acceleration and angular rate bias estimates, δb_a_B and δb_w_B respectively.

(25)

δx =







δr_m δv_m δθ^m_m δb_a_B δb_w_B







∈ <¹⁵ (26)

The error state is continuously updated by the UKF-SLAM estimator, which fuses all available measurements to keep the variance of the error state as small as possible. This update occurs every second.

The Kalman Filter equations and the Unscented Transformation assume state variable components to be uncorrelated [1]. Hence, error state variables must comply with this condition. Since the norm of a rotation quaternion q is equal to one, said kqk = 1, a different representation is needed to express the error of the attitude δq^m_m in the error state.

As a result, the attitude error δθ^m_m is represented by the axis-angle notation [1], δθ^m_m = e · θ, e ∈ <³, e^T · e = 1 (27) Where e is the axis of rotation and θ the rotation angle. The three components of vector δθ^m_m are uncorrelated.

The transformation of the attitude error from axis-angle δθ^m_m to a rotation quaternion δq^m_m is done by means of equation28.

δq^m_m =

"

cos(^θ₂) e · sin(^θ₂)

#

(28) Finally, the total navigation solution x is obtained by combining the prediction from the Strapdown algorithm and the error state. As the error state is updated every second while the Strapdown output is computed every 0.01 seconds, the total state will only be the Strapdown prediction when an error state update is not available.

x =







r˜m+ δr_m

˜

v_m+ δv_m δq^m_m^Nq˜^m_B

˜b_a_B + δb_a_B

˜b_w_B + δb_w_B







(29)

3.3.3 Landmark State

The UKF-SLAM estimator builds a map of 3D landmarks corresponding with the image features retrieved by the ground-facing camera along the landing trajectory.

This map serves as a complementary or unique reference for localization depending on whether the system has absolute position information available. A landmark state vector and covariance matrix are defined to store the information of the landmarks

(26)

corresponding with the image features currently in the field of view. Both, landmark state vector and covariance, are updated by the estimator. The information in the landmark state vector is saved to the map, to keep track of all the landmarks observed along the way to the landing site.

Image features appear and disappear from the field of view of the camera as the vehicle is moving; therefore, the landmark state is a vector of variable length comprising the information of the landmarks corresponding with only the visible features at a time in the field of view. Similarly, the landmark state covariance matrix varies in size depending on the number of features in the field of view.

In this project, landmarks are modeled by two different parametrizations: the Euclidean and the Inverse Depth, described in section2.3.3. Ultimately, the Inverse Depth Parametrization was preferred over the Euclidean for landmark modeling;

however, the latter was useful in an early development stage to validate and test the observation model.

The Inverse Depth Parametrization describes the 3D landmark position by six parameters: the camera position, denoted by x, y, z, at which the corresponding image feature was observed for the first time; the azimuth and elevation angles in world frame, θ and φ, of the image feature when it was observed for the first time, and the depth to the landmark encoded by its inverse ρ = 1/d. The landmark state for a single landmark i is given in equation30.

l_i = (x_i, y_i, z_i, θ_i, φ_i, ρ_i)^T (30) The 3D Cartesian position of the landmark can be retrieved from these parameters as follows:







X_i Y_i Z_i





=







x_i y_i z_i





+ 1

ρ_im(θi, φi) (31)

m =







cos φ_isin θ_i

− sin φ_i cos φ_icos θ_i





 (32)

The landmark state vector and covariance matrix are updated only if the landmark tracking functionality is enabled; otherwise only the error state vector and covariance are updated in an iteration of the estimator. The resulting state vector of the UKF-SLAM estimator, when landmark tracking is active, is presented in equation 33.

x_{ukf −slam} = [δx^T, l₁^T, l₂^T, l₃^T, ..., l_n^T]^T, ∈ <^(15+6·n) (33) where n denotes the number of landmarks in the state vector and l_i is represented by the Inverse Depth Parametrization.

(27)

3.3.4 Error and Landmark State Covariance

The covariance matrix P comprises the variance and covariance of the variables in the UKF-SLAM state vector in 33. This matrix represents the uncertainty in the error state and in the landmark state; therefore, it tracks the uncertainty of the navigation solution.

P ∈ <(15+6·n×15+6·n) (34) Where n is the number of landmarks in the landmark state.

3.4 Strapdown Algorithm

This section reviews the Strapdown algorithm proposed in [1]. It computes the prediction of the next navigation solution, ˜x(t + ∆t), based on the current total navigation solution x(t) and the last (corrected) inertial measurements a_B and ω_B, where ∆t is the time step between two inertial measurements.

The predicted total state ˜x consists of the predicted position ˜r_m, the predicted velocity ˜v_m, and the predicted attitude ˜q^m_B . The sensor bias prediction is equal to the current bias estimate.

The gravitational acceleration is modeled in accordance with the ellipsoid model of the Moon. The gravitational vector g_m is derived from Newton’s law of gravitation and depends on the current position r_m. Refer to equation 35.

g_m(r_m) = −r_m· µ

kr_mk³ (35)

where µ = G · M is the standard gravitational parameter of the Moon, with the gravitational constant G = 6.674 · 10⁻¹¹ N m²kg⁻² and the Moon’s mass M = 7.353 · 10²² kg.

The kinematic equations, from [1], to describe the vehicle dynamics are presented in equation 38. Since these equations are first-order approximations and the inertial measurements are obtained at a high rate (to minimize the integration error), the Euler integration method can be used to obtained the next estimate ˜x(t + ∆t).

˜

x(t + ∆t) = x(t) + ˙x(t) · ∆t (36) Where the derivative of the total state ˙x is presented in equation 38.

˙

x(t) = f ((x(t), [a_B, ω_B, ω_m](t)) (37)

˙ x(t) =







˙ r_m

˙ v_m

˙ q^m_B





=







v_m

R^m_Ba_B+ g_m(r_m) − 2 · ω_m× v_m R^m_BΩ_B− Ω_mR^m_B





 (38)

(28)

Where R^m_B ∈ <^(3x3) is the matrix representation of the rotation quaternion q^m_B, ω_m is the angular rate of the Moon and Ω is the skew-symmetric matrix representing an angular rate vector ω:

Ω =







0 −ω_z ω_y

ω_z 0 −ω_x

−ω_y ω_x 0





 ∈ <^(3×3) (39)

The term 2 · ω_m× v_m, in equation 38, is called Coriolis acceleration and Ω_mR^m_B is the change in attitude due to the angular rate of the Moon [1].

3.5 UKF-SLAM Estimator

The UKF-SLAM estimator is a routine that addresses the Simultaneous Localization and Mapping problem jointly with the Unscented Kalman Filter. In this thesis,

the estimator updates the error state, the landmark state and the corresponding covariance; moreover, the landmark tracking functionality can be enabled and disabled anytime. If disabled, the landmark state vector and covariance are not updated.

As demonstrated in [1], a UKF estimator is sufficient to compute accurately the error in the total state. However, a UKF-SLAM estimator additionally provides with relative positioning information, required when absolute positioning data is missing.

For instance, during phase 3 of the operational scenario.

Two requirements were defined for the landmark tracking routine to work. One, is the availability of a set of image features likely to be re-observed by the ground-facing camera and the other, the convergence of the sensor bias estimate to the true value.

The latter concerns the correlation between landmark parameters, as modeled by the Inverse Depth Parametrization, and state variables such as position and attitude.

The 3D landmark position, retrieved from the state variables and the corresponding image feature, is used in the update phase to refine the predicted error and landmark state, computed in the predict phase. If the bias estimate has not yet converged, the inertial measurements would not be compensated and thus, the total state estimates would be very mistaken making the filter diverge or converge very slowly when using landmark information to refine the state prediction. In order to avoid this, the landmark tracking functionality of the estimator was enabled in simulations after the sensor bias estimate had converged by means of a base UKF scheme as in [1].

When the landmark tracking functionality is active, the state vector x_{ukf −slam} comprises the error state δx and the landmark state l, where l ∈ <^(6·n) (refer to equation 33). The corresponding covariance P ∈ <(15+6·n×15+6·n), where n is the number of landmarks in the landmark state.

Otherwise, the state vector xukf −slam is simply the error state and the correspond- ing covariance P , where P ∈ <^(15×15).