Institutionen för systemteknik
Department of Electrical Engineering
Examensarbete
Target Classification Based on Kinematics
Examensarbete utfört i Reglerteknik vid Tekniska högskolan vid Linköpings universitet
av
Robert Hallberg
LiTH-ISY-EX--12/4622--SE
Linköping 2012
Department of Electrical Engineering Linköpings tekniska högskola
Linköpings universitet Linköpings universitet
Target Classification Based on Kinematics
Examensarbete utfört i Reglerteknik
vid Tekniska högskolan vid Linköpings universitet
av
Robert Hallberg
LiTH-ISY-EX--12/4622--SE
Handledare: Per-Johan Nordlund
SAAB AB Zoran Sjanic
ISY, Linköpings universitet
Examinator: Martin Enqvist
ISY, Linköpings universitet
Avdelning, Institution Division, Department
Avdelningen för reglerteknik Department of Electrical Engineering SE-581 83 Linköping Datum Date 2012-09-02 Språk Language Svenska/Swedish Engelska/English Rapporttyp Report category Licentiatavhandling Examensarbete C-uppsats D-uppsats Övrig rapport
URL för elektronisk version
http://www.ep.liu.se
ISBN — ISRN
LiTH-ISY-EX--12/4622--SE Serietitel och serienummer Title of series, numbering
ISSN —
Titel Title
Klassificering av flygande objekt med hjälp av kinematik Target Classification Based on Kinematics
Författare Author
Robert Hallberg
Sammanfattning Abstract
Modern aircraft are getting more and better sensors. As a result of this, the pilots are getting more information than they can handle. To solve this problem one can automate the information processing and instead provide the pilots with conclusions drawn from the sensor information. An aircraft’s movement can be used to determine which class (e.g. commercial aircraft, large military aircraft or fighter) it belongs to. This thesis focuses on comparing three classification schemes; a Bayesian classification scheme with uniform priors, Transferable Belief Model and a Bayesian classification scheme with entropic priors.
The target is modeled by a jump Markov linear system that switches between different modes (fly straight, turn left, etc.) over time. A marginalized particle filter that spreads its particles over the possible mode sequences is used for state estimation. Simulations show that the results from Bayesian classification scheme with uniform priors and the Bayesian classification scheme with entropic priors are almost identical. The results also show that the Transferable Belief Model is less decisive than the Bayesian classification schemes. This effect is argued to come from the least committed principle within the Transferable Belief Model. A fixed-lag smoothing algorithm is introduced to the filter and it is shown that the classification results are improved. The advantage of having a filter that remembers the full mode sequence (such as the marginalized particle filter) and not just determines the current mode (such as an interacting multiple model filter) is also discussed.
Nyckelord
Keywords Target Classification, Transferable Belief Model, Entropic Priors, Jump Markov Linear System, Marginalized Particle Filter, Fixed-lag Smoothing, Interacting Multiple Model
Abstract
Modern aircraft are getting more and better sensors. As a result of this, the pilots are get-ting more information than they can handle. To solve this problem one can automate the information processing and instead provide the pilots with conclusions drawn from the sensor information. An aircraft’s movement can be used to determine which class (e.g. commercial aircraft, large military aircraft or fighter) it belongs to. This thesis focuses on comparing three classification schemes; a Bayesian classification scheme with uniform priors, Transferable Belief Model and a Bayesian classification scheme with entropic pri-ors.
The target is modeled by a jump Markov linear system that switches between different modes (fly straight, turn left, etc.) over time. A marginalized particle filter that spreads its particles over the possible mode sequences is used for state estimation. Simulations show that the results from Bayesian classification scheme with uniform priors and the Bayesian classification scheme with entropic priors are almost identical. The results also show that the Transferable Belief Model is less decisive than the Bayesian classification schemes. This effect is argued to come from the least committed principle within the Transferable Belief Model. A fixed-lag smoothing algorithm is introduced to the filter and it is shown that the classification results are improved. The advantage of having a filter that remembers the full mode sequence (such as the marginalized particle filter) and not just determines the current mode (such as an interacting multiple model filter) is also discussed.
Acknowledgments
First of all, I would like to thank SAAB AB for the means and opportunity to do this thesis. Dr. Per-Johan Nordlund has been my main supervisor at SAAB and it is he who has received most of my technical questions. Dr. Sören Molander at SAAB has also helped me a lot, especially with the Transferable Belief Model.
My supervisor at the university, Ph.D. student Zoran Sjanic, has handled most questions regarding the report and the proofreading of it. He and my examiner, Assoc.Prof. Martin Enqvist, have also provided me with some interesting feedback.
Linköping, June 2012 Robert Hallberg
Contents
Notation xi 1 Introduction 1 1.1 Background . . . 1 1.2 Problem Formulation . . . 1 1.3 Purpose . . . 2 1.4 Limitations . . . 2 1.5 Thesis Outline . . . 3 2 Bayesian Classification 5 2.1 The Basics of Bayesian Probability Theory . . . 52.2 Classification . . . 5
2.3 Example . . . 6
3 Transferable Belief Model 9 3.1 Belief on a Discrete Frame . . . 9
3.2 Belief, Plausibility and Commonality . . . 10
3.3 Conjunctive Rule of Combination . . . 10
3.4 Pignistic Probability . . . 12
3.5 Generalized Bayesian Theorem . . . 13
3.6 Least CommittedBBA. . . 13
3.7 Belief on a Continuous Frame . . . 14
3.8 Least CommittedBBD . . . 14
3.9 Summary . . . 15
3.10 Example . . . 16
4 Entropic Priors 17 4.1 General Entropic Priors . . . 17
4.2 Entropic Priors Induced by Gaussian Likelihoods . . . 18
4.3 Example . . . 19
4.4 Problem with Low Values . . . 20
5 System Modeling 21
5.1 Jump Markov Linear System . . . 21
5.2 Motion Model . . . 22
5.3 Measurement Model . . . 24
6 State Estimation 25 6.1 Marginalized Particle Filter . . . 25
6.1.1 Tracking . . . 26
6.1.2 Classification . . . 28
6.1.3 Class Probability Calculations . . . 30
6.1.4 Overview . . . 30
6.2 Marginalized Particle Filter with Fixed Lag . . . 30
7 Results 33 7.1 Flight Trajectory . . . 33
7.2 Results from Standard MPF . . . 34
7.3 Results from MPF with Fixed-Lag . . . 37
7.4 Validation . . . 37
8 Discussion 39 8.1 Likelihood Functions . . . 39
8.2 Selection of Sampling Time . . . 40
8.3 Selection of Modes . . . 42
8.4 Selection of Process Noise . . . 42
8.5 Selection of Number of Particles . . . 43
8.6 The Quadratic Scalar Value . . . 44
8.7 Importance of the Initialization of the Filters . . . 45
8.8 Strength of TBM . . . 47
8.8.1 TBM without LC BBD . . . 48
8.8.2 TBM without GBT and BetP . . . 49
8.9 Recursion . . . 49
8.10 Importance of Choice of Filter . . . 52
9 Conclusions and Future Work 55 9.1 Conclusions . . . 55
9.2 Future Work . . . 56
A Bayesian Probability Laws 59 B Belief, Plausibility and Commonality 61 C Approximating the CDF of aχ2-distribution 63 C.1 Theχ2-distribution . . . 63
C.2 The Central Limit Theorem . . . 63
C.3 Approximate aχ2-distribution with a N -distribution . . . 63
C.4 Approximate the CDF of a N -distribution . . . 64
CONTENTS ix
E The Kalman Filter 67
E.1 Standard Form . . . 67 E.2 Fixed-Lag Smoothing . . . 68
F Algorithms with the Marginalized Particle Filters 71
F.1 Algorithm with the Standard MPF . . . 71 F.2 Algorithm for the Fixed-Lag MPF . . . 72
G Interacting Multiple Models Filter 75
G.1 IMM Algorithm . . . 75 G.2 Classification . . . 77
H Simulation Parameters 79
Bibliography 81
Notation
Notations
Notation Meaning
R The set of all real numbers
N Gaussian (normal) distribution
χ2 Chi squared distribution
a1:n a1, a2, . . . , an
Abbrevations
Abbrevation Meaning
PMF Probability Mass Function
PDF Probability Density Function
BT Bayes’ Theorem
TBM Transferable Belief Model
BBA Basic Belief Assignment
GBT Generalised Bayesian Theorem
LC Least Committed
BBD Basic Belief Density
CDF Cumulative Distribution Function
JMLS Jump Markov Linear System
TPM Transition Probability Matrix
CV Constant Velocity
CT Coordinated Turn
KF Kalman Filter
PF Particle Filter
MPF Marginalized Particle Filter
MMSE Minimum Mean Square Error
MC Monte Carlo
MCMC Markov Chain Monte Carlo
1
Introduction
1.1
Background
Modern aircraft are getting more and more sensor types, and the existing sensor types are getting better and better. As a result of this, the pilots are getting more information than they can handle, but what they really need is knowledge, knowledge that is based on conclusions drawn from that information.
It would be preferable if the pilots got information not only about where other aircraft are, but also which class they belong to (commercial jet, bomber, fighter, ship, helicopter, etc.). The classification system in this thesis uses the kinematic behavior to classify the targets. Note that several different sensor readings could be used for classification, since the aircraft have several distinguishing attributes, such as acceleration, speed, altitude, size, shape, electro-magnetic emissions, etc. Thus, the classification system in this thesis is considered to be part of a large classification system that uses several attributes.
1.2
Problem Formulation
The state vector of the target at timek is denoted by xk ∈ Rnx, wherexk contains infor-mation about the target’s position, speed and possibly its acceleration. The target class ci, wherei ∈ {1, 2, . . . , n} for n different classes, is a non-evolving (time-invariant) at-tribute that describes the target’s maneuverability. The classes can be defined in various ways, e.g., maximum altitude, minimum and maximum speed, accelerations in various directions.
The target’s motion is described by
xk+1= fk(xk, ci) + wk(ci) (1.1)
wherefkis a possibly non-linear function inxk, andwkis the process noise . The function fkimplicitly contains the class definitions. The process noise represents unforeseen distur-bances (e.g. wind) and other un-modeled effects that might impact the target’s movement. The measurement,yk ∈ Rny, at timek is described by
yk= hk(xk) + vk (1.2)
wherehk is a possibly non-linear function andvkis the measurement noise.
The objective of target classification is to determine the target’s class,ci, given the mea-surements,yk, and full or partial information aboutfk,hkand the two noise sources.
1.3
Purpose
The problem described in Section 1.2 can be solved in various ways. The focus in this the-sis is the use of a probability model. Bayesian probability theory is the standard choice for solving such a problem, but as will be described later in this thesis it has some drawbacks with handling this type of problem. The Transferable Belief Model is another probabil-ity theory that is getting more and more popular for classification purposes, and it does not have the same mentioned drawbacks as Bayesian probability theory. Using entropic priors to initialize the Bayesian classification scheme instead of using uniform priors (us-ing uniform priors is a standard choice) is one way of compensat(us-ing for the mentioned drawback.
The main purpose of this thesis is to compare the three different classification schemes (Bayesian with uniform priors, Transferable Belief Model and Bayesian with entropic priors), in terms of how they work and what results they give.
1.4
Limitations
As the main focus of this thesis is not to solve the problem as well as possible, but to compare different classification schemes, some simplifying limitations will be introduced. The target is assumed to be moving with constant speed. It is also assumed to only be able to make turns in two dimensions, even though it moves in all three dimensions. Thus, the classes are distinguished by how sharp turns they can perform. The measurements yk, which realistically would come from several sensors and possibly several co-operative aircrafts, are assumed to be from a single sensor measurement, which measures only the target’s position. These limitations should not affect the comparison of the different classification schemes.
1.5 Thesis Outline 3
1.5
Thesis Outline
After this introductory chapter, three chapters (Chapter 2, 3 and 4) will describe the theory behind the three classification schemes. Chapter 5 describes the modelling of the system and Chapter 6 how the problem is solved. The results and a method for comparing the dif-ferent classification schemes are given in Chapter 7. Chapter 8 contains various remarks and observations that may be of interest. Finally, some conclusions and suggestions for future work are given in Chapter 9.
There are also several appendices. The appendices contain either what could be seen as prerequisites for reading the thesis, additional explanatory information, or information that would be needed to reproduce the results in this thesis, i.e., information that probably is not necessary for the average reader.
2
Bayesian Classification
This chapter describes the basics of the Bayesian classification scheme and gives an ex-ample of a case where the Bayesian classification scheme does not work in a desirable way.
2.1
The Basics of Bayesian Probability Theory
As is stated in Smets [1991], every belief model consists of (at least) two components; a static and a dynamic one. The static component describes how the belief is stored within the model and the dynamic component how to update the belief given a new piece of information. In Bayesian probability theory, belief is stored as a probability mass,P , for each elementθ ∈ Θ, where Θ contains all possible elements, such that P (θ) ∈ [0, 1] and P
θ∈ΘP (θ) = 1. P (θ) is called a probability mass function,PMF. This corresponds to the static component.
Suppose that the probability fron eventA is known and then that event B has occurred. The new probability forA, given the new information, can then be expressed by
P (A|B) = P (A ∩ B)
P (B) . (2.1)
This is called the conditional probability, and corresponds to the dynamic component.
2.2
Classification
The objective of classification, according to Ristic et al. [2004], is to determine the target’s class probabilities given the measurements, i.e., p(ci|y1:k) ∀ci, where y1:k =
{y1, y2, ..., yk} are the measurements andci ∈ Θ= {c1, c2, . . . , cn} is the class. The class 5
likelihood functions,p(y1:k|ci), are assumed to be known. With the use of Bayes’ theorem,
BT, (see Appendix A), the class probabilities can be computed with
P (ci|y1:k) (A.1) = p(y1:k|ci)P (ci) p(y1:k) (A.2) = P p(y1:k|ci)P (ci) cj∈Θp(y1:k|cj)P (cj) , (2.2)
whereP (ci) = P (ci|∅) ∀ciare the prior class probabilities. In general, uniform priors , i.e. P (ci) = 1/n, are used when there is no knowledge about the initial priors.
Equation (2.2) can be rewritten into
P (ci|y1:k) =
p(yk|y1:k−1, ci)P (ci|y1:k−1)
P
cj∈Θp(yk|y1:k−1, cj)P (cj|y1:k−1)
(2.3)
by using the chain rule (see (A.3)). Although this expression is equivalent to (2.2), choos-ing which one to use may have an impact on the result, since the expressions are approxi-mated in the filters being used. More about this will be described in Section 8.9.
The calculation ofp(yk|y1:k−1, ci) depends on the implementation of the filter that is be-ing used, but in general it is calculated by N(yk; ˆyk|1:k−1,ci, Σk|1:k−1,ci), where ˆyk|1:k−1,ci
is the expected value of yk, given previous measurements and an assumed class, with corresponding covariance Σk|1:k−1,ci.
Givenp(y1:k|ci) ∀ci ∈ Θ, the class probability is easily calculated using (2.2). Figure 2.1 gives an overview of the Bayesian classification scheme.
p(y1:k|ci)
BT
P (ci|y1:k)
Figure 2.1: An overview of the Bayesian classification scheme.
2.3
Example
Suppose we have three classes, Θ = {c1, c2, c3}, which are defined by the possible
accel-eration each class can perform. The accelaccel-eration should be within the intervals[−1g, 1g], [−3g, 3g] and [−5g, 5g], for class c1,c2andc3, respectively, whereg = 9.81m/s2.
2.3 Example 7
Also, suppose that the likelihood functions,p(y|ci), of these classes can be modeled as zero mean Gaussian densities, with standard deviationsσ1 = 0.31g, σ2 = 0.93g and
σ3= 1.54g, for classes c1,c2andc3, respectively. These values of the standard deviations
are derived from the condition thatP (|y| < γi) = 0.99876, where γi is the acceleration limit for each class and y is the measured acceleration. The likelihood functions, as functions of the measured acceleration, are shown in Figure 2.2.
−3 −2 −1 0 1 2 3 0 0.05 0.1 0.15 0.2 Acceleration [g] Probability density [−] c1 c2 c3
Figure 2.2: Likelihood function for each class, described in Section 2.3, as a func-tion of accelerafunc-tion.
The result in Figure 2.3 is obtained by using (2.2) for a single time step with uniform priors. Observe that for small accelerations, the Bayesian classification scheme greatly favors the target being of classc1, even though targets of classc2andc3are capable of
−3 −2 −1 0 1 2 3 0 0.2 0.4 0.6 0.8 1 Acceleration [g] Probability [−] c1 c2 c3
Figure 2.3: The class probability, using the Bayesian classification scheme, for each class, described in Section 2.3, as a function of acceleration.
3
Transferable Belief Model
The Transferable Belief Model,TBM, is an interpretation of the Dempster-Shafer theory. Hence, there are a lot of similarities between the two theories. This thesis will describe the
TBM, and not go into any details about the similarities. Most of this chapter is a summary of Ristic and Smets [2005], with some clarifying examples.
According to Smets and Kennes [1994], the TBM distinguishes between two different levels; the credal level and the pignistic level. Both names is derived from Latin; credo means I believe and pignus can be translated into a bet. Beliefs are stored and updated in the credal level. The pignistic level is only used when a decision needs to be made, i.e., when one would need to make bets based on the beliefs. From theTBMs point of view, standard (Bayesian) probability can be considered to be a pignistic probability.
3.1
Belief on a Discrete Frame
Consider the following discrete set of mutually exclusive events, named the frame of discernment.
Θ= {θ1, θ2, . . . , θn} (3.1)
For example,θicould correspond to “the aircraft belongs to classi”, where i ∈ {1, 2, . . . , n}. The set of all subsets to Θ is denoted2Θand is defined as2Θ= {A|A ⊆ Θ}.
3.1 Example
If Θ= {θ1, θ2} then2Θ= {∅, {θ1}, {θ2}, {θ1, θ2}}.
The belief in theTBMis represented by a so-called Basic Belief Assignment,BBA. Each setA ∈ 2Θ gets a quantity of belief assigned to it, denotedm(A) (where m is short for
mass), such that m(A) ∈ [0, 1] and P
A∈2Θm(A) = 1. This would correspond to the
static component (see Section 2.1) in this belief model. Compared with thePMFfor the Bayesian probability theory, belief can be assigned not only to singletons of Θ, but to all possible subsets of Θ, i.e.2Θ. Assignment of belief to a subset containing several events could be useful in situations where the evidence does not support a specific event, due to lack of information.
3.2
Belief, Plausibility and Commonality
Three functions can be defined to express the belief in other ways than theBBA; the belief functionbel, the plausibility function pl and the commonality function q, defined in the following equations. bel(A) = X {B|B⊆A,B,∅} m(B) ∀A ⊆ Θ (3.2) pl(A) = X {B|B∩A,∅} m(B) ∀A ⊆ Θ (3.3) q(A) = X {B|A⊆B} m(B) ∀A ⊆ Θ (3.4)
The belief function,bel(A), corresponds to the total belief that supports A without also supporting its complement,A.
3.2 Example
If Θ= {θ1, θ2, θ3} thenθ1= {θ2, θ3}.m(θ1), m({θ1, θ2}), m({θ1, θ3}) and
m({θ1, θ2θ3}) all support θ1, but the three latter masses also supportθ1, thus
bel(θ1) = m(θ1).
The functionpl(A) represents the total belief that potentially could support A or already supportsA, i.e., the total belief that does not support its complement A. Thus, the plau-sibility function, defined in (3.3), can also be expressed as pl(A) = bel(Θ) − bel(A). The commonality function,q(A), does not have an interpretation as intuitive as the belief function and the plausibility function. It is only defined in this thesis because it is useful mathematically.
All these three functions have a one-to-one mapping with theBBA, i.e., if one of them is known, all the others can be computed too. As, the definitions in (3.2), (3.3) and (3.4) may be a little bit hard to grasp, Appendix B provides some clarifying examples for the belief, plausibility and commonality function.
3.3
Conjunctive Rule of Combination
Letm be aBBAdefined on the frame Θ, with some belief assigned toA ⊆ Θ. Suppose
some new information is obtained, i.e., that subsetB ⊆ Θ is true. The belief that previ-ously was assigned to A will now have to be transferred to A ∩ B, hence the name of
3.3 Conjunctive Rule of Combination 11
this model, transferable belief model . This updating of belief would correspond to the dynamic component (see Section 2.1) in this belief model.
The transfer of belief is done in the following way. Letm1andm2be twoBBAs defined
on the same frame Θ (e.g. m1 could correspond to the initial belief, while m2 would
correspond to the new information in the example above). Then the newBBA correspond-ing to the combined information from the two previousBBAs can be calculated with the conjunctive rule of combination as
m12(A) = (m1∩m2)(A) =
X
{B,C|B,C⊆Θ,B∩C=A}
m1(B)m2(C) ∀A ⊆ Θ (3.5)
where ∩ is the symbol for the conjunctive rule of combination.
It may be complicated to determine which subsetsB and C that should be used. Luckily, the conjunctive rule of combination can be expressed much easier by using the common-ality function instead
q12(A) = q1(A)q2(A) ∀A ⊆ Θ (3.6)
whereqi corresponds to the commonality function ofmi,i ∈ {1, 2} Smets [1990, Chap. 5].
3.3 Example
Letm1 andm2 be two BBAs defined by Table 3.1a. By computing their commonality
functions,q1andq2, by applying (3.4), the result in the same table are obtained.
A ∅ 1 2 {1, 2}
m1 0.1 0.2 0.3 0.4
m2 0.1 0.4 0.2 0.3
q1 1 0.6 0.7 0.4
q2 1 0.7 0.5 0.3
(a) Initial values.
A ∅ 1 2 {1, 2}
q1,2 1 0.42 0.35 0.12
m1,2 0.35 0.3 0.23 0.12
(b) Values after the combination.
Table 3.1: Values for Example 3.3.
Furthermore, by combining the commonality functions using (3.6) to getq1,2, the result
in Table 3.1b is obtained. As mentioned in Section 3.2, the commonality function has a one-to-one mapping with theBBA. In order to go fromq1,2tom1,2, apply
m(A) = X
{B|A⊆B}
(−1)|A−B|q(B) ∀A ⊆ Θ (3.7)
where |A − B| is the difference in cardinality between A and B. This equation is a rewritten version of Eq.(3.2) in Smets [1990].
The conjunctive rule of combination is both commutative, i.e.m1∩m2= m2∩m1, and
associative, i.e. (m1∩m2) ∩ m3 = m1∩(m2∩m3). Thus, in order to combine more
than two BBAs, simply use the conjunctive rule of combination several times, m1:k =
3.4
Pignistic Probability
As the name implies, pignistic probability is the probability on the pignistic level. The belief on the credal level is converted into a pignistic probability by the use of the pignistic transformation, denotedBetP ,
BetP (θi) = X {A|θi∈A⊆Θ} 1 |A| m(A) 1 − m(∅) (3.8)
where |A| is the cardinality of A, i.e. the number of elements in A. 3.4 Example
Suppose the belief on the credal level for Θ= {θ1, θ2} is given by Table 3.2.
A ∅ θ1 θ2 {θ1, θ2}
m(A) 0.1 0.2 0.3 0.4
Table 3.2: BBA in Example 3.4. The pignistic probability can then be calculated by
BetP (θ1) = 1 1 0.2 1 − 0.1+ 1 2 0.4 1 − 0.1 ≈0.44 BetP (θ2) = 1 1 0.3 1 − 0.1+ 1 2 0.4 1 − 0.1 ≈0.56. Note thatP θi∈ΘBetP (θi) = 1.
It is worth noticing in (3.8) that differentBBAs can be transformed into the same pignistic probability, thus the pignistic transform is a many-to-one mapping. More about this will be described in Section 3.6.
It can be shown that bel(θi) ≤ BetP (θi) ≤ pl(θi) if m(∅) = 0, which implies the in-terpretation that bel and pl can be seen as a lower and an upper limits of the pignistic probability, respectively.
3.5 Generalized Bayesian Theorem 13
3.5
Generalized Bayesian Theorem
The generalized Bayesian theorem,GBT, is the credal version of Bayes’ theorem (defined in Appendix A). The notationmdomain(subset|condition) will now be used to express m and its related functionpl correctly. The domain is the set of elements on which theBBA
is defined. Thus the previous notationm(A) would correspond to mΘ(A), since the values
ofm where defined on the frame Θ.
Now, let Θ= {θ1, θ2, . . . , θn} as earlier (whereθicould correspond to classi) and yk ∈ R (where yk could correspond to the measurement at timek). Suppose we know the conditionalBBAsmR
(yk|θi), i ∈ {1, 2, . . . , n} for every yk ∈ R. Note that aBBAon R is not defined in this thesis yet, more about this will be described in Section 3.7. TheGBT
then provides the means to calculatemΘ(A|yk) for every A ⊆ Θ by mΘ(A|yk) = Y {θi|θi∈A} plR (yk|θi) Y {θi|θi∈A} (1 − plR (yk|θi)) (3.9) whereplR (yk|θi) corresponds to mR(yk|θi) via (3.3). Note thatplR
(yk|θi) is still regarded as “unknown” so far in this thesis, so will still need to compute it in order to get the full understanding of how to go from measurements on the pignistic level to beliefs on the credal level.
3.6
Least Committed
BBA
Suppose we know the pignistic probability on a discrete frame Θ and would like to get its correspondingBBA. As mentioned in Section 3.4, the transformation from a pignistic probability to a belief on the credal level would be a one-to-many transformation. The collection of all thoseBBAs that correspond to the same pignistic probability is referred to as the set of isopignisticBBAs. So, how should we choose one of the isopignisticBBAs? One solution is to be cautious and choose theBBAwith the smallest amount of committed belief, Smets [1998, chap. 3.4].
In order to compare the amount of committed belief, we say thatm1isq-less committed
thanm2if and only ifq1(A) ≥ q2(A) ∀A ⊆ Θ. Suppose that the elements θi ∈ Θ fullfill BetP (θ1) ≥ BetP (θ2) ≥ · · · ≥ BetP (θn) (if not, relabel the elements). Then the least committed,LC,BBA(denotedmLC) is computed by
mLC(Ai) = |Ai|(BetP (θi) − BetP (θi+1)) ∀i ∈ {1, 2 . . . , n} (3.10) whereAi = {θ1, θ2, . . . , θi} andBetP (θn+1) = 0. The reader is referred to Ristic and
Smets [2005, Sec. II.B] for details.
3.5 Example
Let Θ= {θ1, θ2} and suppose that the pignistic probabilitiesBetP (θ1) = 0.6 and BetP (θ2) =
0.4 are known.
By using (3.10),mLC(θ1) = 0.2 and mLC({θ1, θ2}) = 0.8, which implies that mLC(θ2) =
An isopignisticBBAtomLC is given by Table 3.3a, denotedm1. Note that applying the
pignistic transform, given by (3.8), on m1 would give the result BetP (θ1) = 0.6 and
BetP (θ2) = 0.4. ∅ θ1 θ2 {θ1, θ2} m1 0 0.6 0.4 0 mLC 0 0.2 0 0.8 (a) BBAs. ∅ θ1 θ2 {θ1, θ2} q1 1 0.6 0.4 0 qLC 1 1 0.8 0.8 (b) Commonality functions. Table 3.3: BBAs and belief functions for Example 3.5.
The corresponding commonality functions, calculated using (3.4), are given in Table 3.3b. Note thatqLC(A) ≥ q1(A) ∀A ⊆ Θ.
As stated in Section 3.5, we want to computeplR
(yk|θi), and since the frame there is R (i.e. continuous, not discrete) the method of getting the LC BBA described in this section cannot be used. The way to computeplR
(yk|θi) will be described in Section 3.8.
3.7
Belief on a Continuous Frame
As mentioned in Section 3.6, the measurements used as input to the classification in this thesis will be continuous. Thus theTBMhas to be applied to a continuous frame instead of a discrete. This change of frame is not straightforward to understand. However, since the classes are discrete, only the process of getting theLCisopignisticBBAfrom a contin-uous frame is of interest in this thesis. Thus, only that part of the contincontin-uousTBM will be described here (interested readers are referred to Caron et al. [2008]). Basically, the
BBAs become basic belief densities,BBDs, the pignistic probabilities becomes pignistic densities and so on, in the continuousTBM.
3.8
Least Committed
BBD
In most of the literature, the way to determine theLCisopignisticBBDis only derived for the univariate GaussianPDF. Since the measurements in this thesis are in three dimensions and the filters being used in this thesis (described later in Chapter 6) use a composition of several Gaussian variables, we need a way to determine theLCisopignisticBBDfor a mixture of multivariate GaussianPDFs.
Suppose thatp(y) is a mixture of GaussianPDFs, given by p(y) = M X j=1 βjN(y; ˆyj, Σj) (3.11) wherey ∈ Rny and yˆ
j and Σj are the mean and covariance matrix of the jth mixture component respectively (for examplej could correspond to particle j if a particle filter is
3.9 Summary 15
being used). The parameterβjare the weights of the component, and satisfyPMj=1βj = 1. Then, according to Caron et al. [2008, Chap. 3-4], the LCisopignistic BBD ofp(y) is given by plRny (y ∈ Rny) = 1 − M X j=1 βjFny+2((y − ˆyj)TΣ −1 j (y − ˆyj)) (3.12)
whereFpis the cumulative distribution function,CDF, for aχ2distribution withp degrees of freedom, given by Fp(χ2) = χ2 Z 0 up2−1 2p2Γ(p2)e −u 2du (3.13)
and where Γ is the gamma function, given by
Γ(z) = ∞ Z 0 tz−1e−tdt. (3.14)
3.9
Summary
With theTBMclassification scheme, the class likelihoodsp(y1:k|ci) are transformed into
plausibility functions on the credal level, using the LC BBD method described in Sec-tion 3.8. Belief is then assigned to all subsets of Θ by applying theGBTon the plausibility functions. Finally, the belief is transformed back to a pignistic probability by using the pignistic transform.
Figure 3.1 provides an overview of theTBMclassification scheme.
mΘ(A|y 1:k) GBT plRny k (y1:k|ci) LC BBD p(y1:k|ci) P (ci|y1:k) BetP credal pignistic
3.10
Example
Consider the same example as in Section 2.3. The result of using theTBMclassification scheme on the same likelihood functions and with no knowledge about any previous be-liefs (which would correspond to the use of uniform priors in the Bayesian example) is presented in Figure 3.2. −3 −2 −1 0 1 2 3 0 0.2 0.4 0.6 0.8 1 Acceleration [g] Probability [−] c1 c2 c3
Figure 3.2: The class probability, using the TBM classification scheme, for each class as a function of acceleration.
Notice that, unlike the Bayesian classification scheme, for small accelerations the TBM
4
Entropic Priors
Consider the example in Section 2.3 again. Another way to deal with the problem that the Bayesian classification scheme favors classc1 for small accelerations is to compensate
for that effect by using other priors than the uniform.
4.1
General Entropic Priors
One type of prior that shows promising results is entropic priors Palmieri and Ciuonzo [2010]. The idea is to find the prior distribution that maximizes the joint entropy for the measurements and the classes. Entropy can be interpreted as a measure of uncertainty of a random variable. Entropy,H(X), is in general defined by
H(X) = −X
x∈X
P (x) log(P (x)) (4.1)
whereX ∈ X is a discrete random variable Cover and Thomas [1991, Eq. 2.1]. The joint entropy for the discrete random variablesX ∈ X and Y ∈ Y is defined by
H(X, Y ) = −X
x∈X X
y∈Y
P (x, y) log(P (x, y)) (4.2)
according to Cover and Thomas [1991, Eq. 2.8].
Now let X be all possible measurements, Rny, and Y be all possible classes, Θ =
{c1, c2, . . . , cn}. With the use of BT(A.1) the joint entropy for the measurements and the classes can then be expressed by
H(y, Θ) = X ci∈Θ Z y∈Rny p(y|ci)P (ci) log 1 p(y|ci)P (ci) dy (4.3) 17
as stated in Palmieri and Ciuonzo [2010, Eq. 6]. According to Palmieri and Ciuonzo [2010], the prior that maximizes the joint entropy is given by
P (ci) =
eH(y|ci)
P cj∈Θe
H(y|cj) (4.4)
where the conditional entropy,H(y|ci), is given by H(y|ci) = −
Z
y∈Rny
p(y|ci) log(p(y|ci))dy. (4.5)
The derivation can also be found in Appendix D.
4.2
Entropic Priors Induced by Gaussian Likelihoods
Letp(y1:k|ci) = N (y1:k; ˆy1:k,i, Σ1:k,i), where ˆy1:k,iis the expected value and Σ1:k,iis the
covariance matrix ofy1:kfor classci ∈ Θ. The conditional entropy can then be expressed by H(y1:k|ci) = 1 2ln((2πe) kny|Σ 1:k,i|) (4.6)
where |Σ1:k,i| is the determinant of the covariance matrix for classci, Cover and Thomas [1991, Theorem 9.4.1].
The multivariate GaussianPDFis given by
N(y1:k; ˆy1:k, Σ1:k) = 1 ( √ 2π)k|Σ 1:k|1/2 e−12(y1:k−yˆ1:k)TΣ−1:k1(y1:k−yˆ1:k). (4.7)
Note thaty1:kis a vector ofk elements while y1:k(in bold) containsk · nyelements, hence the exponentk in the denominator.
The conditional entropy is then used in (4.4) to calculate the Gaussian entropic prior
P (ci) = e
H(y1:k|ci)
P cl∈Θe
H(y1:k|cl) =
e12ln((2πe)kny /2|Σ1:k,i|)
P cl∈Θe H(y1:k|cl) = (2πe) kny/2|Σ 1:k,i|1/2 P cl∈Θ(2πe) kny/2|Σ 1:k,l|1/2 = |Σ1:k,i| 1/2 P cl∈Θ|Σ1:k,l| 1/2. (4.8)
Note that the entropic prior is dependent of information from the measurements, and cannot be determined before the simulation starts. A prior is usually something that is known prior to the simulation, hence the name. Thus, this way of looking at priors might
4.3 Example 19
seem less intuitive than “normal” priors. The class probability can then be expressed by
P (ci|y1:k) =
|Σ1:k,i|1/2N(y1:k; ˆy1:k,i, Σ1:k,i)
P cl∈Θ|Σ1:k,l| 1/2N(y1:k; ˆy1:k,l, Σ1:k,l) = |Σ1:k,i|1/2 ( √ 2π)kny|Σ1:k|1/2 e12(y1:k−yˆ1:k)TΣ−1:k1(y1:k−yˆ1:k) P cl∈Θ |Σ1:k,l|1/2 ( √ 2π)kny|Σ1:k|1/2 e12(y1:k−yˆ1:k)TΣ −1 1:k(y1:k−yˆ1:k) = e −1 2(y1:k−yˆ1:k,i)TΣ −1
1:k,i(y1:k,i−yˆ1:k,i)
P cl∈Θe
−1
2(y1:k−yˆ1:k,l)TΣ1:k,l−1 (y1:k−yˆ1:k,l)
. (4.9)
Note that the entropic prior cancels out |Σ1:k|1/2from the likelihood function. Thus, the entropic prior does not need to be computed explicitly.
4.3
Example
Again, consider the same example as in Section 2.3. Figure 4.1 shows the result of using the Bayesian classification scheme with entropic priors. Notice that the result is almost identical to the result in Section 3.10 from theTBM classification scheme. The use of entropic priors seem to solve the problem with classc1being favored for low accelerations
for the Bayesian classification scheme.
−3 −2 −1 0 1 2 3 0 0.2 0.4 0.6 0.8 1 Acceleration [g] Probability [−] c1 c2 c3
Figure 4.1: The class probability, using the Bayesian classification scheme with entropic priors, for each class as a function of acceleration.
4.4
Problem with Low Values
However, ask grows e−12(y1:k−yˆ1:k,i)TΣ−1
1:k,i(y1:k−yˆ1:k,i)will decrease rapidly. Sooner or later
the computational device that is being used to calculatee−12(y1:k−yˆ1:k,i)TΣ −1
1:k,i(y1:k−yˆ1:k,i)will
round off the value to 0 for all classes, which leads to division by zero in (4.9). This problem can easily be solved by instead using
P (ci|y1:k) = e −1 2rk(ci) Pn j=1e −1 2rk(cj) = e −1 2rk,0e−12r 0 k(ci) Pn j=1e −1 2rk,0e−12r 0 k(cj) = e −1 2r 0 k(ci) Pn j=1e −1 2r 0 k(cj) (4.10) where rk(ci) = (y1:k−yˆ1:k,i)TΣ −1 1:k,i(y1:k−yˆ1:k,i), (4.11) rk0(ci) = rk(ci) − rk,0 (4.12) and rk,0= minc i rk(ci). (4.13)
The probabilityP (ci|y1:k) remain the same, while the numerator and denominator are
po-tentially greater than before. This improvement of the original method has been developed during this thesis project and has turned out to be quite useful.
5
System Modeling
The focus of this thesis is the comparison of the different classification schemes, not the system modeling. Thus, there will be no evaluation of different motion or measurement models. For simplicity a 3-dimensional Cartesian coordinate system is used in this thesis, see Figure 5.1.
Z
Y
X
Figure 5.1: Cartesian coordinate system. All vectors are orthogonal.
Aircraft move continuously in time, but the modeled system will be discrete. Thus, there is an implicit discretization (with the zero-order hold assumption) within the modeling that will not be explained in this thesis. The reader is referred to Gustafsson [2010, Sec. 12.1] for details about how the discretization is performed. The effects of the zero-order hold assumption will briefly be mentioned in Section 8.2.
5.1
Jump Markov Linear System
An aircraft usually flies with constant speed and constant direction. When the aircraft changes course, it would be most natural to turn with constant angular velocity. Given
these two assumptions, it would be preferable to model the system as a Jump Markov Linear System, JMLS, where the aircraft flies in a specific mode (e.g. fly straight, turn right or turn left ) at each time step. Each mode is then modeled as a linear system.
We consider the same system as in Orguner [2010], namely the following
xk = F(mk)xk−1+ G(mk)wk (5.1)
yk = H(mk)xk+ J(mk)vk (5.2)
wherexk ∈ Rnx is the base state, that contains position and velocity information,mk ∈ 1, 2, ..., Nmis the mode state,yk ∈ Rny is the measurement,yk ∈ Rnyis the measurement, wk ∈ Rnw is the process noise distributed with wk ∼ N(wk; 0, Q), vk ∈ Rnv is the measurement noise distributed with vk ∼ N(vk; 0, R) and matrices F, G, H and J are known functions of the mode state,mk.
The jumps between the modes are modeled as a time-invariant Markov chain (hence the name of the system,JMLS), with transition probability matrix,TPM, Π= [πij
4
= P (mk = j|mk−1= i)]. TheTPMis assumed to be known.
5.2
Motion Model
The base state is given by
xk = posX posY posZ vX vY vZ
where posα denotes the position and vα the velocity along the α-direction, for α ∈ {X, Y , Z}. As stated in Section 5.1, the aircraft often flies with constant speed in a con-stant direction. This motivates the use of the concon-stant velocity model (CVmodel) given by
FCV = 0I3 I3Ts
3×3 I3
!
(5.3)
for one mode, whereI3is an3 × 3 identity matrix and Tsis the sampling time.
When the aircraft turns, it will most likely do so with constant angular velocity. Thus, the coordinated turn model (CTmodel) Gustafsson [2010, Tab. 13.4], given by
FCT(ω) = 1 0 0 sin(ωTs)/ω −(1 − cos(ωTs))/ω 0 0 1 0 (1 − cos(ωTs))/ω sin(ωTs)/ω 0 0 0 1 0 0 Ts 0 0 0 cos(ωTs) −sin(ωTs) 0 0 0 0 sin(ωTs) cos(ωTs) 0 0 0 0 0 0 1 , (5.4)
5.2 Motion Model 23
will be used to model the turns for different angular velocities in the X Y -plane, denoted byω. As stated in Section 1.4, only uniform turns in the X Y -plane will be considered, and the target always has the same speed, |vk|.
We consider three classes, Θ = {c1, c2, c3}, and five modes in the modeling. The
con-nections between the classes and modes are given in Table 5.1, where also the modes are specified. Thus, c3 is most agile and has all five modes to choose from. The motions
defined by the modes are illustrated in Figure 5.2. As a remark, with the initial speed that will be used in this thesis, the modes would approximately correspond to 0g,±2g,±4g-turns, whereg = 9.81m/s2.
mk F Description Inc1 Inc2 Inc3
1 FCV Fly straight Yes Yes Yes
2 FCT(ω = 5
◦ π
180◦) Turn, easy left No Yes Yes
3 FCT(ω = −5
◦ π
180◦) Turn, easy right No Yes Yes
4 FCT(ω = 10
◦ π
180◦) Turn, hard left No No Yes
5 FCT(ω = −10
◦ π
180◦) Turn, hard right No No Yes
Table 5.1: Definition of the five modes, and their relation to the three classes.
x−position y−position Mode 1 Mode 2 Mode 3 Mode 4 Mode 5
Figure 5.2: The motion of the target when using the five different modes. The process noise is for simplicity modeled as independent of the mode, as the major part of it is due to the same unforeseen factors (wind, atmospheric pressure, etc.). It affects the motion according to the G-matrix, given by
G = I3T 2 s /2 I3Ts ! . (5.5)
To summarize, the final motion equation can be expressed by
5.3
Measurement Model
An aircraft gets its measurements of its surroundings both from its own different sensors and as information from other friendly aircraft or possibly other information nodes (e.g., radar stations on the ground). All information is then fused together and a combined esti-mate is acquired. For simplicity, we consider the combined estiesti-mate as a single position measurement in Cartesian coordinates, i.e.,
H =hI3 03×3
i
. (5.7)
The measurement noise is assumed to be additive and also independent of the mode state.
To summarize, we have the measurement equation
yk = H xk+ vk (5.8)
wherevk ∈ Rnv is the measurement noise distributed with vk ∼ N(vk; 0, R). Note that the measurement equation is entirely independent of mode,mk.
6
State Estimation
In words, the system (described in Chapter 5) can be seen as a linear state space model at each individual time step, but the system may change from one linear state space model to another linear state space model between the time steps. If the mode state was known for all time steps, it would be possible to estimatexk with the use of a standard Kalman filter,KF, (see Appendix E), but only the probability of the mode changing from one state to another (i.e., theTPM) is known.
The conditionalPDFof the base state (xk) is expressed by p(xk|y1:k, ci) = X ∀m1:k p(xk, m1:k|y1:k, ci) = X ∀m1:k p(xk|m1:k, y1:k, ci)P (m1:k|y1:k, ci) (6.1) wherem1:k= {m1, m2, . . . , mk}.
IfNm > 1, then the number of possible values of m1:kwill grow exponentially withk,
making it too computationally heavy to solve for high values ofk. As Nm = 5 in our case, we need to approximate the expression somehow. One alternative is to use a marginalized particle filter,MPF, which is described in Section 6.1. Section 6.2 describes a fixed-lag variant of theMPF, which reduces the impact of noise at the cost of the classification being delayed.
6.1
Marginalized Particle Filter
A standard particle filter,PF, spreads particles in all dimensions of the state it is supposed to estimate. As the needed number of particles, to ensure a reliable result, grows
tially with the dimension, a lot of particles would be needed in our case. TheMPF(also
known as Rao-Blackwellized particle filter) solves this problem by “dividing” the state into two parts. One part is solved with a standardKFand the other part is handled by aPF. In our case, the mode state (mk) is handled with a PF while the base state (xk) is solved by using a standardKF. Now, the particles only have to be spread in one dimension,mk. Caron et al. [2008, Sec. 6] describes aMPF for classifying a target with theTBM clas-sification scheme. The MPFin this section is an extended version of thatMPF, with the two Bayesian classification schemes. In order to classify the target we need oneMPFper
class. The overall class probability is then computed by comparing how well each class-matched filter works. An overview of this can be seen in Figure 6.1, wheren classes are considered.
Class probability calculations
MPF-1 . . . MPF-n
yk
P (c1|y1:k) P (cn|y1:k)
Figure 6.1: Overview of the marginalized particle filters.
What happens inside eachMPFfor each time step can be divided into two parts; tracking and classification, described in Section 6.1.1 and Section 6.1.2, respectively. For clarity, the class dependence will be omitted for most of the variables in both sections. The class probability calculation is straightforward, but is briefly described in Section 6.1.3 anyway. An overview of the final algorithm is then given in Section 6.1.4, while the full algorithm is given in Appendix F.1.
6.1.1
Tracking
As mentioned earlier, thisMPFtracking algorithm originates from Caron et al. [2008, Sec. 6.2]. An overview of the tracking can be seen in Figure 6.2. Let the particles in theMPF
be represented with index(j), j ∈ 1, 2, . . . , Np, whereNpdenotes the number of particles being used in theMPF. Each particle is associated with a mode chain,m(j)1:k, and a weight
e ω(j)k .
6.1 Marginalized Particle Filter 27 Update weight KF . . . KF yk Sampling . . . Sampling .. . . . . ... Particle1 ParticleNp Resampling . . . MMSE xˆ MMSE k|k (ci) .. . ... . . . Particle1 ParticleNp To classification
Figure 6.2: Overview of the tracking in a single class’ MPF for a single time step.
with the information in theTPMto sample a new mode state for the current time step, i.e.,
e
m(j)k ∼p(mk|m(j)
k−1, Π), (6.2)
where Π referrers to the TPM. The mode state sample me(j)k is then used together with the state estimatexˆ(j)k−1|k−1and its covariance matrix Σ(j)k−1|k−1from the previous time step in aKF to compute {xˆ(j)k|k, Σ(j)k|k, ε(j)k|k, Sk|k(j)} (see Appendix E.1 for details) where ε is the innovation andS its covariance matrix.
Given these new values, the weight can be updated by
e ω(j)k ∝ e ω(j)k−1N(ε(j) k|k; 0, S (j) k|k) (6.3a) and Np X j=1 e ω(j)k = 1. (6.3b)
P (m1:k|y1:k, ci) from (6.1) can then be approximated by P (m1:k|y1:k, ci) ≈ Np X j=1 e ωk(j)δ(m1:k−me (j) 1:k) (6.4) whereme(j)1:k = {me (j) k , m (j) 1:k−1} and δ(x) = (1, if x = 0
0, if x , 0. With that approximation, the
whole posterior can be expressed by
p(xk|y1:k, ci) ≈ Np X j=1 e ω(j)k p(xk| e m(j)1:k, y1:k, ci) (6.5) wherep(xk| e m(j)1:k, y1:k, ci) = N (xk; ˆx (j) k|k, Σ (j) k|k).
For each class, the combined estimate (using the minimum mean square error estimator,
MMSE) is given by ˆ xMMSEk|k = Np X j=1 e ωk(j)xˆ(j)k|k. (6.6)
After the computation of the combined estimate, the particles are resampled, i.e., the probable particles are duplicated and the unlikely particles are deleted. me(j)k is resampled intom(j)k . More specifically, eachm(j)k has probabilityωe(j)k to be equal tome(j)k . The weight,
e
ω(j)k , is then set to1/Npfor all particles.
After the resampling, the particles are used in the classification, as described in Sec-tion 6.1.2, before the next time step starts.
6.1.2
Classification
As mentioned earlier, this classification algorithm originates from Caron et al. [2008, Sec. 6.3]. An overview of the classification step is given in Figure 6.3. The class likelihood, p(y1:k|ci), is in some form needed in all described classification schemes, and is given by
p(y1:k|ci) = X m1:k p(y1:k|m1:k, ci)P (m1:k|ci) ≈ Np X j=1 ω(j)k p(y1:k|m (j) 1:k) (6.7)
for each class, whereω(j)k is the classification weight, given by ω(j)k ∝ ωe (j) k p(y1:k|m (j) 1:k) (6.8a)
6.1 Marginalized Particle Filter 29
From Tracking (all particles)
Particle likelihood Quadratic scalar value
Classification weight
Class likelihood Plausibility Class likelihood
with entropic priors
p(y1:k|ci) plR(y1:k|ci) e−
1 2(... )TΣ
−1 1:k,ci(... )
Figure 6.3: Overview of the classification in a single class’ MPF for a single time step. and Np X j=1 ω(j)k (ci) = 1. (6.8b)
p(y1:k|m(j)1:k) can be expressed by
p(y1:k|m (j) 1:k) = p(y1:k−1|m (j) k m (j) 1:k−1)p(yk|y1:k−1, m (j) 1:k) (6.9) wherep(yk|y1:k−1, m (j) 1:k) = N (ε (j) k|k; 0, S (j)
k|k). Note that the dependency of m
(j)
k can be
removed due to a feature of the Markov modelling, which states that measurements are independent of future mode states.
The quadratic scalar value, used in both theTBMclassification scheme and the Bayesian classification scheme with entropic priors , is computed recursively by
rk(y1:k, m(j)1:k) = rk−1(y1:k−1, m(j)1:k−1)
Once that value is obtained, the plausibility can be computed with plR (y1:k|ci) = 1 − Np X j=1 ω(j)k Fdim(y1:k)+2(rk(y1:k, m(j)1:k)) (6.11)
for each class, according to Caron et al. [2008, Eq. (49)], whereFp(x) is theCDFof a χ2-distribution and is given by (3.13).
For the Bayesian classification scheme with entropic priors, the approximation
e−12(y1:k−yˆ1:k)TΣ −1 1:k(y1:k−yˆ1:k)≈ Np X j=1 ω(j)k e−12rk(y1:k,m(j)1:k) (6.12)
is used as input to the class probability calculations. The proposed way to handle the problem with low values, described in Section 4.4, is used in a corresponding way in the approximation, but is not included here for clarity.
6.1.3
Class Probability Calculations
The standard Bayesian class probabilities,PBayes(ci|y1:k), are calculated by using p(y1:k|ci) from (6.7) in (2.2). In order to getPT BM(ci|y1:k), simply apply theGBT(see (3.9)) on the
plausibilities in (6.11) to get theBBAs, and then the pignistic transformation (see (3.8)) on theBBAs. The Baysian class probabilities with entropic priors ,PEntropic(ci|y1:k), are
calculated by using (4.9) with the approximation in (6.12).
6.1.4
Overview
Each time step,k, in theMPFcan be summarized with the following algorithm. MPF, single time step algorithm
For each class:
Sampling: mejk ∼p(mk|m(j) k−1, Π) K Fme j k, ˆx (j) k−1|k−1, Σ (j) k−1|k−1 ⇒xˆ(j) k|k, Σ (j) k|k, ε (j) k|k, S (j) k|k Update weight:ωek(j)∝ e ω(j)k−1N(ε(j) k|k; 0, S (j) k|k), PNp j=1ωe (j) k = 1 ˆ xMMSEk|k =PNp j=1ωe (j) k xˆ (j) k|k Resampling:mejk ωe (j) k → mj k Classification(εk|k(j), Sk|k(j)) Class probability calculations
6.2
Marginalized Particle Filter with Fixed Lag
The general idea behind fixed-lag smoothing is to wait a fixed number of time steps,L, before estimating the states at time step k. The estimation result should improve with the additional information fromyk+1:k+L, at the cost of a delayed estimation. However,
6.2 Marginalized Particle Filter with Fixed Lag 31
implementing this in aPFfor aJMLSis not trivial. As stated in Doucet et al. [2001, Sec.
IV.A], there are different strategies to consider when choosing how to sample the particles with fixed lag.
One strategy is to sample the particles atk +L from the distribution at k, i.e., using theTPM
L times, which would be computationally cheap comparing to the alternatives. However, since the particles have been sampledL times, there will only be a few particles that are “close” to the true trajectory, which makes the estimation bad. This problem is known as “sample depletion”. This strategy is illustrated in Figure 6.4a.
Another way is to use all the measurements up to time stepk + L to sample the particles atk. This would include runningKFs for every possible mode sequence from time step k + 1 to k + L. The result would be optimal, but the computational cost would be too expensive, as[number of modes]L>> 1. This strategy is illustrated in Figure 6.4b. The proposed way to deal with the problem in Doucet et al. [2001] is to use a Markov chain Monte Carlo (MCMC) algorithm that performs internal time steps for each time step fromk to k + L, including sampling, runningKFs with the measurements and resampling. This would only have time complexity O(L) compared to O([number of modes]L) as in the second alternative, but with an estimation result close to the optimal, i.e., a preferable compromise between time complexity and estimation result. A generalMCMCstrategy is illustrated in Figure 6.4c. Note that the particles are less spread than in Figure 6.4a. However, there seems to be an error in that algorithm (in the updating of the particles’ base states) that makes it impossible to use (perhaps a typo or a forgotten initialization). Unfortunatly, it was not possible to correct it during this thesis project.
All these three alternatives would improve the sampling, which would make it possible to run theMPFwith fewer particles. As none of the strategies were attractive to use, a simpler version of fixed lag was used in this thesis instead. The idea is to let the sampling and resampling continue as earlier, but use aKFwith fixed-lag (described in Appendix E.2) that computes additional delayed estimates. The most delayed estimate is then used in the classification instead of the present estimate. The following algorithm describes a single step in the proposedMPFwith fixed lag.
MPF with fixed lag, single time step algorithm For each class:
Sampling:mejk ∼p(mk|m(j) k−1, Π) K F e mjk,hxˆ(j)l|k−1, Σ(j)l|k−1ik−1l=k−L−1 ⇒hxˆ(j) l|k, Σ (j) l|k, ε (j) l|k, S (j) l|k ik l=k−L Update weight:ωe(j)k ∝ e ω(j)k−1N(ε(j) k|k; 0, S (j) k|k), PNp j=1ωe (j) k = 1 ˆ xk−L|kMMSE=PNp j=1ωe (j) k xˆ (j) k−L|k Resampling:mejk ωe (j) k → mj k Classification(ε(j)k−L|k, Sk−L|k(j) ) Class probability calculations
x−position
y−position
(a) SamplingL times, where L = 3. 400 particles.
x−position
y−position
(b) Consider all possible mode sequences and then use the best.
x−position
y−position
(c) A general MCMC algorithm. 400 par-ticles.
Figure 6.4: Illustrations of the three sampling strategies described in Section 6.2. Each particle is represented by a dot, each mode sequence by a line and the true mode sequence by a dashed line. Note that the figures only illustrates how the particles are spread in each algorithm, not how the particles’ states are updated. Thus, the problematic MCMC algorithm can be shown here too.
7
Results
In this chapter, a trajectory is defined in Section 7.1 and the results from the trajectory with the two estimation methods in Chapter 6, the standardMPFand theMPFwith fixed-lag, are presented in Section 7.2 and Section 7.3, respectively. Section 7.4 describes one way to measure the usability of each method and presents the results of it.
7.1
Flight Trajectory
Suppose that the target has the initial base statex0 = [15000 45000 5000 220 10 0]T
and then switches between flying straight and turning according to Table 7.1.
Event Turn angle [◦/s] Time [s]
Start of trajectory 0 1
Start of easy left turn 5 80
End of easy left turn 0 97
Start of hard right turn -10 161
End of hard right turn 0 169
End of trajectory 0 240
Table 7.1: Specified flight trajectory.
The target motion can then be seen in Figure 7.1.
2 3 4 5 x 104 4.5 5 5.5 6 6.5 x 104 x−position [m] y−position [m] Start 2g 4g
Figure 7.1: The flight trajectory, seen in two dimensions. The target starts in (15km,45km), makes a left turn at (33km,46km), makes a right turn at (34km,63km) and the trajectory ends at (51km,64km).
7.2
Results from Standard MPF
The results from the standardMPFare given in Figure 7.2 for the trajectory given in Sec-tion 7.1. The parameters for the result are given in Table H.1a, theTPMs in (H.1), (H.2) and (H.3) and the initial state and covariance in (H.4). The results from the Bayesian ap-proach with uniform priors are almost identical to the results from the Bayesian apap-proach with entropic priors, more about this will be described in Section 8.1. Unlike the example in Section 2.3, the Bayesian approach with uniform priors does not favor classc1 when
flying straight. This is also covered in Section 8.1.
The results from theTBMseems to be less decisive than the results from Bayes, i.e., the
TBMless sensitive to noise but the turns do not make as high impact on the classification result as with Bayes (notice that the class probability for c2 does not reach zero in
Fig-ure 7.2d). This could either be a advantage or a disadvantage depending on the application. In this case, where we imagine that a lot of different attributes (kinematic behavior, shape, electro-magnetic emissions, etc.) are fused together in a large classification system, being indecisive is preferred, since an inaccurate classification for the kinematic behavior might damage the overall classification in the large system.
Also note that classc2seems to be favored after the first turn for all three different
classi-fication schemes. This probably has to do with the number of particles being used in each
7.2 Results from Standard MPF 35 0 50 100 150 200 250 0 0.2 0.4 0.6 0.8 1 Time [s] Probabilities [−] c 1 c 2 c 3
(a) Bayes with uniform priors,N = 1
0 50 100 150 200 250 0 0.2 0.4 0.6 0.8 1 Time [s] Probabilities [−] c 1 c 2 c 3
(b) Bayes with uniform priors,N = 10
0 50 100 150 200 250 0 0.2 0.4 0.6 0.8 1 Time [s] Probabilities [−] c 1 c 2 c 3
(c) Transferable Belief Model,N = 1
0 50 100 150 200 250 0 0.2 0.4 0.6 0.8 1 Time [s] Probabilities [−] c 1 c 2 c 3
(d) Transferable Belief Model,N = 10
0 50 100 150 200 250 0 0.2 0.4 0.6 0.8 1 Time [s] Probabilities [−] c 1 c 2 c 3
(e) Bayes with entropic priors,N = 1
0 50 100 150 200 250 0 0.2 0.4 0.6 0.8 1 Time [s] Probabilities [−] c 1 c 2 c 3
(f) Bayes with entropic priors,N = 10 Figure 7.2: Results from the different classification schemes of the flight trajectory in Table 7.1, with the standard MPF and with filter parameters from Table H.1a. N is the number of Monte Carlo (MC) simulations.
0 50 100 150 200 250 0 0.2 0.4 0.6 0.8 1 Time [s] Probabilities [−] c 1 c 2 c 3
(a) Bayes with uniform priors,N = 1
0 50 100 150 200 250 0 0.2 0.4 0.6 0.8 1 Time [s] Probabilities [−] c 1 c 2 c 3
(b) Bayes with uniform priors,N = 10
0 50 100 150 200 250 0 0.2 0.4 0.6 0.8 1 Time [s] Probabilities [−] c 1 c 2 c 3
(c) Transferable Belief Model,N = 1
0 50 100 150 200 250 0 0.2 0.4 0.6 0.8 1 Time [s] Probabilities [−] c 1 c 2 c 3
(d) Transferable Belief Model,N = 10
0 50 100 150 200 250 0 0.2 0.4 0.6 0.8 1 Time [s] Probabilities [−] c 1 c 2 c 3
(e) Bayes with entropic priors,N = 1
0 50 100 150 200 250 0 0.2 0.4 0.6 0.8 1 Time [s] Probabilities [−] c 1 c 2 c 3
(f) Bayes with entropic priors,N = 10 Figure 7.3: Results from the different classification schemes of the flight trajectory in Table 7.1, with the MPF with fixed-lag and with filter parameters from Table H.1b. N is the number of MC simulations.