Trajectory and Pulse Optimization for Active Towed Array Sonar using MPC and Information Measures

(1)

UPTEC F 20047

Examensarbete 30 hp September 2020

Trajectory and Pulse Optimization for Active Towed Array Sonar using MPC and Information Measures

Fabian Ekdahl Filipsson

(2)

Teknisk- naturvetenskaplig fakultet UTH-enheten

Besöksadress:

Ångströmlaboratoriet Lägerhyddsvägen 1 Hus 4, Plan 0

Postadress:

Box 536 751 21 Uppsala

Telefon:

018 – 471 30 03

Telefax:

018 – 471 30 00

Hemsida:

http://www.teknat.uu.se/student

Abstract

Trajectory and Pulse Optimization for Active Towed Array Sonar using MPC and Information Measures

Fabian Ekdahl Filipsson

In underwater tracking and surveillance, the active towed array sonar presents a way of discovering and tracking adversarial submerged targets that try to stay hidden. The configuration consist of listening and emitting hydrophones towed behind a ship.

Moreover, it has inherent limitations, and the characteristics of sound in the ocean are complex. By varying the pulse form emitted and the trajectory of the ship the measurement accuracy may be improved. This type of optimization constitutes a sensor management problem. In this thesis, a model of the tracking scenario has been constructed derived from Cramér-Rao bound analyses. A model predictive control approach together with information measures have been used to optimize a filter's estimated state of the target. For the simulations, the MATLAB environment has been used. Different combinations of decision horizons, information measures and

variations of the Kalman filter have been studied. It has been found that the accuracy of the Extended Kalman filter is too low to give consistent results given the studied information measures. However, the Unscented Kalman filter is sufficient for this purpose.

Ämnesgranskare: Dave Zachariah Handledare: Isaac Skog

(3)

Popul¨ arvetenskaplig Sammanfattning

Ett av forskningsomr˚adena i Sverige idag är undervattensspaning. Ett sätt att upptäcka och sp˚ara ub˚atar som försöker gömma sig är att använda släpbara sonarsystem som ak- tivt letar efter fientliga m˚al med hjälp av ljudpulser. Idén är att använda en uppställning av hydrofoner som omvandlar tryckskillnader i vattnet till elektriska signaler. Geometrin möjliggör b˚ade bestämning av avst˚and, riktning till och axiell fart av m˚al. Tillordnin- gen släpas p˚a ett gediget avst˚and fr˚an skeppet s˚a att egenbrus minimeras. Idén bakom Aktiva släpbara sonarsystem (ATAS) är enkel i teorin men sv˚ar att implementera i prak- tiken. Akustiken i havet är komplicerad och ATAS har begränsningar att förh˚alla sig till. Vid svängningar böjs arrayen vilket skapar problem att mäta korrekt och olika typer av ljudpulser ger olika noggrannhet för avst˚ands- och fartbestämning. Genom att välja rätt val av rutt och pulsform kan noggrannheten för mätningar av ett m˚al förbättras.

Denna typ av optimering motsvarar ett sensorhanteringsproblem. I denna rapport har en tillst˚andsmodell för en sp˚arande ATAS byggts upp med hjälp av Cramér-Rao undre gräns analys. Metoden MPC har använts för att optimera ett givet filters estimerade tillst˚and av ett m˚al. För att evaluera olika mätningars informationsutvinning har metoden använt olika optimalitetskriteria. Olika kombinationer av beslutshorisonter, optimalitetskriteria och variationer av Kalman filtret har studerats. Det har noterats att noggrannheten av extended Kalman filter är för l˚ag för att ge konsistenta resultat givet de olika optimalitetskriteria undersökta. Dock s˚a uppfyller unscented Kalman filter detta syfte.

(4)

Acknowledgements

The author would like to thank Isaac Skog for proposing the thesis work and providing guidance and help through out the period. A thanks also goes to Dave Zachariah who was the subject reviewer of the project. In addition, Andreas N¨ojd deserves a thank you for aiding with insights into the order of magnitude of the parameters of the model. Even more, thank you to FOI Kista for the opportunity to carry out the Master thesis at your facilities. Thank you everybody else at FOI who gladly listened to and shared ideas.

In closing, this thesis marks the end to my degree in Master’s Programme in Engi- neering Physics (300 credits). It is thus also appropriate to show my gratitude towards my family for their support through out the years. Thank you.

Warmly, Fabian Ekdahl Filipsson September 25, 2020

(5)

Notation

Table 1: Sets, spaces and subspaces.

Notation Meaning

R Real numbers

Rⁿ Real n-vector R^m×n Real m × n matrix

K, K⁰ Sets of time steps of interest U^(pulse) Control set of pulse forms U(trajectory) Control set of trajectories

[a b ...] Row vector with first element a and second b

Table 2: Symbols, operators, and functions.

Notation Meaning I Identity matrix

I(·) Fisher information matrix I Sound intensity

Psignal Power of a signal

A^T Transpose of matrix (or vector) A det(A) Determinant of matrix A

kAk_F Frobenius norm of matrix A Cov(θ) Covariance of vector θ

Var(θ) Variance of variable θ

CRB(θ) Cram´er-Rao bound for variable θ E(θ) Mean of vector θ

diag(θ) Diagonal matrix with vector θ spanning the diagonal

blkdiag(A,B) Block diagonal matrix of the matrices A and B

k Time step

r_k Range at time k

˙r_k Axial velocity at time k

Continued on next page

(8)

Table 2 – Continued from previous page Notation Meaning

γk Bearing at time k

βk Angle with respect to Cartesian frame at time k y_k Measurement vector at time k

xk Generic state at time k sk Position state at time k v_k Velocity state at time k

x^t_k Target state of target at time k x^p_k Platform state of platform at time k

v^p The speed of the platform u_k A generic control input at time k u^∗_k The optimal control input at time k u^(pulse)_k Control input of pulse forms, at time k u(trajectory)

k Control input of trajectories, at time k ˆ

x^t_m|n Filter estimate of x^t at time m given data up to time n

Pm|n Filter covariance matrix at time m given data up to time n

e_k Measurement noise vector at time k w_k Process noise vector at time k σ_w² Variance of process noise

R_k Measurement error covariance matrix at time k

∇_θ Gradient operator with respect to the parameters in a vector θ

J (·) Future costs

J^∗(·) Optimal future costs gk(·) Stage cost

gN(·) Terminal cost

N (µ, σ²) Guassian distribution with mean µ and variance σ² c A generic constant

vsound The speed of sound in water

τ Time step

∆ω Relative doppler shift

∆ω⁰ Doppler shift

ω₀ Initial carrier frequency of signal

Table 3: Acronyms and abbreviations.

Notation Meaning

DP Dynamic programming TAS Towed array sonar ATAS Active towed array sonar

Continued on next page

(9)

Table 3 – Continued from previous page Notation Meaning

MPC Model predictive control CRB Cram´er-Rao bound SNR Signal to noise ratio

CV Constant velocity rMSE Root mean square error

TL Transmission loss

PDF Probability density function

(10)

Chapter 1

Introduction

One of the focus areas of military research in Sweden is underwater tracking and surveillance techniques. A subarea of this is active towed array sonar techniques, which can be used to scan the ocean for hidden vessels. This thesis deals with this subarea and investigates optimizing its usage in a defined tracking scenario.

1.1 Background

In underwater surveillance, sonar is the dominating form of detection and tracking. To detect sound, a hydrophone converts pressure fluctuations into readable electric signals.

Of particular interest is sound created by adversarial vessels that attempt to stay hidden.

However, other noises that the sea makes is also picked up, such as that of waves, marine life and industries.

Commonly, hydrophones measure omnidirectionally. By placing multiple hydrophones in various geometries, such as in a straight array, one enables measuring directions as well.

Configurations exists such as hull-mounted arrays where the hydrophones are placed along each hull of a vessel. However, these suffer from recording much of the noise from its own ship, referred to as self-noise. Another configuration to overcome this problem is the towed array sonar (TAS). In the TAS, the array is not mounted on the ship but on a cable that is towed behind it. This reduces unwanted self-noise immensely. The earliest known experiments of TAS had a detection range of about 2000 meters [1]. Today, detection ranges up to 100 km can be obtained [2].

The TAS can only listen and is referred to as a passive sonar. In active sonar, pulses are emitted, reflected, and then listened to as the sound propagates back again. By also towing a projector behind the ship that emits pulses, the towed array sonar can ”be active” as well. This is referred to as active towed array sonar (ATAS). An illustration of ATAS system is shown in Figure 1.1. In this thesis only the ATAS will be considered.

The ATAS considered in the thesis is capable of measuring range to a target. It can also measure the axial velocity between the target and itself through the doppler shift of the received signals. Lastly, the ATAS can measure bearing to the target. Here, bearing refers to the angle of the signal with respect to the aperture of the array.

(11)

CHAPTER 1. INTRODUCTION

The idea behind ATAS is simple in concept but complex in implementation. The acoustic characteristics of the ocean are complex and can deviate locally in unexpected ways. Sound is refracted and bent as the water’s density is changed by temperature and salinity [2]. The geometry and type of ocean floor dictates how the sound is propagated in shallow water, and noise is constantly present caused by activities of both men and nature. Lastly, the physical configuration of the ATAS has inherent limitations that needs to be accounted for.

To improve the measuring of a target’s state the crew of the ATAS varies the trajectory of the ship and decides on sending pulses with varying properties. These decisions have so far been tuned manually by the crew based on previous experience.

More specifically, pulses that are broadband gives a superior accuracy of range but worse accuracy of speed of the target. And narrowband pulses give worse range accuracy but better speed accuracy. Furthermore, different positioning and orientations of the ATAS and target relative to each other will give varying accuracy in the measured range, velocity, and bearing. But changing the direction of the ATAS will bend the cable with the hydrophones and distort the sensor data. In fact, a turn may take an entire kilometer and when completed, the crew has to wait a whole minute before they can assume the cable is straight enough to measure again. And even when going straight, turbulence in the water can bend the array or shift it to lay a whole 10^◦ incorrectly. Even more, the speed of the ship may affect the buoyancy of the cable and present further challenges. As highlighted, there are many considerations to be taken into account when using ATAS systems.

Figure 1.1: An illustration of active towed array sonar. An array of listening hydrophones is towed behind an ally ship (Blue). In addition, an emitting hydrophone is towed closer to the ship. Sound waves are emitted that hit an adversarial submarine (Red) and are reflected back to the listening hydrophones on the array.

(12)

CHAPTER 1. INTRODUCTION

1.2 Problem Formulation

The problem to be considered in this thesis is that of an ATAS tracking a target. One may think of the ATAS as a moving platform with attached sensors. The task of determining the trajectory and signal pulses of the ATAS can be viewed as a sensor management problem, as defined in [3]. By changing operational configurations of the system, it can react to and learn from previous measurements made. The configurations are changed both as to satisfy operational constraints, but also to achieve operational objectives, such as not driving the boat too fast, and measuring accurately enough. To satisfy constraints and achieve objectives, one typically seeks an optimal policy for determining the optimal sensor configuration for the next measuring stage. The question then arises of how the policy should be made that determines a decision, given the current available information.

The current available information may be quantified and judged by converting it to scalar measures such as information measures and then acted upon as to achieve satisfactory operational constraints [4]. The aim of the thesis is to investigate how sensor management and information based route planning can be used to improve tracking for an ATAS estimating the range, velocity, and bearing of a target.

1.3 Related Work

Planning is a large research field with several communities involved working independently on the same problems. Several subfields exist thereamong control, robotics, data fusion and, artificial intelligence. One may divide up planning into path planning, with regards to kinematics, and sensor planning, with regards to pure configurations of the actual sensor, as done in [5]. The licentiate’s thesis [6] deals with the path planning part and has been an inspiration to this master’s thesis.

(13)

Chapter 2

Theory

The following chapter puts the problem presented in Chapter 1 into a mathematical framework. It starts of by a description of deterministic dynamic programming, which is commonly used to solve optimization problems. It proceeds by looking at what measurement accuracy one can expect from ATAS using a Cram´er-Rao bound analyses. Next, the model for the specific scenario is defined. After this, two filters are presented for the scenario model. Then, information measures are presented as a means of quantifying the goodness of estimates, and they are tied together with the dynamic programming framework. The chapter ends with formulating the problem in Chapter 1 as an optimization problem.

2.1 Deterministic Dynamic Programming

Within the area of sensor management, different techniques for solving optimization problems exists and one of these techniques is dynamic programming (DP) [7]. The following theory on DP is based on [8].

Consider a discrete time-dynamic system

xk+1= f (xk, uk) k = 0, 1, ..., N − 1 (2.1)

x0 xk x_k+1 xN

gk(xk, uk) fk(xk, uk)

gN(xN)

Current stage k Future stages

Figure 2.1: A state x_k evolves over time in stages (steps). It starts at x₀ and finishes at a terminal node denoted state x_N. The state evolves according to x_k+1= f_k(x_k, u_k) where uk is a control input. For every stage a cost gk(xk, uk) is incurred. At the last stage however, no control input can be chosen and a terminal cost gN(xN) is incurred.

(14)

CHAPTER 2. THEORY

where xk∈ R^n×1 is the state of the system, k is a time index and uk is a form of control input that belongs to a control set Uk(xk) that in turn depends on xk. N is the number of times control is applied, also referred to as the horizon. The state xk transitions, as in Figure 2.1, to the next state xk+1 and while doing so, incurs a stage cost gk(xk, uk).

Sequentially in every stage, a decision is made on choosing the next control input uk as uk = arg min

u

gk(xk, u) + J^∗(xk+1) (2.2) where J^∗(xk+1) is the optimal future cost starting from xk+1. For an initial state x0, the goal is to minimize over a sequence of controls, [u0, ..., uN −1], the cost function

J (x₀; [u₀, ..., u_{N −1}]) = g_N(x_N) +

N −1

X

k=0

g_k(x_k, u_k) (2.3) where gN(xN) is the terminal cost that can be seen as transitioning from the terminal state xN to a fictional state that comes after this last state.

In DP, the Principle of optimality states that optimal sequences [u^∗₀, ..., u^∗_{N −1}] follow optimal solutions [u^∗_k, ..., u^∗_{N −1}] of tail sub problems. This gives the general DP algorithm where we start with tailproblems of length one, and go backwards. More specifically, start with J_N^∗(x_N) = g_N(x_N) for every possible teminal node x_N, and for k = 0, ..., N − 1, go backwards by calculating

J_k^∗(xk) = minimize

uk∈Uk

gk(xk, uk) + J^∗(fk(xk, uk)) . (2.4) The optimal cost for the problem is obtained as J^∗(x₀). And the optimal control sequence may be identified by ”going forward”. That is, after ”going backwards”, all J^∗ for every state are known and the control sequence is obtained by

u^∗_k = arg minimize

u_k∈U_k gk(x^∗_k, u_k) + J_k+1^∗ (f_k(x^∗_k, u_k))

(2.5) for k = 1, ..., N − 1.

The reason for using the word deterministic comes from that the stage transitions and incurred costs presented here do not depend on any stochastic variable.

Example 2.1.1. Consider the problem of a reconnaissance boat located at a dock. The boat can only scout in the planar x,y-plane. At time step k = 0, a submarine from the depths of the ocean emerges a distance away from the dock. It will stay in place, but after three time steps it will submerge again. The boats goal is to estimate the position of the submarine before it disappears. To get good measurements, it can dismount from the dock and move right, left or forward depending on its current position. However, each action costs fuel and results in different accuracy of measurements. At the end the boat has to return to the dock.

In Figure 2.2, the lowest optimal cost and the optimal control sequence for the problem can be found by following the blue nodes. By going backwards and comparing future costs the optimal cost for the problem can be found. By going forward and comparing all the immediate costs and future costs the optimal control sequence can be found.

(15)

CHAPTER 2. THEORY

12

6

4

5 2

1 2 3

2 1 1

6 5 9

7 4 3

2 8

6 4

5 3

3

5

Figure 2.2: Every node displays the optimal cost for the subproblem. The node is blue if it results in an optimal path and red if it does not. The arrows correspond to the right, left or forward movements and displays the incurred cost of that particular movement.

The leaf nodes displays the terminal cost to return to the dock which is also the optimal cost at the last state.

End of example

2.1.1 Model Predictive Control

In model predictive control (MPC) a decision horizon is defined for the optimization problem. Similar to in the DP approach previously presented an optimal control sequence [u^∗₀, ..., u^∗_{N −1}] is found but in MPC, only the first control u^∗₀ is used and the rest is disregarded. The optimization is then repeated over a shifted decision horizon.

Example 2.1.2. Consider example 2.1.1 again with the exception that the submarine will not submerge after a given time. Also, the submarine will move around and not stay in place. An MPC approach is now suitable for the problem. After each completed stage the reconnaissance boat will be presented with a new position of the submarine and re-evaluate

the best route for tracking. End of example

2.2 Measurement accuracy

As discussed in the introduction, there are a multitude of factors that effects the accuracy of a sonar measurement. In this report, the focus is mainly on the signal to noise ratio (SNR). Note that the literature that is cited in the following sections assume a narrowband signal. This is mostly because broadband analyses can be very difficult. Thus, to conform to the problem formulation of the the thesis with regards to narrowband and broadband pulses, the assumption is made that the broadband pulse is only slightly broader in its spectrum than the narrowband. For a source on narrowband requirements, please see [9].

(16)

CHAPTER 2. THEORY

2.2.1 Fisher Information & Cram´er-Rao Bound

A lower bound on the variance of an estimate can be found (if it exists) through Cram´er- Rao bound (CRB) analyses. Consider a random variable Y and the corresponding probability density function (PDF) f (y|θ), where θ is an unknown but deterministic parameter.

A sample of data (a realization of Y ) will provide some information about the parameter θ. The Fisher information quantifies how much information is obtained. For the multi- variate case where Y ∼ f (y|θ) and θ = [θ₁, ..., θ_N], the Fisher information is a matrix defined as

I(θ) = En

∇θlogf (y|θ)∇θlogf (y|θ)^To

(2.6) Here ∇θ is the gradient with respect to the parameter vector θ, and logf (x|θ) is the log-likelihood function.

Consider any unbiased¹ estimator of θ. Then the inverse of the Fisher information matrix bounds a covariance matrix

cov(θ) ≥ I(θ)⁻¹ (2.7)

where the diagonal elements represent the CRB of the individual parameters in the parameter vector θ. The matrix inequality means here that the matrix cov(θ) − I(θ)⁻¹ is positive semidefinite.

2.2.2 Range Analyses

For measuring range, r, to a target, only a single sensor is needed. Literature [10, appendix 7A.3] states that the CRB for range estimation using one sensor is proportional to the reciprocal of SNR (See appendix A.1 for details). Thus the CRB of range is defined as

CRB(r) ≡ cr

SN R (2.8)

where c_r is a constant for scaling the variance appropriately. Note however, that using an array of sensors gives more accurate estimates because of noise suppression [11]. In addition, the beam forming that is made possible with an array of sensors can reduce noise from unwanted directions [12].

2.2.3 Bearing Analyses

The paper [13], derives the CRB for the estimate of direction of arrival (DOA) for a uniform linear array. Using the result and only accounting for SNR and bearing, γ, the lowest achievable variance may be expressed as

CRB(γ) ≡ cγ

SNR × sin²(γ) (2.9)

where cγ is a constant for scaling the variance appropriately.

1For a biased estimator the CRB looks slightly different.

(17)

CHAPTER 2. THEORY

2.2.4 Doppler Analyses

The axial speed ˙r of a target is directly proportional to the measured doppler shift. For the CRB of doppler with a moving array, according to [14], the CRB on the targets doppler shift with an active moving array depends on the CRB of a non-moving array, plus a term accounting for the doppler shift due to the array movement. More specifically, [14] states for the doppler shift that

CRB(∆ω⁰) = CRB(∆ω) + (∆ω_A)²sin²(γ)CRB(γ) (2.10) where

∆ωA= 2ω₀ vsound

vA, (2.11)

and

∆ω⁰= ∆ω + ∆ωAcos(γ). (2.12)

Here, ∆ω is the relative doppler shift measured by the array, γ is the bearing to the target, ∆ωAcos(γ) is the doppler shift due to the array movement, vA is the magnitude of the array’s velocity and ω0is the initial carrier frequency of the pulse emitted.

Assuming CRB(∆ω) = _SNR^c^r1^˙ , where cr1˙ is a constant, and using the result of 2.9 the CRB for estimating radial velocity may be written (only accounting for SNR and array velocity) as

CRB( ˙r) ≡ c_r1_˙

SNR+c_r2_˙ v_A²

SNR (2.13)

where cr1˙ and cr2˙ are constants for scaling the variance appropriately.

2.2.5 SNR Dependency

The SNR is often expressed in logarithmic scale as SNRdB = 10log(SNR)[dB] and is proportional to the source level of a signal, its transmission loss and noises in the ocean.

In logarithmic scale the transmission loss (TL) reduces the SNRdB by TL = 10log(I₀

I1

) [dB] (2.14)

where I0is a reference intensity at 1 meter from the source. I1is the intensity at a distance r from the source. Remember that the intensity is I = ^Power_Area. Assuming that the ocean is deep enough such that the expansion of the sound is spherical, then I₁= ^P_4πr^signal₂ . For the case of a ”shallow” ocean with depth h and assuming the sound has a total reflection against the ocean floor, I₁ = ^P_2πrh^signal, i.e, a cylindrical expansion of the sound. However, since total reflection is never the case in reality, the intensity will reduce somewhere in between 20log(r) [dB] and 10log(r) [dB] when increasing the distance r.

In addition to geometrical expansion of the sound, the oceans natural absorption of the sound through friction between water molecules and other anomalies affect the transmission loss TL as well. Therefore, in practise the SNRdB is often assumed to reduce with 17log(r) dB [15] and the assumption is made that

(18)

CHAPTER 2. THEORY

SNR ≡ c

r^1.7 (2.15)

where c is a generic constant for scaling the SNR correctly.

2.3 Tracking Scenario Model

For the ATAS problem considered in this thesis, let the state of the target be x^t=s^t

v^t

(2.16) where s^t= [x^ty^t]^T and v^t= [ ˙x^ty˙^t]^T. Here, x and y represents the position and ˙x and ˙y the velocity of the target with respect to a Cartesian frame. In a similar fashion let the state of the platform be

x^p=s^p v^p

(2.17) where s^p= [x^py^p]^T and v^p= [ ˙x^py˙^p]^T.

The following two sections present the system and measurement equation of the state space model for the tracking scenario. The reader is referred to appendix A.2 for a recap on the state space model on linear and functional form.

2.3.1 System Equation

Assume the state of the target evolves according to a constant velocity model, that is

x^t_k+1=







1 0 τ 0

0 1 0 τ

0 0 1 0

0 0 0 1





 x^t_k+







τ²

2 0

0 ^τ₂²

τ 0

0 τ







wk (2.18)

where τ is the sampling interval. Further, the process noise, wk ∈ R^2×1, with variance σ²_w, is assumed distributed as

wk ∼ N (0, σ_w²I).

2.3.2 Measurement Equation

With a ATAS we can measure range, axial speed (doppler) and bearing. Let the measurement equation be

y_k=



 r_k

˙r_k γk



+ e_k (2.19)

where ek∈ R^3×1 is the measurement error distributed as

(19)

CHAPTER 2. THEORY

ek∼ N (0, Rk) with

Rk=





σ²_r(rk) 0 0

0 σ_r²_˙(r_k, v_k^p) 0 0 0 σ²_γ(r_k, γ_k)



. (2.20)

Here,

r_k =

s^t_k− s^p_k

(2.21a)

˙rk =(v^t_k− v^p_k)^T(s^t_k− s^p_k)

ks^t_k− s^p_kk (2.21b)

γk = βk− ψk (2.21c)

βk = arctan(y_k^t− y_k^p

x^t_k− x^p_k) (2.21d)

ψ_k = arctan(y˙^p

˙

x^p), (2.21e)

where γk, βk and ψk are defined as in Figure 2.3, and the variances of range, axial speed and bearing are (in agreement with the CRB analyses in Section 2.2)

σ_r²(rk) = crr^1.7_k (2.22a)

σ_r²_˙(rk, v_k^p) = cr,1˙ r^1.7_k + cr,1˙ r^1.7_k (v_k^p)² (2.22b) σ_γ²(rk, γk) = cγr^1.7_k

sin²(γk) + (2.22c)

with

v^p_k = kv^p_kk . (2.23)

One may also consider the following modifications of (2.22a), (2.22b) and (2.22c)

σ²_r(rk) = cr,1+ cr,2(rk− cr,3)^1.7 (2.24a) σ²_r_˙(r_k, v^p_k) = c_r,1_˙ + c_r,2_˙ r^1.7_k + c_r,3_˙ r_k^1.7(v^p_k)² (2.24b) σ²_γ(rk, γk) = cγ,1+ cγ,2r^1.7_k

sin²(γk) + (2.24c)

where cr,3 implies a sensing ”sweet spot”. One could imagine this corresponds to when spherical waves ”becomes” planar. Further, adding the constants with subscript 1 to the model aids in settings other constants to zero in order to validate the model in

(20)

CHAPTER 2. THEORY

Figure 2.3: Visual illustration of the bearing γ = β − ψ. In red is the target and in blue is the ship towing the ATAS. Four coordinate systems are shown: Map, Target, Platform, and Platform⁰. The position vectors vectors s^t and s^p relate the Target and Platform coordinate systems to the reference system denoted Map. Platform⁰is a rotation of Platform with the angle ψ.

simulations. The value of the constants will depend on type of pulse used and the pulse may be specified by a control input u^(pulse)_k . All in all, the measurement equation may be expressed as

y_k = h(x^t_k, ek; x^p_k, u^(pulse)_k ) (2.25) Lastly, in order to observe a correct direction ψ of the platform its speed cannot be zero. Nevertheless, this is a reasonable condition since a platform that stands still will not have a straight array.

(21)

CHAPTER 2. THEORY

2.4 Filters

For the state space model presented Section 2.3 it is appropriate to apply the Extended Kalman filter (EKF) and the Unscented Kalman filter (UKF). The reader is referred to [16] for a thorough description and derivation of the EKF. For an in-depth explanation of the UKF, the reader may see [17]. Moreover, in appendix A.3 and A.4 the algorithms for the filters, applied on the ATAS problem, may be found, as well as a summary of how they work. Still, a few words will be mentioned here about their accuracy.

It has been shown that the UKF gives more accurate estimates than the EKF for moving targets with nonlinear kinematics [18], [19]. Theoretically, the UKF will perform the nonlinear transformation of the filter estimate and covariance matrix to a higher degree than the EKF [17]. However, the EKF may have better initial error convergence than the UKF [20]. In addition, the EKF is simpler to implement.

Both the EKF and the UKF propagates the first and second moments of the probability distributions for the target’s state. The expected value is here denoted ˆx^t_k|k and the covariance matrix is P_k|k. The subscripts k|m should be read: at time k, given data up to time m.

2.5 Information Measures

To evaluate the goodness of an estimate, ˆx^t_k|k, a scalar quantification of it is needed.

Its covariance matrix P_k|k contains information about the uncertainty of the estimate.

The eigenvalues, λ1...λn, of the covariance matrix can be used to calculate quantitative information measures such as the trace, Frobenius norm or determinant of a matrix.

tr(P_k|k) =

n

X

i=1

λi Trace (A-optimality) (2.26a)

P_k|k _F =

q

tr(P_k|kP^T_k|k) = v u u t

n

X

i=1

λ²_i Frobenius norm (2.26b)

det(P_k|k) =

n

Y

i=1

λi Determinant (D-optimality) (2.26c)

Under Gaussian assumption, the covariance matrix P_k|k spans an ellipse of the con- fidence interval for an estimate. The trace is directly proportional to the sum of the diagonal axes of this ellipse, and the determinant to the area of it [4]. Note that adding eigenvalues only makes sense if they have the same unit.

2.5.1 Information Measures in the MPC Framework

Since an EKF or a UKF is used to estimate the state of the target, it is in the MPC algorithm suitable to use a cost function based on the covariance matrices, P_k|k, of the

(22)

CHAPTER 2. THEORY

filters. The terminal cost and stage cost are thus defined as

gN(PN |N) = optimality(APN |NA^T), k = N (2.27a) gk(P_k|k) = optimality(AP_k|kA^T), k ∈ K (2.27b) where K is a set that corresponds to the future integers of interest, up to and, not including N . In the ATAS problem, not every predicted future state will be of interest.

Further, A = [I 0], and I is a two by two identity matrix. That is, only the elements corresponding to position in the covariance matrix P_k|k are considered, R^4×4 −→ R^2×2. Here,

optimality(AP_k|kA^T) =











trace(AP_k|kA^T) If A-optimality det(AP_k|kA^T) If D-optimality

AP_k|kA^T

_F If Frobenius norm.

.

2.6 Formulating the ATAS Problem as an Optimiza- tion Problem

To recap, the ATAS sensor management problem considered in this thesis is that of optimizing a sensor platform’s estimates by varying the trajectory and pulse forms emitted.

This may be formulated as the optimization problem

minimize : X

k∈K⁰

g_k(P_k|k)

subject to : ˆx^t_k+1|k+1, P_k+1|k+1 = filter(ˆx^t_k|k, P_k|k; x^p_k, y_k, u^(pulse)_k ) x^p_k+1= f (x^p_k, u(trajectory)

k )

x^t_k+1= f (x^t_k, w_k)

y_k= h(x^t_k, e_k; x^p_k, u^(pulse)_k ) u^(pulse)_k ∈ U^(pulse)

u(trajectory)

k ∈ U(trajectory)(x^p_k) K⁰= {0, n, 2n, 3n, ..., N n}

ˆ

x^t_0|0= E(x^t₀), P_0|0 = Cov(x^t₀)

(2.28)

where a cost function based on information measures is minimized. The summation in the cost function is done over a set of time steps denoted K⁰. The time steps represent when the platform has completed a trajectory. For the case of every trajectory taking n time steps, this results in K⁰ = {0, n, 2n, 3n, ..., N n}. Here, N is represents the terminal node and is referred to as the decision horizon in the MPC framework.

The minimization is subject to how the target’s state estimate, ˆx^t_k|k, and corresponding covariance matrix, P_k|k, are propagated through the filter dynamics of an EKF or a UKF.

(23)

CHAPTER 2. THEORY

Moreover, the propagated moments depend on the current state of the platform x^p_k, and the measurements y_k and the control input u^(pulse)_k .

The control input u^(pulse)_k holds information about what pulse to use when generating the measurements y_k. This is done by scaling the variance appropriately through the constants in (2.24a), (2.24b) and (2.24c). The control input u^(pulse)_k also gives information to the filter about what pulse was used. Continuing, the control input u(trajectory)

k

holds information about how to update the platform’s state. The set U(trajectory)(x^p_k) of available control inputs for the trajectory depend on the current state x^p_kof the platform.

(24)

Chapter 3

Method

The following chapter describes how the investigation of sensor management and information based route planning has been done. It describes how the system is simulated, how the sensor management system is implemented, how the parameters of the system have been chosen to represent a realistic scenario, and lastly, how the simulations are evaluated.

3.1 Simulation Setup

The problem to be optimized is that of (2.28). When the simulation starts, the platform will already have found the target. That is, no reconnaissance, but only tracking is performed. The state of the target evolves according to a CV-model as in (2.18). The evolution of the platform’s state is described in a separate section. At a certain distance between the platform and target, the platform is assumed to have lost the target. A threshold distance is thus defined for this purpose. If the platform does not lose the target, the simulation will run till the defined number of time steps has been reached.

The movement of the platform and target is made in a discretized manner. Every time step the states update, an estimate of the target is made.

The simulations can be seen illustrated in a sensor management system framework in Figure 3.1. Here, a state of the target is estimated using either an EKF or a UKF. The first and second moments of the estimate, ˆx^t_k|kand P_k|k, are then used to predict future system performance of the filter. This is done by simulating different control inputs and finding the shortest path in a tree structure of information measures. The optimization is done by choosing only the first pair of control inputs for the shortest path, as by the definition of MPC. The pair of control inputs consists of the choice of pulse form, u^pulse_k , and choice of trajectory, u^trajectory_k . The sensor selector updates the trajectory and pulse form according to the control inputs. This will result in new measurements for the filter to process. A new prediction of future system performance will however not be performed again, until the current trajectory has finished.

(25)

CHAPTER 3. METHOD

Trajectory

• Forward

• Left turn

• Right turn

• Acceleration

• Deceleration

Pulse form

• Broadband

• Narrowband

Control input sets

Optimization (MPC)

Signal processing

(EKF/UKF)

Sensor selector Predict system

performance (EKF/UKF with

information measures)

measurements

estimate ˆ

x^t_k & P_k|k

Figure 3.1: The sensor management system for the ATAS problem.

3.1.1 Platform Trajectory Kinematics

The target’s kinematics were presented in Section 2.3 as moving according to a CV model, see (2.18). When it comes to the platform, it can perform one of five movements: turn left, turn right, accelerate, decelerate or go straight forward. Every one of these actions take a number of time steps to perform. Moving forward for a fixed amount of steps is self-explanatory. Turning, is performed by assuming a constant turn rate during a 90^◦ turn; thereby, completing the turn in the defined number of steps with the last step in the final heading direction. Lastly, accelerating is performed by assuming a constant acceleration during every discrete time step of the movement; starting at one speed and ending up at another after the trajectory is complete. While doing anything other than going forward with the same speed, measurements are assumed unavailable.

3.1.2 Pulse Forms

Pulse forms of narrowband and broadband are incorporated into the simulations by scaling the variance of the range and doppler part of the measurements accordingly (see the constants in (2.24a) to (2.24c)). Note that the pulse form stays the same while a trajectory is being traversed.

Trajectory and Pulse Optimization for Active Towed Array Sonar using MPC and Information Measures

Examensarbete 30 hp September 2020

Trajectory and Pulse Optimization for Active Towed Array Sonar using MPC and Information Measures

Fabian Ekdahl Filipsson

Abstract

Trajectory and Pulse Optimization for Active Towed Array Sonar using MPC and Information Measures

Popul¨ arvetenskaplig Sammanfattning

Acknowledgements

Contents

Notation

Chapter 1

Introduction

1.1 Background

1.2 Problem Formulation

1.3 Related Work

Chapter 2

Theory

2.1 Deterministic Dynamic Programming

2.2 Measurement accuracy

2.3 Tracking Scenario Model

2.4 Filters

2.5 Information Measures

2.6 Formulating the ATAS Problem as an Optimiza- tion Problem

Chapter 3

Method

3.1 Simulation Setup