• No results found

Channel Equalization Using Machine Learning for Underwater Acoustic Communications

N/A
N/A
Protected

Academic year: 2021

Share "Channel Equalization Using Machine Learning for Underwater Acoustic Communications"

Copied!
102
0
0

Loading.... (view fulltext now)

Full text

(1)

Master of Science Thesis in Electrical Engineering

Department of Electrical Engineering, Linköping University, 2020

Channel Equalization using

Machine Learning for

Underwater Acoustic

Communications

(2)

Communications

Martin Allander LiTH-ISY-EX--20/5301--SE Supervisor: Dr. Özlem Tugfe Demir

isy, Linköping University

Systems Engineer Oskar Axelsson

Saab Dynamics

Examiner: Associate Professor Emil Björnson

isy, Linköping University

Division of Communication Systems Department of Electrical Engineering

Linköping University SE-581 83 Linköping, Sweden Copyright © 2020 Martin Allander

(3)

Sammanfattning

Trådlös akustisk undervattenskommunikation är ett fält i utveckling med ett fler-tal applikationer. Den akustiska undervattenskanalen är väldigt speciell och be-teendet beror mycket på miljön kommunikationen sker i. Jämfört med trådlös radiokommunikation är den använda bandbredden mycket mindre och Doppler-effekten är mycket mer påtaglig, på grund av ljudets långsammare utbrednings-hastighet. Litteratur publicerade de senaste åren framhäver att maskininlärnings-assisterad kanalestimering och kanalutjämning jämfört med traditionella signal-behandlingsmetoder. Maskininlärning kan vara fördelaktigt att använda då det kan vara svårt att designea algoritmer för undervattenskommunikation, då gene-rella kanalmodeller har visat sig vara svårt att hitta. Denna studie syftar till att utforska ifall maskininlärnings-assisterad kanalestimering och kanalutjämning kan erbjuda ökad prestanda jämfört med tradiotionella metoder. I studien stude-ras övervakad maskininlärning med ett ”deep neural network” och ett ”recurrent neural network”, för att se om neuronnäten kan öka prestanda i termer av an-talet bitfel. En kanalsimulator med miljöspecifik indata används för att studera ett antal olika scenarion. Resultatet av simuleringarna syftar till att identifiera intressanta miljöer att testa neuronnäten i. Resultaten i studien pekar på att i mycket tidsvarierande kanaler kan maskininlärning sänka bitfelsfrekvensen, om nätverk tränas med förhandsinformation om kanalen. Att utnyttja maskininlär-ning utan föregående information om kanalen resulterade i ingen förbättring av prestandan.

(4)
(5)

Abstract

Wireless underwater acoustic (uwa) communications is a developing field with various applications. The underwater acoustic communication channel is very special and its behavior is environment-dependent. The uwa channel is charac-terized by low available bandwidth, and severe motion-introduced Doppler effect compared to wireless radio communication. Recent literature suggests that ma-chine learning (ml)-based channel estimation and equalization offer benefits over traditional techniques (a decision feedback equalizer), in uwa communications. mlcan be advantageous due to the difficultly in designing algorithms for uwa communication, as finding general channel models have proven to be difficult. This study aims to explore if ml-based channel estimation and equalization as a part of a sophisticated physical layer structure can offer improved performance. In the study, supervised ml using a deep neural network and a recurrent neu-ral network will be utilized to improve the bit error rate. A channel simulator with environment-specific input is used to study a wide range of channels. The simulations are utilized to study in which environments ml should be tested. It is shown that in highly time-varying channels, ml outperforms traditional tech-niques if trained with prior information of the channel. However, utilizing ml without prior information of the channel yielded no improvement of the perfor-mance.

(6)
(7)

Acknowledgments

A big thank you to Oskar Axelsson for hosting me at Saab Dynamics and giving me the freedom to outline this thesis after my interests and all the support and feedback. At Saab Dynamics, I would also like to thank Per Abramhamsson and Simon Keisala, Per for helping me understand all the complications in modeling sonar propagation, and Simon for all the help with tweaking the neural networks. Of course, a final thank you to Emil Björnson and Özlem Tugfe Demir. I ap-preciate all the feedback on the report and support.

Linköping, June 2020 Martin Allander

(8)
(9)

Contents

List of Figures xii

List of Tables xiv

Notation xv 1 Introduction 1 1.1 Motivation . . . 1 1.2 Purpose . . . 4 1.3 Problem Formulation . . . 4 1.4 Limitations . . . 4 1.5 Background . . . 5 2 Theoretical Background 7 2.1 Underwater Channel Characteristics . . . 7

2.1.1 Attenuation . . . 7 2.1.2 Noise . . . 9 2.1.3 Multipath Propagation . . . 9 2.1.4 Doppler Effect . . . 11 2.1.5 Scattering . . . 11 2.2 Channel Models . . . 11

2.3 Baseline Physical Layer . . . 12

2.3.1 Transmitter . . . 12

2.3.2 Receiver . . . 14

2.4 Artificial Neural Networks . . . 15

2.4.1 Input and Output . . . 16

2.4.2 The Neuron . . . 16

2.4.3 The Network . . . 16

2.4.4 Training . . . 17

2.4.5 Designs Considerations . . . 17

2.4.6 Recurrent Neural Networks . . . 19

2.4.7 Long Short-Term Memory Architecture . . . 19

2.5 Previous Work . . . 20

(10)

2.5.1 Machine Learning in Wireless Radio Communication . . . 20

2.5.2 Machine Learning in Underwater Acoustic Communication 21 2.5.3 Key Takeaways . . . 21

3 Method 23 3.1 System Model . . . 23

3.1.1 Baseline Receiver . . . 24

3.1.2 Machine Learning Receiver . . . 24

3.1.3 Choice of Parameters . . . 24

3.1.4 Bit Error Rate Definition . . . 25

3.2 Software Simulation Environment . . . 25

3.2.1 Machine Learning Software . . . 25

3.2.2 Channel Model . . . 26

3.3 Channel Simulation Configuration . . . 26

3.3.1 Bathymetry . . . 27

3.3.2 Sound Speed Profiles . . . 27

3.3.3 Bottom Sediment Types . . . 27

3.3.4 Channel Geometry and General Parameters . . . 28

3.3.5 Channel Variations . . . 28

3.3.6 Noise . . . 30

3.4 Channel Simulations . . . 30

3.4.1 Time-Variant Filter . . . 31

3.4.2 Simulation Loop . . . 31

3.5 Artificial Neural Network Structure . . . 32

3.5.1 Deep Neural Network . . . 33

3.5.2 Long Short-Term Memory Network . . . 34

3.6 Artificial Neural Network Experiments . . . 34

3.6.1 Training Data Generation . . . 35

3.7 Miscellaneous Studies . . . 35

4 Results 37 4.1 Artificial Neural Network Experiments . . . 37

4.1.1 High Bit Error Rate Channels . . . 38

4.1.2 Low Bit Error Rate Channels . . . 39

4.2 Deployment Strategies . . . 40

4.2.1 Low to Moderate Time-Variance . . . 40

4.2.2 High Time-Variance . . . 42

4.3 Miscellaneous Studies . . . 43

5 Discussion 45 5.1 The Results . . . 45

5.1.1 Artificial Neural Network Structure . . . 45

5.1.2 Deployment Strategies . . . 46

5.1.3 Miscellaneous Studies . . . 47

5.2 Error Sources . . . 47

(11)

Contents xi

5.2.2 Machine Learning Software . . . 48

5.3 Relation to Other Work . . . 48

5.4 Sources . . . 49

5.5 The Thesis in a Larger Perspective . . . 49

6 Conclusions 51 6.1 Evaluation . . . 51

6.2 Future Work . . . 52

6.3 Final Words . . . 53

A Bathymetry Profiles 57 A.1 Shallow Scenario . . . 57

A.2 Deep Scenario . . . 60

B Sound Speed Profiles 63 B.1 Shallow Scenario . . . 63

B.2 Deep Scenario . . . 67

C Channel Simulations 73

D Time-varying Channels 79

(12)

1.1 Picture of an underwater sensor node from Saab Dynamics. . . 2

2.1 Transmission loss as a function of frequency, at 1km range and with k = 1.7. . . . 8

2.2 An illustration of the frss modulation format, compared to tradi-tional dsss, from [32]. . . 13

2.3 Illustration of an lstm block and its connections, from [6]. . . 20

4.1 Performance comparison of the lstm, dnn and dfe-pll in high berchannel. . . 38

4.2 Performance comparison of the lstm, dnn and dfe-pll in low berchannel. . . 39

4.3 ber as a function of snr, using three different equalizers. Shallow profile, clay bottom ssp 2019-03-16. . . 41

4.4 ber as a function of snr, using three different equalizers. Shallow profile, clay bottom ssp 2019-06-15. . . 41

4.5 ber as a function of snr, using three different equalizers. Shallow profile, clay bottom ssp 2019-08-21. . . 42

4.6 ber as a function of snr, using two different equalizers. Shallow profile, clay bottom ssp 2019-01-07. . . 43

4.7 ber as a function of snr, for online and offline-trained lstm in cgnand wgn. Shallow obstacle profile, clay bottom ssp 2019-03-16. . . 44

A.1 Illustration of the shallow flat bottom profile. . . 58

A.2 Illustration of the shallow slope bottom profile. . . 59

A.3 Illustration of the shallow obstacle bottom profile. . . 60

A.4 Illustration of the deep flat bottom profile. . . 61

A.5 Illustration of the deep slope bottom profile. . . 61

A.6 Illustration of the deep obstacle bottom profile. . . 62

B.1 Sound speed as a function of depth, data from 2019-03-16 04:45 REF M1V1. . . 64

B.2 Sound speed as a function of depth, data from 2019-06-15 14:40 REF M1V1. . . 65

(13)

LIST OF FIGURES xiii

B.3 Sound speed as a function of depth, data from 2019-08-21 21:12 REF M1V1. . . 66 B.4 Sound speed as a function of depth, data from 2019-09-14 05:18

REF M1V1. . . 67 B.5 Sound speed as a function of depth, data from 2019-01-07 13:50

Släggö. . . 68 B.6 Sound speed as a function of depth, data from 2019-03-05 07:23

Släggö. . . 69 B.7 Sound speed as a function of depth, data from 2019-05-06 07:27

Släggö. . . 70 B.8 Sound speed as a function of depth, data from 2019-07-02 08:24

Släggö. . . 71 B.9 Sound speed as a function of depth, data from 2019-11-04 08:45

Släggö. . . 72 C.1 Channel simulation, shallow scenario with sandy bottom type. . . 74 C.2 Channel simulation, shallow scenario with clay bottom type. . . . 75 C.3 Channel simulation, deep scenario with sandy bottom type. . . 76 C.4 Channel simulation, deep scenario with clay bottom type. . . 77 D.1 Channel impulse response for deep scenario, slope profile, sand

bottom ssp 2019-03-05. . . 79 D.2 Channel impulse response for deep scenario, slope profile, sand

bottom ssp 2019-05-06. . . 80 D.3 Channel impulse response for deep scenario, obstacle profile, clay

bottom ssp 2019-01-07. . . 80 D.4 Channel impulse response for shallow scenario, obstacle profile,

clay bottom ssp 2019-03-16. . . 81 D.5 Channel impulse response for shallow scenario, obstacle profile,

clay bottom ssp 2019-06-15. . . 81 D.6 Channel impulse response for shallow scenario, obstacle profile,

(14)

1.1 Comparison of fundamental physical properties for radio

commu-nication and uwa commucommu-nication. . . 2

3.1 Configurable equalizer parameters. . . 25

3.2 Bottom sediment properties for two different kinds of bottom. . . 27

3.3 Channel geometry and general channel/simulation properties. . . 28

3.4 Configurable small-scale settings. . . 29

3.5 Configurable large-scale (L-S) settings. . . 29

3.6 Configurable Doppler effect parameters. . . 30

3.7 dnn layer structure. . . 33

3.8 lstm layer structure. . . 34

3.9 Training options. . . 35

(15)

Notation

Abbreviations

Abbreviations Meaning

adam Adaptive moment estimation

ann Artificial neural network

auv Autonomous underwater vehicle

ber Bit error rate

cgn Colored Gaussian noise

dfe Decision feedback equalizer

dfe-pll Decision feedback equalizer with a phase-locked loop

dnn Deep neural network

dsss Direct sequence spread spectrum frss Frequency repetition spread spectrum

fsk Frequency-shift keying GPU Graphical processing unit

isi Intersymbol interference

lstm Long short-term memory

mcss Multi-carrier spread spectrum

ml Machine learning

ofdm Orthogonal frequency division multiplexing

pll Phase-locked loop

psd Power spectral density

qpsk Quadrature phase-shift keying relu Rectified linear unit

rls Recursive least squares

rnn Recurrent neural network

siso Single input single output snr Signal-to-noise ratio

ssp Sound speed profile

uwa Underwater acoustic

wgn White Gaussian noise

wssus Wide-sense stationary uncorrelated scattering

(16)
(17)

1

Introduction

1.1

Motivation

Wireless terrestrial communications is a game-changing technology and is a defin-ing technology of the 20th and 21st century. Huge efforts have been made by the industry and the research community to improve and optimize all aspects of mod-ern wireless communications. Modmod-ern wireless communications are based on electromagnetic waves, which travel at the speed of light. Underwater commu-nication is not an equally explored territory as its terrestrial radio counterpart since the applications are not necessarily as broad and mainstream. However, applications do exist, both civilian and military. Examples of civil applications are oceanographic studies, in terms of undersea exploration, environmental mon-itoring, and disaster prevention. Examples of military applications are communi-cations between submarines, surveillance systems, and mine reconnaissance. In other words, underwater networks have interesting and varying applications. A picture of an underwater sensor node designed by Saab Dynamics is shown in Figure 1.1.

Electromagnetic and optical waves are not feasible in underwater communica-tions as they are quickly absorbed by the water. So to communicate under water at a longer range than a few meters, acoustic waves are utilized, although acous-tic waves come with some undesired properties. Table 1.1 presents some met-rics, comparing an example of the underwater acoustic (uwa) channel with an example of the common terrestrial radio channel, to give the reader some under-standing of the implications of using acoustic waves compared to electromagnetic waves. Data for the radio channel is from the long term evolution standard, more commonly known as 4G [1]. As can be seen in the table, the utilizable bandwidth and propagation speed in uwa communications are in vastly different scales

(18)

Figure 1.1:Picture of an underwater sensor node from Saab Dynamics.

pared to radio communications.

Table 1.1: Comparison of fundamental physical properties for radio com-munication and uwa comcom-munication.

Property Radio channel uwachannel

Wave propagation speed 3 · 108m/s 1500 m/s

Bandwidth 20 MHz 5 - 10 kHz

Center frequency 700 MHz - 2.7 GHz 5 - 30 kHz

Communication networks are often modeled by dividing the stack into different layers, an example is the OSI-model [2], each layer has different tasks and the low-est level is the physical layer. The physical layer has the task of transmitting the raw bit-stream over the physical channel [2]. The purpose of the transmitter is to map binary data onto a waveform which can be transmitted over the medium. The medium can, for example, be a radio-link or an acoustic link. The physical medium disturbs the transmitted waveform by introducing noise and attenuating the signal, etc. The receiver has the task of picking up the disturbed waveform and decode the information correctly. The physical layer transmitter and receiver should, therefore, be designed according to the communication channel it is op-erating in.

(19)

1.1 Motivation 3

The nodes in an underwater network can be static or moving. A static node can be a sensor gathering data, while a moving node can be an autonomous underwa-ter vehicle (auv). Underwaunderwa-ter networks are often built ad-hoc, as the application and the environment between nodes vary a lot. For example, a node that collects data from the ocean needs to establish a communication link between a node on the surface. The distance between the nodes can be several kilometers, which poses a significant challenge, due to the severe signal attenuation at long dis-tances. Another very different scenario would be two auvs wanting to establish a communication link in the Baltic Sea (around 80 meters deep). Here difficulties do not arise due to the vast distance, but instead, echoes caused by reflections of audio on the surrounding surfaces. The environment in the underwater channel between the nodes is also subject to constant change due to factors such as ma-rine wildlife, ocean vessels, and tides.

With the very different propagation environments described above, designing a physical layer that can handle very different and difficult conditions is a chal-lenge. There are in literature a lot of suggestions for transmitter/receiver struc-tures for uwa communications with a lot of variations. In modern wireless ra-dio communication standards such as 4G and 5G, orthogonal frequency division multiplexing (ofdm) is a commonly used method. In ofdm the available fre-quency band is divided into many sub-bands, each behaving similarly to an addi-tive white Gaussian noise (wgn) channel. ofdm reduces modem complexity and enables high data-rates. ofdm is applicable in uwa communication and can be beneficial. However, ofdm does not always provide the optimal solution in uwa communication, shown in [31]. The preferred physical layer design depends on if the application is long/short-range, high/low signal-to-noise ratio (snr), shallow or deep. Therefore more complicated physical layer designs for uwa exist and can provide benefits over ofdm, models where the specifics of the uwa channel are taken into consideration. Developing physical layers for uwa communica-tion is non-trivial and with the different possible environments, uniformly good performance is not easily guaranteed. It is all worsened by the fact that testing is expensive and time-consuming.

uwahas some leading characteristics, explained in Section 2.1, but modeling the uwa channel is non-trivial, which is detailed in Section 2.2. Finding general models has proven to be difficult, which leads to a model deficit. Physical layer designs are therefore often sub-optimal and unable to perform well in all con-ditions. In this case, machine learning (ml) is a good candidate to combat the model deficit. Recent literature highlights ml as an alternative or complement approach to classical signal processing, to improve and generalize performance for physical layer algorithms. ml has also gained popularity in other fields such as speech recognition, image processing, etc. It is therefore of great interest to study if ml algorithms can be utilized to improve the performance of a communi-cation system when the underwater channel introduces difficulties. The desirable property of an ml-based system is that it can be generalized, performing well in multiple circumstances, if trained correctly.

(20)

1.2

Purpose

Researchers in cooperation with Saab Dynamics have suggested a physical layer (transmitter and receiver) protocol based on the frequency repetition spread spec-trum (frss). The protocol is motivated for reliability and performance for low snr, where it outperforms ofdm [31]. The existing frss transmitter and receiver will be used as a baseline for the project. The purpose is then to investigate if the channel estimation and equalization utilized by the frss based on a decision feedback equalizer with a phase-locked loop (dfe-pll) can be improved by uti-lizing ml. Channel estimation and equalization are considered among the most difficult tasks in uwa communication, due to the sparse time-varying multipath propagation. The purpose of the study is to explore whether a receiver based on ml methods like deep neural networks (dnns) or recurrent neural networks (rnns) can offer improved performance. The performance will be studied in terms of coded bit error rate (ber) as a function of the snr when the baseline dfe-pllwill be compared to the developed ml-based receiver.

1.3

Problem Formulation

The thesis aims to study

1. in which environments dnn or rnn-based channel estimation and channel equalization can improve performance compared to dfe-pll;

2. why it can offer improved performance and how much the performance can be improved;

3. and possibilities to develop a solution where the rnn or dnn has no prior knowledge of the deployment scenario, either an online-training or a data-driven approach.

Various channel simulations will be performed to identify environments where the performance can be improved, which will be a sizeable part of the work.

1.4

Limitations

To limit the scope of the thesis, several limitations are set throughout the thesis. Here some of the initial assumptions are described, but more limitations are set throughout the thesis as theory and models are introduced.

The distance between transmitter and receiver is assumed to be 1000 meters. It was also decided that the underwater environments should resemble the condi-tions in Swedish waters. Swedish waters, mainly the Baltic Sea and the waters in Skagerrak and Kattegatt are shallow. So the intention is to study shallow-water communications. Two depths, a shallow case (18 meters) and a deep case (72 meters), are studied to represent different environments. Note that a depth of 72

(21)

1.5 Background 5

meters is still considered in the general field as shallow-water communications. It is assumed that the communication nodes utilize hydrophones in a single input single output (siso) setup i.e., transmitter and receiver only use one hydrophone. The beam-pattern is assumed to be omnidirectional. The hardware used at Saab Dynamics does not provide a truly omnidirectional pattern, but this assumption reduces the complexity and the assumption is common in the literature. Trans-mitter and receiver locations are assumed to be static, only drifting slightly in the environment. Further limitations introduced in the thesis are intended to repli-cate a realistic simulation.

The aspect of computational complexity and hardware limitations of uwa com-munication modems will be disregarded.

1.5

Background

This thesis is written in corporation with Saab Dynamics in Linköping. Saab Dynamics is a subsidiary of Saab AB. Saab AB provides high technological solu-tions within the military defense, civil defense, and aerospace. Saab Dynamics, in turn, supplies a wide range of products, such as torpedoes, missiles, ground com-bat equipment, and various naval solutions. The naval solutions include auvs, remotely-operated vehicles, underwater networks, and torpedoes (a mix of civil and military products). For example, remotely-operated vehicles can perform re-pairing missions on underwater pipelines or oceanographic studies.

Thus, underwater wireless communications is an area of great interest for Saab Dynamics as it can allow for these products to become wireless and autonomous. Saab Dynamics works with a lot of partners to promote and participate in uwa wireless communications research. As an active participant in the research com-munity and involved in product development, Saab Dynamics has an interest in uwachannel modeling and signal processing.

(22)
(23)

2

Theoretical Background

This chapter describes the fundamental theory important to the study. First, an introduction to the underwater channel is given in Section 2.1 and the channel models are discussed in Section 2.2. Based on theoretical knowledge about uwa communications, the baseline frss physical layer is described in Section 2.3. A section is devoted to describing the basics of artificial neural networks (anns) and the types dnns and rnns, which are studied in this thesis. The chapter is ended by studying related work in communications.

2.1

Underwater Channel Characteristics

Due to the usage of acoustic signals, the signals transmitted in uwa are inher-ently wideband [25], i.e., the bandwidth of the signal B is large relative to the carrier frequency fc. As highlighted in Table 1.1, the carrier frequency is substan-tially lower compared to electromagnetic waves, making the available bandwidth lower. The low bandwidth utilized also means that the supported data-rates are quite low, as the capacity of the channel increases with the available bandwidth [29]. The intuition behind the result is that more available bandwidth means that more information can be loaded into one transmission. In radio communications, assumptions are often based on the fact that fcB, this is not viable in the uwa channel [25]. To get a further understanding of the difficulties of the acoustic channel, some crucial aspects will be discussed below in separate sections.

2.1.1

Attenuation

The amount of power captured at the receiver is a determining factor if we can extract any data from the signal, or if we just receive noise. Thus, understanding the attenuation of a signal through a communication medium is crucial in all

(24)

kinds of communication systems. In the uwa channel, the signal attenuation is frequency-dependent. The attenuation, i.e., path-loss A(l, f ) can be described according to [25] as:

A(l, f ) = (l/lr)kα(f )l−lr, (2.1) where l is the propagation distance compared to a reference distance lr, f is the frequency of the signal, and α(f ) is the absorption coefficient, which increases with increasing frequency. The exponent k, i.e., the path loss exponent, models the spreading factor in the water, which is a factor between 1 and 2. α(f ) implies that low-frequency components of the signal transmitted through the water will be received with higher power compared to the high-frequency components. The frequency-dependent absorption coefficient can be described by Thorp’s empiri-cal formula [16] in dB/km: 10 log α(f ) = 0.11 f 2 1 + f2 + 44 f2 4100 + f2 + 2.75 · 104 f2+ 0.003, (2.2) where f is in kHz. The frequency-dependent transmission is visualized in Figure 2.1 for a typical bandwidth 5 − 10 kHz.

Figure 2.1: Transmission loss as a function of frequency, at 1km range and with k = 1.7.

The choice of the spreading factor is determined by the physical properties of the channel. A spreading factor of k = 2 corresponds to spherical spreading, where the transmission loss increases with the square of the range [30, p. 101] and is spread over the surface of a sphere. The choice of k ≈ 2 is comparable to deep ocean communications when the sound waves propagate through the ocean with

(25)

2.1 Underwater Channel Characteristics 9

few obstacles. A spreading factor of k = 1 corresponds to cylindrical spread-ing [30, p. 102]. Cylindrical spreadspread-ing occurs when the sound does not propa-gate freely horizontally or vertically, for example, when the vertical propagation is limited by the seafloor and surface. Cylindrical spreading can occur both at moderate and long ranges [30, p. 102], where the sound is trapped between the seafloor and surface.

The reality in most scenarios is somewhere in between cylindrical and spherical spreading. With our interest in shallow-water communication, a choice of k = 2 is not realistic as the sound does not propagate freely in the medium. A choice of k = 1 is not realistic as the trapped sound at a depth of 18 or 72 meters still suffers from attenuation when reflected on the surfaces, and the bending due to sound speed variations yields an inhomogeneous propagation.

2.1.2

Noise

All communication channels are subject to disturbing noise. A common assump-tion is to study the performance of a communicaassump-tion system in the presence of ambient wgn, where the white color describes the power spectral density (psd) of the noise being constant in the frequency range of interest and Gaussian de-scribes the probability density function of the noise. In the uwa channel, the ambient noise may be modeled as Gaussian [25], although no specific motivation is provided. Some of the most important articles cited in this thesis assume the noise is Gaussian, but without further references or motivation [23, 26, 32]. The noise psd in uwa communication is in fact colored (frequency-dependent), simi-lar to the path-loss [30, p. 206]. Both [25] and [30, p. 210] suggest that the noise psddecreases at approximately 18 dB/decade with increasing frequency.

In the water, there exists a lot of interference, such as ocean wildlife and ship noise. These noise sources can differ a lot depending on the environment, for ex-ample in a harbor or the middle of the sea. Attempts have been made to model specific interference sources, such as shipping lanes [3] and shrimp clicking [5]. Due to the unpredictability of some of these interference sources, they can poten-tially disrupt even the most optimal receivers, designed under assumptions of colored or white Gaussian noise.

2.1.3

Multipath Propagation

The environment between the receiver and the transmitter is not obstacle-free, therefore, the acoustic signal is reflected while propagating in the environment. Thus the receiver can pick up multiple delayed instances (from multiple paths) of the originally transmitted signal, as reflective components are received. Multiple delayed instances of signals at the receiver cause intersymbol interference (isi). isiis an issue that must be dealt with by the receiver, and if not, can disrupt any attempts at communication. Multipath propagation is commonly modeled as a tapped-line impulse response [24], where the tap gains are modeled by stochastic

(26)

processes. In standard terrestrial wireless communications, multipath can be in large quantities. In the uwa channel, the number of propagation paths is not necessarily as many. The issue is rather that the isi can last several symbol intervals [26], due to the speed of sound, as sound travels at a much slower speed in water compared to electromagnetic waves.

Time-Varying Multipath Propagation

It is important to notice that the multipath propagation properties shift slowly over time. The time-varying multipath channel is described by a time-variant fil-ter h(τ; t), as described in [2, p. 132]. The variable τ corresponds to the delays in the impulse response, i.e., filter taps. The variable t describes how h(τ; t) varies with time. When the impulse response varies faster with respect to τ in compar-ison to the variable t, the filter can be considered a sequence of time-invariant filters [2, p. 132]. The output of the filter is given by the convolution:

y(t) = ∞ Z

−∞

h(τ; t)x(t − τ)dτ, (2.3)

where x(t) is the input signal to the system.

Environmental Variations

The behavior of multipath propagation is determined by the appearance of the physical channel [24]. In shallow waters, the channel impulse response is de-termined by reflections on the surface and bottom, as well as other objects and the direct path [24]. The appearance of the bottom is called bathymetry and can be compared to topography on land. Bathymetry between two communica-tion nodes depends on where the nodes are deployed. Hence, there is no single bathymetry valid for all communications. The importance of having an under-standing of the bathymetry for communications will be highlighted in Section 3.3.1. The surface properties between the communication nodes are not static either. Due to tides, waves, and other phenomena the surface is subject to vari-ations over time compared to the bathymetry which can be considered rather static once known. Attempts at modeling the behavior of the surface will be rather limited in this thesis, but the impact of surface variations should be men-tioned. Entire studies have been devoted to modeling the impact of waves on communications and concluded that they are significant [15]. Even details such as air bubbles due to crashing waves also affect communications.

Sound Speed Variations

The speed of sound in water varies with depth [30, p. 111], due to varying lev-els of temperature, pressure, and salinity, etc. This is important to consider as the sound waves do not propagate homogeneously in the water due to this trait. Sound waves are bent, which causes further unpredictabilities. The sound speed

(27)

2.2 Channel Models 11

profile (ssp) can vary largely with the seasons. In the spring, cold water from rivers cools down the surface water, while the deeper ocean water keeps a stable temperature, causing a "knee" in the sound speed profile, Figure B.5 illustrates this well. Similar behaviors can be observed in the summer when water close to the surface is heated up. This affects the geometry of the multipath propagation [24].

2.1.4

Doppler Effect

The Doppler effect is the frequency shift in the observed signal due to the move-ment of the transmitter or receiver relative to the path traveled by the signal. The Doppler effect is present in all non-stationary communication channels, but as in all previous sections, it is more severe in the underwater case. The mag-nitude of the Doppler effect is proportional to a = v/c, where c is the propaga-tion speed of the signal and v is the velocity of the moving transmitter/receiver. To give some insight, c ≈ 1500 m/s for underwater sound, while radio waves travel at the speed of light, c ≈ 3 · 108 m/s. Hence, for a particular object ve-locity v, the Doppler effect would be higher by a factor of 200000 for the uwa channel. As [25] points out, there are few comparable scenarios in radio com-munications, only for low-orbit satellite communications similar Doppler effects are introduced. The Doppler effect is introduced when we consider a moving transmitter or receiver such as an auv, but even without intentional motion, un-derwater nodes are subject to drifting with waves, tides, and currents [25]. So the Doppler effect can not be disregarded even for a non-moving setup.

2.1.5

Scattering

The sea contains inhomogeneities which intercept and reradiate portions of the acoustic signal [30, p. 237]. The reradiation is called scattering. The sum of the total scattering is called reverberation, and [30, p. 237] names three types of reverberation: sea-surface reverberation, bottom reverberation, and volume reverberation. The first two are self-explanatory and volume reverberation occurs due to marine wildlife, other objects, and the inhomogeneous structure of sea itself.

2.2

Channel Models

Modeling of the uwa channel is a major obstacle to achieving reliable communi-cations in the uwa channel [12]. Good channel models that can be implemented in software are essential to simulate physical layers, as sea tests are expensive and troubleshooting is more difficult. To create a channel model that takes into consideration all the aspects mentioned in Section 2.1 is non-trivial. In wireless communications, a common approach to deal with multipath propagation and fading is to create stochastic channel models. A common assumption is the wide-sense stationary uncorrelated scattering (wssus) channel [2]. Due to its analyti-cal tractability, a similar model would be desired in uwa communications, but

(28)

studies conclude that wssus assumptions are violated by non-stationary behav-ior in the uwa channel [33]. Also, as summarized in [18], the signal envelope has been reported to follow Rayleigh, Rice, and Lognormal distributions in different studies. According to another article [24], claims of shallow-water medium-range channels following a Rayleigh fading model have been challenged. Models such as Rayleigh fading might be feasible but are constantly challenged and the uwa channel behavior is difficult to generalize. It is concluded that the contradicting results show that a realistic uwa channel simulator is required [18]. A promising approach is ray theory, both [24] and [12] highlight the usage of ray theory to model multipath propagation in the uwa channel.

2.3

Baseline Physical Layer

The dfe-pll has been considered a suitable and popular receiver structure for uwa communications [12]. The structure was initially suggested in a series of two articles [23] and [26]. The design is motivated by the unique characteristics of the uwa channel, namely large Doppler-fluctuations and long time-varying multipath. The structure has been combined with the multi-carrier spread spec-trum (mcss) modulation technique in [34]. The authors suggest a physical layer structure based on mcss, where the receiver utilizes the dfe-pll [34]. mcss was later renamed to frss and was motivated specifically for outperforming the other candidates, namely, direct sequence spread spectrum (dsss) and frequency-shift keying (fsk) in low snr scenarios [32]. It did not give the highest data-rates at high snrs compared to dsss and fsk, but the low snr performance is very desir-able in a tough underwater channel. The frss transmitter and receiver are vital components in this thesis.

2.3.1

Transmitter

The effective data-rate (bit/s) and stability of the reception are determined by the used spreading factor. The spreading factor is determined by a rate-parameter R, as it determines the effective data-rate (bit/s). A total of six available configura-tions of R are suggested in [32], R ∈ {1, 2, 3, 4, 5, 6}, but only four are available in the available implementation. The spreading factor is of length K = 2R1. A choice of R = 1 yields a single sub-band, while R = 4 yields fifteen sub-bands. A choice of higher R yields a more stable transmission, but the increased spread-ing factor reduces the effective data-rate (bit/s). Therefore, the choice should be made based on the transmission conditions.

When R is chosen, the data symbols are mapped onto the K = 2R1 sub-bands. Initial training symbols are prepended and information bits are continuously multiplexed with periodic training symbols. The waveform is prepended with a preamble, see [32], which is utilized for detection, synchronization, and Doppler estimation. Figure 2.2 from [32] illustrates the frss modulation format for R = 2 compared to dsss.

(29)

2.3 Baseline Physical Layer 13

Figure 2.2:An illustration of the frss modulation format, compared to tra-ditional dsss, from [32].

Preamble

All rates use an m-sequence preamble as described in [32]. The m-sequence is mapped onto a raised cosine carrier function. The raised cosine has a roll-off factor of β = 1/3. The preamble is then moved to the passband with carrier frequency fc.

Channel Coding

Before the bits are channel coded they are scrambled. After scrambling, the bits are channel coded using a 1/2 rate convolutional encoder [32]. The bits are then interleaved, i.e., rearranged to protect the data from burst errors.

Modulation and Training Symbols

The symbol sequence created from the process of channel coding is modulated using Gray-coded quadrature phase-shift keying (qpsk). The training symbols mentioned in Section 2.3.1 are a set of known qpsk symbols, inserting one train-ing symbol for every two data symbols. A sequence of random qpsk symbols is prepended to the signal, to enable initial equalizer training. The resulting se-quence which is ready to be modulated onto the sub-bands is denoted z(n). FRSSgeneration

The sequence z(n) is mapped onto the K = 2R1 sub-bands utilizing a raised cosine pulse with roll-off factor of β = 1/3. After modulation onto the sub-band

(30)

waveforms, the signal is concatenated with the preamble with a short pause in between of duration ∆t. The final output from the frss transmitter is a time-continuous signal x(t).

2.3.2

Receiver

The receiver can be divided into several parts as well. The processes are in or-der acquisition, equalization, adaptive filtering, phase tracking, log-likelihood computation, and soft Viterbi decoding. The input to the frss receiver is a time-continuous signal u(t) which has passed through the uwa channel.

Acquisition and Pre-processing

The received signal u(t) is demodulated to the baseband from the carrier fre-quency fc. The baseband signal is then passed to a Doppler filter bank, i.e., the signal is correlated with a bank of Doppler-shifted replicas of the preamble. A brick wall filter is applied to remove out-of-band noise [34]. A filtered signal yi(t) per sub-band i is obtained from the pre-processing.

Equalization

The purpose of the equalizer is to eliminate the isi introduced by the channel. The input to the equalizer is a time segment of length Teqof the pre-processed signal. Teq should be larger than the channel delay spread, Teq determines the parameter L. The equalizer is trained using the training sequences described in Section 2.3.1. Training is performed by feeding the samples to the filter and up-dating the filter coefficients by investigating the error between the known train-ing symbols and the filter output. The output of the equalizer is denoted as ˆz(n). The equalization is performed by a fractionally spaced dfe-pll as described in [23] and [26]. A symbol estimate is yielded by taking the input signal y(m)k,n and performing multiplication with the filter coefficients c(m)k,n, see equation (9) in [34];

ˆz(m)(n) = K X

k=1

(y(m)k,n)Tc(m)k,n±1, (2.4)

where y(m)k,n is given by equation (10) in [34];

y(m)k,n =                    yk(nT − (L − 1)bT /(2a)) yk(nT − (L − 3)bT /(2a)) .. . yk(nT + (L − 3)bT /(2a)) yk(nT + (L − 1)bT /(2a))                    exph−(m) k (n ± 1) i . (2.5)

(31)

2.4 Artificial Neural Networks 15

The signal yk(t) is down-sampled to a samples per symbol. The dfe has a spac-ing of bTeq/a, a and b are design parameters [34], T is the symbol period. θ

(m) k represents the approximated phase shift of the carrier frequency for sub-band k.

Adaptive filtering and phase tracking

The parameters utilized in the equalization process are constantly tracked. The known training symbols are used to update c(m)k,n using an adaptive filter, in [34] a recursive least squares (rls) filter is suggested. For more details on adaptive filtering and the rls filter see [8]. The phase shift θ(m)k is approximated by a pll by utilizing the known training symbols, for details see [34].

Log-likelihood Ratio

The symbol estimates from the equalization ˆz(n) are then utilized to calculate the likelihood ratio. The periodic training symbols are utilized to calculate a log-likelihood ratio [34], which is utilized to compensate for the bias of the equalizer. The output from the Log-likelihood Ratio is fed to the Viterbi decoder.

Viterbi Decoding

The scaled symbols are deinterleaved using the known interleaving sequence at the transmitter. The symbols are then put through a soft-decision Viterbi decoder [32]. The final step is to unscramble the symbols, which yields the desired bit sequence.

2.4

Artificial Neural Networks

In this section, the basics of anns are described, as dnns and rnns are specific types of anns that have been utilized in this thesis.

anns is categorized as an ml algorithm. anns have attracted great attention in recent years and often ml is considered synonymous with anns, even though there are other kinds of algorithms in ml. anns have become very popular due to their ability to solve very complex and non-linear problems and have proven to be applicable in a wide array of subjects. Applications include computer vi-sion, signal processing, medical diagnosis, etc. Fundamental research and the-ory in the area were made in the 1970s, but it has gained popularity in the past 10-20 years due to the increased accessibility of computing power that enables the training of the parameters in larger anns. In this section, the fundamental theory will be presented to introduce the reader into the basics in anns. This sec-tion only studies supervised learning with anns, where the network is trained by feeding the network input/output combinations of the correct behavior. The goal with the training is to learn a function that takes previously unseen data and performs the correct function mapping, i.e. generalization. The basics of anns in this section are based on [19].

(32)

2.4.1

Input and Output

Before explaining how anns work, the input and output have to be explained. As mentioned before, anns have been applied to a variety of problems and the input and output of an ann vary depending on the application. The input data to an annis often referred to as features, where selected features of the data are used. If the user wants to identify cats in an image, the feature can be 256 × 256 pixel values and the desired output is 0 or 1, 0 if there is no cat, 1 if there is a cat. In a signal processing example, the feature can be an entire received signal in the time-domain, its Fourier transform, or other properties of the signal. Feature selection is of out-most importance, and selecting the correct feature and feature size is crucial. It is desirable to pre-process information, to reduce the input dimension and aid the neural network in learning already known properties. However, it might come at a cost of lost information which can degrade the performance. The important part is that there is a good data set with correctly annotated input and output pairs xn and yn, n ∈ {1, ..., N } respectively. The purpose of the network is then to build an approximation of the function that gives the desired function f so that yn = f (xn) for all n.

2.4.2

The Neuron

The neuron is the fundamental computational block of an ann. The neuron has a set of input features xn, and each feature xn is multiplied with a weight wn, n ∈ {1, ..., N }. The multiplied weights are summed up according to:

z = N X

n=1

wnxn+ b (2.6)

where b corresponds to an introduced bias weight. The purpose of the bias weight is to be able to represent the output of the neuron in a possibly wider range compared to the input domain. The output of the neuron y is finally given by y = g(z), where g(z) corresponds to the activation function. The purpose of the activation is to perform a non-linear mapping of the weighted summation to the output so that it can be decided if the neuron was activated or not. Activation functions are non-linear functions (linear functions may only be utilized in the output layer), as it enables anns to capture non-linear behaviors, a necessity to solve complex problems. Some examples of popular activation functions are given in Section 2.4.5.

2.4.3

The Network

The artificial neural network is built of multiple layers of neurons. In the simplest case, the input features (input layer) is connected to a set of neurons, which then produce the output (output layer). To solve complex problems layers of neurons are added in between the input and output layers, which are called hidden layers. When performing classification tasks the Softmax layer is commonly used in the

(33)

2.4 Artificial Neural Networks 17

output layer in multilabel classification tasks. So each class is given a probability between zero and one, based on which a classification decision is made.

2.4.4

Training

Once the network architecture is built the final step is training the network. The training of an ann is the process of updating the weights (including bias weights) win all the neurons in the network. First, we have to design a problem which can be optimized. Therefore, a loss function is introduced, in this case the mean square error loss function is selected, but note that there are other variants avail-able. The mean square error loss function is given as;

(w) =X m

X

k

(yk,mpk,m(w))2, (2.7)

where k corresponds to all training samples and m to all output nodes. Note that the training sample yk,mis known and the network predicts the output pk,m(w) . It is desired to minimize (w), this is done by gradient descent:

wt+1i,j = wi,jtη∂(w)

∂wi,j , (2.8)

the parameter η is called the learning rate and the derivative d(w)/dwijis found using partial derivatives and the chain rule, using that activation functions should be differentiable or piecewise differentiable. The learning rate determines how fast the weights in the network are updated. A high learning rate leads to faster training, and possibly over-fitting. A low learning-rate leads to slower training, and possibly non-converging results.

The process of feeding training samples through the network is called forward propagation. The process of updating the weights based on the loss function is called backward propagation. This can be done for one training sample at a time or utilizing multiples samples, which is called batch learning. In batch-learning, training samples are put into batches, and then forward and propagation is per-formed. The number of training samples in one batch is called batch size or mini-batch size, as common sizes are relatively small numbers as 32 or 16.

2.4.5

Designs Considerations

Some important design considerations will be discussed in this section, to give some insight into the design of an ann.

Activation Function

There exist many possible activation functions, here three classical activation functions are presented. The first one is the sigmoid function:

σ (x) = 1

(34)

The sigmoid function has a range between 0 and 1 and is easy to apply. However, it suffers from some undesirable properties:

• Vanishing gradient problem, i.e., the partial derivates become very small when x is too small or too large, which leads to slow updates for early layers in the network.

• The range of the derivative of the sigmoid function is very narrow, which leads to indistinct gradient values.

• It is not zero-centered. Hence, negative-valued outputs cannot be repre-sented by a sigmoid function.

The hyperbolic tangent function:

f (x) = tanh(x) (2.10)

is a popular choice. It is zero-centered and the derivative is not as narrow as the sigmoid function. It does, however, suffer from the vanishing gradient problem. The rectified linear unit (relu):

f (x) = max(0, x) (2.11)

is a function that does not suffer from the vanishing gradient problem and it is not zero-centered. It is worth noting that Equation (2.11) is not suitable for the output-layer and should only be used in the hidden layers if the range of output is not restricted to 0-1 interval. A variant of the relu is the leaky relu, it behaves similarly to the relu, with the exception that it allows small negative values to be let through. It can be described as:

f (x) =        x x > 0 0.01x otherwise. (2.12) Number of Layers

The number of layers and neurons is a design consideration. The number of lay-ers can be increased to improve the ability to capture complexities in data, i.e., a more shallow network might be well suited for simpler tasks and vice versa. The number of neurons in each layer can also be increased to capture more complex structures, but should also be of suitable dimensions compared to the dimensions of the input.

There are some drawbacks to increasing the number of layers. One danger is that the network is more prone to over-fitting on the training data. Increasing the number of layers increases training and run times. When working with a larger amount of layers it is also important to make sure that earlier layers in the network are updated properly, to avoid the vanishing gradient problem. A dnn is an ann where there is more than one hidden layer.

(35)

2.4 Artificial Neural Networks 19

Optimization method

Equation (2.8) presents the simple gradient descent method. There are a lot of variants of the basic gradient descent available. Different methods can reduce training time or modify the learning rate while training. The learning rate should also be selected carefully. An example of a popular optimization method is adap-tive moment estimation (adam), which updates the learning rate while training [13].

Loss Function

The mentioned mean squared error function (2.7) is a popular choice of the loss function, however, there are other options available. The choice of loss function depends on how the problem is designed.

Data Set

The performance of anns is only as good as the provided data set. If the data set is not general enough the ann might only become applicable to specific situ-ations. The training data has to be selected such that it allows to be generalized to the possible various data that will appear when using the ann. Any network, regardless of how many neurons it has or how deep it is, is only as good as the given training data.

2.4.6

Recurrent Neural Networks

rnns is a class of anns, where the connections between neurons form a directed graph. Compared to feedforward anns (information only moves in one direction, from the input layer to output layer), rnns can use their internal state/memory to aid in the solving of tasks [10]. Essentially rnns can utilize information of previous input compared to the basic feedforward network, which lacks this property. To learn time-varying underwater channels, this property could be of use. There are several rnn architectures. The only rnn structure provided by MATLABs Deep Learning Toolbox (version 2019b) is the long short-term memory (lstm) structure.

2.4.7

Long Short-Term Memory Architecture

A lstm layer is built from a set of recurrently connected blocks, called memory blocks. Each block contains a set of cells, where each cell is connected to three multiplicative gates, the input gate, the output gate, and the forget gate. The gates regulate the flow of information into the cell by being open or closed, thus controlling its behavior [6]. An illustration of an lstm is presented in Figure 2.3, from [6]. The gates open or close based on the signals they receive, so informa-tion is let through or blocked based on the signal strength and importance. The signals that lead into the gates are weighed as shown in Figure 2.3. This means that the network learns how to value the signal strength, i.e., learning how to

(36)

Figure 2.3:Illustration of an lstm block and its connections, from [6].

handle the data. The input gate thus controls when data is allowed to enter the cell, the output gate controls when data is allowed to leave the cell, and the forget gate controls when data is allowed to be deleted. The concept of gates, allows the neural network to capture remote dependencies, long-term memory as the name implies.

BidirectionalLSTMs

A bidirectional lstm runs the input in both directions, yielding backward and forward information of the sequence. This allows the network to combine infor-mation from the past and the future. This property has proven to be useful as it learns the context of the provided information better compared to a basic lstm.

2.5

Previous Work

Before explaining the outlined method in this thesis, recent work and progress within the field will be explained to give some insight into the new approach this thesis takes.

2.5.1

Machine Learning in Wireless Radio Communication

Terrestrial radio wireless communications have a plethora of literature on utiliz-ing ml for performutiliz-ing channel estimation and equalization. One can learn a lot from this field, but the assumptions are not coherent with the assumptions for the

(37)

2.5 Previous Work 21

uwachannel. In an article, a dnn is trained online and offline, exploiting knowl-edge from training symbols to learn a time-varying channel [17]. An interesting approach to this problem is to develop a network that can learn time-varying properties. Recent literature in radio communications highlights the fact that rnns which can learn a temporal dynamic behavior compared to dnns, can be suitable for channel estimation/equalization. rnns are exploited in [4] and [7] to perform online pilot-assisted channel estimation/equalization. rnns utilize the previous outputs of the network to create an internal state, which it can use to process new inputs. The number of works on lstms in communications is rather limited, they have however been proven successful in speech recognition [14].

2.5.2

Machine Learning in Underwater Acoustic Communication

Recent literature on uwa communication suggests that different parts of the re-ceiver chain or the entire chain can be replaced by dnns with promising results. In a study, a version of the dfe-pll receiver is replaced by a dnn with improved system performance, namely lower bers compared to the normal receiver at the same snr [37]. The authors considered a single sub-band system, while in this thesis a multi-sub-band system is adopted. A related study proposes a system where the entire ofdm receiver chain is replaced by a dnn [36], implying the possibility that dnns are promising to improve ber in multi-sub-band systems. However ofdm is very different compared to frss in a lot of aspects, most im-portantly the isi is removed by the cyclic prefix in ofdm, which simplifies the rest of the reception. Another work replaces the channel estimation in an ofdm system with a dnn [11]. In [36] and [11] the authors observed lower bers at all snrs compared to traditional receiver structures which the authors suggested as benchmarks.

One common factor in [11, 36, 37] is that the dnns only consist of three to four hidden layers and have a common topology. The topology follows the pattern that the number of neurons in the hidden layer is half of the previous layer. If the input layer is of size 1024 as in [36], the first hidden layer should be of size 512, then 256, and so on. Building a network from a similar structure should be a good start.

2.5.3

Key Takeaways

The strength of rnns are highlighted for wireless radio communication [4, 7], but similar studies in uwa communication were not found. The studies related to uwa communication highlights the usage of dnns [11, 36, 37]. In both areas, training symbol-aided online-training has been proven effective for rnns and dnns. Training symbol-aided training is very interesting as frss training sym-bols could be utilized in such an approach.

This study will take a new approach by comparing the two different structures in uwa communications and a rather unique multi-sub-band system. The cited

(38)
(39)

3

Method

In this chapter, the method to answer the problems posed in Section 1.3 is out-lined. First, the theory from previous chapters is used to describe the system model. Then the software used to simulate the system is described in Section 3.2. Details regarding the intricate channel simulation are described in Section 3.3, outlining the possible channels. Section 3.4 outlines how the different con-figurations are simulated. The content in the sections until Section 3.5 serves the purpose to identify interesting deployment scenarios for the anns, i.e., answer-ing in which environments performance can be improved. The final questions of how much performance can be increased and deployment strategies (generaliz-able performance) are answered in the final sections. Section 3.5 proposes ann structures based on the related work and Section 3.6 describes how anns are de-ployed to answer the outlined problem formulation. Finally, some miscellaneous studies are described.

3.1

System Model

The intention with this section is to describe how the ml receiver and baseline receiver were implemented on a schematic level. Both systems utilized the same functions and shared most properties, besides the equalization.

Assuming that the frss transmitter described in Section 2.3.1 yields a time-continuos signal x(t), x(t) was then sent through the underwater multipath channel accord-ing to equation (2.3). The received signal at the hydrophone is y(t) where n(t) is additive noise:

u(t) = y(t) + n(t). (3.1)

The noise n(t) can either be colored or white, see Section 3.3.6. What is important is that the received signal is inevitably embedded in noise. The received signal

(40)

u(t) was fed to the different receivers.

3.1.1

Baseline Receiver

The baseline receiver utilizes all the steps outlined in Section 2.3.2. In the end, a sequence of estimated information bits is generated. By comparing the estimated bit sequence and the original bit sequence, the ber could be calculated.

3.1.2

Machine Learning Receiver

For performance of the ann receiver to be comparable to the baseline receiver, only the equalization process described in Section 2.3.2 is replaced. First, the frssreceiver performs the described pre-processing mentioned in Section 2.3.2. The next step was to perform the equalization, which was performed by an ann. Similar to equation (2.5) the signals were stacked, but the phase offset was disre-garded in this implementation, yielding the following expression:

yk,n =                    yk(nT − (L − 1)bT /(2a)) yk(nT − (L − 3)bT /(2a)) .. . yk(nT + (L − 3)bT /(2a)) yk(nT + (L − 1)bT /(2a))                    . (3.2)

yk,nfor each symbol n was considered as the input to the ann. The choice of a and b are design parameters which had to be considered, see Section 3.1.3, they essentially determined how many samples the signal was represented by, i.e., the length of the vector yk,n. Dimensions for the selection of a and b are presented in Section 3.1.3, which yielded a vector of length 90. The input was then to be fed to the ann which yielded a symbol estimate ˆz(n) for each n, the estimate is fed to the log-likelihood ratio and Viterbi decoding as described in Section 2.3.2. In the end, a sequence of estimated information bits was generated. By comparing the estimated bit sequence and the original bit sequence, the ber could be calculated.

3.1.3

Choice of Parameters

Each rate had a different amount of sub-bands, hence the signal dimensions for each rate vary. Thus in this thesis an ann is configured to one rate configuration. Out of the four available rates, the rate R = 2 which utilizes three sub-bands was studied in this thesis. The choice of R = 2 was motivated by the fact that fewer sub-bands reduces the input dimension complexity to the ann. R = 1 was not considered as part of what makes the thesis unique compared to [37] is to study equalization in a multi-sub-band configuration.

The choice of equalizer parameters a, b and Teq (Teq determines the size of L) for the ann are presented in Table 3.1.

(41)

3.2 Software Simulation Environment 25 Table 3.1:Configurable equalizer parameters.

Property baseline ann

a undisclosed 4

b undisclosed 1

Teq undisclosed 24 ms

The choices of a, b, and Tequtilized by the baseline frss receiver are undisclosed but are comparable to the ones chosen for the ann. Teqwas chosen by studying the simulated channel impulse responses, which had a maximum delay spread in the range of 20 ms, see Appendix D. Thus 24 ms was chosen to have some margins. A spacing of b = 1 was chosen as a high signal fidelity was considered beneficial, i.e., the spacing between samples was minimal. The parameter a was chosen to be comparable to the baseline equalizer.

3.1.4

Bit Error Rate Definition

The ber refers to the channel coded ber, the number of bit errors in a received packet. As mentioned, by comparing the estimated bit sequence and the original bit sequence, the ber could be calculated. The ber curve as a function of snr is the performance metric utilized in this report. At least 10 errors per point on the curve is utilized to estimate the ber. In some channels, and in high snr, this could require an extensive number of simulations to estimate.

The ber can be also be affected by the reception, namely false alarms or unde-tected packets. The receiver has a probability of giving a false alarm, i.e., a detec-tion occurs in the absence of a signal. There is also a probability that a received packet is not detected. False alarms and undetected packet affects the modem per-formance, and can thus be represented in the ber curve, but in this study perfect reception is assumed, i.e., false alarms and undetected packets are disregarded. Any false alarms are disregarded in simulations, as false alarms occur with an increased probability as the number of simulations is increased, which can skew bercalculations when a higher number of simulations are needed to estimate the ber. For the same reasoning, undetected packets are also disregarded.

3.2

Software Simulation Environment

The frss transmitter and receiver described in Section 2.3 came with a MATLAB reference implementation. In this section, the software and setup used to simu-late the underwater channel and the ann framework will be described.

3.2.1

Machine Learning Software

MATLAB has a Deep Learning Toolbox [28], which was utilized to implement the dnn and rnn. The Deep Learning Toolbox provides tools to build anns

(42)

and contains a lot of advanced features and provided the most necessary tools to design and build anns within this thesis. The toolbox also allows the user to speed up training with a graphical processing unit (GPU), which was available in the setup. An advantage of utilizing the toolbox was that the thesis could focus more on designing network structures and testing different models, rather than spending time on implementing basic functions. An alternative would have been to use popular toolboxes, PyTorch or TensorFlow which are available for Python, but the frss code was given in MATLAB, so MATLAB’s toolbox was chosen.

3.2.2

Channel Model

As simple channel models based on assumptions regarding the uwa channel were proven to be non-realistic, as mentioned in Section 2.2, one must resort to more complex channel models. The aspects mentioned in Section 2.1 needed to be taken into consideration one by one. A common choice for modeling multi-path propagation and attenuation in underwater conditions is Bellhop [20]. Bell-hop is a model based on ray tracing and can utilize information about ssps and bathymetry, along with other environmental properties, to generate a channel im-pulse response. Bellhop is relatively complicated as it requires a lot of input, but it has become popular as it is considered somewhat realistic. There are plenti-ful of models [21, 35, 38], which utilize Bellhop as a baseline for generating the channel impulse response. [35] proposes a channel model and study the impact of noise and the Doppler effect to generate a complete channel model. Another study [21] proposes a statistical model based on large-scale and small-scale ef-fects of the movements and environmental conditions. Doppler efef-fects are also studied in detail. A third article [38] utilizes Bellhop to simulate the physical layer in a NET-layer simulator.

Out of the possible channel models, [21] was chosen to be used in this thesis. All the important aspects mentioned in Section 2.1 are modeled in [21] which makes the model very complete, and it is referenced in recent uwa literature [36, 37]. An additional benefit is also that the model had an openly available implementation in MATLAB [22]. The model described in [21] contains a lot of hyperparameters, and the next section is devoted to motivating the input to the simulation.

3.3

Channel Simulation Configuration

The channel simulator, based on [21], yielded an estimated time-variant channel impulse response h(τ; t) based on the configured parameters. The channel simu-lator took the environmental parameters and gave them to Bellhop. The Bellhop simulation was deterministic, a specific configuration always yielded the same signal arrivals. Based on the signal arrivals, the time-variant channel impulse response was generated from the small-scale variations (Table 3.4) and Doppler variations (Table 3.6).

(43)

3.3 Channel Simulation Configuration 27

In this section, the configuration utilized to simulate h(τ; t) will be described. Bathymetry, ssps and bottom sediment described in Section 3.3.1 to Section 3.3.3 were input to the Bellhop simulator while the general parameters and channel variations were input to the statistical model from [21].

3.3.1

Bathymetry

As mentioned in Section 1.4, two depths were considered. For each depth, three general bathymetry profiles were considered, yielding six possible bathymetries. The purpose was to capture general scenarios. The three general scenarios se-lected were a flat ocean bottom, a slope, and an obstacle. Motivation and plots of the bathymetries can be found in Appendix A.

3.3.2

Sound Speed Profiles

Nine different ssps were utilized in simulations, four for the shallow channel, and five for the deeper channel. The ssps were chosen to correspond to the varying seasons. The ssps were taken from the Swedish Meteorological and Hydrological Institute’s weather buoys in Skagerrak and the Baltic Sea [27]. Data from the buoy located at Släggö was chosen due to its coast proximity in which the effects of freshwater rivers can be noticed. Data from the buoy REF M1V1 located between Öland and Småland was chosen for the shallow scenario, as it was one of the few buoys with sound speed data in shallower waters. For each buoy, a set of the available dates with data were selected, the chosen dates are found in Appendix B. Data points were written down on paper and then manually added into MATLAB, creating an approximation of the plots provided by Swedish Meteorological and Hydrological Institute. ssps were picked at specific times, attempting to capture the varying seasons and their effects on the sea. Plots of the ssps and description of the behaviors are found in Appendix B.

3.3.3

Bottom Sediment Types

The Bellhop simulator allowed the user to specify properties of the sea bottom sediment. It was noted that altering this parameter yielded significant changes in the simulation (thus, two setups were considered). The properties of the bottom sediment determined how much reflection and reverberation the bottom gener-ates. Two kinds of bottom sediment, with very different properties were consid-ered. The sediment types and properties are presented below in Table 3.2, the data was obtained from seafloor measurements [9].

Table 3.2:Bottom sediment properties for two different kinds of bottom. Sediment type Sediment sound speed Wet bulk density

Silty sand 1657.49 m/s 1.91 g/cm3

References

Related documents

Many treatments of JSCC exist, e.g., characterization of the distortion regions for the problems of sending a bivariate Gaussian source over bandwidth-matched Gaussian

Closed-form expressions for the general Bayesian minimum mean square error (MMSE) estimators of the channel matrix and the squared channel norm are derived in a Rayleigh

Rather than stating, as she does, that events are being produced by ‘grounds’, or that the event is a figuration of a rupture that inexplicably appears against a background of

Channel sounder measurements SISO/MIMO Link level simulations Channel reconstruction Measurement based channel modeling Stored data Decision Pathloss and fading estimator High

The downlink transmission from base stations to users is particularly limiting, both from a theoretical and a practical perspective, since user devices should be simple

When generating the results the following data was used: the weights of the superim- posed sequence α = 0.4, channel length K = 11 taps, length of preamble and post amble N p =

However, in this project, time series analysis takes over only when the channel is unstable and the received signal strength could not maintain a steady

Training is the method through which network weights and bias values are updated. The proposed system calculates the estimate of channel in terms of neural