IN
DEGREE PROJECT MECHANICAL ENGINEERING, SECOND CYCLE, 30 CREDITS
STOCKHOLM SWEDEN 2020 ,
Inference of transfer functions and prediction of vessel responses
using machine learning.
JOHN CONALL HARRINGTON
KTH ROYAL INSTITUTE OF TECHNOLOGY
SCHOOL OF ENGINEERING SCIENCES
Abstract
Knowledge of vessel responses to waves is of the utmost importance for vessel opera
tions. The responses affect which routes a vessel can take, what cargo it can carry, the conditions it’s crew will experience and much more. This can pose a problem for perfor
mance optimisation companies such as GreenSteam, partners in this project, for whom the vessel transfer functions are generally not available.
This project aims to use bayesian machine learning methods to infer transfer functions and predict vessel responses. Publically available directional wave spectra are combined with highfrequency motion measurements from a vessel to train a model to create the transfer functions, which can be integrated to get the motions. If successful, this would be a relatively inexpensive method for computing transfer functions on any vessel for which the required measurements are available. Though not many vessels measure this data currently, the industry is moving towards more data collection, so that number is likely to rise.
The results identify a number of issues in the available data which must be overcome to produce usable results from these methods. Though results are not optimal, they show a promising start and a route is proposed for future research in this area.
Summary in Swedish / Sammanfattning på Svenska
Kunskap om fartygens gensvar i vågor är av yttersta vikt för alla fartygsoperationer. Gensvaren påverkar vilka rutter ett fartyg kan ta, vilken last det kan bära, vilka förhållanden besättnin
gen kommer att utsättas för och mycket mer. Vanligen är transferfunktionerna för far
tygsrörelser inte tillgängliga vilket utgör ett problem vid prestandaoptimering vilket företag som GreenSteam, partner i detta projekt, arbetar med.
Projektet syftar till att använda Bayesianska maskininlärningsmetoder för att bestämma överföringsfunktionerna och förutsäga fartygsgensvar. Offentligt tillgängliga riktningsvågspek
tra kombineras med högfrekventa rörelsemätningar från fartyg för att träna en modell att skapa överföringsfunktioner. Funktioner som sen kan användas för att bestämma rörelserna under andra vågförhållanden. Om det lyckas skulle det vara en relativt bil
lig metod för att beräkna överföringsfunktioner till vilket fartyg som helst där erforderlig mätdata finns tillgänglig. Även om de flesta fartyg för närvarande inte mäter rörelser, går branschen mot ökad datainsamling, så antalet fartyg med rörelseinformation kommer sannolikt att öka.
Resultaten identifierar ett antal krav på tillgänglig data som måste övervinnas för att mask
ininlärningsmetoderna ska ge användbara resultat. Studiens resultat är inte optimala men
är en lovande start och en väg framåt föreslås för vidare forskning inom detta område.
Acknowledgements
John Conall Harrington
Ulrik Dam Nielsen, Assistant Professor, DTU
Thank you to Ulrik for his excellent supervision and constructive advice and criticism throughout the project.
Karl Garme, Assistant Professor, KTH
Thank you to Karl for providing information and resources on small craft transfer functions, as well as the opportunity to present the thesis project at KTH’s annual Naval Architecture Student Conference.
Daniel Jacobsen, CSO, GreenSteam
Thank you to Daniel for his machine learning expertise, and for allowing me to collect data
for this project through GreenSteam.
Contents
Abstract . . . . ii
Summary in Swedish / Sammanfattning på Svenska . . . . ii
Acknowledgements . . . . iii
1 Introduction 1 1.1 Machine Learning projects . . . . 1
1.2 Previous Work . . . . 2
1.3 Motivation . . . . 2
1.4 Objectives . . . . 3
2 Data 5 2.1 Waves . . . . 5
2.2 Motions . . . . 9
2.3 Conditions . . . 12
2.4 Using the Data . . . 12
3 Methodology 15 3.1 Modelling Procedure . . . 15
3.2 Model Design . . . 15
3.3 Inference . . . 23
3.4 Model Criticism . . . 23
3.5 Modifications for small craft . . . 25
4 Results and Discussion 27 4.1 Sensitivity Analysis . . . 27
4.2 Pure Neural Network . . . 28
4.3 Correction Coefficient Neural Network . . . 33
5 Other Methods 37 5.1 Waves as a latent variable . . . 37
5.2 Pooled models using known transfer functions . . . 38
5.3 Putting it all together . . . 39
6 Conclusions 41 Bibliography 42 A All Result Plots 45 A.1 Pure NN . . . 45
A.2 Closed Form NN . . . 94
B Machine Learning Resources 108
1 Introduction
The modern world is in the midst of a data revolution. We are collecting more data ev
ery day from every aspect of our society and our world, and the maritime industry is no exception [1].
As the amount data we have on ships, shipping and related fields grows, it becomes increasingly hard to efficiently process and draw insight from. Our standard models can be improved by more data, but they can be limited by their simplicity.
Knowledge of response is of the utmost importance for vessel operations. The vessel responses affect which routes it can take, what cargo it can carry, the conditions it’s crew will experience and much more. Currently, vessel responses can only be predicted with knowledge of the corresponding transfer functions, which convert given wave spectra into response spectra for that vessel. These transfer functions must be measured from model tests or calculated using panel code or strip theory methods. These methods of course require access to the ship’s hull lines, and are, with the exception of strip theory, often expensive and timeconsuming; they also have the disadvantage of being limited in accu
racy to the speed and loading conditions for which they were calculated. For performance optimisation companies, or any other 3rd party that is not the ship builder or owner, these approaches are not viable.
The aim of this project is to use machine learning methods to learn the vessel transfer functions from measured motions and estimated wave spectra. This has a much lower cost than model testing or panel codes, and it’s accuracy is theoretically limited only by the availability of data; it will predict transfer functions reasonably accurately, limited by the quality of the data and measurement noise, for a vessel in any condition that has been seen in the data, and can be updated easily to benefit from additional data collection.
As the transfer function is the missing piece of the puzzle, it is not possible to learn transfer functions directly. Rather, the transfer function will be the internal result of a model trained to predict the vessel responses. The model will then take the relevant wave spectra, calculate the predicted response spectra and compare this with the observed response spectra.
This project has been completed in partnership with GreenSteam, a marine data intelli
gence company specializing in improving vessel efficiency through Machine Learning.
1.1 Machine Learning projects
Machine learning (ML) projects are best done as a team, with both domain and ML experts.
This paper concentrates on those aspects of the machine learning problems that should be considered by the domain expert (in this case the naval architect), and avoids the heavier ML aspects. However, an understanding of the underlying concepts is extremely beneficial to the domain expert when working on a machine learning project, as it enables them to understand the limitations and capabilities of the models, and to follow discussions and decisions made by other team members.
For this reason, a basic overview of the relevant machine learning concepts is given when they are relevant, and should suffice to follow the decisions made in this paper. Mathe
matical details of algorithms used for optimisation/inference are not given. It should also
be noted that said algorithm choice may not be entirely optimal; the purpose of this project
is not so much to get the best/most optimal result but to explore a simple approach to the problem, an initial foray into the field to improve our understanding of the available data and inform future research.
Readers looking to learn more about the underlying machine learning techniques should refer to appendix B.
1.2 Previous Work
Estimation of vessel responses has been studied extensively. The majority of studies attempting to estimate vessel response, in either the time or frequency domain, do so in one of 4 ways: Through closedform formulae [2], which cannot capture the relationship accurately as they simply do not take enough information as input, or through model testing, strip theory [3] or panel codes [4], which all require access to the ship’s lines.
In the closely related field of sea state estimation, machine learning has already made an appearance. Mak [5] made timedomain predictions of relative wave direction using shortterm memory convolutional neural networks, and Nielsen [6] has used bayesian optimisation, a similar approach to machine learning, to estimate directional wave spectra from moving vessels (a topic which will be touched upon in chapter 5).
1.3 Motivation
Why learn transfer functions?
Knowledge of vessel transfer functions is of great benefit to maritime operations. In com
bination with knowledge of the sea state in which the vessel will operate, it allows us to calculate the responses of the vessel, whether those be motions, accelerations or bending moments. The results of these calculations are of the utmost importance for the captain or operator when making decisions about routing and safety.
For owneroperators, the lines plans of a vessel are generally available, so strip theory or panel code methods can be employed. This study then, focuses on what is done when these are not necessarily available.
Until recently, estimating vessel responses using wave spectra as an input (and therefore assuming the wave spectra to be truth) would have been infeasible due to the level of uncertainty surrounding wave conditions. In recent years, the development of 3rd gen
eration spectral wave models has led to good information on wave spectra being freely available through datasets like the ERA5 [7], used in this study. This, combined with the rapidly increasing availability of highfrequency data measurements from vessels, means work such as this is now feasible.
Why machine learning?
A simple maritime example of the power of machine learning is the learning of speed
power curves as a baseline for measuring hull fouling. At the very least, every vessel will have a baseline speedpower curve from sea trial data. Some vessels may have speed
powerdrafttrim curves, covering a far wider set of conditions but still limited. What if the vessel is going into strong winds or waves? What if they are accelerating? The standard approach to this problem is to filter these data points and compare the baseline mea
surements to similar conditions, but in some cases this can cause loss 90% of available data.
A machine learning algorithm can take all data from, for example, the first 3 months of
a vessels lifetime, and use it to create a baseline that accounts for the effects of every
variable. This means that any data point measured can be compared directly to that
baseline, no matter the conditions, and the difference seen will be the effect of fouling.
GreenSteam have successfully developed and deployed models for this purpose, as well as for the more difficult task of optimising vessel trim [8].
The transfer function is controlled by the interaction between the water and the hull, so for any one vessel will change with draft, trim, speed and water depth. Machine learning allows all of these, and any other factors expected to have an influence, to be taken into account together, using all available data.
1.4 Objectives
The primary objective of this work is to make a first attempt at the inference of transfer
functions with machine learning. The quality of results is not expected to necessarily be
usefull, but regardless the study is valuable in it’s ability to expose the issues present in
the available data, and to give direction to future research.
2 Data
The vessel used for this study is a RoPAX vessel with particulars as seen in Table 2.1.
During the period of data collection the vessel was operating a continuous back and forth route across the North Sea.
L
BP185m
B 30m
T 7m
ST W
op19kn
∇ 24000
Table 2.1: Vessel Particulars
This study uses publically available data on ocean waves (section 2.1) along with data collected aboard the vessel by GreenSteam on the vessels motions (section 2.2) and general operating condition (section 2.3).
2.1 Waves
Ocean waves are irregular, and it is nearimpossible to accurately predict or calculate wave motions more than a few minutes ahead of current measurements. Instead, waves are modelled using a stochastic approach.
The stochastic approach to wave modelling relies on the phenomenon of superposition;
that when two or more waves cross each other at any point the resulting displacement will
be the sum of the displacements of the individual wave displacements at that point. If this
theory is applied to many regular waves of different directions, frequencies and phases
all interacting with each other, the result is an irregular seaway. Any irregular seaway can
be described then, by some combination of regular waves, as shown in figure 2.1.
Figure 2.1: An irregular seaway can be described by a combination of regular waves of different amplitude, phase, frequency and direction [9]
Descriptions of seaways in this manner are given as directional wave spectra, such as
that shown in figure 2.2, using the oceanographic convention where the direction is the
direction in which the waves are travelling, not where they are coming from, as is used
in meteorology. These represent the amplitude (or the energy density, in the case of
spectral densities) of waves in the seaway at every direction and frequency. Phase is
assumed to be random, such that if one were to assign a random phase to every direction
and frequency and create a sinusoid at the given amplitude for each, the result would
be probabilistically equal to the true seaway. Thus, the directional wave spectra is a stochastic description of the true seaway.
Figure 2.2: Example of directional wave spectra
Directional wave spectra can be hindcast or forecast through the use of 3rdgeneration models, which integrate the full energy balance equation with no prior restrictions on the spectral shape [7]. The wave spectra available in the ERA5 dataset from the European Centre for Mediumrange Weather Forecasts (ECMWF) are the result of such a model, and will be considered as true for this study.
The ERA5 wave spectra can be downloaded through the Climate Data Store (CDS) API.
The dataset contains an hourly .36/.36 degree lat/lon grid, giving directional wave spectra at 24 directions (θ) and 36 frequencies (ω) from 0 to 0.5Hz.
Once downloaded, relevant points are extracted by selecting the nearest neighbour in time and space for each point along a vessel’s route. Interpolation was considered, but it was decided that interpolation between wave spectra was unlikely to give accurate results as the different wave frequencies propagate at different rates.
To avoid the complications inherent in the onetothree relationship between encounter and absolute frequencies for a vessel with a speed of advance [10], the decision was made to perform all calculations in encounter frequency (the ”one” side of the oneto
three). The wave spectra (S) is therefore converted to encounter spectra (S
e), using the heading (β) and speed through water (U ) of the vessel to adjust the frequencies to account for the doppler shift due to the vessels advance speed.
ω
e= ω − ω
2U
g cos(θ − β) (2.1)
To maintain conservation of energy at the new frequencies, the spectral densities must
also be adjusted. Differentiating the above relationship gives the relationship between the
spacing of the frequencies in the absolute and encounter domains:
δω δω
e= g
g − 2ωU cos(θ − β) (2.2)
Figure 2.3: Conservation of energy in encounter wave spectra [11]
To maintain conservation of energy when converting to encounter frequency, the energy in the shifted set of frequencies must be the same as that in the original, as shown in Figure 2.3:
S(ω
e) = S(ω) δω δω
e= S(ω) g
g − 2ωU cos(θ − β)
(2.3)
Figure 2.4: An example of absolute (left) and encounter (right) wave spectra, with direc
tions relative to the vessel so that 0 degrees is following waves
To maintain the same set of frequencies at all points, the results are interpolated onto a linearly spaced set of 50 frequencies from 0.050.5Hz. To maintain a uniform set of directions for all points, the new wave directions (θ − β), which are different at each point due to changes in heading, are interpolated onto the original set of 24 directions. This produces encounter wave spectra like those seen in figure 2.4.
2.2 Motions
A floating body moves in 6 degrees of freedom, shown in figure 2.5.
Figure 2.5: Vessel degrees of freedom
Motion data is collected inhouse by GreenSteam specifically for this project.
The ship motions are measured using a 6 d.o.f. accelerometer installed by simply bolting it into a cabinet on the vessel’s bridge, 3m aft and 13m above midships. To ensure good resolution of the expected motions in the range 0.051Hz, data is recorded at 10Hz. This data can be accessed over a secure VPN connection to a GreenSteam server running onboard the vessel.
Due to bandwidth limitations, the raw data cannot be downloaded at 10Hz, so it is trans
formed into response spectra onboard the ship then downloaded in that form.
The ship response spectra are by definition nonstationary as conditions are never truly stationary in a real seaway, so accurate estimation of the response spectrum is nontrivial.
There are two primary methods for response spectra estimation: Parametric and non
parametric. The parametric approach assumes that the process follows some model, and can therefore produce accurate spectral estimates from much shorter time series, while the nonparametric make no assumptions so removes any bias in the model, but require much longer time series for an accurate estimate. Some testing showed that the spectral peaks from 1 hour of data are well reproduced by the nonparametric Welch method (multiple overlapping fast fourier transforms) down to 10 minute periods, so this method is used for this project. Further reduction in the length of the time series results in some spectra becoming unreasonable.
The motions measured aboard the vessel are accelerations in the translational degrees of
freedom, velocities in the rotational. These must be integrated to get displacements. This
integration could be performed in either the time or frequency domain, but timedomain integration would have to be run aboard the ship and is therefore a more difficult option for this project. Frequency domain integration however can be less accurate at low fre
quencies, so may not be usable. To assess this, a comparison is made between time and frequency domain integrated response for a small subset of times.
Time domain integration aboard the ship is performed by first normalising the timeseries motion data by subtracting it’s mean to prevent the integration drifting.
˙ x(t) =
Z
T0
(¨ x(t) − ¯¨x(t))dt (2.4)
x(t) = Z
T0
( ˙ x(t) − ¯˙x(t))dt (2.5)
This integration is performed using the trapezium rule, then the spectra are generated and saved, ready to be downloaded from the vessel.
Frequency domain integration is performed by taking a desired displacement x(t) and considering it’s Fourier transform X(f ):
x(t) = Z
X(ω)e
ωitdω
˙ x(t) =
Z
X(ω)e ˙
ωitdω
(2.6)
By definition, velocity is the rate of change of displacement:
˙
x(t) = d
dt [x(t)] (2.7)
Substituting the fourier transform of x(t):
˙
x(t) = d dt
Z
X(ω)e
ωitdω
= Z
X(ω) d
dt [e
ωit]dω
= Z
ωiX(ω)e
ωitdω
(2.8)
Comparing this to the fourier representation of ˙x(t):
Z
ωiX(ω)e
ωitdω ˙ x(t) = Z
X(ω)e ˙
ωitdωωiX(ω) = ˙ X(ω) (2.9)
What we receive from the vessel after calculation onboard is the velocity spectral density:
S
x˙(ω) = | ˙ X(ω)|
2(2.10)
Therefore our desired displacement spectral density will be found by:
S
x(ω) = |X(ω)|
2=
X(ω) ˙ ωi
2
= | ˙ X(ω)|
2ω
2= S
x˙ω
2(2.11)
To go from acceleration to displacement, we perform the same operation twice:
S
x= S
¨xω
4(2.12)
These frequency domain integral methods have one primary issue: As ω approaches 0, the S
xderived from either S
x˙or S
x¨will become indeterminate. This is typically referred to as ”1/f noise”, as for a sampling frequency f it typically becomes a problem at ω = 1/f . As the sampling frequency is 10Hz, a highpass filter is applied at 0.1Hz to remove the worst of this noise. This is a problem for wave responses, as the filter (shown in Figure 2.6) affects data in the range 00.2Hz, where the majority of vessel responses are present.
Figure 2.6: The lowpass filter applied to frequency domain integrated spectra, with verti
cal line at 0.1Hz
The results of these integrated spectra differ, primarily in the low frequencies. A compar
ison of the results for pitch is shown in figure 2.7, which also shows the relevant wave spectra at the same times in onedimensional form. From this comparison, we can see that while the frequency domain integration is clearly missing motions in the very low fre
quencies, the wave spectra is 0 at those frequencies anyway, meaning the model will have
no chance of predicting the true response. For this reason, the issues with the frequency
domain integration are considered to not have a large effect on the final results, and the
frequency domain integration method is used throughout.
Figure 2.7: Comparison of time and frequency domain integration methods
2.3 Conditions
Alongside the measured motions, this study uses data from the GreenSteam Dynamic Trim Optimiser system, which uses a number of onboard sensors installed by Green
Steam. These sensors download mean values for every 5 minute period to an internal GreenSteam system from which the data can be exported.
The data used for this study was the vessel location, speed, heading, draft and trim, as well as the water depth at that location.
Signal Units Measurement method
Lat/Lon degrees GPS
Speed through Water kn Estimated from GPS Speed over Ground and hind
cast ocean currents Heading degrees Onboard compass
Draft m Calculated from freeboard measured by onboard radars midships
Trim m Calculated from freeboard measured by onboard radars fore and aft
Water Depth m Echo sounder readings
Table 2.2: Data sources for nonresponse vessel data
2.4 Using the Data
When data has been collected from all three sources, it is compiled into a single dataset using the python library xarray, which allows for storing gridded data in many dimensions.
The data structure for N datapoints is shown in table 2.3.
Name Shape Dimensions
Time N
Direction 24xN Time
Frequency 50xN Time
Latitude 1xN Time
Longitude 1xN Time
Speed Through Water 1xN Time
Heading 1xN Time
Draft 1xN Time
Trim 1xN Time
Depth 1xN Time
Heave 50xN Frequency, Time
Surge 50xN Frequency, Time
Sway 50xN Frequency, Time
Pitch 50xN Frequency, Time
Roll 50xN Frequency, Time
Yaw 50xN Frequency, Time
Wave Spectrum 24x50xN Direction, Frequency, Time Table 2.3: Full dataset structure.
This structure keeps all data associated with a single timestamp together and allows easy
access by selecting that timestamp, facilitating easy traintest splitting and analysis, as
will be discussed in chapter 3.
3 Methodology
3.1 Modelling Procedure
Building, testing and using machine learning models should follow a defined scientific procedure. In this study, that procedure is what is known as Box’s Loop [12].
Figure 3.1: Box’s Loop modelling procedure
Box’s loop consists of three primary steps, shown in figure 3.1: Model design, inference and model criticism.
In the model design stage (section 3.2), relevant naval architecture and statistical mod
elling techniques are combined to design a model based on the internal physical structure believed to exist in the data. Once a model has been designed, inference (the learning pro
cess) is performed through whichever algorithm has been chosen. Finally, in the model criticism stage, the results of the model are compared to a subset of the data which was excluded from the inference stage, to assess the model performance. If not satisfied with model performance, the model design is modified accordingly.
For this study the focus will be on model design and criticism, as these are the stages where domain expertise is most useful. Inference will be covered only very briefly.
In figure 3.1, the data is seen to be split into two parts, the train data and the test data.
Train data is given to the model at the inference stage, and is what the model uses to learn the distribution of parameters. The test data is totally excluded from this learning process, so that it can be used to make unbiased criticism of the model in the final stage.
As this study uses time series data, the data is split randomly by timestamp, with 90% of data used for training and 10% used for test.
3.2 Model Design
3.2.1 Transfer functions for ship motion
Vessel responses in waves are modelled through the theory of linear superposition. This
theory states, in essence, that the ship acts as a linear filter for the waves, such that any
regular wave encountering the vessel at a given frequency will produce a regular response at that same frequency.
Figure 3.2: The ship as a linear filter [6]
This simple theory is extended to the real, irregular seaway by superposition. Section 2.1 describes how regular waves superpose into the irregular seaway; thus the vessel re
sponses to those regular waves also superpose. With the theory of linear superposition in hand, we only need to model the vessel response to regular waves and that knowledge can be easily extended to the real seaway.
The equation of motion for the response to regular, and therefore all, waves derives from Newton’s 2
ndlaw. It analagous to that of a mechanical oscillator and has the same com
ponents of mass M
ij; added mass a
ij; damping b
ij, in this case specifically hydrodynamic damping; stiffness c
ij, the hydrostatic restoring force; and an excitation force F provided by the waves.
X
6 j=1[(M
ij+ a
ij)¨ x
j(t) + b
ij· ˙x
j(t) + c
ij· x
j(t)] = F (t) (3.1)
The excitation from a regular wave with amplitude A and encounter frequency ω
eis mod
elled simply as:
F (t) = X
i· Ae
−iωet(3.2)
Where X
iis the complex exciting force per wave amplitude.
As the vessel will, according to the linear filter analogy, oscillate with the same frequency and phase as the wave, it follows that some function Z
jmust relate the amplitude of one to the other. This is the transfer function:
x
j(t) = ϕ
je
−iωet= Z
j· Ae
−iωet(3.3) Z
j= ϕ
jA (3.4)
Substituting x
jand F (t) back into equation (3.1):
X
6 j=1ϕ
j−(M
ij+ a
ij)ω
e2− ib
ijω
e+ c
ij= A · X
i(3.5)
By standard matrixinversion techniques we are able to arrive at the full equation for the transfer function:
Z
j= ϕ
jA =
X
6 i=1X
i−(M
ij+ a
ij)ω
e2− ib
ijω
e+ c
ij −1(3.6)
The response spectrum R of a body moving in a given motion ij (see figure 2.5) in a sea
way can therefore be described by multiplying the wave spectrum of that seaway with the complex transfer function (Z
i· Z
j∗). As the vessel is affected by waves from all directions, this operation must be integrated about the vessel.
R
ij(ω
e, S, ¯ s) = Z
π−π
Z
i(ω
e, θ, ¯ s) · Z
j∗(ω
e, θ, ¯ s) · S(ω
e, θ)dθ, i, j = {1...6} (3.7)
To develop these equations into a usable machine learning model, bayesian methods are employed.
3.2.2 Bayesian learning methods
Bayesian methods [13] have two primary advantages over normal statistics/ML methods:
They take into account prior information, and they produce distributions rather than pre
dictions.
Distributions have an advantage over predictions because they are, generally speaking, a better fit to reality. It is incredibly rare to find a dataset without some form of noise, whether it be from unknown variables or poor sensors.
Bayesian methods use a prior distribution set by us essentially the probability we think there is of that event. This enables us to teach the model what we expect from it, even before it sees any data.
In more detail, Bayes theorem states:
P (A |B) = P (B |A) · P (A)
P (B) (3.8)
Where P (A |B) is the posterior, P (B|A) is the likelihood and P (A) is the prior. A visualisa
tion is shown in Figure 3.3, though it should be noted that the data informs the likelihood.
Figure 3.3: Visualisation of Bayesian Inference [14]
When we work with bayesian methods, we define stochastic variables which have priors, and the model calculates their likelihood and posterior. These can then be combined in any way we choose (addition, mutliplication, etc...) to produce the desired model. Our aim being to model the expected physical structure of the data, the ideal method involves developing prior estimates for each of the transfer function’s components X
i, M
ij, a
ij, b
ijand c
ijso that we can model them as stochastic variables. Z
jwould then be a determin
istic product of these stochastic variables, meaning that it is purely the result of combining those variables and has no uncertainty of it’s own.
R
ij, being the observed variable, must be stochastic as there is always some noise in any measurement
1. If measurement sources are good the noise on R
ijwill be small. In this study it is modelled as a normal distribution, with mean of course at the point defined by the physical equations, and standard deviation of 0.01.
For M
ijand c
ij, developing priors is reasonably simple as these factors are not dependent on the frequency of oscillation and can be estimated from some basic knowledge of vessel dimensions and shape coefficients. For the hydrodynamic components X
i, a
ijand b
ijhowever, this is more complicated. The physical components of exciting force X
i, added mass a
ijand damping b
ijall depend on the vessel shape and are related to one another. A combination of their similar dependencies and their interaction within the transfer function makes it very difficult to learn these functions accurately, as they are able to compensate one another. This also leads to numerical problems during inference, as if added mass or damping become very small the exciting force nears infinity.
Generally, problems of this sort can be solved in Bayesian learning by putting strong priors on the difficult components that help the model keep them in the expected range and avoid numerical problems (though one has to be careful that these priors represent a true physical belief rather than what one hopes to see [15]). However without an idea of the vessels lines, it has not been possible to find in literature any method of estimating
1
Even if we could find an entirely noisefree measurement, inference on deterministic observed variables is
very difficult and no good algorithm for it has yet been found, so it is naturally not supported by any probabilistic
machine learning library.
added mass, damping or exciting force directly that could simply be inserted into the above transfer function to give some (poor) estimate of a transfer function, meaning the formulation of priors for them is very difficult. It has also not been possible to find any estimates of the shape of these functions to use to limit the work required to optimise for them.
For these reasons, the first principles method cannot be used without knowledge of the ships lines. This is discussed further in section 5.1.
3.2.3 Simplified transfer functions
As modelling the components of the transfer function proved difficult, the problem was simplified, reducing the transfer functions to a learned ”black box” function. As the under
lying libraries used to perform the inference do not support imaginary numbers, only the real part of the transfer function can be estimated, so coupling is ignored. This reduces the equation for the response (in the model, the mean of the observed stochastic variable R) to:
R
i(ω
e, θ, ¯ s) = Z
π−π
|Z
i(ω
e, θ, ¯ s) |
2· S(ω
e, θ)dθ, i = {1...6} (3.9) Where Z
i(ω
e, θ, ¯ s) is the learned transfer function. This transfer function will be estimated using neural networks.
3.2.4 Neural networks for transfer functions
With the transfer function as a single learned parameter matrix, we will need to give the algorithm more freedom to compute it. The model will be required to take the vessel conditions and produce estimates of the transfer function with no internal guides as to it’s structure. For this we will use a neural network. A brief introduction to neural networks is included, but more can indepth information be found in [16].
Introduction
The neural network is, at it’s simplest, a series of matrix operations. A single layer of a neural network with input x and output y can be explained by a simple system of equations:
y
′= ¯ w · ¯x + ¯b (3.10)
y = y
′e
αy′1 + e
αy′(3.11)
The first of these likely looks familiar, as it is the equation for a straight line. The only difference is that every variable is a vector or matrix, examples of which can be seen in figure 3.4. We refer to the parameters w and b as weight and bias, and these are what the model will learn in order to make predictions. The second equation is known as an activation function, in this case the swish activation function [17] which is used in this study with α = 0.1.
Neural networks contain this same operation many times over, with different weights, dif
ferent biases and potentially different activation functions. Figure 3.4 shows a NN with
two layers, represented by it’s system of equations, a visualisation of the matrices and a
neural diagram. The first layer takes a dataset with 4 columns (features) and transforms it
into 3 “features” Z, called nodes. The second layer transforms this into the single desired
output Y .
Figure 3.4: Neural Network explained in 3 ways. [18]
As this study uses bayesian methods, weights, biases and outputs shown in figure 3.4 are distributions, allowing propagation of uncertainties right through the network.
Design
Returning to the problem at hand, the above information can be used to begin to formulate a neural network designed to predict the transfer function. The input and output layers of the network are defined by the data they represent:
The input to the neural net will be the matrix of vessel conditions. As there are 4 columns in this matrix (with heading excluded as the wave spectra are already rotated to be relative to heading), the input layer weights must have shape (4, X), where X is the number of nodes desired in the next layer.
As the transfer function Z
imust be multiplied by the wave spectra S, the shape of the neural network output must match that of the wave spectra. This means the final output of the network should have shape (24, 50). However, this can be further simplified using the physical concept that the transfer functions are symmetrical about the ship’s centerline.
This reduces the number of directions by half, leaving an output shape of (12, 50). In order to achieve this output, the weights for the final layer must have shape (X, 12, 50), and the biases shape (12, 50).
The number of hidden nodes X, and the number of layers of those nodes, are difficult to
decide without experience in neural network design, so are determined by testing multiple
configurations and comparing the results in the model criticism stage (section 3.4).
Figure 3.5: Full model diagram with small neural network. The linear filter box represents equation (3.9)
For all layers, each weight and bias is modelled as a normal distribution with mean at 0 and a standard deviation of 1.
As an example, figure 3.5 shows the full model flow for a smaller example neural network, with 2 hidden layers of 5 neurons each. Every straight, whitetipped arrow in that diagram represents the operations in equations (3.10) and (3.11).
The results for models of this type can be seen in section 4.2.
3.2.5 Neural networks for correction
The pure neural nets provide a very general solution, but we can further simplify the prob
lem and potentially improve our results by instead using the neural net to learn a correction coefficient to be applied to a known transfer function. In this study closed form expressions for the transfer functions will be used, but the same could be done with other, potentially more accurate, starting points such as the results of strip theory calculations.
For this case the problem is modelled as above, but the learned transfer function Z
iis replaced with equation (3.12).
Z
i(θ, ¯ s) = CF
i(θ, ¯ s)i · (1 + Ω
i(θ, ¯ s)) (3.12)
Where Ω(θ, s) is the result of the neural network and CF (θ, s) the closed form transfer
function, which will be described in section 3.2.6. The only change to the design of the
neural network is the activation function of the final layer. Rather than using the swish
function, a tanh function is used to constrain the output between 1 and 1.
Figure 3.6: Correction coefficient model diagram
3.2.6 Closed form transfer functions
In this study, the closed form transfer functions developed in [2] are used as a starting point from which the real transfer functions can be learned, in an attempt to reduce the work of the model.
Though [2] provides equations for 3 motions, for simplicity only heave and pitch are con
sidered here. These transfer functions are calculated using the vessel dimensions, speed and heading, and evaluated at the same 12 directions and 50 frequencies present in the wave spectral data.
Though strip theory transfer functions are not available for the primary vessel of this study, a comparison is made between the strip theory and closed form transfer functions of another, slightly large vessel. This provides some intuition as to the quality of the closed form transfer functions.
Figure 3.7: Percentage error in closed form transfer function for heave
Figure 3.7 shows the percentage error for the closed form transfer functions for heave at
every direction and frequency. It can be seen that the closed form functions are good at low frequencies in all directions, which might be expected as heave transfer functions must be 1 when ω = 0, but become less accurate at higher frequencies. There are particularly problems in beam seas (around 90 degrees), when the transfer functions actually achieve 3700% error (the graph scale has been truncated to 200% to maintain readability).
For pitch, shown in figure 3.8, the error takes a similar shape but has far fewer peaks above the 200% mark. The transfer function at 90 degrees however seems to be close to 100% error at every point.
Figure 3.8: Percentage error in closed form transfer function for pitch
3.3 Inference
Inference is performed using AutoDifferential Variational Inference (ADVI) [19] in PyMC3 [20], with data in minibatches to improve performance and reduce memory usage [21].
For readers looking to learn more about how this step works or how to actually use these methods, a list of useful resources is given in appendix B.
3.4 Model Criticism
The model designs presented in section 3.2 were developed through the iterative process of Box’s loop. This means that throughout the development process inference was run and the resulting models were criticised and refined.
In many machine learning courses and applications, model criticism is simplified to just
”testing”, and consists of simply comparing the model predictions for the test set to the truth, often through some error metric such as rootmeansquared error or meanabsolute error. The reasons this is not enough for this project are threefold:
• The use of bayesian methods means we are interested not just in the predicted value, but in the entire posterior distribution.
• The model is designed to have an informative internal structure, so the internal re
sults should be criticised as well as the predictions.
• The final result is 2dimensional, so a single error score cannot fully capture it’s accuracy.
For these reasons, the model criticism is a more complex process.
Model criticism for this project was primarily done visually, through a number of plots of the data, but a few scoring metrics were used as initial measures of basic performance.
The first metric used to criticise any model is available in realtime during the inference process, and is called the evidence lower bound, generally referred to as ELBO. This is the metric that the optimisation algorithm used is attempting to reduce, so plotting it in realtime during the inference process allows an immediate assessment of whether the model is learning something. If ELBO is going down, all is probably well, if ELBO never moves it is probably not worth letting the inference continue to run as it is not working, and there must be a problem with the model definition. Eventually, ELBO should stabilise, signifying that the model has converged.
Once a model has converged and the inference process is completed the train data used for inference is swapped out for the test data. The results of the model are assessed by sampling the posterior distributions of the model variables, in this case the transfer function Z and response R.
500 samples are taken for each datapoint in the test set, enough to give a good represen
tation of the distribution of the posterior for each point.
With samples taken, the results can be plotted for visual assessment. The plots used for model criticism are as follows:
• A plot of the mean response of each datapoint, alongside the true responses. This gives an immediate overview of whether the model is in the right area.
• A plot of each of the 12 directions of the transfer function for the point with the largest response, and the same for 5 random response points, with 90 and 95%
confidence intervals. These are used to visually assess the shape of the transfer functions based on the expected shapes.
• A plot of the predicted responses at the same 6 points as above, again with 90 and 95% confidence intervals. This is used to check whether the predictions fall within the confidence interval, and to check that the shape of the response spectra are reasonable.
• For the correction coefficient model, plots of the closed form transfer functions at the above points for comparison to the final learned transfer functions.
Alongside these plots, the rootmeansquare error (equation (3.13), using the mean of the posterior as the predicted value y
pred) is calculated at every frequency, giving an immedi
ate overview of which frequency areas the model is more or less accurate in. The same is done at every timestamp, which can then be compared to input data to highlight any particular conditions in which the model might be struggling to make good predictions.
RM SE = v u u t 1
N X
N k=1(y
true,k− y
pred,k)
2(3.13)
Examples of all of the above plots can be seen in Chapter 4.
3.5 Modifications for small craft
The introduction of small, high speed (planing) craft adds an interesting twist to the model design. Planing craft at slow speeds behave as displacement vessels, so the modelling techniques already discussed are fully capable of dealing with those speeds. However, above speeds of
VL= 2 the response magnitude starts to become nonlinear with wave height.
This introduces a problem for our linear filter theory, as it violates it’s basic assumption of linearity. Fortunately, this problem can be circumvented with a small change. The filter equation (3.9) is still valuable because it allows us to use a transfer function relating the amplitude of waves to that of the response, only that transfer function must now depend on the wave height to some power. This power will change with speed so, taking the output of the neural nets from above as C
j, the transfer function could be modelled as:
Z
j(ω, θ, ¯ s) = S(ω, θ)
k· C
j(ω, θ, ¯ s) (3.14) Where the power k is some function of the froude number. This function may need to be as flexible as another small neural net, but to start off with a simple linear function should probably be tested.
Unfortunately, not enough data was available to make these tests as part of this project.
4 Results and Discussion
4.1 Sensitivity Analysis
The first requirement was to determine a reasonable number of hidden nodes and hidden layers for the networks. This was done by running a set of different options for each and assessing the rate and extent to which the model learns using the ELBO plots. Every test run eventually converged to a similar point, so the table below shows only the number of iterations and the running time for each.
10 25 75 100 200
1 45000 / 3min 45000 / 3min 7500 / 7min 5000 / 12min 4000 / 31min 2 55000 / 6min 35000 / 6min 6000 / 7min 6000 / 16min 6000 / 33min 4 / 4min 35000 / 4min 8000 / 10min 8000 / 19min / 42min 6 14000 / 7min 40000 / 7min 25000 / 9min 25000 / 24min 45000 / 43min
Table 4.1: Sensitivity Analysis Results
Table 4.1 shows the convergence point in number of iterations and the total runtime for 30k iterations for all models. Along the top are the number of hidden neurons and down the left side is the number of hidden layers. The chosen point is in bold, chosen because adding more neurons drastically increased runtime.
The sensitivity analysis showed little difference in the final results at any point, so the speed of convergence was the only factor in choosing the model. For this reason the chosen model was the 1 hidden layer of 200 nodes. The test was run for both pure and correction coefficient networks with very similar results. The ELBO plot for the chosen point is shown in figure 4.1 (this example being from heave).
Figure 4.1: ELBO plot for pure neural net heave model training
For the closedform models, the ELBO plots showed little learning at any point, figure 4.2,
so the same point was used as for the pure models, just for simplicity.
Figure 4.2: ELBO plot for closed form heave model training
4.2 Pure Neural Network
The quality of results was similar across 5 of the 6 degrees of freedom, with the exception of yaw, where the influence of rudder motions likely makes the data less accurate as a measurement of waveinduced response, so the focus of the discussion here will be on heave motion. The plots of the other motions can be found in appendix appendix A.
First, observe the predicted mean values for every datapoint. These show that the model has correctly estimated the rough location of many peaks around 0.1Hz but often fails to capture peaks lower than this and, at least at the most visible highest values, made reasonable attempts at estimating the height of those peaks. It is also immediately obvious that at the lowest plotted frequencies of 0.05Hz the predicted values are often smaller than truth.
Figure 4.3: All test set heave responses, predicted and measured
Moving on to look at individual example responses, first the maximum response in fig
ure 4.4. This shows that while the mean predicted value is a little low, the model starts to predict the spike at the higher frequencies. Unfortunately the predicted value drops off before the peak as the waves spectra goes to nearzero, forcing the reponse down with it. This implies that one of two things is true here: Either the wave spectra is not accurate and there were more low frequency waves or, perhaps more likely, the height of the measured peak is the result of the frequencydomain integration issues previously discussed.
Figure 4.4: Maximum test set heave response, predicted and measured
The first randomly chosen point showed the opposite effect. The measured response begins to dip down again as the highpass filter comes into effect, but the prediction has a second spike. This is likely due to the fact that generally the waves are not large in this frequency, and the integration means the measured responses are larger than they should be. This leads to the model pushing the values of the transfer function up very high at these low frequencies, creating large spikes at these frequencies whenever waves are present.
Figure 4.5: Random test set heave response 1, predicted and measured
The second randomly chosen point again has a spike at low frequencies, then drops down
to underpredict the response peak.
Figure 4.6: Random test set heave response 2, predicted and measured
The transfer functions for all points tend not to vary, implying that the model has learnt very little of the effects of the vessel conditions, instead being the same at every timestamp.
This is most likely because other factors, such as the measurement issues at low frequen
cies and the quality of the wave data, introduce so much uncertainty that the affects of
these cannot be captured.
Figure 4.7: Maximum test set heave transfer functions
Across all points seen in all degrees of freedom, the lowest frequencies show issues.
Above, two reasons for this have been suggested, but there is a need to assess which is the primary source of uncertainty in this area. To assess this, the model was rerun with the raw measured acceleration spectra, preintegration, as the truth. The results of this model are therefore unaffected by the integration issues, so the problems with low
frequency waves are shown more clearly.
Predicting Acceleration
In vertical acceleration we see a much better agreement towards the low frequencies in
general, which is expected as the accelerations will become 0 in 0Hz waves, so should
be easy to predict.
Figure 4.8: All test set vertical acceleration responses, predicted and measured
The max response for vertical acceleration in figure 4.9 however, shows that at least some of the problems at low frequencies stem from inaccuracies in the wave data. The predicted motion can be seen to go to zero with the waves at a frequency where the measured motion us still close to it’s peak value.
Figure 4.9: Maximum test set vertical acceleration response, predicted and measured
The majority of the random responses for vertical acceleration are similar to figure 4.10,
showing very small responses with the models overpredicting. These points all seem to
be in small wave heights.
Figure 4.10: Random test set vertical acceleration response 1, predicted and measured
The 3rd randomly chosen point however was more interesting. Figure 4.11 shows a point where the measured responses go to zero at a similar point to the waves. Here the model generally does a reasonable job of predicting the response, getting the peak location and height correct.
Figure 4.11: Random test set vertical acceleration response 3, predicted and measured
4.3 Correction Coefficient Neural Network
In the closed form network the same issues appear. The network does a good job of
making the predictions generally for heave, but again the mean predictions in figure 4.12
and the maximum response in figure 4.13 shows a drop off as waves go to zero before
the response does.
Figure 4.12: All test set heave responses, predicted (with closedform) and measured
Figure 4.13: Maximum test set heave response, predicted (with closedform) and mea
sured
The transfer functions again reinforce the issues at the low frequencies. At every point, the model is trying to account for the small waves and large measured responses at these frequencies by pushing the closed form model up to double it’s original value the maxi
mum it is capable of increasing it.
Figure 4.14: Maximum test set heave transfer functions, predicted (with closedform)
Again, similar results are found with the closed form model for pitch.
5 Other Methods
In the previous chapter it could be seen clearly that the predictions made by these models were unreliable. Here some more advanced machine learning methods are proposed to advance the state of research, still using only currently available data. Some of the later proposals assume accelerometers similar to that fitted for this study will be fitted to many more vessels.
It should be noted that there may be simpler, nonML methods that would improve the results, such as using a robust algorithm for integration at low frequencies or running a re
analysis of wave data to a higher resolution. These methods may well improve the results of the current models, but the end result would still be improved further by techniques similar to those outlined below.
5.1 Waves as a latent variable
Aside from low frequency integration issues, the most prominent source of uncertainty in this study was the lack of resolution in the wave data, in all dimensions.
Research into the Ship as a Wave Buoy method has shown that it is possible to make estimates of the sea state at a location based on response measurements from on board a vessel with knowledge of the transfer functions. Thus, it is reasonable to assume that, given knowledge of the transfer functions, one could use those same responses to im
prove upon the starting point given by the ERA5 spectra, or of course to learn entirely new spectra independent of ERA5 data. We could then use these improved spectra to attain better results in the inference of transfer functions.
The paradox in the above statement of course, is that knowledge of the transfer functions is required to learn the wave spectra, and knowledge of the wave spectra required to learn the transfer functions. This is what has lead to the current state of research, where we are primarily interested in improving our wave estimation using those vessels for which good transfer functions are available to us because, as this paper demonstrates, the wave data is not yet good enough for us to do the inverse. However, for the Ship as a Wave Buoy estimations to be practically useful to the maritime industry it will need to be scaled up, which will require transfer functions to be known for a vast number of vessels.
Fortunately, machine learning can help solve this.
If we take the primary goal to be that of this paper, the prediction of vessel responses, we can formulate a model which enables us to improve our wave spectra estimates while learning the transfer functions, by modelling the wave spectra as a latent variable.
Doing this would most likely need to involve the introduction of a time dependency, such
that each point is influenced by those before it. A simplified potential model is shown in
figure 5.1.
Figure 5.1: An example of a model structure with waves as a latent variable
This model would take the ERA5 data, or perhaps a parametrised version
1, as the prior estimate of the wave spectra, allowing the model to update it based on the resulting mo
tions.
Required Work:
• Improving wave spectra with machine learning (with known transfer functions)
• Latent variable model for predicting vessel responses and improving wave spectra
5.2 Pooled models using known transfer functions
Transfer functions calculated through standard methods such as strip theory are known to give good agreement with experimental results. Thus, for vessels where such trans
fer functions are readily available there is little requirement for further work, though small correction coefficients may provide some benefit. These vessels with known transfer func
tions can help us solve the problems faced in section 5.1.
Some basic knowledge of vessel type and dimensions allows us to sort vessels into clus
ters, or pools, of potentially similar ships. Within these pools priors for components of the transfer function can be shared between vessels, and each vessel can effectively learn from every other vessel. Thus, if each pool contains some vessels which have known transfer functions, these can be split into their components of exciting force, added mass, damping, mass and hydrostatic restoring force and used to inform all other members of the pool. The better starting point this gives might allow the previously impossible com
ponents to be separated and learnt reliably. Separation may also be helped by the appli
1