Interpolation and visualization of sparse GPR data

(1)

Master’s Thesis

Interpolation and visualization of sparse GPR data

Rickard Sj¨ odin

rickard.sjodin@outlook.com

Supervisor: Joakim Paulsson Internal supervisor: Adrian Hj¨alt´en

Examinor: Peter Olsson

(2)

Ground Penetrating Radar is a tool for mapping the subsurface in a non- invasive way. The radar instrument transmits electromagnetic waves and records the resulting scattered field. Unfortunately, the data from a survey can be hard to interpret, and this holds extra true for non-experts in the field. The data are also usually in 2.5D, or pseudo 3D, meaning that the vast majority of the scanned volume is missing data. Interpolation algorithms can, however, approximate the missing data, and the result can be visualized in an application and in this way ease the interpretation.

This report has focused on comparing different interpolation algorithms, with extra focus on behaviour when the data get sparse. The compared methods were: Linear, inverse distance weighting, ordinary kriging, thin plate splines and fk domain zone-pass POCS. They were all found to have some strengths and weaknesses in different aspects, although ordinary kriging was found to be the most accurate and created the least artefacts. Inverse distance weighting performed surprisingly well considering its simplicity and low computational cost. A web-based, easy-to-use visualization application was developed in order to view the results from the interpolations. Some of the tools implemented include time slice, crop of a 3D cube, and iso surface.

(3)

1 Introduction 1

1.1 Background . . . 1

1.2 Aim . . . 3

1.3 Goals . . . 3

1.4 Limitations . . . 3

2 Theory 5 2.1 GPR data . . . 5

2.1.1 The limits of spatial sampling . . . 8

2.1.2 GPR data preprocessing . . . 8

2.2 Interpolation methods . . . 9

2.2.1 Linear interpolation (LERP) . . . 9

2.2.2 Shepard’s algorithm (IDW) . . . 10

2.2.3 Ordinary Kriging (OK) . . . 10

2.2.4 Thin Plate Splines (TPS) . . . 12

2.2.5 Fk domain zone-pass POCS Algorithm (FKPOCS) . . 13

2.3 3D Visualization Application . . . 14

3 Method 16 3.1 Data acquisition . . . 16

3.2 Interpolation prerequisites . . . 17

3.2.1 Evaluation strategy . . . 19

3.2.2 Empirical variogram . . . 19

3.2.3 Parameter evaluation . . . 19

3.3 Evaluation of goodness . . . 20

3.3.1 Root Mean Squared Error . . . 21

3.3.2 Pearson Correlation Coefficient . . . 21

(4)

4 3D Visualization Application - Method and Results 22 4.1 Visualization tools in Threejs . . . 22

5 Interpolation Results 28

5.1 MIRA HDR data set . . . 28 5.2 WideRange data set . . . 31 5.3 Computational costs . . . 35

6 Discussion 37

7 Conclusions 41

(5)

Introduction

1.1 Background

Ground Penetrating Radar, or GPR for short, is a widely used method for non-invasive surveying. It is used to map subsurface geological structures and objects by using electromagnetic (EM) waves. This means that GPR can map and locate subsurface objects without any risk of damaging neither the object itself nor its surrounding environment. GPR has extensive application areas in geotechnical engineering [1], pipeline inspection [2], archaeology [3]

and more.

A typical GPR instrument works in the following way: The EM waves are transmitted by the instrument into the subsurface in a cone-like pattern and are reflected by objects with different dielectric properties from the surrounding material. The reflected waves, superimposed from all directions, are recorded by the instrument. The time from transmission to receiving of the waves can be translated into depth. The recorded amplitude as a function of time is commonly called trace. By acquiring traces at various positions across the surface, it is possible to get a map of structures beneath the surveyed area. The spatial pattern is typically along transect lines. The data from the instrument from a single transect line could roughly be seen as a 2D cut-through of the subsurface, and is commonly viewed as a 2D image called profile. The reason why we do not get a 3D image even though the EM waves are propagating in all three dimensions is that the receiver can not discriminate the direction of the reflected waves, only the time of arrival.

Moving the instrument along a transect thus yields two-dimensional data:

(6)

one spatial and one temporal.

A significant complication of the tool, however, is that the data from a survey can be hard to interpret for non-experts in the field [4]. It is often recommended to take a course in data interpretation in order to get the most out of the survey in question. In order to ease the interpretation, a 3D view of the subsurface is usually constructed, since it is more intuitive (although not trivial) to understand. This is partly because it enables the possibility to view the surveyed area in a top-view, by slicing the volume at a specific time (depth) and view two spatial dimensions. This way, it is easier to spot objects and recognize shapes stretching out spatially.

To acquire a 3D map, the survey is carried out along multiple transect lines in close proximity. Naturally, the closer the transects (profiles) are to each other, the more information is possible to acquire. However it is usually costly, mainly in terms of time and energy, to survey in dense patterns. This is especially true when the area of interest has a difficult terrain, or for some other reason is hard to survey. But from a sparse data acquisition, i.e. when the profiles are far apart, the 3D volume is missing the vast majority of the data. This is where the need for interpolation techniques comes in. By interpolation it is possible to fill the gaps by approximating the missing data.

It is common to make relatively dense surveying of parallel profiles and interpolate with a simple linear method [5]. But as the profiles get further apart and if the survey pattern changes, this naive method is prone to give false and misleading results.

There are very few studies that have investigated different interpolation techniques applied to GPR, and no known study in particular in the cross-line (across multiple profiles) direction from profiles with large distance. Samet et al. [6] did an investigation of the optimal profile interval and compared three different interpolation methods: linear, cubic and cubic splines. Their focus was to find out how high the profile and trace resolution have to be in order to preserve most of the vital information. They concluded that a profile distance of 0.25 m when using a 400 MHz antenna was a good comprimise between resolution and survey-cost. Their choice of interpolation method showed little to no impact.

Safont et al. [7] compared different methods when recovering missing traces in-line of a profile, and proposed a new statistical method named Ex- pectation assuming an Independent Component Analyzers Mixture Model (E-ICAMM). They demonstrated the superiority of the new method compared to Kriging and Splines. However, they did not investigate the perfor-

(7)

mance in the cross-line direction of sparse data.

Yi et al. [8] proposed a method to interpolate irregularly sampled data via a zone pass filter in the frequency-wavenumber domain. They argued that this method could potentially interpolate beyond the Nyqvist criterion (see section 2.1.1). They concluded that the method performed well even when the data were sparse relative to the Nyqvist criterion. However in their 3D investigation they removed traces with a random distribution and not along transect lines.

Where many earlier works have focused on preserving most of the information, in this study we accept the fact that much information will be lost. We focus on how we can minimize this loss by investigating which interpolation method is the least bad, given that a sparse data set has been collected.

1.2 Aim

The aim of this thesis is to lay a stable grounding for future visualization tools in order for GPR non-specialists to have an easier time to interpret the results of a GPR survey, in particular from sparsely collected data.

1.3 Goals

The thesis consists of two main goals and one secondary goal. These are:

1. (Main) Interpolate densely sampled GPR data to the satisfaction of the product owner

2. (Main) Interpolate sparsely sampled GPR data to the satisfaction of the product owner

3. (Secondary) Create an application which can visualize the results

1.4 Limitations

We will only consider interpolation and not extrapolation, meaning that we will not evaluate the results of data points outside the range of the existing data.

(8)

We only consider the case where the velocities of the propagating waves are unknown. This means we are restricted to one time axis and can not convert it to distance (depth).

(9)

Theory

This section is composed of three parts. The first part is a summary of GPR instruments, the resulting data, and methods of post-processing which is relevant to the interpolation. The second part is going through the different interpolation methods that have been investigated, and the third part will introduce the visualization application that was made.

2.1 GPR data

The Ground Penetrating Radar instrument consists of two main components - one transmitting antenna (Tx) and one receiving antenna (Rx). The transmitter sends an electromagnetic pulse in a cone-like pattern. The pulse is reflected back towards the receiver when it propagates from one material to another with different permittivity, [9]. This is illustrated in figure 2.1 below.

The timestamp for each reflection is recorded, making it possible to build up a waveform called trace or A-scan. The recorded amplitude from a specific time is called a sample. The receiver can not discriminate the direction of which the reflections are coming from and is recording the superimposed result from each direction. In a typical GPR instrument, multiple traces are acquired with some spatial distance along a transect line, building up a 2D data set commonly called profile, or B-scan. An example profile can be seen in figure 2.2. High amplitude is seen in black and white, while zero amplitude is gray. Each of the columns is a resulting trace from a specific point in space.

Objects will leave marks in the profile in the form of hyperbolas. Exam-

(10)

Figure 2.1: Illustration of a GPR instrument scanning the subsurface. The transmitter (Tx) is sending a signal into the ground with permittivity ₁, which is scattered from the object with permittivity ₂. The reflected signal returns to the instrument and is recorded by the receiver (Rx).

ples of hyperbolas can be seen to the left in the figure of the profile (figure 2.2). This is an effect following from the cone-like transmitter signal, since there will be a reflection from the object even when the antenna is not directly above it.

The time interval from a transmitting signal to the received reflection is related to the euclidean distance, but the relation is often unknown since it depends on the velocity of the EM wave. This velocity is possible to approximate, with the most common technique being considering the shape of the hyperbolas. The problem gets extra complicated when the penetrated medium does not have a uniform permittivity.

Since the data acquisition involves moving the instrument and collecting data in the spatial dimensions, it is vital to keep track of the instrument’s position. There are two main ways to do this. A common strategy is to survey

(11)

Figure 2.2: Example of a profile gathered from a GPR instrument. Large amplitude is seen in black and white, while gray is no amplitude. Each of the columns are called traces. To the left it is possible to spot hyperbolas, which indicates that objects with different permittivity have reflected the incoming wave.

in straight lines, with distance being measured by a wheel on the instrument.

In this case it is important to be careful and avoid diverting from the lines.

This is because the instrument only considers the distance travelled and has no information about any change of direction. The other option is to make use of a GPS tracking system. This will result in (x,y) values for each trace, giving more freedom for the survey pattern and allowing faster acquisition times. The downside of this approach is that the interpolation procedure gets more complicated as the data points can no longer be trivially regularized to a grid without introducing some errors. It is also important that the GPS is tracking with high accuracy in order to keep the errors at a minimum.

(12)

2.1.1 The limits of spatial sampling

When considering reconstruction of a data set, it could be useful to know of the theoretical limits. The Nyqvist criterion states that the spatial distance of each trace should be no more than a quarter of a wavelength [10] in order to retain the information from the scattered field. Any larger distance can result in severe loss of information and aliasing. Aliasing is an effect where the discrete sampling of the original signal is making multiple signals indistinguishable from each other. This causes the original signal difficult or impossible to reconstruct accurately.

Since the criterion is dependent on wavelength, it means that a higher frequency GPR instrument needs a smaller profile distance in order to reconstruct the missing data accurately. However, in this report, we will focus on large distances beyond this limit.

2.1.2 GPR data preprocessing

There are a few processing schemes that improve the visibility and interpretation of the data and thus improves the 3D visualization. A summary of the most common processing algorithms used throughout this report are listed below.

Gain Function

As the transmitted EM waves propagate through the medium, they dissipate energy. This results in a weaker response from an object when it is further away from the instrument. A gain function is usually applied to the data to correct for this. This is most often done by visual inspection, as automatic gain functions could potentially give unsatisfactory results. In this report, a gain function was applied to the data sets based on visual inspection.

Hilbert Transform (Amplitude Envelope)

In a 3D view of the data, the wave pattern of the reflection itself is not usually interesting, but rather the amplitude. The conversion of a signal w(t) to its corresponding amplitude envelope [11] is given by

H(w)(t) = 1 π

Z ∞

−∞

w(τ )

t − τdτ. (2.1)

An example of the effect from this can be seen in figure 2.3 below. Each of

(13)

the traces (the columns in a profile, see figure 2.2) was processed in this way prior to interpolation.

Figure 2.3: Example of the conversion from a signal to its amplitude envelope.

The original signal is seen in blue and its resulting envelope in orange.

Band-pass filter

Since the frequency of the GPR instrument is known, we can filter surrounding frequencies to highlight the signal from the GPR. This was done by a Fourier transform of the signal and applying a filter in the frequency domain.

2.2 Interpolation methods

We have data in locations x = (x, y, t). Since we assume that the velocities of the propagating waves are unknown, we can not convert the time axis to distance. Methods based on euclidean distance are therefore naturally restricted to spatial 2D interpolation at each time t. We can formulate the problem in the following way:

We seek to estimate the unknown value ˆg(x_i) in location x_i, given the observed values g_j at sample locations x_j = (x_j, y_j) at a certain time t_l. In general, it is assumed that g(x_i) are observations of an unknown function f (x, y) which is to be approximated, ˆf ≈ f , such that ˆf (x_j) = f (x_j).

2.2.1 Linear interpolation (LERP)

Linear interpolation approximates function h(x), x ∈ R¹ according to ˆh(x) = h(x₀) + h(x₁) − h(x₀)

x₁− x₀ (x − x₀) (2.2)

(14)

This linear interpolation requires the data to be on a regular grid. The interpolation is then carried out in the perpendicular direction from the profiles. More about how this was done in section 3.2.

This method is coordinate dependent, meaning that a rotation of the data set or the coordinate system will lead to a different result.

2.2.2 Shepard’s algorithm (IDW)

Shepard’s algorithm [12], also commonly known as Inverse Distance Weight- ing (IDW) approximates g(x) according to

ˆ g(x) =

PN

α=1Wα(x) g(xα) PN

α=1W_α(x) , (2.3)

where N is the number of observed values and Wα(x) is the weight function W_α(x) = 1

kx − x_αk^d. (2.4)

A common criticism of the method is that points far away from the approximated point get too much impact on the result [13]. Therefore, modifications are often made to make it more localized. This can be done in two alternative ways; by restricting the weights W_α to the k nearest neighbours or the neighbours inside a radius R.

Parameters

d - The order of which the weight function is to be decreased by distance.

The lower number, the smoother and flatter the approximate function will be. Typical value: 2.

k - Number of closest data points to be included in the calculations. Typical value: 50.

R - Maximum distance a neighbour can be away from the data point to be included in the calculations. Typical value: 1 m.

2.2.3 Ordinary Kriging (OK)

Ordinary kriging uses the covariance function between sample locations to derive optimal weights in a Best Linear Unbiased Estimator sense [14]. It estimates g(x) according to

(15)

ˆ g(x) =

N

X

k=1

W_kg(x_k), (2.5)

where PN

k=1Wk = 1. We assume that the data are a part of a realization of an intrinsic random function with a variogram γ(h), where h is distance.

The estimation variance ˆσ² = var(ˆg(x₀) − g(x₀)) is the variance of ˆ

g(x₀) − g(x₀) =

n

X

k=1

W_kg(x_k) − 1 · g(x₀) =

n

X

k=0

W_kg(x_k) (2.6) with the weight W₀ = −1 andPn

k=0W_k= 0. This gives ˆ

σ² = E(ˆg(x₀) − g(x₀))²

= −γ(kx₀− x₀k) −

n

X

k=1 n

X

l=1

W_kW_lγ(kx_k− x_lk) + 2

n

X

k=1

W_kγ(kx_k− x₀k) (2.7) By minimizing the estimation variance we obtain the ordinary kriging system, which is the following system of equations







γ(kx₁− x₁k) . . . γ(kx₁− x_nk) 1 ... . .. ... ... γ(kxn− x1k) . . . γ(kxn− xnk) 1

1 . . . 1 0











 W₁

... Wn

µ







=







γ(kx₁− x₀k) ... γ(kxn− x0k)

µ





 (2.8) where µ is a parameter to ensure that Pn

k=1Wk = 1.

Since the true variogram γ is unknown, we estimate it by calculating the empirical variogram. This is done by computing

ˆ

γ(h) = 1 2

1 N (h)

N

X

k=1

(g(x_k+ h) − g(x_k))². (2.9) There are several different functions used to model γ. One of the most common functions for the variogram is the spherical model, given as

γ(h) =

(c₀+ c

3h 2a − ¹₂

h³ a³

0 < h ≤ a

c₀+ c h > a

(2.10)

(16)

The constants c₀, c and a are found by fitting the function to the empirical variogram in (2.9). This can be done automatically or by inspection.

2.2.4 Thin Plate Splines (TPS)

A function which is only dependent on the magnitude of its argument is called radial. An example of this is g(x) = φ(kxk) = φ(r), where kxk = r.

This means that φ is constant for vectors of the same length.

The goal is to find an approximation ˆg(x) as a linear combination of radial basis functions according to

ˆ g(x) =

n

X

k=1

W_kφ(kx − x_kk). (2.11)

We set the constraint

n

X

k=1

W_kφ(kx_j− x_kk) = g(x_j). (2.12)

This can be written in matrix form as







φ(kx₁− x₁k) φ(kx₁− x₂k) . . . φ(kx₁− x_nk) φ(kx₂− x₁k) φ(kx₂− x₂k) . . . φ(kx₂− x_nk)

... ... . .. ...

φ(kxn− x1k) φ(kxn− x2k) . . . φ(kxn− xnk)











 W₁ W₂ ... Wn







=





 g₁ g₂ ... gn





 and in compact form

ΦW = g. (2.13)

We require Φ to be non-singular in order for (2.13) to have an unique solution. By choosing φ(r) = r²log(r), we acquire thin plate splines [15] which is analogous to the commonly used cubic splines [16] in the 1D case.

A significant advantage of this method is that it has no free parameters which have to be tuned to the data. However, since the solution involves calculating the inverse of a matrix, it is relatively heavy computationally.

(17)

2.2.5 Fk domain zone-pass POCS Algorithm (FKPOCS)

This algorithm is a combination of a conventional Projection Onto Con- vex Sets (POCS) algorithm [17] and zone-pass filtering in the frequency- wavenumber domain (fk), in an attempt to circumvent the problem of aliasing which occurs when the sampling is too sparse. For further information, see [8]. This is the only method to utilize all three dimensions.

The fk domain is calculated from

G_i(k_x, k_y, f ) = F (g_i(x, y, t)), (2.14) where F is the Fourier transform. The cone-shaped zone-pass filter is given by

H_i(k_x, k_y, f ) =

(1, −_bw^f < ±pk²_x+ k²_y < _bw^f ,

0, otherwise (2.15)

bw = tan π 2

1 − i

N

, (2.16)

where k_x, k_y are the wavenumbers in the x and y directions respectively, f is the frequency, i is the iteration number, N is the total amount of iterations and bw is the widening angle of the filter.

The filtered fk domain at iteration i is denoted by G⁰_i(k_x, k_y, f ) and is given by

G⁰_i(k_x, k_y, f ) = G_i(k_x, k_y, f )H_i(k_x, k_y, f ). (2.17) The time domain is then recovered by the inverse Fourier transform according to

g⁰_i(x, y, t) = F⁻¹(G⁰_i(k_x, k_y, f )). (2.18) By denoting the locations of missing data points M , we let the input of next iteration be

g_i+1(x, y, t) =

(g_i⁰(x, y, t), x, y, t ∈ M,

g_i(x, y, t), otherwise. (2.19) In other words, the resulting values after returning to the time domain are kept only at the locations where there were missing data points to begin with, and the known data points are staying the same.

In the early stages of the algorithm when i is small, the widening angle of the cone-shaped filter bw is also small. This means that lower wave-frequency

(18)

components are interpolated first, and higher components get interpolated afterwards.

When the data is sparse the iterations at lower wave-frequency numbers are more important than higher, since they iteratively will bridge the gaps.

Therefore, to speed up the convergence, we can modify the angular function in eq (2.16) to

bw = tan π 2

1 − i²

N²

. (2.20)

One problem with the algorithm is that it will continue even though it has already converged. It is possible to have a break condition that is dependent on the relative change between calculations. However, one problem that arises from this has to do with computational time. It can be unnecessarily expensive to calculate relative error after each iteration. A possible compro- mise to this could be to only calculate relative change every ∼ 5 iterations, depending on the particular data set.

Parameters

N - Maximum number of iterations. A too low number will prevent the algorithm from converging, but a too high number will make the computation unnecessarily heavy. A higher resolution and/or sparse data will require a higher number. Typical value: 200.

2.3 3D Visualization Application

There exists a multitude of different 3D GPR visualization software with years of work behind them. Examples of well developed tools are GPR Slice [18], Reflexw [19]. However, most of them lack an important aspect:

a user-friendly experience. Therefore, there is a demand for a powerful yet user-friendly software.

The goal of the application is to have the following properties:

• User-friendly

• Modern look

• Web-based

(19)

• Ability to interpolate GPR data with various methods

• Ability to visualize interpolation results in 3D

• Ability to crop/time slice results

• Ability to set threshold value and view as iso surface

The purpose of the application is to ease the interpretation of the collected data and quickly and seamlessly view the results in 3D. This application module is planned to be a part of a future GPR processing application.

(20)

Method

3.1 Data acquisition

The data used in this study were collected from two different GPR instruments, the MAL˚A Easy Locator Pro WideRange HDR and the other MAL˚A MIRA HDR. The WideRange (WR) instrument is an all-round product which is built to be easy to use. It is capable of scanning at two different frequencies at once, which can be advantageous since there is a trade-off between resolution and penetration depth that depends on frequency. Some of the technical details about the instrument can be seen in table 3.1.

The MIRA HDR, on the other hand, is the successor of the flagship of MAL˚A GPR instruments: the MIRA. It was still under development when the data were collected. The MIRA HDR, just as the MIRA, is an array- solution which means that it produces multiple profiles in parallel and essentially yields a 3D scan, or C-scan. The MIRA HDR that was used had 22 channels which means that 22 parallel profiles were produced simultaneously, at a width of about 1.5 m.

The WR survey was carried out on a parking lot outside the Ume˚a office of Guideline Geo. The area was 13 × 9 m² and profiles were collected at an equidistant interval of 0.5 m, in a grid pattern. The position was tracked by a GPS system. The frequency of the antenna for the data used for evaluation was 670 Mhz.

The MIRA HDR survey was carried out as a part of a product test over a large area of 60 × 35 m². However, only a subset with interesting utilities with an area of 11 × 7 m² was used in the evaluation process. As with the

(21)

WR case, the tracking was done by GPS. The frequency of the antennas was 500 MHz.

Figure 3.1: Data collection with MAL˚A Easy Locator Pro WideRange HDR.

Table 3.1: MAL˚A Easy Locator Pro WideRange HDR Technology MAL˚A HDR Technology

Effective bandwidth 80-950 MHz

SNR >101 dB

Scans/second >500 Operating time Up to 8 hrs

Positioning Build-in DGPS, external GPS, wheel encoder

3.2 Interpolation prerequisites

As mentioned briefly in section 2.1, there are two main ways of keeping track of the position of each trace, and each method requires different approaches in order to interpolate. When having parallel profiles with a fixed interval, the

(22)

profiles have the same amount of traces. This lets us approach the problem in the following manner:

Let two profiles X⁽¹⁾, X⁽²⁾be represented as matrices, with samples at row i and a trace at column j (see figure 2.2). Imagine we want to interpolate a profile ˆX that lies directly between them. We can then approximate the profile as a function of X⁽¹⁾, X⁽²⁾ according to

Xˆ_ij = f (X_ij⁽¹⁾, X_ij⁽²⁾). (3.1) For instance, a linear interpolation approximates the profile as

Xˆ_ij = aX_ij⁽¹⁾+ (1 − a)X_ij⁽²⁾ (3.2) where a is a constant between 0 and 1.

This approach fails when we are dealing with GPS positions since we get scattered data at positions (x, y). In order to most effectively deal with this problem, interpolation methods that can handle scattered data need to be used. This includes the tested methods IDW 2.2.2, OK 2.2.3 and TPS 2.2.4.

However, since both OK and TPS involves inverting a matrix to solve with a time complexity of O(n³), this approach becomes unfeasible for straight- forward use, and the data must be downsampled and/or segmented. Another way to get around this problem, however suboptimal, is by binning the data.

This will downsample the data to an arbitrary level and enables the methods based on eq. (3.1) to be compared with the rest of the methods. The binning procedure can be explained as follows:

Let the grid value at each grid point b^g_i(x_i, y_i, t_i) be the average of the surrounding observations. Let ∆x, ∆y, ∆t be the spatial resolution, then

b^g_i = 1 n

n

X

l=0

g(xl) (3.3)

where x_l are the n sample locations in the cube bounded by

x_i− ∆x/2 < x_l < x_i+ ∆x/2 y_i− ∆y/2 < y_l < y_i + ∆y/2 t_i− ∆t/2 < t_l < t_i + ∆t/2

It is easy to see that an increase in resolution will result in an ever closer match with the scattered positions; however, the amount of data points needed to be interpolated increases along with it.

(23)

The MIRA HDR survey already had a very dense data set, with a profile interval of 6.5 cm and trace interval of 4.5 cm. By creating a grid with the same cell size and binning the data, we acquire a sufficiently accurate result.

3.2.1 Evaluation strategy

The evaluation process was done through a cross-validation scheme. In the case of the WR data, all of the profiles were binned to a regular grid. A new grid with every other profile was then created. Since there are two permutations this can be done, both of them were actualized and the result was calculated by the average value from each interpolation result. Continually, this proceeds with every third profile, every fourth and so on. This can be translated into profile distance since we have approximately equidistant profiles.

The MIRA HDR investigation process was done slightly differently from the WideRange survey, but still very similar. Since we can acquire a dense 3D grid without the need for interpolation, we let this be the baseline and then remove slices from the grid at ever increasing distance. Analogous with the WR case, we first keep only every other slice, then every third and so on, and calculate the result from the average of all possible permutations.

3.2.2 Empirical variogram

The empirical variograms were calculated before binning the data to a regular grid in both of the data sets in order to avoid the errors introduced by binning.

Both variograms were calculated at constant times t_l, since the distance h is only defined in the plane (x, y). The result from each time were averaged to a single variogram of each data set.

3.2.3 Parameter evaluation

Two of the investigated methods had free parameters: IDW and FKPOCS.

The parameters were tuned in order to get a fair comparison between the methods. However, a complication of the parameter evaluation is that the optimal parameter settings depend on the particular data set, the sparsity of the data and resolution of the grid. Since the main focus of this investigation is on sparse data, the parameters were set with high profile interval in both of the data sets. The resolution for the WR data set was set to be a median value

(24)

of the typical values used in the evaluation process, which was 150×150×150 (x, y, t).

The IDW method has, as mentioned in section 2.2.2, three parameters in total. However, we had the option to either restrict the weights to the k nearest neighbours or only the neighbours inside a radius R. Since the profiles are varying in distance, the most reasonable approach was to consider the k nearest neighbours to ensure that a sufficient number of weights were used.

This strategy was confirmed beneficial in testing. To evaluate the parameters d and k, all possible combinations of d = 0, 1, ..., 5 and k = 5, 10, ..., 200 were evaluated. The combinations were initially compared via the objective measures in section 3.3, and the best visual result from the top competing combinations was chosen for each data set.

For FKPOCS, the number of iterations was set to an exaggerated number to ensure convergence. This way we can be confident that we are comparing it justly with the rest of the methods.

3.3 Evaluation of goodness

At each step, the interpolation results were evaluated using the following metrics:

• Root Mean Square Error (RMSE)

• Pearson Correlation Coefficient (PCC)

In addition to these, other important aspects were measured:

• Computational cost

• Visual inspection

The evaluation process of the methods is not straight-forward, and eval- uating with objective measures comes with some caveats. Often in real-life scenarios, the interesting information in the data stems from man-made objects underground. A good example of this are pipes. Some methods could be judged to be better than others by RMSE or PCC, but by visual inspection it could be possible that these methods do not reconstruct pipe-like shapes as well.

Detailed descriptions of the evaluation measures are given in the following sections.

(25)

3.3.1 Root Mean Squared Error

The relative root mean squared error (RMSE) is calculated by

RM SE = 1 n

N

X

i

p(ˆg(xi) − g(x_i))²

g(x_i) (3.4)

where g(x_i) is the real observed value. Since we have ˆg(x_j) = g(x_j) at the true data points xj, we let the normalization factor n be the number of points that are approximated, i.e. the amount of points where x_i 6= x_j.

3.3.2 Pearson Correlation Coefficient

The Pearson Correlation Coefficient [20] is used as a measure of how similar two matrices are to each other. It is given by

r_X,Y = cov(X, Y )

σ_Xσ_Y (3.5)

where X, Y are matrices, σX, σY are the standard deviation of each matrix, and cov(X, Y ) is the covariance of the two matrices. The covariance is given by

cov(X, Y ) = 1 mn

m

X

i=1 n

X

j=1

(X_ij− ¯X)(Y_ij − ¯Y ) (3.6) where m is the number of rows, n the number of columns and ¯X, ¯Y are the averages of X and Y , respectively.

(26)

3D Visualization Application - Method and Results

Since the application was to be web-based, it was developed with HTML, JavaScript (JS) and CSS, together with the useful and popular JS framework React.js [21]. The 3D visualizations were developed with the JS library Three.js [22]. Three.js is a library which is based on WebGL, a powerful 3D visualization library which can utilize the GPU for the heavy computations that come with 3D visualizations.

The interpolation calculations were done in python, utilizing the algorithms used from the evaluation process. The scripts were made available via a back-end server based on flask [23], which the front-end application can call to get the results from the desired interpolation method along with its parameters.

Git was used as a version control system during development.

4.1 Visualization tools in Threejs

Threejs lets you define a mesh via 3D points (x, y, z) in 3D space and provides tools for easy yet powerful visualizations. Following below is a description of each visualization object that was implemented in the application.

(27)

Figure 4.1: Screenshot of the application that was created.

3D Cube

There are some pre-made meshes already implemented, such as a 3D cube.

This is the first object that was used to visualize the interpolation results and was set as a standard object for newly calculated interpolation results.

Each side of the cube was given a texture corresponding to the data values at each position. Each data point was given a color based on its value, via a defined colormap. More about this under the Colorslider and colormap title further down.

In order for an opaque 3D cube to be helpful for visualization, a crop tool was created. The cropping tool made it possible to cut the cube at any dimension x, y, t to visualize the insides of the cube. An example of this

(28)

can be seen in figure 4.2 a)-b). By moving the slider to the right (4.2 c, highlighted in blue) after selecting the desired dimension (t in this case), the cube is cropped. The data textures are thus updated for each step of the slider. The calculations are happening live as the slider move, letting the user see how the insides of the cube look with little delay.

Figure 4.2: Screenshots from the 3D cube view. In a) we see the full cube, and a cropped version in b). In c) we see the toolbar with the cropping tool highlighted in blue. Note to the upper left of the highlighted area we have the option to select the desired dimension of cropping, with dimension t being highlighted in the figure.

Time slice

Strongly related to the cube is the time slice. The time slice lets the user cut through the cube one layer at a time. The mesh is a simple plane instead of a cube, moving in the t direction when moving the slider. Just as the cube, the plane has a data texture on each side of the plane which updates in real time with the slider. This is one of the more common visualization tools for viewing 3D GPR data.

Colorslider and colormap

As mentioned above, both the 3D cube and the time slice utilizes data textures for visualization. Since each data set is different, it is important to be able to change the colormap of the textures in order to raise the contrast of interesting patterns in the data. The implemented colormap works as

(29)

follows: Three colors are chosen. The first color corresponds to the lowest value in the data set, the second to the median value and the third to the highest value. Every value in between is given a color based on interpolation between them. This colormap can then be changed by a slider with three nodes, letting the user select which values should correspond to which color.

See figure 4.3 for an illustrated example. Another feature implemented related to the colorslider was the ability to change any of the three base colors via a color-picker.

Figure 4.3: Illustration of the effect from changing the colorslider can have.

By changing the colormap interesting objects in the data can be visualized more clearly.

Iso surface

The iso surface is a way of seeing ”through” the cube by letting high values in the 3D cube be visible, hiding the low and making a smooth surface from the remaining points. This works in the following way: A threshold is set, and a surface of constant value is calculated via the marching cubes algorithm [24]. Briefly described, the algorithm loops through each cell in the grid and calculates which of the eight corners of a cell is above the threshold value.

Once the corners are identified, the corresponding mesh points are identified for the particular case via a table. A mesh can be created from the points, and after the whole volume is done, an iso surface is created. An example of an iso surface can be seen in figure 4.4.

Just as for the 3D cube, it is possible to crop the iso surface with the crop slider described above. This makes it easy to highlight specific areas in the cube, as viewed in the figure below were the pipe-shaped object was highlighted.

(30)

Figure 4.4: Iso surface illustration. In a) we see a zoomed view of the pipe- shaped object in the data. The pipe was isolated by the cropping tool. By looking closely at the object it is possible to spot the individual mesh triangles calculated from the marching cubes algorithm described in the text. In b) we get a clearer view of where in the volume the pipe is located, and in c) we demonstrate a possibility to view the iso surface together with the raw data profiles. Note the white outline of the iso surface, which activates by mouse hovering, letting the user see the shape through the profiles.

Profiles in 3D

It can be useful to have the original data to compare with the interpolated results. This makes it possible to complement the information from the two different data types. For instance, if the interpolation result show a pipe-like shape, it can be confirmed by looking at the original 2D profiles at the exact location.

To implement the profiles in 3D view, a custom mesh had to be created in order to work with an arbitrary shape from (x,y) coordinates. The next step was to match the position, rotation and scale with the interpolation results, i.e. the 3D Cube, time slice and iso surface. Once done, a data texture based on the raw GPR data was applied to the custom mesh. This was mapped correctly so as not to warp the scaling of the data caused by the uneven surface. A number of example profiles in the 3D view can be seen in figure 4.5 and figure 4.4 c).

(31)

Figure 4.5: Profiles viewed in 3D. In a) we have an example of parallel profiles and in b) we see data collected in an S-pattern. In b) we also see the combination of a time slice which can be moved up and down, which lets the user compare the interpolated results to the raw data.

Quickly after implementation it was clear that it was essential to be able to dynamically remove profiles of choice since they are completely opaque. If there is an interesting area behind a profile it should be possible to seamlessly remove it to view the area in question. In the current implementation, this is possible to do via right-clicking the profile to be removed.

(32)

Interpolation Results

5.1 MIRA HDR data set

Figures 5.1-5.3 are showing time slices of the results from each method, with the baseline grid in the upper left. We can see a linear object classified as a pipe in the area. In the figures we can tell how well the methods were able to reconstruct the shape of the pipe and how different methods introduce artefacts.

Worth noting is that the linear method (LERP) was interpolating in the x-direction, which was advantageous in this case since the pipe is aligning approximately along the same axis. None of the other methods have instruc- tions about direction. The parameters used for IDW in this data set were k = 60, d = 3.

The figures are only showing one particular permutation of the profiles, meaning that the result could look different even with the same profile interval. In the baseline it is possible to spot empty areas seen in black; the largest one seen at around x = 4 m stretching out vertically. This is because the MIRA HDR did not fully cover the surveyed area. We can see in the time slices that the methods interpolated the missing area. The objective measures did not evaluate the interpolation result in these areas since the baseline values were unknown.

(33)

0.0 2.5 5.0 7.5 10.0 0

2 4

6

Baseline

0.0 2.5 5.0 7.5 10.0 0

2 4 6

LERP

0.0 2.5 5.0 7.5 10.0 0

2 4

6

IDW

0.0 2.5 5.0 7.5 10.0 0

2 4 6

OK

0.0 2.5 5.0 7.5 10.0 0

2 4

6

TPS

0.0 2.5 5.0 7.5 10.0 0

2 4

6

FKPOCS

Offset [m]

Figure 5.1: Time slice from the MIRA HDR data set with a profile distance of 0.2 m. At this short distance, it is hard to spot any discrepancies between the methods. The methods had 25% of the baseline data available while the rest was interpolated.

Figure 5.1 is showing the time slices with a profile distance of 0.2 m. It is only possible to spot minor discrepancies between the methods. Perhaps the most clear anomaly is from FKPOCS, which shows a very subtle striped pattern. More apparent differences are not easily spotted until we have a profile distance of 0.5 m, seen in figure 5.2. In the figure we can see that LERP is showing a very subtle smearing effect, which the other methods do not have. IDW is showing a striped pattern vertically along the profiles, which makes the data look incoherent. OK, TPS and FKPOCS are showing similar results, with OK and TPS being slightly smoother at the interpolated points.

All these effects are even more prominent in figure 5.3, with a profile distance of 1 m. LERP is showing a clear smearing effect, and the image looks blurry. IDW is still showing a vertical striped behaviour, and is the worst method to reconstruct the pipe. OK and TPS are still nearly identical to each other and it is difficult to spot any differences between the two. They are both smoothing the data in between the profiles more than the other methods, making the real data points be easily spotted. Similarly to IDW, but way more subtle, we can see a vertical striped pattern. This can not be said of FKPOCS, which is also not smoothing the interpolated data points as much as OK and TPS. This makes the data look a bit more coherent.

(34)

0.0 2.5 5.0 7.5 10.0 0

2 4

6

Baseline

0.0 2.5 5.0 7.5 10.0 0

2 4 6

LERP

0.0 2.5 5.0 7.5 10.0 0

2 4

6

IDW

0.0 2.5 5.0 7.5 10.0 0

2 4 6

OK

0.0 2.5 5.0 7.5 10.0 0

2 4

6

TPS

0.0 2.5 5.0 7.5 10.0 0

2 4

6

FKPOCS

Offset [m]

Figure 5.2: Time slice from the MIRA HDR data set with a profile distance of 0.5 m. The methods interpolated from 12.5% of the available original data.

At this distance we start to see some variety. LERP has a minor smearing effect and IDW is showing subtle vertical stripes.

0.0 2.5 5.0 7.5 10.0 0

2 4

6

Baseline

0.0 2.5 5.0 7.5 10.0 0

2 4 6

LERP

0.0 2.5 5.0 7.5 10.0 0

2 4

6

IDW

0.0 2.5 5.0 7.5 10.0 0

2 4 6

OK

0.0 2.5 5.0 7.5 10.0 0

2 4

6

TPS

0.0 2.5 5.0 7.5 10.0 0

2 4

6

FKPOCS

Offset [m]

Figure 5.3: Time slice from the MIRA HDR data set with a profile distance of 1 m. The methods interpolated from 6.25% of the available original data.

LERP has a clear smearing effect which is making the image look blurred.

All methods except FKPOCS have a vertical stripe effect with IDW having the most. OK and TPS are very similar to each other.

(35)

RMSE and Pearson Correlation Coefficient

The results from the objective measures can be seen in figure 5.4. Keep in mind that the objective measures is an average value of all time slices and not only the viewed time slice in the previous section.

According to RMSE, all of the methods are performing similarly with the exception of FKPOCS. LERP starts diverging slightly at around 0.5 m.

IDW is performing surprisingly well considering the artefacts seen in the time slices. In agreement with the visual results, OK and TPS are nearly identical in this measure.

The PCC is telling a slightly different story, however. Here we see that OK has a minor advantage over TPS, and is competing with IDW for the top spot. Interestingly, FKPOCS is performing better than LERP for the most part in this measure even though it was considerably worse at RMSE.

The PCC is more in line with the subjective viewing result from the previous section, except for IDW which is still performing better than expected.

Figure 5.4: Relative RMSE in the left figure indicates that the methods are performing similarly except FKPOCS which is has the largest error. LERP is getting slightly worse as the distance increases. The PCC is showing some more discrepancies even though the general trend is the same.

5.2 WideRange data set

Timeslices from the WR data set can be seen in figures 5.5-5.6 below. The resolution used was 150×150×100 (x,y,t ). Note that we do not have a dense

(36)

3D grid to start since the smallest profile interval was 0.5 m. This makes it harder to compare with a ground truth as we could do in the MIRA HDR data set, and interpolating the profiles to get a baseline could be misleading even though it is relatively dense. Therefore we only see a time slice of the real data points after the profile was binned to a 3D grid. Note also that in the baseline, we can see that the lines are not exactly of equal length, and are not perfectly straight. This is different from the MIRA HDR data previously, where the baseline was set and we sliced the grid at varying distances along perfect lines. The parameters used for IDW in this data set were k = 80, d = 3.

By looking closely we can see a pipe-shaped object to the right in the area, and it can be seen more clearly after interpolation.

0 5 10

0

5

Baseline

0 5 10

0 5

LERP

0 5 10

0 5

IDW

0 5 10

0 5

OK

0 5 10

0 5

TPS

0 5 10

0

5

FKPOCS

Offset [m]

Figure 5.5: Time slice from the WR data set with a profile distance of 1 m.

LERP shows a smearing effect, but the data looks coherent and the pipe was reconstructed well. IDW, OK and TPS look similar to each other, although OK created the least artefacts. FKPOCS reconstructed the pipe surprisingly well considering it had no information about interpolation direction, but suffers from artefacts in the form of ringing effects.

We see a bigger discrepancy from the different methods in this data set than the MIRA HDR. When the profile distance increases LERP is showing its one dimensional nature clearly, although it possibly reconstructed the

(37)

pipe-shaped object the best. Note that similarly as in the MIRA HDR data set, the angle of the pipe was aligning well with the interpolation direction, which naturally makes the reconstruction easier for this particular method.

The worst method to reconstruct the pipe was IDW, closely followed by OK and TPS. FKPOCS did a decent job of reconstructing the pipe, especially considering it had no explicit direction of interpolation. The method, however, is also creating some severe artefacts mostly in the form of ringing effects. The method which created the least artefacts was OK. Similar be- haviours can be seen when increasing the profile distance to 2 m. This can be seen in figure 5.6.

0 5 10

0

5

Baseline

0 5 10

0 5

LERP

0 5 10

0 5

IDW

0 5 10

0 5

OK

0 5 10

0 5

TPS

0 5 10

0

5

FKPOCS

Offset [m]

Figure 5.6: Time slice from the WR data set with a profile distance of 2 m. The smearing effects from LERP is extremely clear. The size of the object in the top middle in the area is, for instance, greatly exaggerated in the direction of interpolation. IDW and OK are very similar to each other.

TPS greatly under-and overshot some areas, seen by the very dark and very bright spots. FKPOCS did a decent job of not over-smoothing the data but got some severe ringing artefacts.

At this profile distance, the TPS method created excessively large artefacts, which can be seen by the large white and black areas. A common theme for IDW, OK and TPS is that the original data points are being seen clearly.

All of these methods seem to have an implicit smoothing effect. This is not

(38)

the case with FKPOCS however. For better or for worse, it tried harder to extend the details of the original data further from the profiles. The method did create some artefacts, partly in the form of a ringing effect which we also saw in the previous figure.

RMSE and Pearson Correlation Coefficient

In figure 5.7 below we see the results from the objective measures from the WR data set. The immediate noteworthy fact from the graphs is the poor results from TPS. The artefacts viewed in the timeslices where it clearly under-and overshot caused, unsurprisingly, the objective measures to highlight this distinctly. OK performed the best, followed by IDW.

Looking at the PCC measure, FKPOCS was performing the least well at short profile distance, but caught up to LERP and IDW at large distance and seemed to reach a stagnation point. The other method kept getting worse with increasing profile distance. Just as in RMSE, the OK performed best overall, followed by IDW.

Figure 5.7: The RMSE and PCC from the WR data set. OK performed the best in both measures. TPS performed very poorly at distances above 2 m.

Another noteworthy fact is that we do not reach any stagnation point in the RMSE graph, but rather see the increasing error trend even when going from a profile distance of 2.5 m to 3 m. This tells us that the profiles have not gotten completely independent even after such a large distance.

(39)

5.3 Computational costs

Before going into the results, some context must be given to the tests. The most essential aspect to consider from the following graphs is the shape of the curves, followed by the actual values. The main reason is because the actual values depends heavily on the specific implementation, programming language, and hardware. However, the general shape should be more robust to these variables.

Any curve getting steeper with increased data points (i.e. time complexity larger than O(n)) is bad news for its applicability potential to GPR. The amount of data points is usually high enough as to make such method im- practical. Of course, this depends on the purpose of the specific use case and high computational costs could sometimes be worth given a better interpolation result. It could potentially also be worth downsampling the data set to enable a better interpolation method.

It is important to mention that it is difficult to compare FKPOCS with the other methods regarding computational time. This is in part because it is an iterative method, and its convergence properties depend on more than just the number of data points.

The results from the time of computation for each method is seen in figure 5.8. In a) we see from the increasing slope of OK and TPS that they indicate a time complexity larger than O(n). FKPOCS is, as mentioned earlier, hard to judge in this manner and on top of that we see an unstable behaviour.

This behaviour persisted throughout several different tested data sets, and the causal reason is unclear.

(40)

a) b)

Figure 5.8: The computational cost of each method as a function of data points. In a) we see that OK and TPS are showing an increasing slope along with an increase of data points, making them less efficient at large data sets. FKPOCS is difficult to compare with the rest of the methods since its iterative, and is showing an unstable behaviour. LERP and IDW are best compared in b), which has the number of data points increased with an order of magnitude compared to a). IDW is following a linear trend. The amount of data points is too low for LERP to show any sign of linear time complexity.

It is unsurprisingly the fastest, considering its simplicity.

The obvious winners are LERP and IDW, which are plotted in b) to visualize their comparative performance. The number of data points are increased with one order of magnitude compared to a) in order to see the scaling better. IDW is showing a linear behaviour, making it efficient at large data sets. LERP is actually showing a constant time behaviour even at this scale, indicating that it is incredibly fast.

(41)

Discussion

LERP

Any method based on eq. (3.1), which includes the linear method, could be both beneficial and disadvantageous when considering larger distances. For instance, the reconstruction of the pipe in figures 5.1-5.3 would have been considerably worsened if we would have rotated the coordinate system about 45^o. However, in cases were we know the angle of the pipe and the main goal of the interpolation is to visualize it, the strategy can be advantageous.

This is because we can decide to set the coordinates in line with the pipe and carry out the interpolation in that direction.

Worth noting is that this method causes a smearing effect, which could give a false impression of the size of an object seen in the data. For example, two small independent objects in two profiles next to each other could be joined together by the interpolation and look like a pipe, even though it is not. The risk of two independent objects getting joined together gets worsened with larger distance since the profiles get increasingly independent. The smearing effect also makes the image look blurry, which makes it difficult to look at and interpret.

IDW

IDW performed surprisingly well considering its simplicity. Its biggest weakness was giving off striped artefacts in line with the profiles which made them almost seem independent. However, this behaviour was not as prominent at a large profile distance of 2 m as seen in figure 5.6. This could be because of the parameters which were set at a large distance. In more dense data, other values of k and d could potentially lessen the artefacts. Perhaps the most

(42)

efficient way to solve this would be to have the parameters be dynamically changed based on their neighbourhood. This is similar to how OK is working and could be the reason why IDW is giving similar results. However, the high computational cost of OK makes it unpractical in most cases, and a slightly modified IDW with dynamic parameters could be a decent middle ground.

One potential reason why IDW performed so well in the objective measures is because it never over or undershoots, meaning that the estimated value is never lower or higher than the lowest or highest value of the nearby points. Since so much information is lost when increasing the distances, this could be a ”safe” strategy of approximating a point.

If the computational cost is essential, IDW is by far the most efficient method tested. It can very quickly give a good approximation of the unob- served data. The biggest weakness of the method is the parameter depen- dency.

OK

OK performed well in both the visual and objective measures. It performed slightly better than TPS in the MIRA HDR data set and performed the best in the WR data set. However, regarding the computational cost it performed the worst, and given the small advantage in the visual and objective measures, it is hard to justify its use unless computational cost is a non-issue.

Another drawback is the need for calculation of an empirical variogram, which is another computationally expensive task. On top of that, a function has to be fitted to the calculated variogram. In its defence, it did not seem to be very sensitive to variation of the variogram function parameters. Another positive aspect is that the method produced very little artefacts.

TPS

TPS performed very well in the MIRA HDR data set and was extraordinar- ily similar to OK visually. However, it could not deal with the WR data set well at all. This is most likely because of the non-perfect transect lines in the data set. TPS tries to create smooth surfaces, and when two very close data points in the grid have different values, the gradient could cause TPS to over-or undershoot when approximating points far away from the original data points. This would explain why it would get worse with larger profile distance. Because of this, TPS should not be used in a context like the WR data set, i.e. after binning sparse data to a grid. TPS did not perform well regarding computational cost either, which essentially means that it is de-

(43)

pendent on downsampling if it should have any practical viability. Therefore an alternative downsampling scheme from binning would be important to justify the use of the method. The biggest advantage of the method is the independence of any free parameters.

FKPOCS

Perhaps the most interesting method tested was FKPOCS. It was also the only tested method to take advantage of all three dimensions. It did under- perform in the objective measures, but visually it had some unique properties.

For example, in the WR data set, it reconstructed the pipe second best to LERP, even though it is coordinate independent. It also did a great job of making the data look coherent. This is in particular true compared to IDW, OK and TPS, which smoothed the data and made the original data points stand out. Unfortunately, the coherency came at the cost of distracting artefacts, mainly in the form of ringing effects. The ringing effects were not as severe in the MIRA HDR data set, and the visual results could easily com- pete with OK and TPS. The artefacts seemed to be more severe at the edges of the data, which could indicate that the source of the artefacts have some- thing to do with boundary conditions. It is very possible that the artefacts could be prevented or minimized by padding the data set.

Regarding computational cost it is, as mentioned in section 5.3, difficult to appropriately compare FKPOCS with the rest of the methods since it is iterative and depends on convergence. For example, when the data get sparse, all the other methods are getting faster since there are fewer data points in the calculations. However, the opposite is true for FKPOCS. A larger distance between the data points leads to slower convergence. How- ever, to give a rough idea of the performance, it was performing somewhere between IDW and TPS when running the evaluation scripts. The use of the method could be therefore be viable if coherent looking data is desired, with the caveat that some artefact are also introduced.

General discussion

Overall, the visual results from both of the data sets gave a good indication of what kind of properties each interpolation method has. For instance, some of the methods smoothed the data more than others. The method with the most amount of smoothing was OK. This property may be why it performed so well in the objective measures. However, smoothing of the data is not easily classified as either a good or bad property. A good thing about smoothing

(44)

is that artefacts are not as easily created. It can also be good to see which data points stem from the interpolation and which are from the real data, because it makes it clear what information to trust and what to take with a grain of salt. The worst aspect about smoothing is that the data looks less coherent. The risk is also that interesting objects can have their edges smoothed which makes it harder to separate them from the background.

This can be significant since one of the most important aspects of the interpolation is to reconstruct the shape of objects. This is a reason why the objective measures should not be considered too heavily when judging how good an interpolation method is. Perhaps algorithms that are more inclined to find and reconstruct patterns in the data could be proved beneficial. Of all the tested methods, FKPOCS came the closest to this. This behaviour could be especially beneficial when the data get sparse, since looking at the neighbouring points get more and more meaningless and the large patterns in the data are gradually becoming the most important piece of information.

3D visualization application

The developed application proved to be a very useful tool during the evaluation process of the different interpolation methods. One of its biggest strengths was the ability to quickly visualize a newly collected data set in 3D. The application played a big part when choosing good time slices in the data sets to visualize in the interpolation results (chapter 5). It also proved useful when choosing resolution of the grid and parameters for the methods, since it was possible to quickly visualize the results after trying a new setting.