Ditch detection using refined LiDAR data : A bachelor’s thesis at Jönköping University

(1)

Ditch detection

using refined

LiDAR data

A bachelor’s thesis at J ¨onk ¨oping

University

PAPER WITHIN Computer Science AUTHORS: Filip Andersson, Jonatan Flyckt TUTOR: Niklas Lavesson

(2)

Science in Engineering programme. The authors take full responsibility for opinions, conclusions and findings presented.

Examiner: Ragnar Nohre

Supervisor: Niklas Lavesson

Scope 15 credits

Date: 2019-06-18

Mailing address: Visiting address: Phone:

Box 1026 Gjuterigatan 5 036-10 10 00

(3)

Abstract

In this thesis, a method for detecting ditches using digital elevation data derived from LiDAR scans was developed in collaboration with the Swedish Forest Agency.

The objective was to compare a machine learning based method with a state-of-the-art automated method, and to determine which LiDAR-based features represent the strongest ditch predictors.

This was done by using the digital elevation data to develop several new features, which were used as inputs in a random forest machine learning classifier. The output from this classifier was processed to remove noise, before a binarisation process produced the final ditch prediction. Several metrics including Cohen’s Kappa index were calculated to evaluate the performance of the method. These metrics were then compared with the metrics from the results of a reproduced state-of-the-art automated method.

The confidence interval for the Cohen’s Kappa metric for the population was calculated to be [0.567 , 0.645] with a 95 % certainty. Features based on the Impoundment attribute derived from the digital elevation data overall represented the strongest ditch predictors. Our method outperformed the state-of-the-art automated method by a high margin. This thesis proves that it is possible to use AI and machine learning with digital elevation data to detect ditches to a substantial extent.

Keywords

Machine learning - Geographic information systems - Classification and regression trees - Supervised learning by classification

(4)

Acknowledgements

We thank Professor Niklas Lavesson for guidance and instructions during the course of this thesis. We also wish to thank Liselott Nilsson and everyone else at the Swedish Forest Agency for providing the data and the thesis idea, as well as a lot of help along the way. Thank you also to Dr Anneli ˚Agren at the Swedish University of Agricultural Sciences for input and advice, as well as the labelled data for this project. Lastly, thank you to Sigurd Israelsson at the School of Engineering at J¨onk¨oping University for lending hardware resources for the experiments in this study.

(5)

1 Introduction

The Swedish Forest Agency (SFA) is a government agency charged with various forest issues, as well as with enforcing forest political goals set by the Swedish Parliament (Skogsstyrelsen, 2018). The SFA wishes to perform a mapping of ditches in Sweden, where ditches are identified on maps. Ditch mapping is important as a foundation in issues such as forest production, nature consideration and infrastructure (L. Nilsson, personal communication, 9 January, 2019). It is also important in water quality assessment, as ditches can be contaminated by chemicals in their surface runoff and shallow subsurface flow, which can cause groundwater contamination, and lead to changes in the nature of water flow (Carluer & Marsily, 2004; Dages et al., 2009).

Methods for manual mapping of ditches have high precision, but they incur significant effort and costs (L. Nilsson, personal communication, 9 January, 2019). A high precision manual mapping has been performed at an area called Krycklan in V¨asterbotten, Sweden (L. Nilsson, personal communication, 9 January, 2019). The automated methods previously applied, however, have been too imprecise for ditch mapping (Svensk, 2016) due to the Light Detection and Ranging (LiDAR) measurements sometimes being interfered by trees, bushes, or grass. These automated methods can therefore leave gaps in the digital elevation model, causing ditches to be incorrectly mapped (Cho, Slatton, Cheung, & Hwang, 2011; Roelens, H¨ofle, Dondeyne, Van Orshoven, & Diels, 2018). There are studies that have achieved good results using automated algorithms configured to fill in cavities in the model without utilising machine learning. However, these algorithms can be hard to tune correctly, and they can still suffer from the disrupted LiDAR scans (Cho et al., 2011).

In the light of this the SFA wishes to examine the possibilities of using artificial intelligence (AI) and machine learning to identify ditches. As this is an open problem with many challenges and no apparent solution, this has given us a unique opportunity to expand the knowledge of machine learning for the SFA. The aim of this thesis has not been to give a full solution to the SFA’s machine learning goals, but to form a basis on which future work can build on. The focus has been on investigating if a machine learning method can perform better than an established method, as well as determining what data attributes are most relevant for a correct ditch prediction. To explore this, experiments have been conducted with a supervised learning algorithm. Supervised learning has been used in previous studies of geographical data with good results (Gislason, Benediktsson, & Sveinsson, 2006; Roelens et al., 2018; Stanislawski, Brockmeyer, & Shavers, 2018; Sverdrup-Thygeson, Ørka, Gobakken, & Næsset, 2016).

(8)

2 Context

A geographic information system (GIS) is a system for storing, analysing, retrieving, and displaying geographical data (Mulik, 1999). This data can be either spatial or attribute data represented on maps or in tables in software systems (Mulik, 1999). One type of GIS data is digital elevation data, which represents elevation differences of geographic surfaces with many different data attributes in a raster format (Esri, 2017).

2.1 Available data

The data available for the work in this study are all attributes originally derived from LiDAR data. This LiDAR data has first been refined into a digital elevation model (DEM). The DEM has then been processed with the Whitebox software (Lindsay, J. B., 2016). All the data attributes are in a raster format with a resolution of 0.25m2per pixel (L. Nilsson, personal communication, 9 January, 2019). The following data attributes have been extracted from this process:

• Digital Elevation Model

The DEM attribute represents the elevation in metres above sea level. • Sky View Factor

The Sky View Factor attribute represents, with a value ranging from 0 to 1, how much of the sky that is visible from a certain point on the ground (Zaksek, Oˇstir, & Kokalj, 2011).

The Sky-view factor is defined by part of the visible sky (Ω) above a certain observation point as seen from a two-dimensional representation (a). The algorithm computes the vertical elevation angle of the horizon γ_i in n directions to the specified radius R (b). (Zaksek et al., 2011)

See Figure 1 for a visual representation. • Impoundment index

An attribute representing the size of the impoundment that would be created by inserting a dam at a location in the DEM. The grid cells of the DEM, together with specified length and height of the dam, produce the Impoundment index. This index is mainly used for mapping drained wetlands with LiDAR data. (Lindsay, J. B., 2016)

The Impoundment index used in this thesis was developed specifically for the Krycklan area. (A. ˚Agren, personal communication, 7 May, 2019).

(9)

• High Pass Median Filter - smoothed

High Pass Median Filters (HPMF) are used to help humans and algorithms to focus attention on important small scale features in images. For LiDAR data you can eliminate random variations in laser pulse energy, as well as the effects of attenuation. The filter uses the median value of a neighbourhood of values and then replaces all the neighbourhood values with this median value to help smooth out the data. (Shane D. Mayor, 2007)

• Slope

The Slope attribute represents the degree of slope at a certain point, with a value ranging from 0◦to 90◦. This attribute contains no information about the direction of the slope.

Figure 1. Sky View Factor illustration, (Zaksek et al., 2011)

2.2 Current situation

The SFA has provided refined digital elevation data of the Krycklan area. This area covers approximately 10,000 hectare, and the data is refined from LiDAR data with a very high resolution of an average of 20 laser pulses per m2 (Erixon, 2015). The data comes from a scan ordered by the Swedish University of Agricultural Sciences (SLU). This detailed data is currently only available for Krycklan and one other area, while the LiDAR data for the rest of Sweden comes from the Swedish national laser scan and has a lower resolution (Lantm¨ateriet, 2018). Of the 10,000 hectare in Krycklan, 4,100 hectare was used in our study. The current manual mapping was developed at SLU with the use of a height model and orthophoto by manually placing ditches as vectors on a two-dimensional map (A. ˚Agren, personal communication, 25 January, 2019). This detailed mapping has been provided to us by the SFA for the Krycklan area. Although this manual mapping is precise, it requires vast amounts of labour and

(10)

time resources (L. Nilsson, personal communication, 9 January, 2019). Because of this, it was relevant to explore the possibility of a more automated solution. The current automated solutions can detect parts of ditches, but fail where the LiDAR data has been disrupted by various blocking factors (Svensk, 2016). These disruptions may leave gaps in the ditch model, which need to be manually studied and mapped by a human. The automated mappings also incorrectly classifies non-ditch areas as ditches too frequently. One automated study was performed in Delineation of Ditches in Wetlands by Remote Sensing by Gustavsson and Selberg (2018). In their study, the Whitebox software (Lindsay, J. B., 2016) was used to produce a ditch mapping from the data described in 2.1. Their method was used as a comparison with the method produced in our study.

3 Aim and scope

With the manually mapped ditches in Krycklan as training and validation data, we have investigated whether it is possible to use a supervised learning algorithm to classify ditches in this area. The SFA has previously never attempted to classify ditches using machine learning (L. Nilsson, personal communication, 9 January, 2019). A comparison was also made between our classifier and the method used by Gustavsson and Selberg (2018). In addition to this, information about what data features contribute the most to ditch predictions was collected.

Two research questions were formed to answer these queries:

• Does the proposed approach detect ditches more accurately than that of Gustavsson and Selberg (2018)?

• Which LiDAR-based features represent the strongest ditch predictors?

We captured various aspects of detection performance through the following measures: accuracy, precision, recall, Cohen’s Kappa index, and area under the precision/recall curve score.

We did not make use of the original LiDAR data, but instead only the refined data attributes described in 2.1. We also only focused on one supervised learning algorithm, and did not make any comparisons between several different machine learning methods. The experiments were performed on a 4,100 hectare area of northern Sweden, which is only roughly 0.00009 % of the total area of Sweden. The available data is of a 0.25 m2 resolution per raster point (pixel). The data derived from the Lantm¨ateriet national laser scan has a resolution of 4 m2 (Lantm¨ateriet, 2018). The higher quality resolution

(11)

and relatively small study area may mean that our results are not generalisable for the entirety of Sweden.

The study aimed to find all the ditches in the study area, but we did not seek out any information about the shape or form of individual ditches. The ditches all have a generalised width based on a perceived average of ditch width in the area. No information was sought out about the depth or water flow of ditches. Both forest and road ditches exist, but we did not separate these two classes of ditches in any way. No focus was put on the computational efficiency of the method.

4 Background

According to Flach (2012), machine learning is the science of systems that improve their performance by using previous experience. Flach defines two phases of a given machine learning implementation: the learning problem and the task. Both phases require inputs and these inputs are defined by features. Features consist of refined or raw data from a set of data attributes. A learning algorithm uses these features to produce a model by mapping how the features correlate. The task is the so called black box method where feature data is sent to a model to produce an output (Flach, 2012). A black box method is a method where the user enters an input and receives an output, but has no insight in why the algorithm makes its prediction (Guidotti et al., 2018). Flach (2012) summarises the machine learning process in the following way: ”Machine learning is concerned with using the right features to build the right models that achieve the right tasks.”. There are different categories of tasks that can be solved with machine learning, e.g. classification, regression, or clustering. In this study, we focused on modelling ditch detection in a way that enabled us to apply classification to solve the detection problem.

4.1 Supervised learning

Supervised learning is a branch of machine learning where labelled data exists and can be used to map features and train the model (Kotsiantis, 2007). Furthermore, supervised learning works by using algorithms that can produce general hypotheses from externally supplied occurrences (Kotsiantis, 2007). These hypotheses can make predictions about future occurrences by building a model where the distribution of classification labels is mapped to correlating features. The model can then be used to label and classify data where the features are known beforehand, but the classification label is unknown.

(12)

Kotsiantis (2007) defines several steps in the process of developing a classifier with supervised learning, see Figure 2. The first steps involve identifying and pre-processing the input data, as well as defining the labels in the training set. When this is done, a suitable algorithm is selected. The last step involves using the algorithm on the data to train the classifying model. This step can consist of many iterations where the input parameters are tweaked to produce the best possible model. Because the SFA provided a detailed mapping of ditches over Krycklan, and the data is labelled, it was possible to apply supervised learning in this study.

Problem Data pre-processing Definition of training set Algorithm selection Training Evaluation with test set

OK? Classifier Yes Identification of required data Parameter tuning No

Figure 2. The steps involved in developing a supervised learning classifying model (Kotsiantis, 2007).

4.2 Related work using machine learning

There are several studies where geographical data has been used to structure models of behaviour and patterns in geographical areas. Sverdrup-Thygeson et al. (2016) used digital elevation data in their study of classifying old near-natural forests versus old managed forests. Their data consisted of digital elevation data refined from LiDAR scans. Three algorithms were compared: generalised linear model, boosted regression

(13)

trees, and random forest. They concluded that the difference in performance between the algorithms was not statistically significant, indicating that many approaches can be taken when classifying this type of data. Another study using supervised learning with digital elevation data is Random Forest for land cover classification (Gislason et al., 2006). They concluded that random forest is very desirable for multisource classification of remote sensing data where no statistical models are available.

Stanislawski et al. (2018) used a deep learning convolutional neural network method in their study to extract road and drainage valley features from digital elevation data in Iowa, USA. They used labelled data of roads and stream valleys and a DEM derived from LiDAR data to train their model. Stanislawski et al. (2018) used a three metre widening process to produce a raster model from the vectors of the road and stream valley networks. Their study indicates that deep neural networks can be used to detect drainage features from digital elevation data. However, deep neural networks have the downside of requiring vast amounts of processing power (Jaderberg, Vevaldi, & Zisserman, 2014).

Roelens et al. (2018) used random forest to detect ditches in Belgium. They used raw LiDAR data with a resolution of at least 16 laser pulses per m2. They managed to attain an overall classification accuracy between 91 and 99 %, and a ditch classification true positive rate (recall) of between 67 and 89 %. Many different features were used where neighbouring points of a specific LiDAR point were examined. Values of all LiDAR attributes were represented for the neighbouring area for both a 0.5, 1, 1.5, 2, 3 and 4 metre radius. Reproducing these neighbouring area features was of interest when building the model in our study. Roelens et al. (2018) used a Gini importance evaluation to determine that the most important factor of these neighbouring areas were the two and four metre radius values.

There was a plethora of possible algorithm choices for the task that this thesis deals with. Several of them (generalised linear model, boosted regression trees, and random forest) can also produce variable weights, highlighting what features are the most important for a given prediction (Sverdrup-Thygeson et al., 2016). Because random forest proved to be suitable for Roelens et al. (2018), it was chosen for our ditch detection problem.

4.3 Random forests and decision trees

Model ensembles is a category of machine learning where a set of models can be combined to increase diversity and robustness of predictions (Flach, 2012). Random forest is a supervised learning algorithm, first developed by Breiman (2001), which

(14)

makes use of the model ensemble technique by using a set of decision trees to build a forestof trees. The outputs from these trees are then examined, and a majority vote can be used to produce the final classification output for a specific input (Breiman, 2001). Random forest can also produce a class probability estimation ranging from 0 to 1 for each sample by dividing the amount of tree outputs of a class with the total amount of trees.

A decision tree is a data structure where each step down in the tree aims to minimise the entropy of the outputs (Kotsiantis, 2007). Each node in a decision tree represents a feature or an abstraction of features from the inputs, and each split is made by using this feature to separate the inputs. The end node of each tree is called a leaf, and each leaf produces an output value. Figure 3 and Table 1 show a generic decision tree and its data.

at1

at2 No No

Yes at3 at4

No Yes No a3 Yes b3 a2 b2 c2 a4 b4 a1 b1 c1

Figure 3.An example of a decision tree. Each node splits the data set based on certain feature attributes (Kotsiantis, 2007).

Table 1

The mock data used in Figure 3

at1 at2 at3 at4 Class

a1 a2 a3 a4 Yes a1 a2 a3 b4 Yes a1 b2 a3 a4 Yes a1 b2 b3 b4 No a1 c2 a3 a4 Yes a1 c2 a3 b4 No b1 b2 b3 b4 No c1 b2 b3 b4 No

One way to increase the diversity of a model ensemble is to use a technique called bootstrap aggregation (bagging), which was developed by Breiman (2001). Bagging takes, with replacement, a set of different random samples for constructing each tree (Flach, 2012). Bagging is particularly useful in constructing decision trees, due to their inherent sensitivity to variations in the data.

(15)

Random forest makes use of bagging, as well as subspace sampling. Subspace sampling, developed by Ho (1995), is the process of drawing a random subset of features from the feature set. Using this subspace sampling of features in tandem with bagging helps to produce an even more robust and diverse ensemble, and this is the method referred to as random forest (Flach, 2012). According to Breiman (2001), this reduces generalisation and overfitting errors that other classification algorithms can give rise to.

To determine which split is the best at each node in a tree, random forest can prioritise finding either the best reduction of entropy in the training dataset, or using a Gini criterion, which tries to produce pure nodes by sending all data in the largest class to the next node (Breiman, 1996). There are other hyperparameters that can be tuned to achieve better results from the model. For instance, you can adjust the amount of decision trees to use, or the maximum amount of features to use for each subspace sampling (Pedregosa et al., 2011).

4.4 Gini importance

The random forest algorithm can produce variable weights using Gini importance. Gini importance shows how often a feature is selected for a split for a certain classification (Menze et al., 2009). Menze et al. (2009) explain that the Gini importance can be obtained without much extra processing power, as it is a by-product of the actual training of the classifier. Each node in every randomised tree in the forest is examined, and each feature is given a Gini impurity rating. This impurity rating measures the frequency of incorrect labelling when using a randomly selected feature. As the Gini impurity rating for a feature increases, its Gini importance decreases (Menze et al., 2009). In the final product from the algorithm, the Gini importance for each feature is summarised from the trees, and presented as a continuous value between 0 and 1. A high value indicates that a feature is important when making a prediction (Menze et al., 2009). The Gini importance can help when developing a model by indirectly giving advice on new features to introduce. This Gini importance was also relevant for giving answer to our second research question found in 3.

(16)

5 Evaluation

The following data was gathered to evaluate the prediction, where positive represents a ditch and negative represents a non-ditch:

• Number of true positives (TP) • Number of false positives (FP) • Number of true negatives (TN) • Number of false negatives (FN)

In An introduction to ROC analysis, Fawcett (2006) explains that the outcome of the prediction can be presented in a confusion matrix to represent the disposition of the set of instances.

The following metrics were extracted from the confusion matrix: • Recall (true positive rate)

Positives correctly classified in relation to all actual positives. TP

TP + FN • Precision

Positives correctly classified in relation to all classified positives. TP

TP + FP • Accuracy

Correct classifications in relation to all attempted classifications. TP + TN

TP + TN + FN + FP

Using only accuracy as an evaluation metric when dealing with an imbalanced dataset (roughly 98 % of all occurrences are non-ditch) would produce a poor performance assessment (Spelmen & Porkodi, 2018). By simply classifying all pixels as non-ditches, one would by default attain 98 % accuracy. For this reason, the results were also evaluated using Cohen’s Kappa (Cohen’s κ) index. Cohen’s κ index measures how much better a prediction is compared to a prediction based purely on chance, where chance would yield a value of zero (Sim & Wright, 2005). With our data, a κ value of roughly zero would be attained by predicting 2 % of the occurrences as ditch pixels

(17)

completely at random.

The chance rating Pcof a prediction of n occurrences is calculated with:

Pc= _{(TP+FN)·(TP+FP)} n +(FN+TN)·(FP+TN)_n n

Cohen’s κ is then calculated as a value between -1 and 1 with:

κ = Accuracy – Pc 1 – Pc

Values above zero are better than chance and values below zero are worse than chance. Landis and Koch (1977) suggest the benchmarks seen in Table 2 as a general measurement of how good a prediction’s κ rating is.

Table 2 κ analysis

κ value Performance strength

< 0.00 Poor 0.00 – 0.20 Slight 0.21 – 0.40 Fair 0.41 – 0.60 Moderate 0.61 – 0.80 Substantial 0.81 – 1.00 Almost perfect

Note. Landis and Koch (1977) suggested these benchmarks to judge the performance strength of a classifier using Cohen’s κ rating.

The precision-recall curve and the area under the precision-recall curve (AUPRC) are additional metrics that can be used when evaluating datasets with a largely imbalanced class distribution (Fu, Yi, & Pan, 2019). The precision-recall curve has the recall value on the x-axis and the precision value on the y-axis, and the area under the curve that is defined by this point gives the AUPRC value. The area under this curve is given as a value between zero and one, where a value closer to one is better. The weighting causes the precision-recall curve to not place an equal value on true negatives and true positives (Fu et al., 2019). For our ditch detection problem, this means that the AUPRC evaluation metric favours accurately classifying ditch pixels over accurately classifying non-ditch pixels.

(18)

confidence intervals can be calculated from several different predictions. This can be preferable to the more traditionally used significance levels, due to confidence intervals not only either rejecting or accepting a hypothesis (Gardner & Altman, 1986). Confidence intervals are given as intervals between two values that you can say with some confidence level that the true mean of the population lies in. Commonly in research, intervals with a confidence level of 90, 95 or 99 % are used (Gardner & Altman, 1986). The confidence interval is calculated with:

¯x – λ_{α /2}·√σ n , ¯x + λα /2· σ √ n

where ¯x is the mean of all the samples and n is the number of samples, and λ_{α /2} can be obtained from a z-value table where α represents 1 - the confidence level. σ is the standard deviation of the samples given by:

σ = v u u t N

∑

i=1 (x_i– µ)2· P(ξ = xi)

where P(ξ = x_i) is the probability of occurrence of a value from your set of values, and µ is the expectation given by:

µ = N

∑

i=1 x_i· P(ξ = xi) (V¨annman, 2002)

6 Approach

The study was initiated by an information retrieval from previous research in the areas of machine learning and GIS. The information was then used to determine what algorithms and methods were the most suitable for the following phase of the study. The last phase consisted of an experiment designed to answer the research questions of the thesis. Based on the information study, a hypothesis was also formulated for the experiment.

6.1 Experiment design

The first phase of the experiment involved reproducing an automated method for ditch detection not using machine learning. The method chosen for this comparison is from

(19)

Delineation of Ditches in Wetlands by Remote Sensing by Gustavsson and Selberg (2018). In this study, the Whitebox software (Lindsay, J. B., 2016) was used on a DEM to determine how well different data attributes could be used to detect ditches in raster and polyline formats. Since two of the data attributes (Sky View Factor and Impoundment index) were also available in our dataset, reproducing this method on the Krycklan area produced a good comparison with our model. The second experiment phase involved feature engineering and developing post-processing for the random forest model. The third phase involved evaluating the output from the model and determining the importances of the features used. Lastly, the results from the different methods were compared and analysed to determine how they differed.

6.2 Experimental methodology

The random forest algorithm from the python library scikit-learn was used in the experiment. This library allows you to choose to split your trees with a Gini- or entropy criterion. The probability predictions used in this random forest distribution simply calculates the amount of occurrences of a class output divided with the total amount of trees in the forest. (Pedregosa et al., 2011)

The independent variables of the experiment were the feature inputs of the learning algorithm, as well as the number of trees and other configurations of the random forest algorithm. The dependent variable was the raster output classified either as ditch or non-ditch. (Zobel, 2015)

To answer the research question ”Does the proposed approach detect ditches more accurately than that of Gustavsson and Selberg (2018)?”, the following hypothesis for the experiment was formulated:

The method proposed in this study outperforms the method by Gustavsson and Selberg (2018) in ditch detection, with respect to Cohen’s Kappa index.

6.3 Data preparation

6.3.1 Training and validation data

To develop and evaluate our model, the raster and ditch label data of Krycklan were manually divided into 21 smaller subsections. From this division, 11 of the subsections were put aside as hold-out data to evaluate the performance of the predictions, and 10 zones were used in the development of the model. This allowed the model to

(20)

be evaluated on unseen data to strengthen the validity of the experiment. Each zone represents an area of roughly 196 hectare. Figure 4 shows which zones were used for development and evaluation respectively.

Figure 4. Krycklan’s location in Sweden. Red zones were used for developing the model and green zones were used for evaluation. Each zone represents 2997 · 2620 pixels (roughly 196 hectare).

(21)

With the 11 zones in the hold-out data for the final random forest experiment, a process called leave-one-out cross validation was used. Leave-one-out cross validation is a method where you train a model on all but one of your occurrences, and use that occurrence to evaluate the results (Wong, 2015). Using this technique allowed us to train 11 different random forest classifying models with a large amount of data, and evaluate each model once on a single zone, producing 11 sub-experiments to evaluate the method on.

6.3.2 Defining ditches in raster format

The digital elevation data from the SFA was represented in a raster format, whereas the ditches from SLU were represented as vectors. These vectors contain no information about the width of ditches. To label each individual pixel as either ditch or non-ditch, a conversion from vector to raster format was performed. Because the observed average width of ditches is larger than 0.5 metres, all pixels within a radius of three pixels (1.5 metres) of the vectors were labelled as ditch pixels. Figure 5 A shows the ditches rasterised from vectors and B shows the ditches after widening. The data in Figure 5 B is the labelled data that was used to train the random forest model. A similar approach was taken by Stanislawski et al. (2018) in their study of roads and stream valleys. Due to all ditches varying in width, it was not possible to produce a perfect representation of each ditch. However, this made for a good compromise for the average ditch.

Since our aim was to detect ditches, and not each pixel labelled as a ditch, some adjustments were made when evaluating the prediction results. The dataset was divided into a lower resolution grid of six by six pixels (9 m2) for each grid. Each grid cell that contained at least 25% ditch pixels was then labelled as a ditch. A similar method was used by Stanislawski et al. (2018). See Figure 5 C for a visual representation of these grid zones.

(22)

Figure 5. Processing of ditch labels.

A: Rasterised ditches with a width of one pixel.

B: Ditches after a widening process, seven pixels (3.5 metres) wide. Used as input when training the model.

C: Ditches after a zone conversion. Used for evaluating the results from the model.

6.4 Reproducing the Whitebox method

In Delineation of Ditches in Wetlands by Remote Sensing (Gustavsson & Selberg, 2018), the workflow for ditch detection consisted of a reclassification to remove noise and to define the limits of what to classify as a ditch. The raster data was then imported into ArcMap (Esri, 2017) to convert the raster to vectors (Gustavsson & Selberg, 2018). We only reproduced the reclassification step, as the results needed to be in raster format in order to compare it with our model. The workflow that Gustavsson and Selberg (2018) used for ditch detection is presented below.

• Sky View Factor

The Sky View Factor data has a value between zero and one. The data was binarised to only include values below 0.989. To remove large waterbodies, Gustavsson and Selberg (2018) created a buffer of six metres around polygons of waterbodies. These were converted to pixels and excluded in the result (Gustavsson & Selberg, 2018). Since we had no available data on waterbodies, we could not remove them from the prediction.

(23)

• Impoundment Index

The dams constructed in Whitebox (Lindsay, J. B., 2016) were four by four metres in size. After running the impoundment tool, the data was binarised to remove values with a water accumulation below 30 m3. This was done to remove flat areas, but still maintain the pixels with a large water accumulation. (Gustavsson & Selberg, 2018)

6.5 Feature engineering

Developing the random forest model involved examining how different kinds of features affected the prediction. Several possible data manipulatation methods could theoretically produce a better prediction. An issue with the previously used automated methods is that they do not correctly detect ditches where the LiDAR has been interrupted by bushes or trees. To combat this, steps were taken where neighbouring pixels were included to give a representation of the area surrounding a specific pixel. A similar approach was taken by Roelens et al. (2018), and this approach produced positive results in their study.

6.5.1 General features

The features used for training the model (Sky View Factor, Impoundment index, Slope, High Pass Median Filter) are all derivatives of the digital elevation data. These raw features provided a satisfactory foundation for the model, but lacked in the generalisability of their predictions. More diverse features were extracted using simple statistical aggregates such as mean, median, min, max, and standard deviation. This facilitated finding obscurities in the neighbouring areas around pixels. These features were calculated by gathering all data points in different circular radii around the studied pixel, before performing one of the statistical aggregations. See Figure 6: B, C, H, and J for graphical representation of some of these features.

6.5.2 Custom features

Several custom features were also developed in addition to the general features, attempting to specifically target and enhance ditches as well as non-ditches. These will be presented as follows.

The Sky View Factor Conic filter uses the Sky View Factor attribute to detect and fill gaps in ditches. This was done by taking the mean of all the pixels covered by a cone-shaped mask, which expands outwards from each examined pixel point. The mean

(24)

was calculated in eight directions from each pixel in a radius outward of 10 pixels. If the mean value from two opposing masks were both below a threshold, the pixel was updated with the lowest of these values. This meant that only pixels with strong ditch indicative values in two opposing directions were updated. This allowed the filter to avoid updating pixels that lay close to cavities or hollows, and only focus on linear geographical properties. This however also meant that geographical properties such as streams were amplified as well.

The Impoundment Ditch Amplification feature uses the Impoundment attribute to amplify ditches by using thresholds and classifying pixels that usually indicate ditches with increasing values. Means and medians were used to eliminate noise, and to produce a smoother ditch representation. See Figure 6: K for a graphical representation of this feature.

Similar to the aforementioned Impoundment Ditch Amplification, the HPMF ditch amplification feature classifies pixels based on their likelihood of lying in a ditch. Values were

smoothed with medians and means of different radii before receiving another reclassification based on ditch likelihood. A mean was taken one more time to smooth out the reclassified data. See Figure 6: E for a graphical representation of this feature.

The Sky View Factor non-ditch amplification feature amplifies pixels which are not ditches. This aims to help the model exclude hills and streams, which generally have a deeper impression on the landscape than ditches do with the Sky View Factor attribute. This observation was used to help amplify pixels that exceeded a certain threshold. This feature still misses many stream pixels, and sometimes also picks up pixels from particularly deep ditches. See Figure 6: I for a graphical representation of this feature. The Slope non-ditch amplification has the same goal as the Sky View Factor-based filter, but uses different thresholds and is based on the Slope attribute instead. This more aggressive filter will pick up a much higher percentage of hills and streams, with the downside of sometimes covering ditches as well.

The DEM ditch amplification feature was extracted from the DEM, where differences in elevation of local areas were calculated. Pixels that lay at a lower altitude than the average of a 15 meter radius circle around the examined pixel were marked out before a morphological grey closing was performed to remove noise from the feature.

A Gabor filter is an image processing filter that can be used to detect lines of a certain orientation in an image (Hong, Wan, & Jain, 1998). A set of 30 Gabor filters, which were rotated in different angles and with different frequencies, was used to detect lines in all directions. The filters from this set of filters were then combined to amplify ditches. These filters were used to create features from both the HPMF and Sky View

(25)

Factor attributes. See Figure 6: D and G for graphical representations of these features. The raw Impoundment feature was used to create a mask, attempting to retain ditches, but mark out streams. This was done by using a threshold on the Impoundment index that only marked out areas with a relatively large impoundment, which would indicate that these areas contained streams. After widening the resulting area, this mask was used to remove streams from all the aforementioned custom features, generating one new feature from each. See Figure 6: F and L for graphical representations of features that make use of this mask.

(26)

Figure 6. Example of 11 of the 81 features for a small sample area, in addition to ditch labels, used by the model:

A: Labelled ditches, B: Slope standard deviation, radius 6, C: HPMF mean, radius 4, D: HPMF Gabor, E: HPMF ditch amplification, F: HPMF ditch amplification - streams removed G: Sky View Factor Gabor, H: Sky View Factor max, radius 6, I: Sky View Factor non-ditch amplification J: Impoundment mean, radius 3, K: Impoundment ditch amplification, L: Impoundment ditch amplification - streams removed

(27)

6.6 Model configuration

The random forest classifier was trained on all the features seen in Table 3. The testing phase showed that the classifier produced poor results when the ratio of ditch- versus non-ditch pixels in the training data was very high. A high ratio led to the model not being punished for mislabelling ditches as non-ditches, causing it to prioritise a high accuracy over a high recall. According to Spelmen and Porkodi (2018) an imbalanced dataset causes a minority class to receive a reduced accuracy. As the ditch class is much less common than the non-ditch class, this needed to be addressed when training our model. To balance the model, we attempted to train the model with a roughly equal amount of ditch pixels and non-ditch pixels.

The first step to create a more balanced dataset was to extract all pixels labelled as ditches, as well as pixels within close proximity of ditches. Secondly, random pixel samples from the entire area were extracted. This allowed the training dataset to be fairly balanced while still containing most of the geographical features of each zone, see Figure 7.

Figure 7. Yellow pixels here indicate the balanced masks used to determine what pixels are used when training the model. A and B represent two of the zones from the training dataset.

(28)

Table 3

The 81 features used when training the model.

Feature/Algorithma Circular radiib

Sky View Factor raw

-Sky View Factor mean 2, 3, 4, 6

Sky View Factor median 2, 4, 6

Sky View Factor standard deviation 2, 4, 6

Sky View Factor min 2, 4, 6

Sky View Factor max 2, 4, 6

Sky View Factor non ditch amplification

-Sky View Factor conic filter - streams removed

-Sky View Factor Gabor

-Sky View Factor Gabor - streams removed

-Impoundment raw

-Impoundment mean 2, 3, 4, 6

Impoundment median 2, 4, 6

Impoundment standard deviation 2, 4, 6

Impoundment min 2, 4, 6

Impoundment max 2, 4, 6

Impoundment ditch amplification

-Impoundment ditch amplification - streams removed

-HPMF raw -HPMF mean 2, 3, 4, 6 HPMF median 2, 4, 6 HPMF standard deviation 2, 4, 6 HPMF min 2, 4, 6 HPMF max 2, 4, 6 HPMF ditch amplification

-HPMF ditch amplification - streams removed

-HPMF Gabor

-HPMF Gabor - streams removed

-Slope raw

-Slope mean 2, 3, 4, 6

Slope median 2, 4, 6

Slope standard deviation 2, 4, 6

Slope min 2, 4, 6

Slope max 2, 4, 6

Slope non-ditch amplification

-a _{Displays algorithm used to produce the feature.}

b _{Represents the radius of the circular mask (if one was used) to determine}

(29)

A hyperparameter tuning was performed to determine what parameter values for the random forest algorithm would yield the best results. Evaluating a maximum of 25 features for each node, and using 200 trees showed the best results. Setting the class weight to balanced also improved the performance of the classifier. A probability

prediction was used instead of a majority vote binary prediction to allow further post-processing of the prediction.

6.7 Post-processing

The model outputs a ditch class probability prediction for each pixel. These probabilities have continuous values between zero and one. zero indicates a very low probability of a pixel lying in a ditch, while one equals a very high ditch probability. See Figure 8: A for a graphical representation a raw prediction for one of the 21 zones of Krycklan.

6.7.1 Noise reduction and gap filling

The probability predictions contained a lot of noise in places far away from ditches, which needed to be excluded. The first step for removing noise was to use a bilateral de-noising filter on the entire prediction image. This left linear properties and pixels with a very high value intact, while lowering the value of pixels that did not contribute to an accurate prediction. See Figure 8: B for a graphical representation.

The second step for removing noise was to use a custom function to remove pixels with a semi-high probability, but that lay far away from any other high probability pixels. A threshold value was used to avoid removing pixels that had a high enough probability, helping to retain pixels that lay in or close to a ditch. The max probability value in a circular radius of 10 pixels was then calculated. If this max value was not high enough, the probability of the examined pixel was lowered. See Figure 8: C for a graphical representation.

The third step involved taking measures to try to fill gaps in ditches that the model failed to correctly predict. A similar method to the one described in 6.5.2 was employed to calculate the mean of cone masks expanding outwards in different directions from the examined pixel. This step also amplified some of the noise that was left, but filling the gaps in the ditches was judged to be more important to help make the next step more effective. See Figure 8: D for a graphical representation.

(30)

6.7.2 Binarisation with zones

The model’s ditch prediction is given in the original resolution raster format. Therefore, the same grid conversion was performed on the prediction raster as on the evaluation data. A mean probability rating was calculated for each six by six grid zone by binary classifying the entire zone if the mean probability exceeded 35 %. This helped to fill in gaps where lone pixels in ditches had been incorrectly classified, and also helped in the next step of the post-processing. See Figure 8: E for a graphical representation.

6.7.3 Cluster removal

To remove noise from the binary prediction, a custom cluster detection algorithm was used. By finding the number of connected pixels with a true value and removing those whose cluster size were below a given threshold, minor noise in the prediction could be removed while still retaining most of the ditch pixels. Ditches that have a low probability, and therefore create small clusters may still be excluded by this method, but the noise removal advantages outweigh the loss in recall. This algorithm operates similarly to a paint fill function in an image processing software, with the difference of counting the pixels instead of colouring them. A distance calculation was also performed in tandem with this method to find the largest distance of pixels inside each given cluster. This helped to remove cavities and hollows that were not removed by the initial small cluster removal, but that had a shape that indicates that they did not represent a ditch. See Figure 8: F for a graphical representation.

(31)

A

B

C

D

Figure 8. The different steps taken after a prediction by the model. A more yellow pixel indicates a high probability of a ditch, whereas a purple pixel indicates a low ditch probability: A: Raw probability prediction, B: Prediction after bilateral de-noising, C: Prediction after custom de-noising, D: Prediction after gap filling

(32)

E

F

Figure 8. (continued) E: Prediction after binarisation with zone probability, F: Final binary prediction after cluster removal

7 Results and analysis

7.1 Experimental results

The results from the experiment with our method for the 11 different zones are summarised in a confusion matrix in Table 4. The results for the reproduced method of Gustavsson and Selberg (2018) can be seen in Table 5 for the Impoundment method, and in Table 6 for the Sky View Factor method. The reason that the reproduced experiments have a lower total percentage of actual ditch pixels is due to the grid classification being used in our method, but not being used in the reproduced methods.

(33)

Table 4

Confusion matrix from the results of the random forest model prediction. Displays percentages of predicted ditch and non-ditch occurrences as well as actual occurrences.

Prediction Actual

Ditch % Non-ditch % Sum %

Ditch % 1.32a 0.87b 2.19 Non-ditch % 0.69c 97.12d 97.81 Sum % 2.01 97.99 100.00 a _{True positive} b _{False positives} c _{False negatives} d _{True negatives} Table 5

Confusion matrix from the reproduced Impoundment method results. Displays percentages of predicted ditch and non-ditch occurrences as well as actual occurrences.

Prediction Actual

Ditch % 0.19a 5.66b 5.85 Non-ditch % 1.26c 92.89d 94.15 Sum % 1.45 98.55 100.00 a _{True positive} b _{False positives} c _{False negatives} d _{True negatives} Table 6

Confusion matrix from the reproduced Sky View Factor method results. Displays percentages of predicted ditch and non-ditch occurrences as well as actual occurrences.

Prediction Actual

Ditch % 0.09a 0.31b 0.40 Non-ditch % 1.36c 98.24d 99.60 Sum % 1.45 98.55 100.00 a _{True positive} b _{False positives} c _{False negatives} d _{True negatives}

(34)

7.2 Analysis

Table 7 displays all the evaluation metrics for the prediction of our method. The confidence intervals were calculated from the 11 different zones that the experiment was performed on. Since the zones have different amount of ditches in them, the confidence intervals will not be completely accurate, but will produce a close estimation. From Table 7, it can be seen that averaging the results from the 11 zones will yield a very similar value to the total results, indicating that the model generally performs equally well on zones with a low amount of ditches as on zones with a high amount of ditches. For the results of each individual zone, refer to Appendix C.

Table 7

Metrics for the prediction performance of our model.

Metric Totala Zoneb CI 90%c CI 95%c CI 99%c

Average Accuracy % 98.43 98.43 [98.04 , 98.82] [97.97 , 98.90] [97.82 , 99.04] Recall % 65.47 65.25 [59.30 , 71.20] [58.16 , 72.33] [55.93 , 74.56] Precision % 60.11 58.98 [56.54 , 61.43] [56.06 , 61.90] [55.15 , 62.82] κ rating 0.619 0.606 [0.573 , 0.639] [0.567 , 0.645] [0.552 , 0.628] AUPRC 0.631 0.625 [0.590 , 0.659] [0.584 , 0.665] [0.571 , 0.678]

a _{The result of all 11 zone experiments when combined.}

b_{An average score from the 11 different zones that the experiment was performed on.} c _{Confidence intervals at different confidence levels.}

Table 8 shows the evaluation metrics from the experiments with the recreated methods of Gustavsson and Selberg (2018). Of the two methods, the Impoundment method outperformed the Sky view factor method for all metrics except recall.

Table 8

Metrics from the total results of the two recreated experiments using the thresholds from the thesis by Gustavsson and Selberg (2018).

Metric Sky View Factor Impoundment

Accuracy % 93.08 98.32

Recall % 13.00 6.10

Precision % 3.23 22.17

κ rating 0.029 0.090

(35)

The κ rating for our method can be seen in the Total column in Table 7 (κ = 0.619). The κ rating from the two reproduced experiments can be seen in Table 8 (κ = 0.090 and κ = 0.029). Because the κ rating from our method outperforms both reproduced methods, our hypothesis has been confirmed by the experiments.

In Table 9, the top 20 features from the random forest model are presented with their importance percentages for the model making a successful prediction. The feature importance was obtained using the Gini impurity for each feature (Menze et al., 2009). For a full list of feature importances for the model, refer to Appendix B.

Table 9

The top 20 features by importance when the model makes a prediction.

Position Featurea Importance (%) 1. Impoundment mean 3 11.08 2. Impoundment mean 4 7.44 3. HPMF mean 4 7.03 4. HPMF mean 3 3.97 5. Impoundment median 4 2.97 6. Impoundment mean 2 2.53 7. Sky View Factor Gabor - streams removed 2.03 8. Impoundment ditch amplification 1.91 9. HPMF median 4 1.90 10. Impoundment ditch amplification - streams removed 1.76 11. Slope standard deviation 6 1.62 12. Sky View Factor non-ditch amplification 1.59 13. HPMF Gabor - streams removed 1.59 14. Sky View Factor Gabor 1.48

15. HPMF Gabor 1.46

16. Sky View Factor max 6 1.39 17. HPMF mean 6 1.39 18. Impoundment standard deviation 4 1.39 19. Slope min 6 1.31 20. Slope non-ditch amplification 1.25

a _{The number next to some of the features indicates the circular radius used to select}

(36)

8 Discussion

8.1 Strengths

The results from our method seen in Table 7 show that we managed to attain a Cohen’s κ rating in the substantial range, according to the thresholds proposed by Landis and Koch (1977) seen in Table 2. This means that our model performed substantially better than one based purely on chance. The total recall value of 65.47 % shows that we managed to find most of the ditch pixels that exist. It is hard to determine exactly how many of the actual ditches we managed to detect, due to pixel classification not being an entirely accurate measurement to assess ditch detection. The precision of 60.11 % shows that we incorrectly classified a large amount of pixels as ditch pixels. However, a lot of these incorrect classifications lay in very close proximity to a ditch, which has to be taken into consideration when assessing the performance of the method. Considering pixel classification was used, our results have shown great promise.

We managed to bridge a lot of gaps and exclude streams to a large extent in our predictions. This was in part due to the features developed, such as the impoundment stream removal features, and in part due to the post-processing of the prediction. The post-processing was successful both in removing noise from the random forest prediction, as well as bridging some of the gaps that the model was unsuccessful at bridging on its own. The homogeneous six by six pixel grids that helped form more continuous ditches, and our custom cluster removal of both small clusters and clusters of a non-ditch shape both helped in this.

Several of the custom features were of use. For example, using fingerprint enhancement techniques for gap filling such as Gabor filters (Hong et al., 1998), proved to work well on ditches due to similarities in the structures of gaps in fingerprint ridges and ditches in images. Another example is the Sky View Factor conic filter, and the gap filling in the post-processing, which filled gaps by looking at the means of pixels in opposite directions of an examined pixel. Applying different statistical aggregations using circular radii of different sizes for neighbouring areas of pixels, similar to Roelens et al. (2018), also proved to produce very useful features.

To assure the validity of our results, more than half of the data (11 out of 21 zones) were excluded early in the process, only for use in the final experiment. This allowed for the model to be tested on areas that had not been seen during development, and therefore did not have features or post-processing tweaked to their composition. Another strength in our method was the use of leave-one-out cross validation with our 11 hold-out zones.

(37)

This allowed for us to use more training data for each model in the experiment, instead of dividing the data into training- and testing zones.

8.2 Weaknesses and limitations

Generally, the ditch pixels that our method failed to detect were the ones where the ditches were more shallow, and made less of an imprint on the landscape than other ditches, causing the data to have weaker values. Sometimes pixels in these ditches were classified as ditches, but not enough pixels in the surrounding area, causing our post-processing to identify them as noise and remove them. Incorrectly classified ditch pixels that caused the precision to drop were generally the pixels that lay in streams. However, small cavities or hilly areas classified as ditches by the model were generally removed in the post-processing.

A potential issue with using our model in other geographical areas is that all the post-processing steps and features were developed based on occurrences in the Krycklan area. Different geographical compositions in other areas could cause the feature algorithms developed in this study to be less effective. Thresholds used to fill gaps in ditches and to remove noise could also prove to be ineffective in other areas. In addition to this, the raw Impoundment feature used in this study was created with specific thresholds for the Krycklan area, and may therefore not work as well in other parts of Sweden.

Because all ditches were widened from a vector format to 3.5 metres in a raster format, this means that some ditches were made too wide, and some too thin. This resulted in the model sometimes not learning correctly on pixels with faulty labels, which potentially hindered the classification’s performance. Had we been able to use labels that were correctly classified on a pixel basis, the model could have learned with a lot higher precision, and the classification results would also most likely benefit from this. This raises the question of whether pixel classification is the right way to tackle this ditch detection problem.

While many of the features used in the experiment proved to be of value, the project suffered from our lack of experience in feature engineering. Time had to be spent on learning what features are, and how to make use of them. With more experience, better features could have been developed more efficiently. In some cases the data may also have simply been too weak to be able to detect all ditches, and to distinguish them from streams. It is possible that use of the raw LiDAR data in addition to the DEM could have provided even better opportunities to detect ditches.

(38)

Due to time constraints, no extensive hyperparameter tuning for our random forest model was performed. Since hyperparameter tuning is often more important than choice of algorithm (Lavesson & Davidsson, 2006), this probably led to a lower performance than what was possible with our features. For hyperparameters, we only examined number of trees, the max amount of features used for evaluation per node in a tree, as well as setting the class weight to balanced or not. Several combinations of hyperparameter settings were not tested.

8.3 Comparison to state-of-the-art

The results of our method showed a significant improvement over the previously used automated methods for ditch detection. Several issues with previous methods were solved at least to some extent, such as bridging gaps and removing some of the streams from the ditch predictions.

Concerning the comparison with the state-of-the-art method from Gustavsson and Selberg (2018), the thresholds values used in their geographical area were not generalisable to the Krycklan area. This probably caused the comparison of the methods to be biased in our favour. Gustavsson and Selberg (2018) also had data of water bodies in their study area, which they could exclude from the ditch prediction. This data did not exist in our reproduction, causing their method to potentially perform worse. The results from the prediction highlights one of the bigger problems with automated methods that are tuned to a specific geographical area.

8.4 General discussions

Looking at what features had the highest Gini importance as seen in Table 9, the features based on the Impoundment attribute generally outperformed features based on the other attributes. As Impoundment measures the volume that a dam would occupy on the landscape, it is logical that this feature clearly marks out streams and ditches, and therefore has a high importance for our model. The features in Table 9 marked with streams removed also used the Impoundment attribute to attempt to remove streams from the feature, further highlighting the effectiveness of this attribute. Of the custom features, the features based on Gabor filters performed well. This makes sense since the nature of Gabor filters is to mark out linear properties in images, and ditches almost always form straight lines as they are man-made structures as opposed to, for instance, streams.

(39)

There were a lot of challenges that we had to overcome during the development of this thesis. Time had to be spent on learning about features and other machine learning terms, as we had never before used machine learning methods. Researching what algorithm to use and what potential features to develop also took up a lot of time, which with more experience could have been spent on developing a better model instead. A lot of the development was simply trial and error, where different features and post-processing were developed to determine what worked best.

9 Conclusions and future work

This thesis investigates how to use real-world digital elevation data together with manually identified ditches to identify patterns and relationships that can be used for automatic ditch detection. The proposed method significantly outperforms state-of-the-art methods. While the method still has room for improvement, it performs well on the available data. The thesis contributes to an increased understanding of how machine learning can be applied to ditch detection and, more generally, to real-world problems in forestry. An implication of this work is that it may become both faster and easier to map ditches, which can support decision making on which environmental actions need to be taken, and where. From an economic perspective, automation of ditch detection significantly reduces manual labour and cost. In conclusion, this thesis has shown that it is possible to use machine learning with digital elevation data to learn patterns that enable robust detection of ditches.

As mentioned previously, pixel classification may not be the best approach to ditch detection. One could examine larger grids of pixels and perform object segmentation on the images formed by these grids to potentially produce a better performance. Using deep learning or other techniques that handle these types of features well could be a good next step to take for the SFA. The prediction performance could then be assessed e.g. by measuring distances from detected objects to the vector labels, thereby avoiding artificially labelling and widening ditches. Being limited by processing power, memory, and the time allotted for this thesis, however, we early ruled out the possibilitiy of using deep learning techniques.

The work in this thesis can help to form a base for the SFA:s goals of using machine learning to classify ditches. It gives an insight into what features work well, and what features do not. Some of the custom features could be reused in the future, with random forest or other machine learning algorithms. Several of the post-processing algorithms, such as the probability prediction noise removal or the cluster removal from the binary

(40)

(41)

References

Breiman, L. (1996). Technical note: Some properties of splitting criteria. Machine Learning, 24(1), 41-47. doi: https://10.1023/A:1018094028462

Breiman, L. (2001, October). Random forests. Machine Learning, 45(1), 5-32. doi: https://10.1023/A:1010933404324

Carluer, N., & Marsily, G. D. (2004). Assessment and modelling of the influence of man-made networks on the hydrology of a small watershed: implications for fast flow components, water quality and landscape management. Journal of Hydrology, 285(1), 76-95. doi: https://10.1016/j.jhydrol.2003.08.008

Cho, H.-C., Slatton, K. C., Cheung, S., & Hwang, S. (2011). Stream detection for lidar digital elevation models from a forested area. International Journal of Remote Sensing, 32(16), 4695-4721. doi: https://10.1080/01431161.2010.484822

Dages, C., Voltz, M., Bsaibes, A., Pr´evot, L., Huttel, O., Louchart, X., . . . Negro, S. (2009). Estimating the role of a ditch network in groundwater recharge in a mediterranean catchment using a water balance approach. Journal of Hydrology, 375(3-4), 498-512. doi: https://10.1016/j.jhydrol.2009.07.002

Erixon, A. (2015). Kombinerad flygfotografering/laserskanning h¨assleby/krycklan. TerraTec Sweden AB, Johanneshov. (Unpublished document)

Esri. (2017). Digital elevation models. Retrieved 2018-01-20, from https://learn .arcgis.com/en/related-concepts/digital-elevation-models.htm Fawcett, T. (2006). An introduction to roc analysis. Pattern Recognition Letters, 27(8),

861-874. doi: https://10.1016/j.patrec.2005.10.010

Flach, P. (2012). Machine learning: The art and science of algorithms that make sense of data. Cambridge, United Kingdom: Cambridge University Press. doi: https://10.1017/CBO9780511973000

Fu, G. ., Yi, L. ., & Pan, J. (2019). Tuning model parameters in class-imbalanced learning with precision-recall curve. Biometrical Journal, 61(3), 652-664. doi: https://10.1002/bimj.201800148

Gardner, M. J., & Altman, D. G. (1986). Confidence intervals rather than p values: Estimation rather than hypothesis testing. British medical journal (Clinical research ed.), 292(6522), 746-750.

Gislason, P., Benediktsson, J. A., & Sveinsson, J. R. (2006, October). Random forests for land cover classification. Pattern Recognition Letters, 27(4), 294-300. doi: https://10.1016/j.patrec.2005.08.011

Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., & Pedreschi, D. (2018). A survey of methods for explaining black box models. ACM Computing

(42)

Surveys, 51(5). doi: https://10.1145/3236009

Gustavsson, A., & Selberg, M. (2018). Delineation of ditches in wetlands by remote sensing (Unpublished bachelor’s thesis, Uppsala University, Uppsala, Sweden). Retrieved from http://www.diva-portal.org/smash/get/diva2: 1221962/FULLTEXT01.pdf

Ho, T. K. (1995). Random decision forests. In Proceedings of the international conference on document analysis and recognition, icdar(Vol. 1, p. 278-282). doi: https://10.1109/ICDAR.1995.598994

Hong, L., Wan, Y., & Jain, A. (1998). Fingerprint image enhancement: Algorithm and performance evaluation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8), 777-789.

Jaderberg, M., Vevaldi, A., & Zisserman, A. (2014). Speeding up convolutional neural networks with low rank expansions. Nottingham, United Kingdom: Proceedings of the British Machine Vision Conference. doi: https://10.5244/C.28.88

Kotsiantis, S. B. (2007). Supervised machine learning: A review of classification techniques. Informatica (Ljubljana), 31(3), 249-268. doi: https://10.1007/s10462 -007-9052-3

Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159-174. doi: https://10.2307/2529310

Lantm¨ateriet. (2018). Laser data - laserdata skog. Retrieved

from https://www.lantmateriet.se/contentassets/

d85c20e0e23846538330674fbfe8c8ac/lidar data skog.pdf

Lavesson, N., & Davidsson, P. (2006). Quantifying the impact of learning algorithm parameter tuning. In Proceedings of the national conference on artificial intelligence(Vol. 1, p. 395-400).

Lindsay, J. B. (2016). Whitebox gat: A case study in geomorphometric analysis. computers & geosciences. Computers & Geosciences, 95, 75-84. doi: https: //10.1016/j.cageo.2016.07.003

Menze, B. H., Kelm, B. M., Masuch, R., Himmelreich, U., Bachert, P., Petrich, W., & Hamprecht, F. A. (2009). A comparison of random forest and its gini importance with standard chemometric methods for the feature selection and classification of spectral data. BMC Bioinformatics, 10. doi: https://10.1186/1471-2105-10-213 Mulik, S. N. (1999). An introduction to geographical information systems. IETE

Technical Review (Institution of Electronics and Telecommunication Engineers, India), 16(5-6), 419-424. doi: https://10.1080/02564602.1999.11416861

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., . . . Duchesnay, E. (2011). Scikit-learn: Machine learning in python. Journal of

(43)

Machine Learning Research, 12, 2825–2830. Retrieved from http://www.jmlr .org/papers/volume12/pedregosa11a/pedregosa11a.pdf

Roelens, J., H¨ofle, B., Dondeyne, S., Van Orshoven, J., & Diels, J. (2018). Drainage ditch extraction from airborne lidar point clouds. ISPRS Journal

of Photogrammetry and Remote Sensing, 146, 409-420. doi: https:

//10.1016/j.isprsjprs.2018.10.014

Shane D. Mayor, C. S. U. (2007). Unfiltered vs. filtered - what is the difference? Retrieved 2019-02-11, from http://lidar.csuchico.edu/filtering/ Sim, J., & Wright, C. C. (2005). The kappa statistic in reliability studies: Use,

interpretation, and sample size requirements. Physical Therapy, 85(3), 257-268. doi: https://10.1093/ptj/85.3.257

Skogsstyrelsen. (2018). Om oss. Retrieved 2019-01-16, from https://www

.skogsstyrelsen.se/om-oss/

Spelmen, V. S., & Porkodi, R. (2018). A review on handling imbalanced data. In Proceedings of the 2018 International Conference on Current

Trends towards Converging Technologies, ICCTCT 2018. doi: https:

//10.1109/ICCTCT.2018.8551020

Stanislawski, L., Brockmeyer, T., & Shavers, E. (2018). Automated road breaching to enhance extraction of natural drainage networks from elevation models through deep learning. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences - ISPRS Archives, 42(4), 671-678. doi: https: //10.5194/isprs-archives-XLII-4-597-2018

Svensk, J. (2016). Beräkning av diken fr˚an lantmäteriets nationella laserdata. Foran Sverige AB, Linköping. (Unpublished document)

Sverdrup-Thygeson, A., Ørka, H. O., Gobakken, T., & Næsset, E. (2016). Can airborne laser scanning assist in mapping and monitoring natural forests? Forest Ecology and Management, 369, 116-125. doi: https://10.1016/j.foreco.2016.03.035 V¨annman, K. (2002). Matematisk statistik. Lund, Sweden: Studentlitteratus AB. Wong, T. . (2015). Performance evaluation of classification algorithms by k-fold and

leave-one-out cross validation. Pattern Recognition, 48(9), 2839-2846.

Zaksek, K., Oˇstir, K., & Kokalj, Z. (2011, 12). Sky-view factor as a relief visualization technique. Remote Sensing, 3. doi: https://10.3390/rs3020398

Zobel, J. (2015). Writing for computer science (3rd ed.). London, United Kingdom: Springer Publishing Company, Incorporated. doi: https://10.1007/978-1-4471 -6639-9

(44)

Appendices

Appendix A

Source code

All the source code used to develop features, models, post processing and for performing the experiment can be found in our GitHub repository here.

(45)

Appendix B

All feature importances

Table 10

The importance for 38 of the 81 feature when the model makes a prediction.

Position Featurea Importance (%) 1. Impoundment mean 3 11.08 2. Impoundment mean 4 7.44 3. HPMF mean 4 7.03 4. HPMF mean 3 3.97 5. Impoundment median 4 2.97 6. Impoundment mean 2 2.53 7. Sky View Factor Gabor - streams removed 2.03 8. Impoundment ditch amplification 1.91 9. HPMF median 4 1.90 10. Impoundment ditch amplification - streams removed 1.76 11. Slope standard deviation 6 1.62 12. Sky View Factor non-ditch amplification 1.59 13. HPMF Gabor - streams removed 1.59 14. Sky View Factor Gabor 1.48 15. HPMF Gabor 1.46 16. Sky View Factor max 6 1.39 17. HPMF mean 6 1.39 18. Impoundment standard deviation 4 1.39 19. Slope min 6 1.31 20. Slope non-ditch amplification 1.25 21. HPMF min 4 1.14 22. Sky View Factor max 4 1.13 23. Impundment max 6 1.12 24. Impundment standard deviation 6 1.09 25. Impundment median 6 1.04 26. Impundment mean 6 1.03 27. Slope min 4 1.01 28. Impundment median 2 1.01 29. HPMF min 6 1.01 30. HPMF standard deviation 6 1.01 31. Slope median 6 1.00 32. HPMF min 2 0.99 33. HPMF mean 2 0.98 34. Sky View Factor min 6 0.96 35. Impundment max 4 0.96 36. HPMF ditch amplification - streams removed 0.96 37. Slope mean 6 0.95 38. HPMF ditch amplification 0.94

a_{The number next to some of the features indicates the circular radius used to}

Ditch detection using refined LiDAR data : A bachelor’s thesis at Jönköping University

Ditch detection

using refined

LiDAR data

A bachelor’s thesis at J ¨onk ¨oping

University

Abstract

Keywords

Acknowledgements

Contents

1

Introduction

2

Context

2.1

Available data

2.2

Current situation

3

Aim and scope

4

Background

4.1

Supervised learning

4.2

Related work using machine learning

4.3

Random forests and decision trees

4.4

Gini importance

5

Evaluation

∑

∑

6

Approach

6.1

Experiment design

6.2

Experimental methodology

6.3

Data preparation

6.4

Reproducing the Whitebox method

6.5

Feature engineering

6.6

Model configuration

6.7

Post-processing

A

B

C

D

E

F

7

Results and analysis

7.1

Experimental results

7.2

Analysis

8

Discussion

8.1

Strengths

8.2

Weaknesses and limitations

8.3

Comparison to state-of-the-art

8.4

General discussions

9

Conclusions and future work

References

Appendices

Appendix A

Source code

Appendix B

All feature importances