Large Scale Terrain Modelling for Autonomous Mining

(1)

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Large Scale Terrain Modelling for

Autonomous Mining

Examensarbete utfört i Reglerteknik vid Tekniska högskolan i Linköping

av

Johan Norberg

LiTH-ISY-EX--10/4347--SE

Linköping 2010

Department of Electrical Engineering Linköpings tekniska högskola

Linköpings universitet Linköpings universitet

(2)

(3)

Large Scale Terrain Modelling for

Autonomous Mining

Examensarbete utfört i Reglerteknik

vid Tekniska högskolan i Linköping

av

Johan Norberg

LiTH-ISY-EX--10/4347--SE

Handledare: Eric Nettleton

RTCMA, University of Sydney Paul Thompson

RTCMA, University of Sydney Jonas Callmer

ISY, Linköping University

Examinator: Thomas Schön

ISY, Linköping University

(4)

(5)

Avdelning, Institution

Division, Department

Division of Automatic Control Department of Electrical Engineering Linköpings universitet

SE-581 83 Linköping, Sweden

Datum Date 2010-09-24 Språk Language Svenska/Swedish Engelska/English Rapporttyp Report category Licentiatavhandling Examensarbete C-uppsats D-uppsats Övrig rapport

URL för elektronisk version

http://www.control.isy.liu.se http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-57334 ISBN — ISRN LiTH-ISY-EX--10/4347--SE

Serietitel och serienummer

Title of series, numbering

ISSN

—

Titel

Title

Storskalig Terrängmodellering för Autonom Gruvdrift Large Scale Terrain Modelling for Autonomous Mining

Författare

Author

Johan Norberg

Sammanfattning

Abstract

This thesis is concerned with development of a terrain model using Gaussian Pro-cesses to support the automation of open-pit mines. Information can be provided from a variety of sources including GPS, laser scans and manual surveys. The in-formation is then fused together into a single representation of the terrain together with a measure of uncertainty of the estimated model.

The model is also used to detect and label specific features in the terrain. In the context of mining, theses features are edges known as toes and crests. A combination of clustering and classification using supervised learning detects and labels these regions.

Data gathered from production iron ore mines in Western Australia and a farm in Marulan outside Sydney is used to demonstrate and verify the ability of Gaussian Processes to estimate a model of the terrain. The estimated terrain model is then used for detecting features of interest.

Results show that the Gaussian Process correctly estimates the terrain and uncertainties, and provide a good representation of the area. Toes and crests are also successfully identified and labelled.

Nyckelord

Keywords Gaussian Processes, mine automation, terrain modelling, feature detection, sup-port vector machine

(6)

(7)

Abstract

This thesis is concerned with development of a terrain model using Gaussian Processes to support the automation of open-pit mines. Information can be provided from a variety of sources including GPS, laser scans and manual surveys. The information is then fused together into a single representation of the terrain together with a measure of uncertainty of the estimated model.

The model is also used to detect and label specific features in the terrain. In the context of mining, theses features are edges known as toes and crests. A combination of clustering and classification using supervised learning detects and labels these regions.

Data gathered from production iron ore mines in Western Australia and a farm in Marulan outside Sydney is used to demonstrate and verify the ability of Gaussian Processes to estimate a model of the terrain. The estimated terrain model is then used for detecting features of interest.

Results show that the Gaussian Process correctly estimates the terrain and uncertainties, and provide a good representation of the area. Toes and crests are also successfully identified and labelled.

Sammanfattning

Detta examensarbete syftar till att utveckla en terrängmodell med hjälp av Gaussprocesser för att ge stöd till automatisering av gruvdrift i dagbrott. Informa-tion kan hämtas från en mängd olika källor som t.ex. GPS, laserskanningar och manuella mätningar. Informationen fusioneras sedan till en enda representation av terrängen som även innehåller information om osäkerheten i den skattade modellen. Modellen används också för att detektera och klassificera specifika kännetecken i terrängen. Inom gruvdrift är dessa kännetecken “kanter” benämnda släntfot och släntkrön. En kombination av klustring och klassificering med maskinlärning används för att detektera och markera dessa områden.

Data hämtad från järngruvor i västra Australien och en farm i Marulan utanför Sydney används för att demonstrera och verifiera förmågan hos Gaussprocessen att skatta en modell av terrängen. Den skattade modellen används sedan till att detektera intressanta områden.

Resultaten visar att Gaussprocessen korrekt skattar terrängen med osäkerheter och tillhandahåller en god representation av området. Släntfötter och släntkrön är också framgångsrikt identifierade och klassificerade.

(8)

(9)

Acknowledgments

This thesis would not have been possible without the help and encouragement from many helpful people.

First of all, I would like to thank the Rio Tinto Centre for Mine Automation and especially my supervisor, Dr. Eric Nettleton, for allowing me to come and do my master’s thesis in Sydney. It has been an interesting, challenging and very fun time and I have learned a lot about academic research during my time here.

Thanks also to my second supervisor, Dr. Paul Thompson, for helping me when I got stuck and coming with many helpful ideas. Both of you have been of great support and have given me many helpful comments on my work and thesis.

A thank to Tim Hale at RTCMA is also needed, for helping me with the field trials down in Marualan. I could never have imagined that Australia could be that cold.

My examiner and supervisor in Linköping, Dr. Thomas Schön and Jonas Callmer, also deserves to be mentioned for coming with the suggestion to do my thesis in Australia and helping me get in touch with ACFR. You have also given me many valuable comments on my work and thesis.

Many thanks to all my friends from all over the world that I have met during my studies in Linköping, Amherst and Sydney. You are too many to mention you all here, but you know who you are. To all the international students: I hope I will see you all soon again somewhere in the world!

Finally, to my family for supporting and caring for me at all times. Your support and encouragement have helped me through these five years as a university student.

Johan Norberg Sydney, July 2010

(10)

(11)

6.4.1 Segmentation . . . 54 6.4.2 Classification . . . 57 6.5 Conclusion . . . 58 7 Conclusions 61 7.1 Summary . . . 61 7.1.1 Terrain Estimation . . . 61 7.1.2 Edge Detection . . . 62 7.2 Results . . . 62 7.3 Future Work . . . 63 Bibliography 65 A Data Sets 71 B Cholesky Decomposition 75 C Abbreviations 76 D Notation 77

(13)

Chapter 1

Introduction

The aim of this thesis is to develop and demonstrate a terrain model to support automation in an open-pit mining environment. This application requires a terrain model that can use information from a wide variety of sources including GPS, manual surveys and laser scans, and fuse them into a single representation of the environment. The information is also used to detect and label key terrain features automatically. In the context of mining, the critical features of interest are edges known as toes and crests.

The algorithms developed in this thesis are demonstrated using data from production iron-ore mines operating in Australia’s Pilbara region as well as data from a farm in Marulan south of Sydney, Australia.

The underlying representation for the terrain model in this thesis is formulated using Gaussian Processes. Gaussian Processes are a non-parametric method that provides a probabilistic approach to estimation of continuous functions. They also incorporate sensor properties and uncertainties into the model.

From the modelled terrain representation, features of interest are extracted with the help of clustering and then classified as a toe or a crest using support vector machines.

1.1 Mine Automation

Complete mine automation has the potential to improve many aspects of mining, including safety, productivity and the economy of the process. The safety is increased since humans can be removed from hazardous environments and handling of dangerous materials. By increasing the quality and quantity of the produced items the economy is also improved. However, despite the potential benefits, automation has not been widely deployed in mining. Figure 1.1 shows a panorama over the mine in West Angelas in Australia and Figure 1.2 shows three typical vehicles used in open-pit mining.

Previous research and attempts at automation in mining typically only focused on the automation of the vehicles. However, significant benefits may be acquired by extending this and automating the whole process and information flow. This

(14)

Figure 1.1. A panorama over the open-pit mine in West Angelas.

includes the integration of both existing and new information systems and processes and the embedding of best practices in autonomous systems.

LKAB, in corporation with Tamrock, tried during the late 1980s automation of a load haul dump (LHD) to transport extracted ore and rocks in their underground mine [21]. The LHD was equipped with video cameras and a communication and guidance system. Although the tests were successful and the LHD removed almost one million tons of iron ore during the field trials, the project was put on hold for a couple of years. Today, the loading and transporting of ore is fully automated.

In the middle of the 1990s, a program called “Mining Automation Program”, or MAP, was formed by a group of companies operating in the mining industry with the aim to develop autonomous mines [36]. The idea was to provide the mining industry with similar technology that changed the manufacturing industry with a goal to fully automate the development, drilling, loading and transportation in mines. A research mine in Canada was established to test the different techniques and equipment.

A number of technologies had to be invented in order to archive this. Commu-nication with the mining equipment from the surface required a commuCommu-nication system that allowed communication throughout the mine. For the vehicles and drills to know where they are in the mine, a navigation and positioning system that works underground was also developed. Using teleoperated drills, explosive loaders and LHDs, the operators could observe and operate the mine remotely from the surface with reduced cost as the result.

Caterpillar has also developed a system for underground autonomous operations [10]. The system allows the operators to supervise operations from a working place far away from the LHD. The operator loads and dumps the LHDs remotely, which are then driven autonomously during hauling. Access restrictions can also be set up at different parts of the mine so that LHDs are not permitted to drive in areas where personnel is currently located.

While many of these programmes have achieved a degree of success is niche areas, there has yet to be a successful “whole of mine” automation system. To address this, Rio Tinto, a global mining company, has started the Rio Tinto Centre for Mine Automation (RTCMA) with the aim to develop a fully automated and remotely operated mine. RTCMA consists of researchers and students from all over the world and is part of the Australian Centre for Field Robotics (ACFR) and

(15)

1.1 Mine Automation 3

(a) A drill.

(b) Truck and excavator.

Figure 1.2. Pictures from West Angelas of a drill (a) and an excavator filling a truck

with rocks (b).

the University of Sydney.

One of the key projects within the RTCMA has been the development of an autonomous blast hole drill rig (see Figure 1.2(a)). This platform was tested in 2008 and has subsequently been deployed at a production iron-ore mine in Australias Pilbara region. The drill both moves and drills autonomously and uses a combination of GPS and track encoders to determine its position.

In December 2008, the use of Komatsu’s FrontRunner Autonomous Haulage System began to transport ore by autonomous trucks and trains at the same mine.

(16)

The trucks are equipped with GPS, laser and radar to determine position and detect obstacles [40]. A supervisory system gives the operators full control over the location and status of all the vehicles.

The trucks are automatically guided to the loading point based on the location of the excavator where they are loaded and then directed to the dump site. If an obstacle is detected on the route, the trucks are stopped or the speed is reduced. They navigate through intersections and refuel themselves, continuously adapting their path in relation to other vehicles and nearby objects. Operators on manned equipment, such as the excavators, interact with the trucks via a screen where the trucks for example can ask for permission to approach.

1.2 Terrain Modelling in Mining

In order for a vehicle to be able to operate and adapt autonomously to its environ-ment, it needs perception of the surroundings. For example, a drill has to know where it can drive and drill safely and in the most efficient way. The terrain is continuously altered when ore is removed, making open-pit mining a very dynamic environment. Therefore, it is of the utmost importance to have a rich representation of the mining environment. Traditionally, this has been done by the surveyors who manually measures the mine but by using improved technology and sensors, this process could be automated. A manual survey of a mine can be viewed in Figure 1.3.

Figure 1.3. A survey performed in an open-pit mine.

(17)

1.3 Objectives 5

of the terrain varies in different areas; for example to the vehicles so they can plan efficient paths for driving. Other reasons for having a good representation of the terrain include: to estimate the volume of the landmass removed and to determine the future development of the mine.

1.3 Objectives

This thesis is part of the work at RTCMA on mine automation. The aim is to demonstrate the ability of Gaussian Processes to estimate and model terrain found in a mining environment and then automatically label features in the estimated model. More specifically, the objectives of this work are:

• Development of an efficient framework to allow fusion of multiple data sources. • Verification of the estimated model and uncertainties against the ground

truth.

• Identification and labelling of features in the estimated terrain; specifically toes and crests.

In the future, this model may be used in production mine sites to build a model of the mining environment to provide knowledge of the surroundings to the operators and the autonomous vehicles.

The ability to automatically detect specific features in the terrain allows faster and more accurate labelling than what is possible by manual classification.

1.4 Contributions

The main contributions of the research are:

• Developed a framework to efficiently implement the terrain model. A key feature of this design is the ability to collect sensor data from a variety of sensors and locations, and fuse this data together into a single representation using Gaussian Processes. The different properties of the sensors are exploited to provide the certainty of the estimated terrain model.

• Verified the estimated models using the gathered sensor data to prove that both the elevation and uncertainties are estimated correctly. Data was gathered from a farm in Marulan outside Sydney as well as from a production ore mine in Western Australia.

• Proposed a method to detect and filter outliers using a statistical test. As the collected data is often noisy, outliers must be detected and removed. They will otherwise interfere with the estimation process and cause unnecessary errors in the estimated surface.

(18)

• Developed a method for automatic detection and labelling of toes and crests. These are critical terrain features for mining applications as they are used to verify the current shape of the mine and plan the future development. The estimated terrain model can be scanned for interesting edges that are automatically labelled into the correct class. The result is a complete system for estimation, fusion and classification of terrain features in open-pit mines.

1.5 Thesis Outline

The remainder of the thesis is organized as follows; Chapter 2 presents the terrain estimation in detail. The theory behind Gaussian Processes and how they can be used to estimate terrain is described and an extension to fuse multiple sensors with different parameters together is discussed.

The following chapter, Chapter 3, describes the necessary implementation details that are required in order to implement the system. Different approximation techniques are described as well as other ways to speed up the computations.

Chapter 4 discusses detection and labelling of specific key features found in mines; in this work toes and crests. Different ways to detect and label them are possible and one method is proposed. The theory behind two required algorithms are given in detail.

The implementation details on the toe and crest detection are given in Chapter 5 which describes the proposed method in more detail. Experiments using the developed algorithms for terrain estimation and feature extraction are provided in Chapter 6 where data from iron ore mines from Western Australia is used to evaluate the methods.

Finally, Chapter 7 concludes the thesis and summarizes the results and dis-cusses possible future work that should be done in this area to further improve the estimation and feature detection. The data sets used in the experiments, Choleskey decomposition of matrices, abbreviations and notation can be found in the appendices.

(19)

Chapter 2

Terrain Estimation

This chapter discusses methods for terrain estimation. The theory behind terrain modelling and how Gaussian Processes can be used in modelling of terrain is discussed. The chapter also addresses methods for using data from multiple sensors and data from different spatial locations in terrain estimation.

In its general form, terrain estimation consists of finding the best estimate of a surface that approximates the true ground. For each point in the xy-plane, ¯

x = (x, y), the corresponding height z = f (¯x) needs to be modelled.

The given input is one or many point clouds consisting of a number of gathered sensor readings. Each point ¯q in the point cloud is a point in three dimensions

¯

q = (x, y, z). For each point, the elevation uncertainty σ2

n is also known.

When scanning the ground for input data, many challenges emerge. Areas might be occluded from the scanning location depending on the type of sensor used. Thus, the gathered sensor data may contain empty or sparse areas. Different sensors also have different properties with respect to accuracy, range, visibility and noise. All these factors need to be taken into account when performing data gathering and estimation.

It is not desirable to have input data with empty regions and to circumvent this, data can be gathered from areas where these regions are visible. When using multiple locations to scan the terrain, a way to fuse the different scans together is required. The same scanner does not necessarily have to be used in all the scans. A common case in the mining environment is that a number of sparse but approximately equally spaced data points collected from a GPS device are available over the whole mine. In addition to this, a number of laser scans are performed in different locations to give a high resolution point cloud that covers part of the area.

2.1 Related work

The problem of generating a 3D-representation of terrain is difficult and many different approaches are possible. Digital elevation models (DEM) are usually represented using a grid or as a triangular irregular network to create a 2.5D representation of the terrain to be modelled. This representation is often called

(20)

height map or elevation map and stores the elevation of the terrain at each point in a 2D-space.

Grid based methods, Wack and Wimmer [55], Ackermann and Kraus [1], are popular because of its simple structure. A regular grid is formed and the height is stored at each cell. Due to its regular structure, local features cannot be handled without increasing the resolution in the whole model.

Triangulated irregular networks, studied for example by Peucker et al. [43] and Leal et al. [29], are a way to represent a surface by connecting irregular nodes with lines into non overlapping triangles. A regular grid would need to be adjusted to the part of the terrain with most features and would therefore be redundant in smooth areas. A triangulated irregular network solves the problem by locally adapting to the underlying terrain. One way to store the data is by using a spatial data structure such as a Quad-tree suggested by Pajarola et al. [39].

An extension to elevation maps is suggested by Triebel et al. [51] to handle vertical structures and multiple levels. This approach allows each grid to store multiple surfaces which allows modelling of structures such as bridges, caves and overhanging structures. Wurm et al. [61] use octrees to build a modell of a 3D environment.

For grid based methods, a way to interpolate the surface between the grid points is needed and the method chosen greatly affects the quality of the generated surface. A review and comparison of many common methods is done by Kidner et al. [24]. It is found that higher order techniques are always better than linear techniques and a minimum of bicubic interpolation should be used for digital elevation models. Similar results are also found by Kidner [25] where higher order models are found to give better estimates. Rees [49] studied the accuracy in interpolating a digital terrain model to a higher resolution. In general, the different interpolation techniques do not handle uncertainties in a statistical way and if the uncertainty in the terrain modell is required, other methods are often needed.

Different methods for scanning the terrain and acquiring data are also possible. Buckley et al. [7] used a helicopter equipped with a laser scanner for data gathering and Perlant [42] used a stereo camera.

The use of machine learning, and more specifically Gaussian Processes, for terrain generation is a way to handle uncertainties in the scanned data and the sensor in a statistically sound way. Gaussian Processes are a non-parametric method and allows for large scale terrain modelling. In the geostatistics field, prediction using Gaussian Processes is popular and is known under the name

Kriging [12, 35].

Gaussian Processes for large scale terrain modelling have previously been studied for example by Vasudevan et al. [52] who described the approach in detail. Different kernels and approximation techniques are evaluated together with a comparison with other methods for terrain modelling. Three different data sets with both laser scans and GPS readings from different mines in Australia are used in the evaluation. The suggested approach is to use Gaussian Processes to generate a surface of the point cloud collected by the sensors.

(21)

2.2 Gaussian Process Estimation 9

[27] and Plagemann et al. [44]. A local kernel adaptation technique is proposed to account for local variations of the terrain by using different locally varying length scales. The idea is that the length scales are different in areas with a lot of variation in the elevation compared to areas which are relatively flat. All local length scales are first initiated with the globally learned parameters and then iteratively updated by a learning algorithm until the local parameters have converged. Results show that this approach allows reliable estimation of the terrain and gives a good balance between smoothing and preserving of rough terrain.

Fusion of different sensors or sensor readings from different locations has also been studied, for example by Vasudevan et al. [53]. The work gives the fundamentals of Gaussian Process fusion and how different sensors with different noise parameters can be fused together. It also evaluates different sampling and filtering techniques when dealing with multiple sensors.

Vasudevan et al. [54] suggests an extension to Gaussian Processes called multi-output Gaussian Processes to simultaneously handle multiple dependent multi-outputs. It is shown that information from multiple data sets improves the prediction although the data sets do not overlap.

2.2 Gaussian Process Estimation

Gaussian Processes [48, 13, 52, 59] are a generalization of Gaussian distributions and extends them to infinite dimensions. For regression they can solve a typical prediction problem where some noisy observations z are given at locations ¯x. The

goal is to compute the best estimate, f∗ of the underlying function f (¯x), at a new

test point ¯x∗.

Some statistical assumptions of the underlying function, f (¯x), are needed. These

are represented by the choice of a covariance function which will be explained in more detail in Section 2.2.2. A specific functional model is however not needed, and this may allow the estimated function to fit better to the underlying observations. Therefore, Gaussian Processes are often referred to as non-parametric since no underlying function is explicitly defined.

2.2.1 Definition

A Gaussian Process is mathematically defined by a mean function m(¯x) and a

covariance function k(¯x, ¯x0) where

m(¯_{x) = E(f (¯}x)), (2.1a)

k(¯x, ¯x0_{) = E (f (¯}x) − m(¯x))(f (¯x0) − m(¯x0)) . (2.1b) It is here assumed for simplicity that the mean m(¯x) is zero, but this is not

necessary. Usually a Gaussian Process is then written as

(22)

The observed data is noisy and can be seen as an unknown underlying function

f (¯x) with some added Gaussian noise with variance σ2_n,

z = f (¯x) + , (2.3a)

 ∼ N (0, σ_n2). (2.3b) For the terrain modelling problem, the data given is a position in the xy-plane, ¯

x = (x, y) and the observed data at each point is the noisy height of the terrain at

that point according to (2.3a).

2.2.2 Covariance Function

The covariance function, or kernel, describes how two points ¯x and ¯x0 depend on each other and a number of different choices exist. The most popular kernel is perhaps the squared exponential kernel and another example is the neural network kernel which is used in this work. A number of different kernels are further explained below. With each kernel, some parameters are needed and the set of all these parameters are referred to as the kernel hyperparameters.

The choice of a kernel and the hyperparameters is everything that is needed to estimate the function value and the uncertainty at an arbitrary position. The covariance kernel is by definition always symmetric and positive-definite.

Squared exponential kernel

The squared exponential kernel is defined as

k(¯x, ¯x0) = σ_f2exp 1 2(¯x − ¯x 0₎T_{Σ (¯}_{x − ¯}_x0₎ , (2.4a) Σ = lx 0 0 ly −2 . (2.4b)

where Σ is the length scale matrix and the parameters lxand ly describe how the underlying function varies in the direction of x and y. A large value of the length scale parameters gives points far away from the estimation point a bigger influence over the final estimated height, while a small value limits this to nearby points. Furthermore, σ2

f is the signal variance and σ2n is the noise variance.

This kernel is stationary since it is a function of ¯x − ¯x0 and thus invariant to translation.

For the squared exponential kernel, the set of all hyperparameters becomes:

θ = {σf, lx, ly, σn}.

Neural network kernel

The neural network kernel [60] with the transfer function modelled as a Sigmoid function is defined as k(¯x, ¯x0) = σ2_farcsin _{β + 2¯}_xT_{Σ ¯}_x0 (1 + β + 2¯xT_{Σ ¯}_{x)(1 + β + 2¯}_x0T_{Σ ¯}_x0₎ . (2.5)

(23)

where β is a bias factor and Σ is the same matrix as defined in (2.4b). The neural network kernel is not invariant to translation and it is therefore not stationary. The parameters σ_f2 and σ2n are the signal variance and the noise variance respectively. The hyperparameters for the neural network kernel is the set θ = {σf, β, lx, ly, σn}.

2.2.3 Prediction

Predicting a value z∗ at a test point ¯x∗ by the use of Gaussian Process regression

can be done from the noisy observations after choosing kernel and hyperparameters. The hyperparameters must be chosen in some way; preferably by learning from the input data which is described in Section 2.2.4.

The covariance between any two observations is

Cov(zp, zq) = k(¯xp, ¯xq) + σ2nδpq, (2.6) where δpqis the Kronecker delta function defined as δpq= 1 if p = q and 0 otherwise. Using matrix notation, K(X, X) is the matrix where the covariance is calculated between all observed values of ¯x, here denoted as X. For m observed points, the

matrix K(X, X) becomes an m × m matrix,

K(X, X) =      k(¯x1, ¯x1) k(¯x1, ¯x2) · · · k(¯x1, ¯xm) k(¯x2, ¯x1) k(¯x2, ¯x2) · · · k(¯x2, ¯xm) .. . ... . .. ... k(¯xm, ¯x1) k(¯xm, ¯x2) · · · k(¯xm, ¯xm)      . (2.7)

The matrices K(X∗, X) and K(∗, ∗) are defined in a similar way,

K(X∗, X) =k(¯x∗, ¯x1) k(¯x∗, ¯x2) · · · k(¯x∗, ¯xm) , (2.8a)

K(X∗, X∗) = k(¯x∗, ¯x∗). (2.8b)

The covariance matrix can therefore be calculated as

Cov(Z) = K(X, X) + σ2_nI (2.9)

where Z is the set of all observed values of f (X). The joint distribution of the observed points Z and test outputs f∗ becomes

Z f∗ ∼ N 0 0 , K(X, X) + σn2I K(X, X∗) K(X∗, X) K(X∗, X∗) . (2.10) We would like to calculate the conditional probability of f∗ given Z. This

probability p(f∗|Z) is also Gaussian distributed and gives the following conditional

distribution and predicting expressions for the estimated height and covariance.

f∗|Z ∼ N ˆf∗, Cov(f∗), (2.11a)

ˆ

f∗= K(X∗, X)(K(X, X) + σ2nI)−1Z, (2.11b) Cov(f∗) = K(X∗, X∗) − K(X∗, X)(K(X, X) + σn2I)−1K(X, X∗). (2.11c)

(24)

The process of determining the best estimate ˆf∗ of an unknown height and

its variance at a position X∗ involves using (2.11b) and (2.11c) to compute the

estimated height and variance. One property of Gaussian distributions is that the covariance in (2.11c) does not depend on the observed values Z. If instead the estimated observation ˆz∗ is required, the noise term σn2 must be added to the covariance computed by (2.11c).

ˆ

z∗= ˆf∗, (2.12a)

Cov(z∗) = Cov(f∗) + σn2I. (2.12b) Using the notation introduced above requires the inverse (K(X, X) + σ2

nI)−1 to be calculated and this is typically an operation with complexity O(m3_{), where}

m is the number of data points. If m is large, this operation becomes very computationally expensive. Solving a linear equation by matrix inversion is often the least efficient way. When floating point arithmetics is used, the numerical properties of the solving method must also be taken into consideration. A much better way is to use Gaussian elimination or to perform a matrix decomposition such as LU-decomposition to solve the given linear equation system [17].

As stated earlier, the covariance matrix is always a symmetric and positive-definite matrix and this fact can be used to speed up calculations. Cholesky decomposition can be used on all symmetric and positive-definite matrices to decompose the matrix into a lower triangular matrix and its conjugate transpose. A more detailed description of Cholesky decomposition can be found in Appendix B. Cholesky decomposition is faster and numerically more stable than directly inverting a matrix. The number of operations used by Cholesky decomposition is almost half compared to LU-decomposition and this makes it much more efficient in solving linear systems of equations.

Prediction using Gaussian Process regression with Cholesky decomposition in practice can be seen in Algorithm 2.1, where the following shorthand notation is used; K = K(X, X), K∗= K(X, X∗) and K∗∗= K(X∗, X∗).

Algorithm 2.1 Gaussian Process regression [48]

Require: X inputs, Z observed points, k covariance function, σnnoise parameter,

X∗ test inputs 1: L = cholesky(K + σ2 nI) 2: α = LT\(L\Z) 3: fˆ∗= K∗Tα 4: v = L\K∗ 5: Cov(f∗) = K∗∗− vTv

2.2.4 Learning Hyperparameters

The quality of the output from the regression is dependent on the choice of covari-ance function and the choice of hyperparameters. For the neural network kernel, the hyperparameters can be expressed as the set θ = {σf, lx, ly, β, σn}. If these

(25)

parameters are chosen badly, then the output will be poor. The hyperparame-ters usually depends on the data sets and they must somehow be chosen before regression can be performed.

Many ways exist to determine the best possible hyperparameters; these include the Bayesian approach and cross validation. Here, the Bayesian approach is described which estimates the hyperparameters by maximizing the log marginal likelihood. The log marginal likelihood can be derived from the marginal likelihood

p(Z|X), which is given by the following integral p(Z|X) =

Z

p(Z|f, X)p(f |X) df. (2.13) An expression for the log marginal likelihood conditioned on the hyperparame-ters is obtained after integration

log p(Z|X, θ) = −1 2Z T_{(K + σ}2 nI) −1_{Z −} 1 2log |K + σ 2 nI| − n 2log 2π. (2.14) The best estimate of the hyperperameters occurs when p(θ|X, Z) is the greatest. Assuming little prior knowledge of the value of θ, then according to Bayes’ theorem, the following relation can be shown

p(θ|X, Z) ∝ p(Z|X, θ)p(θ|X). (2.15) Assuming p(θ|X) is constant gives

The optimization can be performed with regular optimization methods such as a conjugate gradient search [19] or Nelder-Mead simplex [38]. If the noise term σn is known for the input data then this term does not have to optimized and instead the known value can be used to reduce the dimension of the optimization problem.

The marginal likelihood has most likely a number of local maximas which could make it hard for an optimizer to find the global maxima. Some care must therefore be taken so that a bad local optimum is not found. Multiple datasets that are assumed to have the same set of hyperparameters can also be used and this is called multi-task learning. One solution is to add the log marginal likelihood from the different datasets together and optimize on this sum instead [4].

Example: Gaussian Process Regression

An example of performing Gaussian Process regression in one dimension can be seen in Figure 2.1. In the figure, a function (solid line) is sampled at regular intervals, but with some added noise (circles). A Gaussian Process regression is performed on the sampled data to estimate the function between the sampled points (crosses).

(26)

A 95% confidence interval is shown for the estimated points. The covariance function used is a neural network kernel and the hyperparameters are learned from the sampled data using a Nelder-Mead simplex method.

−3

−2

−1

0

1

2

3 −20

−10

0

10

20

30 x

f

(x

)

Figure 2.1. An example of regression using Gaussian Processes. The circles are samples

with added noise from the solid line. The crosses are estimated points with a 95% confidence interval plotted as lines. The hyperparameters are learned from the input data by maximizing the log marginal likelihood.

2.3 Gaussian Process Fusion

When estimating a large area of terrain, it is often not possible or practical to scan the whole area at once. Limitations in the type of sensors used or the properties of the terrain could make it impossible to get sufficient data to build an accurate model, for example due to range or visibility limitations.

Different sensors also have different properties and that should be taken into consideration when estimating the surface to get the best possible estimate of the terrain area in question. A way to include these different properties in the estimation as well as to fuse different scans together is therefore required.

(27)

2.3 Gaussian Process Fusion 15

Estimation using Gaussian Processes as described in Section 2.2 assumes that only one set of data points is used and that the noise is the same for all points. If different sensors with different properties are used, or if data is collected in different locations, the Gaussian Process regression must be altered to handle this new information. Using many sensors or scans also increases the amount of data available during regression and this must be taken into consideration when fusing data together so that the estimation process does not become too complex.

The proposed approach is similar to the method used by Vasudevan et al. [53], but is generalized to allow for individual noise parameters on all sampled points. Depending on the type of sensor used this could be useful, for example if the variance is dependent on the distance or location of the measured point.

The fusion is based on the idea that the hyperparameters are the same among the different data sets, but with different noise parameters. It is also assumed that a regular Gaussian Process regression can be performed on the datasets using different noise parameters, but with the same kernel.

Each sampled point ¯xi has thus a noise parameter σi associated with it that depends on the noise of the sensor it was measured with. To handle different noise parameters, (2.11b) and (2.11c) have to be modified accordingly to

ˆ

f∗= K(X∗, X)(K(X, X) + Ψ)−1Z, (2.17a)

Cov(f∗) = K(X∗, X∗) − K(X∗, X)(K(X, X) + Ψ)−1K(X, X∗), (2.17b)

where Ψ is a diagonal matrix with the different noise parameters σ2

i on the diagonal; Ψ = diag(σ2

1, σ22, ..., σ2nc), and Z is the set of the observed values; Z =

[z0, z1, · · · , zn] T

.

When computing the variance for the height z∗ instead of f∗, a noise term

R(X∗) has to be added to (2.17b). This term is the expected value of the noise

terms for the data points used which results in [53],

ˆ z∗= ˆf∗, (2.18a) Cov(z∗) = Cov(f∗) + R(X∗), I (2.18b) R(X∗) = 1 N N X i=1 σ_i2, (2.18c)

where N is the total number of data points used in the estimation of the current point and σiis the noise parameter for point i.

This method takes all scanned points and uses the individual noise parameters to determine the best estimate of a new test point using fused sensor data. For terrain generation this means that multiple data sets can be fused together into a single representation of the terrain which includes all the collected sensor data. The individual sensor properties are also exploited to get the best possible estimate of the terrain.

Example: Gaussian Process Fusion

An example of fusion using Gaussian Processes can be seen in Figure 2.2. In this figure, two different sets of samples are taken from the solid line. The sampled

(28)

points, circles and triangles, have different noise parameters and are fused together using the method described above.

The crosses are the estimated points with a 95% confidence interval shown. The hyperparameters where learned by maximizing the log marginal likelihood on the first dataset.

0 5 10 15 20 −1.5 −1 −0.5 0 0.5 1 1.5 x f (x )

Figure 2.2. An example of fusion using Gaussian Processes. The circles and triangles

are sampled points with different noise parameters from the solid line. The crosses are estimated points from fusion of the sampled points with a 95% confidence interval plotted as lines. The hyperparameters are learned from the input data by maximizing the log marginal likelihood.

2.4 Conclusion

This chapter has described the theory and different approaches to terrain modelling. This work performs terrain estimation using Gaussian Processes regression, and the definition and how they can be applied to terrain modelling has been discussed. Gaussian Processes are non-parametric, incorporates the properties of the sensors used and requires no assumptions to be made beforehand on the underlying terrain.

The regression method can be extended to allow fusion of different sensors or scans to get a better estimate when limitations in sensor range, visibility or other circumstances occurs.

(29)

Chapter 3

Terrain Estimation using

Gaussian Processes

This chapter contains details on the implementation and the challenges that arise in terrain estimation when dealing with large datasets. A system overview is given and different approximations which are used to speed up calculations are described.

To store the collected data efficiently and reduce the complexity of the problem, the method proposed by Vasudevan et al. [52] is used. An approximation method based on the nearest neighbours is used together with a KD-tree that stores all the sensor data. Threading can also be performed when processing the data to further speed up calculations. The collected data is sometimes noisy as dust is detected by the laser scanner and therefore outliers must be identified and removed. To achieve this, a statistical test is proposed and it is described later in this chapter.

For evaluating the performance of the estimator, the model must be validated.

k-fold cross validation is used when no reference data is available. Otherwise, the

computed model is compared with the true ground heights with the mean squared error used as metric.

3.1 System Overview

The system consists of three stages as illustrated in Figure 3.1: input, regression and output. In the input stage, data from one or many sensors is loaded and prepared for processing. Data used for validation is also loaded. When all required data is loaded, the KD-trees (described in Section 3.6) are prepared and built for fast access when required. A grid where the heights need to be estimated is then created and stored.

When all data is loaded and the KD-tree is built, the regression stage begins. If threading is used, the main process creates the requested amount of worker threads and fills the input queue to the threads with the locations where the heights need to be estimated. The main process waits until all threads are finished and stores the data afterwards. If threading is not used, the main process estimates the unknown

(30)

heights and stores the data afterwards.

After all data is processed a mesh is created to allow the surface to be viewed. The mesh is formed from the grid by connecting the adjacent points into triangles and for each vertex the uncertainty at that point is also stored. Finally, the mesh is saved onto the disk for later processing or viewing.

Load data Build tree Create grid Perform regression Create mesh Save output Input stage Output stage Regression stage

Figure 3.1. An overview of the three stages in the system and the different steps each

stage consists of.

3.2 Matrix and Vector Operations

For matrix operations, the implementation uses Eigen [15]. Eigen is a library for linear algebra written in C++ and handles both basic arithmetic operations such as addition and multiplication as well as more advanced operations such as linear transformation, decomposition and equation solving for matrices and vectors.

Eigen allows the program to take advantage of the Streaming SIMD Extensions (SSE) found in all newer processors. SSE and its sequels are a single

instruc-tion, multiple data (SIMD) instruction set that allows multiple operations to be performed in parallel [22].

A common operation such as vector addition in four dimensions, where four floating point numbers should be added together would usually require four add instructions to be issued by the processor. By taking advantage of the special instructions in the SSE instructions set, these calculations can instead be performed by a single instruction and thus, the performance could potentially increase.

(31)

3.3 Removal of Mean 19

Matrix and vector operations are particularly well suited to be implemented as SIMD instructions as many additions and multiplications need to be executed. If the processor has support for SSE2 instructions, then Eigen can be configured to use these instructions.

3.3 Removal of Mean

The GP-regression, described in Section 2.2, requires the mean of the heights in the input data to be zero. Before the tree is built, the mean is removed and saved. When the regression is finished, the previously calculated mean is added to the estimated height.

The precision in floating point calculations when using computers must also be taken into account. One issue is the calculation of the neural network kernel involving the division of two potentially big numbers followed by the calculation of the inverse sine on this result. Precision will be lost when dividing two large numbers and the results will be similar for nearby points. Thus, the calculated kernel will also lose precision and will approximately be similar everywhere.

One solution to this problem is to remove the mean in the location as well before building the KD-tree. A mean of zero will increase the precision in the division and make the calculation of the kernel more precise.

3.4 Validation

For validation of the estimated surfaces, two methods are used. k-fold cross validation [26] is used to validate the estimated heights when no other independent height measurements are available. If another dataset is available with true heights then the estimated heights are instead compared to these values.

For evaluation of the model output, the mean squared error (MSE) is used. The mean squared error is the mean of the squared error between all the estimated values f (x), and true function values y computed according to

M SE = 1 N N X i=1 (yi− f (xi))2. (3.1) A small mean squared error indicates a good fit of the model to the data.

k-fold cross validation works by randomly dividing the input data into k subsets, D1, D2, ..., Dk [2]. Estimation is done k times and each time, t = 1, 2, ..., k, all subsets except Dt are used in the estimation process. The heights are compared with the data given by the subset Dt. The mean square error is computed during each iteration and the average mean squared error is used as the model fit for the whole dataset. The advantage of this method is that it allows all data to be used for both training and validation.

To validate the estimated uncertainty, a 95% confidence interval is used to compare the estimated uncertainty with the true height. For Gaussian distributed data with standard deviation σ, this means that 95% of the estimated points should have a height error less than 1.96 σ metes.

(32)

3.5 Approximations

When working with large datasets, it is often not practically possible to take advantage of all the information given. Some of the information may also be redundant for the problem at hand and information can therefore be discarded without a great loss in accuracy.

In the large scale terrain modelling applications considered in this thesis, there are typically datasets with at least one million points. This makes the terrain modelling very computationally demanding if all points are to be used for estimation. An example is the matrix inversion which typically is of complexity O(m3₎

where m is the number of data points. For a large number of data points this would be too slow for an online system and some approximations must be used to achieve satisfactory execution times.

3.5.1 Learning from a Subset of Data

Using a large dataset in determining the hyperparameters described in Section 2.2.4 can be very computationally expensive. Therefore, only a subset of the input data is used during learning. The number of data points used for learning is typically only a few thousand points.

In order to minimize the time needed for modelling a surface, a value of the hyperparameters that works well over all tested datasets are used in this work. Maximization of the log marginal likelihood (2.14) has been done on multiple datasets individually and a value of the hyperparameters that was found to give satisfactory results on all datasets was then used for the experiments.

3.5.2 Local Neighbourhood Approximation

The number of datapoints N in the terrain modelling applications considered in this work is typically very large. This has potential to cause significant computational difficulties if not managed.

All N data points are stored in a spatial data structure and when a GP estimation is to be performed at a test point, only the nc closest points are fetched from the point cloud and used in the regression. This is termed a local neighbourhood approximation and is based on the assumption that the nearest data at evaluation points are the most informative when an unknown height is to be estimated. Importantly, it reduces the computational complexity extensively for relatively small values of nc since nc<< N . However, the approach loses the global continuity of the terrain model since the surface is approximated by its local neighbourhood.

In this implementation, a KD-tree is used for storing the point cloud. More information about KD-trees can be found in Section 3.6. Experiments on real data from mines with different values of nc and how the runtime and the quality of the estimated surface are effected by nc are presented in Section 6.3.2. An example on a created surface can be found below.

(33)

3.6 KD-Tree 21

3.5.3 Evaluation

To evaluate the local neighbourhood approximation how it affects the quality of the generated surface, a surface was created and 200 points sampled. These points were used to estimate the surface using a varying number of closest neighbours and the results can be seen in Table 3.1.

Cross validation shows that using only a few of the closest neighbours gives poor performance with a high mean square error and standard deviation. However, using more neighbours lowers both the error and the estimated uncertainties.

No. of neighbours MSE (m2₎ _{Std. dev. (m)}

5 0.334 0.194 10 0.182 0.106 20 0.065 0.083 35 0.071 0.076 50 0.126 0.076 75 0.050 0.073 100 0.049 0.073 150 0.033 0.070 All 0.040 0.070

Table 3.1. The results from performing cross validation on a constructed surface to

evaluate the impact of the closest neighbour approximation. A total of 200 points were used in the simulations.

3.6 KD-Tree

A KD-tree is a binary partitioning tree for storing a number of data points in a

k-dimensional space [46, 37]. Since it is binary, each node has exactly two children.

The KD-tree splits the dataset into two subspaces among a dimension. All data points that have a lower coordinate in the current splitting dimension are assigned to the left sub tree and the rest of data points are stored in the right sub tree. In two dimensions the splitting dimension is usually alternated between the horizontal and the vertical axes.

Construction of the tree involves selecting a pivot element around which the splitting is performed. One choice for building a balanced tree is an element such that both subspaces contain the same amount of data points after splitting or differs by one in the case of an odd number of points. A balanced tree is a tree where each node has roughly the same height on both children.

The KD-tree allows for efficient search of a nearest neighbour as well as returning all points in the tree within a certain range of a query point. For a nearest neighbour search, the leaf node which contains the target point is first found. Since the nearest neighbour must lie closer than this leaf, parts of the whole tree do not need to be checked.

(34)

The total complexity for finding the closest neighbour in a tree with m points becomes O(log m). To find the n nearest neighbours the process can be repeated by always ignoring the previously found closest points until n points are found.

3.7 Outlier Detection

The scanned data sometimes contains outliers; that is observations which are distant from the rest of the data. This could be due to an item such as a leaf or dust in the air that the scanner has detected. These points are not desirable as they interfere with the estimation of the terrain and introduce unwanted errors in the model. In mining environments, dust is a common problem as it is produced in most daily operations. Figure 3.2 shows the dust that arises when an excavator loads a truck with ore.

Figure 3.2. Photo from the mine in West Angelas showing the dust produced when an

excavator loads a truck with ore.

Outliers are detected using a statistical Z-test [28]. For a point where the height is to be estimated, the closest N points are fetched from the KD-tree. The mean value ¯x of these points as well as the variance σ2 _{are then calculated. Since the}

true variance is not known, it must be estimated from the sample. Estimates of the sample mean ¯x and sample variance s2 _{are calculated according to,}

¯ x = 1 N N X i=n xi, (3.2a) s2= 1 N − 1 N X i=n (xi− ¯x)2. (3.2b)

If the closest points are independent and numerous enough, they can be assumed to be normally distributed according to the central limit theorem. A Z-test can

(35)

3.8 Fusion of Multiple Sensors 23

therefore be performed on all points to detect any outliers. A point xi is discarded if the height is further away than αd standard deviations from the mean.

xi− ¯x

s > αd (3.3)

In practice, a modified Z-test where the standard deviation has to be bigger than a threshold βd was found to work particularly well. This modification was added to prevent points with a low variance from being removed.

xi− ¯x

s > αd ∩ s > βd. (3.4)

The thresholds αd and βd both have to be determined from experimental data.

Example: Removing Outliers

An example of why outliers pose a problem is demonstrated in Figure 3.3. In Figure 3.3(a), a surface is estimated on a small area in which the data set contains outliers. A number of large unwanted spikes emerge due to the outliers.

The result when using the outlier filtering method described above is illustrated in in Figure 3.3(b). The same data set was used to estimate the surface. All outliers are successfully detected and removed, which results in a much better terrain estimation without the unwanted spikes.

3.8 Fusion of Multiple Sensors

The implementation supports multiple point clouds from different sensors to be used in the regression, as described in Section 2.3. Different sensors or multiple sensor scans can be fused together to get a better estimate of the surface. Each point cloud from a single sensor or location is stored in its own KD-tree. The noise parameter σn is stored as well in the tree for each scanned point.

Using the ideas from [53], the nc closest points to the current point to be estimated are fetched from each one of the KD-trees. For np point clouds, this would return a total of ncnp points. From these points, only the nc points closest to the test point are returned to the Gaussian Process for use in the estimation process.

3.9 Threading

Processors in modern computers become every year more and more complex to increase the performance. Traditionally, processor makers have increased the clock frequency of the processor to shorten the computation time of each instruction. Instead of increasing the clock frequency, the trend today among processor man-ufactures is to use multiple cores on each processor unit. Each core can execute

(36)

(a) No outliers removed (b) Outliers removed

Figure 3.3. A close preview of a region in the Marulan dataset that contains outliers.

Figure (a) shows the region without removing any outliers and Figure (b) the same region with outliers detected and removed.

instructions independently of the other cores, and this allows instructions to be executed in parallel. Today, it is not uncommon that a consumer computer is equipped with a dual-core or a quad-core processor.

To take advantage of multiple cores on a processor, the program needs to have support for multiple processes or threads. Each process or thread can be run on different cores and this can allow a program to use the whole processor for computation instead of just a single core.

Threading is not supported by the standard library in C++ and a third-part library must be used. One possible choice is Boost.Thread [5] which provides platform independent threading support to C++.

For the terrain estimation problem, computation of a height at an unknown position only depends on the nearest points. There is nothing that prevents multiple test points to be evaluated at the same time since no information needs to be passed between them during computation and this makes the problem suitable for threading.

The main process creates a number of tasks each consisting of one or many positions where the height is to be estimated. These tasks are put in a thread safe queue which is passed to the worker threads.

(37)

3.9 Threading 25

A user configurable number of threads are then created and these threads perform the computations. Each thread pops a task from the input queue and calculates the terrain elevation at the positions specified by the tasks by first querying the KD-tree for the closest points and then computing the heights with the help of the Gaussian Process implementation. The KD-tree implementation uses VTK1, which is thread safe once it is built.

After all heights in the tasks are computed, the thread pushes the estimated heights to an output queue and then continues with the next task in the input queue. Once the input queue is empty, the threads terminate. When all working threads are finished executing, the main process continues and stores the estimated heights from the output queue.

3.9.1 Threading Performance

To evaluate the multi-threading performance of the implementation, benchmarks were performed using a varying number of working threads to estimate a surface consisting of 800x800 points. Results from the benchmarks are shown in Table 3.2 where the total execution time when estimating the surface with a varying number of threads is presented. The results are also shown in Figure 3.4 where the average execution time is plotted against the number of threads. A quad-core processor was used in the experiments.

No. of threads Time (s)

1 1583 2 860 3 671 4 590 5 600 6 618 8 603

Table 3.2. The results from the benchmarks using a varying number of worker threads.

Total time is the time required to generate a surface with 800x800 points.

The experiments show that increasing the number of threads from one to two decreases the time required by 45%. Using four threads only requires 37% of the exection time compared to a single thread. However, using more than four threads does not reduce the execution time significantly. This is expected since the processor only has four cores. If more than four threads are used, the available cores must be shared among the threads by switching context and this add some overhead. There exist tecniques, such as Hyper-threading, that can help to reduce this overhead and sometimes increase the performance. However, these are beyond the scope of this thesis.

(38)

0 2 4 6 8 10 12 1.00 1.50 2.00 2.50 Number of threads A v erage computation time p er test p oin t (ms)

Figure 3.4. Average execution time for different number of threads. The average time is

the total time divided by the number of points estimated, in this case 640000.

The results demonstrate the ability to scale the computation time with the number of cores and processors available on the target architecture. There is no impact on the quality of the generated surface, only the runtime is decreased.

3.10 Conclusion

This chapter has addressed practical implementation details for performing terrain estimation using Gaussian Processes. In order to speed up the calculations, an approximation method using only the closest neighbours is proposed. A KD-tree stores all the collected sensor data and is queried for the closest neighbours when estimating a test point.

A way to detect and remove outliers using a statistical Z-test is also proposed as the collected sensor data is often noisy; especially in mining environments, where dust is a big problem.

Since no information except the closest neighbours is required when estimating a test point, the problem can be parallelized to further reduce computation time.

(39)

Chapter 4

Feature Extraction

This chapter discusses the theory behind feature detection and classification in terrain modelling applications. The goal is to detect and classify the parts of a mine known as toes and crests. An overview of the terminology used in open-pit mining is provided in Figure 4.1. A bench is a horizontal surface between two bench faces. The toes and crests are the edges where the bench and the bench face meet. In reality, the benches are not necessarily flat, but can be both uneven and vague. ‘Edge’ is used throughout the thesis as a general term for crest and toe.

In Figure 4.2, a view of a typical open-pit mine can be seen. The benches are in this case very distinctive with clear indications of where the toes and the crests are located.

This chapter also develops the theory for two algorithms used in the process of extracting features. k-means clustering is used to find flat regions in the surface and support vector machines (SVMs) are used for classification of data; both are described in detail later in this chapter. Implementation details on how the method works are given in Chapter 5.

4.1 Related Work

Detection of features is in general a very broad term since many different features are possible to detect in images. A well studied feature is edges and several edge detectors have been proposed. A review of the basics behind edge detection as well as descriptions of the different methods can be found in Marr and Hildreth [34], Peli and Malah [41] and Jain et al. [23]. One popular algorithm is the so called Canny edge detector, Canny [9]. Edge detectors can however, not directly be applied to the problem in this thesis, since the toes and the crests are not located in the positions found by general edge detectors. The toes and the crests are also often vague which would make these methods hard to adapt.

Edges are however, only one of many possible features found in images. Other features could for example be corners, Harris and Stephens [18] and ridges, Linde-berg [31], Eberly et al. [14] but these features are not relevant for the problem in this thesis.

(40)

θs hb hw Pit floor Crest Toe Bench Bench face

Figure 4.1. An overview of the terminology involving open-pit mining. θs is the overall

pit slope angle, hbis the bench height and hwis the bench width.

Figure 4.2. Photo from an iron-ore mine in Australia’s Pilbara region showing parts of

(41)

4.2 Toe and Crest Detection 29

Splitting an image into different regions with similar properties is another way to detect features and this approach is used in this work. Ramos and Muge [47] used k-means clustering together with genetic algorithms to group areas with similar colour. For grey scale images, the gradient can be used for segmentation. This is used for example by the watershed algorithm, Mangan and Whitaker [33]. Pradeep et al. [45] uses a similar method as in this work, but to locate steps in staircases. The normals to an estimated surface are calculated from data gathered by a stereo camera. Clustering is then performed on the normals to locate the steps.

After interesting features are detected, they often need to be classified. Classi-fication is a machine learning technique in which objects are classified to belong to a specific class. The different classification methods can be divided into two categories: non-learning based and learning based. Non-learning based methods require no training data and can directly classify data into a class. A popular method is the nearest neighbour distance classifier studied for example by Boiman et al. [3].

Learning based classifiers, also called supervised classifiers, requires a set of training data to learn the classifier’s parameters. One popular learning based classifier used in this work is support vector machines, and it has for example been used by Wang and Hu [57] and Bosch et al. [6] to classify images. Song and Civco [50] used support vector machines to classify extracted regions from satellite images as either road or not road. A comparison between different methods, both learning and non-learning based, and their performance on different datasets has been performed by Walt and Barnard [56].

4.2 Toe and Crest Detection

Toes and crests are important features in open-pit mining as they provide a way to verify the current shape of the mine with the predefined plan. In particular, they can be used to verify that the pit slope has the correct angle. This is important since a wrong shape of the walls could potentially result in dangerous collapses of the mine.

The objective is to locate and mark all toes and crests. The data available is the estimated mine surface from the Gaussian Process estimation that provides a model of the elevation together with the associated uncertainties. The edges should be marked in such a way that they are independent of the underlying surface representation and scale.

For this problem, two approaches are possible. The first approach is to create and train a classifier that operates on the estimated terrain surface and classifies every point as toe, crest or neither. It would then group the classified locations together so they build up continuous regions.

The second approach works in reverse by first trying to extract all edges that are of interest and then classify only these regions.

In this work, the second approach is used. The first approach would require a classifier that is very good at separating the three classes, especially the non-edges

Large Scale Terrain Modelling for Autonomous Mining

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Large Scale Terrain Modelling for

Autonomous Mining

Large Scale Terrain Modelling for

Autonomous Mining

Examensarbete utfört i Reglerteknik

vid Tekniska högskolan i Linköping

av

Abstract

Sammanfattning

Acknowledgments

Contents

Chapter 1

Introduction

1.1

Mine Automation

1.2

Terrain Modelling in Mining

1.3

Objectives

1.4

Contributions

1.5

Thesis Outline

Chapter 2

Terrain Estimation

2.1

Related work

2.2

Gaussian Process Estimation

2.2.1

Definition

2.2.2

Covariance Function

2.2.3

Prediction

2.2.4

Learning Hyperparameters

−3

−2

−1

0

1

2

3

−20

−10

0

10

20

30

x

f

(x

)

2.3

Gaussian Process Fusion

2.4

Conclusion

Chapter 3

Terrain Estimation using

Gaussian Processes

3.1

System Overview

3.2

Matrix and Vector Operations

3.3

Removal of Mean

3.4

Validation

3.5

Approximations

3.5.1

Learning from a Subset of Data

3.5.2

Local Neighbourhood Approximation

3.5.3