Autonomous Crop Segmentation, Characterisation and Localisation

(1)

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Autonomous Crop Segmentation, Characterisation and

Localisation

Examensarbete utfört i Reglerteknik vid Tekniska högskolan vid Linköpings universitet

av Gustav Jagbrant LiTH-ISY-EX--13/4719--SE

Linköping 2013

Department of Electrical Engineering Linköpings tekniska högskola

Linköpings universitet Linköpings universitet

(2)

(3)

Autonomous Crop Segmentation, Characterisation and

Localisation

Examensarbete utfört i Reglerteknik

vid Tekniska högskolan vid Linköpings universitet

av

Gustav Jagbrant LiTH-ISY-EX--13/4719--SE

Handledare: Michael Roth

isy_{, Linköpings Universitet}

James Underwood

Australian Centre for Field Robotics

Examinator: Thomas Schön

isy, Linköpings Universitet

(4)

(5)

Avdelning, Institution Division, Department

Avdelningen för Reglerteknik Department of Electrical Engineering SE-581 83 Linköping Datum Date 2013-09-06 Språk Language Svenska/Swedish Engelska/English Rapporttyp Report category Licentiatavhandling Examensarbete C-uppsats D-uppsats Övrig rapport

URL för elektronisk version http://www.ep.liu.se

ISBN — ISRN

LiTH-ISY-EX--13/4719--SE Serietitel och serienummer Title of series, numbering

ISSN —

Titel Title

Autonom Segmentering, Karakterisering och Lokalisering i Mandelplantager. Autonomous Crop Segmentation, Characterisation and Localisation

Författare Author

Gustav Jagbrant

Sammanfattning Abstract

Orchards demand large areas of land, thus they are often situated far from major population centres. As a result it is often difficult to obtain the necessary personnel, limiting both growth and productivity. However, if autonomous robots could be integrated into the operation of the orchard, the manpower demand could be reduced.

A key problem for any autonomous robot is localisation; how does the robot know where it is? In agriculture robots, the most common approach is to use GPS positioning. However, in an orchard environment, the dense and tall vegetation restricts the usage to large robots that reach above the surroundings. In order to enable the use of smaller robots, it is instead necessary to use a GPS independent system. However, due to the similarity of the environment and the lack of strong recognisable features, it appears unlikely that typical non-GPS solutions will prove successful. Therefore we present a GPS independent localisation system, specifically aimed for orchards, that utilises the inherent structure of the surroundings. Furthermore, we examine and individually evaluate three related sub-problems.

The proposed system utilises a 3D point cloud created from a 2D LIDAR and the robot’s movement. First, we show how the data can be segmented into individual trees using a Hidden Semi-Markov Model. Second, we introduce a set of descriptors for describing the geometric characteristics of the individual trees. Third, we present a robust localisation method based on Hidden Markov Models. Finally, we propose a method for detecting segmentation errors when associating new tree measurements with previously measured trees.

Evaluation shows that the proposed segmentation method is accurate and yields very few segmentation errors. Furthermore, the introduced descriptors are determined to be consistent and informative enough to allow localisation. Third, we show that the presented localisation method is robust both to noise and segmentation errors. Finally it is shown that a significant majority of all segmentation errors can be detected without falsely labeling correct segmentations as incorrect.

Nyckelord

Keywords Orchard, Horticulture, Localisation, Hidden Semi-Markov Model, Hidden Markov Model, Segmentation, 3D Data, 3D Point Cloud, 3D Descriptors

(6)

(7)

Sammanfattning

Eftersom fruktodlingar kräver stora markområden är de ofta belägna långt från större befolkningscentra. Detta gör det svårt att finna tillräckligt med arbetskraft och begränsar expansionsmöjligheterna. Genom att integrera autonoma robotar i drivandet av odlingarna skulle arbetet kunna effektiviseras och behovet av arbet-skraft minska.

Ett nyckelproblem för alla autonoma robotar är lokalisering; hur vet roboten var den är? I jordbruksrobotar är standardlösningen att använda GPS-positionering. Detta är dock problematiskt i fruktodlingar, då den höga och täta vegetationen begränsar användandet till större robotar som når ovanför omgivningen. För att möjliggöra användandet av mindre robotar är det istället nödvändigt att använ-da ett GPS-oberoende lokaliseringssystem. Detta problematiseras dock av den likartade omgivningen och bristen på distinkta riktpunkter, varför det framstår som osannolikt att existerande standardlösningar kommer fungera i denna om-givning. Därför presenterar vi ett GPS-oberoende lokaliseringssystem, speciellt riktat mot fruktodlingar, som utnyttjar den naturliga strukturen hos omgivnin-gen. Därutöver undersöker vi och utvärderar tre relaterade delproblem.

Det föreslagna systemet använder ett 3D-punktmoln skapat av en 2D-LIDAR och robotens rörelse. Först visas hur en dold semi-markovmodell kan användas för att segmentera datasetet i enskilda träd. Därefter introducerar vi ett antal deskrip-torer för att beskriva trädens geometriska form. Vi visar därefter hur detta kan kombineras med en dold markovmodell för att skapa ett robust lokaliseringssys-tem. Slutligen föreslår vi en metod för att detektera segmenteringsfel när nya mätningar av träd associeras med tidigare uppmätta träd.

De föreslagna metoderna utvärderas individuellt och visar på goda resultat. Den föreslagna segmenteringsmetoden visas vara noggrann och ge upphov till få seg-menteringsfel. Därutöver visas att de introducerade deskriptorerna är tillräckligt konsistenta och informativa för att möjliggöra lokalisering. Ytterligare visas att den presenterade lokaliseringsmetoden är robust både mot brus och segmenter-ingsfel. Slutligen visas att en signifikant majoritet av alla segmenteringsfel kan detekteras utan att felaktigt beteckna korrekta segmenteringar som inkorrekta.

(8)

(9)

Abstract

Orchards demand large areas of land, thus they are often situated far from major population centres. As a result it is often difficult to obtain the necessary person-nel, limiting both growth and productivity. However, if autonomous robots could be integrated into the operation of the orchard, the manpower demand could be reduced.

A key problem for any autonomous robot is localisation; how does the robot know where it is? In agriculture robots, the most common approach is to use GPS po-sitioning. However, in an orchard environment, the dense and tall vegetation restricts the usage to large robots that reach above the surroundings. In order to enable the use of smaller robots, it is instead necessary to use a GPS indepen-dent system. However, due to the similarity of the environment and the lack of strong recognisable features, it appears unlikely that typical non-GPS solutions will prove successful. Therefore we present a GPS independent localisation sys-tem, specifically aimed for orchards, that utilises the inherent structure of the surroundings. Furthermore, we examine and individually evaluate three related sub-problems.

The proposed system utilises a 3D point cloud created from a 2D LIDAR and the robot’s movement. First, we show how the data can be segmented into in-dividual trees using a Hidden Semi-Markov Model. Second, we introduce a set of descriptors for describing the geometric characteristics of the individual trees. Third, we present a robust localisation method based on Hidden Markov Models. Finally, we propose a method for detecting segmentation errors when associating new tree measurements with previously measured trees.

Evaluation shows that the proposed segmentation method is accurate and yields very few segmentation errors. Furthermore, the introduced descriptors are deter-mined to be consistent and informative enough to allow localisation. Third, we show that the presented localisation method is robust both to noise and segmen-tation errors. Finally it is shown that a significant majority of all segmensegmen-tation errors can be detected without falsely labeling correct segmentations as incorrect.

(10)

(11)

Acknowledgments

I would like to start by thanking the people at ACFR in general - you are all a great bunch of people. Apart from that I would especially want to thank both James Underwood and Juan Nieto for superb advice and supervision. Also, spe-cial thanks to James for having a good taste in beers. Furthermore, on a more serious note, thanks to HAL for funding the project according to AH11009 Au-tonomous Perception Systems for Horticulture Tree Crops.

As for my time here, it was an enriching experience. Especially thanks to all friends I’ve made during these months. In order to avoid forgetting someone, I’ll mention none. Thanks for all the times of laughter and fun - We’ll meet again someday.

Back home I would like to extend my special thanks to Fredrik Gustafsson who first made me aware of the possibility of writing the thesis at ACFR. Furthermore thanks to Hanna Nyqvist who helped me with the visa application process. I’d also like to extend additional thanks to my supervisor Michael Roth and my ex-aminer Thomas Schön for the helpful comments and feedback.

Also, since this ends my current studies at Linköping University, I would like to thank all examiners, lecturers, professors, teachers and lab-assistants that I’ve come in contact with over the last five year - with very few exceptions - you all did a great job! (And for the exception, well looking back that was pretty hilari-ous anyways...)

Finally, thanks to Mom and Dad, I hope you didn’t worry too much.

Linköping, August 2013 Gustav Jagbrant

(12)

(13)

6 Localisation 71 6.1 Usage . . . 71 6.2 Localisation Method . . . 72 6.2.1 Transition Matrix . . . 73 6.2.2 Initial Probabilities . . . 76 6.2.3 Observation Distributions . . . 77 6.2.4 Matching Algorithm . . . 78 6.3 Randomised Datasets . . . 79 6.3.1 Feature Noise . . . 79 6.3.2 Border Noise . . . 80

(15)

CONTENTS xi

6.3.3 Merge Errors . . . 81

6.3.4 Split Errors . . . 82

6.3.5 Detection Errors . . . 82

6.4 Evaluation . . . 82

6.4.1 Counting Correct Matches . . . 82

6.4.2 Evaluation Setup . . . 83

6.4.3 Single Row Localisation . . . 84

6.4.4 Multiple Row Localisation . . . 87

6.4.5 Conclusions . . . 88

7 Robust Segmentation 91 7.1 Segmentation Error Detection . . . 91

7.2 Evaluation . . . 92

7.2.1 Results . . . 93

8 Summary and Further Work 97 8.1 Summary of Results . . . 97

8.2 Further Work . . . 98

A Some Basic Probability Theorems 103 B Descriptor Similarity Distributions 105 B.1 Similarities to Self . . . 105

B.1.1 Simple Volume Descriptor . . . 105

B.1.2 Simple Height Descriptor . . . 106

B.1.3 Volume Signature Descriptor . . . 106

B.1.4 Height Signature Descriptor . . . 107

B.2 Similarities to Others . . . 108

B.2.1 Simple Volume Descriptor . . . 108

B.2.2 Simple Height Descriptor . . . 108

B.2.3 Volume Signature Descriptor . . . 109

B.2.4 Height Signature Descriptor . . . 109

C Direction Based Localisation 111

(16)

(17)

1

Introduction

Since the year 2000, the population of the earth has increased from approxi-mately 6 billion to 7 billion people [1], and is expected to reach a staggering 9.6 billion by year 2050 [17]. Additionally, this increase in population is not evenly distributed, but is concentrated to the cities, leading to an unprecedented urban-isation of the world [2]. As a direct consequence, a proportionally smaller rural population must sustain a much larger urban population. In order for such a change to be possible, it is necessary to increase the efficiency of the agricultural sector. To that end, this thesis is part of a project to create a system capable of au-tonomous orchard surveillance, including mapping, classification and detection. A key problem for any autonomous robot is localisation; how does the robot know where it is? A common approach in agriculture robots is to use Differential GPS (DGPS). In orchards however, this approach is only possible for larger and taller robots, as the dense vegetation attenuates the DGPS signal. In order to enable the use of smaller robots, it is instead necessary to use a GPS independent local-isation system. However, due to the similarity of the environment and the lack of strong recognisable features, it appears unlikely that standard localisation sys-tems will prove successful. Therefore we present a GPS independent localisation system, specifically aimed for orchards, that utilises the inherent structure of the surroundings. Furthermore, we examine and individually evaluate three related subproblems.

The first task examines how to segment a 3D point cloud of the orchard, col-lected by a 2D Laser Scanner, into trees; allowing a surveying robot to designate estimated attributes such as health and yield to individual trees. Second, we ex-amine how the geometric characteristics of the individual trees can be described

(18)

in order to enable localisation. Third, we present a robust localisation method, based on the previously introduced segmentation and characterisation methods. Finally, we introduce a method for detecting segmentation errors when perform-ing localisation, allowperform-ing erronous attribute estimates to be discarded.

Of importance to note, however, is that the presented localisation algorithm still remains to be fully evaluated; the crux being that the presented evaluation is based on 3D point data created from DGPS estimates of the robot’s position. Still, we are confident that similar results can be obtained by using odometry or in-ertial navigation system (INS) data. Nevertheless this final evaluation was left outside the scope of the thesis due to time constraints.

In this chapter, the background and goal of the thesis are explained. Section 1.1 introduces the Australian Centre for Field Robotics, where this thesis was written, and Horticulture Australia Limited. Section 1.2 explains the used datasets and how they were collected. Thereafter, Section 1.3 discusses related research while Section 1.4 describes the contributions of the thesis. Finally, Section 1.5 gives an outline of the entire thesis.

1.1 ACFR and HAL

The Australian Centre for Field Robotics (ACFR) is a teaching and research centre based at the University of Sydney, working both within the research community as well as with commercial partners to further the advance of field robotics. One of its partners is Horticulture Australia Limited, a not-for-profit industry owned company investing in research and development that may prove beneficial both to the horticulture industry as well as the wider community.

At present, agriculture accounts for 2 percent of the GDP of Australia. However, given the proximity to the densely populated East Asia, the government forecasts that the agricultural sector could account for 5 percent of the GDP by 2050. In or-der to achieve this goal, the producers must overcome both high minimum wages as well as workforce limitations. For this reason, HAL has partnered with ACFR in order to further automation in the horticulture sector.

1.2 Data Collection and Datasets

Before the work with the thesis commenced, a field trial was performed at an almond orchard in Mildura (Victoria, Australia). The data used in this thesis was collected by the robot, known asShrimp, seen in Figures 1.1 and 1.2. The robot was equipped with numerous sensors, of which two are utilised in this thesis. The first sensor is a Novatel navigation system, utilising DGPS to estimate the robot’s position. The second is a SICK LMS-291 2D Laser Scanner (LIDAR). The LIDAR is mounted vertically and oriented to the right, giving a vertical line of

(19)

1.2 Data Collection and Datasets 3

Figure 1.1:Shrimp seen from the left.

measurements with a frequency of 75 Hz. Data from these two sensors is com-bined to create a 3D point cloud, an excerpt of which is presented in Figure 1.3. We make use of 8 different datasets collected during the trial. Three of these were collected by the robot moving in a straight line past both sides of the same row. A fourth dataset was collected by the robot traversing 6 different rows, including the row in the smaller datasets. Furthermore, a fifth dataset was collected by the robot traversing the same row in a sinusoidal motion, resulting in the data being slightly distorted.

Additionally, we utilise 3 datasets containing only empty space, without any trees. To be noted is that these empty datasets were collected on the path right before entering the row of trees, and not in an empty area inside the actual rows. This results in the ground structure of these datasets being different from the empty spaces in the inner parts of the rows, as the ground in the rows have a hill-like structure not shared by the area next to the rows.

In order to simplify later references to the datasets, we define them as

• The 3 shorter datasets - The 3 datasets collected by the robot moving in a straight line past both sides of the same row.

• The longer dataset - The dataset collected by the robot traversing 6 rows in a straight line.

• The 4 shorter datasets - The 3 shorter datasets and the corresponding row of the longer dataset.

• The distorted dataset - The dataset collected by the robot moving in a sinu-soidal motion.

(20)

Figure 1.2: Shrimp seen from the right. The position of the vertically mounted SICK LIDAR, partially occluded by another sensor, is shown.

Figure 1.3:An excerpt of the created point cloud. Every point is represented by an X, Y and Z coordinate.

Moreover, we often view the same row seen from different sides as separate rows unless the similarity between the two sides is explicitly investigated. Utilising this view, the shorter and distorted datasets contain 2 rows and the larger dataset contains 10 rows.

Finally, in order to give an estimate of the size of the datasets, it should be men-tioned that each row contains 58 trees, with an extent of roughly 320 meters, and is represented by approximately 1.5 million points. The empty datasets are significantly smaller, each having a length of approximately 10 meters.

1.3 Related Work

In this section, we briefly examine previous work related to orchard segmenta-tion, characterisation of 3D data, and localisation.

(21)

1.4 Thesis Contributions 5

There exist numerous approaches to segmenting 3D data, however many of these are general solutions, created to segment data consisting of undefined types of ob-jects. Assuming that better results can be obtained by orchard specific methods, allowing integration of the known orchard structure, we focus on these meth-ods. One such method was presented in [9], using Gaussian mixture models with unknown number of clusters to segment the individual points into trees. The method was evaluated in a peach orchard, showing good results. However, most trees studied in that work were completely spatially separated, whereas the al-mond trees in our study often have overlapping canopies.

Instead of a per-point segmentation, the authors of [18] propose a method based on splitting of the row into “slices” of 0.2 meters. Based on the height of each slice, the dataset is segmented into trees by employing a Hidden Semi-Markov Model. Furthermore, this method also allows for an a priori estimate of the widths of the trees to be incorporated into the segmentation. Nevertheless, the research presented in [18] did not focus on the accuracy of the segmentation, but only on the number of found trees, making it difficult to determine the precision of the method.

In order to characterise the segmented trees, it is necessary to use a descriptor to describe the 3D data. There exists a multitude of 3D descriptors, one of the more well known being spin-images [4], which describes the local neighbourhood with regard to an estimated normal direction. Other descriptors based on local normal estimation include shape contexts [3] [16] and point feature histograms [14] [15]. Furthermore, there exist other descriptors, such as shape distributions [10] and shape functions [19], which are based on the distribution of the points.

An interesting localisation model was presented in [6], where Dynamic Time Warping was used to perform sequence-based visual localisation. Most intrigu-ingly, the author showed that highly accurate localisation can be achieved through sequence matching even when the information in each individual element is severely limited.

Another approach to localisation, which allows more modeling freedom, is to use Hidden Markov Models (HMMs). This was used in [12] to perform indoor lo-calisation based on the recognition of landmarks such as door and corridors. Ad-ditionally, there exists more complex HMM localisation methods that integrate both landmark observations and robot odometry [5].

1.4 Thesis Contributions

In this thesis we aim to advance the research in the perception of orchards. All work is performed using data from the almond orchard presented in Section 1.2. It is believed that the same approach will work in other orchards which share the same structure.

(22)

First we examine how to segment the orchard into individual trees. In order to do so, we utilise the Hidden Semi-Markov Model approach presented in [18]. We extend the proposed model, present a number of different variants of the seg-mentation method, and evaluate their performance.

Second, we investigate how to characterise the segmented trees. One significant attribute of the utilised datasets is that the trees are sparsely sampled. This is problematic for the descriptors dependent on the estimation of surface normals, as the sparse sampling is likely to result in an inaccurate and inconsistent normal estimation. Furthermore, if the robot moves at non-uniform speed, or happens to turn slightly, the trees will be non-uniformly sampled. This is problematic when utilising distribution based descriptors, as they assume that the sampling density is consistent.

Recognising these problems, we define a set of descriptors for the specific ap-plication. We show how these can be used to build a map of the orchard and introduce a simple localisation algorithm. Additionally, we investigate how the map of the orchard may best be built from multiple measurements of each tree. In the following part of the thesis, we utilise a Hidden Markov Model and the previously introduced descriptors to create a robust localisation method. We eval-uate its performance and show that it is robust both to measurement noise and segmentation errors.

A problem with building the map of the orchard from multiple measurements is that segmentation errors, though empirically rare, risk corrupting the map. In order to address this problem, we end the thesis by presenting a simple method for detecting segmentation errors when performing localisation.

In conclusion, the contributions presented in the thesis can be summarised as follows:

• A method for segmenting individual trees from a 3D point cloud is pre-sented.

• A set of tree descriptors are defined and their usefulness evaluated. • A robust localisation method is introduced and its performance evaluated. • A method for detecting segmentation errors when performing localisation

is presented.

1.5 Outline

Below we present an outline of the thesis with a short summary of each chapter: • Chapter 2 explains the theory of Hidden Markov Models (HMMs).

(23)

1.5 Outline 7

• Chapter 3 explains the theory of Hidden Semi-Markov Models (HSMMs). • Chapter 4 presents a HSMM based method for segmenting the 3D data into

individual trees.

• Chapter 5 introduces and evaluates a set of descriptors for characterising the segmented trees.

• Chapter 6 presents a robust HMM based localisation method.

• Chapter 7 presents a method for detecting segmentation errors when per-forming localisation.

• Chapter 8 summarises the presented results and gives an outline of further work.

(24)

(25)

2

Hidden Markov Models

Hidden Markov Models (HMMs) are used in a variety of areas such as speech recognition, computational biology and computer vision. At the core of the model is a set of states, each state producing an observation based on a probabilistic dis-tribution. At each time, the model is in one active state, the state currently pro-ducing the observation. Furthermore, the probability of transitioning between the different states is also modeled. Having modeled the complete HMM it is possible to estimate the most probable sequence of states given a sequence of ob-servations.

In this chapter we present and derive three of the common algorithms for esti-mating the sequence of states. Furthermore, we explain how the model may be utilised with multiple observation features as well as how to avoid numerical underflow problems. Additionally, in order to aid in the derivation of the intro-duced algorithms, the utilised probability theorems are presented in Appendix A. However, before presenting the optimisation methods, it is necessary to define the notation.

First of all, the probability of transitioning from state qt = i to state qt+1 = j

is defined to be independent of both previous states and of time. These assump-tions make it possible to create a transition matrix A with element Aij = P (qt+1=

j|qt = i). The states and state transitions can be visualised in a state transition

diagram, an example of which is seen in Figure 2.1.

Second, the observation distribution of each state, P (x|q), is modeled. This allows the likelihood of state qt = j given measurement xt to be calculated. In order to

simplify notation, the likelihood of qt = j for all observations are collected in a

vector Bjsuch that Bj(t) = P (xt|qt= j).

(26)

1

2

3

A11 A22 A33 A12 A23 A21 A32 A13 A31

Figure 2.1: A state transition diagram of a Hidden Markov Model with 3 states.

To use the model it is necessary to define the a priori probabilities of each state. These prior probabilities are denoted πj= P (q1= j) whereP πj= 1.

To further simplify the notation we define x1:tto be the sequence of observations

from x1 to xt. Similarly q1:t is defined to to be the state-sequence from q1 to qt.

Furthermore, the total number of states is defined as N .

2.1 Definition of Optimal States

A common application of Hidden Markov Models is to find the most probable state sequence given a sequence of observations. In order to investigate the prob-lem however, it is necessary to properly define the optimality expression. In many applications it is of interest to find the individually most probable states given all measurements, formally expressed as

ˆqFB = (arg max q1 P (q1, x1:T), arg max q2 P (q2, x1:T), ... , arg max qT P (qT, x1:T)). (2.1)

One problem with the optimisation is that it is only possible in an offline situa-tion. The equivalent causal problem is expressed as

ˆq_F = (arg max q1 P (q1, x1), arg max q2 P (q2, x1:2), ... , arg max qT P (qT|x1:T)). (2.2)

The common denominator for Equations (2.1) and (2.2) is that they both target the individually most probable states. This approach is applicable in many in-stances. However, it risks creating invalid state sequences if there exists any ele-ment Aij = 0 in the transition matrix. In such situations it is better to optimise

(27)

2.2 The Forward Algorithm 11

with regard to the jointly most probable state sequence, formally expressed as

ˆqV = arg max q1,q2,...,qt

P (q1:t, x1:t). (2.3)

All three optimisation criteria are common in practical problems, with optimisa-tion algorithms available for each one. The non causal optimisaoptimisa-tion of Equaoptimisa-tion (2.1) is performed using the Forward-Backward algorithm. Removing the anti-causal part of the Forward-Backward algorithm yields the Forward algorithm used for optimising Equation (2.2). The optimisation of Equation (2.3) is done using the Viterbi algorithm [8].

Before proceeding, it may be of interest to note that it is possible to create the pre-sented optimality criteria with conditional probabilities instead of joint probabil-ities, in fact they are presented using conditional probabilities in many textbooks. The reason for our choice of joint probabilities over conditional probabilities is due to the fact that it removes a scale factor. This has no effect on the choice of optimal sequence since the scale factor is the same for all states, but slightly simplifies the derivation of the optimisation algorithms.

2.2 The Forward Algorithm

The Forward algorithm solves (2.2) using recursive updates. Begin by introduc-ing αt(j) = P (qt = j, x1:t), (2.4) recursively updated by αt(j) = P (qt= j, x1:t) = P (xt|qt= j)P (qt = j, x1:t−1) = P (xt|qt= j)( X i P (qt= j, x1:t−1, qt−1= i)) = P (xt|qt= j)( X i P (qt= j|x1:t−1, qt−1= i)P (x1:t−1, qt= i)) = P (xt|qt= j)( X i P (qt= j|qt−1= i)P (qt−1= i, x1:t−1)) = Bj(t)( X i Aijαt−1(i)). (2.5)

(28)

α1(j) = P (q1= j, x1) = P (x1|q1= j)P (q1= j) = Bj(1)πj. (2.6)

Having calculated α, the most probable state at each time is calculated as

q∗t= arg max j

(αt(j)). (2.7)

The complete algorithm is presented in Algorithm 1. Algorithm 1The HMM Forward Algorithm

forj = 1 → N do . Initialisation α1(j) = Bj(1)πj end for fort = 2 → T do . Recursion forj = 1 → N do αt(j) = Bj(t)(PiAijαt−1(i)) end for end for

fort = 1 → T do . State Maximisation

q∗_t= arg max

j

(αt(j))

end for

2.3 The Forward-Backward algorithm

The Forward-Backward algorithm is the non-causal extension of the Forward al-gorithm, employing a similar recursive algorithm for solving (2.1). Begin by in-troducing γt(j) = P (qt= j, x1:T) = P (xt+1:T|x1:t, qt= j)P (x1:t, qt= j) = P (x1:t, qt = j)P (xt+1:T|qt = j) = αt(j)βt(j) (2.8) where αt(j) = P (qt= j, x1:t), (2.9)

(29)

2.3 The Forward-Backward algorithm 13

βt(j) = P (xt+1:T|qt= j). (2.10)

P (xt+1|qt+1= i)P (xt+2:T|qt+1= i)P (qt+1= i|qt = j)

=X

i

Bi(t + 1)βt+1(i)Aji. (2.11)

The α and β variables are are initialised by

α1(j) = P (q1= j, x1) = P (x1|q1= j)P (q1 = j) = Bj(1)πj (2.12)

and

βT(j) = P (xT +1:T|qt= j) = P (∅|qt= j) = 1. (2.13)

Having calculated α and β, γ is calculated by

γt(j) = αt(j)βt(j) (2.14)

and the most probable state at each time calculated as

q∗_t = arg max

j

(γt(j)). (2.15)

(30)

Algorithm 2The HMM Forward-Backward Algorithm

forj = 1 → N do . Initialisation

α1(j) = Bj(1)πj

βT(j) = 1

end for

fort = 2 → T do . Recursion - Forward

forj = 1 → N do

αt(j) = Bj(t)(PiAijαt−1(i))

end for end for

fort = 1 → T − 1 do . Recursion - Backward

forj = 1 → N do

βt(j) =PiBi(t + 1)βt+1(i)Aji

end for end for

forj = 1 → N do γt(j) = αt(j)βt(j) end for q∗t= arg max j (γt(j)) end for

(31)

2.4 The Viterbi Algorithm 15

2.4 The Viterbi Algorithm

The Viterbi algorithm solves (2.3) by employing a dynamic programming ap-proach [8] [11]. Start by defining

δt(j) =_q max

1,q2,...,qt−1P (q1:t−1, qt= j, x1:t), (2.16)

thus δ is the joint probability of the most likely state sequence and the observa-tions. The update step is calculated as

δt(j) =_q max 1,q2,...,qt−1 P (q1:t−1, qt = j, x1:t) = max q1,q2,...,qt−1 P (xt|q1:t−1, qt= j, x1:t−1)P (q1:t−1, qt= j, x1:t−1) = max q1,q2,...,qt−1 P (xt|qt = j)P (qt= j|q1:t−1, x1:t−1)P (q1:t−1, x1:t−1) = max q1,q2,...,qt−1 P (xt|qt = j)P (qt= j|qt−1)P (q1:t−1, x1:t−1) = max i P (xt |_q_t _{= j)P (q}_t_{= j|q}_t−1_{= i)} _max q1,q2,...,qt−2P (qt−1= i, q1:t−2, x1:t−1) ! = max i Bj(t)Aijδt−1(i). (2.17) Initialisation of δ is done by δ1(j) = P (q1= j, x1) = P (x1|q1)P (q1) = Bj(1)πj. (2.18)

In order to find the most probable sequence, the previous maximising state is stored in a variable φ for each time and state:

φt(j) = arg max i

Bt(j)Aijδt−1(i). (2.19)

Having calculated δ for the entire sequence, the most probable final state is found as

q∗_T = arg max

j

δT(j). (2.20)

Given the final state, the most probable state-sequence is found using φ to back-track along the sequence:

(32)

q∗t= φt+1(q

∗

t+1). (2.21)

The complete algorithm is presented in Algorithm 3. Algorithm 3The HMM Viterbi Algorithm

forj = 1 → N do . Initialisation δ1(j) = Bj(1)πj φ1(j) = 0 end for fort = 2 → T do . Recursion forj = 1 → N do δt(j) = max i Bj(t)Aijδt−1(i) φt(j) = arg max i Bj(t)Aijδt−1(i) end for end for q∗_T = arg max j δT(j) . Termination fort = T − 1 → 1 do . Back-Tracking q∗t= φt+1(q ∗ t+1) end for

(33)

2.5 Use of Multiple Features 17

2.5 Use of Multiple Features

Previous sections have treated the observation x as a scalar value, however it is possible to use multivariate observations. Let

P ( ¯xt|qj) = P ((x1t, x2t)|qj) = P (x1t, x2t|qj) (2.22)

where x1_t and x2_t denote observations of different features such as height and width. Furthermore, assuming that x1_t and x2_t are independent, (2.22) may be simplified to

P ( ¯x|qj) = P (x1t|qj)P (x2t|qj). (2.23)

This allows multiple features to be used simply by replacing Bj(t) = P (x|qj) with

Bj(t) = P (xt1|qj)P (x2t|qj) in the optimisation algorithm.

Finally, it should be observed that it is possible to use different features for differ-ent states without doing any changes to the optimisation algorithm. Setting

Bj(t) = P (x1t|qj) (2.24)

and

Bi(t) = P (x2t|qi), (2.25)

this equals defining

p(xt2|qj) = P (x1t|qi) = K = 1 (2.26)

over an area of x large enough such that all observations fall within this area. This results in an incorrect probability for all states due to the fact thatR_ab1 , 1 in the general case. However, due to the scale being the same for all state sequences, this does not affect the choice of state sequence.

(34)

2.6 Avoiding Numerical Underflow

As all three presented algorithms contain a large number of products of factors smaller than 1, they are all susceptible to underflow problems. This can be re-solved either through scaling [7] or by working in the logarithmic domain. The standard solution for the Forward and Forward-Backward algorithms is the scal-ing method, however this method does not generalise to HSMMs [7], therefore we present the logarithmic approach. The core idea of the logarithmic approach is to optimise the logarithm of the probability. This effectively removes any risk of underflow problems since

log(a · b) = log a + log b. (2.27)

2.6.1 Forward Algorithm

The logarithmic version of the Forward algorithm is presented in Algorithm 4. Algorithm 4The Logarithmic HMM Forward Algorithm

˜

π = log(π) . Calculate Log Probabilities

˜ A = log(A) ˜ B = log(B) forj = 1 → N do . Initialisation α1(j) = ˜Bj(1) + ˜πj end for fort = 2 → T do . Recursion forj = 1 → N do

αt(j) = ˜Bj(t) + log(Piexp( ˜Aij+ αt−1(i)))

end for end for

q∗t= arg max j

(αt(j))

(35)

2.6 Avoiding Numerical Underflow 19

2.6.2 Forward-Backward Algorithm

The logarithmic version of the Forward-Backward algorithm is presented in Al-gorithm 5.

Algorithm 5The Logarithmic HMM Forward-Backward Algorithm ˜

˜ A = log(A) ˜ B = log(B) forj = 1 → N do . Initialisation α1(j) = ˜Bj(1) + ˜πj βT(j) = 0 end for

fort = 2 → T do . Recursion - Forward

forj = 1 → N do

αt(j) = ˜Bj(t) + log(Piexp( ˜Aij + αt−1(i)))

end for end for

fort = 1 → T − 1 do . Recursion - Backward

forj = 1 → N do

βt(j) = log(Piexp( ˜Bi(t + 1) + βt+1(i) + ˜Aji))

end for end for

forj = 1 → N do γt(j) = αt(j)βt(j) end for q∗_t= arg max j (γt(j)) end for

(36)

2.6.3 Viterbi Algorithm

Due to the fact that

log(max f ( · )) = max(log f ( · )), (2.28) the derivation of the modified Viterbi algorithm is straight-forward. The result-ing algorithm is presented in Algorithm 6.

Algorithm 6The Logarithmic HMM Viterbi Algorithm ˜

˜ A = log(A) ˜ B = log(B) forj = 1 → N do . Initialisation δ1(j) = ˜Bj(1) + ˜πj φ1(j) = 0 end for fort = 2 → T do . Recursion forj = 1 → N do δt(j) = max i ( ˜Bj(t) + ˜Aij+ δt−1(i)) φt(j) = arg max i ( ˜Bj(t) + ˜Aij+ δt−1(i)) end for end for q∗_T = arg max j δT(j) . Termination fort = T − 1 → 1 do . Back-Tracking q∗_t= φt+1(q ∗ t+1) end for

(37)

3

Hidden Semi-Markov Models

The Hidden Semi-Markov Model (HSMM) is an extension of the Hidden Markov Model where the time spent in each state is explicitly modeled. In this chapter we introduce the model and the corresponding optimisation algorithm used in the thesis. However, we first show why this extension is important.

First, the probability of a Hidden Markov Model remaining in the same state for exactly d observations is

P (Exactly d consecutive observations of state j) = (Ajj)d−1(1 − Ajj). (3.1)

This gives, since

(Ajj)d−1(1 − Ajj) ∝ exp(d log(Ajj)), (3.2)

that the probability of the model remaining in the same state is effectively de-creasing exponentially with time, thus limiting the usefulness of HMMs. To fur-ther stress this point, Figure 3.1 compares the implicit duration distribution of a HMM with the explicit duration distribution of a HSMM.

As mentioned, the key feature of the HSMM is that the duration spent in each state is explicitly modeled using a probability distribution. This allows the tran-sitions between states to be defined in the same way as in HMMs while allowing complete freedom in modeling the duration distribution. Taken together, this al-lows most of the simplicity of the HMM to be kept while making the model much more accurate for many problems.

(38)

0 5 10 15 20 25 30 35 0 0.02 0.04 0.06 0.08 0.1

Observations spent in state

Observations

Probability

Figure 3.1: The implicit duration distribution of a HMM (—) compared to an explicitly defined duration distribution of a HSMM (—). The shape of the HMM distribution is restricted while the shape of the HSMM distribution can be arbitrarily defined.

In order to specify the notation, we define

Cj(d) = P (Exactly d consecutive observations of state j). (3.3)

Furthermore, the integer variable D is introduced to define the maximum dura-tion spent in a state. This variable adds no informadura-tion to the model from a theo-retical viewpoint, however it is necessary to limit the computational complexity of the optimisation algorithms.

As only the HSMM variant of the Viterbi algorithm is used in this thesis, the HSMM variants of the Forward and Forward-Backward algorithms are not pre-sented here. However, they are explained in detail in [7].

3.1 The Viterbi Algorithm

In order to derive the the HSMM version of the Viterbi algorithm we make use of the notation introduced in [7]:

• Gt = (Qt, Lt) is a generalised form of the ordinary state containing both the

ordinary state, Qt, and the duration of the state, Lt. The suffix t means that

the state ends at time t, i.e. Qt+1, Qt.

• Gtp is the previous state, i.e. the state exactly before Gt

• G−_t are all states prior to Gt

• Y (Gt) are all observations of Gt. In the same way Y (G

−

t) are all observations

(39)

This allows a δ, similar to the one used in the HMM Viterbi algorithm, to be introduced as δt(g) = max G− t P (Gt = g, G − t, Y (Gt), Y (G − t)) = max G−_t P (Y (Gt)|Gt= g)P (Gt= g, G − t, Y (G − t)) = max G−_t P (Y (Gt)|Gt= g)P (Gt= g|G − t)P (G − t, Y (G − t)) = max G_tp=g0      P (Y (Gt)|Gt= g)P (Gt= g|Gtp = g 0 )max G− tp P (Gtp = g 0 , G−_t_p, Y (Gtp), Y (G − tp))        . (3.4) Due to the fact that tp is a random variable, Equation (3.4) can not be

imple-mented in practise. In order to resolve this, we insert Gt = (Qt, Lt). Furthermore,

we introduce a variable F to denote when a state finishes:

Ft=        1 , if Qt+1, Qt 0 , else . (3.5)

We also use that P (j, d|i, d0) = P (d|j)P (j|i), thus assuming both that the duration of the current state is independent of the previous state and that the transition probabilities are independent of the duration of the states. Applying this to Equa-tion (3.4) yields δt(j, d) = max G−_t P (Qt = j, Lt = d, Ft= 1, y1:t, G − t) = max i,d0 P (yt−d+1:t |_{j, d)P (j, d|i, d}0_)max G−_t−d P (Qt−d= i, Lt−d= d 0 , Ft−d= 1, y1:t−d, G − t−d) ! = max i,d0               t Y s=t−d+1 Bj(s)        P (d|j)P (j|i)δt−d(i, d 0 )        = max i,d0               t Y s=t−d+1 Bj(s)        Cj(d)Aijδt−d(i, d 0 )        . (3.6)

This is the same update step as is presented in [13] and [20].

It can be observed that the the update step is similar to the ordinary HMM al-gorithm, the difference being that the duration probabilities are factored in and that it is necessary to search for the maximising arguments over the duration of the states. Furthermore the δ in the HMM algorithm is defined as the joint

(40)

prob-ability of being in state j at time t. In the HSMM algorithm however it is defined as the joint probability of transitioning from state j at time t.

In order to do a proper initialisation it is necessary to initialise δ not only for t = 1 but for all t ≤ D. The initialisation is expressed as

δt(j, t) =        t Y s=1 Bj(t)        Cj(t)πj. (3.7)

As in the HMM algorithm the sequence path has to be stored in a storage variable φ. In order to store the path sequence, it is necessary to store both the previous state and the duration spent in the previous state. This is expressed as

(i∗, d∗) = arg max 1≤i≤N ,1≤d0_≤_D               t Y s=t−d+1 Bj(s)        Cj(d)Aijδt−d(i, d 0 )        φt(j, d) = (t − d, i ∗ , d∗)

for the update steps. Furthermore, it is necessary to initialise the storage variable as

φt(j, t) = (0, 0, 0)

for all t ≤ D.

The last state is found as

(jn∗, d ∗ n) = arg max 1≤j≤N ,1≤d0_≤_N δT(j, d 0 ) tn= T

and the back-tracking as

(tn, j ∗ n, d ∗ n) = φtn−1(j ∗ n−1, d ∗ n−1). (3.8)

(41)

Algorithm 7The HSMM Viterbi Algorithm

forj = 1 → N do . Initialisation fort = 1 → D do δt(j, t) = _Q_t s=1Bj(t) Cj(t)πj φt(j, t) = (0, 0, 0) end for end for fort = 2 → T do . Recursion forj = 1 → N do ford = 1 → D do ifd ≥ t then continue end if δ(t, j, d) = max 1≤i≤N ,1≤d0_≤ D _Q_t s=t−d+1Bj(s) Cj(d)Aijδt−d(i, d 0 ) (i∗, d∗) = arg max 1≤i≤N ,1≤d0_≤_D _Q_t s=t−d+1Bj(s) Cj(d)Aijδt−d(i, d 0 ) φt(j, d) = (t − d, i ∗ , d∗) end for end for end for (jn∗, d ∗ n) = arg max 1≤j≤N ,1≤d0_≤ N δT(j, d0) . Termination tn = T while(tn−dn) > 1 do . Back-Tracking (tn, j ∗ n, d ∗ n) = φtn(j ∗ n, d ∗ n) end while

(42)

3.2 Avoiding Numerical Underflow

Similarly as for the HMM optimisation algorithms, the HSMM variants suffer from underflow problems. As mentioned in Section 2.6, for HMMs this may be re-solved either through the use of scaling [11] or by working in the logarithmic do-main. However, it is shown in [7] that scaling is not a valid approach for HSMMs, thus only the logarithmic method is applicable. Using the same approach as pre-sented in Section 2.6.3 results in Algorithm 8.

(43)

3.2 Avoiding Numerical Underflow 27

Algorithm 8The Logarithmic HSMM Viterbi Algorithm ˜

˜ A = log(A) ˜ B = log(B) ˜ C = log(C) forj = 1 → N do . Initialisation fort = 1 → D do δt(j) = (Pts=1B˜j(t)) + ˜Cj(t) + ˜πj φt(j) = 0 end for end for fort = 2 → T do . Recursion forj = 1 → N do ford = 1 → D do ifd ≥ t then continue end if δt(j, d) = max 1≤i≤N ,1≤d0_≤_D( Pt s=t−d+1B˜j(t)) + ˜Cd(t) + ˜Aij+ δt−d(i, d0) (i∗, d∗) = arg max 1≤i≤N ,1≤d0_≤ D (Pt s=t−d+1B˜j(t)) + ˜Cd(t) + ˜Aij + δt−d(i, d0) φt(j, d) = (t − d, i∗, d∗) end for end for end for (jn∗, d ∗ n) = arg max 1≤j≤N ,1≤d0_≤_N δT(j, d 0 ) . Termination tn = T while(tn−dn) > 1 do . Back-Tracking (tn, jn∗, d ∗ n) = φtn(j ∗ n, d ∗ n) end while

(44)

(45)

4

Segmentation

A key part of running an orchard is to keep an up to date inventory of the trees. This is necessary both to estimate the crop yield and health as well as for gen-eral farm management. Currently, much of this labour is performed manually on selected parts of the orchard, and the results extrapolated to account for the entire orchard. Automation of this process would thus not only result in reduced labour, but also in more accurate estimates. However, in order to perform esti-mates on a per tree level, it is necessary to determine the extent of the individual trees. We denote this process as “segmentation”, as it involves dividing the rows into segments based on the extent of the individual trees.

We base the segmentation on a 3D point cloud, presented in Section 1.2. The reason for basing the segmentation on 3D data is that it has a high signal to noise ratio, due to the contrasting depths in the data. Furthermore, it is robust to illu-mination variance. This can be contrasted to the possible use of visual sensors, suffering both from the general similarity of the environment while also being sensitive to illumination changes.

We introduce a method based on a Hidden Semi-Markov Model (HSMM), inte-grating knowledge about both tree spacing and measured tree features. In order to improve the performance further, we propose a number of different variants of the introduced method. Finally, we evaluate the different methods and compare their results.

(46)

4.1 Modeling the orchard

A 3 state HSMM model for modeling an orchard was presented in [18]. Let the model contain 3 states: tree, gap and border. The tree represents a tree in the or-chard,gap represents a space without a tree and border represents the transition between the other two states. Of importance is thatborder does not only repre-sent the transition betweentree and gap but also from tree to tree and gap to gap. Based on this model, we define an individual tree to be a continuous sequence of tree observations that are not separated by a border state.

The observations of the HSMM are calculated by extracting features from the 3D point cloud. This is done by dividing the rows into vertical “slices” of a spec-ified width. A feature is then extracted from each slice. This approach gives a sequence of observations that can be input to the Viterbi algorithm.

In order to finalise the model, it is necessary to specify the state duration dis-tributions. These distributions are a key part of the model and allow a priori knowledge of tree width to be integrated into the model.

4.1.1 Transition Probabilities

In order to set the notation, define the states as         q1 q2 q3         =         qtree qgap qborder         . (4.1)

To assure that the time spent in each state is uniquely controlled by the state duration distributions, the self-transition probability is set to 0 for all states. Fur-thermore, the transition probability betweentree and gap is set to 0. This gives that the transition probability fromtree and gap to border is 1. Finally, the transi-tion probability fromborder to tree and gap is set to 0.5. The complete transition matrix is given as A =         0 0 1 0 0 1 0.5 0.5 0         , (4.2)

(47)

4.1 Modeling the orchard 31

Tree Boundary Gap 1.0

0.5

1.0

0.5

Figure 4.1: The state transition diagram of the model of the orchard. Note that the explicit state duration distributions are not shown.

4.1.2 Initial Probabilities

To adhere to the standard of boundinggaps and trees by borders, the initial proba-bilities are defined as

π =0 0 1. (4.3)

4.1.3 Choice of Observation Feature

Identifying the state sequence is dependent on how informative the observation features are. There exists a multitude of possible feature and feature combina-tions. The method presented in [18] utilises height features. A problem with the height feature is that it is heavily affected by small branches, making it hard to detect the borders between overlapping trees.

Another feature is volume, a benefit being that it is less heavily affected by mi-nor branches. A third feature is a point-count, giving a rough estimate of the density of the slice. In order to set the notation, we denote these 3 features as the original features in order to distinguish them from the features that may be derived from them.

Furthermore, the level of noise has a great effect on the performance of the model, thus being necessary to take into consideration when choosing feature. As the noise level is closely related to the slice width, we discuss the choice of slice reso-lution in the next section.

Resolution

The resolution is defined to be the width of each slice. It is difficult to quantify the exact effects of the resolution, however some broad points can be noted. A larger resolution, thinner slices, allows for a more exact segmentation of the dataset. On the other hand, a too small resolution makes the features more susceptible to noise. It is thus necessary to balance segmentation preciseness against the noise level. Having no method for performing such analysises theoretically, the problem was examined by calculating and plotting the three original features for different resolution. The usefulness of each resolution was determined by how easy it was to detect the borders between the individual trees using visual inspection. From the results of the analysis, it was determined that a resolution

(48)

of 0.2m is suitable. Having a larger resolution than this had significant negative effects on the noise level while a lower resolution did not improve the noise level by any significant degree.

Height

The height feature is calculated by finding the largest height of the slice. In order to estimate its usefulness, the feature was calculated and plotted for one row. Extracts of the row presented in Figures 4.2 and 4.3 show that some trees and gaps are distinct. However, it also shows that the border seems to be unclear between many larger trees, presumably because of overlapping branches between the trees. 0 10 20 30 40 50 60 70 80 −1 0 1 2 3 4 5 Measured Height Distance [m] Height [m]

Figure 4.2:The measured height of the first 80 meters of an examined row.

240 250 260 270 280 290 300 310 320 −1 0 1 2 3 4 5 Measured Height Distance [m] Height [m]

Figure 4.3: The measured height from 240 to 320 meters of an examined row.

Density

The density feature is calculated by counting the number of points in the slice. One negative aspect of the feature is that it is correlated to the speed of the robot. It is possible that this can be handled by dividing the point feature with the

(49)

cur-4.1 Modeling the orchard 33

rent speed of the robot, however this may be problematic if the robot is not mov-ing in a straight line. Nevertheless, most datasets were collected by the robot moving in a straight line at uniform speed, thus the negative effects of the prob-lem should be negligable. Therefore it was decided that addressing the probprob-lem was outside the scope of the thesis.

The usefulness of the density feature was evaluated in the same way as the useful-ness of the height feature. The plots in Figures 4.4 and 4.5 show that most trees are distinct. Furthermore, the feature seems not to be as heavily affected by over-lapping branches as the height feature. A negative aspect of the feature however is that it does not appear to be as sensitive to the small trees as the height feature.

0 10 20 30 40 50 60 70 80 200 400 600 800 1000 1200 1400 Measured Density Distance [m] Points

Figure 4.4:The measured density of the first 80 meters of an examined row.

240 250 260 270 280 290 300 310 320 300 400 500 600 700 800 900 1000 1100 1200 Measured Density Distance [m] Points

Figure 4.5: The measured density from 240 to 320 meters of an examined row.

Volume

In order to calculate the volume feature, the slice is first split into uniform voxels of width 0.2m. Subsequently, the volume is computed by adding the volume of all voxels containing at least one point. The usefulness of the feature was evaluated in the same way as the height and density features. Figures 4.6 and 4.7 show that

(50)

most trees are very distinct. Furthermore, the feature seems to be less noisy than the density feature while also suffering from the same lack of sensitivity to small trees. Taking these points into consideration, it was decided to use volume as the primary observation feature.

0 10 20 30 40 50 60 70 80 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Measured Volume Distance [m] Volume [m 3 ]

Figure 4.6:The measured volume of the first 80 meters of an examined row.

240 250 260 270 280 290 300 310 320 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 Measured Volume Distance [m] Volume [m 3]

Figure 4.7: The measured volume from 240 to 320 meters of an examined row.

Volume Difference

The authors of [18] used height-difference to indicate border states, a valid ap-proach as borders are characterised by quick changes in height. A problem how-ever is that difference calculations amplifies the noise of the data. As can be seen in Figure 4.8, the noise level is clearly too large for the feature to be useful. In or-der to negate the effect of the noise, it was decided to introduce an ad hoc feature termed “difference of moving averages”.

The “difference of moving averages” feature is computed by calculating the dif-ference between one large moving average and one small moving average. The large moving average, calculated by filtering the volume feature with a rectangu-lar filter of width 30, gives an estimate of the volume in a rectangu-larger neighbourhood

(51)

4.1 Modeling the orchard 35

0 10 20 30 40 50 60 70 80

−0.5 0 0.5

Measured Volume Difference

Distance [m]

Volume Difference [m

3 ]

Figure 4.8: The estimated volume difference using a simple 1-step differ-ence.

around the current slice. The small moving average, calculated by filtering the volume feature with a rectangular filter of width 5 gives a smoothed estimate of the volume at the current slice. The definition of the feature means that it yields large values at borders between large trees. Similarly, it yields large negative val-ues both for tree centres and at the borders between gaps and trees. An example of resulting feature values is presented in Figure 4.9.

0 10 20 30 40 50 60 70 80 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4

Difference of Moving Averages

Distance [m]

Volume [m

3 ]

Figure 4.9: The measured volume (—) and the difference of moving

averages (—).

4.1.4 Hand-Tuned Observation Distributions

The authors of [18] showed that it is possible to achieve good results using hand-tuned observation distributions. Using the same approach, the volume observa-tion distribuobserva-tions were hand-tuned by visually inspecting Figures 4.6 and 4.7. The observation distribution ofgap was set such that it is more likely when a small volume is observed. Similarly, the observation distribution oftree was set such that it is more likely when a large volume is observed. The observation dis-tribution based on the “difference of moving averages” was set to be moderately

(52)

large for small values and large for large values. The reason for having the dis-tribution moderately large for small values is that these may both indicate a tree centre and a border between a gap and a tree. The resulting observation distribu-tions are presented in Figures 4.10 and 4.11.

0 0.5 1 1.5 2 2.5 3 3.5 0 0.5 1 1.5 2 2.5 3 3.5

Hand−tuned Observation Distributions

Volume [m3]

Probability

Figure 4.10:The observation distributions for thetree (—) andgap (—) states.

−0.80 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.5 1 1.5 2 2.5

Hand−tuned Border Distribution

Volume Difference [m3]

Probability

Figure 4.11:The observation distribution for theborder state.

4.1.5 State Duration Distributions

The state duration distributions are used to embed knowledge of the spatial di-mensions of the orchard into the HSMM model. The authors of [18] approached the problem by modelling the duration of thetrees and gaps with a Gaussian dis-tribution centred around a mean. Furthermore, the standard deviation of the Gaussian was made fairly broad to allow for smaller trees. Finally, the Gaussian was truncated at a certain point as to set an upper limit of the tree width. This approach is reasonable for trees, however it is problematic for gaps. The reason is that the gap between a replanted tree and its full grown neighbours is at most half the width of a tree. Modelling the width of the gap with the same distribution as the trees would thus introduce a bias into the model, possibly

(53)

forc-4.1 Modeling the orchard 37

ing small trees to be labeled as gaps. Furthermore, there exists many small gaps between large trees in the dataset, their extent being badly modelled by a Gaus-sian based on the width of the trees. Taking these issues into consideration, the duration distribution of thegap state is set to be uniform.

In order to make the borders as short as possible, the duration distribution of border state is set to Kronecker’s delta, thus limiting the duration of the border to 1.

In order to limit both the computational complexity and the width of the trees, the maximum duration D is set to 35 observations (7 meters). The Gaussian tree distribution is set to have a mean of 25 observations (5 meters) and a standard deviation of 5 observations (1 meter). The duration distributions are displayed in Figures 4.12 and 4.13. 0 5 10 15 20 25 30 35 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08

Tree and Gap Duration Distributions

Duration spent in state

Probability

Figure 4.12:The state duration distributions for thetree (—) andgap (—).

0 5 10 15 20 25 30 35 0 0.2 0.4 0.6 0.8 1

Border Duration Distribution

Duration spent in state

Probability

(54)

4.2 Model Alterations and Extensions

A 3 state model was presented in Section 4.1. In this section we discuss how this model can be altered and extended to increase both its performance and its sim-plicity of use. First we present how the observation and duration distributions can be learnt from a labeled dataset. Second, we extend the model by introducing a ground removal pre-processing step. Third, we describe how the model can be extended to a 4 state model in order to increase its performance with regard to small trees. Finally, we also examine the possible usage of the height and density features.

4.2.1 Learnt Observation Distributions

Instead of using hand-tuned observation distributions, such as those presented in Section 4.1.4, it is possible to learn the distributions from a labeled dataset. The histograms of the observations for the 3 different states are presented in Figures 4.14, 4.15 and 4.16. Figures 4.14 and 4.15 show that bothtree and gap distribu-tions are fairly close to Gaussian while the border distribution in Figure 4.16 is clearly non-Gaussian. Despite this, we still decide to model all observation distri-butions as Gaussian in order to investigate if it is possible to achieve good results using a very simple learning method.

0 0.5 1 1.5 2 2.5 3 0 200 400 600 800 1000 1200

Histogram of Volume Measurements given Tree

Volume [m3]

Number of Measurements

Figure 4.14:The histogram of volume measurements giventree.

Assuming that the distributions are Gaussian, they can be learnt by estimating the mean and variance of the observations for the different states. Performing this operation, using a dataset labeled with the model presented in Section 4.1, yields the observation distributions presented in Figure 4.17.

(55)

4.2 Model Alterations and Extensions 39 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0 10 20 30 40 50 60 70

Histogram of Volume Measurements given Gap

Volume [m3]

Figure 4.15:The histogram of volume measurements givengap.

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0 10 20 30 40 50

Histogram of Volume Measurements given Border

Volume [m3]

Figure 4.16:The histogram of volume measurements givenborder.

0 0.5 1 1.5

0 5 10 15

Learnt Observation Distributions

Volume [m3]

Probability

Figure 4.17:The learnt observation distributions fortree (—) ,gap (—) and border (—).

Autonomous Crop Segmentation, Characterisation and Localisation

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Autonomous Crop Segmentation, Characterisation and

Localisation

Autonomous Crop Segmentation, Characterisation and

Localisation

Examensarbete utfört i Reglerteknik

vid Tekniska högskolan vid Linköpings universitet

av

Sammanfattning

Abstract

Acknowledgments

Contents

1

Introduction

1.1

ACFR and HAL

1.2

Data Collection and Datasets

1.3

Related Work

1.4

Thesis Contributions

1.5

Outline

2

Hidden Markov Models

1

2

3

2.1

Definition of Optimal States

2.2

The Forward Algorithm

2.3

The Forward-Backward algorithm

2.4

The Viterbi Algorithm

2.5

Use of Multiple Features

2.6

Avoiding Numerical Underflow

2.6.1

Forward Algorithm

2.6.2

Forward-Backward Algorithm

2.6.3

Viterbi Algorithm

3

Hidden Semi-Markov Models

3.1

The Viterbi Algorithm

3.2

Avoiding Numerical Underflow

4

Segmentation

4.1

Modeling the orchard

4.1.1

Transition Probabilities

4.1.2

Initial Probabilities

4.1.3

Choice of Observation Feature

4.1.4

Hand-Tuned Observation Distributions

4.1.5

State Duration Distributions

4.2

Model Alterations and Extensions

4.2.1

Learnt Observation Distributions