Autonomous Mapping and Exploration of Dynamic Indoor Environments

(1)

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Autonomous Mapping and Exploration of Dynamic

Indoor Environments

Examensarbete utfört i Reglerteknik vid Tekniska högskolan vid Linköpings universitet

av

Joel Fåk och Tomas Wilkinson LiTH-ISY-EX--13/4722--SE

Linköping 2013

Department of Electrical Engineering Linköpings tekniska högskola

Linköpings universitet Linköpings universitet

(2)

(3)

Autonomous Mapping and Exploration of Dynamic

Indoor Environments

Examensarbete utfört i Reglerteknik

vid Tekniska högskolan vid Linköpings universitet

av

Joel Fåk och Tomas Wilkinson LiTH-ISY-EX--13/4722--SE

Handledare: Jonas Linder

isy_{, Linköpings universitet}

Joakim Rydell

Totalförsvarets forskningsinstitut

Erika Emilsson

Totalförsvarets forskningsinstitut

Examinator: Christian Lundquist

isy, Linköpings universitet

(4)

(5)

Avdelning, Institution Division, Department

Avdelningen för Reglerteknik Department of Electrical Engineering SE-581 83 Linköping Datum Date 2013-09-03 Språk Language Svenska/Swedish Engelska/English Rapporttyp Report category Licentiatavhandling Examensarbete C-uppsats D-uppsats Övrig rapport

URL för elektronisk version

http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-97609

ISBN — ISRN

LiTH-ISY-EX--13/4722--SE Serietitel och serienummer Title of series, numbering

ISSN —

Titel Title

Autonom kartläggning och utforskning av dynamiska inomhusmiljöer Autonomous Mapping and Exploration of Dynamic Indoor Environments

Författare Author

Joel Fåk och Tomas Wilkinson

Sammanfattning Abstract

This thesis describes all the necessary parts needed to build a complete system for au-tonomous indoor mapping in 3D. The robotic platform used is a two-wheeled Segway, op-erating in a planar environment. This, together with wheel odometers, an Inertial Measure-ment Unit (IMU), two Microsoft Kinects and a laptop comprise the backbone of the system, which can be divided into three parts:

• The localization and mapping part, which fundamentally is a SLAM (simultaneous localization and mapping) algorithm implemented using the registration technique Iterative Closest Point (ICP). Along with the map being in 3D, it also designed to handle the mapping of dynamic scenes, something absent from the standard SLAM design.

• The planning used by the system is twofold. First, the path planning - finding a path from the current position to a destination and second, the target planning -determining where to go next given the current state of the map and the robot. • The third part of the system is the control and collision systems, which while they

have not received much focus, are very necessary for a fully autonomous system. Contributions made by this thesis include: The 3D map framework Octomap is extended to handle the mapping of dynamic scenes; A new method for target planning, based on image processing is presented; A calibration procedure for the robot is derived that gives a full six degree of freedom pose for each Kinect.

Results show that our calibration procedure produces an accurate pose for each Kinect, which is crucial for a functioning system. The dynamic mapping is shown to outperform the stan-dard occupancy grid in fundamental situations that arise when mapping dynamic scenes. Additionally, the results indicate that the target planning algorithm provides a fast and easy way to plan new target destinations. Finally, the entire system’s autonomous mapping ca-pabilities are evaluated together, producing promising results. However, it also highlights some problems that limit the system’s performance such as the inaccuracy and short range of the Kinects or noise added and reinforced by the multiple subsystems.

Nyckelord

Keywords Autonomous; Mapping; Dynamic; SLAM; Simultaneous Localization and Mapping; D* lite; 3D; Iterative Closest Point; Scan Match; Exploration Planning; Octomap; Occupancy Grid

(6)

(7)

Abstract

This thesis describes all the necessary parts needed to build a complete system for autonomous indoor mapping in 3D. The robotic platform used is a two-wheeled Segway, operating in a planar environment. This, together with wheel odome-ters, an Inertial Measurement Unit (IMU), two Microsoft Kinects and a laptop comprise the backbone of the system, which can be divided into three parts:

• The localization and mapping part, which fundamentally is a SLAM (simul-taneous localization and mapping) algorithm implemented using the regis-tration technique Iterative Closest Point (ICP). Along with the map being in 3D, it also designed to handle the mapping of dynamic scenes, something absent from the standard SLAM design.

• The planning used by the system is twofold. First, the path planning -finding a path from the current position to a destination - and second, the target planning - determining where to go next given the current state of the map and the robot.

• The third part of the system is the control and collision systems, which while they have not received much focus, are very necessary for a fully au-tonomous system.

Contributions made by this thesis include: The 3D map framework Octomap is extended to handle the mapping of dynamic scenes; A new method for target planning, based on image processing is presented; A calibration procedure for the robot is derived that gives a full six degree of freedom pose for each Kinect. Results show that our calibration procedure produces an accurate pose for each Kinect, which is crucial for a functioning system. The dynamic mapping is shown to outperform the standard occupancy grid in fundamental situations that arise when mapping dynamic scenes. Additionally, the results indicate that the target planning algorithm provides a fast and easy way to plan new target destinations. Finally, the entire system’s autonomous mapping capabilities are evaluated to-gether, producing promising results. However, it also highlights some problems that limit the system’s performance such as the inaccuracy and short range of the Kinects or noise added and reinforced by the multiple subsystems.

(8)

(9)

Sammanfattning

Detta exjobb beskriver delarna som krävs för att för bygga ett komplett system som autonomt kartlägger inomhusmiljöer i tre dimensioner. Robotplattformen är en Segway, som är kapabel att röra sig i ett plan. Segwayn, tillsammans med en tröghetssensor, två Microsoft Kinects och en bärbar dator utgör grunden till systemet, som kan delas i tre delar:

• En lokaliserings- och karteringsdel, som i grunden är en SLAM-algoritm (simultan lokalisering och kartläggning) baserad på registreringsmetoden Iterative Closest Point (ICP). Kartan som byggs upp är i tre dimensioner och ska dessutom hantera kartläggningen av dynamiska miljöer, något som orginalforumleringen av SLAM problemet inte klarar av.

• En automatisk planeringsdel, som består av två delar. Dels ruttplanering som går ut på att hitta en väg från sin nuvarande position till det valda målet och dels målplanering som innebär att välja ett mål att åka till givet den nuvarande kartan och robotens nuvarande position.

• Systemets tredje del är regler- och kollisionssystemen. Dessa system har inte varit i fokus i detta arbete, men de är ändå högst nödvändiga för att ett autonomt system skall fungera.

Detta examensarbete bidrar med följande: Octomap, ett ramverk för kartlägg-ningen i 3D, har utökats för att hantera kartläggkartlägg-ningen av dynamiska miljöer; En ny metod för målplanering, baserad på bildbehandling läggs fram; En kalibre-ringsprocedur för roboten är framtagen som ger den fullständiga posen i förhål-lande till roboten för varje Kinect.

Resultaten visar att vår kalibreringsprocedur ger en nogrann pose for för varje Kinect, vilket är avgörande för att systemet ska fungera. Metoden för kartlägg-ningen av dynamiska miljöer visas prestera bra i grundläggande situationer som uppstår vid kartläggning av dynamiska miljöer. Vidare visas att målplaneringsal-goritmen ger ett snabbt och enkelt sätt att planera mål att åka till. Slutligen utvär-deras hela systemets autonoma kartläggningsförmåga, som ger lovande resultat. Dock lyfter resultat även fram problem som begränsar systemets prestanda, till exempel Kinectens onoggranhet och korta räckvidd samt brus som läggs till och förstärks av de olika subsystemen.

(10)

(11)

Acknowledgments

We would like to thank our supervisors at Totalförsvarets Forskningsinstitut, Dr Joakim Rydell and Erika Emilsson, for letting us carry out this thesis and for their support and advice for the duration of this project. We would also like to thank our supervisor and our examiner at Linköping University, Jonas Linder and Dr Christian Lundquist, for their input and guidance.

Linköping, June 2013 Joel Fåk & Tomas Wilkinson

(12)

(13)

1

Introduction

This is a Master’s thesis about autonomous robotic exploration and mapping. It is carried out at the Swedish Defence Research Agency (FOI).

1.1 Background

Autonomous robotic mapping is a field of research that has received a lot of atten-tion in the last three decades. Considered by many to be one of the major prob-lems in achieving truly autonomous robots is the simultaneous localization and mapping (SLAM) problem, where the robot has to build a map of its environment while localizing itself within the map at the same time. If an environment is too dangerous, too remote or too expensive for human access, a mobile robot needs to be able to reliably build a map of its surroundings and navigate according to that map. Furthermore, SLAM is a requirement for the existence of personal robot assistants or autonomous robotic cars. However, there is more to an autonomous robot than just SLAM. For instance, the robot has to operate safely, not collide with people or objects, as well as being able to plan what to do and where next to move by itself.

1.1.1 SLAM - Simultaneous Localization and Mapping

The SLAM problem can be divided into two main classes, the online SLAM and the full SLAM problem. The full SLAM problem is comprised of estimating the entire trajectory of the robot along with the map, whereas the online SLAM prob-lem is only concerned with the latest robot pose, rather than the full path. The robot’s pose is comprised of the robot’s position in the 2D plane, i.e. x and y coor-dinates, along with the robot’s heading, φ. For this thesis, only the online SLAM problem will be addressed.

(16)

2 1 Introduction

The first SLAM algorithm to be applied, and arguably the most influential, is an algorithm based on the extended Kalman filter (EKF) and maximum likelihood data association using a feature-based map representation [Thrun et al., 2005]. This approach suffers from a number of limiting assumptions and approxima-tions which are, in many situaapproxima-tions, not true. The EKF assumes additive Gaussian noise and is not able to process negative information, that is, information regard-ing the absence of landmarks in sensor readregard-ings cannot be processed [Thrun et al., 2005]. Another drawback of the EKF is its sensitivity to data association errors, caused by its inability to process negative information, and its lack of a mech-anism for representing uncertainty over data associations. If an incorrect data association is made, it can never be undone and too many data association errors will cause the EKF to fail [Montemerlo and Thrun, 2007, p.7].

Since the first SLAM algorithm, many more have been developed to handle the situations that the EKF approach cannot. One of the most notable is the Fast-SLAM algorithm [Montemerlo et al., 2002], which is based on Rao-Blackwellized particle filters [Doucet et al., 2000]. It has enjoyed a lot of success without be-ing limited by many of the assumptions and approximations of the EKF, most notably: Additive Gaussian noise; Single hypothesis data association and incor-poration of negative information.

A variety of sensors both can be, and have been used in SLAM systems. The most common is the 2D range laser scanner, which due to its accuracy allows for high precision map building. However, laser scanners are generally quite expensive, which sparked development of systems using other, less expensive sensors to get range measurements, most notably visual sensors, i.e. cameras. Mono and stereo camera systems used to be the most popular camera setups, resulting in a number of successful SLAM systems [Davison, 2003, Herath et al., 2006]. Recently, the ar-rival of RGBD-sensors such as the Microsoft Kinect, has led to a development of SLAM algorithms explicitly designed for RGBD-sensors [Engelhard et al., 2011, Izadi et al., 2011]. In fully autonomous robot systems, range sensors are usually paired with any number of other sensors, such as odometry and inertial measure-ment units (IMUs), to help with the robot localization.

1.1.2 Dynamic maps

A major issue that SLAM algorithms need to overcome in order to be applicable in all but a few situations, is to remove the assumption of a static environment, i.e. handle dynamic objects. Depending on the context, dynamic objects could be people walking in front of the robot, cars driving along a road, light-weight furniture, such as chairs or coffee tables or a combination of several different objects.

During the last decade and half, numerous approaches to solve the dynamic SLAM problem have been published. These approaches can be divided into two categories; Those that filter out sensor measurements caused by non-static ob-jects [Fox et al., 1999] and those that explicitly model the environment dynamics [Wolf and Sukhatme, 2005, Meyer-Delius et al., 2012].

(17)

1.2 Preconditions 3

1.1.3 Planning

In addition to solving the SLAM problem, a fully autonomous exploration robot has to make decisions about where to go and how to get there. The latter problem can be referred to as path planning and has many uses in addition to SLAM. In robotics, the search space is continuous and most approaches reduce the prob-lem to a discrete graph-search probprob-lem [Russell and Norvig, 2003]. This area has been extensively researched over the years, with multiple well known solutions as a result. The A* algorithm uses heuristics to find the optimal path while vis-iting a minimum amount of nodes during the search and works well in a static world [Russell and Norvig, 2003]. In a dynamic environment, an incremental search algorithm like D* or D* lite is more efficient since the arc cost parameters can change during the calculations [Stentz, 1994, Koenig and Likhachev, 2005]. Focused D* (FD*) combines the strengths of A* and D* by using a heuristic dur-ing the search and can be extended into Focused Hierarchical D* (FHD*) [Seder et al., 2011]. This method introduces map hierarchies in order to avoid having to redo searches in locations that are not affected by the detected changes in the environment, while still calculating the optimal route.

While path finding is a general problem, the decisions of where to go is strongly coupled to the localization and mapping problem and is referred to as autonomous exploration. In a recent study by Juliá et al. [2012], an analysis of the state of the art solutions was made. They divided the methods into two main classes, classic and integrated strategies.

The classic methods focus on gaining as much new information as possible while ignoring the uncertainty in the robots position. One approach involves finding borders between open space and unexplored regions, frontiers, and then navigate to the one closest to the agent [Yamauchi, 1997].

When performing SLAM, there is always uncertainty about the robot’s pose and the quality of the maps built by the robot is highly dependent on its pose. Some actions, like closing loops or returning to past positions, may reduce the uncer-tainty and increase the quality of the map [Surmann et al., 2003]. The integrated methods take this into account and try to minimize the uncertainty in addition to maximizing the information gain. One way to accomplish this, while using EKF-SLAM, is to observe landmarks with low uncertainty when the robot has high uncertainty and vice versa [Tao et al., 2007].

Another way to increase the quality of the maps is to minimize the reconstruction error by looking at multiple expected future poses [Haner and Heyden, 2011]. This optimizes locally and allows the robot to take a small detour from its planned path in order to get a better view of the surroundings.

1.2 Preconditions

In the autumn of 2012, a group of students from Linköping University created an autonomous map-building robot [Torp, 2012]. The robot platform consisted

(18)

4 1 Introduction

of a Segway RMP50 with odometric information, an inertial sensor and a SICK laser scanner LMS511-20100.

The robot had the ability to create a map of its surroundings, choose an unex-plored destination and plan a path to it, move along the path and avoiding new obstacles. The main shortcoming of the system was that the robot did not use the created map to localize itself. This means that the imperfectness of the odomet-ric and inertial data will displace the robot relative to the map and the continued map building will be erroneous. The path planning was not very robust and the robot often failed to find unexplored locations.

The software was written in Matlab with MEX-interfaces for the hardware com-munication.

1.3 Goals

The goal with this Master’s thesis is to develop an autonomous, exploring robot that builds three-dimensional maps of indoor environments. The main goal can be broken down into the following sub goals:

1. Build a SLAM implementation that works in a static indoor environment with rooms and corridors. The maps produced should represent the three-dimensional structures of the available parts of the building. The main focus is to create maps usable for automatic planning and visual off-site inspection.

• Distinguish navigable from impassable areas. The latter includes nar-row passages, ascending and descending stairs.

• Connecting rooms and hallways correctly to enable planning in the environment.

• Implement an occupancy grid [Elfes, 1989] map, such as [Wurm et al., 2010] and [Wolf and Sukhatme, 2005] with support for visual inspec-tion.

2. The robot should be able to work in a dynamic environment. Specifically, the SLAM goals (1.) should still be met in the following scenarios:

• Objects moving in front of the robot, e.g. people walking nearby. These objects should not be added to the map, but the planner should still take them into account in the exploration and path planning. • Environment changes between two separate visits of the robots, e.g.

opened and closed doors and chairs being moved. The map should be updated in order to accommodate the changes.

3. The robot should work autonomously which requires automatic planning of the exploration.

(19)

1.4 Exclusions 5

(a) Exploration Planning is to find unexplored regions of the map and choose the order they should be visited in. The main goal is a planner that creates a plan and dynamically updates it in order to fulfil the following requirements:

• Visit all available locations to create a full map for planning pur-poses.

• Scan all accessible surfaces to enable exhaustive visual inspection. (b) Path Planning is to calculate paths to the desired targets. The main goal is to create near-optimal paths that are dynamically updated in order consider a changing environment.

1.4 Exclusions

In order to reduce the complexity of the problem at hand and to allow us to focus more in detail on certain parts of the system, other parts will not be looked upon other than having a working implementation. These parts include the control of the robot, the collision avoidance system and the interfaces to some of the hard-ware, namely the odometry information from the Segway and the IMU. There are no additional precision requirements for the map when the criteria described in Section 1.3, Goal 1 are met. Furthermore, there will not be any speed or effi-ciency demands on the system and the mapping will be limited to indoor office environments.

1.5 Method

At first, a review of current literature is to be done to determine the present state of autonomous mapping research as well as an investigation as to what software is available for free use. In addition, the current system is to be installed on a new computer.

After switching sensors, getting the system into a working state is top priority. When this has been done, incremental development of the systems will be done in order to keep the system operational to enable continuous testing of the entire system.

Evaluation of the system will be done by letting the robot map known indoor environments to which we have access to some sort of ground truth information, e.g., a map or blueprint, and compare. Some sort of metric that takes into account certain information such as angles between walls and floors, which should be approximately 90 degrees, and that walls and floors should be connected, could be used. Furthermore, even though not any demands have been made regarding the performance of the system, it is still of interest and thus merits evaluation. The design of the current system is to be reworked. The plan is to move large parts of the code to C++, run some of the algorithmic sub-steps in parallel and

(20)

6 1 Introduction

utilize existing open source libraries.

1.6 Thesis Outline

The report starts with an overview of the system design, both hardware and soft-ware, in Chapter 2. Here some designed choices are explained and motivated. Chapter 3 describes the registration method used for this thesis while the map representation used is addressed in Chapter 4. The SLAM method used in the thesis is outlined in Chapter 5 and is based on the registration and map represen-tation described in the preceding chapters. The SLAM ability is evaluated and discussed at the end of the chapter.

Chapter 6 addresses the problems of finding potential targets to visit and how to plan routes. Results and discussion of these subsystems accuracy and perfor-mance are included.

Chapter 7 deals with the task of moving the robot along a path while avoiding collisions with permanent and dynamic objects.

In Chapter 8 a calibration method created for this thesis is described. The pur-pose of the method is to find the 6 DoF pur-poses of the Kinect cameras.

In Chapter 9 the result of the whole system is reported and discussed. Chapter 10 describes deficiencies and proposals of future work.

(21)

2

System Overview

An autonomous robot requires a set of sensors and actuators together with mul-tiple software components to make it fully functional. This chapter describes how the robotic system used in this project is designed. The system consists of Segway RMP50 robot connected to a laptop performing the needed calculations. The robot also contains wheel encoders and a velocity controller. An Inertial Measurement Unit (IMU), two Microsoft Kinects and a hand controller are also connected to the laptop, which have Matlab running to create an environment for the software. These connections can be seen in the UML deployment diagram in Figure 2.1. The details of the hardware items are described in Appendix B.

Figure 2.1: Deployment diagram of the system. It describes the relations between the sensors, robot, hand controller and computer.

(22)

8 2 System Overview

2.1 Hardware Assembly

During the project, the Kinects were positioned on the robot such that the over-lap of their field of view was as small as possible. This was done in order to minimize the interference effects between the two Kinects’ IR projectors. The lo-cation of the Kinects is approximately 0.6 m and 0.46 m above the floor, roughly along the axis of rotation of the Segway. Furthermore, the Kinects have been mounted on the Segway upside down to avoid the stand, which can wobble and induce an unknown, varying pitch angle. A picture of the assembly can be seen in Figure 2.2. Kinects IMU Xbox Hand Controller Laptop RMP50

Figure 2.2: The hardware setup. The system includes a Segway RMP50

robot, an IMU, two Kinects cameras, a Xbox hand controller and a laptop. More information of the hardware can be found in Appendix B.

(23)

2.2 Software Design 9

2.2 Software Design

The software consists of several subsystems with different responsibilities. Many calculations are time and/or memory consuming. The various tasks must not interfere too much with each other, for example, the robot has to avoid colli-sions with unexpected obstacles even if heavy planning calculations are being performed. Therefore the system had to be carefully designed. In this section, the design and implementation of this project are described.

The tasks that need to be performed could be divided into four separate but in-terdependent groups:

• Reactive motion control

– Collision avoidance, see Section 7.1

– Path following and motion control, see Section 7.2 • Sensor data collection

• SLAM, see Chapter 5

– Registration, see Chapter 3 – Map building, see Chapter 4 • Planning

– Target planning, see Section 6.2 – Path planning, see Section 6.3

Between these parts there are a lot of dependencies. For example, the collision avoidance system needs current sensor data while the path motion controls needs the latest plan from the planner. The planner is dependent on the robots current position and the map from the SLAM calculations. The map building process needs the odometric data from the motion controls as well as the latest data from Kinect.

In order to handle all these dependencies, the mediator design pattern is utilized [Gamma et al., 2003]. In short, a mediator can be described as a central point of communication, responsible for sending and collecting data from the correct destination and sources. In that way, the modules does not need to know about each other and new modules can easily be added. A downside with this pattern is that the mediator itself could be incalculable. An overview of the system can be seen in Figure 2.3.

In order to make the various tasks independent of the execution time of the other tasks, the system is designed to work concurrently. That means that the four subsystems are executed in individual threads and no task has to wait for another task to finish. If the computer running the system has four processor cores or more, all tasks could be executed at the same time.

(24)

10 2 System Overview

Due to the heavy calculations needed for SLAM and planning, these parts are implemented in C++. The Kinect interface is also implemented in C++ since it’s data structures are first and foremost used for the SLAM. The control sequence of the program, as well as the collision avoidance and motion control are imple-mented in Matlab. These tasks has to be performed at a high rate, but since the calculations needed are relatively lightweight, Matlab works well.

Matlab

Target Planner Path Follower Reactive Obstacle Detection

C++ MEX interface Mediator Path Planner Slammer Map Builder Scan Handler Strategy

Scan Match Strategy

Registrator

Sensory Grabber

Kinect

Log Reader

Figure 2.3: Component diagram showing an overview of the software. The Mediator is the coordinator between all other modules to make them inde-pendent of each other. The Sensory Grabber is constantly updated with the latest sensor data, whether it read from the Kinects or from a log. The Slam-mer performs the SLAM process, see Chapter 3, 4 and 5. The Path Planner handles the dynamic path searching, see Chapter 6.3. The target planning, Section 6.2, path following and reactive obstacle detection, Chapter 7, is run in the main Matlab program.

(25)

2.3 Run Modes 11

2.3 Run Modes

The software has three modes the controls how the calculations are executed: • Run - This is the live mode, where the robot actually drives around and

builds maps in real time. As a result of this, scans are only processed as fast as the mapping thread is able to. As this is a lot slower than 30 frames per second, which is the speed of the Kinects, many scans are lost.

• Playback - In this mode, data is processed offline. This allows for process-ing of everythprocess-ing frame captured by the Kinects. As a consequence of this, the results produced by this mode generally performs best. However, this mode is limited to separate evaluation of individual subsystems, such as the mapping parts or the planning module. Since the robot will not follow the plans, the system can not be evaluated.

• Simulate - This mode is designed to mimic run mode, but uses recorded datasets instead of live frames. It works by having a separate thread read the datasets at a certain speed, determined by the timestamps on the record-ings. This makes the mapping thread miss frames at the same rate as if in run mode. The system evaluation properties are the same as for playback.

2.4 Third Party Libraries

The software makes use of third party C++ libraries. Below follows an overview of the different libraries and how they are used in this project.

2.4.1 Mobile Robot Programming Toolkit

The Mobile Robot Programming Toolkit (MRPT) [Blanco-Claraco et al, 2013], is an large collection of algorithms, drivers and tools created to aid the development of robotic software. It includes implementations of algorithms for localization, SLAM, computer vision and motion planning. In addition it includes facilities to use different sensors, among them the Microsoft Kinect. MRPT is integrated with Octomap, Point Cloud Library and Eigen described in the following sections. For this reasons, MRPT was chosen to be used as a foundation for this project.

2.4.2 OctoMap

OctoMap [Wurm et al., 2010] is an efficient implementation of a 3D occupancy grid. It is based on the octree, which is a datastructure for spatial subdivioning in 3D. The octree representation offers several advantages. First and foremost it is very space efficient, storing only cells that it has some knowledge about as well as employing a compression step whenever possible. Another advantage is that it does not need to know the extent of the map in advance, instead it will be dynamically expanded when needed. Additionally the map is multi-resolution by design. A drawback of OctoMap is the lack of optimization in current ver-sion, but the source code of the project indicates that future releases can be

(26)

paral-12 2 System Overview

lelized. The many advantages of OctoMap makes it a good candidate for the map implementation in this project.

2.4.3 Point Cloud Library

The Point Cloud Library (PCL) [PCL, 2013] is a open source cross-platform frame-work for 2D and 3D image and point cloud processing. It includes algorithms for feature estimation, segmentation, object recognition and much more. In this project, PCL is used for the point cloud registration, normal estimation, plane localization and filtering among other things.

2.4.4 Eigen

Eigen [Eig, 2013] is a C++ library for linear algebra, with an efficient and simplis-tic interface for matrix and vector calculations.

(27)

3

Registration

This chapter will introduce the problem of registration and give an overview of some of the popular algorithms used for 3D matching of range scans. Registra-tion refers to the problem of matching two point cloud scans, a reference scan and a current scan, where the relative pose difference between the two scans is unknown. The objective is to find the transformation between the two scans that minimizes some error metric, e.g. the distance between corresponding points.

3.1 Iterative Closest Point

The Iterative Closest Point algorithm, first described in Besl and McKay [1992], can be summarized as an iterative refinement of the relative pose between cor-responding points in two scans. The point correspondences are found using the nearest neighbour criterion, with the euclidean distance metric, between a point in the current scan and all the points in the reference scan. See Algorithm 1 for a general pseudocode formulation of the ICP algorithm.

3.1.1 Limitations

The ICP algorithm has some limitations in its original formulation, some of which have been overcome in other incarnations. The point matching scheme applied suffers from a fundamental limitation in that the euclidean nearest neighbour generally does not correspond to the same point in the scanned surface. ICP works well if the distance between the two scans is relatively small. If the two scans are far apart from each other, ICP may converge into a local minimum and not achieve a good result. However, provided a good initial solution, convergence to the global minimum is nonetheless possible.

(28)

14 3 Registration

Algorithm 1A general formulation of the ICP algorithm in pseudocode where R is the reference scan, C is the current scan, tis the transformation error tolerance

and d(x, y) is a distance metric.

1: function ICP(R, C, t)

2: while true do

3: R∗

= ∅

4: forpoint c(i)∈ C do

5: Find r∗∈ R :d(r∗, c(i)) = minr∈Rd(r, c(i))

6: Add r∗to R∗with index i

7: end for

8: Estimate transform T∗: T∗= minT(T ∗ R ∗

− C)

9:  = accumulate error between T ∗ R∗and C

10: if  < t then

11: return T∗

12: end if

13: end while

14: end function

Determining the point correspondences has the worst time complexity of the algo-rithms’ steps [Besl and McKay, 1992] and it is also where most of ICPs execution time is spent [Magnusson, 2009]. Due to this, ICP as a whole is quite computa-tionally expensive. Another limitation of the algorithm is that since it is point based, it does not consider the local shape of the matching surface when deter-mining the transformation, therefore loosing a lot of information. An example of registration using ICP is seen in Figure 3.1.

3.1.2 Versions of ICP

There exists a plethora of refinements and modifications done to the basic ICP algorithm to improve its performance and accuracy. A few notable variants are:

• Published independently soon after the paper by Besl and McKay [1992], was a method by Chen and Medioni [1991] that made use of an alternative distance metric; a point to plane distance, which minimizes the difference between a point in the current scan and the tangent plane of the closest point in the reference scan.

• A method by Druon et al. [2006] makes use of colour information to per-form the correspondence matching. This addition is designed to improve ICP when the geometric information is not enough to provide a good result. • To increase robustness, Chetverikov et al. [2002] introduced an outlier re-jection scheme to the basic ICP algorithm, naming it the Trimmed ICP Al-gorithm.

(29)

3.1 Iterative Closest Point 15

(a)The initial, unregistered cloud in yellow, and the ref-erence cloud in light blue.

(b) The registered cloud in white, and the reference cloud in light blue.

Figure 3.1: An illustration of the registration of two point clouds using ICP. Figure 3.1a shows two point clouds before performing registration, note the poor overlap between the clouds. Figure 3.1b shows the result of the registration. The overlap between the clouds is significantly higher after the registration.

(30)

(31)

4

Map Representation

Historically, map representations can be divided into two categories, metric maps and topological maps [Thrun, 2002], although more recently, hybrid approaches which combine metric and topological maps have been formulated [Blanco et al., 2007]. Metric maps represent the world by storing geometrical information of the environment. Popular approaches include landmark maps and occupancy grids. Topological maps store connectivity of different locations of interest along with information of how to navigate between these locations. One fundamental problem with map representations is the discretization of the continuous world, which can be done in infinitely many ways, which results in a space of hypotheti-cal maps that is infinite dimensional.

In order to store three-dimensional maps, this thesis uses the framework Oc-tomap. Octomap is based on the theory of occupancy grids, but has extended the technique with a third dimension and a compact, efficient way of storing the map, based on octrees. For this thesis, the standard Octomap framework is ex-tended to handle the mapping of dynamic scenes. This is done by modifying the way the cells in the grid are updated. A different set of update rules are used instead of the standard ones.

This chapter first introduces the basic theory of the occupancy grid before mov-ing on to the method described by Wolf and Sukhatme [2005] for mappmov-ing of dynamic scenes using an occupancy grid. Finally, the Octomap is introduced along with the contributions of this thesis to the Octomap framework.

(32)

18 4 Map Representation

4.1 Occupancy Grid

The occupancy grid is one of the most popular and widely used map represen-tations. It represents the map as field of binary random variables, arranged in a evenly-spaced grid structure. Each cell in the grid corresponds to the level of occupancy of the space that the cell covers in the world. Each cell is then usu-ally classified either as occupied or free depending on whether they are above or below a certain threshold.

Occupancy grids are most commonly used to describe a plane in the 3-D world, similar to a floor plan or blue print. This representation is usually paired with sen-sors that capture 2-D information, such as a laser range scanner. The occupancy grid makes a strong assumption that the state of the map is static, i.e. nothing is moving in the world. This is a rather coarse approximation that limits the use of the standard occupancy grid formulation to a relatively small set of situations. More formally, the goal is to calculate the posterior over the map given the data

p(m|z1:t, x1:t)

where m is the map, z1:t is the set of all measurements up to time t and x1:t

sequence of all robot poses up to time t, that is, its path. The map is defined as the finite set of all grid cells

m = {mi}Ni=1

where N is the number of cells in the map. Each cell has an associated probability of occupancy, which is the probability whether or not the cell is occupied or free. In order to make probabilistic updating plausible, each grid cell is assumed to be independent of all other cells, making estimating the probability of occupancy for a cell a binary estimation problem with a static state. The independence as-sumption allows for updating cells individually which makes the final posterior over all the cells the product of all the marginal distributions

p(m|z1:t, x1:t) =

Y

i

p(mi|z1:t, x1:t)

4.1.1 Updating The Occupancy Grid

The update procedure for the occupancy grid is an incarnation of the binary Bayes filter. The log odds ratio of a binary state x is defined as

l(x) = log o(x) = log p(x)

1 − p(x)

The log odds ratio is favoured over the odds ratio, due to its numerical stability and computational efficiency.

(33)

4.2 Dynamic Mapping Occupancy Grid 19

The map update equation for the occupancy grid is

log o(p(m(i)|_z_1:t_{, x}_1:t_{)) = log o(p(m}(i)|_z_t_{, x}_1:t_{)) + log o(p(m}(i)|_z_1:t−1_{, x}_1:t₎₎

−_{log o(p(m}(i)₎ (4.1)

For more details of how (4.1) is derived, see Appendix A.

Using (4.1), the algorithm for updating the occupancy grid is straightforwardly implemented. Algorithm 2 updates the occupancy grid a time step at time t, where l0= log o(p(m(i)) is the map prior, lt−1,i = log o(p(m(i)|z1:t−1, x1:t)) is the

pre-vious cell value and lt,i= log o(p(m(i)|zt, x1:t)) is the inverse measurement model.

The algorithm inverse_sensor_model(m(i), zt, xt) calculates whether or not a cell

is occupied in the latest measurement and returns a value locc> 0.5 if the cell is

occupied and a value lfree < 0.5 if the cell is free.

Algorithm 2Occupancy grid mapping at time t, where ztare the measurements, xtis the pose and m is the map.

1: function UpdateMap(zt, xt, m) 2: forcell m(i)_∈_{m do}

3: ifm(i)_{is in perpetual field of z}

tthen

4: lt−1,i = get_occupancy(m(i))

5: lt,i = lt−1,i+ log(inverse_sensor_model(m(i), zt, xt)) − l0

6: else 7: lt,i = lt−1,i 8: end if 9: end for 10: return m 11: end function

4.2 Dynamic Mapping Occupancy Grid

A fundamental assumption made by the standard occupancy grid implementa-tion is a static environment. To bypass this restricimplementa-tion, an approach by Wolf and Sukhatme [2005] is used to handle the mapping of dynamic environments. Addi-tionally, this method is extended to three dimensions when implemented in the Octomap framework.

This approach is based on keeping two occupancy grids simultaneously , denoted S for the static map and D for the dynamic map. The static map stores informa-tion concerning the static parts of the environment, e.g. walls, floors and other structural information. The dynamic map stores information regarding the dy-namic parts of the environment, such as people moving in view of the sensor, or a chair that has moved. Put differently, S and D should be disjunct.

An object is classified as dynamic if it has been observed to move either in the view of the sensors or between two non-contiguous measurements. For example,

(34)

a chair is observed to be in one place when going down a corridor in one direction. When going back in the opposite direction, the chair is observed again, but now it is in a different location.

In more formal terms, the problem is to estimate the posterior of the two maps, i.e.

p(St|z1:t, x1:t) (4.2)

and

p(Dt|z1:t, x1:t) (4.3)

where Stis the static map at time t, Dtis the dynamic map at time t and, as usual, z1:t and x1:t are the measurements and poses respectively, up to time t.

The two maps are updated independently according to a slightly modified update equation in comparison to the standard occupancy grid formulation presented above. Since the world is no longer assumed static and environmental change is dealt with explicitly, previous knowledge of the environment is incorporated in the update to compare with the measurements to be able to differentiate the static parts of the environment from the dynamic ones. Incorporating this information, (4.2) and (4.3) become

p(St|z1:t, x1:t, St−1) (4.4)

and

p(Dt|z1:t, x1:t, St−1) (4.5)

Note that incorporating the previous static map is only possible due to the static map being time invariant, thus incorporating the previous dynamic map is of no use. In addition to being classified as occupied or free, a cell may be classified as unknown. This is needed because once a cell has been deemed static, it cannot be classified as non-static.

4.2.1 Static Map Update

Just as with the standard occupancy grid update, see Appendix A, denoting cell index i at time t with s(i)_t ∈_S_t_{, Bayes rule is applied to (4.4) giving}

p(s(i)_t |_z_1:t_{, x}_1:t_{, S}_t−1_{) =} p(zt|z1:t−1, s (i) t , s (i) t−1, x1:t)p(s(i)t |z1:t−1, xt−1, s (i) t−1) p(zt|zt−1, x1:t, s (i) t−1) (4.6a) = p(zt|s (i) t , s (i) t−1, x1:t)p(s(i)t |z1:t−1, xt−1, s(i)t−1) p(zt|zt−1, x1:t, s (i) t−1) (4.6b) Applying Bayes rule again, this time to the first term of (4.6b), gives

p(zt|s(i)t , s (i) t−1, x1:t) = p(s(i)t |zt, s (i) t−1, x1:t)p(zt|s (i) t−1, x1:t) p(s(i)t |s (i) t−1, x1:t) (4.7)

(35)

4.2 Dynamic Mapping Occupancy Grid 21

Table 4.1:Inverse sensor model for the static map.

St−1 zt p(s(i)t |z1:t−1, x1:t, St−1)

Free Free Low

Unknown Free Low

Occupied Free Low

Free Occupied Low

Unknown Occupied High

Occupied Occupied High

Inserting (4.7) into (4.6b) results in p(s(i)_t |_z_1:t_{, x}_1:t_{, S}_t−1_{) =}

p(s(i)t |zt, s(i)t−1, x1:t)p(zt|s(i)t−1, x1:t)p(s(i)t |z1:t−1, xt−1, s(i)t−1) p(s(i)t |s

(i)

t−1, x1:t)p(zt|zt−1, x1:t, s(i)t−1)

(4.8)

Applying the same procedure to p(¬s(i)_t |_z_1:t_{, x}_1:t_{, S}_t−1_{) gives} p(¬s(i)t |z1:t, x1:t, St−1) = p(¬s(i)_t |_z_t_{, s}(i) t−1, x1:t)p(zt|s (i) t−1, x1:t)p(¬s (i) t |z1:t−1, xt−1, s (i) t−1) p(¬s(i)_t |_s(i) t−1, x1:t)p(zt|zt−1, x1:t, s (i) t−1) (4.9)

Taking the odds of p(s(i)t |z1:t, x1:t, St−1) and using s (i) t independence of x1:tgives p(s(i)_t |_z_1:t_{, x}_1:t_{, S}_t−1₎ p(¬s(i)_t |_z_1:t_{, x}_1:t_{, S}_t−1₎ = p(s_t(i)|_z_t_{, s}(i) t−1, x1:t)p(zt|s (i) t−1, x1:t)p(s (i) t |z1:t−1, xt−1, s (i) t−1) p(s(i)_t |_s(i) t−1, x1:t)p(zt|zt−1, x1:t, s (i) t−1) p(¬s(i)_t |_s(i) t−1, x1:t)p(zt|zt−1, x1:t, s (i) t−1) p(¬s(i)_t |_z_t_{, s}(i) t−1, x1:t)p(zt|s (i) t−1, x1:t)p(¬s (i) t |z1:t−1, xt−1, s (i) t−1) (4.10)

Finally, cancelling some terms and taking the logarithm gives log o(p(s_t(i)|_z_1:t_{, x}_1:t_{, S}_t−1_{)) = log o(p(s}(i)

t |zt, x1:t, St−1)) +

log o(p(s(i)_t |_z_1:t−1_{, x}_1:t_{, S}_t−1_{))) − log o(p(s}(i)

t |St−1))

(4.11)

Hence, the algorithm for updating the static map is the standard occupancy grid update, Algorithm 2, except for the way the inverse sensor model is computed. Instead of returning a value based on whether or not a cell is observed to be free, it returns a low or high value (greater or lesser than 0.5), according to Table 4.1.

(36)

Table 4.2:Inverse sensor model for the dynamic map.

St−1 zt p(dt(i)|z1:t−1, x1:t, St−1)

Free Free Low

Unknown Free Low

Occupied Free Low

Free Occupied High

Unknown Occupied Low

Occupied Occupied Low

4.2.2 Dynamic Map Update

The dynamic map update is derived analogously to the static map update, denot-ing the cell with index i at time t with dt(i)∈Dtthe final map update becomes

log o(p(d_t(i)|_z_1:t_{, x}_1:t_{, S}_t−1_{)) = log o(p(d}(i)

t |zt, x1:t, St−1)) +

log o(p(dt(i)|z1:t−1, x1:t, St−1))) − log o(p(dt(i)|St−1))

(4.12)

The dynamic map is updated in the same manner as the static map, according to Algorithm 2, but with a different table of inverse sensor model update values, Table 4.2.

4.3 Map Implementation

The occupancy grid implementation used by this thesis is the Octomap frame-work and library, described in Section 2.4.2. Octomap is a 3D mapping library and thus, the cells in its occupancy grid are three-dimensional. These cells are commonly referred to as voxels, volumetric pixels, where each voxel represents a volume in 3D space. As the probabilistic updating of the Octomap is, theory wise, identical to the occupancy grid, replacing its update rule with the update rule described in Section 4.2 is easily done. For this thesis, the Octomap frame-work is extended to support the update rules from previous section, allowing for mapping of dynamic scenes in 3D. A sample image of an Octomap can be seen in Figure 4.1

(37)

4.3 Map Implementation 23

Figure 4.1: A sample Octomap seen from above. The light blue voxels are occupied, the red voxels are the unknown and the free voxels are transpar-ent.

(38)

(39)

5

Simultaneous Localization and

Mapping

This chapter presents the main loop of the SLAM algorithm, which utilizes pre-viously presented algorithms to perform simultaneous localization and mapping. The algorithm can be broken down into two main steps, processing of the actions received from the odometry and IMU, and processing of the 3D scans received from the Kinects.

5.1 Measurements

The measurements that are available to the system include:

• Odometric information from the wheels of the Segway, i.e. how far its wheels have turned.

• Angular velocity from the IMU. • 3D scans from the two Kinects.

This information is integrated and fused over time to localize the robot as well as to produce a 3D map.

The Kinect measurements are composed of a set of 3D points, that form a point cloud. The points are triangulated from the depth images taken by the two Kinects. When triangulating the 3D points, the pose of each Kinect needs to be taken into account, as their respective coordinate systems do not coincide with the Segway’s coordinate system. Their poses with respect to the Segway are esti-mated using the calibration procedure described in Chapter 8. Mathematically, it can be thought of as applying a rigid transformation to the points after they have been triangulated to the Kinects coordinate systems.

(40)

26 5 Simultaneous Localization and Mapping

5.2 Motion Model

The robot platform is restricted to operate in a planar environment and therefore the kinematic state can be described by three variables; x, y and θ, where the former two are Cartesian coordinates positioning the robot in relation to an exter-nal frame and the latter is the angular orientation of the robot in the same frame. The state vector (x, y, θ)Tis hereafter referred to as the pose of the robot.

5.2.1 Differential Drive

The Segway RMP50 is a differential wheeled robot, i.e. its movement is based on two separately driven wheels placed on either side of its robot body. The direction is changed by varying the relative speed of the wheels. A differential wheeled robot moves around a point along the wheel axis called the ICC - Instan-taneous Center of Curvature, with the radius r, see Figure 5.1. With a completely linear motion, i.e. no rotation whatsoever, the ICC is located infinitely far away [Hellström, 2011]. v vl vr Path ICC ω r d

Figure 5.1: Illustration of the differential drive motion model. ICC is the instantaneous center of curvature, r is the distance between I CC and the robot’s centre. d is the distance between the two wheels. v is the translational velocity of the robot while ω is the angular velocity. vl and vr is the speed of

the wheels compared to the ground.

The Segway RMP50 is, like many other commercial robots, controlled by provid-ing the translational and angular velocities v and ω. From this input, the robot’s current and future poses can be estimated by using the relationship

v = ω · r (5.1)

(41)

5.3 SLAM Algorithm 27

varying radius. By integrating the velocities over time, the robots path can be estimated.

5.2.2 Odometry Motion Model

While the translational and angular velocities act as control signals to the robot, the accuracy might not be as high as wanted for a good pose estimation for SLAM algorithms. Higher accuracy can often be obtained in practice by measuring the revolution of the robot’s wheel and using IMU measurements [Thrun et al., 2005]. By inserting the angular velocity ω, the radius r to the I CC and the diameter d into (5.1) the velocity of the left wheel vl and of the right wheel vr are

                 vr = ω r + d 2 ! = v +d 2· ω vl = ω r −d 2 ! = v − d 2· ω (5.2a) (5.2b) Solving for the robot’s velocity v, ω and r gives

v = vr+ vl 2 (5.3a) ω = vr−vl d (5.3b) r = d 2 vr+ vl vr−vl (5.3c) Since vr and vl can be obtained from the wheel encoders, the pose can be

esti-mated.

By using a series of angular and translation velocities the pose change can be es-timated by simulating the movement according to Algorithm 3. The algorithm is designed for using angular velocities from an IMU with higher rate of mea-surement than the translational velocities from odometry. By iterating over all angular velocity measurements and updating the pose, a soft curve is generated.

5.3 SLAM Algorithm

Before being fed into the SLAM algorithm, a two-phase preprocessing step is applied to the point clouds. This is done to remove dynamic objects from the scans and to compensate for varying point densities between objects that are close and objects that are far away. First, a voxel grid filter is applied to the point clouds that downsamples the point cloud so that it only has one point per voxel, i.e. it is of uniform point density. Second, by utilizing the information present in the dynamic maps, dynamic objects in the new scan can be removed so that they do not affect the results of registration. This is simply done by iterating over all the points in the scan and removing points that are within a voxel that is occupied in the dynamic map.

(42)

Algorithm 3 The Differential Drive Simulation algorithm, where Ω is a sequence of angular velocities, ttotalis the time since the previous simulation, oland orare

odometry measurements for the left and right wheel and Kl and Kr are

conver-sion factors from wheel encoder tics to m/s.

1: function DriveSimulation(Ω, ttotal, ol, or, Kl, Kr) 2: (x, y, θ)T _{= 0} 3: numsteps = length(Ω) 4: tstep= ttotal numsteps 5: dstep= (Kl· ol+ Kr· or) 2 · numsteps

6: for eachangle velocity ω ∈Ω do

7: θstep= ω · tstep

8: x += cos(θstep) · dstep

9: y += sin(θstep) · dstep

10: θ += θstep

11: end for

12: return (x, y, θ)T

13: end function

Although the first preprocessing step potentially reduces the accuracy of the reg-istration, it are necessary to reduce the computational demand of the regreg-istration, thus reducing the synchronization problem described below.

All of the robot’s hardware interfaces don’t provide timestamps for the data gath-ered from the different sensors, thus there is no synchronization. This is prob-lematic since the Kinects have a latency of approximately 100 ms and the IMU and odometers have hardly any latency at all. If the robot makes any sudden movements, the odometry and IMU will have started processing the movement and changing the pose of the robot, while the Kinects still have not sensed any movement. Results that show the synchronization problem can be seen in Section 5.5.

Algorithm 4 performs a time steps localization and mapping updates, at time t. Lines 2 through 4 simply adds the accumulated pose increments to the current pose. Line 5 merges the two Kinects’ scans and performs the preprocessing steps. SelectReferenceScan() on line 6 selects the target for the registration algorithm. Depending on the desired characteristics, the target can be chosen to be either the cloud most previously inserted into the map, the map itself or the approaches can be combined. Since the map has a fixed resolution and therefore a fixed distance between its points, the accuracy is lower when matching with the map. On the other hand, matching with the previous cloud doesn’t allow the robot to be able to localize itself within the map, as it’s only the relative transformation between the current and most previous pose being estimated. If the registration matches poorly and the IMU or odometers drift at the same time, the robot will loose

(43)

5.4 Results with Dynamic Mapping in 3D 29

track of its pose and any insertions into the map will be incorrect from this point on. The combined strategy cures some of the ailments of the previous strategies. First, attempt to match with the map, and if it succeeds, insert observations into the map. But if the registration fails, try to match with the previous point cloud instead, and then insert if the registration succeeded.

Line 7 performs actual registration between the two scans and saves the result. Line 8 to 10 updates the current pose with the registration results if the registra-tion converged well enough. Finally, the map is updated with the new observa-tions O, taken from the corrected pose pt.

Algorithm 4The SLAM algorithm with edge cases and some data flow removed

for simpler overview. Here, A is a sequence of pose changes, O is a set of 3D observations, ptis the robot’s current pose, mSt is the static map at time t and mDt

is the dynamic map at time t.

1: function SLAM(A, O, pt, mSt, mDt )

2: forpose increment a ∈ A do

3: increment ptwith a

4: end for

5: C= PreprocessScans(O, p)

6: R= SelectReferenceScan()

7: PerformRegistration(R, C)

8: ifregistration is good enough then

9: update ptwith the corrected pose

10: end if

11: update mS_t and mD_t with O and pt 12: return [pt, mSt, mDt ]

13: end function

5.4 Results with Dynamic Mapping in 3D

There exists a set of different scenarios that are relevant to a mapping robot at-tempting to map its surroundings. These scenarios are reflected in the datasets used for the evaluation. In the following section, the term "regular update" will refer to the standard occupancy grid update, which isn’t designed to handle dy-namic scenes. The term "dydy-namic update" will refer to the update presented in Section 4.2, i.e. the map update designed to handle dynamic scenes.

The following datasets are used to test the proficiency of the map building with dynamic objects present.

1. One person change position out of view - A person is standing by the window when the robot, which is manually steered, first passes by. When the robot returns to the same place, the person has moved and sat down on a nearby bench.

(44)

2. One person moving in view of still robot - The robot is still with a person initially standing still, then moving to an adjacent position before moving back again. All in view of the Kinects.

3. Simulated crowd moving in front of unmoving robot - The robot is un-moving, facing two people moving in front of the robot with rectangular, 1.5 by 1 m, planar screens, in order to simulate a large crowd moving in front of the robot. The point of this is to show that the information pro-vided in the dynamic map can be utilized to remove points belonging to dynamic objects in measurements used for the registration. Thus reducing the risk of a the registration diverging, which may occur when a large crowd or large objects are moving uniformly in front of the robot.

The results from dataset 1, using the dynamic update and running simulate mode, see Section 2.3, can be seen in Figures 5.2 and 5.3. The 2-D occupancy grids in the figures are projected from the 3-D occupancy grids, as described in Section 6.1. Figure 5.2 shows the state of the maps about halfway through its trip. Here, a person standing still along the wall to the left has been misclassified as a static object and hence is present in the static map. Figure 5.3 shows the final state of the map when the robots trip is complete. Noticing that the person has moved from its initial position, it is removed from the static map. The person is instead detected in previously free space, causing it to be classified as a dynamic object and placed in the dynamic map.

There is a cluster of voxels left of the person in the static map in Figure 5.3b. This is because the top of the person is outside the field of view of the Kinects as the robot is moving towards its final position.

The results show the capabilities of the system to identify parts of the map that previously are thought to be static and instead classify them as dynamic as more information is gained, resulting in a more correct map of the static environment. The second dataset addresses a different scenario than the first. The results pre-sented in Figures 5.4 and 5.5 show the difference between the maps produced with and without mapping of dynamic scenes enabled. That is, whether to use the regular occupancy grid update or the update presented in Section 4.2. The results show a person standing still in front of a desk, to the left. The person is de-tected as a static object and inserted into the static map. The person then moves to the right and is moved into the dynamic map using the dynamic update, while still being classified as static in the map that uses the regular update.

Dataset 2 (and 1) shows that the system is robust against dynamic objects present in the scene while mapping. Compared to the regular occupancy grid update, the dynamic update produces a more correct map of the static environment. The maps with the dynamic update handles well both scenarios of an object moving out of the robot’s view, as in dataset 1, and objects moving in the robot’s view, i.e. dataset 2.

(45)

(a) The occupancy grid af-ter initial viewing, both dy-namic and static maps pro-jected.

(b)The dynamic and static 3D maps after the initial view. The map on the left is the dynamic map and on the right is the static map. The path of the robot can be seen in yellow and its current position is the in bright red.

Figure 5.2: Results from dynamic objects dataset 1, using the dynamic up-date. A person, marked by the orange box, is initially falsely detected as a static object and is inserted into the static map. The object can also be seen in Figure 5.2a.

(46)

(a) The occupancy grid af-ter the second view, both dy-namic and static maps pro-jected.

(b)The dynamic and static 3D maps after the second view. The map on the left is the dynamic map and on the right is the static map. The path of the robot can be seen in yellow

Figure 5.3: Results from dynamic objects dataset 1, using the dynamic up-date, some time after the maps in Figure 5.2 were created. Here, the robot has returned to scan a previously scanned area, but the person, marked with an orange box, has moved and is now sitting on a bench on the other side of the room. As a result of this, the person has been removed from the static map and is now in the dynamic map instead.

(47)

(a) The initial view of the static map with the static update projected to an occupancy grid.

(b) The initial view of the static map with the dynamic update pro-jected to an occupancy grid.

(c)The initial view of the dynamic and static maps, using the dynamic update.

(d)The initial view of the occupancy grid map with the regular update function.

Figure 5.4: Results from dynamic objects dataset 2. A person. marked by an orange box, is initially standing still and is detected as a static object and is inserted into the static map.

(48)

(a)The static map, with the static update, pro-jected as an occupancy grid after the person has moved.

(b)The static map, with the dynamic update, projected as an occu-pancy grid after the person has moved.

(c)The dynamic and static maps after the person has moved, using the dynamic update.

(d) The regular update occupancy grid map after the person has moved.

Figure 5.5: Results from dynamic objects dataset 2, some time after the maps in Figure 5.4 was created. The person, marked by an orange box, has moved to an adjacent position. With the regular occupancy grid update, the person is still in the map and is detected as a static object. However, with the dynamic update, the person has been detected as a dynamic object and is thus in the dynamic map and not in the static map.

(49)

The third dataset is designed to illustrate what happens if a large group of peo-ple passes by in front of the robot, both with the regular update rule and the dynamic update. As the robot remained completely still during the experiment, all the movement that occurred during the experiment is due to the registration wrongfully correcting the robots pose. Therefore, the total movement of the robot can be used as a measure for the error. These values are presented in Table 5.1.

(a)Regular update 2D (b)Regular update 3D

(c)Dynamic update 2D (d)Dynamic update 3D

Figure 5.6: Results from dynamic objects dataset 3. This dataset was

recorded with a stationary robot with multiple people moving in the scene. (a) and (b) show maps created with registration activated and the regular map update method while (c) and (d) use registration and the dynamic map update along with filtering out points that have been determined to be dy-namic in the dydy-namic map.

The results from the third dataset demonstrates the advantages of the dynamic map. How it can, by very simple means, provide information to greatly improve the quality of the registration and, by extension, the map by removing objects that are present in the dynamic map. Table 5.1 provides numerical values to the errors, where it is clear to see that the dynamic update produced errors up to an order of magnitude smaller in the x direction and heading angle compared to the regular

Autonomous Mapping and Exploration of Dynamic Indoor Environments

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Autonomous Mapping and Exploration of Dynamic

Indoor Environments

Autonomous Mapping and Exploration of Dynamic

Indoor Environments

Examensarbete utfört i Reglerteknik

vid Tekniska högskolan vid Linköpings universitet

av

Abstract

Sammanfattning

Acknowledgments

Contents

1

Introduction

1.1

Background

1.1.1

SLAM - Simultaneous Localization and Mapping

1.1.2

Dynamic maps

1.1.3

Planning

1.2

Preconditions

1.3

Goals

1.4

Exclusions

1.5

Method

1.6

Thesis Outline

2

System Overview

2.1

Hardware Assembly

2.2

Software Design

2.3

Run Modes

2.4

Third Party Libraries

2.4.1

Mobile Robot Programming Toolkit

2.4.2

OctoMap

2.4.3

Point Cloud Library

2.4.4

Eigen

3

Registration

3.1

Iterative Closest Point

3.1.1

Limitations

3.1.2

Versions of ICP

4

Map Representation

4.1

Occupancy Grid

4.1.1

Updating The Occupancy Grid

4.2

Dynamic Mapping Occupancy Grid

4.2.1

Static Map Update

4.2.2

Dynamic Map Update

4.3

Map Implementation

5

Simultaneous Localization and

Mapping

5.1

Measurements