3D camera with built-in velocity measurement

(1)

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

3D camera with built-in velocity measurement

Examensarbete utfört i Datorseende vid Tekniska högskolan vid Linköpings universitet

av

Mattias Josefsson LiTH-ISY-EX--11/4523--SE

Linköping 2011

Department of Electrical Engineering Linköpings tekniska högskola

Linköpings universitet Linköpings universitet

(2)

(3)

3D camera with built-in velocity measurement

Examensarbete utfört i Datorseende

vid Tekniska högskolan i Linköping

av

Mattias Josefsson LiTH-ISY-EX--11/4523--SE

Handledare: Henrik Turbell Industriell handledare Kristoffer Öfjäll

isy, Linköpings universitet Examinator: Klas Nordberg

isy, Linköpings universitet

(4)

(5)

Avdelning, Institution Division, Department

Computer Vision Laboratory Department of Electrical Engineering Linköpings universitet

SE-581 83 Linköping, Sweden

Datum Date 2011-11-10 Språk Language Svenska/Swedish Engelska/English Rapporttyp Report category Licentiatavhandling Examensarbete C-uppsats D-uppsats Övrig rapport

URL för elektronisk version http://www.ep.liu.se http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-68632 ISBN — ISRN LiTH-ISY-EX--11/4523--SE Serietitel och serienummer Title of series, numbering

ISSN —

Titel Title

3D-kamera med inbyggd hastighetsmätning 3D camera with built-in velocity measurement

Författare Author

Mattias Josefsson

Sammanfattning Abstract

In today’s industry 3D cameras are often used to inspect products. The camera produces both a 3D model and an intensity image by capturing a series of profiles of the object using laser triangulation. In many of these setups a physical encoder is attached to, for example, the conveyor belt that the product is travelling on. The encoder is used to get an accurate reading of the speed that the product has when it passes through the laser. Without this, the output image from the camera can be distorted due to a variation in velocity.

In this master thesis a method for integrating the functionality of this physical encoder into the software of the camera is proposed. The object is scanned together with a pattern, with the help of this pattern the object can be restored to its original proportions.

Nyckelord

Keywords computer vision, motion estimation, optical flow, feature detection, corner detec-tion, edge detecdetec-tion, velocity measurement, encoder-free laser triangulation

(6)

(7)

Abstract

In today’s industry 3D cameras are often used to inspect products. The camera produces both a 3D model and an intensity image by capturing a series of profiles of the object using laser triangulation. In many of these setups a physical encoder is attached to, for example, the conveyor belt that the product is travelling on. The encoder is used to get an accurate reading of the speed that the product has when it passes through the laser. Without this, the output image from the camera can be distorted due to a variation in velocity.

In this master thesis a method for integrating the functionality of this physical encoder into the software of the camera is proposed. The object is scanned together with a pattern, with the help of this pattern the object can be restored to its original proportions.

Sammanfattning

I dagens industri används ofta 3D-kameror för att inspektera produkter. Kameran producerar en 3D-modell samt en intensitetsbild genom att sätta ihop en serie av profilbilder av objektet som erhålls genom lasertriangulering. I många av dessa up-pställningar används en fysisk encoder som återspeglar hastigheten på till exempel transportbandet som produkten ligger på. Utan den här encodern kan bilden som kameran fångar bli förvrängd på grund av hastighetsvariationer.

I det här examensarbetet presenteras en metod för att integrera funktion-aliteten av encodern in i kamerans mjukvara. För att göra detta krävs att ett mön-ster placeras längs med objektet som ska bli skannat. Mönstret återfinns i bilden fångad av kameran och med hjälp av detta mönster kan hastigheten bestämmas och objektets korrekta proportioner återställas.

(8)

(9)

Acknowledgments

A lot of people have supported me during this thesis. And to some of these I am especially grateful for all their help.

First of all I would like to thank my supervisor, Henrik Turbell at the company for always being there for me and for giving me the opportunity to do this master thesis.

My supervisor at the university, Kristoffer Öfjäll for the quick responses to all the questions I have had regarding the thesis.

Finally Klas Nordberg, my examinator, for introducing and educating me in the field of computer vision.

(10)

(11)

Introduction

1.1 Background

The use of 3D cameras has grown over the years and in today’s industry they have become standard equipment in the manufacturing process. They are used for verifying that packages are intact, if a unit is missing some part from an earlier assembling stage or even sorting mail. These are just a few examples of all the possible areas of use. Most 3D cameras use laser triangulation to measure objects. The camera captures slices of the object where the laser hits, called profiles. Movement is therefore required to collect a series of these profiles that represents the whole object. Because of this the 3D camera is usually used in conjunction with a conveyor belt in an industrial environment. One of the most common se-tups use an external encoder that sends a series of pulses that reflects the speed of the conveyor belt. Without this encoder the proportions of the scan can be-come distorted. With the use of the velocity data provided by the encoder the measured points can easily be transformed into a calibrated (x, y, z)-point cloud and displayed as a 3D model.

1.2 Thesis Objectives

The objective of this thesis is to investigate and develop a MATLAB script that can determine the velocity of an object being scanned. The script will not use an external encoder, instead only the data gathered from the 3D camera. In other words, integrate the functionality of the physical encoder into the software of the camera. To make the work more feasible the area of usage will be focused on the cases where the objects that are being scanned are residing on an assembly plate. One might ask why this is desirable, why is the physical encoder an issue? Since it is an extra component that needs to be bought or manufactured and assembled with the rest of the necessary parts, it is an issue of cost and time. Integrating these two would make the camera much more diverse and also a more desirable product. Though this method may not be as accurate as using the physical encoder it is still

(14)

2 Introduction

highly suitable for demonstration and laboratory environments where very high measurement precision is not required.

The main approach is to place a suitable pattern made of a sturdy material on the side of the plate and when scanned, the camera will generate an 2D intensity image containing the pattern. This intensity image should not to be confused with the range image used to create the actual 3D model. In the intensity image the pattern should be detected and measured with high precision so that it is possible to accurately determine how much it has been distorted compared to the actual pattern. With the distortion known, the velocity can be calculated and the proportions restored. The tasks of this thesis are listed below.

• Position and stretch estimation of the pattern • Sub-pixel precision

• Determine the location of all the scanned profiles

• Investigate how high and low velocities that can be handled • Investigate how a skewed laser or pattern affect the result • Test different patterns

• Determine one or several error measurements

• Construct a demonstrator that works in real time with the camera

1.3 Disposition

This thesis is divided into several chapters, which are briefly described below. 1. Introduction This chapter will provide the reader with some background information to the problem and the purpose of this thesis. Also, some notations and abbreviations that are used throughout the document are presented.

2. Background Theory and Related Research The second chapter will explain the necessary theory behind the methods and algorithms used throughout the thesis. It will then continue with some related research and commonly used methods for motion estimation and derivations of these.

3. Proposed Method In the third chapter the details of the methods used and steps taken in the thesis work will be explained.

4. Experiments The fourth chapter is of a more practical nature and will de-scribe the experiments that were conducted. The chapter will start with explain-ing the experiments done on synthetic data, e.g. generated pattern without noise. Thereafter experiments with real data from the 3D camera will be presented.

(15)

1.4 Notation and Symbols 3

5. Results and Discussion In this final chapter there will be a summary of the results of both the synthetic and real experiments. Were the results as expected? Did they differ? How much and why? Here it will also be discussed why certain methods were used over others, what could have been done differently and some suggestions will be provided on what can be done in the future regarding the subject.

1.4 Notation and Symbols

A, [x, y] Matrix

AT Transpose of the matrix A

f, F Functions

(x, y) Image coordinates

X Real world coordinates

1.5 Definitions and Abbreviations

MO Moravec Operator

NMS Non-maxima suppression det(A) Determinant of matrix A

trace(A) Sum of elements on the main diagonal of the matrix A ∇f (x, y) Gradient of f(x,y)

arg(x, y) The angle in radians from the x axis to the point (x, y) Profile A column of pixels retrieved from the camera

(16)

(17)

Chapter 2

Background Theory and

Related Research

2.1 Introduction

Before going into the proposed method for this thesis, it is necessary to know some of the basic theory and methods used.

In this chapter several topics will be covered such as features, corner and edge detection, non-maxima suppression, laser triangulation, least squares minimization and also research regarding motion estimation.

2.2 Features

There is no universal definition of what a feature is since it depends on the appli-cation. A feature is in general defined as a part of an image that is interesting for the problem at hand. In computer vision, it is often important to be able to find something specific e.g. features in images and for example tracking them between a series of images. Another area of use could be to extract data that can be related to real world dimensions. With that said a feature could be anything distinct in an image like corners, edges, textures or even whole objects. In this thesis, features such as corners and edges will be of great interest and detection of these will be discussed in the following sections.

2.3 Corner Detection

Corners are intuitively defined as a points of high curvature on region boundaries according to [1]. They are often and widely used as feature points because of their invariance to rotation, translation and even scale to some extent. Corners are easier and better to track than lines when there is an issue of correspondence, this is due to the so called aperture problem. If a moving line is observed through a

(18)

6 Background Theory and Related Research

small aperture only the motion perpendicular to the line can be measured. There is no way to know if there is any movement in the parallel direction since neither the start nor the stop point can be observed. The motion of corners on the other hand can easily be detected since they do not suffer from the aperture problem. This is illustrated in Figure 2.1.

(a) Ambiguity of lines. (b) Unambiguity of

corners.

Figure 2.1. Aperture problem.

Corner detectors take advantage of the fact that the direction of gradient at the tip of the corner is ambiguous. Other definitions of corners are described in [2]. To mention some: a pixel with a local neighbourhood where there are two distinct and different edge directions, or local intensity extremas. It can also be line endings or fast changes in the curvature.

2.3.1 Moravec Operator

One of the first and simplest of the corner detectors is the Moravec Operator (MO) [3]. It considers a local patch in an image f and checks the average intensity variation when the patch is shifted a small amount in eight specific directions. There are basically three cases that can occur.

1. If the intensity in the patch is approximately constant, this will result in little variation when the window is shifted.

2. If the patch resides on an edge, then shifts parallel to the edge will generate a small change in intensity while shifts perpendicular to the edge will result in a much larger change.

3. If the patch is covering a corner or an isolated point, then shifts in any direction will result in a large change in intensity. Thus, the corner can be detected by finding the minimum change generated from the shifts. E.g. the intensity of the pixels surrounding the corner will vary much more than the corner pixel.

From the cases above the MO can be written as

MO(i, j) = 1 8 i+1 X k=i−1 j+1 X l=j−1 (f (k, l) − f (i, j))2 (2.1)

(19)

2.3 Corner Detection 7

where i, j and k, l are the coordinates for the inspected pixel and the eight surrounding pixels respectively.

2.3.2 Harris Corner Detector

There are many other methods that give a much better results than the MO. A common method is the Harris Corner Detector [4]. The Harris Corner Detec-tor solves the issues that the MO has, such as having a rectangular patch, an anisotropic response and responding too fast to edges. This is accomplished by calculating the auto-correlation energy in a local region, given by

E(∆x, ∆y) = X xi∈W X yi∈W g(x, y)(f (xi, yi) − f (xi− ∆x, yi− ∆y))2 (2.2) g(x, y) = exp −x 2_{+ y}2 2σ2 (2.3) where g(x, y) is a Gaussian weight function and W is an patch from an image f shifted by ∆x, ∆y around x, y. The shift can be approximated by the first order Taylor expansion and after some calculations, described in more detail in [2], the matrix A = g(x, y)    ∂f ∂x 2 ∂f ∂x ∂f ∂y ∂f ∂x ∂f ∂y ∂f ∂y 2    (2.4)

can be derived. This matrix represents the local structure of the neighbourhood and by looking at the eigenvalues λ1 and λ2 of the matrix A it is possible to see

how the variation in the intensity changes. This can be shown through principal component analysis (PCA) and is explained in [2]. A more detailed explanation can be found in [5]. As with the MO there are three cases that need to be considered. 1. Both λ1and λ2are small. This means that the neighbourhood of the

exam-ined pixel from the image f is flat, there is little variation in its neighbour-hood and thus no edges or corners.

2. One of λ1and λ2is small and the other is large. This indicates that there is

variation in one direction and an edge is present in the local neighbourhood. 3. When both λ1 and λ2 are large this indicate that a corner has been found

since there are variations in every direction.

A pleasing fact that was pointed out in [4] is that these eigenvalue computations can be avoided by calculating the response function R

R(A) = λ1λ2− k(λ1+ λ2)2 or (2.5)

(20)

The parameter k should reside between 0.02 and 0.15 according to literature. To find the corners a threshold should be introduced to find the maximal values of the response and also perform Non-Maxima Suppression (NMS) to avoid maximas being too close to each other.

2.4 Directional NMS

NMS was introduced as a step in Canny edge detection [6]. NMS finds the local maximas in an image. Every pixel with a non-zero magnitude is inspected. Two additional pixels in the direction of the gradient, rounded to the closest 45◦ mul-tiple, is considered. These pixels can be located at any radius from the inspected pixel, in this thesis two adjacent pixels are selected. If any of these have a greater magnitude than the inspected pixel, it is removed. This results in a image where only the local maximas remain.

2.5 Edge Detection

Edges can be seen as pixels separating two regions in an image and can be defined as having two components, magnitude and direction [2]. The magnitude is the magnitude of the gradient g and the edge direction is rotated −90◦ with respect to the gradient direction ψ, they are calculated as

|∇g(x, y)| = s  ∂g ∂x 2 + ∂g ∂y 2 (2.7) ψ = arg ∂g ∂x, ∂g ∂y (2.8) According to [7], edges corresponds to a significant amount of variation in re-flectance, illumination, orientation and depth of scene surfaces. Steps are the most common variation that are found in images and generally occurs between re-gions where the intensity is almost constant but have different levels of grey. Two other variations that often are encountered are lines and junctions. Lines can be regarded in images as local extremas where for example a thin object is occluding another object or background. Junctions are intersections between two or more edges and these can be analogously referred to corners.

Edge detectors usually have three stages of operation: smoothing, differenti-ation and labeling. The smoothing reduces the noise and regularizes the image. Differentiation is the computation of the derivatives and labeling is the actual marking of the edge. There are many kinds of edge detectors and most of them are designed for step edges, since they are the most common. The detectors can be divided into three categories.

1. Finding extrema using the first-order derivative defined by the gradient. 2. Finding zero-crossings in the second-order derivative along the gradient

(21)

2.6 Laser Triangulation 9

3. Using piecewise continuous functions to describe the image intensity. The first category uses gradient operators defined for a small neighbourhood and are expressed as convolution masks. Edges are found by convolving the image or desired area with these masks. The operators have a mask for each direction that the edge should be detected in. One of the most common of these operators is the

Sobel operator. The Sobel operator is a very simple detector that is often used to

find horizontal and vertical edges. The two Sobel masks are   1 2 1 0 0 0 −1 −2 −1   and   −1 0 1 −2 0 2 −1 0 1   (2.9)

There are also Sobel masks for diagonal edge detection.

The second category originate from the Marr-Hildreth edge detector [8]. A first-order derivative has its extremum at the corresponding edge in the image, the second-order derivative should also be zero at the same position. What makes the zero-crossing better than the extremum is that it is much easier to detect the zero-crossing with high precision. This is illustrated in Figure 2.2.

(a) f (x) (b) f0_(x) _{(c) f}00_(x)

Figure 2.2. Illustration of the zero-crossing in 1D.

The third category will not be described here. For details refer to [9].

2.6 Laser Triangulation

The term triangulation comes from the geometry of the setup. This is used by the 3D camera in this thesis to collect its data. There is the camera and a laser emitter, that produces a sheet of laser referred to as the laser plane. The length between the camera and laser emitter is known as well the angle at the laser plane. This data makes it possible to determine the location where the laser plane hits a surface, demonstrated in Figure 2.3.

The camera detects where the laser plane hits a surface and creates a profile image, see Figure 2.4. Depending on the height of the object, it will appear to be closer or further away from the camera. By taking a series of these profiles during the time the object passes through the laser plane it is possible to produce a height image that can be used to create a 3D model.

(22)

l

α

Figure 2.3. Laser triangulation setup.

(a) Object passing through laser plane.

y

x

(b) Profile image.

Figure 2.4. Example of the profile images produced when an object intersects the laser

plane.

2.6.1 Occlusion

Since both the camera and the laser emitter are in a fixed position there are situations where the entire object can not be scanned due to occlusion. There are two types of occlusion: camera and laser occlusion. Camera occlusion is a result of the camera being mounted in such an angle that it not possible for the camera to see the laser. Laser occlusion occurs when the camera can see what is to be captured but the laser can not hit the desired surface. Figure 2.5 demonstrates these two occlusions.

(23)

2.6 Laser Triangulation 11

(a) Camera occlusion. (b) Laser occlusion.

Figure 2.5. Two types of occlusion: camera and laser occlusion.

2.6.2 Reflectivity

What the camera looks for in the image it captures is the laser line produced by the laser plane intersecting a surface. The laser will reflect differently depending on what surface it hits, e.g. a white surface will reflect more than a black surface. Since the laser’s intensity has approximately the form of a Gaussian perpendicular to the laser plane, the difference in reflectivity can cause troublesome artefacts at the edges of surfaces with high contrast. The camera finds the peak with the highest value on the laser line, so at edges with high contrast the camera can detect a faulty peak as shown in Figure 2.6. This shift in the peak detection will result in a shift in the measured profile height and can sometimes give noticeable artefacts at the border between two areas with high contrast, e.g. a flat surface can look uneven. In this case the captured image could be skewed in such a way that the edges of an object or pattern are not exactly where they should be, instead they have been moved a short distance. This translation is a very small one and might not affect measurements too much, but it is still something to have in consideration. See the following example for clarification.

Example 2.1

If the laser is is moving from left to right, sweeping over a black rectangle on a white surface, the left side of the rectangle will be shifted a very small distance to the right. The right side of the rectangle will then analogously be shifted to the left.

(24)

Intensity

x

(a) Profile of the laser line.

Intensity

x

Correct peak

Observed peak

(b) Faulty peak is detected.

Figure 2.6. At the border between two surfaces with different reflectivity the wrong

peak may be observed causing artefacts and small measurement errors.

2.7 Least Squares Minimization

It is often desirable to find the best-fitting curve to a given set of data points. One of the most common techniques is to minimize the sum of the squared residuals of the points from the curve, see Figure 2.7.

Figure 2.7. Least squares minimization, residuals are the vertical distance between the

data point and curve.

In general, a kth degree polynomial is defined as

y = a0+ a1x + · · · + akxk (2.10)

and the sum of the squared residuals are given by

R2= n X i=1 yi− a0+ a1xi+ · · · + akxki 2 (2.11)

(25)

2.8 Related Research 13

where i is the index for the data points. From these equations it is possible to compute the Vandermonde matrix and also obtain the matrix for a least squares fit,      1 x1 . . . xk1 1 x1 · · · xk2 .. . ... . .. ... 1 xn · · · xkn      | {z } X      a0 a1 .. . ak      | {z } a =      y0 y1 .. . yk      | {z } y (2.12)

where the Vandermonde matrix is the matrix X. The coefficients a for the best-fitting polynomial can be obtained, in matrix notation, through the following steps,

y = Xa (2.13)

XTy = XTXa (2.14)

a = (XTX)−1XTy (2.15)

2.8 Related Research

There has not been any previous research in this particular area at the company. However, there are some standard procedures to calculate velocities in images that relate to this thesis. These are often refereed to as motion estimation. Important to note is that motion estimation uses two or more images compared to a single intensity image that this thesis will focus on. This intensity image can contain several different velocities over several time intervals.

Motion estimation is widely used in video compression and it is the process of determining motion vectors that describe the correspondence between pixels in two or more images. There are two basic methods for doing this, pixel based (direct) and feature based (indirect). The pixel based method as described in [10] use information collected from every pixel in the image. This is contrast to the feature based [11] that extracts features and then compute the image correspondence. With that said, the following methods are the most common way to determine velocities in computer vision. How these relate to this thesis will be described at the end of the respective methods.

2.8.1 Optical Flow

A popular pixel based method is optical flow. Optical flow describes how intensity moves in an time-varying image. It is assumed that the intensity structures in a local region is approximately constant during the motion computation. In other words, intensity values can move but not vary over time. This is of course not always true in practice. In an outdoor environment the illumination changes with the sun, shadows and other occlusions arises. To cope with this it is also assumed

(26)

that the motion takes place during a short period of time, then the previous assumption holds. The image intensity function I(x,y,t) can then be written as

I(x, y, t) = I(x + ∆t · vx, y + ∆t · vy, t + ∆t) (2.16)

where x and y are pixel coordinates, vxand vyis the velocity in x and y direction.

∆t is such a short time interval that the velocity can be considered constant. If this formula is re-written with the help of Taylor-expansion with respect to ∆t and some simple calculations the following can be derived,

∂I ∂xvx+ ∂I ∂yvy+ ∂I ∂t = 0 (2.17)

This expression is called the optical flow equation or the brightness constancy

constraint equation (BCCE) and is a fundamental part of optical flow estimation.

A noticeable problem is that there are two unknown variables, this constraint is therefore not sufficient enough to calculate both vxand vy. It is however possible

to compute the motion perpendicular to the local gradient of I. This is called the aperture problem and is described in Section 2.3.

To be able to solve equation 2.17 another set of equations is needed. There are many different optical flow methods that provide additional conditions and con-straints. Just to mention some there are, differential, frequency based, correlation based and multiple motion methods. All these are described in [12].

Were the optical flow problem to be considered only in one dimension it would be somewhat related to the method presented in this thesis. If a known pattern is stretched with various velocities and then the optical flow problem is solved for this pattern the main objective of this thesis would have been reached. However, this is impractical because for the optical flow solution to give a reliable result the two images that are compared need to be correctly oriented to each other and have the same illumination. This means that the pattern obtained from the camera must have the same illumination and be detected, translated, scaled and maybe even rotated to be able to perform the optical flow method.

2.8.2 Feature Based Motion Estimation

The feature based method focuses on extracting features where it is possible to get good correspondence between images. Point features are often located using the Harris corner detector, described in Section 2.3.2. This is done in at least two images. When features has been extracted in both images, they need to be paired together. This is done either by a coarse-to-fine approach or by comparing a local area around a feature in the first image to the features in the other image. These features and their corresponding feature then go through feasibility tests to sort out faulty correspondences and to remove outliers that might not be a feature at all. A common method to check correspondence and outliers is RANSAC [13].

The Lucas-Kanade method (LK) is a popular feature based method that finds the disparity between corresponding points in two images, and is used for tracking among other areas. A quick derivation and explanation of how the LK method works will now be presented, based on [14]. First off, since it is a feature based

(27)

2.8 Related Research 15

method it works with patches or templates around the feature points, one for each image. The translation describing the displacement d between these two patches is

I(x + d) = J (x) (2.18)

I(x) is the template from the first image and J (x) from the second, where x =

[x, y]T _{is the pixel coordinates and d = [d}

x, dy]T is the displacement in x and y

direction. The d that fulfils equation 2.18 is the one that minimizes

 =

Z Z

W

w(x) [I(x + d) − J (x)]2dx (2.19) where w(x) is a weight function. Approximation of I(x + d) using a first order Taylor expansion yields

I(x + d) ≈ I(x) + dT∇I(x) (2.20) that when inserted into 2.19 gives

 =

Z Z

W

w(x)I(x) − J(x) + dT_∇I(x)2

dx (2.21)

The weight function w(x) can be a Gaussian but for simplicity it is here set to 1. To minimize , set its derivative to zero

∂ ∂d ≈ 2

Z Z

W

I(x) − J(x) + dT_{∇I(x) ∇I(x)dx = 0} _(2.22)

Rearrange the terms to get Z Z

W

∇I(x)(∇I(x))T_dx_{d =}Z Z W

[J (x) − I(x)] ∇I(x)dx (2.23)

Finally solve the equation

T2Dd = s (2.24)

and find the displacement d between the two features. T2D is the 2D structure

tensor for the template I(x). This equation is called the Lucas-Kanade equation. To relate to this thesis, the indirect method plays a big role as extraction of the features in the pattern in the initial step of the proposed method.

(28)

(29)

Chapter 3

Proposed Method

3.1 Introduction

The 3D camera produces a grayscale intensity image when the assembly plate passes through the laser. On the side of this plate resides a pattern. This pattern along with its features can be identified. By comparing these features to their real world counterparts the velocity of the plate can be determined. This might be solved with a wide range of different types of patterns but this thesis has its primary focus on patterns with distinct corners.

In this chapter every part of the procedure will be explained in detail. Note, if not referred, everything in this section was developed during this thesis.

3.2 Patterns

Determining which pattern to use will be the first task. Different patterns have both advantages and disadvantages. In this thesis several patterns were examined. The different patterns are new iterations of the previous ones, in an attempt to improve them. In this section the different patterns are discussed in terms of what possible advantages and disadvantages they might have. Later on in Chapter 4 the patterns are compared to one and other, doing both synthetic and real experiments with them. It will also cover a more detail explanation to why a new iteration of the pattern was needed. Through experiments it has been determined that the patterns should preferably not have an width exceeding 100 pixels, otherwise the features in the pattern can be too ambiguous and it also provides sufficient amount of space for the object being scanned.

A pattern that quickly comes to mind is a series of triangles like the one in Figure 3.1. What speaks for this kind of pattern are the distinct corners and symmetry that triangles have, it is easy to calculate the angles and width at a certain point.

After some minor experiments using the pattern in Figure 3.1 it became clear that this pattern could not be used since the corner detector had a hard time to

(30)

18 Proposed Method

(a) Triangles without spacing. (b) Triangles with spacing.

Figure 3.1. A simple pattern consisting of six triangles.

find the tip of the triangles since there was no space between them. So introducing some spacing between the triangles improved the results and eased the process of mapping the corners correctly. This pattern was the first to be more properly tested and its properties are listed below.

Positive

• Simple pattern

• Easily detectable features • High precision end points Negative

• Few feature points may lead to insufficient amount of data • May be difficult to find vertical lines with high precision • Space inefficient

• No overlap

The second iteration of the pattern that was tested is shown in Figure 3.2. It has the advantage of being more space efficient, more figures can be fitted on the same area as the triangle pattern. There are also more features in this pattern that should be able to provide more information, making the results more accurate. The edges also overlap each other giving the possibility to have several measurements over these sections. Because of the shape it is possible to find the wedge-shaped feature both on the front and back of every arrow.

Positive

• Relatively simple pattern • Easily detectable features • More space efficient • More feature points • Overlapping wedges

(31)

3.2 Patterns 19

Figure 3.2. Arrow pattern.

Negative

• Corners with large angles produce a lower response from the Harris operator than others

• No high precision end points • Does not scale well to small sizes

The arrow pattern’s response from the Harris operator tends to overlap at the top corners, making them indistinguishable from each other. To solve this problem every corner was designed to have a 90◦ angle. In addition the pattern was also made thicker, two times the distance between them. This resolved the issue of having a large angle at some corners and also making sure that the response from the Harris operator did not overlap. The modified arrow pattern can be seen in Figure 3.3.

Figure 3.3. Modified arrow pattern.

Positive

• All those of the previous patterns Negative

• No high precision end points

This pattern looks like it should provide a better result than the previous patterns. It gives a better Harris response and has overlapping features, but both wedges

(32)

can not be used due to the reflectivity problem. If both are used there will be an offset in both directions. Since this is an issue another pattern is needed, one that does not need both wedges to be analysed. Considering that this pattern is great in every aspect except that it only overlaps with the back wedge it could be modified to gain more overlap. Figure 3.4 shows the final iteration, the sharp

modified arrow (SMA) pattern. To be able to gain more overlap without making

the pattern wider, the angle at the tip of the pattern had to be made sharper. An angle of approximately 34◦ was selected.

Figure 3.4. Sharp modified arrow pattern.

Positive

• All those of the previous patterns • Much overlap

• No need to analyse the back wedge Negative

• No high precision end points

The new pattern provides a good response from the Harris operator and overlaps the next three tips, which should provide enough data to get a good result. As noted above, all the patterns are susceptible the reflectivity problem. How this affects the results and why the back wedge can not be used is explained in Chapter 4.

3.3 Finding Feature Points

Feature points in this thesis are, as described earlier, corners. These corners are detected by using the Harris corner detector, see Section 2.3.2 for a more detailed explanation. The Harris detector is probably the most commonly used method for finding features such as corners because of its properties. It provides both an isotropic and rotational invariant response. The gradient that is used in the Harris detector is computed as an regularized derivative, i.e. the derivative convolved with a Gaussian filter g(σ). To make the gradient computation faster, the gradient is computed in x and y direction separately. The gradient in x direction is then written as

(33)

3.4 Feature Mapping 21 ∂ ∂xf (x) = ∂ ∂x(f ∗ g(σ))(x) = (f ∗ −x σ2g(σ))(x) (3.1)

where f (x) is the intensity image and g(σ) the Gaussian filter. The gradient in the y direction is computed in a similar fashion. Since this can be precomputed, the regularized gradient can be performed with a single convolution.

The resulting response from the Harris detector is an image where corners will appear as light areas and the brightest is the actual detected corner. To obtain the pixel where the response is the greatest a simple thresholding could sometimes work. But it often happens that the response from the Harris detector gives multiple peaks near each other. It is then hard to determine which is the correct peak with a simple threshold applied. To solve this problem, NMS is used, see Section 2.4. After applying NMS to the Harris response only the local maximas will remain and the location of the corners have been found.

These locations of the corners are however not very precise, they are located a few pixels away from the actual corner. Because of this these corners should not be used directly to measure distances in the image, a much more precise location is needed. This is discussed in Section 3.6.

3.4 Feature Mapping

The feature mapping is the process of determining which of the found corners correspond to each of the individual parts of the pattern. This is done through exhaustive searching for three corners that match a certain criteria. All the tested patterns have the similar feature that they can all contain wedges. These wedges is shown in Figure 3.5.

(a) Triangle. (b) Arrow. (c) Modified

ar-row.

(d) SMA.

Figure 3.5. Common wedge feature in the patterns.

Each wedge consist of three features. The two features at the baseline (vertical line between end points) should have approximately the same angle to the tip of the wedge, Figure 3.6, this is a criteria they must uphold. There are of course other corners that can fulfill this criteria so by introducing some additional constraints the correct corners can be paired together. It is assumed that the pattern is travelling in one direction so the tip of the wedge will always be on a certain side

(34)

of the other two corners. The two corners at the baseline should not be too far apart in the x direction. Also, the angle between the corners can not be too small or too large, e.g. the end points should not lay too close (large angle) or too far away (small angle) from the tip. To limit the search even more the process will not look for a match further than five corners ahead, it is unreasonable that a matching corner should be further away than this.

There is also a probe test, Figure 3.7. What this means is that originating from the chosen corners a line of pixels is selected and their value averaged. The average values from the probe tests are then compared to one and other and if they differ less than than 50% the corners are regarded as a match for that wedge. A final test is to roughly check the orientation of the corners. Each wedge should have three corners with predefined orientations depending on the pattern. Since it is a rough estimation of the orientation, a large variance is allowed. By looking at the first order derivative in the corner pixels, the orientation θ is calculated simply by

θ = arctan ∇yI

∇xI

. (3.2)

These gradient vectors can be seen in Figure 3.8.

(a) Upper corner. (b) Lower corner.

Figure 3.6. The angle at the corners should approximately be the same.

(a) Tip corner. (b) Upper corner. (c) Lower corner.

(35)

3.5 Edge Detection 23

Figure 3.8. Gradient vectors for the SMA pattern.

3.5 Edge Detection

When all the corners are known and mapped to the pattern it is possible to find the edge connecting these. These edges can then be used to find where the corners are located with sub-pixel precision and also to determine how the pattern has been distorted. The edges are found using the zero-crossing method described in Section 2.5. Since the patterns contain straight lines between corners it can be assumed that the distorted edge lies in the proximity of these straight lines. This line will give the first starting point for the algorithm, the starting point will be updated with the help of the found zero-crossing to get closer to the edge. For every x value in between these corners, the corresponding y value is found using the zero-crossing method. This makes the search limited to one dimension. The algorithm starts at the first starting point, and checks if there exists a zero crossing within a certain radius. If no zero-crossing is found, the search radius is increased, but only up to a specified limit to avoid finding faulty or multiple zero-crossings. When a zero-crossing is found it will be between two pixels, to get the sub-pixel position basic linear interpolation is used,

y − y0 x − x0 = y1− y0 x1− x0 ⇔ /y = 0/ ⇔ x = x0− y0  x1− x0 y1− y0 (3.3) where x0, x1 is the x value for the pixels next to the zero-crossing and y0, y1 are

their corresponding y values. An illustration of the linear interpolation of the zero-crossing is shown in Figure 3.10. The interpolated value will then be the new y value for the search area. So if the edge is bent the starting position will always be closer to the edge if the zero-crossing is used rather than the straight line between two the corners, see Figure 3.9. Here the edges are detected in a scan of the SMA pattern, therefore only the front edge will be detected. A scan

(36)

containing the arrow or modified arrow pattern would detect both the front and back edge.

(a) Edge detection.

(b) Close up on the edge.

Figure 3.9. First starting points (magenta markers). Starting points (red markers) for

the finding of the zero-crossings. The straight line (cyan markers) is used as the first start point and to check if the wrong edge is being followed. Note, the left and rightmost edges are also detected but the starting points and the straight line are not shown.

Ideally this should be performed for every pixel between the two corners, but a problem arises when the zero-crossing should be detected near the corners. At the corners the two edges are close to each other and it is possible to detect several zero-crossings. This makes it hard to determine which intersection is the desired one. Because of this some of the points near the corners are ignored, there is still enough points to get a good estimate of where the edge is. How many points that needs to be ignored can be hard to determine. So if the algorithm starts to follow the wrong edge, e.g. the lower instead of the upper, this will be detected and the y value for the straight line will be used to get back to the right edge. This is done by comparing the direction of the straight line between the two corners and the newly found zero-crossing. If they have opposite directions on their slopes, the wrong edge is followed. When all the zero-crossings for the points between the two corners have been found, it is possible to fit a polynomial to the data that describes the edge, which is discussed in the next section.

(37)

3.6 Polynomial Edge Fitting and Sub-pixel Detection of Corners 25

y

x

y

₀

y

₁

x

₀

x

1

Figure 3.10. Linear interpolation.

3.6 Polynomial Edge Fitting and Sub-pixel

De-tection of Corners

At this stage a sufficient amount of data has been collected to be able to approxi-mate a polynomial to the edges of the pattern. This polynomial can then be used to find a much more precise position of the corner by finding the intersection of two polynomials, e.g. the two slopes in the triangle and the arrow patterns. These polynomials is also used later on to determine how the pattern has been distorted due to acceleration or deceleration, more on that in Section 3.8.

When fitting a polynomial to the edge data there are two things that are of importance. What degree of the polynomial should be used? And should every data point be used?

With the data at hand, the polynomial for the edge is found through least squares minimization, see Section 2.7. The degree of the polynomial needs to be very high to be able to capture every variation in the velocity. This is however not very effective and often introduces unwanted oscillation and over-shoots in the fitting. Since the idea is to use the polynomials both to see how the velocity has changed and to determine the sub-pixel position of a corner, by finding the inter-section of two polynomials, it would be easiest to use a polynomial of the second degree. The movement of the assembly plate is done by hand. Through experi-ments this movement has been seen to be similar to a second degree curve. It is most likely to be pushed at a almost constant, slightly accelerating or decelerating pace for short periods of time. This also makes it easier to analytically find the intersection of two second degree polynomials than those of higher degrees, simply solve the equation system

y = Ax2+ Bx + C

y = Dx2+ Ex + F . (3.4) Where A, B, C are the coefficients for the first polynomial and D, E, F are the

(38)

coefficients for the second. The polynomial fitting and intersection of these can be seen in Figure 3.11.

For polynomials of higher degrees, iterative methods like the Newton-Raphson

Method can be used to find an approximate solution. In some situations a higher

degree would be more suitable.

In an attempt to be able to get better results for finding the sub-pixel position for the corners, a more local adaptation was introduced. Instead of using all the data points for the edge, it only uses half of it. The idea is to use a second degree polynomial to get an adaptation that is more alike one of a higher degree near the corners. This felt promising but it (as well) only gave better results in some situations. Instead the local polynomials serve as a backup in case something were to go wrong with the primary polynomial fitting on all the edge data. More on this in Chapter 4.

(a) Triangle pattern. (b) Curve fitting. (c) Intersection of the two

curves.

Figure 3.11. Polynomial fitting to a stretched triangle and intersection of these.

The intersection of the polynomials gives the position of the corner with sub-pixel precision, which is more accurate than the one obtained from the Harris detector. This procedure is performed for the whole pattern to find all the corners located at the tips. For the triangle pattern it is also used to find the two corners at the baseline. This process is unique for the triangle pattern and can not be solved with the method mentioned above since one of the polynomials is (close-to) vertical and has an infinitely steep slope. Instead a iterative method similar to the Newton-Raphson is used. The vertical polynomial is expressed as a function of y instead of x. So the two functions would be

y = Ax2_{+ Bx + C}

(39)

3.7 Real World Lengths 27

Then originating from the Harris corner, start by solving y, use this new y value to find the new x. Iterate this process a specified number of times or until the position no longer change more than a certain threshold.

When the sub-pixel position of the corners has been found, it is possible to accurately measure how far the pattern has travelled given a specific corner.

3.7 Real World Lengths

After the coordinates of the corners have been located with sub-pixel accuracy, the next step is to map them to their real world correspondence. This is done with the use of a model of the pattern defined in millimetres, meaning that it contains the x coordinates for all the corners in the pattern. The model is of course relative, the first corner is located in origin and the remaining corners lies at a fixed distance from this origin. Figure 3.12 shows this mapping for a scan with the SMA pattern. Both the tip and the end points for the different patterns need to be mapped to their corresponding real world coordinates. They do however have slightly different way to map the end points but more on that in Chapter 4. These end points does not have a sub-pixel position like the tip, it is however not a big issue since its position does not need sub-pixel precision as described Section 3.8. Example 3.1 explains how the mapping works.

With this information, together with the polynomials that describe how the edges are bent, it would be possible to restore the pattern but there would be no way to measure how accurate it would be. The physical encoder indicates very precise how the pattern has been distorted, but it does not measure in millimetres, it has its own unit of measurement.

Whenever the physical encoder rotates it increments its value, it is connected in such a way to the assembly plate that when the plate moves, the encoder rotates. If the assembly plate is pulled a certain distance the encoder will generate a specific value for that distance. By knowing how many times the encoder values increments per millimetre it is easy to convert between these units of measurement. Measuring the encoder value for a length of 20 centimetres 15 times the average encoder value is 11713 which gives 58.656 encoder values per millimetre in this particular setup.

(40)

Example 3.1

This example shows how the real model for the triangle pattern looks like. If there are six triangles and each triangle is assumed to be 12 mm long and have a spacing in between them of 1 mm, the real world model would be as follows.

Corner Millimetre 1 0 2 12 3 13 4 25 .. . ... 10 64 11 65 12 77 0 100 200 300 400 500 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 x coordinate (profile) Encoder value

Figure 3.12. Real world correspondence mapping for SMA pattern. Blue markers are

(41)

3.8 Position of Intermediate Profiles 29

3.8 Position of Intermediate Profiles

At this point the real world position of the corner points have been found and if the speed had been constant this would be enough, but since this is not the case the position for every intermediate profile needs to be found. As noted in Section 3.6 the polynomials fitted to the edges of the pattern describe how it has been distorted. At this step the patterns with overlapping features ought to have a greater advantage since they provide several measurements per profile. As a minimum there are two curves overlapping all the time, as with the triangle pattern and in the beginning of the different arrow patterns.

In Figure 3.12 all the corners and end points can be seen with their correspond-ing real world position, represented in encoder values. As seen, there are gaps in between them. It is this gap that the polynomial for that particular pattern shall fill. The polynomials that were fitted to the edge data are represented in a co-ordinate system consisting of pixels, but now the representation needs to be in encoder values, a scaling is then needed. Figure 3.13 illustrates the task at hand, the curve f with start and stop values f1 and f2 needs to be scaled so that it fits

the points g1 and g2. The function g(x) can then be written as g(x) = sf (x) + o.

To find the scaling s and the offset o solve the equation system g1= sf1+ o g2= sf2+ o ⇔ s = g1−g2 f1f2 o = g2f1−g1f2 f1−f2 (3.6) Even though the curve is scaled its shape is preserved in such a way that it still represents the distortion of the pattern. Performing this operation on all the polynomials will result in the curves shown in Figure 3.14. A close up on the polynomials and an overlap can be seen in Figure 3.15.

f

₁

f

₂

g

₁

g

₂

Figure 3.13. The polynomials needs to be scaled in order to fit into the new coordinate

system.

With all the curves in place they need to be weighted together to form a single curve that can represent the position of the profiles, a position curve. The simplest

(42)

30 Proposed Method 0 100 200 300 400 500 0 1000 2000 3000 4000 5000 6000 7000 8000

9000 Fitting w.r.t. detected curves in the pattern.

x coordinate (profile)

Encoder value

Figure 3.14. The polynomials describing the stretch are fitted in between the known

points in the pattern.

30 40 50 60 70 80 0 50 100 150 200 250 300 350 400 450 500

Fitting w.r.t. detected curves in the pattern.

x coordinate (profile)

Encoder value

Figure 3.15. A close up on the polynomials and the overlap at the second tip.

way would be just to take the average of the overlapping curves, but this could result in sharp transitions at new overlaps. To reduce these transitions a weight function is used, like a Gaussian or in this case a cos2 _{function, Figure 3.16. The}

(43)

3.8 Position of Intermediate Profiles 31

resulting weighted curve p(x) can then be calculated as

p(x) = w1(x)f1(x) + w2(x)f2(x) + · · · + wn(x)fn(x) w1(x) + w2(x) + . . . wn(x)

(3.7) where wn(x) is the weight function for polynomial n at profile x. wn(x) is of course

individual for every polynomial and not global. The resulting curve p(x) can be seen in Figure 3.17. In Figure in 3.18 the same can be seen but in millimetres. Also note that the curve is only weighted up to the last tip since this is the last position known with high precision. With this curve it is finally possible to restore the scanned image to its original proportions.

(44)

32 Proposed Method 0 50 100 150 200 250 300 350 400 450 0 1000 2000 3000 4000 5000 6000 7000 8000 x coordinate (profile) Encoder value

Figure 3.17. The resulting weighted curve p(x).

0 50 100 150 200 250 300 350 400 450 0 20 40 60 80 100 120 x coordinate (profile)

Real world X coordinate (millimeters)

(45)

3.9 Restoring the Image 33

3.9 Restoring the Image

With the position curve p(x) obtained, it is time to restore the proportions of the image. First off, a fixed resolution needs to be chosen. This resolution relates to its real world proportions. For example, a resolution chosen to ₁₀1 of a millimetre means that for every ₁₀1 mm find the profile corresponding to that value according to the curve q(X) in Figure 3.19, which is defined as

p(q(X)) = X

c , (3.8)

where X is the real world coordinate and c is the number of encoder values per millimetre. In the setup used for the experiments c = 58.656.

Since the profiles are in a discrete range the corresponding profile will be in-terpolated between the two closest profiles. Figure 3.22 further illustrates this. This operation is then performed from the lowest up to the last known real world coordinate, specified by q(X). Since the values in q(X) are relative to the first corner the corresponding profile number is added to the profile number of the first corner. See example 3.2. Figures 3.20 and 3.21 shows the original scan and the resulting image. The proposed method is here by finished, the results of this is dis-cussed in Chapter 5. Next chapter will cover experiments for every part that has been discussed here and also cover error measurements concerning the proposed method.

Example 3.2

Using the same q(X) as above, the first value is zero since it starts at the tip of the first wedge and the tip of the last wedge in this case would be 128.75 mm. This is also the total length in mm of resulting image. By specifying the resolution to 0.115 mm yields the following correspondence.

Real world coordinate Corresponding profile

0 start + 0.1956 0.1150 start + 0.9439 0.2300 start + 1.6922 0.3450 start + 2.3965 .. . ... 128.3400 start + 423.8692 128.4550 start + 424.1553 128.5700 start + 424.4359 128.6850 start + 424.7165

Where start in this case is equal to 24 and the corresponding profile is interpolated as described above.

(46)

34 Proposed Method 0 20 40 60 80 100 120 140 0 50 100 150 200 250 300 350 400 450

Real world X coordinate (millimetre)

Profile

Figure 3.19. q(X), a function of real world coordinates instead of profiles.

x coordinate (profiles) y coordinate (pixels) 50 100 150 200 250 300 350 400 450 500 −50 0 50 100 150 200

(47)

3.9 Restoring the Image 35 x coordinate (profiles) y coordinate (pixels) _{100 200 300 400 500 600 700 800 900 1000 1100} 0 100 200

Figure 3.21. Pattern restored to its original shape.

56 56.5 57 57.5 58 58.5 59 254 256 258 260 262 264 266 X: 57.47 Y: 259 X: 57.84Y: 260

Real world X coordinate (millimetres)

Profile

Figure 3.22. Interpolation between two profiles. The chosen resolution requires the

profile corresponding to 57.5 mm (red dot) but there are only corresponding profiles for 57.47 and 57.84 mm. Linear interpolation is performed to get the corresponding profile for the one at 57.5 mm.

(48)

(49)

Chapter 4

Experiments

4.1 Introduction

During the course of this thesis a lot of experimenting has been performed, both on synthetic data and on real data captured from the camera. This chapter will cover all the experiments that were performed and the outcome of these. The structure will be similar to the section discussing the proposed method. In the first section the purpose of having synthetic patterns and how they are generated will described. After that the corner and edge detection will be covered. Moving on to polynomials describing the edges, finding the corners with sub-pixel precision, interpolation of the x position of all profiles between the known corners, restoring the image, skewed pattern or laser and maximum and minimum velocities. The synthetic experiments are compared to their real counterparts.

Not all the figures from the experiments will be shown in this section, instead some can be seen in the appendix or at the end of each section. Those shown in this section are meant to illustrate the key-features of every part of the proposed method and the significant differences between the patterns.

The setup for the experiments consist of a camera, laser, movable assembly plate and metal framework to hold everything together, shown in Figure 4.1. The assembly plate is pulled by hand through the laser. To make the scanning easier an IR-sensor detects when the object is about to pass through the laser and activates the camera. It is a bit troublesome to measure the velocity and represent it in the typical metric system with the setup used. Instead the velocity can be represented in encoder values with the information gathered from the physical encoder.

4.2 Accuracy Measurement

The synthetic experiments are ones conducted on patterns and data created in MATLAB. By performing experiments purely in MATLAB, a lot more control over the in and output is gained. The experiments can be performed with or without noise and every point and curve is known for control measurements. This is an

(50)

38 Experiments

Figure 4.1. The setup used for the experiments.

important part of the development process, since it is easier to understand and interpret the results and provides a ground truth to the real experiments using the same pattern. The real experiments are then of course the ones performed on the intensity images captured from the actual camera, as the pattern moves under the laser. These experiments are more likely to contain noise and other irregularities that the synthetic experiments do not.

4.2.1 Generation of Synthetic Data

Generation of the synthetic data needs to be as accurate as possible to be able to get good results. MATLAB can generate images with vector graphics which would provide very exact results if they could be used for test data. The problem lies in that images in MATLAB are stored as M×N matrices and the vector image would therefore need to be rasterized, converting the vector image to pixels. A simple rasterisation process would simply round of the exact position given by the vector to a discrete value. In Figure 4.2 this simple but inexact process can be seen.

Instead of letting MATLAB rasterize the pattern, it is performed by a super-sampeling script, allowing the rasterisation process to be controlled. The way this is done is in a similar manner but in each pixel that a vector line passes the amount of covered pixel area is calculated. Depending on how much of the pixel is covered the grayscale value will be different. A pixel that is outside the shape will be 0 and 1 if it is inside the shape. If a pixel would have 1₉ of its area covered, the grayscale value would be 1₉. With this technique the pattern will have an anti-aliasing effect

(51)

4.2 Accuracy Measurement 39

(a) Vector graphics. (b) Automatic rasterisation. (c) Controlled rasterisation.

Figure 4.2. Generation of synthetic patterns.

Figure 4.3. Intersection of a line and pixel (marked with red). The intersecting line

covers a ninth of the pixel, therefore the color of that pixel will be one ninth (assuming the scale is from 0 to 1).

and have a much smoother and realistic transition from foreground to background. See Figure 4.3.

4.2.2 Corner Detection

The technique for detecting corners is explained in Section 2.3. In Figure 4.4 the synthetically generated triangle can be seen along with the result of the Harris operator. The extracted corners after performing NMS are shown in Figure 4.5. NMS is not really necessary for the synthetic patterns since there is no noise, a simple threshold would be sufficient but it provides a more sophisticated way of finding the correct maxima. Even though NMS is used, a threshold is needed to filter out those maximas with a too low value. NMS is however necessary for finding the correct corners in the pattern obtained from the camera. The pattern

(52)

40 Experiments

could contain noise and also the corners may not be as unambiguous as in the synthetic pattern.

(a) Original image. (b) Harris response.

(c) Harris response, tip of the triangle.

(d) Harris response, bottom-right of the triangle.

(e) Harris response, bottom-left of the triangle.

Figure 4.4. Harris operator applied to a synthetically generated triangle, σ = 0.9, filter

size of 5 × 5. Note, the triangle has been rotated for convenience.

(a) Corner after NMS, tip of the triangle.

(b) Corner after NMS, top of the triangle.

(c) Corner after NMS, bottom of the triangle.

Figure 4.5. NMS applied to the Harris response for the synthetically generated triangle.

When using the Harris operator on the corresponding real images taken from the camera, the response depends on the velocity of the assembly plate when the image was captured. If the plate was moving in an constant and rather slow speed the edges will be straight and the Harris response will look very much like the synthetic. In Figure 4.6 the image is captured with an approximate constant speed. The Harris operator performs equally well on both the synthetic and real pattern. Found corners have a disparity of 2-3 pixels from their true position,

(53)

4.2 Accuracy Measurement 41

which is good enough as a starting point.

(a) Original image.

(b) Harris response.

(c) Harris response, top of the triangle.

(d) Harris response, bottom-right of the triangle.

(e) Harris response, bottom-left of the triangle.

Figure 4.6. Harris operator applied to a scanned triangle, σ = 0.9, filter size of 5 × 5.

Note, the triangle has been rotated for convenience.

The arrow pattern as described in Section 3 is better than the triangles in several aspects. One problem though that is noticeable when applying the Harris operator is that the first two corners, Figure 4.7(c) and 4.7(d), have a much lower response than the rest. This is due to the fact that the angle of the corner is much greater than the others. In this case a lower acceptance threshold is needed which also introduces the possibility to accept noise as possible corners. This is one of the main reasons why a new iteration of the pattern was made.

According to the experiments, the choice of the variable σ should reside between 0.9 and 1.5 to give good responses. A lower value gives a sharper, more crisp response while a larger will result in a much smoother response. For the patterns where the corners are far apart the choice is rather arbitrary but for the SMA pattern, σ should be chosen to the lower end of the interval. Otherwise the corner responses might overlap, making it hard to separate them.

4.2.3 Feature Mapping

The feature mapping process became better with every new parameter introduced. Just checking the angles goes a long way but there is no way to know if the three features matched together represents the front or the back wedge on the pattern (Figure 3.5). It also encounters problems when a corners goes missing, e.g. the Harris response was too low or the pattern was travelling at such a speed that the pattern has become too compressed. This will result in that the wrong corners are matched together, like the feature mapping shown in Figure 4.8(a). However,

3D camera with built-in velocity measurement

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

3D camera with built-in velocity measurement

3D camera with built-in velocity measurement

Examensarbete utfört i Datorseende

vid Tekniska högskolan i Linköping

av

Abstract

Sammanfattning

Acknowledgments

Contents

Chapter 1

Introduction

1.1

Background

1.2

Thesis Objectives

1.3

Disposition

1.4

Notation and Symbols

1.5

Definitions and Abbreviations

Chapter 2

Background Theory and

Related Research

2.1

Introduction

2.2

Features

2.3

Corner Detection

2.3.1

Moravec Operator

2.3.2

Harris Corner Detector

2.4

Directional NMS

2.5

Edge Detection

2.6

Laser Triangulation

α

2.6.1

Occlusion

2.6.2

Reflectivity

2.7

Least Squares Minimization

2.8

Related Research

2.8.1

Optical Flow

2.8.2

Feature Based Motion Estimation

Chapter 3

Proposed Method

3.1

Introduction

3.2

Patterns

3.3

Finding Feature Points

3.4

Feature Mapping

3.5

Edge Detection

y

x

y

y

x

x

3.6

Polynomial Edge Fitting and Sub-pixel

De-tection of Corners

3.7

Real World Lengths