Detection of interesting areas in images by using convexity and rotational symmetries

(1)

Department of Science and Technology Institutionen för teknik och naturvetenskap

Linköpings Universitet Linköpings Universitet

SE-601 74 Norrköping, Sweden 601 74 Norrköping

Detection of interesting areas in

image by using convexity and

rotational symmetries

Linda Karlsson

(2)

Detection of interesting areas in

images by using convexity and

rotational symmetries

Examensarbete utfört i bildbehandling

vid Linköpings Tekniska Högskola, Campus Norrköping

Linda Karlsson

Handledare: Astrid Lundmark

Examinator: Björn Kruse

(3)

Rapporttyp Report category Licentiatavhandling x Examensarbete C-uppsats x D-uppsats Övrig rapport _ ________________ Språk Language Svenska/Swedish x Engelska/English _ ________________ Titel

Title: Detection of interesting areas in images by using convexity and rotational symmetries.

Författare

Author: Linda Karlsson

Sammanfattning

Abstract : There are several methods avaliable to find areas of interest, but most fail at detecting such areas in cluttered scenes. In this paper two methods will be presented and tested in a qualitative perspective. The first is the darg operator, which is used to detect three dimensional convex or concave objects by calculating the derivative of the argument of the gradient in one direction of four rotated versions. The four versions are thereafter added together in their original orientation. A multi scale version is recommended to avoid the problem that the standard deviation of the Gaussians, combined with the derivatives, controls the scale of the object, which is detected.

Another feature detected in this paper is rotational symmetries with the help of approximative polynomial expansion. This approach is used in order to minimalize the number and sizes of the filters used for a correlation of a representation of the orientation and filters matching the rotational symmetries of order 0, 1 and 2. With this method a particular type of rotational symmetry can be extracted by using both the order and the orientation of the result. To improve the method’s selectivity a normalized inhibition is applied on the result, which causes a much weaker result in the two other resulting pixel values when one is high.

Both methods are not enough by themselves to give a definite answer to if the image consists of an area of interest or not, since several other things have these types of features. They can on the other hand give an indication where in the image the feature is found.

ISBN

_____________________________________________________ ISRN LITH - ITN - MT-EX -- 02/31 -- SE

_________________________________________________________________ Serietitel och serienummer ISSN

Title of series, numbering ___________________________________

Nyckelord Keyword

Detection, convexity, shape from shading, rotational symmetries, polynomial expansion, normalized inhibition

URL för elektronisk version www.ep.liu.se/exjobb/

(4)

There are several methods available to find areas of interest, but most fail at detecting such areas in cluttered scenes. In this paper two methods will be presented and tested in a qualitative perspective. The first is the darg operator, which is used to detect three dimensional convex or concave objects by detecting zero-crossings in the argu-ment of the gradient. The detection is performed on four rotated versions, which are added together into one operator. A multi scale version is recommended to avoid the problem that the standard deviation of the Gaussian lowpass filters, included in the derivative filters, controls the scale of the object which is detected.

Another feature investigated in this paper is rotational symmetries with the help of approximative polynomial expansion. This approach is used in order to minimize the the number and sizes of the filters used for a correlation of a representation of the orientation and filters matching rotational symmetries of order 0, 1 and 2. With this method a particular type of rotational symmetry can be extracted by using both the order and the orientation of the result. To improve the method’s selectivity a normalized inhibition is applied on the result, which causes a much weaker result in two of the resulting symmetry order values, when one is high.

Both methods are not enough by themselves to give a definite answer to if the image consists of an area of interest or not, since several other things have these types of features. They can on the other hand give an indication where in the image the feature is not found.

(5)

I wish to express my gratitude to the following people, who in different ways have helped and supported me throughout the making of this master thesis.

• My supervisor at SAAB Bofors Dynamics, Astrid Lundmark.

• My examiner Björn Kruse at the institute of science and technology at Linköping

University.

• The people at SAAB Bofors Dynamics, especially Leif Haglund. • My opponent Marie Gunnarsson.

(6)

1 Introduction 3

1.1 Background . . . 3

1.2 Problem specification . . . 4

1.3 Outline of the paper . . . 4

2 Basic concepts. 5 2.1 Camouflage . . . 5

2.2 Edge-based detection . . . 5

2.2.1 Basic filters used in edge-based detection. . . 5

2.2.2 Problems with edge-based detection. . . 6

2.3 Double angle representation . . . 7

3 Convexity/concavity detection 11 3.1 Introduction . . . 11

3.2 The yargoperator . . . 13

3.2.1 Definition of the argument of the gradient . . . 13

3.2.2 The argument of the gradient of a paraboloid . . . 14

3.2.3 Definition of the yargoperator . . . 15

3.2.4 What is detected by the yargoperator? . . . 16

3.2.5 Dependence on orientation . . . 17

3.3 darg: A quasi-isotropic form of yarg. . . 18

3.3.1 Properties . . . 19

(7)

4.1 Definition of a rotational symmetry . . . 23

4.2 Detection by correlation . . . 25

4.3 Polynomial expansion. . . 27

4.3.1 Local polynomial expansion using normalized convolution . 27 4.3.2 Approximative local expansion. . . 29

4.4 Normalized inhibition . . . 33

4.5 Single scale detection. . . 34

4.6 Multi scale detection. . . 36

5 Implementation 39 5.1 The darg operator. . . 39

5.1.1 The Gaussian filters. . . 39

5.1.2 A multi scale version of darg. . . 42

5.2 Rotational symmetries. . . 44

6 Results and discussion 47 6.1 The darg operator. . . 47

6.1.1 Test images. . . 48 6.1.2 Planar objects. . . 49 6.1.3 Standard deviation σ2 . . . 49 6.1.4 Noise . . . 51 6.1.5 Non-linearities . . . 53 6.1.6 Threshold . . . 56

6.2 The multi scale version of darg. . . 57

6.2.1 Threshold . . . 59

6.2.2 The standard deviation σ2. . . 59

(8)

6.2.4 Detection of the whole area of interest. . . 62

6.2.5 Noise . . . 64

6.2.6 Enlargement of image. . . 64

6.3 Rotational symmetries. . . 66

6.3.1 Test images. . . 66

6.3.2 Detection by approximative polynomial expansion. . . 67

6.3.3 Normalized inhibition. . . 68

6.3.4 Planar objects compared to three dimensional convex objects. 69 6.3.5 Non-linearities . . . 69

6.3.6 Noise . . . 70

7 Conclusions 85 7.1 The dargoperator . . . 85

7.2 Rotational symmetries . . . 86

7.3 Recommendations for future work . . . 87

A Invariance of illumination 91

(9)

(10)

2.1 Double angle representation. . . 7

2.2 A circle and star pattern with double angle representation. . . 9

3.1 Illumination by directed light. . . 11

3.2 Intensity function of a paraboloid . . . 12

3.3 Gradient of an intensity function. . . 13

3.4 The gradient of a paraboloid . . . 14

3.5 Argument of gradient. . . 15

3.6 Yargof a paraboloid. . . 16

3.7 Zero crossing . . . 16

3.8 Gradient of an edge . . . 17

3.9 Gradient of a paraboloid. . . 17

3.10 Yarg’s response for cylinders . . . 18

3.11 αarg . . . 18

3.12 dargof a paraboloid . . . 19

3.13 Counter shading. . . 21

3.14 Counter-shading among animals . . . 22

3.15 dargof a counter-shaded cylinder . . . 22

4.1 Local orientation of a circle and a star pattern. . . 23

4.2 Rotational symmetries of order n = 0, 1, 2, . . . . 25

4.3 Examples of rotational symmetries which are not of the n:th order symmetries. . . 25

(11)

4.5 Correlator structure for two-dimensional polynomial expansion . . . 30

4.6 Correlator structure for approximative polynomial expansion. . . 32

4.7 Two star patterns which covers different amounts of area. . . 33

4.8 Filter kernels. . . 36

4.9 A single scale and a multi scale correlator structure. . . 37

5.1 A summary of the darg operator. . . 40

5.2 The arc-tan function . . . 42

5.3 σ1= σ2vsσ1 = σ2 . . . 43

5.4 A summary of calculating rotational symmetries. . . 45

6.1 Test images I . . . 48

6.2 Test images II . . . 48

6.3 dargfor variations of σ2 in testimage.m I. . . 50

6.4 Different amounts of noise added to doe_hanford_mod. . . 54

6.5 Different amounts of noise added to doe_hanford_mod. . . 55

6.6 dargapplied to transformations an image I. . . 55

6.7 dargapplied to transformations an image II. . . 56

6.8 dargapplied to transformations an image III. . . 56

6.9 Definition of colors for representing percentage values. . . 57

6.10 Thresholds for dargof different σ2of forest_tank . . . 57

6.11 Thresholds for dargof different σ2of doe_hanford_mod . . . 58

6.12 Multi scale dargfor different σ2’s applied on forest_tank. . . 60

6.13 Multi scale dargfor different σ2’s applied on doe_hanford_mod. . . 60

6.14 Different sizes of the Gaussian applied on forest_tank. . . 63

6.15 Different sizes of Gaussian applied on doe_hanford_mod. . . 63

(12)

6.17 Multi scale dargapplied on noisy versions of doe_hanford_mod. . . 65

6.18 Multi scale dargapplied on sampled and enlarged images . . . 66

6.19 Testimages for rotational symmetries. . . 67

6.20 Testimage ball . . . 67

6.21 Color reference image . . . 68

6.22 The rotational symmetries snof star and circles . . . 71

6.23 snwith and without normalized inhibition of circle_and_cross. . . . 72

6.24 Rotational symmetries for five different σ. . . . 73

6.25 si₀and s1i calculated from transformed images. . . . 74

6.26 si₂calculated from transformed images. . . 75

6.27 si₀calculated from transformed images for different γ’s. . . . 76

6.28 A part of forest_tank with noise applied to it. . . 76

6.29 Rotational symmetries of a part of forest_tank. . . 77

6.30 si₀of a part of forest_tank subjected to noise. Scale 1 and 2. . . 78

6.31 Scale 3 of si₀and scale 1 of s1 i for a part of forest_tank subjected to noise. . . 79

6.32 si₁of a part of forest_tank subjected to noise. Scale 2 and 3. . . 80

6.33 si₁of a part of forest_tank subjected to noise. Scale 3 and 4. . . 81

6.34 si₂of a part of forest_tank subjected to noise. Scale 1 and 2. . . 82

(13)

(14)

6.1 dargfor different σ2 in forest environment . . . 50

6.2 dargfor different σ2 in a city environment . . . 51

6.3 dargfor different amounts of noise added to forest_tank . . . 52

6.4 dargfor images with different noises added to sotacvistir . . . 53

6.5 Different structures of the multi scale dargfor forest_tank . . . 61

6.6 Different structures of the multi scale dargfor doe_hanford_mod. . . 62

B.1 dargfor images with different noises added to doe_hanford_mod I . 93 B.2 dargfor images with different noises added to doe_hanford_mod II . 94 B.3 Ranking and Misses/Hits for original doe_hanford_mod . . . 94

(15)

(16)

1

Introduction

1.1 Background

In all types of digital communication some form of information is transported over a medium. When the information consists of series of images, the large quantities of information creates a problem. The amount of information can be decreased with the help of compression, there is however a limit to how much the image can be compressed. In the compression processes information is lost. The demand of quality at the receiver end puts a restraint on the amount of compression which can be used. In a military surveillance application it is possible to decrease the amount of infor-mation, which needs to be transmitted, by analyzing which types of information the other party must receive. Most of the information, which is apprehended by the surveillance equipment, is not of interest and it is therefore only desirable to transmit the image when it contains something of interest. In other words, the image is only transmitted when an area of interest within the image is detected.

Since it is reasonable to assume that an area of interest will be present in a series of images it is also desirable to improve the compression of these images. This can be achieved by compressing the areas of the image which are not classified as an area of interest harder than the actual areas of interest.

The detection of areas of interest is a complex issue with several problems and a vari-ety of methods dealing with them. Depending on the background and the appearance of objects, the rate of success can be rather different from one method to another. One issue which is considered problematic is finding objects in a cluttered scene, or in other words a scene consisting of a multitude of objects. Another related issue is when the object itself is camouflaged.

(17)

The aim of this paper is to present and evaluate different methods of feature detection in order to find objects of interest in complex images. Two different methods will be analyzed. First of all a context free method of detecting three dimensional convex surfaces, using the intensity information of the image. In addition a method of detect-ing rotational symmetries is analyzed. The methods are evaluated from a qualitative perspective on single images.

1.3 Outline of the paper

Chapter 2: The concept of camouflage, edge-based detection and double angle

rep-resentation are introduced. The information in this chapter can be considered as background information.

Chapter 3: Presents the darg operator, which is a method for detecting three dimen-sional convex or concave objects.

Chapter 4: The concept of n:th order rotational symmetries is introduced and a

method using approximative polynomial expansion to detect rotational symmetries is presented. The selectivity of the method is improved by normalized inhibition. A multi scale version of the method is presented in the end of the chapter.

Chapter 5: The implementation of the methods presented in chapter 3 and 4 are

discussed. The chapter contains a summary of the two methods along with discussion about the Gaussian filters used in the calculation of dargand an introduction of a multi scale version of the dargoperator.

Chapter 6: Presents the outcome of the tests.

Chapter 7: A conclusion based on the theory and the test results is found in this

chapter.

Appendix A: A theorem, which proves the invariance of illumination of the darg

operator.

Appendix B: Tables containing information about the effect of noise on darg for different amounts of noise.

(18)

2

Basic concepts

In this chapter basic terms and areas of interest are defined, which will later be used in the methods of detection used in this Master’s thesis.

2.1 Camouflage

Animals and humans use camouflage as a way to conceal themselves from unwanted surveillance. This can appear in several different forms. The most common is in the case of animals, the prey hiding from a predator, or in the case of human-made camouflage, a soldier avoiding detection from the enemy. In order to achieve this objective, several methods can be used such as blending into the background or hiding in an environment which contains a large amount of distracting objects.

In this thesis, objects subjected to human-made camouflage are to be detected. In this case there are numerous ways which can be used to attempt to blend into the environment. Usually this is achieved by using uniforms or coverage with patterns that imitates the distribution of colors and edges that can be found in the surround-ings. Another way of camouflaging is to use the material in the surroundings and for example cover the object with branches from a tree.

2.2 Edge-based detection

2.2.1 Basic filters used in edge-based detection

In an image an edge can be defined as a thin area where a sharp intensity transition is present. Since a sharp transition causes a high response in the derivative perpendicu-lar to the edge, the gradient of the image can be used as a simple edge detector. The

(19)

is the intensity function of the image. ∂ ∂x ≈ dx= −1 0 1 ∂ ∂y ≈ dy =    −1 0 1    (2.1)

This type of filter can be extended into other types of edge detection filters, where, for example, the Sobel-filter 2.3 is a combination of the gradient filters in eq. 2.1 and a middle-value filter like the one in eq. 2.2.

Mhorizontal= 1₄    0 0 0 1 2 1 0 0 0    (2.2) S0◦ = 1 4    −1 −2 −1 0 0 0 1 2 1    (2.3)

By rotating the horizontal Sobel-filter in eq.2.3, Sobel-filters for vertical and oblique directions can be created.

There are several other operators used for edge detection, which originate from the gradient and according to Ohm in [5] the Laplace of Gaussian operator is recom-mended. This operator consists of a Laplacian operator combined with a two di-mensional Gaussian. The Laplace operator itself is created from the mathematical function ∂2I/∂x+ ∂2I/∂y, where I is the intensity function. The operator can

ei-ther be calculated by convolution of the result of the gradient with a gradient filter or by convolving the the gradient filter with itself and then applying it onto the original image.

2.2.2 Problems with edge-based detection

As camouflage often means hiding the edges of an object by adding lots of strong extra edges or making the natural edges seem less apparent based on its surroundings,

(20)

edge-based detection is often unsuccessful. The main reason for this is that since a convex object has a convex intensity function due to the illumination of the sun on the object, its edges tend not to be very sharp because of the gradual change in intensity. This fact combined with several other stronger edges leads to detection of the wrong edges.

One example of this is a soldier wearing a uniform with a pattern resembling the sur-rounding forest. When detecting edges, the edges created by the pattern and the edges in the surroundings will most likely give a stronger response than the actual edges of the soldier. Another example is camouflage adapted to surroundings with a simple structure, such as a white camouflage in an area covered with snow. Since the soldier now has the same color as the background, the actual edges will be rather weak and most likely there will be edges in the surroundings which are much stronger.

Related to the problem with detection of camouflaged objects is detection of an object in a complex environment. An object in a forest is for example difficult to detect even without camouflage because the large amount of strong edges present makes it difficult to find one particular edge.

2.3 Double angle representation

When representing local orientation of an image, there are several courses of action, which could be taken. The classical approach is to produce a two dimensional vector pointing in the dominant direction with a magnitude equal to the energy in the domi-nant direction, such as the image gradient.

In the calculation of the rotational symmetries in chapter 4 the double angle represen-tation, which uses the information in the gradient, will be used. The representation can be defined as a vector or complex number z, which points in the 2θ-direction when the direction of orientation is θ, as is illustrated in fig. 2.1.

Im z

Re z

2θ z

(21)

image gradient is determined with the help of a differentiated Gaussian filter.

I(x, y) = ([dx∗ gy] ∗ I(x, y), [gx∗ dy] ∗ I(x, y)) (2.4)

where gtis a one dimensional Gaussian with zero mean and standard deviation σ. dt

is the derivative of gt.

When the gradient has been determined, the next step is to calculate the actual double angle representation z for each pixel.

z= (I_x,σ2 + I_y,σ2 )γe2i arctan(Iy,σ/Ix,σ) _(2.5)

where Ix,σ ≈ _∂x∂ I(x, y) and Iy,σ ≈ _∂y∂ I(x, y). The energy sensitivity is controlled

by the γ-parameter.

In comparison of the image gradient, there are at least two advantages of instead using the double angle representation.

1 Ambiguities of the representation is avoided as an orientation which has the direction θ, and another with the direction θ + π will be represented by the same descriptor. In other words, two orientations which are pointing in the opposite direction of another will have the same orientation as can be viewed in 2.1.

2 According to Granlund and Knutsson in [2] maximally different or maximally incompatible patterns should be maximally different in their representation. This is achieved when the double angle representation is used.

As an example of why the last advantage is important in the case of rotational sym-metries, which are defined in 4.1, a symmetry in the form of a circle is used. The most incompatible pattern of the circle is a star, since direction of the gradient of a star is always orthogonal to the direction of gradient of the circle. Therefor, a star can be described in the same manner as a circle with the double angle representation, but with a factor π added to the phase. This change of π in the phase causes the vectors to point in the opposite direction.

(22)

(23)

(24)

3

Convexity/concavity detection

3.1 Introduction

In military applications, mostly vehicles, humans or buildings are to be detected. The one property that is present in all these types of objects is that they consist of three dimensional volumes and in many cases parts of the object or the whole object can be considered to be at least partially three dimensional convex or concave. This can be detected with the help of a shape from shading algorithm, which detects shape with the help of the intensity function of the image.

In order to detect convexity or concavity a description for what it is must first be determined. In outdoor environments all objects are more or less subjected to illumi-nation from one major light source in form of the sun. The sun itself can be described as a directed point light source, which because of its distance from earth has parallel rays. There can also exist other forms of lighting in the form of reflection or other sources of light independent from the sun. A cylinder illuminated from above can be seen in fig. 3.1.

Figure 3.1: The cylinder is illuminated by a directed light which results in a convex intensity

(25)

its intensity function takes the form of a convex intensity function. According to the Lambertian reflection model the intensity function has its peak where the surface normal of the object is in the direction of the incoming light from the source of illumination and gradually darkens as the difference in the direction of the incoming light and the surface normal becomes larger. A concave object, which is illuminated in the same manner as the convex object, also has a convex intensity function. An example of this is a hole in the ground which would appear light in the middle and dark around the edges.

In the ideal case the three dimensional concave intensity function can be represented by a the function of a paraboloid and the corresponding convex intensity function as the same paraboloid with reversed sign

f(x, y) = a(x − ε)2+ b(y − η)2 (3.1) where a > 0, b > 0 are constants, which determine the shape of the paraboloid and

(ε, η) the coordinates of the center of the paraboloid.

Figure 3.2: Intensity function of a paraboloid when a and b has the proportions 1 : 3 to

another

In the case of an ideal three dimensional convex intensity function, it can be repre-sented by the same function with a < 0, b < 0 as the only difference.

One other important issue to consider is that objects are often only partially convex, or concave, which implies that partially convex objects should also be detected.

(26)

3.2 The

_y

arg

operator

In the article by Tankus [7] a shape from shading algorithm in the form of an operator

dargis defined, which is expected to detect three dimensional convex intensity func-tions. A a first step in describing the dargoperator, the yargoperator will be defined,

which is later applied as a part of the darg operator.

3.2.1 Definition of the argument of the gradient

In ordinary edge-based detectors, the magnitude of the gradient is used as a basis of detection, but this operator uses the argument instead. The argument of the gradient can be interpreted as the orientation of the gradient in a particular pixel.

The two dimensional gradient map of the intensity function I(x, y) can be defined as:

I(x, y) = ( ∂

∂xI(x, y), ∂

∂yI(x, y)) (3.2)

Since an image consists of discrete values the gradient map can be approximated by:

I(x, y) = ([dx∗ gy] ∗ I(x, y), [gx∗ dy] ∗ I(x, y)) (3.3)

where gtis a one dimensional Gaussian with zero mean and standard deviation σ1.

The derivative of gt is dt. An example of this can be found in fig 3.3 where the

gradient calculated according to eq. 3.3 is displayed.

Figure 3.3: a. Intensity function. b. _∂x∂ I(x, y) c. _∂y∂ I(x, y)

Both derivatives are calculated using σ = 0.4.

From these Cartesian coordinates a transform into polar coordinates is necessary in order to determine the orientation of the gradient.

(27)

θ(x, y) = arg(I(x, y)) = arctan( ∂

∂yI(x, y), ∂

∂xI(x, y)) (3.4)

The two dimensional arctan is in this case defined by the following function :

arctan(x, y) =              arctan(y/x) x≥ 0 arctan(y/x) + π x <0, y ≥ 0 arctan(y/x) − π x <0, y < 0 (3.5)

3.2.2 The argument of the gradient of a paraboloid

Since the objects to be detected are three dimensional convex or concave, the ar-gument of the gradient of the intensity function in the shape of a paraboloid is of interest. The gradient of the paraboloid is necessary to determine the argument.

Figure 3.4: The gradient of a paraboloid in a. the x direction. b. the y direction.

When the gradient is known the next step is to use the previously defined two dimen-sional argument of the gradient. Using these equations, 3.4 and 3.5 produces a result as in fig. 3.5.

In the case of a paraboloid the following can be noticed(see also fig. 3.5):

1 Along each radial ray emerging form the center of the paraboloid, which in this case is considered to be the origin, the value of the argument is constant. 2 If the negative x axis is used as a reference, the value of the argument increases

gradually from−π to π when the angle from the present ray to the negative x axis is increased.

(28)

Figure 3.5: Argument of gradient for the paraboloid from the equations 3.4 and 3.5

3 A discontinuity is present along the negative x-axis where the argument makes a jump from π to−π.

The last property, discontinuity, is the most important feature of the argument of the paraboloid. In most other applications discontinuities are usually an unwanted trait, which is to be avoided, but in this case the operator is based on such a discontinuity ray.

3.2.3 Definition of the yargoperator

In order to detect the discontinuity along the negative x-axis a derivative in the y direction is used. The Gaussian used for this derivative has the standard deviation σ2. The standard deviation σ2must not be the same as σ1.

yarg= ∂

∂yθ(x, y) ≈ [Gσ(x) ∗ Dσ(y)] ∗ θ(x, y) (3.6)

This function will theoretically yield an infinite response to a paraboloid along the negative x-axis. In practice, however, response can be considered finite but strong, which can be seen in fig. 3.6.

As seen above the difference in the detection of three dimensional convex or concave paraboloids is that one gives a high negative response and the other a high positive response. The position of the response is the same except that for concave objects the ray is positioned along the negative axis, while for convex objects the negative x-axis is exchanged for the positive x-x-axis. The ray will appear along the x-x-axis where the gradient of x is negative.

(29)

Figure 3.6: Yargof a a. concave paraboloid b. convex paraboloid

3.2.4 What is detected by the_yarg operator?

If the appearance of the paraboloid is considered and compared to that of an ordinary edge, two observations, concerning the gradient, can be made.

1 If the edge is perpendicular to one direction a zero-crossing in gradient in this direction will be found in the shape of a line following the edge. A zero-crossing of a function can be defined as where the function is zero and has values which is negative on one side of the zero and positive on the other as can be seen in fig. 3.7.

Now considering the paraboloid instead, such a zero-crossing in the gradient of the certain direction will always be present. The paraboloid can be rotated and the zero-crossing will still be found. The paraboloid can therefore be con-sidered as an edge, which crosses the center of the paraboloid, when viewing from an arbitrary direction.

2 The gradient of the the paraboloid differs from that of a line in its much smoother appearance. This is because the intensity function of a paraboloid is much smoother in comparison of the steep slopes of the intensity function around an edge.

(30)

Figure 3.8: a. Original image. Gradient of an edge with respect to b. x and c. y.

Figure 3.9: a. Original image. Gradient of a paraboloid with respect to b. x and c. y.

In fig. 3.8 and fig. 3.9 the values of the gradients are not equally scaled because the difference in the max values of an edge and a paraboloid are so large that the gradient of the paraboloid in that case would appear to be flat.

To summarize, it can be stated that zero-crossings are to be detected and the instru-ment to use for this is the arguinstru-ment of the gradient. The discontinuity ray along the negative x-axis is an indication of a straight line of zero-crossings of the gradient in the y direction.

3.2.5 Dependence on orientation

In the example of an ideal three dimensional convex intensity function such as a paraboloid the yargoperator is almost independent of orientation. In cases where the shape is more oval than circular, or in other words a= b, there is a slight difference due to the steepness of the slope of the gradient.

When objects which are only partially convex are to be detected, the orientation cre-ates a problem. An example of such an object is a cylinder, which is only convex in one direction.

The strongest response comes from the horizontal cylinder which has the same gra-dient in the y direction as that of the corresponding ideal convex function.When

(31)

con-Figure 3.10: Yarg’s response for cylinders with different orientations. a. horizontal b.

oblique c. vertical

sidering the vertical cylinder, this gradient is almost flat with a rather weak response at the top and bottom edge. The reaction of an oblique cylinder is weaker than that of a horizontal cylinder, but stronger than that of a vertical cylinder.

3.3 _d

_arg

: A quasi-isotropic form of y

_arg

Since the yarg is dependent on rotation, an isotropic operator would be a more ap-propriate choice. This will also remedy the difference between concave and convex objects which both gives a strong result but along different parts of the x-axis. To create an isotropic operator it is necessary to begin with calculating rotated ver-sions, which the yargoperator is applied to. The result of these calculations is referred to as αargwere α denotes the rotation angle. The different αarg’s can be produced in

three steps.

1 Rotate the original image by π− α counterclockwise. 2 Calculate yargfor the rotated version.

3 Rotate the result back to the original angle by rotating it α− π degrees coun-terclockwise.

Figure 3.11: a. αargof a paraboloid for α = 0◦b. αargof a paraboloid for α = 90◦c. αarg

of a paraboloid for α = 180◦. d. αargof a paraboloid for α = 270◦.

To calculate a quasi-isotropic extension of the yarg, it is enough to extend the method to include a summation of the following four values of αarg, α = 0◦,90◦,180◦,270◦.

(32)

The resulting operator is called dargand is defined as:

darg=

α=0◦_,90◦_,180◦_,270◦

αarg (3.7)

Figure 3.12: dargof a paraboloid

The summation of the four αarg´s causes the middle points which have strong values

in all of the αarg’s to give a much higher response than the rest of the paraboloid. The

middle of the paraboloid is the only place within this object which has a zero-crossing in the gradient of both x and y.

In difference of the yargoperator, which only detects y direction zero-crossings of the gradient, the dargoperator also detects x direction zero-crossings of the gradient. The highest response is to areas where zero-crossings of the gradient in both directions are present.

3.3.1 Properties

The properties considered in this section are those of yarg, but these properties applies to darg also.

Response to planar objects and edges: An operator created with the intention of

de-tecting three dimensional convexity or concavity should avoid finding objects, which are planar. Planar objects of constant albedo can be described as having linear inten-sity functions instead of convex ones. Since yarg detects zero-crossings, which are not present in linear intensity functions, a zero response is the result. Edges on the other hand, which are a form of zero-crossings, will give a response which in theory is considered to be finite, while the response of the paraboloid is infinite. In practice, both responses are finite but the response of the paraboloid tends to be stronger than that of the edge.

(33)

the edge and the paraboloid will become greater due to the dependence on orientation of the edge. While the paraboloid will give equally strong responses to the different

αarg, the edge will only give a strong response to particular αarg’s.

Invariance of Illumination: In outdoor pictures, the illumination of a particular

scene might vary for a number of different reasons. Different seasons, the time of the day and the weather are but a few examples of reasons. The property of invari-ance to illumination is therefore a much wanted property in an operator.

By the theorem in appendix A.1 it is proven that yarg is invariant under any differ-entiable strongly monotonically increasing transformation of the intensity function. Linear transformations, positive powers, where f (x, y) > 0 and logarithms are ex-amples of such transformations. The invariance property is also applicable for com-position and linear combinations, with positive coefficients, of the previously stated examples, as such functions are also differentiable and strongly monotonically in-creasing

Linear dependence on scale when the image is sampled.: According to the article

written by Tankus [7] there is a linear dependency of scale where the darg of the scaled image, I(sx, sy) is the scaled dargof the original image when multiplied with

a factor s.

dI(sx,sy)_arg (x, y) = sdI(x,y)_arg (sx, sy) (3.8)

One of the tree following conditions is needed in order for the linear dependence to hold.

1 The image is continuous.

2 The sub sampling of the image is dense enough to faithfully represent the orig-inal image after the scaling.

3 Choosing the right σ. According to Tankus, when an image location where

darg −→ ∞, a change in σ will only affect the size of the domain, where the

in theory infinite response should occur, but not the relative strength of the response. This infinite response reduces the effect of change in either standard deviation or scale. In 6.1.3 the correctness of this statement will be discussed.

Invariance of orientation: The issue of orientation was dealt with in the previous

section and it was stated that yarg is dependent on orientation. This was one of the reasons for creating an isotropic operator darg, which is almost invariant to orienta-tion.

(34)

3.4 Counter-shading

As with most visual properties of a picture it is possible to prevent detection of a convex object with a shading to counter the result of the shading which occurs when a light source hits the object. This is called counter shading and is explained by Thayer´s principle.[6]

If we paint a cylinder or sphere in graded tints of gray the darkest part facing toward the source of light and the lightest away from it, the body’s own shade so balances this color scheme that the outlines become dis-solved.

In other words, the object is so shaded that it neutralizes the shades appearing from a point light source.

Figure 3.13: a. One-colored cylinder illuminated by a point light source. b. Counter-shaded

cylinder with ambient lighting. c. Counter-shaded cylinder illuminated by a point source.

Counter shading is found to be a common feature within the animal kingdom. Several types of animals have coloration that changes gradually form dark to light, when viewed from top to bottom. Since most animals live in an environment where the one major source of light is the sun, which can be defined as a point light source, this type of change in color can be thought of as counter-shading.

In the article by Tankus [7] it is indirectly suggested that this phenomenon in the animal kingdom exists because of an ability of animals to trace its prey by convexity detection. This conclusion is based on the fact that a useful ability of some kind in an animal is the result of years of evolution. It is then likely that something like counter-shading would not be present unless for reason.

When the darg-operator is applied to a counter-shaded object it results in much lower values than that of the same object without the counter shading as can be seen in fig 3.15. The one most important reason for the weakness in the response is the fact that

(35)

Figure 3.14: Example of animals who use counter-shading [6].

Figure 3.15: a. darg of a plain cylinder. b. darg of a counter-shaded cylinder. Both are subjected to directed light. The maximum dargvalue of the counter-shaded cylinder is only 0.0256 compared to a maximum of 0.0418 when dargis applied

to a the cylinder in fig 3.13 a.

the intensity function is turned form convex to almost planar by the counter shading. Therefore it can be concluded that counter shading can be used to hide an object against the algorithm presented in this paper.

(36)

4

Detection of rotational symmetries

In this chapter the concept of rotational symmetries is introduced and the theory behind a method for detecting such symmetries is presented.

4.1 Definition of a rotational symmetry

Let r and ϕ denote polar coordinates. Then according to Johansson in [4] a rotational symmetry is defined as a signal f (r, ϕ) where ˆz = z/|z| is only depending on ϕ,

when z is the double angle description of local orientation of f .

Figure 4.1: Rotational symmetry examples. a. Circle. b. Local orientation of a circle. c.

(37)

studying the objects local orientation it is apparant that the only difference in the variation of the phase is a constant difference of π.

zcircle = |z|ei2ϕ (4.1)

zstar = |z|ei2ϕeiπ (4.2)

In this paper the concept of rotational symmetries denotes the special case of rota-tional symmetries, which are called the n:th order rotarota-tional symmetries. They are defined as:

c(r, ϕ)ei(nϕ+α) (4.3)

where n, which denotes angular modulation speed, determines which order of sym-metry it belongs to. The type of rotational symsym-metry can be specified even further by using the phase α to determine which member of the order is found.

The local orientation z and the polar coordinates are reshaped into vectors, which can be described as elements inCn2, and are denoted as z, r and ϕ. This reshaping is made because the objective is to find patterns in a limited square-shaped area within an image.

The symmetries which are of most use, while they are the most common visual pat-terns in daily life according to Johansson in [3] , are those of order n = 0, 1, 2. These three different orders describes particular types of rotational symmetries in the form of gray level patterns.

1 The zeroth order n = 0: Lines are described, since this actually is an averaging of the orientation image.

2 The first order n = 1: Parabolic symmetries, or in other words curvature, corners and line-endings, are described. The direction of the corner is the same as the phase α.

3 The second order n = 2: Detects circular symmetries, such as stars, spiral and circular-like patterns.

Other rotational symmetry patterns than the n:th order symmetries might be of inter-est. In describing rotational symmetries, which are not of the n:th order, the normal-ized orientation descriptor ˆz can be used. These descriptors can be expressed in the

(38)

Figure 4.2: Examples of rotational symmetries of order a. n = 0, b. n = 1, and c. n = 2.[3]

ˆz ∼ n

sneinϕ . (4.4)

Examples of rotational symmetries, which are not of the n:th order, is found in fig. 4.3.

Figure 4.3: Examples of rotational symmetries which are not of the n:th order symmetries

[4]

4.2 Detection by correlation

A simple approach for detecting rotational symmetry patterns of the n:th order is to correlate the double angle representation z with a filter a· bn. This filter represents

one of the n orders, with the help of the circular harmonic basis function bn= einϕ

and an applicability function a. The applicability function can be considered to be a window for the basis function. If the applicability function is chosen as a Gaussian,

(39)

is controlled by the size of the Gaussian, or more distinctly the standard deviation of the Gaussian.

Figure 4.4: The filters a· bnwhen a is a Gaussian for rotational symmetries of the a. O:th

b. 1:st and c. 2:nd order. d. A color reference image, where the color shows the

orientation and intensity the magnitude of the local vector.

When correlating the the filter a· bnwith z, the scalar products in eq. 4.5 is

calcu-lated for each of the local neighborhoods.

s0=< a · b0, z > s1=< a · b1, z > s2=< a · b2, z > (4.5) When s=    s0 s1 s2    , B=    | | | b0 b1 b2 | | |    (4.6)

the scalar products in 4.5 can be represented in the form of matrices, where Wa =

diag(a).

s= B ∗ Waz (4.7)

The resulting components snof the matrix s in 4.7 have the following form when a

rotational symmetry of the n:th order with z =|z|ei(nϕ+α)is present in the image [4]

(40)

From the resulting snin eq. 4.8 it can be interpreted that a high magnitude|sn| is an

indication of an n:th order symmetry, while the phase snpinpoints the actual class member.

Several problems appear with the use of this method. In the thesis by Johansson [4] it is expressed that the method is not as selective as desired. Lines in the outer area of the local neighborhood are detected as not only a zeroth order, but also as a first and a second order rotational symmetry. Another example of the method’s problems with selectivity is that corners are not only detected as a first order symmetry, but since it is almost in the shape of a part of a circle it will also give a noticeable response in s2. A solution to the problem will be presented in section 4.4.

When the algorithm is put to practical use, a problem concerning the time complexity of the algorithm will arise. Since the filters a· bnare two dimensional, the amount

of time necessary for calculating the scalar products will become rather large, when large symmetries are to be detected.

4.3 Polynomial expansion

In order to deal with the problems mentioned in the previous section, polynomial expansion is presented. The fact that if the basis functions bn = einϕ are weighted

with rn, polynomials can be created, is used as a basis for this method. It should be noted that the polar coordinate r and the matrix r are not the same.

     b0 = ei0ϕ = (x + iy)0 = 1 rb1 = rei1ϕ = (x + iy)1 = x + iy

r2b2 = r2ei2ϕ = (x + iy)2 = x2− y2+ i2xy

(4.9)

Since the weight rn does not effect the membership of a basis function to an n:th order rotational symmetry class it is, according to Johansson [4], possible to use polynomial model parameters to detect rotational symmetries.

4.3.1 Local polynomial expansion using normalized convolution

The problem which needs to be solved by the local polynomial expansion is that a signal f is to be approximated [1]. In order to do that, a model of the N-dimensional signal is defined as:

(41)

desired to model f with a second degree polynomial. The two-dimensional case of eq. 4.10 is then written as:

f(x, y) ∼ r1+ r2x+ r3y+ r4x2+ r5y2+ r6xy (4.11) A second degree polynomial basis P2in 2is then defined as:

P2 = 1, x, y, x2, y2, xy (4.12)

The model needs to be applied to a limited square-sized area in a pixel-discretized image, and therefore it is necessary to reshape the local signal and basis functions into vectors allowing them to be described as elements in n2. Eq.4.11 can now be written in the following manner.

f ∼ P2r (4.13)

where the polynomial basis P₂and the model coefficients r are defined as:

P2=    | | | | | | 1 x y x2 _y2 _xy | | | | | |    , r=          r1 r2 r3 r4 r5 r6          (4.14)

The polynomial expansion problem in the form of a weighted least squares problem is in the thesis by Johansson [4] defined as:

arg min

rC f − P2r = arg minrC(f − P2r) ∗ W(f − P2r) (4.15)

where W = WaWc. Wa denotes the applicability weight and Wc denotes the signal certainty weight.

(42)

It is assumed that the certainty is large enough to allow the existence of an inverse of PT₂WP2. The certainty indicates our confidence in the values of f . In this paper full certainty is assumed, which gives Wc = I, and can therefore be omitted from the

calculation. The applicability function a is chosen as a Gaussian function, primarily because it is both isotropic and separable [1].

The solution of the polynomial expansion problem is:

r= (PT₂WaP2)−1PT2Waf =          1 σ2 σ2 σ2 σ2 σ2 3σ4 σ4 σ2 σ4 3σ4 σ4          −1         < 1· g, f > < x· g, f > < y· g, f > < x2· g, f > < y2· g, f > < xy· g, f >          =          2 − 1 2σ2 −2σ12 1 σ2 1 σ2 1 2σ2 2σ14 −_2σ12 _2σ14 1 σ4                   < 1· g, f > < x· g, f > < y· g, f > < x2· g, f > < y2· g, f > < xy· g, f >          (4.16)

when PT₂waP2, which does not depend on the signal, is computed with the

assump-tion of continuous funcassump-tions. The matrix PT₂waf consists of correlations of the signal with the filters in PT₂wa. These correlations can be made according to the structure in fig. 4.5, which uses a total of 9 one-dimensional filters.

4.3.2 Approximative local expansion

An approximation of the result in eq. 4.16 can be achieved by fitting the signal onto an alternative basis function. The basis D2 uses the fact that the derivatives of a Gaussian can be represented as multiplication of two functions, where the original Gaussian is one of the functions. Therefore if the basis D2is defined as:

D2 =    | | | | | | 1 −x −y x2− σ2 y2− σ2 xy | | | | | |    (4.17)

(43)

Figure 4.5: Correlator structure for two-dimensional polynomial expansion where all filters

are one-dimensional and combined with an applicability factor. The first filtering is made in the x direction and the second along the y direction.

and g(t) = √ 1 2πσ2e− 1 2(xσ)2 g(t) = − x σ2g(t) (4.18) g(t) = x 2_{− σ}2 σ4 g(t)

then the basis D2can be weighted with Wa, which results in eq. 4.19

WaD2=    | | | | | | g σ2gx σ2gy σ4gxx σ4gyy σ4gxy | | | | | |    (4.19)

in which Wa = diag(g) and gx, ..., gxy denote partial derivatives up to the second

degree of the Gaussian. The matrix G is calculated from the functions of 4.18, where the derivatives not defined in these functions can be calculated from the functions because of the separability of the Gaussian.

(44)

A relationship between the basis D2and the basis P2, which is defined in eq. 4.12, can be established as:

D2 = P2TP D (4.20) where TP D =          1 0 0 −σ2 _−σ2 ₀ 0 −1 0 0 0 0 0 0 −1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1          (4.21)

If P2 is replaced with eq. 4.20 in the polynomial expansion solution in eq. 4.16, then:

r = (PT₂WaP2)−1PT₂Waf

= (T−T

P DDT2WD2T−1_{P D})−1T−T_{P D}DT₂Wf (4.22)

= TP D(DT2WD2)−1DT2Wf

When calculating eq. 4.23 the same calculations are made as in the original polyno-mial expansion solution, with one exception. The actual fitting of the signal f onto a basis function is made with D2, instead of P2, and is thereafter transformed onto the original model f ∼ P2r by the transformation matrix TP D. When full certainty is assumed, as in eq.4.16, the solution has the form:

r= TP D(DT2WaD2)−1DT2Waf =          1 − 1 2σ2 −2σ12 −_σ12 − 1 σ2 1 2σ4 1 2σ4 1 σ4                   < 1· g, f > <−x · g, f > <−y · g, f > <(x2− σ2)·g, f > <(y2− σ2)·g, f > < xy· g, f >          (4.23)

(45)

result of the last matrix is first produced and then multiplied with the first matrix. A correlator structure must be defined in order to calculate DT₂Waf . Since this involves a correlation of the image with the filters:

g, σ2gx, σ2gy, σ4gxx, σ4gyy, σ4gxy (4.24) an approximation of the filters can be made in the form of a two dimensional Gaussian filter combined with derivative filters as in eq. 4.25.

σ2gx ≈ g ∗ dx

σ2gy ≈ g ∗ dy

σ2gxx ≈ g ∗ dx∗ dx (4.25)

σ2gyy ≈ g ∗ dy∗ dy

σ2gxy ≈ g ∗ dx∗ dy

where dxand dyare one dimensional filters along the x and y direction respectively. As can be seen in the correlator structure in fig. 4.6 only two one dimensional Gaus-sian filters and five derivative filters are needed compared to the nine filters in fig. 4.5.

(46)

4.4 Normalized inhibition

Previously in section 4.2 a problem concerning selectivity of the method was intro-duced. Something which gives a strong response for one of the three orders might also give a noticeable response in one of the other two. This can be prevented by letting the results of the different orders inhibit one another so that if one response is high, the other ones becomes lower than before.

In order to perform the inhibition it is necessary to first normalize the results with re-spect to the amount of certainty. This ensures that the magnitude of snwill always be

between 0 and 1. Normalization is also a solution to the problem that two rotational symmetry patterns of the same class, which differs in the amount of area is covered, have the same phase, but different magnitudes. Two such rotational symmetries are illustrated in fig. 4.7. The result s2 of the left star will be much stronger than of the right star.

Figure 4.7: Two star patterns which covers different amounts of area.

The normalization with respect to the amount of certainty is defined as:

sn= < a· bn, z >

< a,|z| > (4.26)

By normalizing the response it is ensured that the magnitude of the response will be between 0 and 1. Thereafter the inhibition can be performed by calculating the following functions.

|sp0| = |s0|(1 − |s1|)(1 − |s2|)

(47)

The functions in 4.27 are used to calculate the inhibition of snwhen normalized

con-volution is used. As was stated earlier the method of using polynomial expansion instead is recommended, since it is a much faster algorithm. The method of normal-ized inhibition must therefore be adapted to the result of the polynomial expansion. Since z is approximated by Pr a simple substitution in the normalization calculation 4.26 is the more correct approach, but according to Johansson in [4] the calculation of the amount of certainty < a,|Pr| > would be much more computationally complex than the original < a,|z| >. When keeping the original amount of certainty as the normalization factor function 4.26 is modified in the following manner.

sn∼ < a· bn, Pr >

< a,|z| > (4.28)

In order to use the inhibition scheme it is still necessary that sn only has values between 0 and 1. This objective can be achieved with one additional calculation before the inhibition is carried out, which simply gives the value 1 when|sn| exceeds

the threshold 1. The inhibition rule is then changed to:

|sp0| = h(|s0|)(1 − h(|s1|))(1 − h(|s2|)) |sp1| = h(|s1|)(1 − h(|s0|))(1 − h(|s2|)) (4.29) |sp2| = h(|s2|)(1 − h(|s0|))(1 − h(|s1|)) where h(t) = 1 if t > 1 t otherwise (4.30)

4.5 Single scale detection

If a local polynomial expansion model of the local orientation, which uses normalized convolution, is applied then:

(48)

z∼ Pr , where r = (PTWaP)−1PTWaz (4.31) The approximation of the double angle represenation z onto the polynomial expan-sion Pr is determined by approximative polynomial expanexpan-sion. This implies that z is first approximated onto Dr and is thereafter transformed into Pr by multiplication with the matrix TP D.

The correlation of the filters a·bnwith the orientation in the form of the double angle representation z for each local neighborhood is defined in 4.5. These correlations can be approximated by substituting z with the polynomial model parameters r.

s= B ∗ Waz ∼ B ∗ WaPr=     1 0 0 σ2 σ2 0 0 σπ 8 −iσ π 8 0 0 0 0 0 0 σ2 2 −σ 2 2 −iσ 2 2 r     (4.32)

This implies that instead of calculating the rather inefficient correlations it is enough to compute a polynomial model and from this use the parameters r to approximate s. In order to simplify even more it is possible to combine the calculation of the polynomial model and the transformation from r to s. Merging equations 4.31 and 4.32 together gives:

s= B ∗ WaPr= B ∗ WaP(PTWaP)−1PTWaz (4.33)

In other words since

WaP(PTWaP)−1PWaB = Wa 1 _σ1 π 8(x + iy) 4σ12(x2− y2+ 2ixy (4.34) the correlation of the polynomial model with the filter kernels a· bn are equivalent to a correlation of the of the orientation z with the filter kernels in fig. 4.8.

In order to increase the selectivity of the result, equations 4.28, 4.29 and 4.30 are applied to the resulting s.

(49)

Figure 4.8: Filter kernels matching eq. 4.34. a. a· 1 b. a · 1_σπ₈(x + iy) c. _4σ12(x2−

y2+ 2ixy). d. Color reference image, where the color represents the orientation

and the intensity the magnitude of the local vector.

4.6 Multi scale detection

The single scale polynomial expansion algorithm presented in 4.5 can be extended into a multi scale version. The single scale method, which includes approximative local polynomial expansion and normalized inhibition can be generalized to detection of rotational symmetries in several different scales. The multi scale algorithm is defined as:

1 Compute a local polynomial expansion model of z in several scales. Such a model is calculated with the correlator in fig. 4.6 as a basis. The single scale correlator consists of two main parts, a Gaussian filter g followed by a derivative structure ∂. In order to extend the correlator to the multi scale case a low pass hierarchy needs to be calculated using Gaussian filters and an attached derivative structure for each scale as can be seen in fig. 4.9. At the same time a low pass hierarchy of the|z| is produced.

2 For each scale, the scalar product < a· bn, Pr > is calculated using eq.4.32.

3 Thereafter the output of the previous step is normalized with the output cer-tainty according to eq. 4.28 for each step.

4 The inhibition rule in eq.4.29 is used to compute the normalized inhibition for each scale.

(50)

Figure 4.9: Correlator structures for approximative local polynomial expansion in two

(51)

(52)

5

Implementation

During the implementation of the dargoperator and the calculation of rotational sym-metries a couple of basic considerations were taken. Since the objective was to study these methods from a qualitative point of view for single two dimensional images, the time complexity issue has not been considered as a priority.

The programs, which were used, are in the form of functions in Matlab and includes parts, which are programmed in C but imported into the Matlab environment. The programs used for calculating rotational symmetries are created by Björn Johansson at Computer Vision Laboratory at Linköping University in Sweden

5.1 The

_d

_arg

operator

This operator is used to detect three dimensional convex intensity functions, by de-tecting zero-crossings of the gradient. The theory behind the method is presented in chapter 3. A summary of the darg operator can be viewed in fig 5.1, in which the

different operations are described along with the corresponding images, when using the intensity function of a paraboloid as an input. The parameters σ1 and σ2 were chosen as 0.4 and 8.0, respectively.

5.1.1 The Gaussian filters

The darg operator is based on derivatives which are subjected to a two dimensional

low pass filter in form of a Gaussian. The separability of the Gaussian can be ex-ploited by using two separate one dimensional Gaussian filters to minimize calcu-lations. The two one dimensional filters give the same result as a convolution with their corresponding two dimensional filter and the image, when one of the one di-mensional filters is first convolved with the image and thereafter the other filter is convolved with the result of the first convolution.

(53)

Figure 5.1: A summary of the steps necessary when applying the darg operator onto an im-age. The dx and dy modules denotes a convolution with a derivative filter and each are followed by a convolution with a two dimensional Gaussian, denoted as g. The first set of derivatives uses a Gaussian standard deviation σ₁and for the last y derivative a Gaussian of standard deviation σ₂is applied.

The one dimensional Gaussian can be described as :

g(t) = √1

2σe−

1

2(σt)2 _(5.1)

where σ denotes the standard deviation.

Since the image data is discrete, the Gaussian filters must also be discrete. One additional term, n, denoting the size of the filter needs to be introduced. n must be odd to avoid displacement of the result.

In order to determine the values of a Gaussian filter the standard deviation is used as an input, which is both used to determine the size n and the values of the filter coefficients. The following formula has been used to calculate n:

n= 2(σ

(54)

Using formula 5.2, the size of a Gaussian one dimensional filter is determined so that the filter has a value of 20 percent of the maximum value at the edges of the filter. The resulting n together with σ is used to produce the filter.

As not only the Gaussian but also a derivative is necessary, a Gaussian for each di-rection of the gradient is used for an approximation of the derivative by convolution with the corresponding of the filters in 5.3

∂ ∂x ≈ dx= −1 0 1 ∂ ∂y ≈ dy =    −1 0 1    . (5.3)

In order to optimize the performance of the program, two different Gaussian filters are created. In detecting three dimensional convex or concave objects with the darg

operator, edges will also give a noticeable response. In order to make the operator more selective, certain properties of the argument can be used.

1 The argument is calculated for each pixel and therefore only the gradient in that pixel will affect the result.

2 Because of the appearance of the arc tan function in fig. 5.2, strongly posi-tive and strongly negaposi-tive values of q = (_∂y∂I(x, y))/(_∂x∂I(x, y))will be

sup-pressed. This is because of the two horizontal asymptotes arctan(q) −→

±π/2, when q −→ ±∞.

A convex object was previously described as an edge with slowly increasing or de-creasing slope of the gradient around the zero-crossing, while an ordinary edge gives a much higher value of the derivative but only along a thin area. If then a Gaussian with a low σ1is chosen, the ordinary edges will stay thin in comparison of the convex object. The low σ1 causes the difference in the strength of the gradient for a line to be much stronger than for a high σ1. On the other hand, when applying the arc-tan

function to the gradient, the higher values will be suppressed.

The output from the arc-tan function is subjected to a filtration similar to that of the gradient in the y direction. With the intention of producing a strong reaction to the more widespread results of the convex object a σ2 larger than σ1is used. As the high values of the edges were suppressed in the earlier step the sum of the values of a large neighborhood is much greater for the convex objects which can contribute with a larger amount of values than the thin edge.