on Local Orientation
Bjorn Johansson
Computer VisionLaboratory Department of ElectricalEngineering
Linkoping University, SE-581 83Linkoping, Sweden [email protected]
LiTH-ISY-R-2312 2000-10-30 ISSN 1400-3902
Curvature Detection using Polynomial Fitting on
Local Orientation
Bj¨orn Johansson
Computer Vision Laboratory
Department of Electrical Engineering
Link¨oping University, SE-581 83 Link¨oping, Sweden
[email protected]
November 23, 2000
Abstract
This report describes a technique to detect curvature. The technique uses local polynomial fitting on a local orientation description of an image. The idea is based on the theory of rotational symmetries which describes curvature, circles, star-patterns etc. The local polynomial fitting is shown to be equivalent to calculating partial derivatives on a lowpass version of the local orientation. The new method can therefore be very efficiently implemented both in the singlescale case and in the multiscale case.
Contents
1 Introduction 3
2 Notations 4
3 Normalized convolution 4
4 Local orientation in double angle representation, z 5
5 Rotational symmetries 6
5.1 Short introduction . . . 7 5.2 Rotational symmetries and normalized convolution . . . 9 5.3 Rotational symmetries and normalized inhibition . . . 9
6 Using polynomial fitting to detect curvature 10
6.1 Normalized convolution case . . . 11 6.2 Normalized inhibition case . . . 13
7 Efficient polynomial fitting using Gaussian derivatives 15
7.1 Relation between polynomials and partial derivatives . . . 16
8 Using partial derivatives to detect curvature 17
8.1 Normalized convolution case . . . 17 8.2 Normalized inhibition case . . . 18 8.3 Compensating for numerical problems . . . 18
9 Complexity comparisments 21
10 Multiscale filtering 21
11 Experiments 22
11.1 Approximating Gaussian derivatives . . . 22 11.2 Rotational symmetries using polynomial fitting . . . 22 11.3 Multiscale detection of rotational symmetries . . . 22
12 Conclusions 27
1
Introduction
Human vision seems to work in a hierarchical way in that we first extract low level fea-tures such as local orientation and color and then higher level feafea-tures. No one knows for sure what these high level features are but there are some indications that curvature, circles, spiral- and star patters are among them [10], [17]. Perceptual experiments also indicate that corners and curvature are very important features in the process of recog-nition and one can often recognize an object from its curvature alone [2], [4].
One way to detect the features mentioned above is to use the theory of rotational
symmetries. They have been described in the literature a number of times, see e.g. [7],
[3], [6], [14], [15], [5], [12], [13]. The idea is to correlate with suitable filters on an local orientation description of an image. The filters mainly calculate the local polar Fourier transform of the local orientation and can detect curvature, circle, star-patterns etc. They are not very selective but can be made more selective using the normalized convolution technique or let the filter results inhibit each other.
One problem with the theory described previously is that it is difficult to efficiently calculate the results. The filters can in some cases be made approximately separable into a small number of 1D-filters using SVD (see [12]) or, if we want to detect the symmetries in several scales, create a filter-net using a technique similar to the one described in [1] (the results from this is not yet reported). The design of the filter-net is not automatic but involves a lot of engineering which makes it a bit inflexible and the SVD approach is difficult to efficiently generalize to multiscale filtering.
This report describes the rotatational symmetries in terms of polynomials. The symmetries can then be detected by local polynomial fitting on the local orientation. The polynomial fitting can be calculated efficiently by means of 1D-filters. The polyno-mial fitting is also shown to be equivalent to calculate partial derivatives of a lowpass version of the local orientation (local orientation correlated with a Gaussian). This makes the polynomial fitting even more efficient, especially if we want to detect large features or features in several scales.
This approach creates less practical problems. The singlescale/multiscale filters be-comes very efficient and can be automatically created, which makes this method more flexible.
The layout of the report is as follows:
Sections 3, 4, and 5 briefly describe the underlying theory needed to understand the rest of the report. Section 6 introduces the new theory to detect curvature using polynomial fitting on local orientation. Section 7 and 8 describes how to use partial derivative filters to increase the efficiency of the algorithm. Sections 9 and 10 discusses computational complexity and how to detect the symmetries in several scales. The remaining sections 11, 12 contains some experiments and conclusions.
2
Notations
Two scalar (inner) products are used in this report. One weighted and one unweighted:
hz, biW = b∗Wz
hz, bi = b∗z
where b, z ∈ CM, W is a diagonal matrix with weight coefficients and∗ indicates conjugate transpose.
ˆ
z is defined as ˆz = z/|z|.
Additional notations are introduced when needed.
3
Normalized convolution
This section contains a short summary of the normalized convolution theory. The the-ory is thoroughly described in [9], [8] (and also in [16], [20]). Normalized convolution approximates a signal with a linear combination of a set of basis functions. It takes into account the uncertainty in signal values and also permits spatial localization of the basis functions which may have infinite support.
Let{bn}N1 be a set of N linearly independent vectors inCM. Assume we want to represent a vector f∈ CM as a linear combination of{bn}, i.e.
f ∼ N X 1 snbn= Bs (1) where B = b|1 b|2 . . . b|N | | | , s = s1 s2 .. . sN (2)
With each signal vector f ∈ CM we attach a certainty vector c ∈ RM indicating our confidence in the values of f . Similarly we attach an applicability function a∈ RM to the basis functions{bn} ∈ CM to use as a window for spatial localization of the basis functions. We then use weighted least squares to approximate the signal, the weight is a function of the certainty and the applicability:
arg min s∈CNkf − Bsk 2 W = arg min s∈CN(f− Bs) ∗W(f − Bs) (3)
This has the effect that elements in f with a low certainty and elements in bnwith
a low applicability value has less influence on the solution than elements with a high certainty and applicability. The solution becomes
s = (B∗WB)−1B∗Wf = ˜B∗f where B = WB(B˜ ∗WB)−1 (4)
The columns of ˜B are called the dual basis of {bn}.
In terms of inner products the solution can be written as
s = hb1, b1iW . . . hb1, bNiW .. . . .. ... hbN, b1iW . . . hbN, bNiW −1 hb1, fiW .. . hbN, fiW (5)
The signal f can for instance be a local area in an image and the basis functions bn
can for example be polynomials, Fourier functions or other useful analyzing functions (reshaped as vectors). Doing this approximation in each local area in an (complex valued) image can be efficiently implemented by means of convolutions, hence the name normalized convolution. If we use the pointwise multiplication ’·’, we can rewrite equation 5 as s = ha · b1· ¯b1, ci . . . ha · b1· ¯bN, ci .. . . .. ... ha · bN · ¯b1, ci . . . ha · bN · ¯bN, ci −1 ha · b1, c · fi .. . ha · bN, c · fi (6) The left arguments in the scalar product, a· bi· ¯bj and a· bi, can be interpreted as filters that is to be correlated with the signals c and c· f respectively.
If the overall signal certainty c is too low we cannot rely on the result s. For the signal f we had a certainty measure c indicating how well we can rely on the information in f . We can apply the same philosophy for the solution s and use an output certainty coutindicating how well we can rely on s. There exist several suggestions for
cout, see [8]. The one used in this report is from [19]:
cout = det(B∗WaWcB) det(B∗WaB) 1/N (7) which measures how less distinguishable the basis functions becomes when we include uncertainty compared to full certainty. Note that even if the basis functions is orthog-onal in the case of full certainty (c ≡ 1) they may not necessarily be so when the certainty is varying. The 1/N exponent makes coutproportional to c.
4
Local orientation in double angle representation,
z
There exists a number of ways to detect local orientation in images. We will not deal with these methods in this report but rather concentrate on orientation representations.
Re z Im z
z
2θ
Figure 1: Double angle representation of local orientation.
The classical representation of local orientation is simply a 2D-vector pointing in the dominant direction with a magnitude that reflects the orientation dominance (e.g. the energy in the dominant direction). An example of this is the image gradient.
A better representation is the double angle representation, where we have a vector, or a complex number z, pointing in the double angle direction. This means that if the orientation has the direction θ we represent it with a vector pointing in the 2θ-direction, i.e. z = ei2θ. Figure 1 illustrates the idea.
This representation has at least two advantages:
• We avoid ambiguities in the representation: It does not matter if we choose to
say that the orientation has the direction θ or, equivalently, θ + π. In the double angle representation both choices get the same descriptor ei2θ.
• Averaging the double angle orientation description field makes sense. One can
argue that two orthogonal orientations should have maximally different represen-tations, e.g. vectors that point in opposite directions. This is for instance useful in color images and vector fields when we want to fuse descriptors from several channels into one description.
The double angle descriptor will in this report be denoted by a complex number z. The argument, arg z, points out the dominant orientation and the magnitude, c =|z|, usu-ally represent the certainty (c.f. section 3) or energy of the orientation. An example can be found in figure 2, which contain a simple gray-level image and its local orientation description. The local orientation is represented in color, which is a powerful way to visualize complex numbers.
In the rest of this report we will use the local orientation to detect curvature.
5
Rotational symmetries
The rotational symmetry filters were invented around 1981 by Granlund and Knutsson, however first mentioned in a patent from 1986 [14] and later in [11], [6], [7]. This section contains a short summary of the rotational symmetry theory.
Gray-level image Local orientation, z 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100
Figure 2: A simple gray-level image (left) and its local orientation description (right). The color represents the phase∠z and the intensity represents the magnitude |z| = c.
5.1
Short introduction
The reader can verify for himself that a circle can be described in the double angle representation as cei2ϕand a star pattern as cei2ϕeiπ. Therefore we can use the same filter b = ei2ϕand correlate with the local orientation z:
(z ? b)(x) = X
y∈Z2
z(y − x)b∗(y) (8)
to detect both patterns. We will get a high response for both patterns but with different phase (0 and π respectively). The circle- and star patterns described above is only one of many in a family of patterns called rotational symmetries:
n:th order rotational symmetry
bn(ϕ) = einϕ
Each symmetry order describes a whole class of gray-level patterns. Some examples is found in figure 3. The most useful symmetries (i.e. the most common visual patterns in out daily lives) are the
• b0: Describes lines.
• b1: Describes curvature and line-endings. (They are also called parabolic
sym-metries, referring to the gray-level patterns they describe.)
• b2: Describes circles, stars and spiral patterns. (Also called circular
symme-tries.)
Thus if we use b0, b1, and b2 as filters and correlate on the local orientation de-scription z of an image we can detect a lot of interesting patterns. The top row in figure
Pattern class 0 Pattern class 1 Pattern class 2
Corresponding local orientation description
ei(0ϕ+α) ei(1ϕ+α) ei(2ϕ+α)
Figure 3: Some examples of 0:th, 1:st, and 2:nd order symmetry patterns and there corresponding local orientation description. Note that all patterns in a column has the same orientation description when the orientation certainty is disregarded.
13 shows the result of correlating the local orientation z in figure 2 with the filters
a(r) · bn(ϕ), n = 0, 1, 2, i.e. in each point in the image we calculate
s0 = ha · b0, zi
s1 = ha · b1, zi
s2 = ha · b2, zi
⇔ s = B∗W
az (9)
a(r) is in this case a Gaussian and is related to the applicability function mentioned in
section 3 (a window for the basis function bn). The filters a(r)· bn(ϕ) is shown in
figure 4 using the color representation mentioned in section 4.
a · b0 a · b1 a · b2
Figure 4: Rotational symmetry filters a· bn, where bn= einϕ.
Note that the responses sn can be thought of as coefficients from a local polar Fourier transform. The responses are not very selective, as can be seen in the top row in figure 13. The first and second order filters also detect lines, which is undesirable if we want to use them as curvature detectors. Therefore there are techniques developed
to make the responses more selective and sparse. Two of them is discussed in section 5.2 and 5.3.
5.2
Rotational symmetries and normalized convolution
This technique is described further in [13]. The idea is to use the normalized convo-lution theory discussed in section 3 to make the responses more selective. We simply use b0 = ei0ϕ = 1 b1 = ei1ϕ = eiϕ b2 = ei2ϕ = ei2ϕ (10)
as basis functions. The goal is then to approximate the signal f = ˆz = z/|z| with a linear combination of the basis functions, i.e.
ˆ
z ∼ s0b0+ s1b1+ s2b2 (11)
using c = |z| as certainty and a Gaussian function as applicability a. The solution becomes
s = (B∗WB)−1B∗Wˆz = (B∗WB)−1B∗W
az where W = WaWc (12)
The middle row in figure 13 contains the result when multiplied with the output cer-tainty in equation 7. The result is now much more selective and sparse. It is important to use the magnitude|z| as certainty and not part of the signal. If we had used a con-stant certainty and z as signal we would arrive at the same result as in the previous section (top row in figure 13). This is because the basis functions are orthogonal in full certainty and therefore equation 4 simply becomes s = B∗Wz, i.e. sn=ha · bn, zi.
5.3
Rotational symmetries and normalized inhibition
This technique is described further in [12]. The normalized convolution implies that we calculate (B∗WB)−1in each local area in the image. To avoid these computations we can instead let the filter responses in equation 9 inhibit each other so that if one response is high the other ones gets lower. This can be though of as a fuzzy classification. Before we inhibit we normalize with the amount of certainty, i.e.ha·bn, zi
ha, ci . This ensures (easy
to show) that the magnitude of the response lies between 0 and 1, which makes the inhibition more easy. The normalized inhibition is calculated as
s0 = ha · b0, zi · 1−ha·b1, zi ha, ci ·1−ha·b2, zi ha, ci s1 = ha · b1, zi · 1−ha·b0, zi ha, ci ·1−ha·b2, zi ha, ci s2 = ha · b2, zi · 1−ha·bha, ci0, zi ·1−ha·bha, ci1, zi (13)
The result when calculating this on the image in figure 2 is shown in the bottom row in figure 13.
All of the three methods explained above (equation 9, 12, and 13) involves correlat-ing the local orientation z with a number of 2D-filters, a·bn. These filters can be made approximately separable into a small number of 1D-filters using SVD (Singular Value Decomposition) (see [12]) or, if we want to detect the symmetries in several scales, create a filter-net using a technique similar to the one described in [1] (the results from this is not yet reported). The design of the filter-net is not automatic but involves a lot of engineering which makes it a bit inflexible and the SVD approach is difficult to efficiently generalize to multiscale filtering.
Next section introduces a technique based on polynomial fitting on the local orien-tation. This approach creates less practical problems. The singlescale/multiscale filters becomes very efficient and can be automatically created, which makes this method more flexible.
6
Using polynomial fitting to detect curvature
As mentioned above the rotational symmetry theory can be interpreted as a local polar Fourier transform of the orientation z. The method below instead describes the rota-tional symmetries in terms of polynomials. The trick is to modify the basis functions in equation 10 into b0 = ei0ϕ = (x + iy)0 = 1 b1 = rei1ϕ = (x + iy)1 = x + iy
b2 = r2ei2ϕ = (x + iy)2 = x2− y2+ i2xy
(14)
The new basis functions weighted with a Gaussian applicability is shown in figure 5.
a · b0 a · b1 a · b2
Figure 5: Rotational symmetry filters a· bn, where bn = rneinϕ.
Both the methods described above to make the responses more selective, i.e. nor-malized convolution (sec. 5.2) and nornor-malized inhibition (sec. 5.3), can still be used in this approach. Section 6.1 and 6.2 describes how.
First we define the n:th order polynomial basis, Pn, as the basis containing all
following appearance P2 = 1 x y x| | | |2 xy y| |2 | | | | | | P4 = 1 x y x| | | |2 xy y| |2 x|3 x2|y xy|2 ... | | | | | | | | | | | | | | | ... y3 x4 x3y x2y2 xy3 y4 | | | | | | (15)
Using this notation we can write the three basis functions in equation 14 as
B = b|0 b|1 b|2 | | | = 1 x y x| | | |2 xy y| |2 | | | | | | 1 0 0 0 1 0 0 i 0 0 0 1 0 0 2i 0 0 −1 = P2TP2B (16)
TP2Bis the transformation matrix from the polynomial basis P2to the rotational sym-metry basis.
6.1
Normalized convolution case
To find the normalized convolution solution, equation 6 (and 12), we have to calculate the scalar products
hg · bn, zi and hg · bn· bm, ci
(assuming we use a Gaussian applicability g). We will now show that this can be written in terms of scalar products between signal/certainty and the polynomial basis.
We start withhg · bn, zi: hbhb01, ˆzi, ˆziWW hb2, ˆziW = hx · g, zi − ihy · g, zihg, zi hx2· g, zi − hy2· g, zi − i2hxy · g, zi = 10 01 −i 00 0 00 00 0 0 0 1 −2i −1 hg, zi hx · g, zi hy · g, zi hx2· g, zi hxy · g, zi hy2· g, zi = 10 01 −i 00 0 00 00 0 0 0 1 −2i −1 PT 2Wgz = T∗P 2BP T 2Wgz (17)
where Wg = diag(g). A more compact way to arrive at the same result is to use
matrix calculations: hbhb01, ˆzi, ˆziWW hb2, ˆziW = B∗W gWcˆz = B∗Wgz = (P2TP2B)∗Wgz = T∗P2BP T 2Wgz (18) The other scalar products, hg · bn · bm, ci, can also be expressed in terms of the
polynomial basis: hb0, b0iW = hg, ci hb0, b1iW = hx · g, ci + ihy · g, ci hb0, b2iW = hx2· g, ci − hy2· g, ci + i2hxy · g, ci hb1, b1iW = hx2· g, ci + hy2· g, ci hb1, b2iW = hx3· g, ci + hxy2· g, ci + i hx2y · g, ci + hy3· g, ci hb2, b2iW = hx4· g, ci + hy4· g, ci + 2hx2y2· g, ci hb1, b0iW = hb0, b1i hb2, b0iW = hb0, b2i hb2, b1iW = hb1, b2i (19)
which written in matrix form becomes hb0, b0iW hb0, b1iW hb0, b2iW hb1, b1iW hb1, b2iW hb2, b2iW = 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 i 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2i −1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 i 1 i 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 2 0 1 P T 4Wgc = UPPT4Wgc (20) Consequently, if we want to calculate the normalized convolution we can first calculate the scalar productshg · xpyq, zi and hg · xpyq, ci and then create linear combina-tions of the results in order to get the scalar productsha · bn, zi and ha · bn· bm, ci.
The correlations with the polynomial basis functions can be made by means of sepa-rable 1D-filters and is therefore more efficient than correlating with the original basis functions bndirectly.
Figure 6 shows the correlator structure needed to calculate the normalized convo-lution soconvo-lution.
6.2
Normalized inhibition case
To find the normalized inhibition solution is simpler than the normalized convolution case. As seen from equation 13 we have to calculate the scalar products
hg · bn, zi and hg, ci
The scalar productshg · bn, zi is calculated in the same way as in the previous case
(equation 18) and the remaining scalar product,hg, ci, is simply the certainty corre-lated with a Gaussian filter.
We have to make one modification from the normalized inhibition in equation 13 though. The normalized inhibition factor, nin = hg·bhg, cin, zi, described in section 5.3
was based on the property 0≤ nin≤ 1. This does not hold for the new basis functions
b1 = reiϕand b1 = r2ei2ϕ. But we can modify the inhibition factor and instead use for example nin = hhg · bn, zi hg, ci /hg · rhg, 1in, 1i where h(s) = 0 if s < 0 1 if s > 1 s otherwise (21)
The termhg·rhg, 1in, 1iis the value ofhg·bhg, cin, zi if we have full certainty and ˆz = einϕ = ˆ
bn. We can now, just as in equation 13, calculate the normalized inhibition as
s0 = hg · b0, zi · (1 − ni1)· (1 − ni2)
s1 = hg · b1, zi · (1 − ni0)· (1 − ni2)
s2 = hg · b2, zi · (1 − ni0)· (1 − ni1)
(22)
Figure 7 shows the correlator structure needed to calculate the normalized inhibi-tion soluinhibi-tion.
gfed `abcz = c·f 1 =={ { { { { { { { { 1 55j j j j j j j j 1 y // y y2 ))T T T T T T T T y2 x // eeeeeee 22e 1 x y ,,Y Y Y Y Y Y Y Y xy x2 ''N N N N N N N 1 // x2 ?>=< 89:;c 1 II 1 <<y y y y y y y y y y y 1 y 55l l l l l l l l y y2 // y2 y3 ))R R R R R R R R y3 y4 ""E E E E E E E E E E E y4 x BB 1 55l l l l l l l l x y // xy y2 ))R R R R R R R R xy2 x2 // 1 55l l l l l l l l x2 y // x2y y2 ))R R R R R R R R x2y2 x3 ""E E E E E E E E E E E 1 // x3 x4 : : : : : : : : : : : : : : : 1 // x4
Figure 6: Correlator structure for the normalized convolution case using a polynomial basis. There is understood to be a Gaussian applicability factor in each box as well. The resulting filter kernels are found as the rightmost column multiplied with a Gauss.
gfed `abcz = c·f 1 =={ { { { { { { { { 1 55j j j j j j j j // y 1y y2 ))T T T T T T T T y2 x // eeeeeee 22e 1 x y ,,Y Y Y Y Y Y Y Y xy x2 ''N N N N N N N 1 // x2 ?>=< 89:;c // 1 // 1 1
Figure 7: Correlator structure for the normalized inhibition case using a polynomial basis. There is understood to be a Gaussian applicability factor in each box as well. The resulting filter kernels are found as the rightmost column multiplied with a Gauss.
7
Efficient polynomial fitting using Gaussian derivatives
As mentioned in the previous section the polynomial fitting can be calculated using separable 1D-filters. However, the algorithm can be even more efficient if we use derivatives of Gaussians. The idea is that a polynom multiplied with a Gauss, xmyng, can be expressed in terms of partial derivatives of the Gauss,∂x∂mm∂y∂nng and vice versa. In linear algebra terms we can say that the polynomial basis xmyng and the partial derivative basis ∂x∂mm∂y∂nng span the same subspace and therefore there exist a linear transformation between them.
Correlating with the partial derivative basis can be made more efficient than cor-relating with the polynomial basis. The partial derivatives can be calculated by first convolving with a Gauss filter followed by small derivating filters, typically of the type [−1 0 1].
7.1
Relation between polynomials and partial derivatives
The relation between partial derivatives of a Gaussian and polynomials multiplied with a Gaussian is in the one dimensional case
g(x) = e−12(xσ)2 g0(x) = −x σ2g(x) g00(x) = −1 σ2g(x) −σx2g0(x) =−σ12g(x) −x 2 σ4g(x) = x 2−σ2 σ4 g(x) g000(x) = 2x σ4g(x) +x 2−σ2 σ4 g0(x) = σ2x4g(x) −x 2−σ2 σ4 σx2g(x) =3σ 2x−x3 σ6 g(x) g0000(x) = 3σ2−3x3 σ6 g(x) +3σ 2x−x3 σ6 g0(x) = 3σ 2−3x3 σ6 g(x) −3σ 2x−x3 σ6 σx2g(x) = = x4−6σσ2x82+3σ4g(x) (23) The 2D-case can be easily derived from the 1D-case.
Let us introduce the (a bit sloppy) notation ∂xm∂yn = σ2(m+n) ∂∂xm+nm∂ygm and
define the n:th order partial derivative basis, Dn, as the basis containing all partial derivatives in 2D up to the n:th order. In this report we will use D2and D4. They are therefore given in plain language below
D2 =
1 ∂x ∂y ∂x| | | |2 ∂x∂y ∂y| |2
| | | | | |
D4 =
1 ∂x ∂y ∂x| | | |2 ∂x∂y ∂y| |2 ∂x|3 ∂x2|∂y ∂x∂y| 2 ...
| | | | | | | | |
| | | | | |
... ∂y3 ∂x4 ∂x3∂y ∂x2∂y2 ∂x∂y3 ∂y4
| | | | | |
(24) Let TPnDndenote the transformation matrix between the partial derivative basis Dn
and the polynomial basis (multiplied with the Gaussian) WgPn, i.e.
Dn = WgPnTPnDn (25)
n = 4 are TP2D2= 1 0 0 −σ2 0 −σ2 0 −1 0 0 0 0 0 0 −1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 (26) and TP4D4 = 1 0 0 −σ2 0 −σ2 0 0 0 0 3σ4 0 σ4 0 3σ4 0 −1 0 0 0 0 3σ2 0 σ2 0 0 0 0 0 0 0 0 −1 0 0 0 0 σ2 0 3σ2 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 −6σ2 0 −σ2 0 0 0 0 0 0 1 0 0 0 0 0 0 −3σ2 0 −3σ2 0 0 0 0 0 0 1 0 0 0 0 0 0 −σ2 0 −6σ2 0 0 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 (27)
8
Using partial derivatives to detect curvature
We now use the theory in section 7 to make a very efficient correlator structure to detect the rotational symmetries.
8.1
Normalized convolution case
We start with the normalized convolution case discussed in section 6.1.
Instead of correlating the signal z with P2, as we did in equation 18, we can corre-late with D2. If the relation above is inserted in equation 18 we get
hbhb01, ˆzi, ˆziWW hb2, ˆziW = T∗ P2BPT2Wgz = TP∗2BT−TP2D2D T 2z = 10 −1 i 00 0 0 00 00 0 0 0 1 −2i −1 DT 2z = T∗D2BDT2z (28)
where TD2B= T−1P2D2TP2B = 1 0 0 0 −1 0 0 −i 0 0 0 1 0 0 2i 0 0 −1 (29)
In the same way we can correlate the certainty c with D4instead of P4. We can now rewrite equation 20 as
hb0, b0iW hb0, b1iW hb0, b2iW hb1, b1iW hb1, b2iW hb2, b2iW = UPPT4Wgc = UPTP 4D4−T DT4c = UDDT4c (30) where UD= UPT−TP 4D4= 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 −1 −i 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2i −1 0 0 0 0 0 0 0 0 0 2σ2 0 0 1 0 1 0 0 0 0 0 0 0 0 0
0 −4σ2 −4iσ2 0 0 0 −1 −i −1 −i 0 0 0 0 0
8σ4 0 0 8σ2 0 8σ2 0 0 0 0 1 0 2 0 1 (31) The resulting correlator structure for the normalized convolution case using a par-tial derivative basis is shown in figure 8. This graph should be compared with the one in figure 6. Note that the number of large 1D-filters has been reduced to only two Gaussian filters for the signal correlation and two for the certainty correlation.
8.2
Normalized inhibition case
This case is easier than the previous one. We simply use the result in equation 28. The resulting correlator structure for the normalized inhibition case is shown in figure 9. This graph should be compared to the one in figure 7.
8.3
Compensating for numerical problems
When we approximate the polynomial filters with partial derivatives we will come across some numerical problems, especially in the normalized convolution case. An easy way to realize that is in the case of only one basis function b. Then the normalized convolution solution will become
s = ha · b, c · fi
gfed `abcz = c·f //g(x) // g(y) // g ∂x // r r r r r r r // σ2 ∂g ∂x ∂x // // σ4 ∂2g ∂x2 ∂y %%L L L L // σ4 ∂2g ∂x∂y ∂y : : : : : : : : : : : : // σ2 ∂g ∂y ∂y %%L L L L // σ4 ∂2g ∂y2 ?>=< 89:;c //g(x) //g(y) // g ∂x // // σ2 ∂g ∂x ∂x ::u u u u u u u u u // σ4 ∂2g ∂x2 ∂x // // σ6 ∂3g ∂x3 ∂x $$I I I // σ8 ∂4g ∂x4 ∂y 6 66 66 66 66 u u u u u u // σ4 ∂2g ∂x∂y ∂x // // σ6 ∂3g ∂x2∂y ∂y $$I I I // σ8 ∂4g ∂x2∂y2 ∂y+ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ // σ2 ∂g ∂y ∂y 6 66 66 66 66 u u u u u u // σ4 ∂2g ∂y2 ∂x // // σ6 ∂3g ∂x∂y2 ∂y $$I I I // σ6 ∂3g ∂y3 ∂y $$I I I // σ8 ∂4g ∂y4
Figure 8: Correlator structure for the normalized convolution case using a partial derivative basis. The symbols ∂x and ∂y should be interpreted as σ2 ∂∂x and σ2 ∂∂y respectively. The rightmost column contain the resulting filter kernels.
gfed `abcz = c·f //g(x) //g(y) // g ∂x // r r r r r r r // σ2 ∂g ∂x ∂x // // σ4 ∂2g ∂x2 ∂y %%L L L L // σ4 ∂2g ∂x∂y ∂y : : : : : : : : : : : : // σ2 ∂g ∂y ∂y %%L L L L // σ4 ∂2g ∂y2 ?>=< 89:;c //g(x) //g(y) // g
Figure 9: Correlator structure for the normalized inhibition case using a partial deriva-tive basis. The symbols ∂x and ∂y should be interpreted as σ2 ∂∂x and σ2 ∂∂y respec-tively. The rightmost column contain the resulting filter kernels.
Normally this is well behaved, but if a·b is approximated by one filter and a is approx-imated by another filter there can be numerical problems in areas with low certainty. The errors will have a high influence on the result if we have high signal certainty in elements where the corresponding filter approximations is poor and low certainty else-where. Then both the numerator and the denominator will be small and an error in the approximations can send the solution far away from the true one.
To avoid this problem we can introduce a small constant bias certainty δc. The solution (equation 4) will then become
s = (B∗Wa(Wc+ δcI)B)−1B∗Wa(Wc+ δcI)ˆz
= (B∗WaWcB + δcB∗WaB)−1B∗Wa(z + δcˆz)
(33)
B∗WaB is diagonal because the basis functions are orthogonal in full certainty. The
diagonal has the values
hb0, b0iW = ha, 1i
hb1, b1iW = hx2· a, 1i + hy2· a, 1i = hr2· a, 1i
hb2, b2iW = hx4· a, 1i + hy4· a, 1i + 2hx2y2· a, 1i = hr4· a, 1i
In order to calculate (B∗WaWcB + δcB∗WaB)−1 we can first calculate B∗WB
as before and then add the constant (signal independent) matrix δcB∗WaB before we
calculate the inverse. This is better than adding the bias to the certainty before the correlations because then we have to temporarily increase the image size. This is not
Method B∗Wgz B∗WB
Norm. conv. using bndirectly 3M2 6M2
via pol. basis 9M + 3 18M + 9
via der. basis (2M + 5) + 3 (2M + 12) + 15
B∗Wgz hg, ci
Norm. inhib. using bndirectly 3M2 2M
via pol. basis 9M + 3 2M
via der. basis (2M + 5) + 3 2M
Table 1: Comparisment between the different methods. The values are number of operations (1 addition + 1 multiplication) per pixel. the numbers include correlation with a basis and the following linear combination of the correlations.
necessary when we calculate B∗Wa(z + δcˆz) because we can pretend that the signal
is zero outside the image.
Note that adding a constant bias to the certainty is closely related to regularization which is a general technique to solve ill-posed problems, see [18]. The idea is to add a small bias to the diagonal elements before inverting the matrix, i.e. (G + δI)−1.
9
Complexity comparisments
Table 1 contains a comparisment between the different correlator structures used to calculate B∗Wgz and B∗WB in the normalized convolution case and B∗Wgz and
hg, ci in the normalized inhibition case.
Independently of which correlator structure we choose we get some additional op-erations after the correlations before we arrive at the final solution:
• The normalized convolution case (equation 6) involves the inverse of the 3 × 3
matrix, (B∗WB)−1, which then is multiplied with a 3× 1 vector B∗Wˆz. We also have to calculate the output certainty in equation 7.
• The normalized inhibition case also involves some extra operations to calculate
the final solution, see equation 21 and 22.
10
Multiscale filtering
If we want to detect symmetry features in several scales it can be very efficiently done using the partial derivative basis.
Take for example the normalized inhibition case: In each scale we correlate with a Gaussian filter on the signal z (using two 1D separable filters) followed by a net of derivating filters, see figure 10. We also correlate the certainty with the same Gaussian
filter. In the multiscale case we can first make a low-pass hierarchy using Gaussian filters and then in each scale correlate with a derivating filter net, see figure 11.
11
Experiments
11.1
Approximating Gaussian derivatives
Figure 12 shows the approximating derivative filters in 1D up to the fourth derivative. They can be optimized from using least square. The derivative can be quite well ap-proximated using only 3× 1-filters especially when σ is large.
11.2
Rotational symmetries using polynomial fitting
Figure 14 contains the result from correlating the local orientation with the basis func-tions in bn = rneinϕ(equation 14), using polynomial basis functions. The result when
using partial derivative basis functions is almost identical (therefore not shown here). Note that we detect a bit larger symmetries in this case compared to when we used the basis functions bn = einϕ (figure 13). This is because the new basis functions
have more weight on large radii (compare figure 4 and 5). If we want to detect smaller features we can simply choose a smaller applicability. The result using the new basis functions bn= rneinϕis, besides the feature size, fairly comparable to the result using
the basis functions bn= einϕ.
11.3
Multiscale detection of rotational symmetries
The multiscale algorithm described in section 10 is tested on the same test image. The correlator structure in figure 11 is used.
First, a lowpass hierarchy is created using Gaussian filters and downsampling al-ternately. The same Gaussian filter, with σ1= 1.2, is used in all scales. The resulting standard deviation in scale n becomes σn =
q
σ2
n−1+ (2n−1σ1)2.
Second, the lowpass hierarchy is correlated with a derivating filter net in each scale. The derivating filters is optimized using least square. Note that the derivating filters should be considered to ’spread out’ with increasing scale due to the downsampling step, i.e. if we had not downsampled they would have the appearance
Scale 1: ∼ [−1 0 1] Scale 2: ∼ [−1 0 0 0 1] Scale 3: ∼ [−1 0 0 0 0 0 1]
.. . and should be optimized thereafter.
Third, the result from the correlator structure is used to calculate the final rotational symmetry responses. The result is found in figure 15.
?>=< 89:;z //g(x) //g(y) // g ∂x // r r r r r r r // σ2 ∂g ∂x ∂x // // σ4 ∂2g ∂x2 ∂y %%L L L L // σ4 ∂2g ∂x∂y ∂y : : : : : : : : : : : : // σ2 ∂g ∂y ∂y %%L L L L // σ4 ∂2g ∂y2
∂
g
m
WVUT PQRSz // g // ∂ //// //// ////Figure 10: Single scale correlator structure for the normalized inhibition case using partial derivative basis (same as in figure 9). The correlation of the certainty with a Gauss is not included here.
WVUT PQRSz g @A // ∂ //// //// //// ↓ 2 g @A // ∂ //// //// //// ↓ 2 .. . WVUT PQRSc g @A // ↓ 2 g @A // ↓ 2 .. .
Figure 11: Multiscale correlator structure for the normalized inhibition case using par-tial derivative basis.
0 10 20 30 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 ∗ 12.62 · [−1 0 1] = 0 10 20 30 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 ≈ σ2 ∂g ∂x 0 10 20 30 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 ∗ 12.78 · [−1 0 1] = 0 10 20 30 40 −1.5 −1 −0.5 0 0.5 1 ≈ σ4 ∂2g ∂x2 0 10 20 30 40 −1.5 −1 −0.5 0 0.5 1 ∗ 12.95 · [−1 0 1] = 0 10 20 30 40 −5 0 5 ≈ σ6 ∂3g ∂x3 0 10 20 30 40 −5 0 5 ∗ 13.11 · [−1 0 1] = 0 10 20 30 40 −20 −10 0 10 20 30 40 ≈ σ8 ∂4g ∂x4
Figure 12: Approximation of derivatives of a Gauss with σ = 3. Dashed line: true derivative, Solid line: Approximation with derivating filters, Dash-dotted line: The error.
Zeroth order First order Second order 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100
Figure 13: Rotational symmetry responses using basis 10. Top: Correlating with each basis function separately (eq. 9). Middle: using normalized convolution (eq. 12). Bot-tom: using normalized inhibition (eq. 13). The applicability is chosen as a Gaussian with σ = 5. 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100
Figure 14: Rotational symmetry responses using basis 14. Top: using normalized convolution. Bottom: using normalized inhibition. The applicability is chosen as a Gaussian with σ = 5.
20406080100 20 40 60 80 100 20406080100 20 40 60 80 100 20406080100 20 40 60 80 100 1020304050 10 20 30 40 50 1020304050 10 20 30 40 50 1020304050 10 20 30 40 50 5 10152025 5 10 15 20 25 5 10152025 5 10 15 20 25 5 10152025 5 10 15 20 25 2 4 6 81012 2 4 6 8 10 12 2 4 6 81012 2 4 6 8 10 12 2 4 6 81012 2 4 6 8 10 12 2 4 6 2 4 6 2 4 6 2 4 6 2 4 6 2 4 6
The result of multiscale filtering on some other testimages is shown in figure 16 and 17. The first image is the famous testimage ’lenna’. Note that we for examples detect the eyes in one scale and the head in a larger scale.
The second image is an aerial image used in the WITAS-project , see [21]. The goal here is generate useful landmarks like bends in the road (high curvature points), cross-roads (c.f. star shapes), and roundabouts (circle-like patterns) for use in navigation of autonomous aircrafts.
To detect the symmetries in 5 scales from local orientation on an 256× 256 image takes in Matlab about 1.5 seconds on a 299MHz SUN Ultra 10.
12
Conclusions
This report introduces a new variant of rotational symmetries, which are used to de-scribe curvature on local orientation. The new variant uses polynomial fitting which can be made more efficient than correlating with the original basis functions. To further decrease the computational complexity, the polynomial fitting is shown to be equivalent to correlating with partial derivatives of Gaussians, which can be made very efficient using a Gaussian filter followed by small derivating filters.
The result on a simple testimage shows that the new method gives subjectively the same result as the old one and are much faster to compute.
This report indicate that if we use a Gaussian applicability there is a strong rela-tionship between the three bases
b0 = 1 b1 = x + iy b2 = (x + iy)2 ←→ b0 = 1 b1 = ∂x + i∂y b2 = (∂x + i∂y)2 ! b0 = 1 b1 = eiϕ b2 = ei2ϕ
i.e. polynomial fitting, partial derivatives and polar Fourier transform basically measure the same things.
13
Acknowledgment
This work was supported by the Swedish Foundation for Strategic Research, project VISIT - VIsual Information Technology, and has been conducted in association with project WITAS. The author would like to thank the people at CVL for helpful discus-sions.
50 100 150 200 250 50 100 150 200 250 50 100 150 200 250 50 100 150 200 250 50100150200250 50 100 150 200 250 50100150200250 50 100 150 200 250 50100150200250 50 100 150 200 250 20406080100120 20 40 60 80 100 120 20406080100120 20 40 60 80 100 120 20406080100120 20 40 60 80 100 120 20 40 60 20 40 60 20 40 60 20 40 60 20 40 60 20 40 60 10 20 30 10 20 30 10 20 30 10 20 30 10 20 30 10 20 30 5 10 15 5 10 15 5 10 15 5 10 15 5 10 15 5 10 15
100 200 300 50 100 150 200 250 100 200 300 50 100 150 200 250 100 200 300 50 100 150 200 250 100 200 300 50 100 150 200 250 100 200 300 50 100 150 200 250 50 100 150 20 40 60 80 100 120 140 50 100 150 20 40 60 80 100 120 140 50 100 150 20 40 60 80 100 120 140 20 40 60 80 20 40 60 20 40 60 80 20 40 60 20 40 60 80 20 40 60 20 40 10 20 30 20 40 10 20 30 20 40 10 20 30 5 10 15 20 5 10 15 5 10 15 20 5 10 15 5 10 15 20 5 10 15
References
[1] M. Andersson, J. Wiklund, and H. Knutsson. Sequential Filter Trees for Efficient 2D 3D and 4D Orientation Estimation. Report LiTH-ISY-R-2070, ISY, SE-581 83 Link¨oping, Sweden, November 1998.
[2] F. Attneave. Some informational aspects of visual perception. Psychological
Review, 61, 1954.
[3] H. B˚arman and G. H. Granlund. Corner detection using local symmetry. In
Pro-ceedings from SSAB Symposium on Picture Processing, Lund University, Sweden,
March 1988. SSAB. Report LiTH–ISY–I–0935, Computer Vision Laboratory, Link¨oping University, Sweden, 1988.
[4] Irving Biederman. Recognition-by-components: A theory of human image un-derstanding. Psychological Review, 94(2):115–147, 1987.
[5] J. Big¨un. Optimal orientation detection of circular symmetry. Report LiTH–ISY– I–0871, Computer Vision Laboratory, Link¨oping University, Sweden, 1987. [6] J. Big¨un. Local Symmetry Features in Image Processing. PhD thesis, Link¨oping
University, Sweden, 1988. Dissertation No 179, ISBN 91-7870-334-4.
[7] Josef Big¨un. Pattern recognition in images by symmetries and coordinate trans-formations. Computer Vision and Image Understanding, 68(3):290–307, 1997. [8] G. Farneb¨ack. Spatial Domain Methods for Orientation and Velocity Estimation.
Lic. Thesis LiU-Tek-Lic-1999:13, Dept. EE, Link¨oping University, SE-581 83 Link¨oping, Sweden, 1999. Thesis No. 755, ISBN 91-7219-441-3.
[9] G. Farneb¨ack. A unified framework for bases, frames, subspace bases, and sub-space frames. In Proceedings of the 11th Scandinavian Conference on Image
Analysis, pages 341–349, Kangerlussuaq, Greenland, June 1999. SCIA.
[10] Jack L. Gallant, Jochen Braun, and David C. Van Essen. Selectivity for polar, hyperbolic, and cartesian gratings in macaque visual cortex. Science, 259:100– 103, January 1993.
[11] G. H. Granlund and H. Knutsson. Signal Processing for Computer Vision. Kluwer Academic Publishers, 1995. ISBN 0-7923-9530-1.
[12] Bj¨orn Johansson and G¨osta Granlund. Fast Selective Detection of Rotational Symmetries using Normalized Inhibition. In Proceedings of the 6th European
Conference on Computer Vision, volume I, pages 871–887, Dublin, Ireland, June
2000.
[13] Bj¨orn Johansson, Hans Knutsson, and G¨osta Granlund. Detecting Rotational Symmetries using Normalized Convolution. In Proceedings of the 15th
Interna-tional Conference on Pattern Recognition, volume 3, pages 500–504, Barcelona,
[14] H. Knutsson and G. H. Granlund. Apparatus for determining the degree of vari-ation of a feature in a region of an image that is divided into discrete picture elements. US-Patent 4.747.151, 1988, 1988. (Swedish patent 1986).
[15] H. Knutsson, M. Hedlund, and G. H. Granlund. Apparatus for determining the degree of consistency of a feature in a region of an image that is divided into discrete picture elements. US-Patent 4.747.152, 1988), 1988. (Swedish patent 1986).
[16] H. Knutsson and C-F. Westin. Normalized and differential convolution. In CVPR, pages 515–523. IEEE, June 1993.
[17] Gerald Oster. Phosphenes. Scientific American, 222(2):82–87, 1970.
[18] A. N. Tikhonov and V. Y. Arsenin. Solutions of ill-posed problems. W. H. Win-ston, Washington, DC, 1977.
[19] C-J. Westelius. Focus of Attention and Gaze Control for Robot Vision. PhD thesis, Link¨oping University, Sweden, SE-581 83 Link¨oping, Sweden, 1995. Disserta-tion No 379, ISBN 91-7871-530-X.
[20] C-F. Westin. A Tensor Framework for Multidimensional Signal Processing. PhD thesis, Link¨oping University, Sweden, SE-581 83 Link¨oping, Sweden, 1994. Dis-sertation No 348, ISBN 91-7871-421-4.