Learning Corner Orientation Using Canonical Correlation

(1)

LEARNING CORNER ORIENTATION USING CANONICAL CORRELATION

B. Johansson

∗

Computer Vision Laboratory

Dept. of Electrical Engineering

Link¨oping University

bjorn@isy.liu.se

M. Borga, H. Knutsson

Medical Informatics

Dept. of Biomedical Engineering

Link¨oping University

{

knutte,magnus

}

@imt.liu.se

ABSTRACT

This paper shows how canonical correlation can be used to learn a detector for corner orientation invariant to corner an-gle and intensity. Pairs of images with the same corner ori-entation but different angle and intensity are used as training samples. Three different image representations; intensity values, products between intensity values, and local orien-tation are examined. The last represenorien-tation gives a well behaved result that is easy to decode into the corner orien-tation. To reduce dimensionality, parameters from a poly-nomial model fitted on the different representations is also considered. This reduction did not affect the performance of the system.

1. INTRODUCTION

It is often difficult to design detectors for complex features analytically. An alternative approach is to have a system learn a feature detector from a set of training data. One method based on canonical correlation is described in [1], [2], [3]. where a system learns a detector for local orienta-tion invariant to signal phase by showing the system pairs of sinusoidal patterns that have the same orientation but differ-ent phase. Products between pixel values was used as input samples and the system learned linear combinations of the sample components which in a simple manner was decoded into the orientation angle. It turned out that the linear com-binations could be interpreted as quadrature or Gabor filters. This paper shows that the same technique can be used to learn a descriptor of corner orientation which is invariant to corner angle and intensity. Three input representations are examined: intensity values, products between intensity values, and local orientation in double angle representation. The dimensionality of the input data can be quite large, es-pecially if we use products between intensity values. There-fore, to reduce the dimensionality, parameters from a poly-nomial expansion model on the respective representations is also explored as input data.

∗_{This work was supported by the Foundation for Strategic Research,}

project VISIT - VISual Information Technology.

2. CANONICAL CORRELATION

Assume that we have two stochastic variablesx ∈ CM1_, y ∈ CM2₍M₁_andM₂_{do not have to be equal). Canonical} correlation analysis, CCA, can be defined as the problem of finding two sets of basis vectors, one forx and the other for y, such that the correlations between the projections of the variables onto these basis vectors are mutually maximized. For the case of only one pair of basis vectors we have the projectionsx = ˆw∗_xx and y = ˆw∗_yy (∗ denotes conjugate transpose) and the correlation is written as

ρ = p E[xy] E[x2]E[y2] = w∗ xCxywy p w∗ xCxxwxw∗yCyywy (1)

whereC_xy= E[xy∗], C_xx= E[xx∗], C_yy = E[yy∗]. It can be shown that the maximal canonical correlation can be found by solving an eigenvalue system [1]. The first eigenvectorswˆ_x1,wˆ_y1are the projections that have the highest correlationρ₁. The next two eigenvectors have the second highest correlation and so on. It can also be shown that the different projections are uncorrelated.

3. EXPERIMENT SETUP

In the experiments we have N pairs of training samples, (I(n)x , I(n)y ), see figure 1 for examples. Each pair has the

same corner orientation but differ in other properties. The orientation varies between0◦and360◦with a res-olution of5◦, giving a total of 72 values. The corner angle varies between60◦and120◦with a resolution of5◦, giving a total of 13 values. When using the intensity value repre-sentation it is not possible to learn corner intensity invari-ance, the representation is not descriptive enough. There-fore, in this case we will not vary the intensity. For the other representations both the corner angle and corner in-tensity varies as in figure 1. The training pairs consists of all combinations of the images above that have the same corner orientation but differ in corner angle. This gives a total ofN = 72 × 132= 12168 pairs. Gaussian noise was finally added. The noise actually helps the learning algo-rithm to find more smooth and robust vectorsw_xk. This is not surprising since the algorithm finds projections which

(2)

10 pairs of training samples without noise I_x(72) I_y(72) I_x(389) I_y(389) I_x(878) I_y(878) I_x(2157) I_y(2157) I_x(2407) I_y(2407) I_x(3756) I_y(3756) I_x(4319) I_y(4319) I_x(9957) I_y(9957) I_x(10284) I_y(10284) I_x(11201) I_y(11201)

Same examples with noise added, PSNR = 10 dB

I_x(72) I y (72) I_x(389) I y (389) I_x(878) I y (878) I_x(2157) I y (2157) I_x(2407) I y (2407) I_x(3756) I y (3756) I_x(4319) I y (4319) I_x(9957) I y (9957) I_x(10284) I y (10284) I_x(11201) I y (11201)

Corresponding local orientation (double angle repr.)

Z_x(72) Z_y(72) Z_x(389) Z_y(389) Z_x(878) Z_y(878) Z_x(2157) Z_y(2157) Z_x(2407) Z_y(2407) Z x (3756) Z y (3756) Z_x(4319) Z_y(4319) Z_x(9957) Z_y(9957) Z_x(10284) Z_y(10284) Z_x(11201) Z_y(11201)

Fig. 1. Examples of pairs of training samples for the

exper-iments.

are invariant to the noise (because the noise is not a com-mon property) which implicates a low-pass characteristic of the projections.

For the gray-level experiments an image size of5 × 5 was used. For the local orientation experiments an image size of 9 × 9 was used. The gradient (I_x, I_y) was then computed using5 × 5 differentiated Gaussian filters with

σ = 1.2. The double angle representation is then computed

asz = (I_x+ iI_y)2. This representation has an argument

that is double the local orientation angle. Z will therefore become invariant to edge sign (i.e. a positive or a negative edge with the same orientation), which in turn means that the representation is more invariant to intensity. The border values were finally removed in order to avoid border effects, leaving a5 × 5 local orientation image.

4. REPRESENTATIONS AND INTERPRETATIONS

As input to the CCA we haveN pairs of training samples, (x(n)_{, y}(n)_{), in form of one of the representations}

men-tioned before. This section discusses some details regard-ing these representations and how to interpret the resultregard-ing CCA-vectorswˆ_xkfor the different representations.

Intensity values,I

LetI(n)_x be anM ×M image corresponding to one train-ing sample. Leti(n)_x denote the same image after reshaping it into anM2× 1 vector, i.e. i(n) = vec(I(n)_x ). i(n)_x will be used as input samples to the system:

( x(n)_{= i}(n) x = vec(I(n)x ) y(n)_{= i}(n) y = vec(I(n)y ) (2)

In this case the resulting CCA-vectorswˆ_xk can simply be interpreted as linear filters on the gray-level image.

Products between intensity values,I × I

Products between intensity values can be generated with the outer product,i_xiT_x. The input to the learning algorithm is then this outer product reshape into anM4× 1 vector,

(

x(n)_{= vec(i}(n)

x i(n)Tx )

y(n)_{= vec(i}(n)_y _i(n)T_y ₎ (3)

(In practice the dimensionality is reduced toM2(M2+1)/2 due to the symmetry property ofi(n)_x i(n)T_x ). The projection ofx onto a resulting CCA-vector can then be written as

wT

xkx = iTxWxkix (4)

whereW_xk isw_xk reshaped into aM2/M2 matrix. We can use the eigensystem ofW_xkand write

wT xkx = iTx  X j λkjêkjêTkj   ix= X j λkj(iTxêkj)2 (5)

Hence, the projection can be computed as projections of the imagei_xontoˆe_kjfollowed by the weighted sum of squares. Note thatˆe_kjcan be viewed as linear filters on the image.

If we can remove some of the terms in the sum we can save a lot of computations. It would be tempting to keep the terms corresponding to the largest eigenvalues but it turns out thatE[λ_kj(êT_kji_x)2] = |λ_kj|E[(êT_kji_x)2] is a more rel-evant significance measure since it measures the average en-ergy in the subspace defined byê_kj. The significance mea-sure coincides with the eigenvalues in the experiments in this paper, but this is not always the case.

Local orientation,Z

In this case the double angle representation is used as input samples, ( x(n)_{= vec(z}(n) x ) y(n)_{= vec(z}(n) y ) (6)

and the resulting CCA-vectors can be interpreted as linear filters on theZ image. Note that in this case both the train-ing samples and the CCA-vector will be complex valued.

Polynomial model

To reduce dimensionality we model the intensity and lo-cal orientation with a second degree polynomial:

f(u, v) ∼ r1+ r2u + r3v + r4u2+ r5v2+ r6uv (7)

The parameter vectorr_f = (r₁, ..., r₆)T is found from a weighted least square problem asr_f = Af where the ma-trixA is a function of the polynomial basis functions, see [4]. Note that a projection onr_f can be transformed to a projection onf, as w∗_r_fr_f = (ATw_r_f)∗f.

rI,rI × rI, andrZ is then used as representations

(3)

I r_I 5 10 15 20 25 0 0.5 1 1 2 3 4 5 6 0 0.5 1 I × I (only ρ1-ρ50is shown) rI × rI 10 20 30 40 50 0 0.5 1 5 10 15 20 0 0.5 1 Z rZ 5 10 15 20 25 0 0.5 1 1 2 3 4 5 6 0 0.5 1

Fig. 2. Resultingρ_kfor the different experiments.

w xk |DFT( wxk)| wxk T i x 0° 360° Orientation

Fig. 3. Results from theI experiment.

5. RESULTS

The resulting canonical correlationsρ_kfor the different rep-resentations are shown in figure 2. In all cases we get about 4-8 large correlations and the remaining ones are signifi-cantly smaller. The absolute values are not critical as long as they are fairly high since they depend on the noise added.

I case

Figure 3 shows the six first CCA-vectors and the pro-jection of the noise-free data (with the same intensity) onto the vectors. Note that there are several curves, correspond-ing to different corner angles. The first two CCA-vectors are simply orthogonal edge-filters and the projections vary as sinusoidal functions with90◦phase difference. The next two are sensitive to the double angle of the orientation.

I × I case

Figure 4 shows the eigenvaluesλ_kjand significance mea-sures for the six first CCA-vectors in theI × I experiment. Only two eigenvectorsˆe_kl were significant, and the figure also shows the projection of the noise-free data onto these vectors. Figure 5 shows the vectors and their corresponding Fourier transforms. If closely examined they can be inter-preted as local edge filters!

Z case

Figure 6 shows the five first CCA-vectors and their pro-jections onto noise-free data (argument and magnitude are

0 k = 1 λ_kl 0 | E[ λ_kl ( e_klT i_x)2 ] | 0 λ_k1 ( e_k1T i_x)2 + λ_k2 ( e_k2T i_x)2 0 k = 2 0 0 0 k = 3 0 0 0 k = 4 0 0 0 k = 5 0 0 0 k = 6 1 25 01 25 0 0 360° Orientation

Fig. 4. Results from theI × I experiment.

e

k1 | DFT( ek1) | ek2 | DFT( ek2) |

Fig. 5. Results from theI × I experiment.

shown). They can actually be interpreted as rotational

sym-metry filters which are well known to detect complex

cur-vature, see e.g. [5]. Optimal patterns for these filters can be derived [6]. A prototype pattern for each filter is shown in figure 7. All patterns that can be decribed as rotations or parts of the prototype pattern (e.g. one of the trajectories) are also detected by the corresponding filter.

Polynomial cases

The resulting CCA-vectors for the polynomial experi-ments when transformed to linear filters on the correspond-ing original representation turned out to be very similar to the result for the original representations. They are there-fore not shown in this paper due to lack of space.

6. DECODING CORNER ORIENTATION

The corner orientation angle can be decoded from the pro-jectionsw∗_xkx depending on representation:

I case

In this case we can simply take the angle of the vector (wT

x1ix, wTx2ix). The left column in figure 8 shows this

esti-mate as function of true value for noise-free data (the offset is not important since the system is unaware of the orienta-tion reference value). The estimate is even more invariant to corner angle than the projections alone. However, as said before the projections are not invariant to intensity and this

(4)

w k 0 1 2 | w k *_z x | −180° 0° 180° arg( w k *_z x) 0 1 2 −180° 0° 180° 0 1 2 −180° 0° 180° 0 1 2 −180° 0° 180° 0 1 2 Orientation 0° 360° −180° 0° 180° 0° 360° Orientation

Fig. 6. Results from theZ experiment.

wk1 wk2 wk3 wk4 wk5

∼ e−1iϕ _{∼ e}−2iϕ _{∼ e}0iϕ _{∼ e}1iϕ _{∼ e}3iϕ

Fig. 7.Z result interpreted as rotational symmetries.

simple decoding function will fail when the intensity varies. As an evaluation measure 1000 noisy images with ran-dom corner orientation, angle and intensity a was used. The angular error was computed and the mean angular error was removed. Finally the standard deviation of the error was computed. The result is also shown in figure 8.

I × I case

The projections do not behave as nice as in the previous case and they are therefore more difficult to decode. But the projections are fairly invariant to corner angle and intensity and it should therefore in theory be possible to find a decod-ing function. This is not further investigated in this paper though.

Z case

Since the first and second projections are sensitive to the third and fourth power of the orientation respectively we can decode the projections into a corner orientation angle by taking the argument of the quotient(w∗_k2z_x)/(w∗_k1z_x). The magnitudes of the projections can be used as a certainty measure. The evaluation of this decoding function is shown in the right column in figure 8. Another decoding func-tion could be to use the phase of the fourth projecfunc-tion,w_k4, since this is approximately the identity mapping. But the result would be less accurate, as can be inferred from the projection in figure 6.

7. DISCUSSION

It has often been argued, partly motivated by biological vi-sion systems, that local orientation information should be used to detect more complex features. The results in the

Noise-free data 0° 180° 360° −180° 0° 180° 0° 180° 360° −180° 0° 180°

Angular error on noisy data

std =13.8◦ std =19.5◦ 200 400 600 800 1000 −180° 0° 180° 200 400 600 800 1000 −180 0 180

Fig. 8. Left column: I case, decoding function

angle(wT_x2i_x, wT_x1i_x). Right column: Z case, decoding function arg(w∗_x2z_x/w∗_x1z_x)

I × I and Z experiment furher motivate this idea. Note that

theI × I and Z representations are closely related since the

double angleZ is calculated from products between image gradient components. It may be possible to use the result from the quadratic model experiments but the local orien-tation helps the system to learn a more well behaved repre-sentation which is easier to decode.

It may be possible to use the same technique to learn other features and invariances. One drawback can be the amount of necessary training data. Preliminary experiments shows that by using the polynomial model the number of training pairs can be less than if we use the image or local orientation directly. This is because the number of train-ing samples is generally proportional to the number of input parameters.

8. REFERENCES

[1] M. Borga, Learning Multidimensional Signal

Process-ing, Ph.D. thesis, Link¨oping University, Sweden,

SE-581 83 Link¨oping, Sweden, 1998, Dissertation No 531, ISBN 91-7219-202-X.

[2] H. Knutsson and M. Borga, “Learning Visual Opera-tors from Examples: A New Paradigm in Image Pro-cessing,” in Proc. of ICIAP’99, Invited Paper.

[3] M. Borga and H. Knutsson, “Finding Efficient Nonlin-ear Visual Operators using Canonical Correlation Anal-ysis,” in Proc. of SSAB-2000, Halmstad, pp. 13–16. [4] G. Farneb¨ack, “Spatial Domain Methods for

Orienta-tion and Velocity EstimaOrienta-tion,” Lic. Thesis LiU-Tek-Lic-1999:13, Dept. EE, Link¨oping University, SE-581 83 Link¨oping, Sweden, 1999, Thesis No. 755, ISBN 91-7219-441-3.

[5] B. Johansson and G. Granlund, “Fast Selective Detec-tion of RotaDetec-tional Symmetries using Normalized Inhi-bition,” in Proc. of ECCV-2000, vol. I, pp. 871–887. [6] B. Johansson, “Backprojection of Some Image

Sym-metries Based on a Local Orientation Description,” Re-port LiTH-ISY-R-2311, Dept. EE, Link¨oping Univer-sity, SE-581 83 Link¨oping, Sweden, October 2000.