Finding Efficient Nonlinear Visual Operators using Canonical Correlation Analysis

(1)

Finding EÆcient Nonlinear Visual Operators using

Canonical Correlation Analysis

MagnusBorga

magnus@isy.liu.se

Hans Knutsson

knutte@isy.liu.se

Computer Vision Laboratory

Dept. of ElectricalEngineering

Linkoping University

SE-585 94Linkoping

Sweden

Abstract

This paperpresentsageneralstrategy fordesigning

eÆcientvisualoperators. Theapproachishighlytask

orientedand what constitutes the relevant

informa-tionisdenedbyasetofexamples.Theexamplesare

pairsofimagesdisplayingastrongdependenceinthe

chosen feature but are otherwise independent.

Par-ticularlyimportantconcepts inthe work aremutual

informationandcanonicalcorrelation. Visual

opera-torslearnedfrom examples arepresented,e.g. local

shiftinvariantorientation operators and image

con-tent invariant disparity operators. Interesting

simi-laritiestobiologicalvisionfunctions areobserved.

1 Introduction

Theneedforagenerallyapplicablemethodfor

learn-ing is evident in problems involvingvision. The

di-mensionalityoftypicalinputsoftenexceed10 6

eec-tively ruling out any type of complete analysis. In

commonpracticevisionproblems arehandled by

re-ducingthedimensionalitytotypically<10by

throw-ing awayalmost all available information in a

basi-callyadhocmanner. Thisapproachishoweverlikely

tofailif,asisfrequentlythecaseforvisionproblems,

themechanisms bywhich thenecessaryinformation

canbeextractedisnotwellunderstood. Forthis

rea-sondesigningsystemcapableoflearningtherelevant

informationextractionmechanismsappearstobethe

onlypossiblewaytoproceed.

Hebbianlearningmethods likeOja's rule[12] and

self-organizing feature maps [10] are related to the

principal components of thesignaldistribution and,

hence,base their selectionof basis vectorson signal

variance.

However, when the problem involves an analysis

of the relations betweentwo sets of data, the

prin-cipal components of either set are not relevant. In

recentyears,unsupervisedlearningalgorithmsbased

inginterest. Examplesofthisapproacharethe

info-maxprinciple[11],Imaxprinciple[2]andIndependent

ComponentsAnalysis[7]. Mutualinformationbased

learninghas been used e.g. forblind separationand

blind deconvolution [3] and disparity estimations in

random-dotstereograms[2].

A set oflinear basisfunctions,havingadirect

re-lation to maximum mutual information, can be

ob-tained by canonical correlation analysis (CCA) [8].

CCAndstwosetsofbasisfunctions,oneineach

sig-nal space, such that the correlation matrix between

the signals described in the new basis is a diagonal

matrix. The basisvectorscanbeordered such that

therst pairof vectorsw

x1 andw

y1

maximizes the

correlationbetweentheprojections(x T w x1 ; y T w y1 )

of the signals x and y onto the two vectors

respec-tively. AsubsetofthevectorscontainingtheN rst

pairs denes a linear rank-N relation between the

sets that is optimal in a correlationsense. Inother

words, it givesthe linear combination of one set of

variables that is the best predictorand at the same

time thelinearcombination ofan otherset whichis

themostpredictable. Ithasbeenshownthat nding

thecanonicalcorrelationsisequivalenttomaximizing

themutualinformationbetweenthesetsifthe

under-lyingdistributionsareellipticallysymmetric[9].

2 Canonical correlation

analy-sis

Consider two random variables, x and y, from a

multi-normaldistribution: x y N x 0 y 0 ; C xx C xy C yx C yy ; (1) whereC= Cxx Cxy Cy x Cy y

isthecovariancematrix. C

xx

and C

yy

are nonsingular matrices and C

xy = C

T

yx .

Consider the linear combinations, x = w T x (x x 0 ) and y = w T y (y y 0

(2)

respec-(2),seeforexample[1]: = w T x C xy w y q w T x C xx w x w T y C yy w y : (2)

A completedescription ofthe canonicalcorrelations

isgivenby: C xx [0] [0] C yy 1 [0] C xy C yx [0] ^ w x ^ w y = x ^ w x y ^ w y (3) where: ; x ; y >0and x y =1. Equation

(3)canberewrittenas:

8 < : C 1 xx C xy ^ w y = x ^ w x C 1 yy C yx ^ w x = y ^ w y (4)

Solving (4) gives N solutions f

n ;w^ xn ;w^ yn g; n =

f1::Ng. N is the minimum of the input

dimension-alityandtheoutputdimensionality. Thelinear

com-binations, x n = w^ T xn x and y n = w^ T yn y , are termed

canonical variates and thecorrelations,

n

, between

these variates are termed the canonical correlations

[8]. An importantaspect in this context is that the

canonicalcorrelationsareinvariant

toaÆnetransformationsofxandy. Alsonotethat

thecanonicalvariatescorresponding to thedierent

roots of(4)areuncorrelated,implyingthat:

8 < : w T xn C xx w xm =0 w T yn C yy w ym =0 if n6=m (5)

Itshouldbenotedthat(3) isaspecialcaseofthe

generalizedeigenproblem [4]:

Aw=Bw :

3 Learning from Examples

Descriptors for higherorder features are in practice

impossibletodesignbyhandduetotheoverpowering

amount o possiblesignalcombinations. In[4] it is

shownhowcanonicalcorrelationanalysiscanbeused

tondoperatorsthatrepresentrelevantlocalfeatures

inimages.

The basic idea behind the CCA approach,

illus-tratedingure1,istoanalysetwosignalswherethe

featurethatistoberepresentedgeneratesdependent

signal components. The signal vectors fed into the

CCAare image data mapped througha function f.

Iff istheidentityoperator(oranyotherfull-rank

lin-earfunction), theCCAndsthelinearcombinations

ofpixeldatathathavethehighestcorrelation. Inthis

case,thecanonicalcorrelationvectorscanbeseenas

linearlters. Ingeneral, f can beanyvector-valued

functionoftheimagedata,orevendierentfunctions

f

x and f

y

, one for each signalspace. The choice of

f isof majorimportance asit determinesthe

repre-sentation of inputdata forthe canonicalcorrelation

CCA

f

Figure1: Asymbolicillustrationofthemethodof

us-ingCCAfor ndingfeaturedetectorsinimages. The

desiredfeature(orientation: hereillustratedbyasolid

line)isvaryingequallyinbothimagesequenceswhile

other features (here illustrated with dotted curves)

vary in an uncorrelated way. The inputto the CCA

isafunction f of theimage.

0

5

10

15

20

0

0.2

0.4

0.6

0.8

1 Correlation number

Canonical correlation

Figure2: Plot ofcanonical correlationvalues.

Local orientation

Itisshownin[4]and[6]thatiff isanouter

prod-uct and the image pairscontain sine wavepatterns

withequalorientationsbutdierentphase,theCCA

ndslinear combinationsof the outerproducts that

convey information about local orientation and are

invarianttolocalphase. Figures2,3and4show

re-sultsfromasimilarexperimentthistimeusingimage

pairsofedgeshavingequalorientationanddierent,

independent positions. Independent whiteGaussian

noisetoalevelof12dBSNRwasaddedtoallimages.

Figure2showsthevaluesofthe20rstcanonical

cor-relations. Thevaluesappeartocomeinpairs,therst

twovaluesbeing0:98demonstrating thatthe

mu-tual information mediated throughlocal orientation

ishigh.

Figure 3 show the projections of Fourier

compo-nents on canonical correlation vectors 1 to 8. The

resultshows thatangular operators oforders 2,4,6

(3)

Figure 3: Projections of Fourier components on

canonical correlationvectors1to8. Theresultshows

that angular operators of orders 2, 4, 6 and 8 are

important informationcarriers.

toshift-invarianthavingapositiondependent

varia-tionintheorderof5%. Comparingtogure2itcan

beseenthatthedecreaseinthecanonicalcorrelation

valuescorrespondstoanincreaseinangularorderof

theoperators.

`Complexcells' Performinganeigenvalue

decom-position ofthe canonicalcorrelation vectorsthe

cor-responding linear combinations, in the outer

prod-uctspace, canbeseen asquadraticcombinationsof

linearlters[4]. Thelinearlters (eigenimages)

ob-taineddisplayacleartendencyto form pairsofodd

andevenltershavingsimilarspectra. Such

quadra-turelterpairsallowforalocalshift-invariantfeature

andare functionally similarto the orientation

selec-tive`complexcells'foundinbiologicalvision. Figure

4showsthespectraoffoursuchlterpairs. Thetop

twoarefromcanonicalcorrelationvectoroneand

dis-play selectivityto orientations45 and135deg. The

bottomtwoarefromcanonicalcorrelationvectortwo

Figure4: Spectraofeigenimages interpretedas

com-plex quadrature lter pairs. Top two from canonical

correlationvector1. Bottomtwofromcanonical

cor-relationvector2.

Local disparity

Animportantproblemincomputervisionthatis

suit-ableto handlewith CCAisstereovision,sincedata

inthiscasenaturallyappearinpairs. In[4,5]anovel

stereovisionalgorithmthatcombinesCCAandphase

analysis is presented. It is demonstrated that the

algorithmcanhandletraditionallydiÆcultproblems

such as: 1. Producingmultipledisparityestimates

forsemi-transparentimages, seegure6, 2.

Main-tain accuracyat disparity edges, and 3. Allowing

dierentlyscaledimages.

Canonical correlation analysis is used to create

adaptive linear combinations of quadrature lters.

Theselinearcombinationsarenewquadraturelters

that are adapted in frequency response and spatial

position maximizing the correlation betweenthe

l-teroutputsfromthetwoimages. Figure5showsthe

ltersobtainedfortwowhitenoiseimageswhere

dis-paritychangeslinearlywithhorizontalposition.Note

that the obtained lters have adapted to the eect

of thedisparity gradientthrougha relative osetin

center frequency. The disparityestimateis obtained

by analysing the phase of the scalar product of the

adapted lters. A result for depth estimates using

semi-transparentimages isshowningure6.

4 The Future

The concept of mutual information provides asolid

andgeneralbasisforthestudyofabroadspectrumof

problemsincludingsignaloperatordesignand

(4)

−8

−6

−4

−2

0

2

4

6

8 −0.2

−0.1

0

0.1

0.2 Left filter

−8

−6

−4

−2

0

2

4

6

8 −0.2

−0.1

0

0.1

0.2 Right filter

0

0.5

1 Spectrum of left filter

−

π

−

π

/2

0 π

/2

π

0

0.2

0.4

0.6

0.8 Spectrum of right filter

−

π

−

π

/2

0 π

/2

π

Figure5: ThelterscreatedbyCCA.Solidlinesshow

the real parts and dashed lines show the imaginary

parts.

0

50

100

150

0

20

40

60

80

100 −10

0

10 Vertical position

Disparity

Figure 6: The result of the stereo algorithm for

two random dot images corresponding to two

semi-transparent crossingplanes.

ingure7,istomaximizemutualinformationsubject

to constraintsgivenby a chosenmodel space. This

couldbedoneby varyingnotonlythelinear

projec-tions ,i.e. the CCA part, but also the functions f

x

andf

y .

Finding suitable function classes and eÆcient

pa-rameterisation/implementationsforthesefunctionsis

stillthecentralissueandwillbeanimportanttheme

inourcontinuedinvestigations.

CCA

f

ρ

s

_x

x

y

s

_y

Figure 7: A general approach for nding maximum

mutualinformation.

Acknowledgement

Welike to thankThe Swedish Research Councilfor

Engineering Sciences (TFR) and The Swedish

Na-tional Board for Industrial and Technical

Develop-References

[1] T.W.Anderson.AnIntroductiontoMultivariate

Statistical Analysis. JohnWiley&Sons,second

edition,1984.

[2] S.BeckerandG.E.Hinton.Self-organizing

neu-ral network that discoverssurfaces in

random-dot stereograms. Nature, 355(9):161{163,

Jan-uary1992.

[3] A.J.BellandT.J.Sejnowski. An

information-maximizationapproach to blindseparationand

blind deconvolution. Neural Computation,

7:1129{59,1995.

[4] M. Borga. Learning Multidimensional Signal

Processing. PhD thesis, Linkoping University,

Sweden, SE-581 83 Linkoping, Sweden, 1998.

DissertationNo531,ISBN91-7219-202-X.

[5] M. Borgaand H. Knutsson. Estimating

Multi-ple Depths in Semi-transparent Stereo Images.

In Proceedings of the Scandinavian Conference

onImageanalysis,Greenland,June1999.SCIA.

AlsoasTechnicalReport LiTH-ISY-R-2248.

[6] M. Borga, H. Knutsson, and T. Landelius.

Learning Canonical Correlations. In

Proceed-ingsofthe10thScandinavianConferenceon

Im-ageAnalysis,Lappeenranta,Finland,June1997.

SCIA.

[7] P. Comon. Independent component analysis, a

newconcept? SignalProcessing, 36(3):287{314,

April1994.

[8] H.Hotelling.Relationsbetweentwosetsof

vari-ates. Biometrika, 28:321{377,1936.

[9] J.Kay. Featurediscoveryunder contextual

su-pervision using mutual information. In

Inter-national Joint Conference on Neural Networks,

volume4,pages79{84.IEEE,1992.

[10] T. Kohonen. Self-organized formation of

topo-logicallycorrectfeaturemaps. Biological

Cyber-netics,43:59{69,1982.

[11] R. Linsker. Development of feature-analyzing

cells and their columnar organization in a

lay-ered self-adaptive network. In Rodney M. L.

Cotteril, editor, Computer Simulation in Brain

Science, chapter 27, pages 416{431.Cambridge

UniversityPress,1988.

[12] E.Oja. Asimpliedneuronmodelasaprincipal

component analyzer. J. Math. Biology, 15:267{

273,1982.

[13] C.E.Shannon. A mathematicaltheoryof

com-munication.The BellSystemTechnical Journal,

1948. Alsoin N. J.A.Sloane andA. D.Wyner

(ed.)Claude Elwood Shannon Collected Papers,