Finding EÆcient Nonlinear Visual Operators using
Canonical Correlation Analysis
MagnusBorga
magnus@isy.liu.se
Hans Knutsson
knutte@isy.liu.se
Computer Vision Laboratory
Dept. of ElectricalEngineering
Linkoping University
SE-585 94Linkoping
Sweden
Abstract
This paperpresentsageneralstrategy fordesigning
eÆcientvisualoperators. Theapproachishighlytask
orientedand what constitutes the relevant
informa-tionisdenedbyasetofexamples.Theexamplesare
pairsofimagesdisplayingastrongdependenceinthe
chosen feature but are otherwise independent.
Par-ticularlyimportantconcepts inthe work aremutual
informationandcanonicalcorrelation. Visual
opera-torslearnedfrom examples arepresented,e.g. local
shiftinvariantorientation operators and image
con-tent invariant disparity operators. Interesting
simi-laritiestobiologicalvisionfunctions areobserved.
1 Introduction
Theneedforagenerallyapplicablemethodfor
learn-ing is evident in problems involvingvision. The
di-mensionalityoftypicalinputsoftenexceed10 6
eec-tively ruling out any type of complete analysis. In
commonpracticevisionproblems arehandled by
re-ducingthedimensionalitytotypically<10by
throw-ing awayalmost all available information in a
basi-callyadhocmanner. Thisapproachishoweverlikely
tofailif,asisfrequentlythecaseforvisionproblems,
themechanisms bywhich thenecessaryinformation
canbeextractedisnotwellunderstood. Forthis
rea-sondesigningsystemcapableoflearningtherelevant
informationextractionmechanismsappearstobethe
onlypossiblewaytoproceed.
Hebbianlearningmethods likeOja's rule[12] and
self-organizing feature maps [10] are related to the
principal components of thesignaldistribution and,
hence,base their selectionof basis vectorson signal
variance.
However, when the problem involves an analysis
of the relations betweentwo sets of data, the
prin-cipal components of either set are not relevant. In
recentyears,unsupervisedlearningalgorithmsbased
inginterest. Examplesofthisapproacharethe
info-maxprinciple[11],Imaxprinciple[2]andIndependent
ComponentsAnalysis[7]. Mutualinformationbased
learninghas been used e.g. forblind separationand
blind deconvolution [3] and disparity estimations in
random-dotstereograms[2].
A set oflinear basisfunctions,havingadirect
re-lation to maximum mutual information, can be
ob-tained by canonical correlation analysis (CCA) [8].
CCAndstwosetsofbasisfunctions,oneineach
sig-nal space, such that the correlation matrix between
the signals described in the new basis is a diagonal
matrix. The basisvectorscanbeordered such that
therst pairof vectorsw
x1 andw
y1
maximizes the
correlationbetweentheprojections(x T w x1 ; y T w y1 )
of the signals x and y onto the two vectors
respec-tively. AsubsetofthevectorscontainingtheN rst
pairs denes a linear rank-N relation between the
sets that is optimal in a correlationsense. Inother
words, it givesthe linear combination of one set of
variables that is the best predictorand at the same
time thelinearcombination ofan otherset whichis
themostpredictable. Ithasbeenshownthat nding
thecanonicalcorrelationsisequivalenttomaximizing
themutualinformationbetweenthesetsifthe
under-lyingdistributionsareellipticallysymmetric[9].
2 Canonical correlation
analy-sis
Consider two random variables, x and y, from a
multi-normaldistribution: x y N x 0 y 0 ; C xx C xy C yx C yy ; (1) whereC= Cxx Cxy Cy x Cy y
isthecovariancematrix. C
xx
and C
yy
are nonsingular matrices and C
xy = C
T
yx .
Consider the linear combinations, x = w T x (x x 0 ) and y = w T y (y y 0
respec-(2),seeforexample[1]: = w T x C xy w y q w T x C xx w x w T y C yy w y : (2)
A completedescription ofthe canonicalcorrelations
isgivenby: C xx [0] [0] C yy 1 [0] C xy C yx [0] ^ w x ^ w y = x ^ w x y ^ w y (3) where: ; x ; y >0and x y =1. Equation
(3)canberewrittenas:
8 < : C 1 xx C xy ^ w y = x ^ w x C 1 yy C yx ^ w x = y ^ w y (4)
Solving (4) gives N solutions f
n ;w^ xn ;w^ yn g; n =
f1::Ng. N is the minimum of the input
dimension-alityandtheoutputdimensionality. Thelinear
com-binations, x n = w^ T xn x and y n = w^ T yn y , are termed
canonical variates and thecorrelations,
n
, between
these variates are termed the canonical correlations
[8]. An importantaspect in this context is that the
canonicalcorrelationsareinvariant
toaÆnetransformationsofxandy. Alsonotethat
thecanonicalvariatescorresponding to thedierent
roots of(4)areuncorrelated,implyingthat:
8 < : w T xn C xx w xm =0 w T yn C yy w ym =0 if n6=m (5)
Itshouldbenotedthat(3) isaspecialcaseofthe
generalizedeigenproblem [4]:
Aw=Bw :
3 Learning from Examples
Descriptors for higherorder features are in practice
impossibletodesignbyhandduetotheoverpowering
amount o possiblesignalcombinations. In[4] it is
shownhowcanonicalcorrelationanalysiscanbeused
tondoperatorsthatrepresentrelevantlocalfeatures
inimages.
The basic idea behind the CCA approach,
illus-tratedingure1,istoanalysetwosignalswherethe
featurethatistoberepresentedgeneratesdependent
signal components. The signal vectors fed into the
CCAare image data mapped througha function f.
Iff istheidentityoperator(oranyotherfull-rank
lin-earfunction), theCCAndsthelinearcombinations
ofpixeldatathathavethehighestcorrelation. Inthis
case,thecanonicalcorrelationvectorscanbeseenas
linearlters. Ingeneral, f can beanyvector-valued
functionoftheimagedata,orevendierentfunctions
f
x and f
y
, one for each signalspace. The choice of
f isof majorimportance asit determinesthe
repre-sentation of inputdata forthe canonicalcorrelation
CCA
f
f
Figure1: Asymbolicillustrationofthemethodof
us-ingCCAfor ndingfeaturedetectorsinimages. The
desiredfeature(orientation: hereillustratedbyasolid
line)isvaryingequallyinbothimagesequenceswhile
other features (here illustrated with dotted curves)
vary in an uncorrelated way. The inputto the CCA
isafunction f of theimage.
0
5
10
15
20
0
0.2
0.4
0.6
0.8
1
Correlation number
Canonical correlation
Figure2: Plot ofcanonical correlationvalues.
Local orientation
Itisshownin[4]and[6]thatiff isanouter
prod-uct and the image pairscontain sine wavepatterns
withequalorientationsbutdierentphase,theCCA
ndslinear combinationsof the outerproducts that
convey information about local orientation and are
invarianttolocalphase. Figures2,3and4show
re-sultsfromasimilarexperimentthistimeusingimage
pairsofedgeshavingequalorientationanddierent,
independent positions. Independent whiteGaussian
noisetoalevelof12dBSNRwasaddedtoallimages.
Figure2showsthevaluesofthe20rstcanonical
cor-relations. Thevaluesappeartocomeinpairs,therst
twovaluesbeing0:98demonstrating thatthe
mu-tual information mediated throughlocal orientation
ishigh.
Figure 3 show the projections of Fourier
compo-nents on canonical correlation vectors 1 to 8. The
resultshows thatangular operators oforders 2,4,6
Figure 3: Projections of Fourier components on
canonical correlationvectors1to8. Theresultshows
that angular operators of orders 2, 4, 6 and 8 are
important informationcarriers.
toshift-invarianthavingapositiondependent
varia-tionintheorderof5%. Comparingtogure2itcan
beseenthatthedecreaseinthecanonicalcorrelation
valuescorrespondstoanincreaseinangularorderof
theoperators.
`Complexcells' Performinganeigenvalue
decom-position ofthe canonicalcorrelation vectorsthe
cor-responding linear combinations, in the outer
prod-uctspace, canbeseen asquadraticcombinationsof
linearlters[4]. Thelinearlters (eigenimages)
ob-taineddisplayacleartendencyto form pairsofodd
andevenltershavingsimilarspectra. Such
quadra-turelterpairsallowforalocalshift-invariantfeature
andare functionally similarto the orientation
selec-tive`complexcells'foundinbiologicalvision. Figure
4showsthespectraoffoursuchlterpairs. Thetop
twoarefromcanonicalcorrelationvectoroneand
dis-play selectivityto orientations45 and135deg. The
bottomtwoarefromcanonicalcorrelationvectortwo
Figure4: Spectraofeigenimages interpretedas
com-plex quadrature lter pairs. Top two from canonical
correlationvector1. Bottomtwofromcanonical
cor-relationvector2.
Local disparity
Animportantproblemincomputervisionthatis
suit-ableto handlewith CCAisstereovision,sincedata
inthiscasenaturallyappearinpairs. In[4,5]anovel
stereovisionalgorithmthatcombinesCCAandphase
analysis is presented. It is demonstrated that the
algorithmcanhandletraditionallydiÆcultproblems
such as: 1. Producingmultipledisparityestimates
forsemi-transparentimages, seegure6, 2.
Main-tain accuracyat disparity edges, and 3. Allowing
dierentlyscaledimages.
Canonical correlation analysis is used to create
adaptive linear combinations of quadrature lters.
Theselinearcombinationsarenewquadraturelters
that are adapted in frequency response and spatial
position maximizing the correlation betweenthe
l-teroutputsfromthetwoimages. Figure5showsthe
ltersobtainedfortwowhitenoiseimageswhere
dis-paritychangeslinearlywithhorizontalposition.Note
that the obtained lters have adapted to the eect
of thedisparity gradientthrougha relative osetin
center frequency. The disparityestimateis obtained
by analysing the phase of the scalar product of the
adapted lters. A result for depth estimates using
semi-transparentimages isshowningure6.
4 The Future
The concept of mutual information provides asolid
andgeneralbasisforthestudyofabroadspectrumof
problemsincludingsignaloperatordesignand
−8
−6
−4
−2
0
2
4
6
8
−0.2
−0.1
0
0.1
0.2
Left filter
−8
−6
−4
−2
0
2
4
6
8
−0.2
−0.1
0
0.1
0.2
Right filter
0
0.5
1
Spectrum of left filter
−
π
−
π
/2
0
π
/2
π
0
0.2
0.4
0.6
0.8
Spectrum of right filter
−
π
−
π
/2
0
π
/2
π
Figure5: ThelterscreatedbyCCA.Solidlinesshow
the real parts and dashed lines show the imaginary
parts.
0
50
100
150
0
20
40
60
80
100
−10
0
10
Vertical position
Disparity
Figure 6: The result of the stereo algorithm for
two random dot images corresponding to two
semi-transparent crossingplanes.
ingure7,istomaximizemutualinformationsubject
to constraintsgivenby a chosenmodel space. This
couldbedoneby varyingnotonlythelinear
projec-tions ,i.e. the CCA part, but also the functions f
x
andf
y .
Finding suitable function classes and eÆcient
pa-rameterisation/implementationsforthesefunctionsis
stillthecentralissueandwillbeanimportanttheme
inourcontinuedinvestigations.
CCA
f
f
ρ
s
x
x
y
s
y
Figure 7: A general approach for nding maximum
mutualinformation.
Acknowledgement
Welike to thankThe Swedish Research Councilfor
Engineering Sciences (TFR) and The Swedish
Na-tional Board for Industrial and Technical
Develop-References
[1] T.W.Anderson.AnIntroductiontoMultivariate
Statistical Analysis. JohnWiley&Sons,second
edition,1984.
[2] S.BeckerandG.E.Hinton.Self-organizing
neu-ral network that discoverssurfaces in
random-dot stereograms. Nature, 355(9):161{163,
Jan-uary1992.
[3] A.J.BellandT.J.Sejnowski. An
information-maximizationapproach to blindseparationand
blind deconvolution. Neural Computation,
7:1129{59,1995.
[4] M. Borga. Learning Multidimensional Signal
Processing. PhD thesis, Linkoping University,
Sweden, SE-581 83 Linkoping, Sweden, 1998.
DissertationNo531,ISBN91-7219-202-X.
[5] M. Borgaand H. Knutsson. Estimating
Multi-ple Depths in Semi-transparent Stereo Images.
In Proceedings of the Scandinavian Conference
onImageanalysis,Greenland,June1999.SCIA.
AlsoasTechnicalReport LiTH-ISY-R-2248.
[6] M. Borga, H. Knutsson, and T. Landelius.
Learning Canonical Correlations. In
Proceed-ingsofthe10thScandinavianConferenceon
Im-ageAnalysis,Lappeenranta,Finland,June1997.
SCIA.
[7] P. Comon. Independent component analysis, a
newconcept? SignalProcessing, 36(3):287{314,
April1994.
[8] H.Hotelling.Relationsbetweentwosetsof
vari-ates. Biometrika, 28:321{377,1936.
[9] J.Kay. Featurediscoveryunder contextual
su-pervision using mutual information. In
Inter-national Joint Conference on Neural Networks,
volume4,pages79{84.IEEE,1992.
[10] T. Kohonen. Self-organized formation of
topo-logicallycorrectfeaturemaps. Biological
Cyber-netics,43:59{69,1982.
[11] R. Linsker. Development of feature-analyzing
cells and their columnar organization in a
lay-ered self-adaptive network. In Rodney M. L.
Cotteril, editor, Computer Simulation in Brain
Science, chapter 27, pages 416{431.Cambridge
UniversityPress,1988.
[12] E.Oja. Asimpliedneuronmodelasaprincipal
component analyzer. J. Math. Biology, 15:267{
273,1982.
[13] C.E.Shannon. A mathematicaltheoryof
com-munication.The BellSystemTechnical Journal,
1948. Alsoin N. J.A.Sloane andA. D.Wyner
(ed.)Claude Elwood Shannon Collected Papers,