Learning Canonical Correlations
M.Borga,H.Knutsson,T.Landelius
ComputerVisionLaboratory
DepartmentofElectricalEngineering
LinkopingUniversity
S-58183Linkoping,Sweden
Abstract
This paper presents a novel learning algorithm that nds the linear combination of one set of
multi-dimensionalvariatesthatisthebestpredictor,andatthesametimendsthelinearcombinationofanotherset
whichisthemostpredictable. Thisrelationisknownasthecanonicalcorrelationandhasthepropertyofbeing
invariantwithrespecttoaÆnetransformationsofthetwosetsofvariates. Thealgorithmsuccessivelyndsall
thecanonicalcorrelationsbeginningwiththe largestone. Itis shownthat canonicalcorrelationscanbeused
incomputervisiontondfeaturedetectorsbygivingexamplesofthedesiredfeatures. Whenusedonthepixel
level,themethodndsquadratureltersand whenusedonahigherlevel,themethodndscombinations of
lteroutputthatarelesssensitivetonoisecomparedtovectoraveraging.
1 Introduction
Acommonprobleminneuralnetworksandlearning,incapacitatingmanytheoreticallypromisingalgorithms,isthe
high dimensionalityof theinput-outputspace. Asanexample,typicaldimensionalitiesforsystemshavingvisual
inputsfarexceedacceptablelimits. Forthisreason,apriorirestrictionsmustbeinvoked. Acommonrestrictionis
to useonlylocallylinearmodels. ToobtaineÆcientsystems,thedimensionalities ofthemodelsshouldbeaslow
aspossible. The useoflocally low-dimensional linearmodelswill in mostcasesbe adequateifthesubdivisionof
theinputandoutputspacesaremadeadaptively[3,11].
An important problem is to nd the best directions in the input- and output spaces for the local models.
Algorithms like the Kohonen self organizing feature maps [10] and others that work with principal component
analysis will nd directions where the signal variances are high. This is, however, of little use in a response
generating system. Such a system should nd directions that eÆciently represents signals that are important
ratherthansignalsthathavelargeenergy.
In general theinput to asystem comes from a set of dierent sensors and it is evident that the range of the
signal values from agiven sensor is unrelated to the importance of the receivedinformation. The same line of
reasoningholds for theoutputwhich may consistof signalsto aset ofdierent eectuators. Forthis reasonthe
correlationbetweeninputandoutputsignalsisinterestingsincethismeasureofinput-outputrelationisindependent
ofthesignalvariances. However,correlationaloneisnotnecessarilymeaningful. Onlyinput-outputpairsthatare
regardedasrelevantshouldbeenteredin thecorrelationanalysis.
Relating only the projections of the input, x, and output, y, on two vectors, w
x
and w
y
, establishes a
one-dimensional linear relation between the input and output. We wish to nd the vectors that maximizes
corr(x T w x ; y T w y
), i.e. the correlation between the projections. This relation is known as canonical
corre-lation [6]. It is a statistical method of nding the linear combination of one set of variables that is the best
predictor, andatthesametimendingthelinearcombinationofanotherset whichisthemostpredictable.
Ithasbeenshown[7]that ndingthecanonicalcorrelationsisequivalenttomaximizingthemutualinformation
betweenthesetsX andY ifxandycomefrom ellipticalsymmetricrandomdistributions.
In section2,a briefreviewof thetheory ofcanonicalcorrelation is given. Insection 3wepresentaniterative
learningrule,equation7,that ndsthedirectionsandmagnitudesof thecanonicalcorrelations. Toillustrate the
algorithm behaviour,someexperimentsarepresentedanddiscussedin section4. Finally,in section5,wediscuss
2 Canonical Correlation
Considertworandomvariables,xandy,from amulti-normaldistribution:
x y N x 0 y 0 ; C xx C xy C yx C yy ; (1) where C= C xx C xy C y x C y y
isthecovariancematrix. C
xx andC
yy
arenonsingularmatricesandC
xy =C
T
yx
. Consider
the linearcombinations,x=w T x (x x 0 )and y=w T y (y y 0
), ofthetwovariablesrespectively. The correlation
betweenx andy isgivenbyequation2,seeforexample[2]:
= w T x C xy w y q w T x C xx w x w T y C yy w y : (2)
Thedirectionsofthepartialderivativesofwithrespecttow
x andw
y
aregivenby:
8 > < > : @ @wx ! = C xy ^ w y ^ w T x Cxywy^ ^ w T x Cxx ^ wx C xx ^ w x @ @w y ! = C yx ^ w x ^ w T y Cy xwx^ ^ w T y C y y ^ w y C yy ^ w y (3)
where'^'indicatesunitlengthand' !
='meansthatthevectors,leftandright,havethesamedirections. Acomplete
descriptionofthecanonicalcorrelationsisgivenby:
C xx [ 0 ] [ 0 ] C yy 1 [ 0 ] C xy C yx [ 0 ] ^ w x ^ w y = x ^ w x y ^ w y (4) where: ; x ; y >0and x y
=1. Equation4canberewrittenas:
8 < : C 1 xx C xy ^ w y = x ^ w x C 1 yy C yx ^ w x = y ^ w y (5)
Solvingequation 5givesNsolutionsf
n ;w^
xn ;w^
yn
g; n=f1::Ng. Nis theminimumof theinputdimensionality
andtheoutputdimensionality. Thelinearcombinations,x
n =w^ T xn xandy n =w^ T yn
y ,aretermedcanonicalvariates
and thecorrelations,
n
,betweenthesevariatesaretermedthecanonicalcorrelations[6]. Animportantaspectin
this context isthat the canonicalcorrelationsare invariant toaÆne transformations ofx andy. Alsonote that
thecanonicalvariatescorrespondingtothedierentrootsofequation5areuncorrelated,implyingthat:
8 < : w T xn C xx w xm =0 w T yn C yy w ym =0 if n6=m (6)
Itshould benotedthatequation4isaspecialcaseof thegeneralizedeigenproblem[4]:
Aw=Bw:
Thesolutiontothisproblemcan befoundbyndingthevectorswthat maximizestheRayleigh quotient:
r= w T Aw w T Bw :
3 Learning Canonical Correlations
Wehavedevelopedanovellearningalgorithmthatndsthecanonicalcorrelationsandthecorrespondingcanonical
variatesbyaniterativemethod. Theupdateruleforthevectorsw
x andw y isgivenby: 8 < : w x (w x + x x(y T ^ w y x T w x ) w y (w y + y y(x T ^ w x y T w y ) (7)
wherexandybothhavethemean0. Toseethatthisrulendsthedirectionsofthecanonicalcorrelationwelook
at theexpectedchange,in oneiteration,ofthevectors,w
x and w y : 8 < : Efw x g= x Efxy T ^ w y xx T w x g= x (C xy ^ w y kw x kC xx ^ w x ) Efw y g= y Efyx T ^ w x yy T w y g= y (C yx ^ w x kw y kC yy ^ w y )
Identifyingwithequation3gives:
Efw x g ! = @ @w x and Efw y g ! = @ @w y (8) with kw x k= ^ w T x C xy ^ w y ^ w T x C xx ^ w x and kw y k= ^ w T y C yx ^ w x ^ w T y C yy ^ w y
This showsthat theexpectedchangesof thevectorsw
x andw
y
arein thesamedirectionsasthegradientof the
canonical correlation, ,which meansthat the learningrulesin equation 7on average is agradientsearch on .
;
x and
y
arefoundas:
= q kw x kkw y k; x = 1 y = s kw x k kw y k : (9)
3.1 Learning of successive canonical correlations
The learningrule maximizesthecorrelation and ndsthe directions, w^
x1 andw^
y1
, corresponding to thelargest
correlation,
1
. Tondthesecondlargestcanonicalcorrelationandthecorrespondingcanonicalvariatesofequation
5weusethemodiedlearningrule
8 < : w x (w x + x x((y y 1 ) T ^ w y x T w x ) w y (w y + y y((x x 1 ) T ^ w x y T w y ) (10) where x 1 = x T ^ w x1 v x1 ^ w T x1 v x1 and y 1 = y T ^ w y1 v y1 ^ w T y1 v y1 : v x1 andv y1 areestimatesofC xx ^ w x1 and C yy ^ w y1
respectivelyandareestimatedusingtheiterativerule:
8 < : v x1 (v x1 + (xx T ^ w x1 v x1 ) v y1 (v y1 + (yy T ^ w y1 v y1 ) (11)
Theexpectedchangeofw
x andw y isthengivenby 8 > < > : Efw x g= x C xy ^ w y ^ w y1 ^ w T y 1 Cy ywy^ ^ w T y 1 Cy ywy 1^ kw x kC xx ^ w x Efw y g= y C yx h ^ w x ^ w x1 ^ w T x1 Cxxwx^ ^ w T x1 C xx ^ w x1 i kw y kC yy ^ w y (12)
It canbeseenthat thepartsofw
x and w y parallel toC xx ^ w x1 andC yy ^ w y1
respectivelywill vanish(w T x w x1 0 8xand w T y w y1
0 8yin equation 10). InthesubspacesorthogonaltoC
xx ^ w x1 andC yy ^ w y1 thelearning
rulewillbeequivalenttothatgivenbyequation7. Inthiswaythepartsofthesignalscorrelatedwithw T x1 x(and w T y1
y )aredisregardedleavingtherestunchanged. Consequentlythealgorithmndsthesecondlargestcorrelation
2
and thecorrespondingvectorsw
x2 and w
y2
. Successive canonicalcorrelationscanbefound by repeating the
procedure.
4 Performance
Inthissection,twodierentexperimentsarepresentedtoillustratetheeÆciencyandperformanceoftheproposed
10
1
10
2
10
3
10
5
10
6
10
7
10
8
10
9
10
10
10
dimensions
flops
10
1
10
2
10
3
10
3
10
4
10
5
10
6
dimensions
flops
Figure1: Left: Numberof opsuntilconvergenceforRLS(dashedline)andforouralgorithm(solidline). Right:
Numberof opsperiterationforRLS(dashedline)andforouralgorithm(solidline).
4.1 O(n) speedup
In the rst experiment, we will demonstrate, the advantageof using our canonicalcorrelation algorithm to nd
the proper subspacefora low-dimensional linearmodel in a high dimensional space. A set of trainingdata was
generatedwithn-dimensionalinputvectorsxand8-dimensionaloutputvectorsy . Thereweretwolinearrelations
between x and y and, hence, the propersubspaces should be two-dimensional in the input and output spaces
respectively.
In the experiment, the canonical correlation algorithm was run until it found the proper subspace. This can
bedetermined by thealgorithmsinceit, besidesthedirections ofcanonicalcorrelation,givesanestimate ofeach
canonicalcorrelation. Afterconvergence,astandardrecursiveleastsquare(RLS)algorithmwasappliedtondthe
linear relationsbetween thetwo-dimensionalsubspaces. This algorithm wasiterated until convergence,i.e. until
theerrorwasbelowacertainthreshold. Theerrorwasdenedas
=kBC T
x y k 2
whereBisthe(small)matrixfoundbytheRLSalgorithmandCisthematrixfoundbythecanonicalcorrelation
algorithm extractingtherelevantsubspace.
Asacomparison,thestandardRLSalgorithmwasuseddirectlyonthedata,iterateduntilconvergencewiththe
samethresholdasintherstexperiment. Theerrorinthiscasewasdenedas
=kAx y k 2
where Aisthe(large) matrixfoundbytheRLSalgorithm. Theresultsareplotted ingure1.
Inbothexperimentsthedimensionalities noftheinputspacewas16,32, 64and128(computationalproblems
with the standard RLSmethod set the upperlimit). Thecomplexitywasmeasured bycountingthe numberof
oating point operations ( ops) until convergence (left) and per iteration (right) for the standardRLS method
(markedwith rings)and for ourmethod (marked with stars). The lines show theestimated numberof ops for
largerdimensionalities. Theywerecalculatedbyttingpolynomialstothedata. Forourmethod, asecondanda
rst orderpolynomialwas suÆcientforthe datain theleft and rightgures respectively. ForthestandardRLS
method, athird andasecondorder polynomialrespectivelyhadto beused,in accordancewiththeory. Note the
logarithmicscale.
TheresultsshowthatouralgorithmisoforderO(n 2
)whenrununtilconvergence(O(n)periteration)whilethe
standardRLSmethod isoforderO(n 3
)(O(n 2
)periteration).
Thedimensionalityof thelinearrelationcan,of course,in generalnotbe knownin advance. Thisis, however,
not a problem since the canonical correlation algorithm rst nds the largestcorrelation and then proceeds by
4.2 High dimensional spaces
Thesecondexperimentillustratesthealgorithmsabilitytohandlehigh-dimensionalspaces. Thedimensionalityof
xis800andthedimensionalityofyis 200,sothetotaldimensionalityofthesignalspaceis1000.
Ratherthantuningparameterstoproduceaniceresultforaspecicdistribution,wehaveusedadaptiveupdate
factorsandparametersproducingsimilarbehaviourfordierentdistributionsanddierentnumberofdimensions.
Also note that the adaptability allowsasystem withouta pre-speciedtime dependentupdate rate decay. The
coeÆcients
x and
y
werein theexperimentscalculatedaccordingtoequation13:
8 < : x =a x E 1 x y =a y E 1 y where 8 < : E x (E x +b(kxx T w x k E x ) E y (E y +b(kyy T w y k E y ) (13)
Togetasmoothandyetfastbehaviour,anadaptivelytimeaveragedsetofvectors,w
a
wascalculated. Theupdate
speedwasmadedependentontheconsistencyin thechangeoftheoriginalvectorswaccordingtoequation14.
8 < : w ax (w ax +c k x kkw x k 1 (w x w ax ) w ay (w ay +ck y kkw y k 1 (w y w ay ) where 8 < : x ( x +d(w x x ) y ( y +d(w y y ) (14)
This processwecalladaptive smoothing.
Theexperimenthavebeencarriedoutusingarandomlychosendistributionofa800-dimensionalxvariableand
a 200-dimensional y variable. Twox and two y dimensions were partlycorrelated. The variances in the 1000
dimensionsareinthesameorderofmagnitude
0
0.5
1
1.5
2
2.5
3
x 10
4
0
0.5
1
1.5
iterations
correlation
0
0.5
1
1.5
2
2.5
3
x 10
4
−5
−4
−3
−2
−1
0
1
iterations
Angle error [ log(rad) ]
Figure2: Left: Figureshowingtheestimatedrstcanonicalcorrelation(solidline)asafunctionofnumberofactual
eventsandthetruecorrelationinthecurrentdirectionsfoundbythealgorithm(dottedline). Thedimensionality
of onesetof variablesis 800andof thesecond set200. Right: Figureshowingthelogof theangularerrorasa
function ofnumberofactualeventsonalogarithmicscale.
Totheleft in gure2,the estimatedrst canonicalcorrelationasafunction of numberofactual events(solid
line)and thetruecorrelationinthecurrentdirectionsfoundbythealgorithm (dottedline)isshown.
Totherightinthesamegure,theeectoftheadaptivesmoothingisshown. Theangleerrorsofthesmoothed
estimates(solidanddashedcurves)aremuchmorestableanddecreasemorerapidlythanthe`raw'estimates. The
errorsafter310 4
samplesisintheorderofafewdegrees. (Itshouldbenotedthatthisisanextremeprecisionas,
witharesolutionof3degrees,alowestimateofthenumberofdierentorientationsina1000-dimensionalspaceis
10 1500
.) Theangularerrorswerecalculatedastheanglebetweenthevectorsw
a
andtheexactsolutions,^e(known
from thexysampledistribution),i.e. arccos(w^ T
a ^
5 Applications in computer vision
AttheComputerVisionLaboratoryinLinkoping,wehavedevelopedatensorrepresentationofimagefeatures(see
e.g. [8, 9]) thathas receivedattentionin the computervision society. A possibleextension ofthetensor concept
towardsmorerobustestimations,representationofcertainty, representationofhigherorder features,etc.involves
higherorderltercombinations.
Asanexample,considerathree-dimensionallteringwitha777neighborhoodonthreescales. Thisgives
approximately1000lterresponses. Acompletesecondorderfunction ofthislteringinvolvesabout10 6
signals.
The selectionamong alldierentpossibleltercombinationsto designatensor representationis verydiÆcult
andalearningsystemiscalledfor. Astandardoptimizationmethod, basedonmeansquareerroris,however,not
veryusefulsinceitistheshapeofthetensorthatisofinterestratherthansize. Ifwe,forexample,wanttohavea
tensorrepresentingtheorientationofasignal,wewantthetensortocarryasmuchinformationaspossibleabout
theorientationandnotthemagnitude ofthesignal.
Forthisreason,thecanonicalcorrelationalgorithmisasuitablemethod,sinceitisbasedonmutualinformation
maximizationratherthanmeansquareerror[1]. Itcanalsohandlehigh-dimensionalsignalspaceswhichisessential
in afurther developmentofthetensorconceptinimageprocessing.
Experimentsshowthat thecanonicalcorrelationalgorithmcanbeusedtondltersthatdescribeaparticular
feature in animage invariantwith respect to other features. Thefeatures to be described arelearnedby giving
examples that arepresentedin pairs to thealgorithm in such away that thedesired feature, e.g.orientation,is
equalforeachpairwhileotherfeatures,e.g.phase,arepresentedinanunorderedway.
5.1 Learning low level operations
Intherstexperiment,weshowthatquadratureltersarefoundbythismethodwhenproductsofpixeldataare
presentedto thealgorithm. Quadraturelterscanbeusedto describelowerorderfeatures,e.g.local orientation.
LetI
x andI
y
beapairof55images. EachimageconsistsofasinewavepatternandadditiveGaussiannoise.
A sequence of such image pairsis constructed so that, for each pair, the orientation is equal in the twoimages
while the phase diers in a random way. Theimages have independent noise. Each image pairis described by
vectorsi
x and i
y .
LetXandYbetheouterproductsoftheimagevectors,i.e.X=i
x i T x andY=i y i T y
andrearrangethematrices
XandYintovectorsxandyrespectively. Now,wehaveasequenceofpairsof625-dimensionalvectorsdescribing
theproductsofpixeldatafrom theimages.
Thesequenceconsistof6500examples,i.e.20examplesperdegreeoffreedom. (The outerproductmatricesare
symmetricand, hence,thenumberoffreeparametersis n
2
+n
2
wherenis thedimensionalityoftheimage vector.)
Forasignaltonoiseratio(SNR)of0dB,therewere6signicant 1
canonicalcorrelationsandforanSNRof10dB
there were10signicantcanonicalcorrelations. Thetwomostsignicantcorrelationsforthe0dBcasewereboth
0.7 which corresponds to an SNR 2
of 3.7 dB. For the10 dB case, thetwo highestcorrelations were both0.989,
correspondingto anSNRof19.5dB.
Theprojectionsofimagesignalsxfororientationsbetween0andontothe10rstcanonicalcorrelationvectors
w
x
fromthe10dBcaseareshowntotheleftingure3. Thesignalsweregeneratedwithrandomphaseandwithout
noise. Asseeninthegure,thersttwocanonicalcorrelationsaresensitivetothedoubleangleoftheorientationof
thesignalandinvariantwithrespecttophase. Thetwocurvesare90 Æ
outofphaseand,hence,form aquadrature
pair[5]. Thefollowingcurvesshowthelowercorrelationswhicharesensitivetothefourth,sixth,eighth,andtenth
multiples oftheorientationandtheyalsoform quadraturepairs.
To be able to interpret the canonical correlation vectors, w
x
, we can write the vectors as 2525 matrices,
W
x
,andthen doaneigenvaluedecomposition,i.e. W
x = P i e i e T i
. (Note thatthedata wasgeneratedasouter
products,resultingin positivesemidenitesymmetricmatrices.) Theeigenvectors,e
i
,canbeseenaslinearlters
actingontheimage. Iftheeigenvectorsarerearrangedinto55matrices,\eigenimages",theycanbeinterpreted
in termsof imagefeatures. Totherightin gure3,the twomostsigniÆcanteigenimagesareshownforthe rst
(top)andsecond(bottom)canonicalcorrelationsrespectively. Weseethattheseeigenimagesformtwoquadrature
lterssensitivetotwoperpendicularorientations.
1
Bysignicant,wemeanthattheydierfromtherandomcorrelationscausedbythelimitedsetofsamples.Therandomcorrelations,
inthecaseof20samplesperdegreeoffreedom,isapproximately0.2(givenbyexperiments).
2
TherelationbetweencorrelationandSNRinthiscase isdenedbythecorrelationbetweentwosignalswiththesameSNR,i.e.
0
π
w
x
1,
w
x
2
Projections onto canonical correlation vectors w
x
1 to w
x
8
0
π
w
x
3,
w
x
4
0
π
w
x
5,
w
x
6
0
π
orientation
w
x
7,
w
x
8
Eigenimage 1 for w
x
1
Eigenimage 2 for w
x
1
Eigenimage 1 for w
x
2
Eigenimage 2 for w
x
2
Figure 3: Left: Projections of outer product vectors x onto the 10 rst canonical correlation vectors. Right:
Eigenvectorsof thecanonicalcorrelationvectorsviewedasimages,\eigenimages".
5.2 Learning higher level operations
Inthisexperiment,weusetheoutputfrom neighboringsetsofquadratureltersratherthanpixelvaluesasinput
to thealgorithm. Thiscan bejustied bythefact thatwehaveseenthat quadraturelterscanbedevelopedon
thelowest(pixel)levelusingthis algorithm. Wewillshowthat canonicalcorrelationcanndawayofcombining
lteroutputfromalocalneighborhoodtogetorientationestimatesthatislesssensitivetonoisethanthestandard
vectoraveragingmethod[5].
Let q
x and q
y
be 16-dimensional complex vectors of lter responses from four quadrature lters from each
position in a 22 neighborhood. (The content in each position could be calculated using the method in the
previous experiment.) Let X and Y be the real parts 3
of the outer products of q
x and q
y
with themselves
respectively and rearrange X and Y into 256-dimensional vectorsx and y. For each pair of vectors, the local
orientationwasequalwhilethephaseandnoisedieredrandomly. TheSNR was0dB.Thetwolargestcanonical
correlationswereboth0.8. Thecorrespondingvectorsdetectedthedoubleangleoftheorientationinvariantwith
respecttophase.
Newdata were generated usingarotatingsine-wavepatternwith anSNR of 0dBand projectedonto thetwo
rstcanonicalcorrelationvectors. Theorientationestimatesareshowtotheleftingure4togetherwithestimates
using vectoraveragingon thesamedata. In therightgure, theangular erroris shown forbothmethods. The
meanabsoluteangularerrorwas16 Æ
forthecanonicalcorrelationmethodand22 Æ
forthevectoraveragingmethod,
i.e. animprovementby27%. Notethat theneigborhood isverysmall(22),but preliminarytests indicatethat
theattainableimprovementin noisereductionincreaserapidlywithneigborhoodsize.
References
[1] S.BeckerandG.E.Hinton.Learningmixturemodelsofspatialcoherence.NeuralComputation,5(2):267{277,
March1993.
[2] R. D. Bock. Multivariate Statistical Methods in Behavioral Research. McGraw-Hill series in psychology.
McGraw-Hill,1975.
[3] M.Borga.HierarchicalReinforcementLearning.InS.GielenandB.Kappen,editors,ICANN'93,Amsterdam,
September1993.Springer-Verlag.
3
0
π
2π
Double orientation angle
0
π
2π
Estimated double angle using canonical correlation
0
π
2π
3π
4π
5π
0
π
2π
Estimated double angle using vector averaging
0
200
400
600
800
1000
−180
0
180
Angular error using canonical correlation (deg)
0
200
400
600
800
1000
−180
0
180
Angular error using vector averaging (deg)
Figure 4: Left: Orientation estimates for a rotating sine-wave pattern on a 22 neighborhood with an SNR
of 0 dB,using lter combinations found bycanonical correlations(middle) and vectoraveraging(bottom). The
correctdoubleangleisshownasreference(top). Right: Angularerrorsfor1000dierentsamplesusingcanonical
correlations(middle) andvectoraveraging(bottom).
[4] M. Borga. Reinforcement LearningUsing Local AdaptiveModels, August 1995. Thesis No.507, ISBN 91{
7871{590{3.
[5] G.H.GranlundandH.Knutsson.SignalProcessingforComputerVision. KluwerAcademicPublishers,1995.
ISBN 0-7923-9530-1.
[6] H.Hotelling. Relationsbetweentwosets ofvariates. Biometrika,28:321{377,1936.
[7] J. Kay. Feature discovery under contextual supervision using mutual information. In International Joint
Conference onNeural Networks,volume4,pages79{84.IEEE,1992.
[8] H. Knutsson. Representing local structure using tensors. In The 6th Scandinavian Conference on Image
Analysis,pages244{251,Oulu,Finland,June1989. ReportLiTH{ISY{I{1019,ComputerVisionLaboratory,
LinkopingUniversity,Sweden,1989.
[9] H.Knutsson. TensorBasedSpatio-temporalSignalAnalysis. InJ.L.CrowleyH.I.Christensen,editor,Vision
asProcess.Springer,1995. BasicResearchSeries.
[10] T.Kohonen. Self-organizedformationoftopologicallycorrectfeaturemaps. Biological Cybernetics,43:59{69,
1982.
[11] T.Landelius andH. Knutsson. TheLearningTree,A NewConceptin Learning. InProceedings of the 2:nd