Learning Canonical Correlations

(1)

M.Borga,H.Knutsson,T.Landelius

ComputerVisionLaboratory

DepartmentofElectricalEngineering

LinkopingUniversity

S-58183Linkoping,Sweden

Abstract

This paper presents a novel learning algorithm that nds the linear combination of one set of

multi-dimensionalvariatesthatisthebestpredictor,andatthesametimendsthelinearcombinationofanotherset

whichisthemostpredictable. Thisrelationisknownasthecanonicalcorrelationandhasthepropertyofbeing

invariantwithrespecttoaÆnetransformationsofthetwosetsofvariates. Thealgorithmsuccessivelyndsall

thecanonicalcorrelationsbeginningwiththe largestone. Itis shownthat canonicalcorrelationscanbeused

incomputervisiontondfeaturedetectorsbygivingexamplesofthedesiredfeatures. Whenusedonthepixel

level,themethodndsquadratureltersand whenusedonahigherlevel,themethodndscombinations of

lteroutputthatarelesssensitivetonoisecomparedtovectoraveraging.

1 Introduction

Acommonprobleminneuralnetworksandlearning,incapacitatingmanytheoreticallypromisingalgorithms,isthe

high dimensionalityof theinput-outputspace. Asanexample,typicaldimensionalitiesforsystemshavingvisual

inputsfarexceedacceptablelimits. Forthisreason,apriorirestrictionsmustbeinvoked. Acommonrestrictionis

to useonlylocallylinearmodels. ToobtaineÆcientsystems,thedimensionalities ofthemodelsshouldbeaslow

aspossible. The useoflocally low-dimensional linearmodelswill in mostcasesbe adequateifthesubdivisionof

theinputandoutputspacesaremadeadaptively[3,11].

An important problem is to nd the best directions in the input- and output spaces for the local models.

Algorithms like the Kohonen self organizing feature maps [10] and others that work with principal component

analysis will nd directions where the signal variances are high. This is, however, of little use in a response

generating system. Such a system should nd directions that eÆciently represents signals that are important

ratherthansignalsthathavelargeenergy.

In general theinput to asystem comes from a set of dierent sensors and it is evident that the range of the

signal values from agiven sensor is unrelated to the importance of the receivedinformation. The same line of

reasoningholds for theoutputwhich may consistof signalsto aset ofdierent eectuators. Forthis reasonthe

correlationbetweeninputandoutputsignalsisinterestingsincethismeasureofinput-outputrelationisindependent

ofthesignalvariances. However,correlationaloneisnotnecessarilymeaningful. Onlyinput-outputpairsthatare

regardedasrelevantshouldbeenteredin thecorrelationanalysis.

Relating only the projections of the input, x, and output, y, on two vectors, w

x

and w

y

, establishes a

one-dimensional linear relation between the input and output. We wish to nd the vectors that maximizes

corr(x T w x ; y T w y

), i.e. the correlation between the projections. This relation is known as canonical

corre-lation [6]. It is a statistical method of nding the linear combination of one set of variables that is the best

predictor, andatthesametimendingthelinearcombinationofanotherset whichisthemostpredictable.

Ithasbeenshown[7]that ndingthecanonicalcorrelationsisequivalenttomaximizingthemutualinformation

betweenthesetsX andY ifxandycomefrom ellipticalsymmetricrandomdistributions.

In section2,a briefreviewof thetheory ofcanonicalcorrelation is given. Insection 3wepresentaniterative

learningrule,equation7,that ndsthedirectionsandmagnitudesof thecanonicalcorrelations. Toillustrate the

algorithm behaviour,someexperimentsarepresentedanddiscussedin section4. Finally,in section5,wediscuss

(2)

2 Canonical Correlation

Considertworandomvariables,xandy,from amulti-normaldistribution:

x y N x 0 y 0 ; C xx C xy C yx C yy ; (1) where C= C xx C xy C y x C y y

isthecovariancematrix. C

xx andC

yy

arenonsingularmatricesandC

xy =C

T

yx

. Consider

the linearcombinations,x=w T x (x x 0 )and y=w T y (y y 0

), ofthetwovariablesrespectively. The correlation

betweenx andy isgivenbyequation2,seeforexample[2]:

= w T x C xy w y q w T x C xx w x w T y C yy w y : (2)

Thedirectionsofthepartialderivativesofwithrespecttow

x andw

y

aregivenby:

8 > < > : @ @wx ! = C xy ^ w y ^ w T x Cxywy^ ^ w T x Cxx ^ wx C xx ^ w x @ @w y ! = C yx ^ w x ^ w T y Cy xwx^ ^ w T y C y y ^ w y C yy ^ w y (3)

where'^'indicatesunitlengthand' !

='meansthatthevectors,leftandright,havethesamedirections. Acomplete

descriptionofthecanonicalcorrelationsisgivenby:

C xx [ 0 ] [ 0 ] C yy 1 [ 0 ] C xy C yx [ 0 ] ^ w x ^ w y = x ^ w x y ^ w y (4) where: ; x ; y >0and x y

=1. Equation4canberewrittenas:

8 < : C 1 xx C xy ^ w y = x ^ w x C 1 yy C yx ^ w x = y ^ w y (5)

Solvingequation 5givesNsolutionsf

n ;w^

xn ;w^

yn

g; n=f1::Ng. Nis theminimumof theinputdimensionality

andtheoutputdimensionality. Thelinearcombinations,x

n =w^ T xn xandy n =w^ T yn

y ,aretermedcanonicalvariates

and thecorrelations,

n

,betweenthesevariatesaretermedthecanonicalcorrelations[6]. Animportantaspectin

this context isthat the canonicalcorrelationsare invariant toaÆne transformations ofx andy. Alsonote that

thecanonicalvariatescorrespondingtothedierentrootsofequation5areuncorrelated,implyingthat:

8 < : w T xn C xx w xm =0 w T yn C yy w ym =0 if n6=m (6)

Itshould benotedthatequation4isaspecialcaseof thegeneralizedeigenproblem[4]:

Aw=Bw:

Thesolutiontothisproblemcan befoundbyndingthevectorswthat maximizestheRayleigh quotient:

r= w T Aw w T Bw :

3 Learning Canonical Correlations

Wehavedevelopedanovellearningalgorithmthatndsthecanonicalcorrelationsandthecorrespondingcanonical

variatesbyaniterativemethod. Theupdateruleforthevectorsw

x andw y isgivenby: 8 < : w x (w x + x x(y T ^ w y x T w x ) w y (w y + y y(x T ^ w x y T w y ) (7)

(3)

wherexandybothhavethemean0. Toseethatthisrulendsthedirectionsofthecanonicalcorrelationwelook

at theexpectedchange,in oneiteration,ofthevectors,w

x and w y : 8 < : Efw x g= x Efxy T ^ w y xx T w x g= x (C xy ^ w y kw x kC xx ^ w x ) Efw y g= y Efyx T ^ w x yy T w y g= y (C yx ^ w x kw y kC yy ^ w y )

Identifyingwithequation3gives:

Efw x g ! = @ @w x and Efw y g ! = @ @w y (8) with kw x k= ^ w T x C xy ^ w y ^ w T x C xx ^ w x and kw y k= ^ w T y C yx ^ w x ^ w T y C yy ^ w y

This showsthat theexpectedchangesof thevectorsw

x andw

y

arein thesamedirectionsasthegradientof the

canonical correlation, ,which meansthat the learningrulesin equation 7on average is agradientsearch on .

;

x and

y

arefoundas:

= q kw x kkw y k; x = 1 y = s kw x k kw y k : (9)

3.1 Learning of successive canonical correlations

The learningrule maximizesthecorrelation and ndsthe directions, w^

x1 andw^

y1

, corresponding to thelargest

correlation,

1

. Tondthesecondlargestcanonicalcorrelationandthecorrespondingcanonicalvariatesofequation

5weusethemodiedlearningrule

8 < : w x (w x + x x((y y 1 ) T ^ w y x T w x ) w y (w y + y y((x x 1 ) T ^ w x y T w y ) (10) where x 1 = x T ^ w x1 v x1 ^ w T x1 v x1 and y 1 = y T ^ w y1 v y1 ^ w T y1 v y1 : v x1 andv y1 areestimatesofC xx ^ w x1 and C yy ^ w y1

respectivelyandareestimatedusingtheiterativerule:

8 < : v x1 (v x1 + (xx T ^ w x1 v x1 ) v y1 (v y1 + (yy T ^ w y1 v y1 ) (11)

Theexpectedchangeofw

x andw y isthengivenby 8 > < > : Efw x g= x C xy ^ w y ^ w y1 ^ w T y 1 Cy ywy^ ^ w T y 1 Cy ywy 1^ kw x kC xx ^ w x Efw y g= y C yx h ^ w x ^ w x1 ^ w T x1 Cxxwx^ ^ w T x1 C xx ^ w x1 i kw y kC yy ^ w y (12)

It canbeseenthat thepartsofw

x and w y parallel toC xx ^ w x1 andC yy ^ w y1

respectivelywill vanish(w T x w x1 0 8xand w T y w y1

0 8yin equation 10). InthesubspacesorthogonaltoC

xx ^ w x1 andC yy ^ w y1 thelearning

rulewillbeequivalenttothatgivenbyequation7. Inthiswaythepartsofthesignalscorrelatedwithw T x1 x(and w T y1

y )aredisregardedleavingtherestunchanged. Consequentlythealgorithmndsthesecondlargestcorrelation

2

and thecorrespondingvectorsw

x2 and w

y2

. Successive canonicalcorrelationscanbefound by repeating the

procedure.

4 Performance

Inthissection,twodierentexperimentsarepresentedtoillustratetheeÆciencyandperformanceoftheproposed

(4)

10

1

10

2

10

3

10

5

10

6

10

7

10

8

10

9

10

10 dimensions

flops

10

1

10

2

10

3

10

3

10

4

10

5

10

6 dimensions

flops

Figure1: Left: Numberof opsuntilconvergenceforRLS(dashedline)andforouralgorithm(solidline). Right:

Numberof opsperiterationforRLS(dashedline)andforouralgorithm(solidline).

4.1 O(n) speedup

In the rst experiment, we will demonstrate, the advantageof using our canonicalcorrelation algorithm to nd

the proper subspacefora low-dimensional linearmodel in a high dimensional space. A set of trainingdata was

generatedwithn-dimensionalinputvectorsxand8-dimensionaloutputvectorsy . Thereweretwolinearrelations

between x and y and, hence, the propersubspaces should be two-dimensional in the input and output spaces

respectively.

In the experiment, the canonical correlation algorithm was run until it found the proper subspace. This can

bedetermined by thealgorithmsinceit, besidesthedirections ofcanonicalcorrelation,givesanestimate ofeach

canonicalcorrelation. Afterconvergence,astandardrecursiveleastsquare(RLS)algorithmwasappliedtondthe

linear relationsbetween thetwo-dimensionalsubspaces. This algorithm wasiterated until convergence,i.e. until

theerrorwasbelowacertainthreshold. Theerrorwasdenedas

=kBC T

x y k 2

whereBisthe(small)matrixfoundbytheRLSalgorithmandCisthematrixfoundbythecanonicalcorrelation

algorithm extractingtherelevantsubspace.

Asacomparison,thestandardRLSalgorithmwasuseddirectlyonthedata,iterateduntilconvergencewiththe

samethresholdasintherstexperiment. Theerrorinthiscasewasdenedas

=kAx y k 2

where Aisthe(large) matrixfoundbytheRLSalgorithm. Theresultsareplotted ingure1.

Inbothexperimentsthedimensionalities noftheinputspacewas16,32, 64and128(computationalproblems

with the standard RLSmethod set the upperlimit). Thecomplexitywasmeasured bycountingthe numberof

oating point operations ( ops) until convergence (left) and per iteration (right) for the standardRLS method

(markedwith rings)and for ourmethod (marked with stars). The lines show theestimated numberof ops for

largerdimensionalities. Theywerecalculatedbyttingpolynomialstothedata. Forourmethod, asecondanda

rst orderpolynomialwas suÆcientforthe datain theleft and rightgures respectively. ForthestandardRLS

method, athird andasecondorder polynomialrespectivelyhadto beused,in accordancewiththeory. Note the

logarithmicscale.

TheresultsshowthatouralgorithmisoforderO(n 2

)whenrununtilconvergence(O(n)periteration)whilethe

standardRLSmethod isoforderO(n 3

)(O(n 2

)periteration).

Thedimensionalityof thelinearrelationcan,of course,in generalnotbe knownin advance. Thisis, however,

not a problem since the canonical correlation algorithm rst nds the largestcorrelation and then proceeds by

(5)

4.2 High dimensional spaces

Thesecondexperimentillustratesthealgorithmsabilitytohandlehigh-dimensionalspaces. Thedimensionalityof

xis800andthedimensionalityofyis 200,sothetotaldimensionalityofthesignalspaceis1000.

Ratherthantuningparameterstoproduceaniceresultforaspecicdistribution,wehaveusedadaptiveupdate

factorsandparametersproducingsimilarbehaviourfordierentdistributionsanddierentnumberofdimensions.

Also note that the adaptability allowsasystem withouta pre-speciedtime dependentupdate rate decay. The

coeÆcients

x and

y

werein theexperimentscalculatedaccordingtoequation13:

8 < : x =a x E 1 x y =a y E 1 y where 8 < : E x (E x +b(kxx T w x k E x ) E y (E y +b(kyy T w y k E y ) (13)

Togetasmoothandyetfastbehaviour,anadaptivelytimeaveragedsetofvectors,w

a

wascalculated. Theupdate

speedwasmadedependentontheconsistencyin thechangeoftheoriginalvectorswaccordingtoequation14.

8 < : w ax (w ax +c k x kkw x k 1 (w x w ax ) w ay (w ay +ck y kkw y k 1 (w y w ay ) where 8 < : x ( x +d(w x x ) y ( y +d(w y y ) (14)

This processwecalladaptive smoothing.

Theexperimenthavebeencarriedoutusingarandomlychosendistributionofa800-dimensionalxvariableand

a 200-dimensional y variable. Twox and two y dimensions were partlycorrelated. The variances in the 1000

dimensionsareinthesameorderofmagnitude

0

0.5

1

1.5

2

2.5

3 x 10

4

0

0.5

1

1.5 iterations

correlation

0

0.5

1

1.5

2

2.5

3 x 10

4 −5

−4

−3

−2

−1

0

1 iterations

Angle error [ log(rad) ]

Figure2: Left: Figureshowingtheestimatedrstcanonicalcorrelation(solidline)asafunctionofnumberofactual

eventsandthetruecorrelationinthecurrentdirectionsfoundbythealgorithm(dottedline). Thedimensionality

of onesetof variablesis 800andof thesecond set200. Right: Figureshowingthelogof theangularerrorasa

function ofnumberofactualeventsonalogarithmicscale.

Totheleft in gure2,the estimatedrst canonicalcorrelationasafunction of numberofactual events(solid

line)and thetruecorrelationinthecurrentdirectionsfoundbythealgorithm (dottedline)isshown.

Totherightinthesamegure,theeectoftheadaptivesmoothingisshown. Theangleerrorsofthesmoothed

estimates(solidanddashedcurves)aremuchmorestableanddecreasemorerapidlythanthe`raw'estimates. The

errorsafter310 4

samplesisintheorderofafewdegrees. (Itshouldbenotedthatthisisanextremeprecisionas,

witharesolutionof3degrees,alowestimateofthenumberofdierentorientationsina1000-dimensionalspaceis

10 1500

.) Theangularerrorswerecalculatedastheanglebetweenthevectorsw

a

andtheexactsolutions,^e(known

from thexysampledistribution),i.e. arccos(w^ T

a ^

(6)

5 Applications in computer vision

AttheComputerVisionLaboratoryinLinkoping,wehavedevelopedatensorrepresentationofimagefeatures(see

e.g. [8, 9]) thathas receivedattentionin the computervision society. A possibleextension ofthetensor concept

towardsmorerobustestimations,representationofcertainty, representationofhigherorder features,etc.involves

higherorderltercombinations.

Asanexample,considerathree-dimensionallteringwitha777neighborhoodonthreescales. Thisgives

approximately1000lterresponses. Acompletesecondorderfunction ofthislteringinvolvesabout10 6

signals.

The selectionamong alldierentpossibleltercombinationsto designatensor representationis verydiÆcult

andalearningsystemiscalledfor. Astandardoptimizationmethod, basedonmeansquareerroris,however,not

veryusefulsinceitistheshapeofthetensorthatisofinterestratherthansize. Ifwe,forexample,wanttohavea

tensorrepresentingtheorientationofasignal,wewantthetensortocarryasmuchinformationaspossibleabout

theorientationandnotthemagnitude ofthesignal.

Forthisreason,thecanonicalcorrelationalgorithmisasuitablemethod,sinceitisbasedonmutualinformation

maximizationratherthanmeansquareerror[1]. Itcanalsohandlehigh-dimensionalsignalspaceswhichisessential

in afurther developmentofthetensorconceptinimageprocessing.

Experimentsshowthat thecanonicalcorrelationalgorithmcanbeusedtondltersthatdescribeaparticular

feature in animage invariantwith respect to other features. Thefeatures to be described arelearnedby giving

examples that arepresentedin pairs to thealgorithm in such away that thedesired feature, e.g.orientation,is

equalforeachpairwhileotherfeatures,e.g.phase,arepresentedinanunorderedway.

5.1 Learning low level operations

Intherstexperiment,weshowthatquadratureltersarefoundbythismethodwhenproductsofpixeldataare

presentedto thealgorithm. Quadraturelterscanbeusedto describelowerorderfeatures,e.g.local orientation.

LetI

x andI

y

beapairof55images. EachimageconsistsofasinewavepatternandadditiveGaussiannoise.

A sequence of such image pairsis constructed so that, for each pair, the orientation is equal in the twoimages

while the phase diers in a random way. Theimages have independent noise. Each image pairis described by

vectorsi

x and i

y .

LetXandYbetheouterproductsoftheimagevectors,i.e.X=i

x i T x andY=i y i T y

andrearrangethematrices

XandYintovectorsxandyrespectively. Now,wehaveasequenceofpairsof625-dimensionalvectorsdescribing

theproductsofpixeldatafrom theimages.

Thesequenceconsistof6500examples,i.e.20examplesperdegreeoffreedom. (The outerproductmatricesare

symmetricand, hence,thenumberoffreeparametersis n

2

+n

2

wherenis thedimensionalityoftheimage vector.)

Forasignaltonoiseratio(SNR)of0dB,therewere6signicant 1

canonicalcorrelationsandforanSNRof10dB

there were10signicantcanonicalcorrelations. Thetwomostsignicantcorrelationsforthe0dBcasewereboth

0.7 which corresponds to an SNR 2

of 3.7 dB. For the10 dB case, thetwo highestcorrelations were both0.989,

correspondingto anSNRof19.5dB.

Theprojectionsofimagesignalsxfororientationsbetween0andontothe10rstcanonicalcorrelationvectors

w

x

fromthe10dBcaseareshowntotheleftingure3. Thesignalsweregeneratedwithrandomphaseandwithout

noise. Asseeninthegure,thersttwocanonicalcorrelationsaresensitivetothedoubleangleoftheorientationof

thesignalandinvariantwithrespecttophase. Thetwocurvesare90 Æ

outofphaseand,hence,form aquadrature

pair[5]. Thefollowingcurvesshowthelowercorrelationswhicharesensitivetothefourth,sixth,eighth,andtenth

multiples oftheorientationandtheyalsoform quadraturepairs.

To be able to interpret the canonical correlation vectors, w

x

, we can write the vectors as 2525 matrices,

W

x

,andthen doaneigenvaluedecomposition,i.e. W

x = P i e i e T i

. (Note thatthedata wasgeneratedasouter

products,resultingin positivesemidenitesymmetricmatrices.) Theeigenvectors,e

i

,canbeseenaslinearlters

actingontheimage. Iftheeigenvectorsarerearrangedinto55matrices,\eigenimages",theycanbeinterpreted

in termsof imagefeatures. Totherightin gure3,the twomostsigniÆcanteigenimagesareshownforthe rst

(top)andsecond(bottom)canonicalcorrelationsrespectively. Weseethattheseeigenimagesformtwoquadrature

lterssensitivetotwoperpendicularorientations.

1

Bysignicant,wemeanthattheydierfromtherandomcorrelationscausedbythelimitedsetofsamples.Therandomcorrelations,

inthecaseof20samplesperdegreeoffreedom,isapproximately0.2(givenbyexperiments).

2

TherelationbetweencorrelationandSNRinthiscase isdenedbythecorrelationbetweentwosignalswiththesameSNR,i.e.

(7)

0 π

w

x

1,

w

x

2

Projections onto canonical correlation vectors w

x

1 to w

x

8

0 π

w

x

3,

w

x

4

0 π

w

x

5,

w

x

6

0 π

orientation

w

x

7,

w

x

8 Eigenimage 1 for w

x

1 Eigenimage 2 for w

x

1 Eigenimage 1 for w

x

2 Eigenimage 2 for w

x

2

Figure 3: Left: Projections of outer product vectors x onto the 10 rst canonical correlation vectors. Right:

Eigenvectorsof thecanonicalcorrelationvectorsviewedasimages,\eigenimages".

5.2 Learning higher level operations

Inthisexperiment,weusetheoutputfrom neighboringsetsofquadratureltersratherthanpixelvaluesasinput

to thealgorithm. Thiscan bejustied bythefact thatwehaveseenthat quadraturelterscanbedevelopedon

thelowest(pixel)levelusingthis algorithm. Wewillshowthat canonicalcorrelationcanndawayofcombining

lteroutputfromalocalneighborhoodtogetorientationestimatesthatislesssensitivetonoisethanthestandard

vectoraveragingmethod[5].

Let q

x and q

y

be 16-dimensional complex vectors of lter responses from four quadrature lters from each

position in a 22 neighborhood. (The content in each position could be calculated using the method in the

previous experiment.) Let X and Y be the real parts 3

of the outer products of q

x and q

y

with themselves

respectively and rearrange X and Y into 256-dimensional vectorsx and y. For each pair of vectors, the local

orientationwasequalwhilethephaseandnoisedieredrandomly. TheSNR was0dB.Thetwolargestcanonical

correlationswereboth0.8. Thecorrespondingvectorsdetectedthedoubleangleoftheorientationinvariantwith

respecttophase.

Newdata were generated usingarotatingsine-wavepatternwith anSNR of 0dBand projectedonto thetwo

rstcanonicalcorrelationvectors. Theorientationestimatesareshowtotheleftingure4togetherwithestimates

using vectoraveragingon thesamedata. In therightgure, theangular erroris shown forbothmethods. The

meanabsoluteangularerrorwas16 Æ

forthecanonicalcorrelationmethodand22 Æ

forthevectoraveragingmethod,

i.e. animprovementby27%. Notethat theneigborhood isverysmall(22),but preliminarytests indicatethat

theattainableimprovementin noisereductionincreaserapidlywithneigborhoodsize.

References

[1] S.BeckerandG.E.Hinton.Learningmixturemodelsofspatialcoherence.NeuralComputation,5(2):267{277,

March1993.

[2] R. D. Bock. Multivariate Statistical Methods in Behavioral Research. McGraw-Hill series in psychology.

McGraw-Hill,1975.

[3] M.Borga.HierarchicalReinforcementLearning.InS.GielenandB.Kappen,editors,ICANN'93,Amsterdam,

September1993.Springer-Verlag.

3

(8)

0 π

2π

Double orientation angle

0 π

2π

Estimated double angle using canonical correlation

0 π

2π

3π

4π

5π

0 π

2π

Estimated double angle using vector averaging

0

200

400

600

800 1000

−180

0

180 Angular error using canonical correlation (deg)

0

200

400

600

800 1000

−180

0

180 Angular error using vector averaging (deg)

Figure 4: Left: Orientation estimates for a rotating sine-wave pattern on a 22 neighborhood with an SNR

of 0 dB,using lter combinations found bycanonical correlations(middle) and vectoraveraging(bottom). The

correctdoubleangleisshownasreference(top). Right: Angularerrorsfor1000dierentsamplesusingcanonical

correlations(middle) andvectoraveraging(bottom).

[4] M. Borga. Reinforcement LearningUsing Local AdaptiveModels, August 1995. Thesis No.507, ISBN 91{

7871{590{3.

[5] G.H.GranlundandH.Knutsson.SignalProcessingforComputerVision. KluwerAcademicPublishers,1995.

ISBN 0-7923-9530-1.

[6] H.Hotelling. Relationsbetweentwosets ofvariates. Biometrika,28:321{377,1936.

[7] J. Kay. Feature discovery under contextual supervision using mutual information. In International Joint

Conference onNeural Networks,volume4,pages79{84.IEEE,1992.

[8] H. Knutsson. Representing local structure using tensors. In The 6th Scandinavian Conference on Image

Analysis,pages244{251,Oulu,Finland,June1989. ReportLiTH{ISY{I{1019,ComputerVisionLaboratory,

LinkopingUniversity,Sweden,1989.

[9] H.Knutsson. TensorBasedSpatio-temporalSignalAnalysis. InJ.L.CrowleyH.I.Christensen,editor,Vision

asProcess.Springer,1995. BasicResearchSeries.

[10] T.Kohonen. Self-organizedformationoftopologicallycorrectfeaturemaps. Biological Cybernetics,43:59{69,

1982.

[11] T.Landelius andH. Knutsson. TheLearningTree,A NewConceptin Learning. InProceedings of the 2:nd