or,
Relative Orientation from Extended Sequences
of Sparse Point and Line Correspondences
Using the AÆne Trifocal Tensor
? ??
LarsBretznerandTonyLindeberg
ComputationalVisionandActivePerceptionLaboratory(CVAP)
Dept.ofNumericalAnalysisandComputingScience,
KTH,S-10044Stockholm,Sweden
Abstract. Thispaperaddressestheproblemofcomputingthree-dimen-
sional structure and motion from an unknown rigid conguration of
pointsandlinesviewedbyanaÆneprojectionmodel.Analgebraicstruc-
ture,analogous tothe trilineartensor for threeperspectivecameras, is
denedforcongurationsofthreecenteredaÆnecameras.Thiscentered
aÆnetrifocaltensorcontains12non-zerocoeÆcientsandinvolveslinear
relationsbetweenpointcorrespondencesandtrilinearrelations between
linecorrespondences.ItisshownhowtheaÆnetrifocaltensorrelatesto
theperspectivetrilineartensor, andhowthree-dimensionalmotion can
becomputedfromthistensorinastraightforwardmanner.Afactoriza-
tionapproachisalsodevelopedtohandlepointfeaturesandlinefeatures
simultaneouslyinimage sequences.This theoryis appliedto aspecic
probleminhuman-computerinteractionofcapturingthree-dimensional
rotations from gestures of a human hand. Besides the obvious appli-
cation,this test problemillustrates the usefulnessof the aÆnetrifocal
tensorinasituationwheresuÆcientinformationisnotavailabletocom-
putetheperspective trilineartensor,whilethe geometryrequirespoint
correspondencesaswellaslinecorrespondencesoveratleastthreeviews.
1 Introduction
Theproblemofderivingstructuralinformationandmotioncuesfromimagese-
quences arisesasanimportantsubproblemin severalcomputervisiontasks.In
this paper,weareconcerned withthecomputation of three-dimensionalstruc-
ture and motion from point and line correspondences extracted from a rigid
three-dimensionalobjectofunknownshape,usingtheaÆnecameramodel.
?
Thesupportfrom theSwedishResearchCouncilfor EngineeringSciences, TFR,is
gratefullyacknowledged.Email:bretzner@nada.kth.se,tony@nada.kth.se
??
InProc.5thEuropeanConferenceonComputerVision(H.BurkhardtandB.Neu-
mann,eds.),vol.1406ofLectureNotesinComputerScience,(Freiburg,Germany),
from perspectiveand orthographicprojectionhavebeenpresentedby(Ullman
1979,Maybank1992,Huang&Lee1989, Huang&Netravali1994)andothers.
With the introduction of the aÆne camera model (Koenderink & van Doorn
1991,Mundy&Zisserman1992)alargenumberofapproacheshavebeendevel-
oped, including (Shapiro 1995, Beardsleyet al. 1994, McLauchlan et al. 1994,
Torr 1995) to mention just afew. Line correspondences havebeen studied by
(Spetsakis&Aloimonos1990,Wenget al.1992),andfactorizationmethodsfor
points and lines constitute a particularly interesting development (Tomasi &
Kanade 1992, Morita& Kanade 1997, Quan& Kanade1997, Sturm & Triggs
1996).These directionsofresearchhaverecentlybeencombinedwiththe ideas
behind the fundamental matrix (Longuet-Higgins 1981, Faugeras 1992, Xu &
Zhang1997)andhaveleadto thetrilineartensor(Shashua1995,Hartley1995,
Heyden 1995) as aunied model for point and line correspondences for three
cameras,withinterestingapplications(Beardsleyetal.1996)aswellasadeeper
understandingoftherelationsbetweenpointfeaturesandlinefeaturesovermul-
tipleviews(Faugeras&Mourrain1995,Heydenetal.1997).
Thesubjectofthispaperistobuildupontheabovementionedworks,andto
developaframeworkforhandlingpointandlinefeaturessimultaneouslyforthree
ormoreaÆneviews.Initially,weshallfocusonimagetripletsandshowhowan
aÆnetrifocaltensor canbedenedforthreecenteredaÆnecameras.Thistensor
has a similar algebraic structure as the trilinear tensor for three perspective
cameras.Comparedto the trilineartensor, however,ithas theadvantagethat
it contains a smaller number of coeÆcients, which implies that fewer feature
correspondencesarerequiredtodeterminethistensor.Itwillalsobeshownthat
motionestimationfromthistensorismorestraightforward.
This theory will then be applied to the problem of computing changes in
three-dimensional orientation from a sparse set of point and line correspon-
dences.Specically,itwillbedemonstratedhowastraightforwardman-machine
interface for 3-D orientation interaction (Lindeberg & Bretzner 1998) can be
designedbasedonthetheorypresentedandusingnootheruserequipmentthan
theoperator'sownhand.Formoredetails,see(Bretzner&Lindeberg1998).
2 Geometric problem and extraction of image features
Aspecicapplicationweareinterestedinistomeasurechangesintheorientation
ofahumanhand,asastraightforwardinterfacetotransfer3-Drotationalinfor-
mation to acomputerusing no other userequipmentthan the operator'sown
hand.Incontrasttopreviousapproachesfor human{computerinteraction that
arebasedondetailedgeometrichandmodels(suchas(Lee&Kunii1995,Heap
&Hogg1996))weshallhereexploreamodelbasedonqualitativefeaturesonly.
This modelinvolvesthethumb, theindex ngerand themiddlenger,andfor
each nger the position of the ngertip and the orientation of the nger are
measuredintheimagedomain. Successfultrackingoftheseimagefeaturesover
whichisassumed toberigid.Itis worthnotingthat neitherthetrajectoriesof
pointfeaturesorlinefeaturesper se aresuÆcienttocomputethemotioninfor-
mationweareinterestedin.Theproblemrequiresthecombinationofpointand
linefeatures.Moreover,duetothesmallnumberofimagefeatures,theinforma-
tion is notsuÆcient to computethe trilineartensor for perspective projection
(see thenextsection).Forthis reason,weshalluse anaÆneprojectionmodel,
andtheaÆnetrifocaltensorwill beakeytool.
Thetrajectoriesofimagefeaturesusedasinputareextractedusingaframe-
work for feature tracking with automatic scaleselection reported in (Bretzner
&Lindeberg1996,Bretzner&Lindeberg1997).Blobfeaturescorrespondingto
the nger tips are computed from points (x;y; t) in scale-space (Koenderink
1984,Lindeberg1994)atwhichthesquarednormalizedLaplacian
(r 2
norm L)
2
=t 2
(L
xx +L
yy )
2
(1)
assumes maxima with respect to scale and space simultaneously (Lindeberg
1994).Suchpointsarereferredtoasscale-spacemaximaofthenormalizedLapla-
cian.Inasimilarway,ridgefeaturesaredetectedfromscale-spacemaximaofa
normalizedmeasureofridgestrengthdened by(Lindeberg1996)
AL 2
norm
=t 4
(L 2
pp L
2
qq )
2
=t 4
(L
xx L
yy )
2
+4L 2
xy
2
; (2)
where L
pp and L
arethe eigenvaluesof the Hessianmatrix and thenormal-
izationparameter =0:875.Ateachridgefeature,awindowedsecondmoment
matrix(Forstner&Gulch1987, Bigunet al.1991,Lindeberg1994)
= Z Z
(;)2R 2
L 2
x L
x L
y
L
x L
y L
2
y
g(;; s)dd (3)
iscomputed usingaGaussian windowfunction g(;; s)centeredat thespatial
maximumofAL
norm
andwiththeintegrationscalestunedbythedetection
scale of the scale-space maximum of AL
norm
. The eigenvector of corre-
spondingtothelargesteigenvaluegivestheorientationofthenger.
Theleftcolumnin gure3showsanexampleofimagetrajectoriesobtained
in thisway. An attractiveproperty of this feature trackingscheme is that the
scale selectionmechanismadapts the scale levels to thelocal image structure.
Thisgivestheabilitytotrackimagefeaturesoverlargesizevariations,whichis
particularly importantfor theridge tracker.Provided that thecontrastto the
backgroundissuÆcient,thisschemegivesfeaturetrajectoriesoverlargenumbers
offrames,usingaconceptuallyverysimpleinterframematchingmechanism.
3 The trifocal tensor for three centered aÆne cameras
Tocapturemotioninformationfromtheprojectionsofanunknownconguration
oflinesin3-D,itisnecessarytohaveatleastthreeindependentviews.Acanon-
ical model fordescribing thegeometric relationships betweenpointcorrespon-
dencesandlinecorrespondencesoverthreeperspectiveviewsisprovidedbythe
framescanbeobtainedbyfactorizingamatrixwithimagemeasurementstothe
product oftwomatricesof rank3,onerepresentingmotion,and theotherone
representingshape(Tomasi&Kanade1992,Ullman&Basri1991).Frameworks
forcapturinglinecorrespondencesovermultipleaÆneviewshavebeenpresented
by(Quan& Kanade1997)and forpoint featuresunder perspectiveprojection
by(Sturm &Triggs1996).
Thesubjectofthissectionistocombinetheideabehindthetrilineartensor
for simultaneous modelling ofpointand line correspondencesoverthree views
with the aÆne projection model. It will be shown how an algebraic structure
closely related to the trilinear tensor can be dened for three centered aÆne
cameras. This centered aÆne trifocal tensor involves linear relations between
thepointfeaturesandtrilinearrelationshipsbetweentheline features.
3.1 Perspective cameraand three views
ConsiderapointP =(x;y;1;) T
whichis projectedbythree cameramatrices
M=[I;0],M 0
=[A;u 0
] andM 00
=[B;u 00
] totheimagepointsp,p 0
andp 00
:
p= 0
@ x
y
1 1
A
= 0
@ 1000
0100
0010 1
A 0
B
B
@ x
y
1
1
C
C
A
; (4)
p 0
= 0
@ x
0
y 0
1 1
A
= 0
@ a
1
1 a
1
2 a
1
3 u
0 1
a 2
1 a
2
2 a
2
3 u
0 2
a 3
1 a
3
2 a
3
3 u
0 3
1
A 0
B
B
@ x
y
1
1
C
C
A
= 0
B
@ a
1 T
p+u 0
1
a 2
T
p+u 0
2
a 3
T
p+u 0
3 1
C
A
; (5)
p 00
= 0
@ x
0 0
y 00
1 1
A
= 0
@ b
1
1 b
1
2 b
1
3 u
00 1
b 2
1 b
2
2 b
2
3 u
00 2
b 3
1 b
3
2 b
3
3 u
00 3
1
A 0
B
B
@ x
y
1
1
C
C
A
= 0
B
@ b
1 T
p+u 00
1
b 2
T
p+u 00
2
b 3
T
p+u 00
3 1
C
A
: (6)
Following(Faugeras&Mourrain1995)and(Shashua1997),letusintroducethe
followingtwomatrices
r
j
=
1 0 x 0
0 1y 0
; s
k
=
1 0 x 00
0 1y 00
: (7)
Then, in terms of tensor notation (where i;j;k 2 [1;3], ; 2 [1;2] and we
throughoutfollowtheEinsteinsummationconventionthatadoubleoccurrence
ofanindeximpliessummationoverthatindex)therelationsbetweentheimage
coordinatesandthecamerageometrycanbewritten
r
j u
0 j
+r
j a
j
i p
i
=0; s
k u
00 k
+s
k b
k
i p
i
=0: (8)
Byintroducingthetrifocal tensor(Shashua1995,Hartley1995)
T jk
=a j
u 00
k
b k
u 0
j
; (9)
r
j s
k T
jk
i
=0: (10)
Writtenoutexplicitly,thisexpressioncorrespondstothefollowingfourrelations
betweentheprojectionsp,p 0
andp 00
ofP (Shashua1997):
x 0 0
T 13
i p
i
x 0 0
x 0
T 33
i p
i
+x 0
T 31
i p
i
T 11
i p
i
=0;
y 00
T 13
i p
i
y 00
x 0
T 33
i p
i
+x 0
T 32
i p
i
T 12
i p
i
=0;
x 00
T 23
i p
i
x 00
y 0
T 33
i p
i
+y 0
T 31
i p
i
T 21
i p
i
=0;
y 00
T 23
i p
i
y 00
y 0
T 33
i p
i
+y 0
T 32
i p
i
T 22
i p
i
=0:
(11)
Giventhreecorrespondinglines, l T
p=0,l 0
T
p 0
=0andl 00
T
p 00
=0,each image
linedenesaplanethroughthecenterofprojection,givenbyL T
P =0,L 0
T
P =
0andL 00
T
P =0,where
L T
=l T
M=(l
1
;l
2
;l
3 0);
L 0
T
=l 0
T
M 0
=(l 0
j a
j
1
;l 0
j a
j
2
;l 0
j a
j
3
;l 0
j u
0 j
);
L 00
T
=l 00
T
M 00
=(l 00
k b
k
1
;l 00
k b
k
2
; l 00
k b
k
3
;l 00
k u
00 k
):
(12)
Since l,l 0
and l 00
are assumedto beprojections of thesamethree-dimensional
line, theintersection oftheplanesL,L 0
andL 00
mustdegenerateto alineand
rank 0
B
B
B
@ l
1 l
0
j a
j
1 l
00
k b
k
1
l
2 l
0
j a
j
2 l
00
k b
k
2
l
3 l
0
j a
j
3 l
00
k b
k
3
0 l 0
j u
0 j
l 00
k u
00 k
1
C
C
C
A
=2: (13)
All 33minorsmustbezero,andremovalof thethree rstlines respectively,
leadstothefollowingtrilinearrelationships,outofwhichtwoareindependent:
(l
2 T
jk
3 l
3 T
jk
2 )l
0
j l
00
k
=0;
(l
1 T
jk
3 l
3 T
jk
1 )l
0
j l
00
k
=0;
(l
1 T
jk
2 l
2 T
jk
1 )l
0
j l
00
k
=0:
(14)
These expressionsprovideacompactcharacterizationof thetrilinearline rela-
tionsrstintroducedby(Spetsakis&Aloimonos1990).
Insummary, eachpointcorrespondencegivesfour equations,and each line
correspondence two.Hence,K pointsand Llines are(generically)suÆcient to
express a linear algorithm for computing the trilinear tensor (up to scale) if
Considernext apointQ =(x;y;;1) T
which is projectedto theimage points
q, q 0
andq 00
bythreeaÆnecameramatricesM,M 0
and M 00
, respectively:
q= 0
@ x
y
1 1
A
=MQ= 0
@ 1000
0100
0001 1
A 0
B
B
@ x
y
1 1
C
C
A
; (15)
q 0
= 0
@ x
0
y 0
1 1
A
=M 0
Q= 0
@ c
1
1 c
1
2 c
1
3 v
0 1
c 2
1 c
2
2 c
2
3 v
0 2
0 0 0 1 1
A 0
B
B
@ x
y
1 1
C
C
A
; (16)
q 00
= 0
@ x
0 0
y 00
1 1
A
=M 00
Q= 0
@ d
1
1 d
1
2 d
1
3 v
00 1
d 2
1 d
2
2 d
2
3 v
00 2
0 0 0 1 1
A 0
B
B
@ x
y
1 1
C
C
A
: (17)
Here, the parameterization of Q diers from P, since for an image point q =
(x;y;1) T
theprojection(15)impliesthatthethree-dimensionalpointisonthe
rayQ=(x;y;;1) T
forsome.Byeliminating,weobtainthefollowinglinear
relationshipsbetweentheimagecoordinatesofq,q 0
andq 00
:
(c 1
3 d
1
1 c
1
1 d
1
3 )x+(c
1
3 d
1
2 c
1
2 d
1
3 )y+d
1
3 x
0
c 1
3 x
0 0
+(c 1
3 v
00 1
d 1
3 v
0 1
)=0;
(c 2
3 d
1
1 c
2
1 d
1
3 )x+(c
2
3 d
1
2 c
2
2 d
1
3 )y+d
1
3 y
0
c 2
3 x
0 0
+(c 2
3 v
00 1
d 1
3 v
0 2
)=0;
(c 1
3 d
2
1 c
1
1 d
2
3 )x+(c
1
3 d
2
2 c
1
2 d
2
3 )y+d
2
3 x
0
c 1
3 y
00
+(c 2
3 v
00 2
d 2
3 v
0 2
)=0;
(c 2
3 d
2
1 c
2
1 d
2
3 )x+(c
2
3 d
2
2 c
2
2 d
2
3 )y+d
2
3 y
0
c 2
3 y
00
+(c 2
3 v
00 2
d 2
3 v
0 2
)=0:
(18)
This structure corresponds to thetrilinearconstraint(11) forperspectivepro-
jection,andweshallreferto itastheaÆnetrifocalpointconstraint.
Threelinesl T
q=0,l 0
T
q 0
=0andl 00
T
q 0 0
=0inthethreeimagesdenethree
planesL T
Q=0,L 0
T
Q=0andL 00
T
Q=0in three-dimensionalspacewith
L T
=l T
M =(l
1
;l
2
;0;l
3 );
L 0
T
=l 0
T
M 0
=(l 0
1 c
1
1 +l
0
2 c
2
1
; l 0
1 c
1
2 +l
0
2 c
2
2
;l 0
1 c
1
3 +l
0
2 c
2
3
; l 0
1 v
0 1
+l 0
2 v
0 2
+l 0
3 );
L 0
T
=l 00
T
M 00
=(l 00
1 d
1
1 +l
00
2 d
2
1
; l 00
1 d
1
2 +l
00
2 d
2
2
;l 00
1 d
1
3 +l
00
2 d
2
3
;l 00
1 v
0 0 1
+l 00
2 v
0 0 2
+l 00
3 ):
Since l, l 0
and l 00
are projections of the samethree-dimensionalline, the inter-
sectionofL,L 0
andL 00
mustdegenerateto alineand
rank
l
1 l
0
1 c
1
1 +l
0
2 c
2
1
l 00
1 d
1
1 +l
00
2 d
2
1
l
2 l
0
1 c
1
2 +l
0
2 c
2
2
l 00
1 d
1
2 +l
00
2 d
2
2
0 l
0
1 c
1
3 +l
0
2 c
2
3
l 00
1 d
1
3 +l
00
2 d
2
3
l
3 l
0
v 0
1
+l 0
v 0
2
+l 0
l 00
v 0
1
+l 00
v 00
2
+l 00
=2: (19)