VETENSKAP
OCH
KONST
K
UNGL
T
EKNISKA
H
ÖGSKOLAN
Swedish Institute of Computer Science
Toward Human-Robot Collaboration
Kristian T. Simsarian
Dissertation, March 2000
Kungl Tekniska Hogskolan March 2000 c Kristian T. Simsarian2000 NADA, KTH, 100 44Stockholm ISRN KTH/NA/P--00/04--SE ISSN 1101-2250: TRITA-NA-P00/04 ISBN91-7170-544-9
SICS, Box 1263,164 29Kista
ISSN 1101-1335: ISRN SICS-D{28{SE
KTH Hogskoletryckeriet
Abstract
Recently robotshavebeenlaunchedastour-guides in museums,lawnmowers, in-homevacuumcleaners,andasremotelyoperatedmachinesin so-called dis-tant, dangerous and dirty applications. While the methods to endow robots withadegreeofautonomyhavebeenastrongresearchfocus,themethodsfor human-machinecontrolhavenotbeengivenasmuchattention. Asautonomous robotsbecomemoreubiquitous,themethodsweusetocommunicatetask spec-icationtothembecomemorecrucial.Thisthesispresentsamethodologyand asystemforthesupervisorycollaborativecontrolofaremotesemi-autonomous mobilerobot. Thepresentationcentersonthreemainaspectsoftheworkand oersadescriptionofthesystemandthemotivationsbehindthedesign. The supervisory systemfor human specication of robot tasks is basedon a Col-laborativeVirtualEnvironment(CVE)whichprovidesaneectiveframework forscalable robot autonomy, interaction and environmentvisualization. The systemaordsthespecicationof deictic commands tothe semi-autonomous
robot via the spatial CVE interface. Spatial commands can be specied in
amanner that takes into account some specic everyday notionsof collabo-rativetask activity. Environment visualization of the remote environment is accomplishedbycombiningthevirtualmodeloftheremoteenvironmentwith videofromtherobotcamera. Finallythesystemunderwentastudywithusers that explored designand interaction issues within the context of performing a remote search task. Examples of study issues center on the presentation of theCVE, understanding robot competence, presence, control and interac-tion. Onegoalof the systempresentedin the thesisis to providea direction in human-machine interaction from aform of direct control to aninstance of human-machinecollaboration.
Acknowledgments
Thisworkhasbeenalongtimeinthemaking. Thislistisrepresentativeofthattime andofthepeopleImetalongtheway. Foralltheirhelpindirectandindirectways Iwishtoexpressmysincereappreciationtothefollowingpeople:
To Jan-Olof Eklundh and laterHenrikChristensen who enabled meto perform this work, both by providing guidance andresources. Withoutthese two, and the naldeadlines,thesewordswouldnotbeonthispagerightnow.Theirpatienceand rigorhasbeeninvaluableovertheyears. Theircontributionhereismanifest.
To Tom Olson, my rst advisor and N."Nandhu" Nandhukumar who lead me throughsomerigorousresearchthatbecamemyMaster'sthesisandtoChrisKoeritz whosuppliedtheoÆcestimulationanddistractionnecessarytojumpthatrsthurdle. ToMajaMatricforbeingaresearchcolleagueandfriend,fromwhomIdrewmuch researchinspiration,andfromwhomIlearnedtorenemyresearchwritingskills. To Rod Brooks without whoseexample andencouragement I mighthaveleft robotics research yearsago. To Luc Steels for encouraging me to pursueideas about robot behaviorsandhumaninteraction. TomMitchellforbeingencouragingandforposing the \fetching" challenge toroboticsresearchers, and ingeneraltothe organizers of theNATOASIBiologyandTechnologyofAutonomousAgentsin1993. Itwasthat eventin1992thatchangedthecoursemylifetookforthenexteightyears.
To Michael Lebowitz, Peter Allen,and Julian Hochberg at ColumbiaUniversity whostartedmeonmyresearchcareerwithresearchassistantships andprojectsthat taught me, by example, that research or researchers do not have to be stuy or boring. To Mike Gorman,Bernie Carlson, andthe \Repo researchteam"at UVa, whoshowedmethatresearcherscanbenutty,inafunway. ToPeterHindleandMr. Haas, myhigh schooland elementaryschoolMathematicsteacherswho'sin uence carriesstronglyintotheworkIdotoday. ToBryceLambert,myhigh-schoolEnglish teacher,who tookhisknifeand cuttheword\very"frommyessaysandtaught me that\interesting"wasameaninglessword-yourvoicehasbeeninmyheadasIwrote this.
ToAmberSettleandMarkFascianowhosuppliedmewithgoodfriendship,good foodandsanevoices whenthingswerelookingbleak. Ithankthemfortheir direct and indirect help in nding a healthy direction for myself and my research. To Barry Guiduli who has been a good and constant friendin alwayschallenging me intellectuallyandphysically.
To Anne Smith, who encouragedme to moveahead, not justin words, but by example. TomyfriendsAnneWrightandSueletteDreyfus,bothofwhomnowhave Ph.D.s, and who havetried to challenge,poke, probe,goad and encouragemeinto nishing-Ithankyouforthat.ToRobertEklund,whohasprovidedmuchintellectual stimulation overthe yearsand who has not yetnished hisPh.D., and to whomI hopetoprovidesimilarexample.
To Carl Feynman who showed me how to break problems down and perform
mid-techexperiments,resultinginmylearninghowsimpleresearchcanbe. ToChris Csikszentmihalyiforhistechnicalandintellectualstimulation,artfulinspirationsover theyears,andfriendship,helpingmyworktobebetterandmoreorthogonalthanit mighthavebeenotherwise. ToBenBederson andAllisonDruinwhohavesupplied commentaryandencouragementthroughouttheyearsandwhotogetherwithSusan
ToJanSeidefordiscussionsaboutrelativism,reductionismandfunctionalismand forprovidingreferencesovertheyearstothedebatesinphysicsandscience(andfor allthemusic). ToJulioFernandezforbeingagoodco-teacherinbehavioralrobotics andfor providing mewithanenriching teachingexperience inSpain. ToPhilAgre forbeingtheintellectualthatheis,breakinggroundthathelpedmetorealizethere aremanypathsonecantake,andthatoneshouldtrustone'sownintuition.
ToLennart Fahlenwhohastriedhisbesttohelpmenishandprovidemewith resourcesdespitecompetinginterests andto Janusz Launberg,who hasbeenactive inndingresourcestoenablethiswork.
ToJohnBowersforprovidinganintellectualmentorship,anexampleofsomeone whocanintellectuallyandrigorouslypursuewhattheyenjoyandbedarngoodatit aswellasforreading thesisdraftsandoeringcommentsthatprovokeandvalidate. ToYngveSundblad,TomRodden,andJonO'Brienforreadingversionsofmythesis andprovidingmewithcommentaryandsupportImightnototherwisereceivewithin theCSCWdiscipline. Also Ithankthesethreefor givingmetheencouragement to nish.
ToTomasUhlin,theonlypersonIknewbeforecomingtoSweden,whoencouraged metostartworkingonmyPh.D.hereandwhoprovidedencouragementthroughout theyears. To Lars Bretzner and Peter Nordlundwho haveprovidedfriendshipand examplesofgoodresearchwork,andespeciallyLarsforhelpingguidemethroughthe bureaucraticprocess ofnishing. Toall thepeopleatCVAP-CASwho havehelped inseeminglysmallwaysovertheyears: lettingmeinthe door,helpingtocarry the robot,lendingmeacable,orsomeadvice-thathelpwasinvaluable.
ToMartinNilsson, who originallybroughtmeto Sweden,andwho hasdirectly andindirectlysupportedthiswork. WithoutMartin,this thesiswouldcertainlynot exist,IwouldnothavespenttimeinSweden,andIwouldhavenothadthisnerobot todoexperimentson. ToPer Kreuger who introducedmeto Yogaandprovideda generallypositiveorthogonal in uence,andtoAlfwhotriedtohelpinhisownway. ToSebastianThrunforhisadviceandforhisworkwiththeRhinosystemwhich becamethelow-levelsoftwareplatformfortherobotintheseexperiments. ToJussi Karlgren, Ivan Bretan, et al.for their workonthe DIVERSEprojectand for pro-vidingstimulatingdiscussionsovertheyears. ToTomasAxling,ScottMcglashanfor their workon the nextversion of the speech system and to Tomas for his help in constructinganinterfacetotherobot.
To the administration at SICS for helping with all details along the way and especiallytoLottaJorsatterwhowasabletogettherobotinandoutofthecountry a number of times, and to Marianne Rosenqvist who has helped at a number of importanttimes.
ToKristinaHook,LarsOestreicher, andKerstinSeverinson-Eklundhforhelping toformtheideasthat becametheStudyofUseinthis thesis. Aswellas toall the colorfulpeoplewhoparticipatedinthestudy,withouttheirvividandcandidremarks therewouldnothavebeenmuchtowriteaboutinchapters5and6.
TotheentireICElabforalltheirhelpandsupportovertheyears.Certainly with-outdive thissystemwouldnothavebeenbuilt. ToKarl-Peter
Akesson,especially, forhisexjobbontheRealityPortalsystem inthissystem andhisco-authorshipon aseriesofpapersuponwhichsomeofthis workisbased. ToEmmanuel Freconfor
forsomeoftheexcellentimagescreatedforpapers,afewofthemre-appearinginthis thesis. ToOlovStahlforconstructingandworkingonthevideostreamwithindive. ToMartenSteniusandDanielAdlerforhelpingintheCeBitdemonstrationin1996. As well as to Par Hansson and Anders Walberg for doing neworkinthe Grotto with userinterfacesand helping to improvea numberof dive featuresused inthe constructionofthissystem. ToLasse NilsanderandPeter Elinforbeingresponsive totechnicalproblemsencounteredalongtheway.
ToDodiwhose energyandhelp werevaluable. To
AsaRex who providedgood conversationandhelpedtomakemyinitialtimeherealotmorefun.
ToJerryGarcia,RobertJohnsonandother'friends'aswellastotheAshlandand Larriveeguitarmakersforprovidinginspirationandarespitefromthetechnical.
Asisthewaywhenonedoestheirbestwork,manyoftheabovearebothfriends and colleagues. Even more signicant is the contribution of my family. To J.R. Simsarian,my father,for awakeningmytechnical interestsataveryyoungageand for helpingmeout nanciallywhenIneededitduring my educationaljourney. To Caroland Gordon Loughlin for providing meanasylumfromall the craziness,and givingmelove,sunshineandhappytimes.
Most ofall, my appreciation and love goesto my mother, Astrid Tollefsen, for endowing me with initiative and the quality of not being afraid to work hardand for all her supportand love over theyears. Certainly Iwould notbe at this stage withouther.
1 Introduction 1
1.1 Objectives . . . 2
1.2 Approach . . . 3
1.3 ContributionsofthisThesis . . . 4
1.4 StructureoftheThesis. . . 6
2 Background and RelatedWork 7 2.1 Introduction. . . 7
2.2 Genesisof theSystem . . . 8
2.3 Mobile Robotics . . . 9
2.3.1 Applicationsofautonomousandsemi-autonomoussystems 10 2.4 Presence . . . 14
2.5 VirtualEnvironments . . . 17
2.5.1 Enhancing environment visualization: Augmented Vir-tualityandAugmentedReality . . . 18
2.5.2 3DTVandvirtuality . . . 20
2.6 SoftwareAgentsandRobotAssistants . . . 21
2.6.1 Derivativesoftwareagents . . . 21
2.6.2 Interfaceagentdebate . . . 22
2.6.3 Attributesvs.Endowment. . . 23
2.6.4 Dierentiatingrobotsandsoftwareagents . . . 25
2.7 ComputerSupported CollaborativeWork . . . 28
2.8 RobotUserInterfaces . . . 32
2.8.1 Model-basedsupervisorycontrol . . . 33
2.8.2 Teleassistance. . . 34
2.8.3 MissionLabToolset. . . 36
2.8.4 Multi-AgentSupervisoryControl . . . 37
2.9 EmergenceofServiceRobots . . . 40
3 Approach: Human-Robot Collaboration 41 3.1 Introduction. . . 41
3.3 ModesofHuman-RobotTaskInteraction . . . 43
3.4 ChannelsforHuman-RobotInteraction . . . 46
3.4.1 Visualizingtheenvironment. . . 46
3.4.2 Interactingwiththerobot . . . 46
3.5 ASystemtoSupportHuman-RobotCollaboration . . . 47
3.5.1 Supervisorycollaborativeframework . . . 48
3.5.2 Environmentvisualization . . . 49
3.5.3 Interactivecontrol . . . 50
3.6 Technicalworkandevaluation. . . 50
4 Method: The Human-Robot System 51 4.1 Introduction. . . 51
4.2 SupervisoryCollaborativeFramework . . . 53
4.2.1 CVEplatformandmodel . . . 54
4.2.2 VirtualRobotAgent . . . 55
4.2.3 Robotsemi-autonomy . . . 58
4.3 EnvironmentVisualization. . . 62
4.3.1 Videoin theCVE . . . 63
4.3.2 RealityPortals . . . 67
4.4 CommandInteraction . . . 74
4.4.1 Directanddeictic interaction . . . 75
4.4.2 Spatialmodel . . . 78
4.4.3 Speech interface . . . 81
4.5 DemonstrationsandStudyof Use. . . 83
4.5.1 Cebit1996 . . . 84
4.5.2 Adhocdemonstrations . . . 84
4.5.3 StudyofUse . . . 84
4.6 Summary . . . 85
5 Study ofUse: Human-Robot system 87 5.1 FindingRemoteFlags: aStudyofUse . . . 87
5.2 DescriptionofStudy . . . 89
5.2.1 TheTask . . . 90
5.2.2 Taskinteraction . . . 92
5.2.3 Technicalsetup . . . 92
5.2.4 Thesystemfunctionalityinthestudy . . . 93
5.2.5 TheParticipants . . . 94
5.2.6 Researchquestions . . . 95
5.2.7 Methodology . . . 96
5.2.8 Procedure . . . 97
5.3 ReportofStudy . . . 102
5.3.1 Discussioncategoryoverview . . . 103
6 Study Findings 143
6.1 StudyDiscussion . . . 144
6.2 DesignSuggestions . . . 147
6.3 DevelopmentDuringStudy . . . 149
6.4 StudyImplicationsonDesign . . . 150
6.4.1 Task-dependentsystemfeatures. . . 150
6.4.2 CVEissues . . . 151
6.4.3 Environmentvisualizationissues . . . 152
6.4.4 Trust,Presence,feedback . . . 153
6.4.5 RelationshipandDivisionofLabor. . . 154
6.4.6 Control issues. . . 155
6.4.7 Applicationssuggested. . . 155
6.5 Summary . . . 156
7 Conclusion 157 7.1 Supervisor-AssistantControlFramework . . . 157
7.1.1 CVEdesign . . . 158
7.1.2 Semi-autonomydesign . . . 158
7.2 EnvironmentVisualization. . . 159
7.3 Robot ControlandInteraction . . . 159
7.4 ExperienceofBuildinga`User'System. . . 160
7.5 StudyofUse . . . 160
Introduction
WE are in an age where the number of autonomous machines around us is
increasing. Thoughthis hasbeentrue formany decades, itis nowbecoming true with respect to the general citizen. This marks a shift of deployment from the laboratory and the specialized userto the non-expert. This is also truein thesense ofwho isspecifyingtherobot tasks. Recently wehaveseen
the launch of mobile robots as tour-guides in museums, as lawnmowers, as
homevacuumcleaners,ashome-careaids,asintelligentplaytoys,asremotely operatedmachinesthroughtheWWWandgenerallyastelerobotsinso-called distant,dangerousanddirtyapplications. Onecanreasonablyexpectthistrend tocontinue. Manyofthecapabilitiesoftheseautonomousrobotscanbeseenas thedirectresultofbasicresearchwithintherobotresearchcommunity. While themethodstoendowrobotswithsomedegreeofautonomyhavebeenastrong focusofresearch,themethodsforhuman-machinecontrolhavenotbeengiven asmuchattention. Rarelyaremethodsforhuman-robotinteraction thefocus ofmobilerobotresearch. With thisincreasedpresenceofrobotscomesaneed for moreexploration in the way the interaction takesplace between humans
and autonomous machines. As autonomous robots become more common,
themethods we useto communicate task specicationto them becomemore
crucial. Ifwecanassume that in the end itis themachines that will doour bidding,thenthemethodsforthecommunicationofourgoalsandthemethods forpromotinguser-understandingofarobot anditsenvironmentmust takea morecentralrole.
Formanyofthese tasksthereis aneedfor thehumanuserto beawareof therobot'sremoteenvironmentandoftherobot'scapabilities. Suchawareness includesbeingabletovisualizetherobot'ssituationandenvironment,andonce providedwiththisawareness,havingthemethodsforinstantiatingcommands for a robot to perform. Underlying this awareness and the control methods isan infrastructureframework within which theinteraction itself takes place
Figure1.1: Thisgureshowsthethreemainpartsthatmakeupthecomplete system. Thehumanuser,theCollaborativeVirtualEnvironment,the remote robotandalsodepictstheCVEasthemediumofcommunicationbetweenthe
humanandtherobot.
generallyconsistsofthemetaphorsemployedandthetechnicalimplementation thatsupportsthecontrolandtask-specicationoftheautonomoussystem.
1.1 Objectives
Thisthesislooksatsomeparticularsolutionswhichincludeashared repre-sentationofarobotenvironmentcanbeprovidedtoauser,themethodsforhow humansupervisorycontrolcanbespeciedbyahumansupervisortoarobot assistant and the guiding metaphors and structure in which this interaction takesplace. Inparticular,thethesispresentsthetechnicalandmethodological aspectsofconstructingaspatialinterfaceforremoterobotcontrolandits sup-portingframework. Theframeworkconsistsofaninfrastructureandaguiding metaphor. TheinfrastructureiscomposedofaCollaborativeVirtual
Environ-ments (CVE) and semi-autonomous robot and the guiding metaphor is that
of supervisorycontrol. Under supervisorycontrol,ahumanoperatorspecies
the methods for robot environment awareness are seen here as methods for environmentvisualizationandthemethodsforcontrolaredierentinteraction
mechanisms for communicatingtask commands to the robot. Forthe
inter-action,anapproach employing CollaborativeVirtual Environmentsis usedin combinationwithadeicticspeech andmouse-basedgestureinterface. Forthe visualizationoftheremoterobotworkingenvironment,anumberofsolutions areimplementedprogressingtowardablendoftherealvideotexturesfromthe remotesite andthe3Dgraphicmodeloftheworkingenvironmentcreatingan instanceofAugmentedVirtuality.
Thus, thepractical problemin thisresearch is howa humancan commu-nicatespatio-temporal taskspecicationsto arobotalongwiththeprinciples andtechniquesthat helpguide thisinteraction. Inresponse to thatproblem, thiswork:identiesthecentralinterfaceelementsofthisinteractionthatneed attention, implements and demonstrates the prototypes, and performs a de-signstudy with users to look for actionable insights into the construction of
human-robotsystems.
1.2 Approach
Keyproblemsofinterestin human-robot interactionlieinvisualizationofthe remote environment, multi-modal specication of spatio-temporal tasks and
in the framework for the robot-human relationship. Work in robot-human
communication area seeks to improverobot utility, ease of expressing tasks, the exibilityofthecommunication,andtogenerallybettertheunderstanding ofhowtobuildrobot-humaninterfaces.
Theguidingmetaphorforinteractionbetweentheoperatorandrobotisto considertherobotasanassistant. It isthisconceptthat drivesthe conceptu-alizationofasemi-autonomybothwithintherobotsub-systemaswellaswhen viewedfromtheoutsidebytheoperator. Theoperatorandtherobot,together, formasupervisor-assistantrelationshipanditisthroughthispartnershipthey formacollaborationwithrespecttoagiventask. Thetargetcategoryoftasks forthissystemare'point-to-point'navigation,'go-and-look'searchand 'pick-and-place'manipulation. Although thesystemcomprisesseveralsub-systems, thedesiredeectisthatthesystembeviewedasamediumforhuman-machine interactionwhere thedetails of thesub-systems unifyto implementasystem centered aroundthe activity of task specicationand interaction. That is to say,interactioniswiththetask, notjust therobot.
Theuserinterfaceenablescommunicationbetweentherobotandthehuman operator.Thisisdonethroughvisualandspatialinteraction,complementedby gestureandspeech. TheframeworkfortheinterfaceisbasedonaCollaborative
Virtual Environment (CVE) that provides a medium for spatial interaction
gesturalcommands (e.g. viamouse or other device) aswell asspeech as the
main input between the robot and the human. Most tasksspecicationsare
spatial and take advantage of spatial nature of the 3D interface and robot environment. Oneoftheunderlyingtheoriesinthisworkisthatwithinthe3D environmentspatialcommandscanbespeciedinawaythatbuildsonspecic everydaynotionsofcollaborativecommunicationandtaskspecication.
Agoalofexploringdierenttechniques forenvironmentvisualizationisto providetheoperatorwiththeabilitytoexplorethemodelandvideoofthe re-mote environmentwithouttheconstraintofrobotmovementlimitations(e.g. from direct manipulation of the robot) or from being limited by the camera eld of view. These constitute aset of spatial-temporal problems which nd partialsolutionsin theworkhere. Thesesolutionsaremethodsforstoringthe remote images in the 3D model. Specically, anaugmented 3Dvirtual envi-ronment is created which contains real world images as object textures and allowstheoperatorto exploreavirtual representationofthe realspace. The advantageofusingthis3Dgraphicalenvironmentisthatitcanbemanipulated in awaynotsubject tothe temporal, spatial,and physicalconstraintsof the realworld. Italsohastheadvantagethatselectedparts oftherealworldthat maybeirrelevanttoaparticulartaskcanbeomittedfromtherendering. Thus thepartsoftheworldofinterestcanbeextractedandmadesalient. Asub jec-tivevideo-augmentedgraphicalviewoftherobotenvironmentcanbecreated, allowingtheoperatortoworkwiththeelementsoftherobotenvironmentthat are relevant to the current task. Dierent methods for this visualizationare explored and presentedin this thesis and thenexplored further in the design study.
1.3 Contributions of this Thesis
Themain contributionof this thesis is embodied in theform ofa system implementation. Inadditiontothecomponentsthatcomposethis implementa-tionisastudyofuseofthesystemwithanumberofusers. Thesecontributions canbesummarizedasthefollowing:
FrameworkforHuman-Robotcollaboration;
Techniquesforremoteenvironmentvisualization;
Human-robot multimodaldeictictask specication;
Robotsemi-autonomy;
Frameworkfor Human-Robot Collaboration
Thefundamentalelementofthetechnicalwork inthis thesisisthe frame-workforthehuman-robotcollaborationsystem. Thisisaninfrastructure con-sistingofaCollaborativeVirtualEnvironmentandthree-dimensionalmodelof theremoteenvironment,aremoterobot,aset of communicationand display sub-systemsand thephysicalset-upoftheenvironmentofuse.
Techniques for RemoteEnvironment Visualization
Asignicantelementofahuman-robotsystemistheabilitytovisualizethe remoteenvironment. Thisabilityisembodiedbyasetofsystemsthatallow re-motelysensedinformation,e.g. video,tobedisplayedintheenvironment. This thesisreports thedierentmethods of visualizationthat havebeenemployed inthesystem,howtheyhavebeenusedandhowtheyareconstructed.
Visualization of the remote environment is accomplished by unifying the graphicalmodel of therobot environmentwith videofrom therobot camera.
Thesystem combines 3D graphicalenvironmentswith live video and images
generated from the robot exploration of the remote location. Dierent tech-niques for this combination are presented. These techniques dier in their methodofincorporatingtheremoterobotvideointothe3Denvironment. The technical basis for the visualization methods is grounded in Augmented Re-ality (AR)and Augmented Virtuality (AV). This visualizationof the remote environmentforms animportantpartofthehuman-robotinteraction.
Human-robotMultimodalDeicticTask Specication
Thehuman-robotcollaborativeframeworkprovidesmechanismsfor point-ing andinteracting with thethree-dimensional environment. Thesystem for pointingin the worldhas been augmentedwith aspeech control systemthat together enables deictic task specication. The speech and pointing system containagrammarofcommandsandamodelofinteractionthatenablethe ref-erencetoobjectsintheenvironment,andtransitively,therealworld. Because thethree-dimensional model is representativeof the real world environment, selectingobjectsinthegraphicalenvironmentcanalsobeseenasamethodfor selectingobjectsin theremoterobotenvironment.
Robot semi-autonomy
Thesystemhereattemptstomovebeyondsimplehumancontrolofarobot bygivingtherobotbasiccompetence. Thereisascaleofautonomy. Ononeend ofthis scale is full-autonomy where arobot works completely independently. On theother end of this scaleis complete operator control, where therobot is simply a remotely operated vehicle. In this system, the concept of
semi-autonomyis exploredwhere the robot andthe humancooperateto complete
point-to-objects. These capabilitiesin combination withrealworldrangesensing,and communicationswiththeothersubsystemsformtherobotsub-systemof semi-autonomy. Thehumanthenprovidescommandsthatemploythesecapabilities in asequencetoperformthetarget tasksoffetching.
Demonstrations and Study of Use
Real-worlddemonstrationsareimportanttobothdisseminatetheresearch
ideas embodied in a system aswell asto learn and understand more about
the system, its shortcomings, its advantages, its use, its potentialusers, and possibleapplications. Reportedin this thesisarethendingsof thesestudies anddemonstrations. MostsignicantoftheseistherecentStudyofUse. This StudyofUseencompassed settingupanenvironment,task design,andstudy executionwithusersemployingthesystemforaparticularremotesearchtask. The study collects theirreactions and forms a set of implications for system design. The study is primarily composed of the techniques of observation, a survey questionnaire, and a qualitative interview centered on design. In addition tothat study, anumberof formaland ad hoc demonstrationsof the systemhavetakenplaceovertheyearssincethesystemhasbeendeveloped.
1.4 Structure of the Thesis
The thesis presentation is organized into the following chapters, Introduc-tion, this chapter where the main elements in the thesis are provided along with anoutline of the entirethesis. This is followed by a chapter exploring
Related Research where work that forms the research background is
dis-cussed. Following that is the Approach chapter where themotivations and systemoutlineare provided. Thisis followedbythetechnicalpresentationin
theMethodschapter. TheStudy of Usechapterpresentstheresultsof an
explorativedesignstudyonthesystemandtheFindings chaptersummarizes the study's main results. The nal chapter is the Conclusion and includes re ectionsandasummary ofthemain points.
Background and Related
Work
2.1 Introduction
Theprimaryquestiondrivingthisresearchhasbeenhowtoperformataskata distancebyemployingaremoterobot. Theresultingsolutionisanintegration ofdierent researchdisciplines. Thesystemcombines real-worldautonomous mobile robotics with a simulatedvirtual environmentas amedium for high-level control. Themain application for such a systemis the usage of mobile robotsinthebroadareaofhazardousorinaccessibleenvironmentexploration and remote assistance while also considering the emerging area of domestic robots. Thework is interdisciplinarybynature and crossesmany traditional boundaries. Thoughthisisastrengthof thework,italsoleavesitvulnerable fromtheperspectiveofanyoneparticulareld. Thischapterisanattemptto provideastructureoftheworkandplaceitin contextwithpreviousresearch. Theprimaryresearchareasareautonomousmobileandtelerobotwork, Arti-cialIntelligence,softwareagents,humanmachineinterfaces,CSCW,andmedia spaceinterfaces. Muchofthisworkisfundamentallyinter-relatedmakingthis partitioningsomewhatunnatural. Thetreatmenthere,bynecessity,separates outthese disciplineswhileanattempt is madetoamend thisby makingnote wherework spansandconnectstraditionalresearch areaboundaries. The de-scriptionwillbeginat agenerallevel,coveringrelatedworkin these dierent researchelds,andthenmoveontothespecicworkonhumanrobotinterfaces andthestudiesdonetoevaluaterobotinteractionsystems.
2.2 Genesis of the System
This thesis work has evolved from the author's previous work in robotics, computer vision and articial intelligence. The topic of mobile robot self-localization was the topic of a masters thesis [98, 104]. That work inspired laterresearchonarobotperceptionsystemthatintegratedanumberof com-putationallyinexpensivetechniquesforobjectandlocationrecognition[97]. At
theNATOAdvancedStudyInstitutewherethatworkwaspresented,Thomas
Mitchellmadeacalltotheroboticscommunityforafetchingsystemthatcould \TakeXfromYandbringittoZ"[77]. Itispartlythatcallthatinspiredthis
work ona human-robot system. Originallythe ideawasto employ ahuman
supervisortohandlethehardproblemsinArticialIntelligence. Throughthis human and robot integration anintegrated functioning system could be cre-atedthat demonstratesrobotcapabilitiesinthecontextofhumanguideduse. Thisworkwasrstreportedin1995[102]andlaterrenedasaproposalforan assistantto persons withdisabilities [101]. Thesystem wasfurtherexpanded withgreaterfacilitiesforcommunicationandcontrolbyintegrationwitha de-icticspeechcontrolsystemwhichwasreportedinabookcollection[103]. More explorationwasdoneonthe visualizationaspectsof thesystemandthese re-sultsaredescribedin[100]. Lateracatalogueofthesevisualizationmetaphors are presented in [99] and the mostrecent presentation [6]. This thesis is the sumofthiswork,plusthetheoreticalunderpinningsandthemorerecentwork includingadesignstudywithusers.
Theworkhasevolvedin thisdirectionfrom adissatisfaction withthe cur-rentstateoftheart,andevenresearchagenda,ofmanyintheAIcommunity. Manyofthehardestproblemsin AIarethoseatthehighestlevelwhilemany ofthelower-levelproblemsarendingsolutionsin mobilerobotics. Inthelast decade,researchconcentratingontheselow-levelproblemshascreateddierent engineering approachesthat brought robotics to the stage of obstacle avoid-anceandsimplenavigation. Howeverfewresultsin \top-down"researchhave progressedto real solutionsthat enablea mobile robot to makeautonomous decisionswithinarealunstructuredsetting. Suchunsolvedhigh-levelproblems includedecidingwhichtasksareimportanttoperformnext(givenvirtually un-limitedpossibilities),orthemorespecicproblem ofhowto decomposetasks into subtasks. It was this absence of high-level competence that drove this workin thedirection of addinga humansupervisor. Inasupervisorscenario the human becomes responsible for high-level decisions. The robot, then, is responsible fortasks atits givenlevelof competence: e.g.avoidingobstacles, maintainingsimpletrajectories,graspingandemployingsimpleperceptual pro-cessing. Findingtheappropriatemodeofcollaborationisapointofadjustment andagreement,tacitorexplicit,betweenthehumanandtherobotsystem. The systemin this thesisisan exampleofhowsuch ahumanrobot systemmight beconstructed to enableexpansionof robot competence whilemaintaininga
Specically the intended domain of tasks for the system is centered on fetching: \Take X from Y and bring it to Z". This task breaks down into locating X, moving to X, identifying Y, acquiring Y, locating Z and moving to Z. Thus the humanoperator'srole is to supply theinstantiations ofthese variables. Withahumanoperatorthetaskfortherobotisconsiderablyeasier. Insimpledomains,thelocomotion,theidenticationandthegraspingbecome engineering solutions, yet the system as a whole becomes powerful. As the mobile robot becomescapable of higher level tasks, it is the intention of the systemto enable the operator to assume agreater monitoringrole. That is, therobotisunderthehigh-leveltaskdirectionofthehuman,buttherobotcan assumemoreresponsibilityforthetask asitstask competenceandautonomy increase.
Thesubsequentsections presentrelated work in thedierent research do-mainsrelatedtothissystem.
2.3 Mobile Robotics
Thereisadeniteadvantageinbeingabletosendautonomousrobotsinto haz-ardousand remote environments, e.g. space, sea, volcanic, mineexploration, nuclearorchemicalhazards. Robotscanbebuilttostandhigherenvironmental tolerancethanhumans,performrepeatablespecializedtaskseÆcientlyandare expendable. Tothis end, there has been much researchon fully autonomous robots that can navigate into an area, perform actions such as taking sam-ples,performingmanipulations,andreturntheinformationtobasecontrollers withouthuman assistance.
1
Beyond those institutionally funded exploration applications (e.g. the Mars Rover [93], or the Dante volcanic explorer [12]), anumber of mobile robot applications have been recently launched by com-mercialcompanies foruse bytheaveragecitizen. Such applicationsinclude a robot lawnmower [70], a dust vacuum for domestic use [1, 85], afamily care assistant [111], a robotic play toy [32], and a museum tour-guide [28]. The presenceof robots outside ofthe laboratory and in use bythe non-technical, non-expertoperatorputsagreateremphasisonhowahumancontrolsarobot. Thoughthisthesis work didnotoriginally addressquestions aboutrobot use forthe generalcitizen, theintentionis that someof this research mighthelp informtheenterpriseofcreatingafocusontheinterface. Further,bythetime of thestudy presentedin chapter 6,one of thefoci of thestudy became the possibleuse ofthe robot in ahomeenvironment andthe studyaddresssome oftheissuesthatmightbeencountered.
1
2.3.1 Applications of autonomous and semi-autonomous
systems
Whatisthecontextinwhichasupervisoryrobotsystemmaynd
implemen-tation and use? Red Whitaker has referred to the situations where remote
mobilerobots areusefulasDDD,Dirty,Dangerous,andDistant. 2
Theseare morespecicallydirtyin thesense of havingratherlowappeal to thehuman worker. Jobsthat, by their nature, involveenvironments and materials that arefundamentallyunappealing,e.g.repairworkinsewers. Forthemostpart, dirtyjobsarethosethatdonotoeranattractiveworkingenvironment.
Dangerous work has elements of personal risk. Such tasks might involve work with hazardousmaterials,such astoxicorcaustic chemicalsor radioac-tivematerials suchasnuclearwaste,orotherextreme conditionssuchascold orheat,e.g.reghting. Afurtherexampleiswhenthenatureoftheworkmay createahazardousenvironment,suchasinstructuredemolition. Notonlyisit cumbersomeforhumans toprotectthemselvesfromsuchhazardouselements, butsuchprotectionmayimpedetheeÆciencyofthetask. Moreovertherobot constructionmightbemade toberesistanttosuch elementsand thus be un-aected, or in theworstcase, therobot isat least moreexpendablethan the itsequivalenthumanworker.
Distantworkiscarriedoutonotherplanets,oreventerrestrialenvironments such as deep sea, deepmining, orhigh altitude where the practical limitsof humancontrol force thesituation towardremotecontrol. When such distant work involves inter-planetary distances there is also a signicant time delay incurredinanycommunication. Thisintroducesnewproblemsandmaycause are-workingofthecontrolstructure. DelaystoMars,forexample,areoverten minutesforone-waycommunication, makingdirectcontrolofanautonomous mobile robot impractical. Some applications, such as deep sea and volcanic
exploration may cross some of the DDD categories. Typical tasks in such
environments areexploration, reconnaissance, inspection,and repair. Thus a situation likethe 1986Chernobylemergencymighthavebenetted byhaving robotsonhandthatcouldbesentinasremoteagentstoexamine,shoveldirt, lay concrete and update thestatus inside the radioactivezones in the plant. As aresultofthat needat Chernobyl andbuilding onexperience withThree
Mile Island, the company RedZonewasformed in 1987 to constructa robot
to sendinto theChernobyl Reactorunit 3
In1999 theRedZonePioneerrobot
wassentin toperformreconnaissanceandbuildamapoftheareausingvideo techniquestogetherwithvirtualreality[116].
A further application for teleoperated mobile robots is control overscale, e.g. micro andmacro robots. The categoryof microrobots broadlyincludes
2
RedWhitaker,DiscoveryChannelprogramRobotsRising,1997 3
Sovietsoldiersandotherworkers,referredtoas'biobots,'havebeenusedfortasksinthe reactorareasince1986atgreatpersonalhealthrisk.Bywesternstandardstheenvironment
thosethatcanworkinenvironmentsatlessthanhumanscale. Therehasbeen researchinwhatare referredtoas\nanorobtics,"robotsonthescaleofa bil-lionthofameter. Suchrobotsmightbesentintothebloodstreamtoclearthe cloggedarteriesofapatient. Alessaltruisticapplicationmightbeinsect-sized robotsused forsurveillance. Onthelargeend of thescalearerobots capable
of performing work that is beyond human physical limits. Simple examples
ofthese are autonomouslogging machines, orthelargeconstruction orearth moving equipment used in mining or mineral extraction and transportation. Thoughtheboarderbetweendirty, dangerous, distant,andscaleisnotalways distinguishable, such a work categorization supplies a context in which this systemcanbeconsidered. Thesystemhasbeentested overdistance (10Mto
2000KM) and although the system has not been used in all these contexts,
theexamplesprovidea direction in which such systemsmight developand a practicalmotivationfortheirexistencein therstplace.
Autonomous and Tele-Robot research Research work in the eld of
telerobotics and that in autonomous mobile robotics is quite separate. Not onlyis the researchoften carried outby dierent principleinvestigators,but there are often institutional boundaries to collaboration
4
. Some researchers
havetriedtobridgethegapbetweenautonomousrobotsandteleroboticwork from both directions. From the perspective of telerobotics this has been an
attempt to build more powerful supervisory control systems. From the
au-tonomousroboticsperspective,this hasoftenbeenanattempt to build func-tioningpracticalsystems that employthe autonomous mobile robot research methodsas afoundation implementingsupervisorycontrol systems. However thecommunityseparation(e.g. separateconferences,journals, cultures)often meansthat communicationofthelatesttechniques andresultsislacking.
Intwoexamplesofintegratingthetechnologyofteleroboticsandautonomous roboticsthereisevidenceofthedieringperspectives. Theautonomousrobotics groupat the GeorgiaInstitute ofTechnologyhas takenaschema-based reac-tive architecture and used this as a base-level for teleoperated control. In theirarchitecture the mobile robot performs simplenavigationwhile the op-erator'scommands can be situated in the system either as another behavior thatin uencesnavigationorasamoreglobalprocessthatmanipulatessystem parameters[8]. They have sinceextended this idea to allow the operator to controlgroupbehaviorsin amulti-agentenvironment[9]. Othergroupshave recognizedtheneedfortele-operatorstomoveawayfromlow-levelrobot move-mentcontrol. One eort hascreated a multi-level architecture for robots to provide higher level navigation functions such as path-planning and obstacle
avoidance with operator guidance [24]. One such sophisticated system was
4
Whereas most intelligent autonomous robot work is conducted inComputer Science, ElectricalEngineering,orArticialIntelligencedepartments,muchteleoperatedworkis
car-created to be a mobile system for persons with disabilities developed at the Universityof Delaware. Inadditiontotheconceptof semi-autonomythe sys-tememployssupervisorycontrolusingaplanningsystembasedontraditional AItechniques[63,64]. Thesystemintegratesauser,simplevisualprocessingof theenvironment,andSTRIPS-basedplanner formingtheinteraction between the user and the robot. This may well be a solution for using systems such
as STRIPS [43] on robots in the real world, but most autonomous robotics
researchershaveabandonedlinearSTRIPS-styleplannersdecadesagobecause ofsimpletaskinteractionproblems(asexempliedbytheSussmananomaly). With suchaplannerthe systemiscommittedto havingauserpresentto de-composeplansinto disjointsub-tasks.
Theeldofteleoperatedroboticshasworkedfordecadesonhuman-machine interfacestoenableanoperatortocontrolarobotremotelyinspaceandinother
distant or hazardous environments. There are a number of conferences and
journals that disseminate this work [78, 90]. NASA initiativeshave included virtualenvironmentsandsemi-autonomyinthat work[54,35]. Inthespanof workin telerobotics, thesub-eld knownas "supervisorycontrol", asdened by ThomasSheridan [92] is mostrelevant to this thesis work. Inthe second edition of the journal Presence, Sheridan denes the term supervisory as it relatestorobotics:
Supervisory control does not requirephysical remoteness, though somedegreeoffunctionalremotenessisinherent. Inthestrictsense, supervisorycontrolrequiresthatthemachinebeingsupervisedhas some degree of intelligence and is able to perform automatically
onceprogrammed.
This functional remoteness, and specically the ability of the robot to perform some autonomous tasks, is what has been referred to here as semi-autonomyandit isthat elementthat issoughtin relatedwork. However,the workin thisthesisrelaxesthenotionofautonomy\onceprogrammed." Here theprogrammingisseenasaninteractivetaskwhereautonomyis exibleand related to a specic context. Interactivity is considered a basic part of the systemandinthiswaythesystemisseento becollaborative.
Themostrelatedteleroboticworkto this thesisisthesubset ofwork em-ploying3Dvirtualenvironmenttechnologyfortelerobotics. Thoughthereare many similarities, few systems explore the same set of research notions
ad-dressed here. Investigators have worked on the interface between man and
machine to enablean operator to control arobot remotelyin space [78], and battleeld[10] applicationsand evenused simulatedenvironmentsfor predic-tivetaskplanning[65]. Becausethebodyofresearchintheeldofsupervisory telerobotics andvirtualenvironmentsislarge,what isoeredinthenext sec-tion canonlybea samplingof the work that has thegreatestsignicance or
VirtualEnvironmentsandTelerobotics VirtualEnvironmentshavebeen employedwithteleroboticsinanumberofways. Byvirtualenvironment,what
is meant is a 3D representation of a robot and robot environment. Often,
butnotalways,this environmentisinteractiveand providesameansof prob-ingdierent robot features. All work,by denition of telerobotics, is carried oversomemeasureof distanceorscale. Oftenthevirtualenvironmentisused to gainvisualization of that remote space. Of thedierentsystems that use virtual environments, the dierences occur in the use, purpose and goals of thevirtual environment vis-a-visthe systemas awhole. A categorization of telerobotvirtualenvironmentuse,withexamplereferences,isgivenhere:
Workspace visualization: Allowingthe humanuserto bettervisualize the robot's3Dworkspaceinaspatialmanner. Thismightbeastatic repre-sentation orit might bewith aform ofreal-time update ofrobot posi-tion[76].
Immersive telepresence: Virtual environments along with immersive
dis-play techniques (e.g. head-mounted displays) have been employed to
enableagreatersense of`presence',with thegoal ofoering theusera senseofbeingphysicallypresentintheremotespace[119].
Programming: Usingamodelofaremoteenvironment,theusercanspatially program therobot by, for example,picking bindingpointsfor intended robotpathortrajectory[65].
Workspace interaction: Allowing the user to interact with the robot and
possiblysetting upamodeloftheremoteworkspace. Sophisticated ver-sionsof thisworkallowthe usertobuild and renemodels ofthe envi-ronment[31, 29].
Rehearsal: Before executing a particular task, virtual environments can be used to perform a simulation and allow the user to possibly catch and correctanydetectederrors[34].
Most ofthe systemsin theabovecategories,with theexception of attempts atimmersivepresence,usevirtualenvironmentsinaddition tootherinterface methods. If the interfaceconsists of a graphical userinterface, than the 3D virtualcomponentwilloftenbeawindowdisplayedalongsideotherrobot con-trols. In contrast to this, one goal of this thesis work has been to create a systemthat is unied in the sense of using the virtualenvironmentinterface fortheentiretask. Thesedierentusesofvirtualenvironmentswillberevisited throughoutthisdiscussion.
environmentsimilartothepresentsystem[79]. Heretherobothasre exesthat allowittorespondto thelocalenvironmentandusesthevirtualenvironment
as a medium for communication. Here the user can be an observer of the
remoteenvironmentoftherobotas wellasanactor,within thatenvironment depending onthemodeof therobot. Thepurposeof that work is to allowa
humansupervisor to work at non-humanscales, bothmicro as implemented,
andmacroasimplied.
The work has been implemented on an arm-based Khepera micro-robot
and the goal of the EPFLinterface is to allow ne control via the specica-tion of path-controlpointswithin thevirtual environment. Theauthors also incorporate amodelconstructionsystembasedonamulti-modalsystemthat incorporates range sensing and vision to build amodel of the robot's work-ingenvironment. Overalltheauthorsusethemodelforspecifyingne-grained control points for navigation and haveyet to takeadvantageof the powerof spatialityanddeicticreferencethatthevirtualenvironmentaordsfor
higher-levelcommandsandgreaterrobotautonomy.
Rehearsal Thegroupat Sandia Labshas employeda virtualrobot system
which exemplies the rehearsal style of employing virtual environments for
robotics [34]. A system has been implemented containing a physical model
of the environmentand the robot dynamics so that rehearsalsof robot tasks can be performed before running the robot on the real task in the physical world. Thesystemisbasedaroundagantryrobotsystemthatworksinalarge
20'x40'workspace. Userscanbewarnedbeforehandaboutimpending motion
problems. Oneofthemoreinterestingmotivationsforthissystemisnancial. Theargumentisthatthevirtualsystemwillpromotethesharingofexpensive capital equipment. Howeverin their presentation there is little attention or outside references given to the interface which is not atypical for reports on humanuser-robotsystems.
Onegoalofteleroboticsisthenotionofpresence,oftenattemptedthrough techniquesofimmersion. Thenextsectiondetailswhatthis mightcomprise.
2.4 Presence
Presenceisitsmostbasicdenition isthesubjectivesenseof"beingthere"or beingpresentinaremoteorvirtualenvironment.
The Astronaut would receive a suÆcient quantity and quality of sensoryfeedbackto feelpresentat theremotetasksite andwould be able to carry out repairs or detailed inspections as if he were actuallythere [44].
these factors to real world practice [119, 45, 105]. This thesis views the role
and degree of presence as a means to enable more engagement in the task
work. This concept has also been a way of enabling domain transparency:
seeingthroughthetooltothegoaltask[37,56,55].
Following Welch [119] thetermstelepresence and presence will betreated asessentiallythe same. Welch et al hypothesizethree factorsthat contribute tothesensationofpresence. These arethat theuser:
1 feelsimmersedwithin theVE;
2 feelscapableofmovingaboutinitandmanipulating itscontents;
3 hasanintense interestin theinteractivetask.
Most of these factors are the sameas what contribute to engagement in the realworldandcontinuinginthatdirectionoftransferringrealworldfactorsto thevirtualtheauthorscontinue:
The development of VE's [Virtual Environments] can be viewed
as an attempt to produce by means of a computer program and
accompanyinghardware(e.g. adataglove),thesameexperiencesof clarity, completeness, vivacity, continuity, constancy, and presence thatoccurinnormalperception.
Thisnotionofcontinuityandconstancycanbephrasedashabitability. Through factorscontributing to the sense of presence as outlined above, e.g. engage-ment, rationality, empowerment, an impression of the environmentbeing fa-miliarand \makingsense"helps toenforce thissense ofhabitability. Also,in reverse,attemptstomakeaninterfacehabitableshouldcreateagreatersense ofpresence. One approachto habitabilityis to transitivelyextendreal world habitsandcompetencesintothevirtualenvironmentandworkwiththerobot. Heeter arguesfor three dierent typesof presence: environmental,social, and individual [53]. Like Welch et al. these can be viewed as composing a tripartite spacewhich represents the many variables that form the sensation ofpresence. Regardlessofthe structuraldescription, these three components formanencompassingwaytodiscussthefactorsthatcontributetopresence:
Environmentalfactors: Range of sensory experience and modalities
stim-ulated, amount of sensory resolution, degree of similarity between the observer'sbodyandvirtualrepresentation,presenceorabsenceof stere-opsis, B&W vs. color presentation, presence or absence of perceptual constancyduring movements,familiarityofthescene.
Socialfactors: Whetherother(simulated)individualsarepresentintheVE, andtheextenttowhichtheseothersrespondorinteractwiththeprimary
Individualfactors: AssumptionsthattheobserversbringtotheVE,amount ofpracticethey havehad ontheVEtask, lengthofexposure and inter-actionwiththeVE,thedegreetowhichtheyhavebecomefamiliarwith (and adapted to) the intersensory and sensorimotordiscordances, indi-vidual predispositions to rely or attend to one sensory modality over another.
Clearlyallthreeofthesefactors,environmental,socialandindividual,play arolein thesense ofpresencein asystemincludingthe onein thisthesis. In particular, the isolatableenvironmental factors that arerelevant to thesense of presencein thisthesiswork arerelatedto thepresentation andinteraction withthegraphicalenvironment. Thesearethatthephysicalpresentationoers enoughsensoryresolution (e.g. pixels)andis sizableenoughto displaydetail atacomfortabledistancetothehumanoperator. Otherenvironmentalfactors include thenumberofmodalitiesinvolvedandwhat constitutestheinterface.
In the thesis system the other individual present is the robot agent (or agents)andpossiblyotheroperators. Thereisthenasocialunderstandingbuilt from the representation ofthe robot agentin so faras itdisplays acoherent viewontherobotactions,capabilities,poseandscalewithintheenvironment. Thisisalsowherethebeliefsandunderstandingoftheuservis-a-vistherobot
come into play. Does the human supervisor view the robot as anentity? A
fellow co-worker? Is the operator working with the robot? Is the operator workingthroughtherobot?
IndividualfactorsarebyfarthemostdiÆculttomeasure.Inordertogauge the in uence of manyindividual factors, rigorous andwell calibrated
labora-tory techniques have been employed to answer perceptual questions. While
in-depthqualitativestudiesofindividual responsesmayneedtobeperformed to understand individual factors such as innate individual assumptions and predispositions. Despitetheseattempts,manyofthesefactorsstillremain elu-siveand indescribable. Therearequantitativestudies thathaveattemptedto addresspresence withsomerigor (see [105] foran overview). Alsoindividual factorswillbe,bydenition,highlyvariableanddependentoneachparticular user. Most such quantitativestudies concentrate on psycho-physical factors. Otherfactorssuchasculturalmeanings,semiotics,socialunderstandingsmight onlyberevealedbyqualitativemeans. Since much interfaceworkis basedon the ideaof metaphor with thereal world, such individual and social cultural factorsmaybejustas,ifnotmore,importantthanpsycho-physicalfactors.
Oneaspectthat Heeter'spresencefactorizationdoeslack isthe identica-tionoftheinteractivesense ofcontinuity,rationalitythatmayenable engage-mentand create aworking interface. Thefactors tocreate these senses most likelycrossthecategoricalboundariesindierentcombinations. Forexample, notincludedintheenvironmentalfactorsisthebreadthandtypeofmodalities
andthey arevital tothe degreeof control theuser hasand certainly involve environmental,social,individualfactors. This isbecausethewaywe interact with machines is informed by the way we we interact in everyday social life (seesectionofSoftwareAgents).
Onemorepointthatshouldbeaddressedinadiscussionofpresenceisthe eect of \lag-time."Inmany teleoperation systems,the reductionof lag, the timebetweenacontrolinstructionisgivenandwhentheeectisseen,hasbeen givenagreat deal ofattention. Ithas been supposed that reducinglag gives theoperatorasenseof`beingthere,'andincreasingthesenseofpresence. This isbecausethemotiveofmanysystemsistounifytheoperatorandtheremote robot as \one." In this thesis system, operator-robot unity has not been a goal. Infact,thehumanisseenasasupervisorandtherobotasassistant,two distinctentitieswithdierentrolesandresponsibilities. Thelagfactorsinthe thesissystem,thoughimportant,arenotasrelevantasisinmanyteleoperation systems. Thisisprimarily becausewehavesoughtahigherlevelofautonomy forthe robot thandirect control. Lag time is importanthowever, asit is in
most computerinterfaces, in conrming to the user that a command has in
fact been \understood" or accepted by the system. In a deictic system this canbeaccomplishedbythesystemrespondingwithasimple\ok,"\click,"or \ ash"whenaverbalorgesturalcommand hasbeenaccepted. Theuser will thenunderstandthatthecommandhasbeenacknowledgedeventhoughitmay takesometimeforthesystemtoperformtherequest.
2.5 Virtual Environments
Theeld of Virtual Reality (VR) has been around for morethan a decade. 5
In that time it has passed through the gamut of phases beginning with fas-cination,moving on to utopian promises and then to outrighthype. Bythe timeof thiswriting, muchof thehypeissubsiding, leavingtheterm `Virtual Reality'abittainted,butyet enduringasaseriouseld ofstudyinitself and still holding some of its fascination. Many of the most successful VR ideas havebeenadoptedandincorporatedintotheresearchprogramsofotherelds
(e.g. MolecularModeling,CAM/CAD,Telemedicine,VirtuallyAugmented
re-alities, Teleoperated Robotics). In the eld that is now virtual reality, some ofthemost promisingworkis in socialapplications. Examples ofsuch social applications include graphical collaborative environments, virtual
conferenc-ing,andworkapplicationssuchassharedCAD/CAM systemsandroom-sized
conferencing and collaborativevisualization. Thework in this thesisis most in uencedbythese collaborativeapplications ofVR, those arevirtual worlds that canbe sharedand in which multiple users can interact. The particular typeofsystemplatformuponwhichtherobotsystemisbuiltisaninstanceof
5
aCollaborativeVirtualEnvironment(CVE).Inthis workwedonotusehead mounteddisplaysandemployinsteadlargescreendisplay. Theprimaryreason istoleavethehumanuserunencumbered,whilealsoprovidingausefuldisplay andthepossibilityforcollaborationbetweenco-locatedparticipants.
6
Inthis thesisan immersive virtualenvironment is usedas theinteraction
medium with the remote world and the robot. Specically the work is an
application in the SICS Distributed Interactive Virtual Reality (DIVE) sys-tem [30, 50]. TheDistributed Interactive VirtualEnvironment (DIVE) plat-form hasbeeninexistencesince1992asaresearchsoftwareplatform andhas grownfromaprojectbasedinsharedmulti-mediacommunication. Itis essen-tiallyadistributed andsharedvirtualenvironmentsystembasedonmulticast andapeer-peermodelforsharingandupdatingtheenvironmentmodel. Ithas ahigh capacitydistribution infrastructureto support multiple usersand sup-portsanumberofcommonnetworkstandardsatdierentlevels(e.g. TCP/IP,
ATM,URL,MIME)aswellassupportsanumberofleformats(e.g. VRML,
VR,AC3D).ApplicationfunctionalitiescanbeprogrammedinC,OZ,Tcland integrated with other tools, (e.g. Netscape, Politeam for document sharing, VAPforaudioconferencing).
Recent research in immersive virtual environments and human-computer
interactionatSICShasworkedonbuildingaframeworkfor\natural" interac-tion. Aspectsofthisworkarethestudyofinteractionbetweenagents,human and othersin asharedvirtualenvironment[15], or theconstructionof mech-anisms for human users to interact with the virtual environment [60]. Some of the work in this thesis, the 3D mouse pointer in particular, has come out this work with\natural"interaction[51]. Thatworkhastried toelicit possi-ble meanings for natural while realizing that there may be noone denition of \natural."Thedenition depends oncontextand variesin respect to such factorsastheculture,thetask,fortheindividual'sabilities. Insteadaconcept that is far more complex and less generalizable may be more suitable, such as theconcept of \appropriate."Thus the existence fora generalinteraction frameworkcomes into questionand leavingthe needto create interfaces spe-cictoparticulartasksandcontexts. Thisideaofappropriateinteractiondoes share some resonance with modern trends in computer infrastructures [117] and interfaces[82]. Onesimpleapproachis to searchfor multiple methodsof performingatask,eachcontextdependent.
2.5.1 Enhancing environment visualization: Augmented
Virtuality and Augmented Reality
AugmentedReality(AR)isonebranchofresearchincomputergenerated im-mersiveenvironmentsthathasfoundanumberofpromisingapplications. The
6
Co-located collaboration is not directly addressedin thisthesis, butthe concept has guidedsomeoftheinteractiondecisionsandanewprojectco-writtenbythe thesisauthor
visualization sub-system model presented in this thesis (Reality Portals) has the capability to use a base 3D polygonal model of the world and with the aidof calibration overlayvideo into thegraphical world. It also includes the possibilitywiththeslightlymodiedroutinestodotheopposite,toenhancea videoimagewithcorrectlypositiongraphics. With suchsystemsitispossible, forinstance,toenhance oraugmentthevideostreamwith graphical informa-tionthat may notbe visible but may be useful. This overlayingand mixing ofgraphicsintoavideostreamisreferredtoasAugmentedReality(AR).The complimentaryoperationoflayingvideoontographicworldshasbeenreferred toasAugmentedVirtuality(AV).
Theclassicexamplesof AugmentedRealityemploy adisplaythatis worn onthehead,that enablesthebearertoseeboththerealworldandagraphics displaythat isoverlayedonto semi-transparentscreen. Inthis mannerauser ofsuchasystemwouldsee thephysical worldaugmentedwith\appropriate" graphics. A user might use such a system to repair a laser printer [39], or repairingan automobile enginewithannotated instructionsappearing onthe lens of the see-through glasses [66]. In the previous example the user was
present in the physical environment. Similar operations can be performed
remotely. Milgram et al have used a telerobot system delivering video of a remote scene to a special display worn or observed by the operator that is enhanced with interactive graphics. Such applications include virtual tape-measurement on real scenes as well as graphical vehicle guidance [75], and enhanceddisplays for teleoperated control [73]. In addition to this standard notionofARavirtualenvironmentcanbeembellishedwithreal-worldimages. Milgram [75] has attempted to map outa taxonomy of \mixedrealities", those that blend graphics and real world video. One axis of this taxonomy
spans from Augmented Virtuality to Augmented Reality, with citations and
positioning ofmuch ofthe work inbetween. In addition,Benfordet al. have createda similar mapping of these techniques and applied them to the eld of entertainment and in particular to employing CVEs for inhabited and in-teractivetelevision[18]. Aplacewhere televisionstreams,videoconferencing, andtraditional collaborativevirtualenvironmentsystemscoincide. Thebase techniques for all these systemsare the same and include models of the real andvirtualscenes, methods forlocatingviews (cameraandgraphical)within thosemodels,andmethods formixing videoandgraphicalstreams.
CorbyandNasdescribeanROVthatroamsanuclearcoreforthepurpose ofinspection[59]. Theyhavemotivatedtheuseofarobotbystatingthatitcan
be employed withoutcausing reactor downtime. Also humancore inspection
isproblematicandwouldbeconsideredadangerousenvironment. Thecostof nuclearreactordowntime isenormous(e.g. when measuredbythemetricsof moneyand qualityof service)and must beminimized. To perform the task, theirsystemincludesathreescenedisplay,withthreepointsofviewincluding theposition of the robot within amap of the core, anon-board robot video
signicant problems with having video as the only source of visualizing the actualremoteenvironment. Itisthatreasonthatmotivatestheirthree-screen solution. Thefollowingquote fromtheir reporthighlights theproblem which elementsofaugmentedvirtuality,suchas asRealityPortalsseektoovercome.
The eld of view is often verysmall compared to the size of the reactor. [...] The path that the ROV must take is often a very tortuous one involving many translations and rotations to enter restrictiveareas. It isoften impossibleto determinethenextstep basedonlyonthevideoimagecomingbackfromtheROVcamera. Finally, becausethe camerais mounted ona2 DOFarm, it even morediÆcult for theinspectorto verify that the currentimage is ofcorrect areaviewed in the expected conguration. N. Corby & C.Nas,GECorp R&D[59]
Thisis tosay that merelyhavingthe videoof aremoteenvironmentdoes not aord anunderstanding of the physical structure of the remote environ-ment. Oneapproachto asolutionisacombinationofgraphical andvideoas in AR andAV. Onespecic solutionis toenhance a3Dvirtualenvironment, with appropriately positioned video textures providing an environment with thegreaterphysical structure oftheremotelocation andthedetails from the videoimages. RealityPortalsareaninstanceofsuchasystem.
2.5.2 3DTV and virtuality
TheRealityPortalsAVworkcanalsobeviewedasaninstantiationofworkin thegeneral areaof immersive3D video,or3DTV.In thisimmersivevideo,a userisnotrestrictedtothe2Dplaneforvideobutcaninteractivelychooseviews andexplorevideoregions. Forourpurposes,weseetheRealityPortalsystem asameansoflteringoutnon-essentialdetailsfromapotentiallyclutteredreal worldscene. Institutionally supported programs in the eld of 3D videoare beingcarriedoutatUCSDbyRameshJain'sgroup[61],andalsoatCMUbyby TakeoKanade'sgroup[88]. Bothofthoseapplicationshavefocusedoncreating mixedrealitiesthatcanbeinteractivelyaccessed. Inthisworkwehavecreated a tool that can be used for near real-time remote investigation. It is \near real-time"becausetherearesomedelaysintroducedbythesegmentationand textureapplicationprocess. Thesedelaysareapproximately1-3secondsinthe demonstrationprototype. Thoughthereissignicantroomforoptimizationin the prototype. Kanade's research is intended for broadcastapplications and Jain's lab has started to explore real-time security applications partly as a
2.6 Software Agents and Robot Assistants
Theresearchanddevelopmentofsoftwareagentshasgrowninthepastdecade fromasub-eldinside AItoabonade areaofitsownwithworkshops, sym-posia,proceedings,books,collections,researchprototypesandcommercial ap-plications. There are elements in this thesis work that are related to some of the work in the eld of software agents. The relationshipcenters around thedesign ofan embodied assistant. These similaritiesare moreclearlyseen from thesoftwareagent researcher'spointof viewthan from therobotics re-searcher's. There aresoftwareagentresearchersthat seeverylittledierence betweentheirwork,andthatin intelligentrobotics[112]. Thissection,rather thanbeingacomplete surveyof agentrelated work,presentshowtheeld of softwareagentsisrelatedtothepresentthesiswork,anumberof factorsthat gointothedesignofagents,thenpresentsacontradictoryargumenttocontest that view that work with software agents is the same aswork with physical robots,andthenstepsbackfrom thedierencestopointouttheimplications.
2.6.1 Derivative software agents
One reason for a connection between the work in the software agent (SA)
communityandthatintherobotcommunityisthatkeypersonsworkingwith softwareagentshavecomefromtheeldofroboticsandAI(e.g. PattieMaes,
WalterVan Der Velde, Barbara Hayes-Roth). Thus someof the the work in
software agents has built on work with articially intelligent robotic agents. Thisincludes architecturemodels forsoftwareagentsthat havecomedirectly from the models developed within the eld of articial intelligence [37, 112]. Infact,asevidencedbythefollowingquotation,manyresearchersintheagent communityframetheirwork withrespecttorobotwork.
TheideaofanagentoriginatedwithJohnMcCarthyinthe mid-1950s,andthetermwascoinedbyOliverSelfridgeafewyearslater, whentheywerebothattheMassachusettsInstituteofTechnology. Theyhadinviewasystemthat,whengivenagoal,couldcarryout the details of the appropriate computeroperationsand could ask forandreceiveadvice, oeredinhumanterms,whenitwasstuck. Anagentwouldbea'softrobot'livinganddoingitsbusinesswith thecomputer'sworld. AlanKay[62].
For the purposes of this section, this quote by Alan Kay is oered as a denition of software agents. Also in this quote, theconnection of anagent being a \soft-robot" is made explicit. In the literature already cited, there
are frequent comparisons between robots and agents. The above quote also
pointsoutthehuman-interactionaspectofworkwithsoftwareagents.Itisthis aspectthat mostseparates theworkin softwareagentsfrom that in articial
the humancomputer interaction (HCI) community than it does with the AI community. Identifyingthis splitbringsupatleast twocomments. First that at least partof the softwareagent community is derivativeof AI community andsecond,manyofthehardquestionsaboutAItechniquesmayhavebecome irrelevantintheconstructedenvironmentsofuser-interactivesoftwareagents. It is in the aspect of a computer-based system interacting with a human thattiesthisthesisworkwiththeconceptofasoftwareagent. Inbothsystems the agentis in service of the human user, it is the human that provides the primary goal, and there is an interaction discourse between the human user andtheagent. Inparticularthesupervisor-assistantmetaphorofteleoperated controlisanexampleofsuchahuman-agentinteractionsystem.
However,claimsthatinteractionwithrobotscanbesimpliedtointeraction with software agents (and later perhaps generalized to the hardware robot) are mistaken. Theprimary dierencescanbeseenfromvarious perspectives. Onedierenceliesin therootofthedebate ofnewAIvs.GOFAI(GoodOld Fashioned AI), e.g. the behavior-based school and the goal-oriented school.
This debate has also been framed as a top-down vs. bottom-up division of
approaches. Inthat debateitisarguedthatrobotsneedtobeembodiedfrom thebeginning, thesolutionsdeveloped in simulationswill nottransfertoreal robots[27,87]. Inthisthesis,thetermagentismostlynotusedoutsideofthis section,andinstead'robotassistant'isemployed.
Interactive software agents have triggered a large debate that has yet to
occur in robotics. Though it may come with greater domestic deployment.
Someoftheissuesinthisdebatearejustaspertinenttothedesignofinteractive robots.
2.6.2 Interface agent debate
Human Computer Interaction researchers debate whether intelligence at the interfaceisgoodHCIdesign[96,56,72]. Intermsofteleoperatedhuman-robot interactionitis\intelligence"(orautonomouscompetence)thatwilldistinguish autonomousorsemi-autonomousteleoperatedrobots(supervisorcontrol)from apurelyremote-controlledapplication(teleoperation). Thecontroversycenters on where the designers place the \intelligence." In this thesis work it is not intelligencein theinterface,itisintelligence(howeverlimited), in theformof autonomy,intherobotthattheuserinteractswith. Theinterface,asasystem, isawayofaccomplishingasetoftasks.
Thereisadebateaboutsoftwareagentsthatisoftenheated. Lanierwrites "Theideaofintelligentagentsisbothwrongandevil[...] Ibelievethatthisis anissueofrealconsequencetotheneartermfutureofcultureandsociety"[68]. MuchofLanier'sargumentagainstagentscentersonthefearthatuserswillbe \dumbeddown"to thelevelof theimplementedagentand that theuserwill
HCIarevoicingrelatedviewsaboutagents. OneexampleisBenShneiderman whenhewrites:
I am concerned that if designers are successful in convincing the usersthat computersare intelligent,then theuserswillhavea re-ducedsenseofresponsibilityforfailures. Thetendencytoblamethe machineisalreadywidespreadandIthinkwewillbeondangerous groundifweencouragethistrend.[96].
ThisissimilartowhatFonerreferstoasthesocialcontract: \Most[commercial agentoerings]tendto excessivelyanthropormorphizethesoftware,and then concludethatitmustbeanagentbecauseofthatveryanthropomorphization, whilesimultaneouslyfailingtoprovideanysortofdiscourseor`socialcontract' betweentheuserand theagent"[46]. Itis thisdiscoursethat isbothhardto deneandhardtodesign. PhoebeSengershasaddressedpartofthisissueby pointingoutthelackof attention tothesignication ofagentbehaviorsfrom theuser's perspective. Shewrites: \whatmattersisnottheinternally-dened codeasunderstoodbythedesignerbuttheimpressiontheagentmakesonthe user [91]." The system constructedby Sengers gives special attentionto the subtleculturalaspects ofagentbehaviorrepresentationwithin thecontext of use.
Oneaspectemergingwiththeresearchanddevelopmentinsoftwareagents isthatthereismoretotheinterfacethaneÆciency,thattheremayevenbe en-tertainmentvalue. Howevertheseagaindonothavetobepolaroppositeswork vs.recreation,therearenotionsof`funwork,'anditmaybethoseexperiences that are quite desirable. This may in fact contradict someof Shneiderman's mantrasabout direct manipulation and\predictability" \controllability" and \mastery"beingtheprimaryelementofconcernin aninterface[96]andpoint tootherlessquantiableinterfaceproperties. Thisistosaythatthereismore totheinterfacethaneÆciencyand\timetocompletion"maynotbethebest way to measure the successof the interfaceas awhole [56]. There are other qualitativeaspectsthatneedtobetakenintoaccount. Suchqualitiesaremuch hardertoidentifyandhavemoretodowithpersonalizednotionsof productiv-ity,intuition,andconsistency.
2.6.3 Attributes vs. Endowment
Thisdebateaboutthe"goodandevil"isasmuchaboutwhatusersunderstand theinterfacetobeasitsactualconstitution. Thisleadstoafurtherdiscussion
about user endowment and system attributes. There is a delicate balance
betweenappealing to a\life-like"metaphor in asoftwareagentand, whilein thepursuitofthat goal,promisingtoomuchcompetencein thepresentation. Asanexampleofthepowerofthisseeminglysubtledierence,thedebateabout whetherornotthegraphicalagentfortheOlgaprojecthad`antennae'ornot
was to add avisual cue that the agentwas nota human,the opposing view
was to purposely makeit look more human [110]. What the user puts onto
the system is endowment, what the system actually has capacity for is its attributes.
The warnings from Shneiderman and Lanier mentioned above are partly
centered ontheideaof personication. Presentationmayleaduserstoendow the interfacewith othertraits e.g. iftherepresentation is human, there may beanexpectationofotherhumanabilities. Inattentivepresentationmaylead to false expectations, leading to disappointment, leading to rejection of the interface and tools. A related warningis that the presentation of autonomy in the interfacewill in theend \disempowertheuser."This mayresult from a sense of intimidation from competence the user endows theinterface with. Thesewarningspointtothepoweroftheinterface. TheEliza-syndromeandthe TuringTestaredemonstrationsofthispower. Theintendedeectofproviding suchaninterfaceistoachieveaformofcohesionorhabitabilityintheinterface andinthedialoguethatiseected. Thusadesignermaynotbeabletosimply refer to control and mastery asgoals, manyof the desirableproperties of an interfacemaybesubjective.
Thisdistinctionbetweenendowedintelligenceandinnateintelligenceis an issue that arises quite often in AI and in robotics as well as in the eld of softwareagents. ThisispreciselywhatHorswillbuiltintohisPollytourguiding robot system [57]. Upon start-up, from the robot's speaker came \Hi, I am PollyIamtheMITAIlabtourguiderobot! ... andIdonotunderstandwhat I am saying right now." The robot would then move around the space and conduct atour. Withthisstatementitwasmadeclearthat therobotdidnot understanditsown speech,andalso, byinference,madeitclearthat itwould mostlikelynotunderstandauser'sspeecheither. Whatmightappearto bea touchofwhimsyfrom theinterfacedesignerisactuallyaverystronginterface statement about the robot's capabilities, helping to dene the assumptions under whichtherobotoperatesitstours.
Anotherexamplefrom thesamelab isthe'Cog'robot project. Regarding asession oftapingofademonstrationvideo,RodBrookshadthefollowingto say:
"Cynthiaandtherobotweretakingturns. Therobothadnonotion oftakingturns, butCynthiaknewhowto taketurnsandshe capi-talizedonthedynamicsoftherobot. [..] Itseemstomethatpeople cannotbut helpthemselvesinteractingwithartifactsthewaythey interactwithpeople."
7
Rod Brooks indicated that the turn-takingseemedto be an activity that usersendowonarobot. Thusthefactthatthehumanandrobottaketurnsis
7