Learning in a Reactive Robotic Architecture Thord Andersson
−40
−20
0
20
40
0
20
40
60
80
100
10
−4
10
−2
10
0
Response variable: u
The SSE of the response channels during periods
Response variable: v
SSE during periods
LIU-TEK-LIC-2000:13
Department of ElectricalEngineering
Linköping University, SE-581 83Linköping, Sweden
Learning in a Reactive Robotic
Architecture
Thord Andersson
LIU-TEK-LIC-2000:13
Department of ElectricalEngineering
Linköping University, S-58183 Linköping, Sweden
c
2000ThordAndersson
Department ofElectrical Engineering
Linköping University
SE-58183Linköping
Sweden
In this licenciate thesis, we discuss how to generate actions from percepts
within anautonomous robotic system. In particular, we discussand proposean
original reactivearchitecture suitable for response generation,learning and
self-organization.
Thearchitectureusesincrementallearningandsupportsselforganizationthrough
distributeddynamic modelgeneration andself-containedcomponents. Signalsto
andfromthearchitecturearerepresentedusingthechannelrepresentation,which
ispresentedinthatcontext.
The componentsof the architecture use a novel and exible implementation
of an articial neural network. The learning rules for this implementation are
derived.
A simulator is presented. It hasbeen designedand implementedin order to
testandevaluatetheproposed architecture.
Results of a series of experiments on the reactive architecture are discussed
andaccountedfor. The experimentshavebeenperformed within three dierent
scenarios,usingthedevelopedsimulator.
Theproblemofinformationrepresentationinroboticarchitecturesisillustrated
byaproblemofanchoringsymbolstovisualdata. Thisispresentedinthecontext
oftheWITAS 1
project.
1
No work exists in isolation and this thesis is by no means an exception. It is
impossibletoenumerateallthepeoplewhohaveinuencedandhelpedmethrough
theyears,butyoushould allknowthat Iamverygrateful.
First of all, I wish to thank my supervisor, professor Gösta Granlund, for
introducing me to the Mysteries of Vision and for giving me the pleasure and
opportunitytoworkinhis laboratory.
IalsowishtothankallofmywonderfulcolleaguesandfriendsattheComputer
VisionLaboratory. Youaretrulygreatpeople!
Special thanks to Johan Wiklund for our hacking discussionsand for always
keepingthecomputersintip-topshape.
IwouldalsoliketothankPer-ErikForssén,BjörnJohanssonandÅsa
Johans-sonfor proof-readingthemanuscript. All remainingerrors are to be blamed on
me,duetonalchanges.
Finally,Iwouldliketothankallofmyfamily,especiallyÅsa,fortheirconstant
support,loveandpatience.
The research presented in this thesis was done within the WITAS project,
fundedbytheKnutandAliceWallenbergFoundationwhichisgratefully
Abstract v Acknowledgements vii 1 Introduction 1 1.1 Motivation . . . 1 1.2 Contributions . . . 2 1.3 Thesisoutline . . . 2
2 Frameworkand considerations 5 2.1 Introduction. . . 5
2.2 Whatisaroboticsystemorrobotics ? . . . 5
2.2.1 Briefhistory . . . 6
2.2.2 Presentationofaclassicroboticstructure . . . 7
2.2.3 Activevisionand reactivesystems . . . 8
2.2.4 Three-layerarchitectures . . . 9
2.3 Scenario: TheInfantRobot . . . 10
2.3.1 Thescenario . . . 10
2.3.2 Response generation . . . 10
2.4 Learningandself-organization. . . 13
2.4.1 Whylearning? . . . 13
2.4.2 Whatshouldwelearn?. . . 13
2.4.3 Learningparadigms . . . 14
2.5 Fuzzymatchingofvisualcues . . . 15
2.5.1 Introduction . . . 15
2.5.2 TheWITAS project . . . 16
2.5.3 Fuzzy-setrepresentationofvisualcues . . . 19
2.5.5 Fuzzysignaturematchingatwork . . . 23
2.5.6 Conclusions . . . 25
3 The proposed reactive architecture 27 3.1 Introduction. . . 27
3.2 Structuredescription . . . 27
3.2.1 Thechannelrepresentation . . . 27
3.2.2 Themappingfunctions . . . 29
3.2.3 Thelearningapproach . . . 34
3.2.4 Theevaluationphase. . . 35
3.2.5 Thecreationofanewmodel . . . 35
3.2.6 Theoptimizationphase . . . 36
4 Simulator design and implementation 39 4.1 Introduction. . . 39
4.2 Designconsiderations . . . 39
4.2.1 Thesimulator. . . 39
4.2.2 Thecomputationstructure . . . 43
4.2.3 Thearticialneuralnetwork . . . 46
4.2.4 Theoptimizationmethod . . . 48
4.2.5 Theresilientbackpropagationalgorithm . . . 53
4.3 Theimplementation . . . 56 5 Experiments 59 5.1 Introduction. . . 59 5.2 Scenario1 . . . 59 5.2.1 Setup . . . 59 5.2.2 Results . . . 61 5.3 Scenario2 . . . 68 5.3.1 Setup . . . 68 5.3.2 Results . . . 70 5.4 Scenario3 . . . 76 5.4.1 Setup . . . 76 5.4.2 Results . . . 77 5.5 Summary . . . 80 6 Summary 83 6.1 Summaryanddiscussion . . . 83
6.2 Futureresearch . . . 83
Appendices 87 A Thetrainingparameters,scenario1and2 . . . 87
Introduction
1.1 Motivation
In this licenciate thesis, we will discuss how to generate actions from percepts
within an autonomous robotic system. In particular, we will discuss a reactive
architecturesuitableforresponsegeneration,learningandself-organization.
Robots are currently mainly used as advanced automatons in the industry.
With hard-coded behaviours and limited abilities to sense and adapt, they can
performsimpleandrepetitivetaskssuchasweldingataconstructionlineor
han-dlingmaterialsinastorage.
Autonomous robots are designed to be moresensitive to their surroundings.
Theycanchangetheirbehavioursandplansaccordingtochangesinthe
environ-mentinordertofulllsomespeciedgoal. Anexampleofthiskindofrobotisthe
microroverSojourner[36]whichinJuly4,1997landedonMarsandperformeda
numberof experiments. However,thebehavioursand actionsofthese robotsare
still the resultsof algorithms, goalsand representations, designedand arbitrary
craftedbyhand. Anautonomousrobotofthiskindwillnotbeabletoadapttoa
situationwhichitshumandesignershavenotforeseen. Thus,atrulyautonomous
robothastobecapableoflearningfrom itsexperiences. Learningmethodologies
such as reinforcementlearning[29] showspromisingresults[8,28] in adapting or
learningsuccessfulbehavioursinroboticsystems. An autonomousrobot must,in
addition, have the ability to use sensor data for perception. There are alot of
interestingphilosophicalreasonstowhyasystemcapableofvisionandperception
hastoacquire informationactivelybyitself, see [15,17,21]. Oneofthese is that
responsesofthesystemcanbeusedtoorganizethebarrageofinputdatacoming
fromthesensors.
learntherelationsbetweenitsresponsesand itspercepts onareactivelevel,i.e.,
ifarobotturnsitshead5degrees,howwillitsperceptschange? Ifarobotcould
learnthesebasicrelations,theywouldbevaluableascomponentsinaperception
structure.
Usingresponsesin order to organize percepts seemsto be animportantpart
in the development of the senses in mammals. In an experiment performed by
Held andHein[23],twokittens wereraisedin thesameenvironment,attachedto
each other viaacarousel apparatus. One of thekittens couldnot movefreely
butwerepassivelymovedaroundbytheotherkittenviathecarousel. Aftersome
time, the kitten which could control its movements developed normal
sensory-motor coordination, while the other failed to do so until being freed for several
days[17,23].
1.2 Contributions
Themain contributionsinthis thesisare presentedin chapter 3,4,5andsection
2.5. Themostimportantindividualcontributionsare:
An original reactive architecture for response generation. It supports
in-crementallearningandself-organizationthroughdistributeddynamicmodel
generationandself-containedcomponents. (Chapter3)
Asimulatorwhichhas beenimplementedin order to test andevaluatethe
architecture. (Chapter4and5)
A exible implementation of an articial neuralnetwork (ANN) with
self-contained nodes and links. Rules for back propagation of error gradients
havebeenderivedforthisimplementationandforitsextended denitionof
nodesandlinks. (Chapter4)
Section2.5discusseshowtoanchorsymbolstovisualdataandisco-authoredwith
Dr. Silvia Coradeschi 1
andDr. Alessandro Saotti 2
in ref. [1], partlypublished
in ref.[9].
Section 4.2.5discussespropertiesof theerrorback propagationalgorithm, in
particularinrelationtotheRPROP[31]algorithm.Thishasbeenpublishedin[2],
co-authoredwithMikaelKarlsson 3
.
1.3 Thesis outline
Inchapter2,thebackgroundorcontextofthisthesisispresented. Weintroducea
scenariothatillustratewhatwearestrivingtoattain,anddiscussissuesinresponse
generation and learning. The chapter ends with an exampleof an autonomous
1
CurrentlyatÖrebroUniversity,Dept. ofTechnologyandScience,Sweden.
2
CurrentlyatÖrebroUniversity,Dept. ofTechnologyandScience,Sweden.
3
robotic system that illustrates some problems in information representation in
thesekindofsystems.
Chapter 3 introduces a reactive architecture which is suitable for response
generation,learningandself-organization. Thechannelrepresentationisalso
dis-cussed.
Asimulatorhasbeendevelopedin ordertotestdierentideasandfor
evalua-tionoftheproposedarchitecture. Thedesignandimplementationofthissimulator
isthetopic ofchapter4.
Experimentsontheproposedarchitectureusingthesimulatorarepresentedin
chapter5,andchapter6concludesthethesiswithasummaryandaproposalfor
futurework.
Framework and considerations
2.1 Introduction
In this thesis, we will discuss how to generate actions from percepts within an
autonomousroboticsystem. Inparticular, wewilldiscuss areactivearchitecture
suitableforresponsegeneration,learningandself-organization. Robotswill
more-overbe viewedas Information ProcessingStructures (IPS). This means that we
willnotdealwithimportantaspectsofroboticslikelocomotion,sensorandeector
technologyperse.
This chapter beginswith abriefintroductionto robotics, its historyand the
architecturesbeingused. Itthencontinueswithanexamplescenario,introducing
theissues which will bethe subject forthe rest of thethesis. The chapter ends
withasectioninwhichwewilldiscusstheproblemsofinformationrepresentation
incontextofarealapplication,theWITAS 1
project.
2.2 What is a robotic system or robotics ?
Inrobotics,thegoalistoconstructtechnicalsystemsthatwithhelpofsensors(a
cameraforinstance), eectors (e.g., legs, gripping toolsand wheels) and control
systems can perform advanced, or perhaps tedious, tasks that previously only
couldbe donebyhumans. Examplesof such tasks in theindustryarerepetitive
work onaproductionline (weldingforinstance), materialhandling(storageand
deliveryofmaterial)andsheepshearing (!)[33]. However,these robotsareoften
notmuch morethanadvancedautomatons; theirbehavioursarehard-coded and
theirabilitiestosense andadaptareverylimited.
1
Examples of tasks demanding autonomous robots are when their behaviour
hastobedeterminedbyitsownexperienceasmuchasanybuilt-inknowledge,for
instanceexplorationandworkinhazardousorremoteenvironments. Examplesof
such environments are disaster areas,nuclear power-plants, deepsea, volcanoes,
spaceetc.
Roboticsisofcourseahugescienticeldthatcoversoroverlapsmanyother
eldsandsubeldslikearticialintelligence(AI),controltheory,computervision,
signal processing andothers. Theinterdisciplinaryaspectof robotics is very
in-teresting becauseit puts everyeld orcomponentin context; thedemands and
requirementsonacomponentoftenbecomequitedierentwhenitisputinto
con-text comparedtowhenit isstudiedandfunctioning inisolation. Many
function-alitiesfrom manydisciplineshavetoworktogetherandsharethesameresources,
otherwisethesystem(the robotforinstance)willnotwork.
Wewill,as mentionedearlier,in thisthesisdiscussrobotsasInformation
Pro-cessingStructures(IPS).Thismeansthatwewillnotdealwithimportantaspects
ofroboticslikelocomotion,sensorandeectortechnologyperse. Insteadwewill
focus on anarchitecture which processes signalsfrom the sensors and generates
signalswhich evokeactions.
2.2.1 Brief history
AccordingtoWebster's dictionary[39],arobot is:
1. amachinethatlookslikeahumanbeingandperformsvariouscomplexacts
(aswalkingortalking)ofahumanbeing;also: asimilarbutctionalmachine
whoselackofcapacityforhumanemotionsisoftenemphasized
2. anecient,insensitive,oftenbrutalizedpersonwhofunctionsautomatically
3. anautomaticapparatusordevicethatperformsfunctionsordinarilyascribed
tohumanbeingsoroperateswithwhatappearsto bealmosthuman
intelli-gence
4. amechanismguidedbyautomaticcontrols
The Czech writer and artistJosef Capekcoinedthe wordRobot from theCzech
wordsforserf(robotnik) andforcedlabor(robota)in hisshort storyOpilec
from 1917. Hisbrother,theCzechplaywrightKarelCapek,madethewordRobot
well known with his play R.U.R (Rossum's Universal Robots) which opened
in Prague in January1921[33]. The themeoftheplayis thedehumanizationof
maninatechnologicalcivilization. Intheplay,therobotsarenotmechanicalbut
chemical. Inan essaywritten in 1935, Capek, in the third person,writes about
mechanicalrobots[10]:
It iswith horror,frankly, thathe rejectsallresponsibility forthe
ideathatmetalcontraptionscouldeverreplacehumanbeings,andthat
bymeansofwirestheycouldawakensomethinglikelife,love,or
ofmachines,oragraveoenseagainstlife.
TheAuthorofRobotsDefendsHimself-KarlCapek,Lidovenoviny,
June9, 1935,translation: Bean Comrada
ThetermRobotics wascoinedbythescientistandwriterIsaacAsimov,referring
to the use and study of Robots. In his science ction short story Runaround
published in 1942, the word Robotics wasrst used. In 1950 he published I,
Robot,acollectionofseveral ofthese small storieswhere healso introduces his
famous threeLawsofRobotics [5]. Helateraddedazerothlaw.
Robotics is,again accordingto Webster's dictionary[39],technology dealing
with thedesign,construction,andoperationofrobotsin automation.
EarlyrobotsintheresearchcommunitywereGreyWalter'sElsiethetortoise
(Machina speculatrix) [3] in 1953 and the John Hopkins Beast in 1960. The
rst modern industrialrobotswere theUnimates createdbyGeorgeDevoland
JoeEngelbergerin the1950'sand60's[33]. Engelbergerstartedamanufacturing
companyUnimation (forUniversalAutomation)andhasthereforebeencalled
thefatherofrobotics.
2.2.2 Presentation of a classic robotic structure
Traditionally,aroboticsystemhaslookedsomethinglikegure2.1.
Module
AI/Control
Module
Effector
Module
Response output
Environment
Sensors
Sensor output
Projections
Perception/Vision
Figure2.1: Aclassicroboticsystem
FromtheSensorswegetprojectionsoftheEnvironment. Theseprojectionsare
typicallyintheformofscalars,vectorsand/orarrays. Thedatafromthesensors
ow into the Perception/Vision module where the data is processed according
to some built in rules and models. The rened information, typically symbolic
objectrepresentations,owintotheAI/Controlmodule whichupdates itsmodel
of theworld. Basedon this model and on itsprevious actions, themodule now
generates the next action. This action is then performed and executed by the
Eectormodule whichin turnaecttheenvironment.
This conguration works well when the systemoperates in a restricted, well
controlledenvironmentasforexampleanassembly-stationinanindustry. Inthese
cases,weknowexactlywhichinteractionswiththeenvironmentarepossible,which
are known etc. The system can consequently be tailored for the specic task
withappropriatealgorithmsfor,e.g.,theimageprocessingsystemandthecontrol
system.
The advantages with this kind of systemsis evident; when the development
of thesystemisnished,thesystemis readyforactionand massproduction,no
training or learning is necessary. The disadvantages are the other side of that
coin; iftheenvironment,theobjects,thetasksoreven thesystemitselfdeviates
from whatisspecied,thesystemwill notworkanditwill notadapt tothenew
demands either,nomattertheamountoftime.
If we tryto develop this kindof system for adynamic environmentwhere a
morerichandexiblesetofresponsesareneeded,wewillsoonrunintoproblems:
Perception/Vision Module: Generalalgorithmswithhighdemandsfor
com-putationalresourcesareneeded. Themodulestill needsmodelsforwhat it
isexpectedtoprocessandndin thescenery.
Perception-Control Bandwidth: The output channel from the perception
moduleoftenhasaverylowbandwidthcomparedto itsinput. The
contex-tualinformation presentintheinputispeeledoin theperceptionmodule
andtheinformation isiconiedaccordingto themodelsthat are builtinto
theperceptionmodule. Thismaimedinformationgivessmallopportunities
forarichandexibleresponsegeneration.
AI/Control Module: The module actsaccording to therules builtinto the
module onthereducedinformationfrom theperceptionmodule. Therules
mustbespeciedbythedeveloperandthismeansthatthedeveloperhasto
predictalltheeventsandobjectsthesystemwillencounter.
In conclusion we canpoint outthat the complexity ofthe environment mustbe
built into this classical system for it to function properly. If we have a fairly
uncontrollableenvironmentandnontrivialtasksforthesystemtoperform,wesee
thatthisisanimpossibletask;duetothecombinatorialexplosion,thesystemwill
encountereventsthat itisnotbuilttohandle.
2.2.3 Active vision and reactive systems
InActiveVisiononetriestomakethesystemmoreecientbylettingtheAI/Control
modulecontrolthePerceptionmodule. Inthisway,onlytheinformationcurrently
needed by the AI/Control system is calculated and produced by thePerception
module. Only certain parts and aspects of the sensor information is processed.
By utilizinginformation gained from thehistoryof thesystem, other taskssuch
as trackingbecomesimpler.
Inrecentyears,reactiverobotsystemshavegainedconsiderableattention. The
PerceptionmoduleisinthiskindofsystemscoupleddirectlytotheEectormodule
so that certain percepts trigger a certain behavior. In slightly more advanced
reactivesystems,prioritiesareaddedtothebehaviorssothatsomeofthebehaviors
The beauty of this is that, instead of trying to put the complexity of the
environmentintothesystemandforcethesystemtocertainbehaviors,weletthe
complexityoftheenvironmentbereectedinthesimplerulesofthereactiverobot,
and it canin this way obtainseeminglycomplexbehaviors. Braitenbergshowed
inhisgedankenexperimentsthatsimplesensorimotortransformationscouldresult
in complex behaviors in small simple robots [4]. Among the robots based on
Braitenberg's designs, we nd creatures such as the timid shadow seeker, the
paranoid shadow-fearing robot and an insecure wall follower (after [3]). These
robotsobviouslydonotpossessthe traitstimid,paranoid and insecure,but
itisinterestingthatanobservercangetsuchimpressionsofsuchextremelysimple
creatures.
The problem with pure reactivesystems is that it is hard to choose the set
of behaviorsneeded to get theperformance onewants. Inaddition, many tasks
demandthatthissetofbehaviorschangewiththeenvironmentand/orwithtime.
By introducing hybrid reactive/deliberative robotic systems these problems are
solved; aclassicplanner orcontroller isintroduced, and it willcontrol and alter
thereactivecongurationasneeded.
2.2.4 Three-layer architectures
Aswehaveseen, neitherthe classicsense-plan-act architecture northe pure
re-activearchitectureareupto thetasksof robotics. Recently[11],anarchitecture
surfacedwhichencompassesideasfromboththereactiveandthedeliberative
ar-chitectures. Thisthree-layerarchitectureconsistsofthreecomponents:
Areactivefeedback control mechanism: This layerimplementsreactive
cou-plings ormappings betweensensorsandactuators. Thesemappings should
becontinuousandfast;constantbounded intimeand spacecomplexity. In
addition, the mappings should use no information about the state of the
world.
A reactive plan execution mechanism: This layer implements more
com-plexbehaviorsbycontrollingwhichmappingsandparametersthecontroller
shoulduseatagiventime. Theplansitisexecutingcomefromthe
deliber-ativelayerorareinputbyadesigneratcompile-time. Thislayercanaord
moretimeconsumingcalculationsthanthereactivelayerbut mustbealert
enoughto noticechangesinthereactivebehaviors.
A deliberative planner: This layer perform time consuming tasks such as
planningandotherexponentialsearchbasedalgorithms. Someofthe
com-putationally,heavysignalprocessingtasksalsobelonghere.
Wewillinsection2.5discussanexampleofaroboticsystemwhichusesamodern
deliberative/reactivearchitecture. Wewillalsolookcloserattheproblemof
infor-mationrepresentationbetweenthelayersinthisarchitecture. Inparticularwewill
discusshowto connect,oranchor, symbolicdescriptionsusedat higherlayersto
sensorydataatlowerlevels. Wewillcoverthisincontextofaparticularproblem;
2.3 Scenario: The Infant Robot
Wewill nowdiscuss ascenariowhichdescribeswhat kindofroboticarchitecture
we are considering in this thesis. The scenario will also help us to understand
what kindofproblemswehavetodealwithwhendesigningsuchanarchitecture.
2.3.1 The scenario
Imagine that we have a robot equipped with a lot of sensors for vision, audio,
sonar, tactilesensingetc. Inaddition,therobotalsohasproprioceptive 2
sensors
whichmeasuresthephysicalstateoftherobot(forexampleorientation,changesin
velocityandthepositionsandtheanglesofitsjoints). Therobotcanaectthe
en-vironmentwithasetofeectors;withsomeitcanchangeitsposition(locomotion)
andwithsomeitcanmanipulateobjects.
ConsidernowthecasewheretheInformationProcessingStructure(IPS)ofthe
robot hasbeendesigned and builtusing architectures similar to those discussed
in section 2.2. Inthis case,the designer has to provide rulesor models on how
to combinethesensorinput(alsoknownasdatafusion),andhowtorelatethese
inputstotheactionsoftherobot.
In this scenario, however, we want the robot to learn, through its own
ex-perience, how the percepts from the sensors are coupled to the responses of its
eectors. Likeaninfant,therobotmustlearnhowtoorganizeitssensorystimuli
by experimentingwith itsenvironment. The experimentwiththe kittens in
sec-tion1.1 isagoodexampleofthis. TheISPofthisrobotmust containareactive
architecturewhichsupportslearningandself-organization. Itisanarchitectureof
this kindwewilldiscuss andstudyinthechapterstofollow.
Inthenextsectionwewilllookatdierentaspectsofarchitecturesandresponse
generation.
2.3.2 Response generation
The goalofthe IPSis to generateactions sothat thesystemfunctions properly
in theenvironment. Whatdoesthismeanand whatimplicationsdoesit havefor
thedesignoftheIPSandtheroboticsystem?
A commonandintuitiveapproach usedin classic AIwasdiscussedin section
2.2.2. Inthis approachtheenvironmentisinterpreted,classiedandlabeledinto
symbolicobjects. Thesystemthentypicallyupdatesitsinternalworld-modelusing
these object-descriptionsand reasonsabouttheworldandwhat actionsto make.
Thisisaverynaturalapproachbecausewehumansconsciouslyreasonintermsof
objectsandtheirrelations. Wemust,however,becarefulwiththisintrospective
method, becausetherearemanyresultsindicatingthatourconsciousselfandour
consciousexperiencesoftheworldresideattheveryendoftheneuralprocessingin
thebrain[15]. ThisconsciousparthandleswhatpsychologistsandAI-researchers
calldeclarativeknowledge,i.e.,symbolicfactsandeventsthatwecanreasonabout
consciously,describeandcommunicateverbally. Theprocessingthatprecedesthe
2
consciousparthandles,amongotherthings,motoricskillswithastrongcontextual
dependence. Thisproceduralknowledgeisrichininformationandmaybedicult
to express in language. A typical example is to describe how to ride abicycle.
Since it is these response generating motoric skills we are interested in (we are
not striving to attain language, logic and consciousness), there is no reason to
assume that the technical system we strive to obtain must handle objects and
object-representationsin formsthataresimilarto ourconsciousexperiences.
Something these classic solutions have in common is that they in the
per-ceptionmodule postulatethat wemustknowwhat andwhereanobjectisinthe
environmentbeforeproperresponses canbe generated. Granlund [15,17]argues
that this conceptionis fundamentallywrong and that we in this areaare fooled
bytheluxuryofourownconsciousness. Therearemanydierentunderstandings
onthis topic but Granlund argues that many misunderstandings arise from the
factthatpeopledonotrealizethat dierentapproachesareneededdependingon
whether thegoalis generation of responses or objectrelated tasks such as
com-municationormeasurements. Therefore,Granlund postulatesthat thereare two
dierent domains in Spatial-Cognitive Information Processing where completely
dierentrulesapply. Table2.1showsthedierencesbetweenthetwodomains.
If we use the classical approach and try to identify objects and label the
scenariobeforethegenerationofresponses,wecangetintotrouble. Intheprocess
oflabelingwedeliberatelycutthecontextuallinksfromtheobjectsinordertoget
themasinvariantaspossible(fortemplatematchingforinstance),andwedonot
includeinformationintheobjectdescriptionsaboutwhatresponseswereinvolved
in order to gain the description in the rst place. But, this information is of
vitalimportancewhenwewanttogenerateresponseswhicharehighlycontextual
in their nature. Classical systems solve this by trying, after identication and
estimation of features, to see how objectsrelate to each other; i.e., they try to
recreatethecontextaftertheyhavethrownitaway!
Thesystemweareinterestedinandaregoingtoinvestigatefurtherinthis
the-sis,isthereactivedomainin Spatial-CognitiveInformation Processing. Insucha
system,wedonotstriveto attainconsciousnessorsomekindofunderstanding
oftheenvironment. Weareonlytryingtogenerateeectiveresponsesby
associ-atingperceptswithresponsesondierentlevelsofabstraction.Inthisdomainwe
donottryto ndobjectsand identifythese. Infact,therepresentationsused in
thisdomainrefermoretosituations thantosomethingweusuallyrecognizeasan
object;abookisforexamplesomethingcompletely dierentfromthesamebook
rotated90degrees,becausetheresponsesneededtodealwithitaredierent. This
processingisinadditionnonsymbolicandcontinuousincontrasttoourconscious
Reactive - Associative
Deliberative - Symbolic
Concatenatedlocalmaps Globalityofspaces
View-centeredrepresentation Object-centeredrepresentation
Limited invariance of objects with
referencetoobserverstate
Highlyinvariantobject
Contextspecic Contextuallyinvariant
Low resolution linkage structures
usingfeedbackand servoing
High resolution implementations of
precisemathematicalmodels
Associativelearning
Prescribedmodels
DistributedSystems Centralizedsystems
Operationsanddatamixed.
Neuralnetworks
Separation between operations and
data.
Conventionalcomputers
Interactingwithenvironment One-way projective
Information is acquired by system
onitsownterms
Informationisinputbydesigner
Motor functionsare partof percept
organizationstructure
Outputisarrangedbydesigner
Semanticrepresentation.
Useofexperiencedcoincidences
Symbolicrepresentation.
Useofprescribedrules
Useofdynamicsforacquisitionand
redundantoperation
Prescribedmodesofinterpretation
Self-organizingstructure
Functionality surrounded by more
intelligentshell
Procedural Declarative
Action
Communication/Language
2.4 Learning and self-organization
2.4.1 Why learning?
What the architectures we haveseen in section 2.2 have in common is that no
learningor adaptivityhavetakenplace. Thecostforthis, asdiscussedearlier,is
thatthemodelsoftheenvironmentand/orproperbehaviorshavetobebuiltinto
thesystemfromthestart. Ifwehavethesituationthatweknowthemodelswill
change(butnothow),themodelsareunknownorveryhardtoestablish,learning
andadaptivityofthesystemareneeded.
Buthowmuch andwhatshould thesystemlearn? Ifwespecifytoolittleand
givethesystemtoomanydegreesoffreedomtolearn,thelearningtimewillbehuge
ifthelearningconvergesatall. Ifwespecifytoomuchthelearningwillprobably
befast,buttheresultingsystemmaybetoorestrictedto solvetheproblem.
ThisproblemisknownastheBias-Variancedilemma[28]. Whatwewantto
dois tondjust theamountofbiasneededforthesystemtolearnthe
problem-domain in consideration in a fast and reliable manner. Finding this bias is in
generalnoteasy.
Specifyingwhatasystemshouldlearnisanotherproblem. Foragivenproblem
onecanalmostalwaysndaninnitenumberofsolutionapproachesandsolution
structures.
2.4.2 What should we learn?
Firstwemust decideonwhat time-scale oursystemis operating. This scale
de-cideswhatwecanaordtolearn. Learningforinstancesensor-types,algorithms,
hardware wiringand otherfundamental structural requirementsforasolution,
operateson an evolutionary time-scale. On this scale whole systems are ev
alu-ated against each other and the ttest systems are allowed to evolve. Learning
methodsonthistime-scaleare geneticalgorithms(GAs) [6,24], includinggenetic
programmingmethods(GPs)[27].
Thetime-scaleweareinterestedinhere,isthescaleoftheindividualsystem.
Onthisscalewewantthesystemtoexploreitsenvironment(theproblem-domain)
andlearnfromitsexperiences. Giventhatthesystemisrestrictedwithsomewell
chosenbias,wewantthelearningtobefast. Withfastwemeanthatthesystem
shouldlearntoproducetherightresponseorbehaviourafterjustacoupleoftries
orexamples. Thisissometimes calledinstance-basedlearning.
What thesystemshould learn depends onwhat bias thesystem isrestricted
orinitiatedwith. Forsomesetup,thesystemcouldforexamplelearntoassociate
perceptswithresponsesinanassociativememory,orlearnto deneandcombine
local models for generation of responses. Dierent methods for learningon this
time-scale are supervised, unsupervised and reinforcement learning (see section
2.4.3).
AccordingtoBallard[6],thesystemshouldrstlearntoreact. Thefastestway
tocalculatesomethingistolookuptheanswerinalook-uptable ormemory. A
properresponsepatterncanbegeneratedasfastastheimplementationallows. We
can howevernotstoreeverything; new associativecouplingsmust besuciently
dierent from the existing memoriesto be worthstoring, but at the same time
similar enoughtotheexistingmemoriessotheycanberelated.
Ballardisalsosuggestingthatlargerbehavioursorprogramscanbeformedby
using sequencesof theseassociativememories. These programsareusing hidden
Markov models (HMMs) to dene the problem state space and operators that
expressthetransitionsbetweenthestates. Reinforcementlearningisthenusedto
ndpathsthroughtheHMMs thattendtogivehighrewards.
2.4.3 Learning paradigms SupervisedLearning
System
Trainer
y
x
ε
System
Trainer
y
x
r
Figure2.2: Supervisedlearning(top),reinforcementlearning(bottom)
In supervised learning (see gure 2.2), the teacher shows input data (x) to
the system in training which returns its corresponding output data (y). The
teacherknowsthe correct answersand provides the systemwith an error signal
("),showingthesystemhowitshouldchangeinordertodecreasetheerrorsinits
output. Wemaythinkoftheteacherashavingcompleteknowledgeoftheproblem
domain. Eventually,thesystemwill emulate theteacher. Thiskindoflearningis
verycommon in theeld of ArticialNeuralNetworks (ANNs), see also section
4.2.3.
UnsupervisedLearning
Unsupervisedlearningcanbe thoughtof as aspecial caseof supervisedlearning
where the teacher is built into the system. In this case, one usually wants
the system to learn arepresentation of the input. A task independent measure
of quality of this representation is replacing the teacher. Auto-association and
Reinforcementlearning
In reinforcement learning (see gure 2.2), the learning system has a critic asa
teacher. Thedierencebetweenreinforcementlearningandsupervisedlearningis
that thecriticdoesnotknowtheanswersorhowtoreactinacertainstate. The
criticcanonlygiveameasure,areward(r)ofhowwellorhowbadthesystemhas
performed. Thisis amoregeneralmethod than thesupervisedlearningbecause
itismucheasiertogeneratearewardsignalthantogeneratetherightanswersor
reactions in acertain state. However, thelearningis moredicult, becausethe
teachercannotprovideanyguidanceonhowthesystemshouldchangeinorderto
improveitsperformance. Inaddition,therewardsignalmaybetheresultofactions
thesystemtookinitspast,i.e. therewardmaybedelayed. Thisimpliesthatthe
system using reinforcement learning must solve the temporal credit assignment
problem. Seeref.[8,28]fordiscussionsonreinforcementlearning.
2.5 Fuzzy matching of visual cues
2.5.1 Introduction
Autonomous mobilevehicles needtouse computervisioncapabilitiesin order to
perceivethephysicalworldandtoactintelligentlyinit. Thesesystemsalsoneed
the ability to perform high-level, abstract reasoning in order to operatereliably
in a dynamic and uncertain world without the need for human assistance. For
example, a mail delivery robot faced with a closed door should decide whether
it is better to plan an alternative way to achieve its goal, or to reschedule its
activities and try this deliveryagain later on. In general, autonomous vehicles
needtoincorporate adecision-makingsystemthat usesperceptualdata toguide
theactivity ofthevehicletowardtheachievementoftheintended task,and also
guidestheactivityoftheperceptualsubsystemaccordingto thepriorities ofthis
task.
Animportantaspectin integratingthedecision-makingand thecomputer
vi-sionsystemsis the connectionbetween theabstract representations used bythe
symbolicdecision-makingsystemtodenoteaspecicphysicalobject,andthedata
in thecomputervisionsystemthatcorrespondto that object. Following[34],we
call anchoring the process of establishing this connection. We assume that the
decision-making process associates each object to a set of properties that (non
univocally) describe that object. Anchoring this object then means to use the
vision apparatusto nd anobjectwhose observedfeatures matchthe properties
in thisdescription. Forexample,supposethatthesymbolicsystemhasanobject
named `car-3' with thedescription `small redMercedes onRoad-61.' Anchoring
thisobjectmeansto: (i)ndanobjectintheimagethatmatchesthisdescription;
and (ii) update thedescriptionof `car-3'by using theobservedfeatures, sothat
thesameobjectcanlaterbere-identied.
Oneof thediculties in theanchoringproblem is that thedata providedby
the visionsystemare inherentlyaected byalarge amount ofuncertainty. This
high-leveldescriptionof anintendedobject. Inorder toimprovethereliability ofthe
anchoring process, this uncertainty has to be taken into account in the proper
way. A solution is to use techniques based on fuzzy logic to dene a degree of
matching between aperceptual signature and anobject description. The
possi-bility to distinguish betweenobjects that match a given description at dierent
degreesispivotaltotheabilitytodiscriminateperceptuallysimilarobjectsunder
poorobservationconditions. Moreover,these degreesallowusto considerseveral
possibleanchors, ranked bytheir degreeof matching. Finally, these degreescan
beusedto reasonaboutthequalityofananchorin thedecisionmakingprocess;
for example,wecandecidetoengage insomeactiveperceptionin order togeta
betterviewofacandidateanchor.
Inthesectionstocome,wewilldealwiththeanchoringprobleminthecontext
of anarchitectureforunmannedairbornevehicles. Thisarchitecture,outlinedin
the nextsection, integratesseveralsubsystems, includingavisionsystemandan
autonomousdecisionmakingsystem. Insection2.5.3,wediscusshowwerepresent
the inexactdata provided bythevision systemwithfuzzy sets. In section2.5.4,
we show how we compute the degrees of matching betweenthese data and the
intendeddescriptions. Section 2.5.5illustrates the useof these degreesby going
through acoupleof examples,run in simulation. Finally, section 2.5.6discusses
theresultsandtracesfuture directions.
2.5.2 The WITAS project
Figure2.3: Ascenefrom theWITASproject
The WITAS project, or The Wallenberg laboratory for research on
Infor-mation Technology and Autonomous Systems, is a research laboratory within
LinköpingUniversity,Sweden. Asthenameindicates,thegoaloftheprojectisto
aerialvehicles(UAV's)inparticular. Inmoreconcretetermsthiscurrentlymeans
constructinga small, unmanned helicopter with sensors and computers, able to
autonomouslyy andsurveillanceautomobile trac,seegure2.3.
Thegeneralarchitectureofthesystemisastandardthree-layeredagent
archi-tectureconsistingofadeliberative,areactive,andaprocesslayer:
Thedeliberativelayergeneratesat run-timeprobabilistichigh-level
predic-tionsofthebehaviorsofagentsintheirenvironment,andusesthese
predic-tionsto generateconditionalplans.
Thereactivelayerperformssituation-driventaskexecution,includingtasks
relatingtotheplansgeneratedbythedeliberativelayer. Thereactivelayer
hasaccesstoalibraryoftaskandbehaviordescriptions,whichcanbe
exe-cutedbythereactiveexecutor.
Theprocess layercontainsimage processing andight control,and canbe
reconguredfromthereactivelayerbymeansofswitchingonandogroups
ofprocesses.
Besidesvision,thesensorsand knowledgesourcesofthesysteminclude: aglobal
positioning system (GPS) that gives the position of the vehicle, a geographical
information system(GIS) coveringthe relevant areaof operation, and standard
sensorsforspeed,headingandaltitude.
Thesystemis fullyimplementedinitscurrentversion. Because ofthenature
ofthework,mostofthetestingisbeingmadeusingsimulatedUAVsinsimulated
environments,eventhoughrealimagedatahasbeenusedtotestthevisionmodule.
Inasecondphaseoftheproject,however,thetestingwillbemadeusingrealUAVs.
Moreinformationabouttheprojectcanbefoundat theWITASwebpage[38].
Ofparticularinterestforthispresentationistheinteractionbetweenthe
reac-tivelayerandtheimageprocessingintheprocesslayer. Thisisdonebymeansof
aspecializedcomponentfortaskspecic sensorcontrolandinterpretation, called
theTheSceneInformation Manager(SIM). Thissystem,illustratedingure2.4,
ispart ofthe reactive layerandit manages sensor resources: it reconguresthe
visionmoduleonthebasisoftherequestsofinformationcomingfromthereactive
executor,it anchorssymbolic identiersto image elements(points, regions),and
ithandles simplevision failures, in particulartemporaryocclusionand errors in
carre-identication.
TwoofthemainaspectsofanchoringimplementedintheSIMareidentication
ofobjectson thebasis ofa visualsignature expressed in termsof concepts, and
re-identicationofobjectsthathavebeenpreviouslyseen,buthavethenbeenout
oftheimageoroccludedforashortperiod.
Foridenticationandre-identicationtheSIMusesthevisualsignatureofthe
object,typicallycolorand geometricaldescription,andtheexpectedpositions of
theobject. ForinstanceiftheSIMhasthetaskto lookforared,smallMercedes
near a specied crossing, it provides the vision module with the coordinates of
thecrossing,theHue, SaturationandValuedening red andthe length,width
SIM
Prediction
Storage
skill configuration
calls
vision
data
results
requests
Anchoring
management
& skill
Vision
Reactive Executor
data
road
GIS
GPS
position
Figure 2.4: Overview oftheSceneInformation Managerand itsinteraction with
theVisionmoduleandtheReactiveExecutor.
a degree of inaccuracy, and the SIM also provides the vision module with the
intervalsinside which themeasurementofeachofthefeaturesisacceptable. The
sizeoftheintervaldependsonhowdiscriminatingonewantstobeintheselection
oftheobjectsandalso,inthecaseofre-identicationofanobject,onhowaccurate
previousmeasurementsontheobjectwere.
Thevisionmodulereceivesthepositionwheretolookforanobjectandthe
vi-sualsignatureoftheobjectanditisthenresponsibleforperformingtheprocessing
required to nd theobjects in the image whose measures are in the
acceptabil-ity range and report the information about the objects to the SIM. The vision
module moves the camera toward the requested position and for each objectin
the image and for each requested feature of the object, it calculates an interval
containingthereal value. If thegeneratedintervalintersectswiththe intervalof
acceptabilityprovidedin thevisual signature forthe feature,the feature is
con-sidered to bein theacceptability range. The visionmodule reports information
aboutcolor,shape,position,andvelocityofeach objectwhosefeaturesare allin
theacceptabilityrangetotheSIM.
Intersection of intervals is a simple, but not very discriminating method to
identify an object. As a consequence, several objects that are somehow similar
to theintendedonecanbesentbackbythevisionmoduleto theSIM.TheSIM
then needsto apply some criteriain order to perform a further selection of the
bestmatchingobjectbetweenthosereportedbythevisionmodule. Theselection
of the best matching object should depend on how well the objects match the
dierentaspects ofthe signature, but also ontheaccuracy of themeasurements
performed by the vision and their reliability. In what follows, weshow how we
2.5.3 Fuzzy-set representation of visual cues
Cuesobtainedfromthevisionsystem,e.g.,color,shape,positionandvelocity,are
aectedbyuncertaintyandimprecisionin severalways. Inthiswork,wepropose
toexplicitlyrepresenttheinexactnesswhichisinherentinthesedata,andtotake
this inexactness into account when performing signature matching. In order to
justify our representation, we need to analyze the way in which we extract the
neededparametersfromtheimage.
Considerthemeasurementoftheshapeparameters(length,widthandarea)of
anobservedcar. Roughly, themeasurementstartswithasegmentedandlabeled
binary image containing our candidate cars. This binary image is created by
combiningandthresholdingthefeature imagesproducedbythedierentfeature
channelsavailable,e.g.,orientation,color,IR andvelocity(currently,weonlyuse
thecolor channels). Foreach objectin the labeled image, wethen computethe
momentofinertiamatrix. Fromthis22matrix,wecalculatethetwoeigenvalues
whichcorrespondtothelargestandsmallestmomentofinertia,respectively,and
convert theminto thelengthandwidth of theobjectunder theassumption that
ourobjects(cars)arerectangular. Wealsomeasuretheareabycountingthepixels
that belong to the same object. The length, width and area measures are then
converted tometric measuresthroughmultiplication byascalefactordescribing
themeterperpixelratio. Thisratioiscomputedfromtheeld-of-viewangleand
fromtheposition andanglesofthecamera.
Thereareanumberoffactorsthatinuencethecorrectnessofthevalues
mea-suredbytheaboveprocedure. First,inthesegmentationphase,thediscretization
oftheimagelimitstheprecisionofthemeasure. Second,continuingthe
segmenta-tionphase,weapplysomebinaryoperations(e.g.,llandclose)onthebinary
imageinorder totryto connectandbind segmentedpixelsintoobjects. These
operations slightly alter the shape, thus limiting the precision. The above two
factors together produce a segmentation error, denoted by
s
. Third, the
mea-surement model maybe inaccurate, thus introducing an error, the model error,
denoted by
m
; for example the above assumption that cars are rectangular is
almostnevercompletelytrue. Notethattheimpactofthe
s and
m
errorsonthe
quality of themeasurementsdepends onthe size of the car in the image, which
in turn depends on its distance from thecamera and on the focal lengthof the
camera. Afourthfactorthataectsthemeasurementistheperspectivedistortion
duetotheanglebetweenthenormalofthecarplaneandtheopticalaxis: ifthe
carplaneisnotperpendiculartotheopticalaxis,theprojectionofthe3D-caron
the imageplanewill beshorter. Wedenote thisperspective error by
. Finally,
all the geometric parameters needed to compute the length may themselves be
aected by errors and imprecision. For example, the distance from the camera
depends onthe relative position of the helicopter and the car; and the angle
depends ontheslopeof theroad; boththese valuesmay bedicult toevaluate.
We summarize the impact of these factors on our measurement in a geometric
error term,denotedby
g .
3
3
There aremoresources of errorsinthisprocess. For example,when increases, the car
Theabovediscussion revealsthatthere isagreatamountofuncertaintythat
aects the measured value, for example, the length of an object; and that this
uncertainty is very dicult to precisely quantify in other words, we do not
have amodel of theuncertainty that aects our measures. Similar observations
canbemade forother features measuredbythevision system: for example,the
measurementofthecolorofanobjectisinuencedbythespectralcharacteristicsof
thelightthatilluminatesthatobject. Giventhisdicultnatureoftheuncertainty
inthedatacomingfromthevisionsystem,wehavechosentorepresentthesedata
using fuzzysets [40]. Fuzzysetsoeraconvenientwayto representinexactdata
whose uncertainty cannotbecharacterized byaprecise, stochasticmodel but
forwhichwehavesomeheuristicknowledge. Forexample,Fig2.5(left)showsthe
fuzzysetthatrepresentsagivenlengthmeasurement. Foreachvaluex,thevalue
of this fuzzy set at x is anumberin the [0;1] interval that can beread asthe
degreebywhichxcanbetheactuallengthoftheobjectgivenourmeasurement.
(See [41]forthispossibilisticreading ofafuzzyset.)
2
3
4
5
6
7
8
...
Length (meters)
0
-120
-60
60
120
180
-180
Hue (degrees)
Figure 2.5: Fuzzysets forthemeasuredlength(left)andhue(right).
Inourwork,weusetrapezoidalfuzzysets,bothforcomputationalreasonsand
foreaseofconstruction. Thepossibilisticsemanticsgiveussomesimpleguidelines
onhowtobuildatrapezoidalfuzzysettorepresentaninexactmeasurement. The
attoppartof thefuzzyset (itscore) identiesthose valuesx that canbefully
regarded as the actual length value given ourmeasurement. In the examplein
Fig 2.5(left),thesevaluesarespreadoveranintervalratherthanconcentratedin
apointbecauseofthesegmentationeect: ourmeasurementcannottellus more
thanwhatisallowedbythepixelsize. Thebaseofthefuzzyset(itssupport)
iden-tiesthosevaluesxthatcanpossiblyberegardedastheactuallengthvalue: given
theerrorsthatmayaectourmeasurement,theactuallengthmaybeanywherein
thesupportintervalbutundernocircumstancescanitbeoutsidethisinterval.
Putdierently,thesupportconstitutesasortofworstcaseestimate: howeverbig
the erroris, theactual valuemust liesomewherein thisinterval. Whilethecore
constitutesabestcaseestimate: evenwhenthereisnoerrorin ourmeasurement,
wecannotbemoreprecisethanthis.
Let us now discuss in detail how we have built the fuzzy set in Fig 2.5.
measured value can betotally invalidifthere has been an error in the segmentation and/or
labelingphases;forinstance,ifthecarhasbeenmergedwithitsshadow,orwithanothercarin
Thevision systemhas calculated thelength to 29:9 pixels, which correspond to
l= 4:23meters. The segmentation error
s
is estimated to a constant 1pixel,
which with ascalefactorof s=0:14 meter/pixelgivesus
s
=0:14meter. This
segmentationerroris inherentto ourmeasurementprocess, nomatter how good
ourmodelsandcomputationsare,anditthusdenesthecoreofthetrapezoid in
thepicture,givenbytheinterval[l
s ;l+
s
]=[ 4:09;4:37].
Ourestimatesfortheothererrorsareallcollectedinthesupportofthe
trape-zoid. Themodel error
m
isestimated in acoarse but simpleway by comparing
themeasuredareaa
m
withthecomputedareaa
c
=wl(wherewisthecalculated
width). Thedierencebetweentheseareasdenes
m
suchthata
m
willlieinthe
interval[( w m )(l m );( w+ m )(l+ m
)]. If,forexample,a
m isgreaterthan a c , m
becomes: (As a simplication we have assumed that
m
is the same for
boththewidthandforthelength.)
a m ( w+ m )(l+ m )=0=) m = (w+l) 2 + q (w+l) 2 4 +(a m a c ) (2.1)
whichinourcasegivesus
m
=0:04m. Asfortheperspectiveerror
,inourcase
wehave=40:3 Æ
. Ifweassumethatwemeasure theprojectedlengthaslcos,
then the worst case errordue to becomes
= l
MAX
( 1 cos), where l
MAX
is the estimation of the maximum object length. If we set l
MAX
= 6m we get
=1:42m. Sincethesupportofourfuzzysetmustinclude allthevalueswhich
are possiblein aworstcaseerrorsituation,weinclude alltheaboveerrorsinit. 4
This givesustheinterval[ l
s m ;l+ s + m +
]=[4:05;5:83]forthebase
ofourtrapezoid. Notethat
onlyaecttheupperboundoftheinterval,i.e. the
car mayseem smallerin the image when increases. Thecorrect lengthin our
examplewas 4.42m.
Theconstructionof thefuzzy setsfor theotherfeatures followsimilar
guide-lines. Forexample,Fig2.5(right)showsthefuzzysetthatrepresentstheobserved
Huevalue. (Atthecurrentstageofdevelopment,however,wehavemainlyfocused
ontheshapeparameters.)Althoughthedenitionsofthesefuzzysetsaremostly
heuristic,theyhaveresultedin goodperformanceinourexperiments.
2.5.4 Fuzzy signature matching
Wenowfocusontheproblemofanchoringahigh-leveldescriptioncomingfromthe
symbolic system(reactiveexecutor) to thedata comingfrom the visionmodule.
As anexample, consider the casein which atask needs to refer to `a small red
Mercedes.' The SIM system has to link two types of data: on one side, the
descriptioncontainingthesymbols`red,'`small'and'Mercedes'receivedfromthe
symbolicsystem;andontheotherside,themeasurableparametersoftheobserved
objectswhich aresentby thevision system. Anchoringimpliesto convert these
representations to a common frame, and to nd the car that best matches the
4
Inourcurrentexperimentsinthesimulatedenvironment,wehaveg=0sincethehelicopter
description. Inourcase,wehavechosentoconvertsymbolstotheuniverseofthe
measurable parameters.
Ingeneral,symbolicdescriptionscontainlinguistictermslike`red' and`small'
that do not denote a unique numerical value. Sticking to a common practice
[26,40], we havechosen to map each linguistic term of this kind to a fuzzy set
overtherelevantframe. Forexample,weassociatetheterm`red'with thefuzzy
setshowninFig2.6(left): foreachpossiblevalueh,thevalueofred(h)measures,
on a[0;1] scale, how much h can be regarded as`red.' 5
As a second example,
Fig2.6(right)showshowwerepresentthelengthassociatedtothelinguisticterm
`small-Mercedes'byafuzzy setoverthespaceofpossiblelengths. Inoursystem,
weuseadatabasethatassociateseachcartypetoitstypicallength,size,andarea,
representedbyfuzzysets. Carsofunknowntypesareassociatedwithgenericfuzzy
sets, likethe`small' (car)setin thepicture. Onceagain,weonlyusetrapezoidal
fuzzysets forcomputationalreasons.
0
-120
-60
60
120
180
-180
‘red’
Hue (degrees)
2
3
4
5
6
7
8
...
Length (meters)
‘small-Mercedes’
‘small’
Figure2.6: Fuzzysetsassociatedto thesymbols`small-Mercedes'and`red.'
Oncewehaverepresentedboththedesireddescriptionandtheobserveddata
byfuzzysets,wecancomputetheirdegreeofmatchingusingfuzzysetoperations.
Thischoiceisjustiedinourcasesincefuzzysetscanbegivenasemantic
charac-terization intermsofdegreesofsimilarity[32]. ConsidertwofuzzysetsA andB
overacommondomainX whichrespectivelyrepresenttheobserveddataandthe
target description. The degreeof matching of Ato B, denoted by match(A;B),
is the degree by which the observed value A can be one of those that satisfy
our criterionB. Inthe experimentspresentedin this note, we usethe following
measure: match(A;B) = R x2X minfA(x);B(x)gdx R x2X A(x)dx (2.2)
Intuitively, (2.2) measures the degree by which A is a (fuzzy) subset of B by
lookingathowmuchofAiscontainedinB. (See,e.g.[12]fordierentmeasures.)
The behavior of this measure is graphically illustrated in Fig. 2.7. To ensure
an ecientcomputation, weapproximate (2.2) bythe ratio betweenthe areaof
the inner trapezoidal envelope of A\B and thearea of A. These areascan be
computedveryeasilywhenAand B aretrapezoidalfuzzysets.
5
B
A
0.7
(a)match(A,B)=0.4B
A
1
(b)match(A,B)=0.8B
A
1
(c)match(A,B)=1.0Figure2.7: Threeexamplesofpartialmatching.
Once wehavecomputed adegreeofmatching foreachindividual feature, we
needto combineallthese degreestogetherin ordertoobtainanoveralldegreeof
matchingbetweenthe intendeddescriptionand agivenpercept. Inour case,we
need to combine the degrees of matching of the length, width, area, hue,
satu-ration,and valuecriteria intoonesummarizeddegreeof matching. Thesimplest
waytocombine ourdegreesisbyusingaconjunctive typeofcombination,where
we requirethat each one of the features matches the corresponding part in the
description. Conjunctive combination is typically done in fuzzy set theory by
T-normoperators [26,37], whose mostused instances aremin, product, and the
ukasiewiczT-normmax(x+y 1;0). Inourexperiments,wehavenoticed that
thelatteroperatorprovidesthebestresults. (See[7]foranoverviewoftheuseof
alternativeoperatorswithapplications toimageprocessing.)
The overall degree of matching is used by theSIM to select thebest anchor
amongthecandidateobjectsprovided bythevisionmodule. Foreachcandidate,
theSIM rstcomputes its degreeof matching to the intended description, then
itranksthesecandidates by theirdegree,andnally returnsthefull orderedlist
tothereactiveexecutor. Havingalistofcandidatesisconvenientifthecurrently
bestonelaterturnsoutnottobetheonewewanted. Also,itisusefultoknowhow
much the best matching candidate is better than theother ones: if thetwotop
candidateshavesimilardegreesof matching,wemay decideto engagein further
exploratoryactions in order to disambiguatethe situation before committing to
oneofthem for instance, wemaygivethevisionsystemthe taskto zoomon
eachcandidatein turninthehopetogetmoreprecisedata.
2.5.5 Fuzzy signature matching at work
Weillustrate theuseof thefuzzy signaturematchingby twoexamples ona
sce-nariotaken from theWitasproject. In thisscenario, thedeliberativesystemis
interestedinaredcarofaspeciedmodelinthevicinityofagivencrossing. Four
carsaresituatedaroundthatcrossing,movingindierentdirections. Thecarsare
allred,butofdierentmodels: asmallvan,abigMercedes,asmallMercedes,and
aLotus. Inthe rstexamplethehelicopter isabovethe cars. Inthe second
ex-ample,discriminatingbetweenthecarsismademoredicultbythefactthatthe
helicopterviews thecrossingat aninclinationofabout30degrees(seeFig.2.8).
theextractionofgeometricalfeatures.
Figure2.8: Thesimulatedscenarioforourexamples.
Inourrstexample,thedeliberativesystemdecidestofollow`Van-B',whichis
describedasaredvan. TheSIMsendstheprototypesignatureofaredvantothe
vision module. Since allthefour cars in theimage arered, and theyhavefairly
similar shapes,the visionmodule returnsthe observedsignatures of allthe four
carstotheSIM.Thesesignaturesarethenmatchedagainstthedesiredsignature
byourroutines,resultinginthedegreesofmatching showedintable 2.2.
ID Color Shape Overall
66 1.0 0.58 0.58
67 1.0 0.38 0.38
68 1.0 1.0 1.0
69 1.0 0.0 0.0
Table2.2: Degreesofmatching, rstexample: Van-B
TheIDisalabelassignedbythevisionsystemtoeachcarfoundintheimage.
The degree of matching for the color is obtained by combining the individual
degreesofHue,Saturation,andValue;inourcase,thiswillbe1.0forallthecars
as theyare allred. Thedegreeof matching for the shape isthe combination of
theindividualdegreesofmatchingoflength,width,and area. Theoveralldegree
is the ukasiewitz combination ofthe colorand shapedegrees. Inthis case, car
68iscorrectly 6
identied asthebestcandidate,andananchortothatcaristhus
returnedtothedeliberationsystem.
6
Thisvericationwasdonemanuallyo-linebyanalyzingsomeadditionalinformation,like
Inthesecond example,thedeliberativesystemisinterestedin `Car-D',ared
smallMercedes. TheSIM sends thecorrespondingprototypicalsignatureto the
visionmodule,andagaingetsthesignaturesofallthefourcarsintheimageasan
answer. Inthiscasehowever,thehelicopterisatalongdistancefromthecrossing
andit viewsthe crossingatan inclinationof about30degrees. Byapplyingour
fuzzysignaturematching routine,weobtainthedegreespresentedintable 2.3.
ID Color Shape Overall
66 1.0 0.65 0.65
67 1.0 0.84 0.84
68 1.0 0.0 0.0
69 1.0 0.97 0.97
Table2.3: Degreesofmatching,second example: Car-D
Cars66,67and69matchthedesireddescriptiontosomedegree,whilecar68
cansafelybeexcluded. TheSIMdecidesthatthese degreesaretooclosetoallow
asafe discrimination, and it tries to improve the quality of the data by asking
the vision module to zoom on each one of cars 66, 67, and 69 in turn. Using
theobservedsignatures after zooming, the SIM then obtainsthenew degreesof
matching,showedin table2.4.
ID Color Shape Overall
66 1.0 0.30 0.30
67 1.0 0.70 0.70
69 1.0 0.21 0.21
Table2.4: Degreesofmatching,secondexampleafterzoom: Car-D
Thecloserviewresultsinasmallersegmentationerror,sincethescalefactoris
smaller,andhenceinmorenarrowfuzzysets. Asaconsequence,allthedegreesof
matchinghavedecreasedwithrespecttothepreviousobservation. Whatmatters
here,however,istherelativemagnitudeofthedegreesobtainedfromcomparable
observations,that is,those whichare collectedin theabovetable. These degrees
allowtheSIMto selectcar67asthebestcandidate.
TheSIMnowalsohastheoptiontotrytofurtherimproveitschoiceby
com-manding the helicopter to y over car 67 and take another measurement from
above thecar the best observationconditionsfor thevision system. If wedo
this,wenallyobtainadegreeofmatchingof1:00forcar67. Notethatthisdegree
couldaswellhavedropped,thusindicatingthatcar67wasnotreallythecarthat
wewanted. Inthiscase,theSIM mighthaveusedthepartial matchinformation
andgobackto cars66and69to getmoreaccurateviews.
2.5.6 Conclusions
Anchoringsymbolsto thephysicalobjectstheyare meantto denoterequiresthe
pastsectionsconsideredaninstanceoftheanchoringprobleminwhichwelinkthe
caridentiersusedatthedecision-makingleveltotheperceptualdataprovidedby
avisionsystem. Ourexperimentalresultsindicatethatourtechniqueisadequate
to handlethe ambiguitiesthat arise whenintegrating uncertainperceptual data
and symbolic representations. In particular, fuzzy signature matching improves
ourabilitytodiscriminateamongperceptuallysimilarobjectsindicultsituations
(e.g., perspectivedistortion). Moreover,degreesof matching allowusto exclude
unlikelycandidates,andtorankthelikelyonesbytheirsimilaritytotheintended
description. Finally, these degrees can help in decisionmaking; for example, if
these degrees indicate a large amount of anchoring ambiguity, the system may
decidetoengageinactiveinformationgatheringsuchaszoomingorgettingcloser
to theobjectinordertoobtainbetterinformation.
Theworkreportedhereisstillinprogress,andmanyaspectsneedtobefurther
developed. First,thetreatmentofperspectivedistortionspresentedhereisrather
primitive,inournextexperiments,weshallusedierentmodelsofeachcarviewed
fromdierentobservationangles. Second,weneedtoaccountforstillmoresources
oferrors,includingthepossibilitythatthedetectedobjectisnotacar. Third,we
need to study moresophisticated forms of aggregation of theindividual degrees
of matching of dierent features into an overall degree. For example, in some
situations someof the features are more critical thanothers, and we would like
theirdegreeofmatchingto haveastrongerimpactontheoveralldegree. Fourth,
we plan to include features of a dierent nature in the matching process, like
the observed position and velocity of the cars. Finally, until now wehave only
performed experiments in simulation. At the current stage of development of
the Witas project,the vision system takesthe videoframes produced by a3D
simulatorasinput. Although this congurationresultsin someamount ofnoise
and uncertaintyin the extractedfeatures, weare awarethat areal validation of
ourtechniquewillonlybepossiblewhenwehaveaccesstotherealdata froman
The proposed reactive architecture
3.1 Introduction
Inthischapterwewilldiscussthenutsandbolts ofaproposedreactive
architec-turewhichshowssomeoftheproperties discussedintheInfantRobot scenario
in section 2.3. First, we will present the representation we have chosen for the
input/output signals. Then, wediscussthedesignofthearchitecture. The
chap-ter ends with apresentation of the dierent phases involved while training the
architecture,e.g.,howtoevaluateandcreatemodels.
3.2 Structure description
3.2.1 The channel representation
The signals arriving from the sensors to the ISP of the robot can be seen as
projections of some variables in the environment, accessible to the robot only
through itssensors. Theeectors ofthe robot canbe seenasawayof aecting
some other variables in the environment. We will onwards denote the variables
controllableby therobot response-variables, andthe variables onlyviewable via
sensors percept-variables. Howthesensors andeectorsrepresentthese variables
is veryimportantto knowwhen wedesignour reactivearchitecture. We will in
thefollowingassumethattheperceptandresponsevariablesarerepresentedusing
thechannel representation [8,13,14,18,30].
In this representation we let a set of channels represent a variable. Each
channelissensitivetoasmallpartofthevariabledomainandcanbeviewedasa
band-passlter. Thefunctionweareusingfortheenvelopeofthechannelsisthe
the scalar variable v in the interval v 2 [15;65]. We canalso see the particular
congurationofchannelsforv 40;onlytwochannelsaresignicantlyactivated.
In theexamplethechannels are distributed regularlywith60 Æ
overlap andhave
the samebandwidth, but this isnot arequirementin general. With 60 Æ
overlap
wemeanthattheclosestneighbourstoachannelarephase-shifted 60 Æ
inrelation
to that channel. Noticethat the channels centered in 5and 75 respectively are
neededin ordertorepresenttheinterval[15;65]inaregularmanner.
−20
0
0
20
40
60
80
100
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Channel responses on feature(Name: radius Type: radius−v)
Feature: radius
Channel Response
Figure3.1: Thevariable vrepresentedwiththechannelrepresentation
Usingthechannelrepresentationforthevariableshasanumberofadvantages:
Wecancustomizetheresolutionofthevariable. Insomepartsofthevariable
domainwemaywantto havehighresolution(manychannels)andin other
partswemaywantlowresolution(fewchannels)orperhapsnoresolutionat
all.
We can representmany valuesor eventsat the same time. When we use
the channel representation, we usually extendthe dimensionality from the
original number of dimensions of the variable to the number of channels.
However,typicallyonlyasmallnumberof thechannels(threeforinstance)
are activeat the same time depending on how the channels overlap. The
channelsthatarenotactivecanbeusedtodetectanothereventorrepresent
anothervalue. Ingure3.2wecanseeanexampleofthiswheretwoevents
aresharing thesamechannelrepresentation. Thisis mostlyrelevantonly
We can use simple processing strategies. Due to the local nature of the
channels, itturnsoutthatoperationsonthevariable,suchascalculatinga
functionofit,oftencanbeperformedusingonlylinearoperators. Example:
Letxbethescalarvariablethatisrepresentedbyasetofchannelsc. Ifwe
wantto calculate y
= f(x)wecanoften dothis asy =w T
c, where w is
avectorof weights orparameters. The quality of this approximation, i.e.,
howsimilary istoy
,dependsontherateofchangeofthefunctionf(x)in
relationtotheresolutionofthechannelrepresentationof x.
−150
0
−100
−50
0
50
100
150
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Channel responses on feature(Name: rod−positions Type: position)
Feature: rod−positions
Channel Response
Figure3.2: Anexampleoftwoevents(x=f 15;65g)representedusingthesame
channelset
3.2.2 The mapping functions
Thetask of thearchitectureis, asdiscussedpreviously, to on-linelearnhowthe
percepts change when the robot is performing some action in the environment.
Wewillinthefollowingdenotetheconguration,orpattern,ofresponsechannels
Ifwerepresentthe inputand output variablesin thechannel representation;
howdowe,giventheprevioussystemstate,associatetheincomingperceptpattern
withthecurrentresponsepattern(thecurrentsystemstate)? Wecanreformulate
this as: given the previous system state, we want to approximate the current
activationofaresponsechannelwithsomefunctionoftheactivationsofthepercept
channels.
Wewanttoput somedemandsonthisfunction:
Thefunction must be robustagainst variationsin the congurationof the
perceptchannels;highresolution,lowresolution, mixedscales,dierent
de-greesofoverlapetc. thefunctionmustbehavereasonablyinthesecases.
This impliesthat thefunction should berobust orperhaps independent of
thenormof theperceptchannelset.
Thefunction mustberobust againstnoiseon thepercept channels. Their
mutualrelationsmustbemoreimportantthantheabsoluteactivationofa
single channel; the information that a percept channel is activated at all,
says a great deal due to the locality of the channel representation. The
absoluteactivationofachannelshouldbelessinteresting.
Inmuch the sameway, thefunction must notdepend on theshapeof the
perceptchannelfunction(forinstancecos 2
).
Thefunction shouldberealvaluedandcontinuous.
Thefunction mustbesimple enoughtobeimplementedwithlocal
opera-tionsusingnodesand linksofanANN.
The function should use as few parameters as possible but be expressive
enoughto modeltheresponsechannel.
Ifwedonothaveanyactivationasinput,wedonotwantanyoutput, i.e.,
y(0)0.
There isof coursean(innite)numberoffunctions that moreorlessfulll these
demands. Somefunctions wehavelookedatare:
1. y
j =w
T
c=kw kkckcos(),where w istheparameter(weight)vector,cis
theperceptchannelvectorand istheanglebetweenthevectors.
2. y j = w T c kck =kw k^ccos() 3. y j = w T c P ci 4. z k =(y j
),where()is theactivation function ofanANNnode.
Webeginwithconsideringcase1undertheassumptionthatwehaveadistribution
of channels according to gure 3.3 and 3.4. When we increase x, the dierent
channelswillgoupanddown,butusingachannelfunctionofcos 2
andanoverlap
of 60degreesthe sumof thechannels, P
c
i