Learning in a Reactive Robotic Architecture

(1)

Learning in a Reactive Robotic Architecture Thord Andersson

−40

−20

0

20

40

0

20

40

60

80

100

10 −4

10 −2

10

0 Response variable: u

The SSE of the response channels during periods

Response variable: v

SSE during periods

LIU-TEK-LIC-2000:13

Department of ElectricalEngineering

Linköping University, SE-581 83Linköping, Sweden

(2)

(3)

Learning in a Reactive Robotic

Architecture

Thord Andersson

LIU-TEK-LIC-2000:13

Department of ElectricalEngineering

Linköping University, S-58183 Linköping, Sweden

(4)

c

2000ThordAndersson

Department ofElectrical Engineering

Linköping University

SE-58183Linköping

Sweden

(5)

(6)

(7)

In this licenciate thesis, we discuss how to generate actions from percepts

within anautonomous robotic system. In particular, we discussand proposean

original reactivearchitecture suitable for response generation,learning and

self-organization.

Thearchitectureusesincrementallearningandsupportsselforganizationthrough

distributeddynamic modelgeneration andself-containedcomponents. Signalsto

andfromthearchitecturearerepresentedusingthechannelrepresentation,which

ispresentedinthatcontext.

The componentsof the architecture use a novel and exible implementation

of an articial neural network. The learning rules for this implementation are

derived.

A simulator is presented. It hasbeen designedand implementedin order to

testandevaluatetheproposed architecture.

Results of a series of experiments on the reactive architecture are discussed

andaccountedfor. The experimentshavebeenperformed within three dierent

scenarios,usingthedevelopedsimulator.

Theproblemofinformationrepresentationinroboticarchitecturesisillustrated

byaproblemofanchoringsymbolstovisualdata. Thisispresentedinthecontext

oftheWITAS 1

project.

1

(8)

(9)

No work exists in isolation and this thesis is by no means an exception. It is

impossibletoenumerateallthepeoplewhohaveinuencedandhelpedmethrough

theyears,butyoushould allknowthat Iamverygrateful.

First of all, I wish to thank my supervisor, professor Gösta Granlund, for

introducing me to the Mysteries of Vision and for giving me the pleasure and

opportunitytoworkinhis laboratory.

IalsowishtothankallofmywonderfulcolleaguesandfriendsattheComputer

VisionLaboratory. Youaretrulygreatpeople!

Special thanks to Johan Wiklund for our hacking discussionsand for always

keepingthecomputersintip-topshape.

IwouldalsoliketothankPer-ErikForssén,BjörnJohanssonandÅsa

Johans-sonfor proof-readingthemanuscript. All remainingerrors are to be blamed on

me,duetonalchanges.

Finally,Iwouldliketothankallofmyfamily,especiallyÅsa,fortheirconstant

support,loveandpatience.

The research presented in this thesis was done within the WITAS project,

fundedbytheKnutandAliceWallenbergFoundationwhichisgratefully

(10)

(11)

Abstract v Acknowledgements vii 1 Introduction 1 1.1 Motivation . . . 1 1.2 Contributions . . . 2 1.3 Thesisoutline . . . 2

2 Frameworkand considerations 5 2.1 Introduction. . . 5

2.2 Whatisaroboticsystemorrobotics ? . . . 5

2.2.1 Briefhistory . . . 6

2.2.2 Presentationofaclassicroboticstructure . . . 7

2.2.3 Activevisionand reactivesystems . . . 8

2.2.4 Three-layerarchitectures . . . 9

2.3 Scenario: TheInfantRobot . . . 10

2.3.1 Thescenario . . . 10

2.3.2 Response generation . . . 10

2.4 Learningandself-organization. . . 13

2.4.1 Whylearning? . . . 13

2.4.2 Whatshouldwelearn?. . . 13

2.4.3 Learningparadigms . . . 14

2.5 Fuzzymatchingofvisualcues . . . 15

2.5.1 Introduction . . . 15

2.5.2 TheWITAS project . . . 16

2.5.3 Fuzzy-setrepresentationofvisualcues . . . 19

(12)

2.5.5 Fuzzysignaturematchingatwork . . . 23

2.5.6 Conclusions . . . 25

3 The proposed reactive architecture 27 3.1 Introduction. . . 27

3.2 Structuredescription . . . 27

3.2.1 Thechannelrepresentation . . . 27

3.2.2 Themappingfunctions . . . 29

3.2.3 Thelearningapproach . . . 34

3.2.4 Theevaluationphase. . . 35

3.2.5 Thecreationofanewmodel . . . 35

3.2.6 Theoptimizationphase . . . 36

4 Simulator design and implementation 39 4.1 Introduction. . . 39

4.2 Designconsiderations . . . 39

4.2.1 Thesimulator. . . 39

4.2.2 Thecomputationstructure . . . 43

4.2.3 Thearticialneuralnetwork . . . 46

4.2.4 Theoptimizationmethod . . . 48

4.2.5 Theresilientbackpropagationalgorithm . . . 53

4.3 Theimplementation . . . 56 5 Experiments 59 5.1 Introduction. . . 59 5.2 Scenario1 . . . 59 5.2.1 Setup . . . 59 5.2.2 Results . . . 61 5.3 Scenario2 . . . 68 5.3.1 Setup . . . 68 5.3.2 Results . . . 70 5.4 Scenario3 . . . 76 5.4.1 Setup . . . 76 5.4.2 Results . . . 77 5.5 Summary . . . 80 6 Summary 83 6.1 Summaryanddiscussion . . . 83

6.2 Futureresearch . . . 83

Appendices 87 A Thetrainingparameters,scenario1and2 . . . 87

(13)

Introduction

1.1 Motivation

In this licenciate thesis, we will discuss how to generate actions from percepts

within an autonomous robotic system. In particular, we will discuss a reactive

architecturesuitableforresponsegeneration,learningandself-organization.

Robots are currently mainly used as advanced automatons in the industry.

With hard-coded behaviours and limited abilities to sense and adapt, they can

performsimpleandrepetitivetaskssuchasweldingataconstructionlineor

han-dlingmaterialsinastorage.

Autonomous robots are designed to be moresensitive to their surroundings.

Theycanchangetheirbehavioursandplansaccordingtochangesinthe

environ-mentinordertofulllsomespeciedgoal. Anexampleofthiskindofrobotisthe

microroverSojourner[36]whichinJuly4,1997landedonMarsandperformeda

numberof experiments. However,thebehavioursand actionsofthese robotsare

still the resultsof algorithms, goalsand representations, designedand arbitrary

craftedbyhand. Anautonomousrobotofthiskindwillnotbeabletoadapttoa

situationwhichitshumandesignershavenotforeseen. Thus,atrulyautonomous

robothastobecapableoflearningfrom itsexperiences. Learningmethodologies

such as reinforcementlearning[29] showspromisingresults[8,28] in adapting or

learningsuccessfulbehavioursinroboticsystems. An autonomousrobot must,in

addition, have the ability to use sensor data for perception. There are alot of

interestingphilosophicalreasonstowhyasystemcapableofvisionandperception

hastoacquire informationactivelybyitself, see [15,17,21]. Oneofthese is that

responsesofthesystemcanbeusedtoorganizethebarrageofinputdatacoming

fromthesensors.

(14)

learntherelationsbetweenitsresponsesand itspercepts onareactivelevel,i.e.,

ifarobotturnsitshead5degrees,howwillitsperceptschange? Ifarobotcould

learnthesebasicrelations,theywouldbevaluableascomponentsinaperception

structure.

Usingresponsesin order to organize percepts seemsto be animportantpart

in the development of the senses in mammals. In an experiment performed by

Held andHein[23],twokittens wereraisedin thesameenvironment,attachedto

each other viaacarousel apparatus. One of thekittens couldnot movefreely

butwerepassivelymovedaroundbytheotherkittenviathecarousel. Aftersome

time, the kitten which could control its movements developed normal

sensory-motor coordination, while the other failed to do so until being freed for several

days[17,23].

1.2 Contributions

Themain contributionsinthis thesisare presentedin chapter 3,4,5andsection

2.5. Themostimportantindividualcontributionsare:

An original reactive architecture for response generation. It supports

in-crementallearningandself-organizationthroughdistributeddynamicmodel

generationandself-containedcomponents. (Chapter3)

Asimulatorwhichhas beenimplementedin order to test andevaluatethe

architecture. (Chapter4and5)

A exible implementation of an articial neuralnetwork (ANN) with

self-contained nodes and links. Rules for back propagation of error gradients

havebeenderivedforthisimplementationandforitsextended denitionof

nodesandlinks. (Chapter4)

Section2.5discusseshowtoanchorsymbolstovisualdataandisco-authoredwith

Dr. Silvia Coradeschi 1

andDr. Alessandro Saotti 2

in ref. [1], partlypublished

in ref.[9].

Section 4.2.5discussespropertiesof theerrorback propagationalgorithm, in

particularinrelationtotheRPROP[31]algorithm.Thishasbeenpublishedin[2],

co-authoredwithMikaelKarlsson 3

.

1.3 Thesis outline

Inchapter2,thebackgroundorcontextofthisthesisispresented. Weintroducea

scenariothatillustratewhatwearestrivingtoattain,anddiscussissuesinresponse

generation and learning. The chapter ends with an exampleof an autonomous

1

CurrentlyatÖrebroUniversity,Dept. ofTechnologyandScience,Sweden.

2

CurrentlyatÖrebroUniversity,Dept. ofTechnologyandScience,Sweden.

3

(15)

robotic system that illustrates some problems in information representation in

thesekindofsystems.

Chapter 3 introduces a reactive architecture which is suitable for response

generation,learningandself-organization. Thechannelrepresentationisalso

dis-cussed.

Asimulatorhasbeendevelopedin ordertotestdierentideasandfor

evalua-tionoftheproposedarchitecture. Thedesignandimplementationofthissimulator

isthetopic ofchapter4.

Experimentsontheproposedarchitectureusingthesimulatorarepresentedin

chapter5,andchapter6concludesthethesiswithasummaryandaproposalfor

futurework.

(16)

(17)

Framework and considerations

2.1 Introduction

In this thesis, we will discuss how to generate actions from percepts within an

autonomousroboticsystem. Inparticular, wewilldiscuss areactivearchitecture

suitableforresponsegeneration,learningandself-organization. Robotswill

more-overbe viewedas Information ProcessingStructures (IPS). This means that we

willnotdealwithimportantaspectsofroboticslikelocomotion,sensorandeector

technologyperse.

This chapter beginswith abriefintroductionto robotics, its historyand the

architecturesbeingused. Itthencontinueswithanexamplescenario,introducing

theissues which will bethe subject forthe rest of thethesis. The chapter ends

withasectioninwhichwewilldiscusstheproblemsofinformationrepresentation

incontextofarealapplication,theWITAS 1

project.

2.2 What is a robotic system or robotics ?

Inrobotics,thegoalistoconstructtechnicalsystemsthatwithhelpofsensors(a

cameraforinstance), eectors (e.g., legs, gripping toolsand wheels) and control

systems can perform advanced, or perhaps tedious, tasks that previously only

couldbe donebyhumans. Examplesof such tasks in theindustryarerepetitive

work onaproductionline (weldingforinstance), materialhandling(storageand

deliveryofmaterial)andsheepshearing (!)[33]. However,these robotsareoften

notmuch morethanadvancedautomatons; theirbehavioursarehard-coded and

theirabilitiestosense andadaptareverylimited.

1

(18)

Examples of tasks demanding autonomous robots are when their behaviour

hastobedeterminedbyitsownexperienceasmuchasanybuilt-inknowledge,for

instanceexplorationandworkinhazardousorremoteenvironments. Examplesof

such environments are disaster areas,nuclear power-plants, deepsea, volcanoes,

spaceetc.

Roboticsisofcourseahugescienticeldthatcoversoroverlapsmanyother

eldsandsubeldslikearticialintelligence(AI),controltheory,computervision,

signal processing andothers. Theinterdisciplinaryaspectof robotics is very

in-teresting becauseit puts everyeld orcomponentin context; thedemands and

requirementsonacomponentoftenbecomequitedierentwhenitisputinto

con-text comparedtowhenit isstudiedandfunctioning inisolation. Many

function-alitiesfrom manydisciplineshavetoworktogetherandsharethesameresources,

otherwisethesystem(the robotforinstance)willnotwork.

Wewill,as mentionedearlier,in thisthesisdiscussrobotsasInformation

Pro-cessingStructures(IPS).Thismeansthatwewillnotdealwithimportantaspects

ofroboticslikelocomotion,sensorandeectortechnologyperse. Insteadwewill

focus on anarchitecture which processes signalsfrom the sensors and generates

signalswhich evokeactions.

2.2.1 Brief history

AccordingtoWebster's dictionary[39],arobot is:

1. amachinethatlookslikeahumanbeingandperformsvariouscomplexacts

(aswalkingortalking)ofahumanbeing;also: asimilarbutctionalmachine

whoselackofcapacityforhumanemotionsisoftenemphasized

2. anecient,insensitive,oftenbrutalizedpersonwhofunctionsautomatically

3. anautomaticapparatusordevicethatperformsfunctionsordinarilyascribed

tohumanbeingsoroperateswithwhatappearsto bealmosthuman

intelli-gence

4. amechanismguidedbyautomaticcontrols

The Czech writer and artistJosef Capekcoinedthe wordRobot from theCzech

wordsforserf(robotnik) andforcedlabor(robota)in hisshort storyOpilec

from 1917. Hisbrother,theCzechplaywrightKarelCapek,madethewordRobot

well known with his play R.U.R (Rossum's Universal Robots) which opened

in Prague in January1921[33]. The themeoftheplayis thedehumanizationof

maninatechnologicalcivilization. Intheplay,therobotsarenotmechanicalbut

chemical. Inan essaywritten in 1935, Capek, in the third person,writes about

mechanicalrobots[10]:

It iswith horror,frankly, thathe rejectsallresponsibility forthe

ideathatmetalcontraptionscouldeverreplacehumanbeings,andthat

bymeansofwirestheycouldawakensomethinglikelife,love,or

(19)

ofmachines,oragraveoenseagainstlife.

TheAuthorofRobotsDefendsHimself-KarlCapek,Lidovenoviny,

June9, 1935,translation: Bean Comrada

ThetermRobotics wascoinedbythescientistandwriterIsaacAsimov,referring

to the use and study of Robots. In his science ction short story Runaround

published in 1942, the word Robotics wasrst used. In 1950 he published I,

Robot,acollectionofseveral ofthese small storieswhere healso introduces his

famous threeLawsofRobotics [5]. Helateraddedazerothlaw.

Robotics is,again accordingto Webster's dictionary[39],technology dealing

with thedesign,construction,andoperationofrobotsin automation.

EarlyrobotsintheresearchcommunitywereGreyWalter'sElsiethetortoise

(Machina speculatrix) [3] in 1953 and the John Hopkins Beast in 1960. The

rst modern industrialrobotswere theUnimates createdbyGeorgeDevoland

JoeEngelbergerin the1950'sand60's[33]. Engelbergerstartedamanufacturing

companyUnimation (forUniversalAutomation)andhasthereforebeencalled

thefatherofrobotics.

2.2.2 Presentation of a classic robotic structure

Traditionally,aroboticsystemhaslookedsomethinglikegure2.1.

Module

AI/Control

Module

Effector

Module

Response output

Environment

Sensors

Sensor output

Projections

Perception/Vision

Figure2.1: Aclassicroboticsystem

FromtheSensorswegetprojectionsoftheEnvironment. Theseprojectionsare

typicallyintheformofscalars,vectorsand/orarrays. Thedatafromthesensors

ow into the Perception/Vision module where the data is processed according

to some built in rules and models. The rened information, typically symbolic

objectrepresentations,owintotheAI/Controlmodule whichupdates itsmodel

of theworld. Basedon this model and on itsprevious actions, themodule now

generates the next action. This action is then performed and executed by the

Eectormodule whichin turnaecttheenvironment.

This conguration works well when the systemoperates in a restricted, well

controlledenvironmentasforexampleanassembly-stationinanindustry. Inthese

cases,weknowexactlywhichinteractionswiththeenvironmentarepossible,which

(20)

are known etc. The system can consequently be tailored for the specic task

withappropriatealgorithmsfor,e.g.,theimageprocessingsystemandthecontrol

system.

The advantages with this kind of systemsis evident; when the development

of thesystemisnished,thesystemis readyforactionand massproduction,no

training or learning is necessary. The disadvantages are the other side of that

coin; iftheenvironment,theobjects,thetasksoreven thesystemitselfdeviates

from whatisspecied,thesystemwill notworkanditwill notadapt tothenew

demands either,nomattertheamountoftime.

If we tryto develop this kindof system for adynamic environmentwhere a

morerichandexiblesetofresponsesareneeded,wewillsoonrunintoproblems:

Perception/Vision Module: Generalalgorithmswithhighdemandsfor

com-putationalresourcesareneeded. Themodulestill needsmodelsforwhat it

isexpectedtoprocessandndin thescenery.

Perception-Control Bandwidth: The output channel from the perception

moduleoftenhasaverylowbandwidthcomparedto itsinput. The

contex-tualinformation presentintheinputispeeledoin theperceptionmodule

andtheinformation isiconiedaccordingto themodelsthat are builtinto

theperceptionmodule. Thismaimedinformationgivessmallopportunities

forarichandexibleresponsegeneration.

AI/Control Module: The module actsaccording to therules builtinto the

module onthereducedinformationfrom theperceptionmodule. Therules

mustbespeciedbythedeveloperandthismeansthatthedeveloperhasto

predictalltheeventsandobjectsthesystemwillencounter.

In conclusion we canpoint outthat the complexity ofthe environment mustbe

built into this classical system for it to function properly. If we have a fairly

uncontrollableenvironmentandnontrivialtasksforthesystemtoperform,wesee

thatthisisanimpossibletask;duetothecombinatorialexplosion,thesystemwill

encountereventsthat itisnotbuilttohandle.

2.2.3 Active vision and reactive systems

InActiveVisiononetriestomakethesystemmoreecientbylettingtheAI/Control

modulecontrolthePerceptionmodule. Inthisway,onlytheinformationcurrently

needed by the AI/Control system is calculated and produced by thePerception

module. Only certain parts and aspects of the sensor information is processed.

By utilizinginformation gained from thehistoryof thesystem, other taskssuch

as trackingbecomesimpler.

Inrecentyears,reactiverobotsystemshavegainedconsiderableattention. The

PerceptionmoduleisinthiskindofsystemscoupleddirectlytotheEectormodule

so that certain percepts trigger a certain behavior. In slightly more advanced

reactivesystems,prioritiesareaddedtothebehaviorssothatsomeofthebehaviors

(21)

The beauty of this is that, instead of trying to put the complexity of the

environmentintothesystemandforcethesystemtocertainbehaviors,weletthe

complexityoftheenvironmentbereectedinthesimplerulesofthereactiverobot,

and it canin this way obtainseeminglycomplexbehaviors. Braitenbergshowed

inhisgedankenexperimentsthatsimplesensorimotortransformationscouldresult

in complex behaviors in small simple robots [4]. Among the robots based on

Braitenberg's designs, we nd creatures such as the timid shadow seeker, the

paranoid shadow-fearing robot and an insecure wall follower (after [3]). These

robotsobviouslydonotpossessthe traitstimid,paranoid and insecure,but

itisinterestingthatanobservercangetsuchimpressionsofsuchextremelysimple

creatures.

The problem with pure reactivesystems is that it is hard to choose the set

of behaviorsneeded to get theperformance onewants. Inaddition, many tasks

demandthatthissetofbehaviorschangewiththeenvironmentand/orwithtime.

By introducing hybrid reactive/deliberative robotic systems these problems are

solved; aclassicplanner orcontroller isintroduced, and it willcontrol and alter

thereactivecongurationasneeded.

2.2.4 Three-layer architectures

Aswehaveseen, neitherthe classicsense-plan-act architecture northe pure

re-activearchitectureareupto thetasksof robotics. Recently[11],anarchitecture

surfacedwhichencompassesideasfromboththereactiveandthedeliberative

ar-chitectures. Thisthree-layerarchitectureconsistsofthreecomponents:

Areactivefeedback control mechanism: This layerimplementsreactive

cou-plings ormappings betweensensorsandactuators. Thesemappings should

becontinuousandfast;constantbounded intimeand spacecomplexity. In

addition, the mappings should use no information about the state of the

world.

A reactive plan execution mechanism: This layer implements more

com-plexbehaviorsbycontrollingwhichmappingsandparametersthecontroller

shoulduseatagiventime. Theplansitisexecutingcomefromthe

deliber-ativelayerorareinputbyadesigneratcompile-time. Thislayercanaord

moretimeconsumingcalculationsthanthereactivelayerbut mustbealert

enoughto noticechangesinthereactivebehaviors.

A deliberative planner: This layer perform time consuming tasks such as

planningandotherexponentialsearchbasedalgorithms. Someofthe

com-putationally,heavysignalprocessingtasksalsobelonghere.

Wewillinsection2.5discussanexampleofaroboticsystemwhichusesamodern

deliberative/reactivearchitecture. Wewillalsolookcloserattheproblemof

infor-mationrepresentationbetweenthelayersinthisarchitecture. Inparticularwewill

discusshowto connect,oranchor, symbolicdescriptionsusedat higherlayersto

sensorydataatlowerlevels. Wewillcoverthisincontextofaparticularproblem;

(22)

2.3 Scenario: The Infant Robot

Wewill nowdiscuss ascenariowhichdescribeswhat kindofroboticarchitecture

we are considering in this thesis. The scenario will also help us to understand

what kindofproblemswehavetodealwithwhendesigningsuchanarchitecture.

2.3.1 The scenario

Imagine that we have a robot equipped with a lot of sensors for vision, audio,

sonar, tactilesensingetc. Inaddition,therobotalsohasproprioceptive 2

sensors

whichmeasuresthephysicalstateoftherobot(forexampleorientation,changesin

velocityandthepositionsandtheanglesofitsjoints). Therobotcanaectthe

en-vironmentwithasetofeectors;withsomeitcanchangeitsposition(locomotion)

andwithsomeitcanmanipulateobjects.

ConsidernowthecasewheretheInformationProcessingStructure(IPS)ofthe

robot hasbeendesigned and builtusing architectures similar to those discussed

in section 2.2. Inthis case,the designer has to provide rulesor models on how

to combinethesensorinput(alsoknownasdatafusion),andhowtorelatethese

inputstotheactionsoftherobot.

In this scenario, however, we want the robot to learn, through its own

ex-perience, how the percepts from the sensors are coupled to the responses of its

eectors. Likeaninfant,therobotmustlearnhowtoorganizeitssensorystimuli

by experimentingwith itsenvironment. The experimentwiththe kittens in

sec-tion1.1 isagoodexampleofthis. TheISPofthisrobotmust containareactive

architecturewhichsupportslearningandself-organization. Itisanarchitectureof

this kindwewilldiscuss andstudyinthechapterstofollow.

Inthenextsectionwewilllookatdierentaspectsofarchitecturesandresponse

generation.

2.3.2 Response generation

The goalofthe IPSis to generateactions sothat thesystemfunctions properly

in theenvironment. Whatdoesthismeanand whatimplicationsdoesit havefor

thedesignoftheIPSandtheroboticsystem?

A commonandintuitiveapproach usedin classic AIwasdiscussedin section

2.2.2. Inthis approachtheenvironmentisinterpreted,classiedandlabeledinto

symbolicobjects. Thesystemthentypicallyupdatesitsinternalworld-modelusing

these object-descriptionsand reasonsabouttheworldandwhat actionsto make.

Thisisaverynaturalapproachbecausewehumansconsciouslyreasonintermsof

objectsandtheirrelations. Wemust,however,becarefulwiththisintrospective

method, becausetherearemanyresultsindicatingthatourconsciousselfandour

consciousexperiencesoftheworldresideattheveryendoftheneuralprocessingin

thebrain[15]. ThisconsciousparthandleswhatpsychologistsandAI-researchers

calldeclarativeknowledge,i.e.,symbolicfactsandeventsthatwecanreasonabout

consciously,describeandcommunicateverbally. Theprocessingthatprecedesthe

2

(23)

consciousparthandles,amongotherthings,motoricskillswithastrongcontextual

dependence. Thisproceduralknowledgeisrichininformationandmaybedicult

to express in language. A typical example is to describe how to ride abicycle.

Since it is these response generating motoric skills we are interested in (we are

not striving to attain language, logic and consciousness), there is no reason to

assume that the technical system we strive to obtain must handle objects and

object-representationsin formsthataresimilarto ourconsciousexperiences.

Something these classic solutions have in common is that they in the

per-ceptionmodule postulatethat wemustknowwhat andwhereanobjectisinthe

environmentbeforeproperresponses canbe generated. Granlund [15,17]argues

that this conceptionis fundamentallywrong and that we in this areaare fooled

bytheluxuryofourownconsciousness. Therearemanydierentunderstandings

onthis topic but Granlund argues that many misunderstandings arise from the

factthatpeopledonotrealizethat dierentapproachesareneededdependingon

whether thegoalis generation of responses or objectrelated tasks such as

com-municationormeasurements. Therefore,Granlund postulatesthat thereare two

dierent domains in Spatial-Cognitive Information Processing where completely

dierentrulesapply. Table2.1showsthedierencesbetweenthetwodomains.

If we use the classical approach and try to identify objects and label the

scenariobeforethegenerationofresponses,wecangetintotrouble. Intheprocess

oflabelingwedeliberatelycutthecontextuallinksfromtheobjectsinordertoget

themasinvariantaspossible(fortemplatematchingforinstance),andwedonot

includeinformationintheobjectdescriptionsaboutwhatresponseswereinvolved

in order to gain the description in the rst place. But, this information is of

vitalimportancewhenwewanttogenerateresponseswhicharehighlycontextual

in their nature. Classical systems solve this by trying, after identication and

estimation of features, to see how objectsrelate to each other; i.e., they try to

recreatethecontextaftertheyhavethrownitaway!

Thesystemweareinterestedinandaregoingtoinvestigatefurtherinthis

the-sis,isthereactivedomainin Spatial-CognitiveInformation Processing. Insucha

system,wedonotstriveto attainconsciousnessorsomekindofunderstanding

oftheenvironment. Weareonlytryingtogenerateeectiveresponsesby

associ-atingperceptswithresponsesondierentlevelsofabstraction.Inthisdomainwe

donottryto ndobjectsand identifythese. Infact,therepresentationsused in

thisdomainrefermoretosituations thantosomethingweusuallyrecognizeasan

object;abookisforexamplesomethingcompletely dierentfromthesamebook

rotated90degrees,becausetheresponsesneededtodealwithitaredierent. This

processingisinadditionnonsymbolicandcontinuousincontrasttoourconscious

(24)

Reactive - Associative

Deliberative - Symbolic

Concatenatedlocalmaps Globalityofspaces

View-centeredrepresentation Object-centeredrepresentation

Limited invariance of objects with

referencetoobserverstate

Highlyinvariantobject

Contextspecic Contextuallyinvariant

Low resolution linkage structures

usingfeedbackand servoing

High resolution implementations of

precisemathematicalmodels

Associativelearning

Prescribedmodels

DistributedSystems Centralizedsystems

Operationsanddatamixed.

Neuralnetworks

Separation between operations and

data.

Conventionalcomputers

Interactingwithenvironment One-way projective

Information is acquired by system

onitsownterms

Informationisinputbydesigner

Motor functionsare partof percept

organizationstructure

Outputisarrangedbydesigner

Semanticrepresentation.

Useofexperiencedcoincidences

Symbolicrepresentation.

Useofprescribedrules

Useofdynamicsforacquisitionand

redundantoperation

Prescribedmodesofinterpretation

Self-organizingstructure

Functionality surrounded by more

intelligentshell

Procedural Declarative

Action

Communication/Language

(25)

2.4 Learning and self-organization

2.4.1 Why learning?

What the architectures we haveseen in section 2.2 have in common is that no

learningor adaptivityhavetakenplace. Thecostforthis, asdiscussedearlier,is

thatthemodelsoftheenvironmentand/orproperbehaviorshavetobebuiltinto

thesystemfromthestart. Ifwehavethesituationthatweknowthemodelswill

change(butnothow),themodelsareunknownorveryhardtoestablish,learning

andadaptivityofthesystemareneeded.

Buthowmuch andwhatshould thesystemlearn? Ifwespecifytoolittleand

givethesystemtoomanydegreesoffreedomtolearn,thelearningtimewillbehuge

ifthelearningconvergesatall. Ifwespecifytoomuchthelearningwillprobably

befast,buttheresultingsystemmaybetoorestrictedto solvetheproblem.

ThisproblemisknownastheBias-Variancedilemma[28]. Whatwewantto

dois tondjust theamountofbiasneededforthesystemtolearnthe

problem-domain in consideration in a fast and reliable manner. Finding this bias is in

generalnoteasy.

Specifyingwhatasystemshouldlearnisanotherproblem. Foragivenproblem

onecanalmostalwaysndaninnitenumberofsolutionapproachesandsolution

structures.

2.4.2 What should we learn?

Firstwemust decideonwhat time-scale oursystemis operating. This scale

de-cideswhatwecanaordtolearn. Learningforinstancesensor-types,algorithms,

hardware wiringand otherfundamental structural requirementsforasolution,

operateson an evolutionary time-scale. On this scale whole systems are ev

alu-ated against each other and the ttest systems are allowed to evolve. Learning

methodsonthistime-scaleare geneticalgorithms(GAs) [6,24], includinggenetic

programmingmethods(GPs)[27].

Thetime-scaleweareinterestedinhere,isthescaleoftheindividualsystem.

Onthisscalewewantthesystemtoexploreitsenvironment(theproblem-domain)

andlearnfromitsexperiences. Giventhatthesystemisrestrictedwithsomewell

chosenbias,wewantthelearningtobefast. Withfastwemeanthatthesystem

shouldlearntoproducetherightresponseorbehaviourafterjustacoupleoftries

orexamples. Thisissometimes calledinstance-basedlearning.

What thesystemshould learn depends onwhat bias thesystem isrestricted

orinitiatedwith. Forsomesetup,thesystemcouldforexamplelearntoassociate

perceptswithresponsesinanassociativememory,orlearnto deneandcombine

local models for generation of responses. Dierent methods for learningon this

time-scale are supervised, unsupervised and reinforcement learning (see section

2.4.3).

AccordingtoBallard[6],thesystemshouldrstlearntoreact. Thefastestway

tocalculatesomethingistolookuptheanswerinalook-uptable ormemory. A

(26)

properresponsepatterncanbegeneratedasfastastheimplementationallows. We

can howevernotstoreeverything; new associativecouplingsmust besuciently

dierent from the existing memoriesto be worthstoring, but at the same time

similar enoughtotheexistingmemoriessotheycanberelated.

Ballardisalsosuggestingthatlargerbehavioursorprogramscanbeformedby

using sequencesof theseassociativememories. These programsareusing hidden

Markov models (HMMs) to dene the problem state space and operators that

expressthetransitionsbetweenthestates. Reinforcementlearningisthenusedto

ndpathsthroughtheHMMs thattendtogivehighrewards.

2.4.3 Learning paradigms SupervisedLearning

System

Trainer

y

x

ε

System

Trainer

y

x

r

Figure2.2: Supervisedlearning(top),reinforcementlearning(bottom)

In supervised learning (see gure 2.2), the teacher shows input data (x) to

the system in training which returns its corresponding output data (y). The

teacherknowsthe correct answersand provides the systemwith an error signal

("),showingthesystemhowitshouldchangeinordertodecreasetheerrorsinits

output. Wemaythinkoftheteacherashavingcompleteknowledgeoftheproblem

domain. Eventually,thesystemwill emulate theteacher. Thiskindoflearningis

verycommon in theeld of ArticialNeuralNetworks (ANNs), see also section

4.2.3.

UnsupervisedLearning

Unsupervisedlearningcanbe thoughtof as aspecial caseof supervisedlearning

where the teacher is built into the system. In this case, one usually wants

the system to learn arepresentation of the input. A task independent measure

of quality of this representation is replacing the teacher. Auto-association and

(27)

Reinforcementlearning

In reinforcement learning (see gure 2.2), the learning system has a critic asa

teacher. Thedierencebetweenreinforcementlearningandsupervisedlearningis

that thecriticdoesnotknowtheanswersorhowtoreactinacertainstate. The

criticcanonlygiveameasure,areward(r)ofhowwellorhowbadthesystemhas

performed. Thisis amoregeneralmethod than thesupervisedlearningbecause

itismucheasiertogeneratearewardsignalthantogeneratetherightanswersor

reactions in acertain state. However, thelearningis moredicult, becausethe

teachercannotprovideanyguidanceonhowthesystemshouldchangeinorderto

improveitsperformance. Inaddition,therewardsignalmaybetheresultofactions

thesystemtookinitspast,i.e. therewardmaybedelayed. Thisimpliesthatthe

system using reinforcement learning must solve the temporal credit assignment

problem. Seeref.[8,28]fordiscussionsonreinforcementlearning.

2.5 Fuzzy matching of visual cues

2.5.1 Introduction

Autonomous mobilevehicles needtouse computervisioncapabilitiesin order to

perceivethephysicalworldandtoactintelligentlyinit. Thesesystemsalsoneed

the ability to perform high-level, abstract reasoning in order to operatereliably

in a dynamic and uncertain world without the need for human assistance. For

example, a mail delivery robot faced with a closed door should decide whether

it is better to plan an alternative way to achieve its goal, or to reschedule its

activities and try this deliveryagain later on. In general, autonomous vehicles

needtoincorporate adecision-makingsystemthat usesperceptualdata toguide

theactivity ofthevehicletowardtheachievementoftheintended task,and also

guidestheactivityoftheperceptualsubsystemaccordingto thepriorities ofthis

task.

Animportantaspectin integratingthedecision-makingand thecomputer

vi-sionsystemsis the connectionbetween theabstract representations used bythe

symbolicdecision-makingsystemtodenoteaspecicphysicalobject,andthedata

in thecomputervisionsystemthatcorrespondto that object. Following[34],we

call anchoring the process of establishing this connection. We assume that the

decision-making process associates each object to a set of properties that (non

univocally) describe that object. Anchoring this object then means to use the

vision apparatusto nd anobjectwhose observedfeatures matchthe properties

in thisdescription. Forexample,supposethatthesymbolicsystemhasanobject

named `car-3' with thedescription `small redMercedes onRoad-61.' Anchoring

thisobjectmeansto: (i)ndanobjectintheimagethatmatchesthisdescription;

and (ii) update thedescriptionof `car-3'by using theobservedfeatures, sothat

thesameobjectcanlaterbere-identied.

Oneof thediculties in theanchoringproblem is that thedata providedby

the visionsystemare inherentlyaected byalarge amount ofuncertainty. This

(28)

high-leveldescriptionof anintendedobject. Inorder toimprovethereliability ofthe

anchoring process, this uncertainty has to be taken into account in the proper

way. A solution is to use techniques based on fuzzy logic to dene a degree of

matching between aperceptual signature and anobject description. The

possi-bility to distinguish betweenobjects that match a given description at dierent

degreesispivotaltotheabilitytodiscriminateperceptuallysimilarobjectsunder

poorobservationconditions. Moreover,these degreesallowusto considerseveral

possibleanchors, ranked bytheir degreeof matching. Finally, these degreescan

beusedto reasonaboutthequalityofananchorin thedecisionmakingprocess;

for example,wecandecidetoengage insomeactiveperceptionin order togeta

betterviewofacandidateanchor.

Inthesectionstocome,wewilldealwiththeanchoringprobleminthecontext

of anarchitectureforunmannedairbornevehicles. Thisarchitecture,outlinedin

the nextsection, integratesseveralsubsystems, includingavisionsystemandan

autonomousdecisionmakingsystem. Insection2.5.3,wediscusshowwerepresent

the inexactdata provided bythevision systemwithfuzzy sets. In section2.5.4,

we show how we compute the degrees of matching betweenthese data and the

intendeddescriptions. Section 2.5.5illustrates the useof these degreesby going

through acoupleof examples,run in simulation. Finally, section 2.5.6discusses

theresultsandtracesfuture directions.

2.5.2 The WITAS project

Figure2.3: Ascenefrom theWITASproject

The WITAS project, or The Wallenberg laboratory for research on

Infor-mation Technology and Autonomous Systems, is a research laboratory within

LinköpingUniversity,Sweden. Asthenameindicates,thegoaloftheprojectisto

(29)

aerialvehicles(UAV's)inparticular. Inmoreconcretetermsthiscurrentlymeans

constructinga small, unmanned helicopter with sensors and computers, able to

autonomouslyy andsurveillanceautomobile trac,seegure2.3.

Thegeneralarchitectureofthesystemisastandardthree-layeredagent

archi-tectureconsistingofadeliberative,areactive,andaprocesslayer:

Thedeliberativelayergeneratesat run-timeprobabilistichigh-level

predic-tionsofthebehaviorsofagentsintheirenvironment,andusesthese

predic-tionsto generateconditionalplans.

Thereactivelayerperformssituation-driventaskexecution,includingtasks

relatingtotheplansgeneratedbythedeliberativelayer. Thereactivelayer

hasaccesstoalibraryoftaskandbehaviordescriptions,whichcanbe

exe-cutedbythereactiveexecutor.

Theprocess layercontainsimage processing andight control,and canbe

reconguredfromthereactivelayerbymeansofswitchingonandogroups

ofprocesses.

Besidesvision,thesensorsand knowledgesourcesofthesysteminclude: aglobal

positioning system (GPS) that gives the position of the vehicle, a geographical

information system(GIS) coveringthe relevant areaof operation, and standard

sensorsforspeed,headingandaltitude.

Thesystemis fullyimplementedinitscurrentversion. Because ofthenature

ofthework,mostofthetestingisbeingmadeusingsimulatedUAVsinsimulated

environments,eventhoughrealimagedatahasbeenusedtotestthevisionmodule.

Inasecondphaseoftheproject,however,thetestingwillbemadeusingrealUAVs.

Moreinformationabouttheprojectcanbefoundat theWITASwebpage[38].

Ofparticularinterestforthispresentationistheinteractionbetweenthe

reac-tivelayerandtheimageprocessingintheprocesslayer. Thisisdonebymeansof

aspecializedcomponentfortaskspecic sensorcontrolandinterpretation, called

theTheSceneInformation Manager(SIM). Thissystem,illustratedingure2.4,

ispart ofthe reactive layerandit manages sensor resources: it reconguresthe

visionmoduleonthebasisoftherequestsofinformationcomingfromthereactive

executor,it anchorssymbolic identiersto image elements(points, regions),and

ithandles simplevision failures, in particulartemporaryocclusionand errors in

carre-identication.

TwoofthemainaspectsofanchoringimplementedintheSIMareidentication

ofobjectson thebasis ofa visualsignature expressed in termsof concepts, and

re-identicationofobjectsthathavebeenpreviouslyseen,buthavethenbeenout

oftheimageoroccludedforashortperiod.

Foridenticationandre-identicationtheSIMusesthevisualsignatureofthe

object,typicallycolorand geometricaldescription,andtheexpectedpositions of

theobject. ForinstanceiftheSIMhasthetaskto lookforared,smallMercedes

near a specied crossing, it provides the vision module with the coordinates of

thecrossing,theHue, SaturationandValuedening red andthe length,width

(30)

SIM

Prediction

Storage

skill configuration

calls

vision

data

results

requests

Anchoring

management

& skill

Vision

Reactive Executor

data

road

GIS

GPS

position

Figure 2.4: Overview oftheSceneInformation Managerand itsinteraction with

theVisionmoduleandtheReactiveExecutor.

a degree of inaccuracy, and the SIM also provides the vision module with the

intervalsinside which themeasurementofeachofthefeaturesisacceptable. The

sizeoftheintervaldependsonhowdiscriminatingonewantstobeintheselection

oftheobjectsandalso,inthecaseofre-identicationofanobject,onhowaccurate

previousmeasurementsontheobjectwere.

Thevisionmodulereceivesthepositionwheretolookforanobjectandthe

vi-sualsignatureoftheobjectanditisthenresponsibleforperformingtheprocessing

required to nd theobjects in the image whose measures are in the

acceptabil-ity range and report the information about the objects to the SIM. The vision

module moves the camera toward the requested position and for each objectin

the image and for each requested feature of the object, it calculates an interval

containingthereal value. If thegeneratedintervalintersectswiththe intervalof

acceptabilityprovidedin thevisual signature forthe feature,the feature is

con-sidered to bein theacceptability range. The visionmodule reports information

aboutcolor,shape,position,andvelocityofeach objectwhosefeaturesare allin

theacceptabilityrangetotheSIM.

Intersection of intervals is a simple, but not very discriminating method to

identify an object. As a consequence, several objects that are somehow similar

to theintendedonecanbesentbackbythevisionmoduleto theSIM.TheSIM

then needsto apply some criteriain order to perform a further selection of the

bestmatchingobjectbetweenthosereportedbythevisionmodule. Theselection

of the best matching object should depend on how well the objects match the

dierentaspects ofthe signature, but also ontheaccuracy of themeasurements

performed by the vision and their reliability. In what follows, weshow how we

(31)

2.5.3 Fuzzy-set representation of visual cues

Cuesobtainedfromthevisionsystem,e.g.,color,shape,positionandvelocity,are

aectedbyuncertaintyandimprecisionin severalways. Inthiswork,wepropose

toexplicitlyrepresenttheinexactnesswhichisinherentinthesedata,andtotake

this inexactness into account when performing signature matching. In order to

justify our representation, we need to analyze the way in which we extract the

neededparametersfromtheimage.

Considerthemeasurementoftheshapeparameters(length,widthandarea)of

anobservedcar. Roughly, themeasurementstartswithasegmentedandlabeled

binary image containing our candidate cars. This binary image is created by

combiningandthresholdingthefeature imagesproducedbythedierentfeature

channelsavailable,e.g.,orientation,color,IR andvelocity(currently,weonlyuse

thecolor channels). Foreach objectin the labeled image, wethen computethe

momentofinertiamatrix. Fromthis22matrix,wecalculatethetwoeigenvalues

whichcorrespondtothelargestandsmallestmomentofinertia,respectively,and

convert theminto thelengthandwidth of theobjectunder theassumption that

ourobjects(cars)arerectangular. Wealsomeasuretheareabycountingthepixels

that belong to the same object. The length, width and area measures are then

converted tometric measuresthroughmultiplication byascalefactordescribing

themeterperpixelratio. Thisratioiscomputedfromtheeld-of-viewangleand

fromtheposition andanglesofthecamera.

Thereareanumberoffactorsthatinuencethecorrectnessofthevalues

mea-suredbytheaboveprocedure. First,inthesegmentationphase,thediscretization

oftheimagelimitstheprecisionofthemeasure. Second,continuingthe

segmenta-tionphase,weapplysomebinaryoperations(e.g.,llandclose)onthebinary

imageinorder totryto connectandbind segmentedpixelsintoobjects. These

operations slightly alter the shape, thus limiting the precision. The above two

factors together produce a segmentation error, denoted by

s

. Third, the

mea-surement model maybe inaccurate, thus introducing an error, the model error,

denoted by

m

; for example the above assumption that cars are rectangular is

almostnevercompletelytrue. Notethattheimpactofthe

s and

m

errorsonthe

quality of themeasurementsdepends onthe size of the car in the image, which

in turn depends on its distance from thecamera and on the focal lengthof the

camera. Afourthfactorthataectsthemeasurementistheperspectivedistortion

duetotheanglebetweenthenormalofthecarplaneandtheopticalaxis: ifthe

carplaneisnotperpendiculartotheopticalaxis,theprojectionofthe3D-caron

the imageplanewill beshorter. Wedenote thisperspective error by

. Finally,

all the geometric parameters needed to compute the length may themselves be

aected by errors and imprecision. For example, the distance from the camera

depends onthe relative position of the helicopter and the car; and the angle

depends ontheslopeof theroad; boththese valuesmay bedicult toevaluate.

We summarize the impact of these factors on our measurement in a geometric

error term,denotedby

g .

3

There aremoresources of errorsinthisprocess. For example,when increases, the car

(32)

Theabovediscussion revealsthatthere isagreatamountofuncertaintythat

aects the measured value, for example, the length of an object; and that this

uncertainty is very dicult to precisely quantify in other words, we do not

have amodel of theuncertainty that aects our measures. Similar observations

canbemade forother features measuredbythevision system: for example,the

measurementofthecolorofanobjectisinuencedbythespectralcharacteristicsof

thelightthatilluminatesthatobject. Giventhisdicultnatureoftheuncertainty

inthedatacomingfromthevisionsystem,wehavechosentorepresentthesedata

using fuzzysets [40]. Fuzzysetsoeraconvenientwayto representinexactdata

whose uncertainty cannotbecharacterized byaprecise, stochasticmodel but

forwhichwehavesomeheuristicknowledge. Forexample,Fig2.5(left)showsthe

fuzzysetthatrepresentsagivenlengthmeasurement. Foreachvaluex,thevalue

of this fuzzy set at x is anumberin the [0;1] interval that can beread asthe

degreebywhichxcanbetheactuallengthoftheobjectgivenourmeasurement.

(See [41]forthispossibilisticreading ofafuzzyset.)

2

3

4

5

6

7

8 ...

Length (meters)

0 -120

-60

60

120

180 -180

Hue (degrees)

Figure 2.5: Fuzzysets forthemeasuredlength(left)andhue(right).

Inourwork,weusetrapezoidalfuzzysets,bothforcomputationalreasonsand

foreaseofconstruction. Thepossibilisticsemanticsgiveussomesimpleguidelines

onhowtobuildatrapezoidalfuzzysettorepresentaninexactmeasurement. The

attoppartof thefuzzyset (itscore) identiesthose valuesx that canbefully

regarded as the actual length value given ourmeasurement. In the examplein

Fig 2.5(left),thesevaluesarespreadoveranintervalratherthanconcentratedin

apointbecauseofthesegmentationeect: ourmeasurementcannottellus more

thanwhatisallowedbythepixelsize. Thebaseofthefuzzyset(itssupport)

iden-tiesthosevaluesxthatcanpossiblyberegardedastheactuallengthvalue: given

theerrorsthatmayaectourmeasurement,theactuallengthmaybeanywherein

thesupportintervalbutundernocircumstancescanitbeoutsidethisinterval.

Putdierently,thesupportconstitutesasortofworstcaseestimate: howeverbig

the erroris, theactual valuemust liesomewherein thisinterval. Whilethecore

constitutesabestcaseestimate: evenwhenthereisnoerrorin ourmeasurement,

wecannotbemoreprecisethanthis.

Let us now discuss in detail how we have built the fuzzy set in Fig 2.5.

measured value can betotally invalidifthere has been an error in the segmentation and/or

labelingphases;forinstance,ifthecarhasbeenmergedwithitsshadow,orwithanothercarin

(33)

Thevision systemhas calculated thelength to 29:9 pixels, which correspond to

l= 4:23meters. The segmentation error

s

is estimated to a constant 1pixel,

which with ascalefactorof s=0:14 meter/pixelgivesus

s

=0:14meter. This

segmentationerroris inherentto ourmeasurementprocess, nomatter how good

ourmodelsandcomputationsare,anditthusdenesthecoreofthetrapezoid in

thepicture,givenbytheinterval[l

s ;l+

s

]=[ 4:09;4:37].

Ourestimatesfortheothererrorsareallcollectedinthesupportofthe

trape-zoid. Themodel error

m

isestimated in acoarse but simpleway by comparing

themeasuredareaa

m

withthecomputedareaa

c

=wl(wherewisthecalculated

width). Thedierencebetweentheseareasdenes

m

suchthata

m

willlieinthe

interval[( w m )(l m );( w+ m )(l+ m

)]. If,forexample,a

m isgreaterthan a c , m

becomes: (As a simplication we have assumed that

m

is the same for

boththewidthandforthelength.)

a m ( w+ m )(l+ m )=0=) m = (w+l) 2 + q (w+l) 2 4 +(a m a c ) (2.1)

whichinourcasegivesus

m

=0:04m. Asfortheperspectiveerror

,inourcase

wehave=40:3 Æ

. Ifweassumethatwemeasure theprojectedlengthaslcos,

then the worst case errordue to becomes

= l

MAX

( 1 cos), where l

MAX

is the estimation of the maximum object length. If we set l

MAX

= 6m we get

=1:42m. Sincethesupportofourfuzzysetmustinclude allthevalueswhich

are possiblein aworstcaseerrorsituation,weinclude alltheaboveerrorsinit. 4

This givesustheinterval[ l

s m ;l+ s + m +

]=[4:05;5:83]forthebase

ofourtrapezoid. Notethat

onlyaecttheupperboundoftheinterval,i.e. the

car mayseem smallerin the image when increases. Thecorrect lengthin our

examplewas 4.42m.

Theconstructionof thefuzzy setsfor theotherfeatures followsimilar

guide-lines. Forexample,Fig2.5(right)showsthefuzzysetthatrepresentstheobserved

Huevalue. (Atthecurrentstageofdevelopment,however,wehavemainlyfocused

ontheshapeparameters.)Althoughthedenitionsofthesefuzzysetsaremostly

heuristic,theyhaveresultedin goodperformanceinourexperiments.

2.5.4 Fuzzy signature matching

Wenowfocusontheproblemofanchoringahigh-leveldescriptioncomingfromthe

symbolic system(reactiveexecutor) to thedata comingfrom the visionmodule.

As anexample, consider the casein which atask needs to refer to `a small red

Mercedes.' The SIM system has to link two types of data: on one side, the

descriptioncontainingthesymbols`red,'`small'and'Mercedes'receivedfromthe

symbolicsystem;andontheotherside,themeasurableparametersoftheobserved

objectswhich aresentby thevision system. Anchoringimpliesto convert these

representations to a common frame, and to nd the car that best matches the

4

Inourcurrentexperimentsinthesimulatedenvironment,wehaveg=0sincethehelicopter

(34)

description. Inourcase,wehavechosentoconvertsymbolstotheuniverseofthe

measurable parameters.

Ingeneral,symbolicdescriptionscontainlinguistictermslike`red' and`small'

that do not denote a unique numerical value. Sticking to a common practice

[26,40], we havechosen to map each linguistic term of this kind to a fuzzy set

overtherelevantframe. Forexample,weassociatetheterm`red'with thefuzzy

setshowninFig2.6(left): foreachpossiblevalueh,thevalueofred(h)measures,

on a[0;1] scale, how much h can be regarded as`red.' 5

As a second example,

Fig2.6(right)showshowwerepresentthelengthassociatedtothelinguisticterm

`small-Mercedes'byafuzzy setoverthespaceofpossiblelengths. Inoursystem,

weuseadatabasethatassociateseachcartypetoitstypicallength,size,andarea,

representedbyfuzzysets. Carsofunknowntypesareassociatedwithgenericfuzzy

sets, likethe`small' (car)setin thepicture. Onceagain,weonlyusetrapezoidal

fuzzysets forcomputationalreasons.

0 -120

-60

60

120

180 -180

‘red’

Hue (degrees)

2

3

4

5

6

7

8 ...

Length (meters)

‘small-Mercedes’

‘small’

Figure2.6: Fuzzysetsassociatedto thesymbols`small-Mercedes'and`red.'

Oncewehaverepresentedboththedesireddescriptionandtheobserveddata

byfuzzysets,wecancomputetheirdegreeofmatchingusingfuzzysetoperations.

Thischoiceisjustiedinourcasesincefuzzysetscanbegivenasemantic

charac-terization intermsofdegreesofsimilarity[32]. ConsidertwofuzzysetsA andB

overacommondomainX whichrespectivelyrepresenttheobserveddataandthe

target description. The degreeof matching of Ato B, denoted by match(A;B),

is the degree by which the observed value A can be one of those that satisfy

our criterionB. Inthe experimentspresentedin this note, we usethe following

measure: match(A;B) = R x2X minfA(x);B(x)gdx R x2X A(x)dx (2.2)

Intuitively, (2.2) measures the degree by which A is a (fuzzy) subset of B by

lookingathowmuchofAiscontainedinB. (See,e.g.[12]fordierentmeasures.)

The behavior of this measure is graphically illustrated in Fig. 2.7. To ensure

an ecientcomputation, weapproximate (2.2) bythe ratio betweenthe areaof

the inner trapezoidal envelope of A\B and thearea of A. These areascan be

computedveryeasilywhenAand B aretrapezoidalfuzzysets.

5

(35)

B

A

0.7

(a)match(A,B)=0.4

B

A

1

(b)match(A,B)=0.8

B

A

1

(c)match(A,B)=1.0

Figure2.7: Threeexamplesofpartialmatching.

Once wehavecomputed adegreeofmatching foreachindividual feature, we

needto combineallthese degreestogetherin ordertoobtainanoveralldegreeof

matchingbetweenthe intendeddescriptionand agivenpercept. Inour case,we

need to combine the degrees of matching of the length, width, area, hue,

satu-ration,and valuecriteria intoonesummarizeddegreeof matching. Thesimplest

waytocombine ourdegreesisbyusingaconjunctive typeofcombination,where

we requirethat each one of the features matches the corresponding part in the

description. Conjunctive combination is typically done in fuzzy set theory by

T-normoperators [26,37], whose mostused instances aremin, product, and the

ukasiewiczT-normmax(x+y 1;0). Inourexperiments,wehavenoticed that

thelatteroperatorprovidesthebestresults. (See[7]foranoverviewoftheuseof

alternativeoperatorswithapplications toimageprocessing.)

The overall degree of matching is used by theSIM to select thebest anchor

amongthecandidateobjectsprovided bythevisionmodule. Foreachcandidate,

theSIM rstcomputes its degreeof matching to the intended description, then

itranksthesecandidates by theirdegree,andnally returnsthefull orderedlist

tothereactiveexecutor. Havingalistofcandidatesisconvenientifthecurrently

bestonelaterturnsoutnottobetheonewewanted. Also,itisusefultoknowhow

much the best matching candidate is better than theother ones: if thetwotop

candidateshavesimilardegreesof matching,wemay decideto engagein further

exploratoryactions in order to disambiguatethe situation before committing to

oneofthem for instance, wemaygivethevisionsystemthe taskto zoomon

eachcandidatein turninthehopetogetmoreprecisedata.

2.5.5 Fuzzy signature matching at work

Weillustrate theuseof thefuzzy signaturematchingby twoexamples ona

sce-nariotaken from theWitasproject. In thisscenario, thedeliberativesystemis

interestedinaredcarofaspeciedmodelinthevicinityofagivencrossing. Four

carsaresituatedaroundthatcrossing,movingindierentdirections. Thecarsare

allred,butofdierentmodels: asmallvan,abigMercedes,asmallMercedes,and

aLotus. Inthe rstexamplethehelicopter isabovethe cars. Inthe second

ex-ample,discriminatingbetweenthecarsismademoredicultbythefactthatthe

helicopterviews thecrossingat aninclinationofabout30degrees(seeFig.2.8).

(36)

theextractionofgeometricalfeatures.

Figure2.8: Thesimulatedscenarioforourexamples.

Inourrstexample,thedeliberativesystemdecidestofollow`Van-B',whichis

describedasaredvan. TheSIMsendstheprototypesignatureofaredvantothe

vision module. Since allthefour cars in theimage arered, and theyhavefairly

similar shapes,the visionmodule returnsthe observedsignatures of allthe four

carstotheSIM.Thesesignaturesarethenmatchedagainstthedesiredsignature

byourroutines,resultinginthedegreesofmatching showedintable 2.2.

ID Color Shape Overall

66 1.0 0.58 0.58

67 1.0 0.38 0.38

68 1.0 1.0 1.0

69 1.0 0.0 0.0

Table2.2: Degreesofmatching, rstexample: Van-B

TheIDisalabelassignedbythevisionsystemtoeachcarfoundintheimage.

The degree of matching for the color is obtained by combining the individual

degreesofHue,Saturation,andValue;inourcase,thiswillbe1.0forallthecars

as theyare allred. Thedegreeof matching for the shape isthe combination of

theindividualdegreesofmatchingoflength,width,and area. Theoveralldegree

is the ukasiewitz combination ofthe colorand shapedegrees. Inthis case, car

68iscorrectly 6

identied asthebestcandidate,andananchortothatcaristhus

returnedtothedeliberationsystem.

6

Thisvericationwasdonemanuallyo-linebyanalyzingsomeadditionalinformation,like

(37)

Inthesecond example,thedeliberativesystemisinterestedin `Car-D',ared

smallMercedes. TheSIM sends thecorrespondingprototypicalsignatureto the

visionmodule,andagaingetsthesignaturesofallthefourcarsintheimageasan

answer. Inthiscasehowever,thehelicopterisatalongdistancefromthecrossing

andit viewsthe crossingatan inclinationof about30degrees. Byapplyingour

fuzzysignaturematching routine,weobtainthedegreespresentedintable 2.3.

66 1.0 0.65 0.65

67 1.0 0.84 0.84

68 1.0 0.0 0.0

69 1.0 0.97 0.97

Table2.3: Degreesofmatching,second example: Car-D

Cars66,67and69matchthedesireddescriptiontosomedegree,whilecar68

cansafelybeexcluded. TheSIMdecidesthatthese degreesaretooclosetoallow

asafe discrimination, and it tries to improve the quality of the data by asking

the vision module to zoom on each one of cars 66, 67, and 69 in turn. Using

theobservedsignatures after zooming, the SIM then obtainsthenew degreesof

matching,showedin table2.4.

66 1.0 0.30 0.30

67 1.0 0.70 0.70

69 1.0 0.21 0.21

Table2.4: Degreesofmatching,secondexampleafterzoom: Car-D

Thecloserviewresultsinasmallersegmentationerror,sincethescalefactoris

smaller,andhenceinmorenarrowfuzzysets. Asaconsequence,allthedegreesof

matchinghavedecreasedwithrespecttothepreviousobservation. Whatmatters

here,however,istherelativemagnitudeofthedegreesobtainedfromcomparable

observations,that is,those whichare collectedin theabovetable. These degrees

allowtheSIMto selectcar67asthebestcandidate.

TheSIMnowalsohastheoptiontotrytofurtherimproveitschoiceby

com-manding the helicopter to y over car 67 and take another measurement from

above thecar the best observationconditionsfor thevision system. If wedo

this,wenallyobtainadegreeofmatchingof1:00forcar67. Notethatthisdegree

couldaswellhavedropped,thusindicatingthatcar67wasnotreallythecarthat

wewanted. Inthiscase,theSIM mighthaveusedthepartial matchinformation

andgobackto cars66and69to getmoreaccurateviews.

2.5.6 Conclusions

Anchoringsymbolsto thephysicalobjectstheyare meantto denoterequiresthe

(38)

pastsectionsconsideredaninstanceoftheanchoringprobleminwhichwelinkthe

caridentiersusedatthedecision-makingleveltotheperceptualdataprovidedby

avisionsystem. Ourexperimentalresultsindicatethatourtechniqueisadequate

to handlethe ambiguitiesthat arise whenintegrating uncertainperceptual data

and symbolic representations. In particular, fuzzy signature matching improves

ourabilitytodiscriminateamongperceptuallysimilarobjectsindicultsituations

(e.g., perspectivedistortion). Moreover,degreesof matching allowusto exclude

unlikelycandidates,andtorankthelikelyonesbytheirsimilaritytotheintended

description. Finally, these degrees can help in decisionmaking; for example, if

these degrees indicate a large amount of anchoring ambiguity, the system may

decidetoengageinactiveinformationgatheringsuchaszoomingorgettingcloser

to theobjectinordertoobtainbetterinformation.

Theworkreportedhereisstillinprogress,andmanyaspectsneedtobefurther

developed. First,thetreatmentofperspectivedistortionspresentedhereisrather

primitive,inournextexperiments,weshallusedierentmodelsofeachcarviewed

fromdierentobservationangles. Second,weneedtoaccountforstillmoresources

oferrors,includingthepossibilitythatthedetectedobjectisnotacar. Third,we

need to study moresophisticated forms of aggregation of theindividual degrees

of matching of dierent features into an overall degree. For example, in some

situations someof the features are more critical thanothers, and we would like

theirdegreeofmatchingto haveastrongerimpactontheoveralldegree. Fourth,

we plan to include features of a dierent nature in the matching process, like

the observed position and velocity of the cars. Finally, until now wehave only

performed experiments in simulation. At the current stage of development of

the Witas project,the vision system takesthe videoframes produced by a3D

simulatorasinput. Although this congurationresultsin someamount ofnoise

and uncertaintyin the extractedfeatures, weare awarethat areal validation of

ourtechniquewillonlybepossiblewhenwehaveaccesstotherealdata froman

(39)

The proposed reactive architecture

3.1 Introduction

Inthischapterwewilldiscussthenutsandbolts ofaproposedreactive

architec-turewhichshowssomeoftheproperties discussedintheInfantRobot scenario

in section 2.3. First, we will present the representation we have chosen for the

input/output signals. Then, wediscussthedesignofthearchitecture. The

chap-ter ends with apresentation of the dierent phases involved while training the

architecture,e.g.,howtoevaluateandcreatemodels.

3.2 Structure description

3.2.1 The channel representation

The signals arriving from the sensors to the ISP of the robot can be seen as

projections of some variables in the environment, accessible to the robot only

through itssensors. Theeectors ofthe robot canbe seenasawayof aecting

some other variables in the environment. We will onwards denote the variables

controllableby therobot response-variables, andthe variables onlyviewable via

sensors percept-variables. Howthesensors andeectorsrepresentthese variables

is veryimportantto knowwhen wedesignour reactivearchitecture. We will in

thefollowingassumethattheperceptandresponsevariablesarerepresentedusing

thechannel representation [8,13,14,18,30].

In this representation we let a set of channels represent a variable. Each

channelissensitivetoasmallpartofthevariabledomainandcanbeviewedasa

band-passlter. Thefunctionweareusingfortheenvelopeofthechannelsisthe

(40)

the scalar variable v in the interval v 2 [15;65]. We canalso see the particular

congurationofchannelsforv 40;onlytwochannelsaresignicantlyactivated.

In theexamplethechannels are distributed regularlywith60 Æ

overlap andhave

the samebandwidth, but this isnot arequirementin general. With 60 Æ

overlap

wemeanthattheclosestneighbourstoachannelarephase-shifted 60 Æ

inrelation

to that channel. Noticethat the channels centered in 5and 75 respectively are

neededin ordertorepresenttheinterval[15;65]inaregularmanner.

−20

0

20

40

60

80

100

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2 Channel responses on feature(Name: radius Type: radius−v)

Feature: radius

Channel Response

Figure3.1: Thevariable vrepresentedwiththechannelrepresentation

Usingthechannelrepresentationforthevariableshasanumberofadvantages:

Wecancustomizetheresolutionofthevariable. Insomepartsofthevariable

domainwemaywantto havehighresolution(manychannels)andin other

partswemaywantlowresolution(fewchannels)orperhapsnoresolutionat

all.

We can representmany valuesor eventsat the same time. When we use

the channel representation, we usually extendthe dimensionality from the

original number of dimensions of the variable to the number of channels.

However,typicallyonlyasmallnumberof thechannels(threeforinstance)

are activeat the same time depending on how the channels overlap. The

channelsthatarenotactivecanbeusedtodetectanothereventorrepresent

anothervalue. Ingure3.2wecanseeanexampleofthiswheretwoevents

aresharing thesamechannelrepresentation. Thisis mostlyrelevantonly

(41)

We can use simple processing strategies. Due to the local nature of the

channels, itturnsoutthatoperationsonthevariable,suchascalculatinga

functionofit,oftencanbeperformedusingonlylinearoperators. Example:

Letxbethescalarvariablethatisrepresentedbyasetofchannelsc. Ifwe

wantto calculate y

= f(x)wecanoften dothis asy =w T

c, where w is

avectorof weights orparameters. The quality of this approximation, i.e.,

howsimilary istoy

,dependsontherateofchangeofthefunctionf(x)in

relationtotheresolutionofthechannelrepresentationof x.

−150

0 −100

−50

0

50

100

150

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2 Channel responses on feature(Name: rod−positions Type: position)

Feature: rod−positions

Channel Response

Figure3.2: Anexampleoftwoevents(x=f 15;65g)representedusingthesame

channelset

3.2.2 The mapping functions

Thetask of thearchitectureis, asdiscussedpreviously, to on-linelearnhowthe

percepts change when the robot is performing some action in the environment.

Wewillinthefollowingdenotetheconguration,orpattern,ofresponsechannels

(42)

Ifwerepresentthe inputand output variablesin thechannel representation;

howdowe,giventheprevioussystemstate,associatetheincomingperceptpattern

withthecurrentresponsepattern(thecurrentsystemstate)? Wecanreformulate

this as: given the previous system state, we want to approximate the current

activationofaresponsechannelwithsomefunctionoftheactivationsofthepercept

channels.

Wewanttoput somedemandsonthisfunction:

Thefunction must be robustagainst variationsin the congurationof the

perceptchannels;highresolution,lowresolution, mixedscales,dierent

de-greesofoverlapetc. thefunctionmustbehavereasonablyinthesecases.

This impliesthat thefunction should berobust orperhaps independent of

thenormof theperceptchannelset.

Thefunction mustberobust againstnoiseon thepercept channels. Their

mutualrelationsmustbemoreimportantthantheabsoluteactivationofa

single channel; the information that a percept channel is activated at all,

says a great deal due to the locality of the channel representation. The

absoluteactivationofachannelshouldbelessinteresting.

Inmuch the sameway, thefunction must notdepend on theshapeof the

perceptchannelfunction(forinstancecos 2

).

Thefunction shouldberealvaluedandcontinuous.

Thefunction mustbesimple enoughtobeimplementedwithlocal

opera-tionsusingnodesand linksofanANN.

The function should use as few parameters as possible but be expressive

enoughto modeltheresponsechannel.

Ifwedonothaveanyactivationasinput,wedonotwantanyoutput, i.e.,

y(0)0.

There isof coursean(innite)numberoffunctions that moreorlessfulll these

demands. Somefunctions wehavelookedatare:

1. y

j =w

T

c=kw kkckcos(),where w istheparameter(weight)vector,cis

theperceptchannelvectorand istheanglebetweenthevectors.

2. y j = w T c kck =kw k^ccos() 3. y j = w T c P ci 4. z k =(y j

),where()is theactivation function ofanANNnode.

Webeginwithconsideringcase1undertheassumptionthatwehaveadistribution

of channels according to gure 3.3 and 3.4. When we increase x, the dierent

channelswillgoupanddown,butusingachannelfunctionofcos 2

andanoverlap

of 60degreesthe sumof thechannels, P

c

i