• No results found

Dynamic Abstraction for Interleaved Task Planning and Execution

N/A
N/A
Protected

Academic year: 2021

Share "Dynamic Abstraction for Interleaved Task Planning and Execution"

Copied!
112
0
0

Loading.... (view fulltext now)

Full text

(1)

LinkopingStudiesinS ien eandTe hnology

ThesisNo. 1363

Dynamic Abstraction for Interleaved Task Planning and Execution

by

Per Nyblom

SubmittedtoLinkopingInstituteofTe hnologyatLinkopingUniversityinpartial

ful lmentoftherequirementsfordegreeofLi entiateofEngineering

DepartmentofComputerandInformationS ien e

Linkopinguniversitet

SE-58183Linkoping,Sweden

(2)
(3)

Dynamic Abstraction for Interleaved Task Planning and Execution

by

PerNyblom

April2008

ISBN978-91-7393-905-8

LinkopingStudiesinS ien eandTe hnology

ThesisNo. 1363

ISSN0280{7971

LiU{Tek{Li {2008:21

ABSTRACT

Itisoftenbene ialforanautonomousagentthatoperatesina omplexenvironmentto

makeuseofdi erenttypesofmathemati almodelstokeeptra kofunobservableparts

oftheworld orto performpredi tion,planningandother typesof reasoning. Sin e a

modelisalwaysasimpli ationofsomethingelse,therealwaysexistsatradeo between

themodel'sa ura y andfeasibilitywhen it isused within a ertain appli ationdue

to thelimitedavailable omputationalresour es. Currently,thistradeo isto alarge

extentbalan edbyhumansformodel onstru tioningeneralandforautonomousagents

in parti ular. This thesisinvestigates di erent solutionswheresu h agents aremore

responsibleforbalan ingthetradeo formodelsthemselvesinthe ontextofinterleaved

taskplanningandplanexe ution. Thene essary omponentsforanautonomousagent

thatperformsitsabstra tionsand onstru tsplanningmodelsdynami allyduringtask

planningandexe utionareinvestigatedandamethod alledDAREisdevelopedthatisa

templateforhandlingthepossiblesituationsthat ano ursu hastheriseofunsuitable

abstra tionsandneedfordynami onstru tionof abstra tionlevels. Implementations

ofDAREarepresentedintwo asestudieswherebothafullyandpartiallyobservable

sto hasti domain are used,motivated byresear h with UnmannedAir raftSystems.

The asestudiesalso demonstratepossibleways to performdynami abstra tion and

problemmodel onstru tioninpra ti e.

This workhasbeen supportedby theSwedishAeronauti sResear hCoun il (NFFP4-

S4203), the Swedish National Graduate S hool in Computer S ien e (CUGS), the

SwedishResear hCoun il(50405001)andtheWallenbergFoundation(WITASProje t).

DepartmentofComputerandInformationS ien e

Linkopinguniversitet

(4)
(5)

Acknowledgements

IwouldliketothankmyadvisorPatri kDohertywhohasgivenmemoreor

lessfreehandstoinvestigatethis fas inating eldof Arti ialIntelligen e.

IthastrulybeensomeofthemostinterestingyearsofmylifeandIapologize

foralwayspi kingsubje tsthatyouarelessfamiliarwith.

During mytime at theArti ial Intelligen e andIntegratedComputer

Systemsdivision(AIICS),Ihavere eivedvaluableinputfrommanypeople.

Spe ial thanks to Martin Magnusson, Fredrik Heintz, Per-MagnusOlsson,

DavidLanden,PiotrRudolandGianpaoloContefor ommentingdraftsof

this thesis and related papersat various(perhaps dynami allygenerated)

levelsofabstra tion.

ThankstoMartinMagnussonforprovidingthe reformanyinteresting

andsometimesendlessdis ussionswhi hreallymakemegrowasa person.

Alsothanksto PatrikHaslumforyourendlesswisdomand forsupporting

meduringmyearlydevelopment. ThankstoFredrikHeintzforyoursense

ofdetailandperfe tionandJonasKvarnstromforyourin redibleproblem

solving apabilities(andwilltosharethem). ThankstoTommyPersonand

BjornWingmanforyourhelpwith alltheimplementationissuesand your

insightsintotheUASTe hsystem.

Finally, I thankmy parentsKurtand Gunilla,my girlfriendAnna and

mydaughterAnneliforloveandsupport.

(6)
(7)

Contents

1 Introduction 1

1.1 ModelsandTradeo s. . . . . . . . . . . . . . . . . . . . . . . 1

1.2 TaskEnvironmentsandModels . . . . . . . . . . . . . . . . . 2

1.3 TheUAS Te h System . . . . . . . . . . . . . . . . . . . . . . 4

1.4 Abstra tions . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.5 PlanningModel Types . . . . . . . . . . . . . . . . . . . . . . 7

1.6 Constru tingPlanningModels . . . . . . . . . . . . . . . . . 9

1.7 Fo usofAttention . . . . . . . . . . . . . . . . . . . . . . . . 9

1.8 Dynami Abstra tion. . . . . . . . . . . . . . . . . . . . . . . 10

1.9 Dynami Abstra tionforPlanningandExe ution. . . . . . . 11

1.9.1 Example. . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.9.2 DARE . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.10 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.11 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.12 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2 Preliminaries 16

2.1 ProbabilityTheory . . . . . . . . . . . . . . . . . . . . . . . . 17

2.1.1 Basi Assumptions . . . . . . . . . . . . . . . . . . . . 17

2.1.2 Sto hasti Variables . . . . . . . . . . . . . . . . . . . 17

2.1.3 DistributionsandDensityFun tions . . . . . . . . . . 18

2.1.4 JointDistributions . . . . . . . . . . . . . . . . . . . . 18

2.1.5 ConditionalDistributions . . . . . . . . . . . . . . . . 18

2.1.6 BayesRule . . . . . . . . . . . . . . . . . . . . . . . . 20

2.1.7 Expe tation . . . . . . . . . . . . . . . . . . . . . . . . 20

2.2 BayesianNetworks . . . . . . . . . . . . . . . . . . . . . . . . 20

2.2.1 De nition . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.2.2 HybridModels . . . . . . . . . . . . . . . . . . . . . . 21

2.2.3 Inferen e . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.2.4 Impli itModels . . . . . . . . . . . . . . . . . . . . . . 22

2.2.5 ModelEstimation . . . . . . . . . . . . . . . . . . . . 22

2.3 Dynami BayesianNetworks. . . . . . . . . . . . . . . . . . . 23

2.4 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.5 Exe utionSystems . . . . . . . . . . . . . . . . . . . . . . . . 25

(8)

2.5.1 ModularTaskAr hite ture . . . . . . . . . . . . . . . 26

2.5.2 OtherAr hite tures . . . . . . . . . . . . . . . . . . . 26

2.5.3 De nitionofSkills . . . . . . . . . . . . . . . . . . . . 27

3 Dynamic Decision Networks 28

3.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.2 Lo alReward,GlobalUtility . . . . . . . . . . . . . . . . . . 30

3.3 SolutionTe hniques . . . . . . . . . . . . . . . . . . . . . . . 31

3.4 Spe ialCase: MarkovDe isionPro esses. . . . . . . . . . . . 31

3.4.1 Poli y . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.4.2 SolutionsandSolverMethods . . . . . . . . . . . . . . 32

3.4.3 ValueIteration . . . . . . . . . . . . . . . . . . . . . . 32

3.4.4 Reinfor ementLearning . . . . . . . . . . . . . . . . . 33

3.4.5 RLwithModel Building. . . . . . . . . . . . . . . . . 34

4 The DARE Method 37

4.1 TasksandBeliefs . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.2 OverviewofDARE . . . . . . . . . . . . . . . . . . . . . . . . 38

4.3 Exe utionAssumptions . . . . . . . . . . . . . . . . . . . . . 39

4.4 Re nementAssumptions. . . . . . . . . . . . . . . . . . . . . 39

4.5 Hierar hi alSolutionNodes . . . . . . . . . . . . . . . . . . . 41

4.6 Subs riptionVSPoll . . . . . . . . . . . . . . . . . . . . . . . 43

4.7 TheMethod. . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.7.1 Main. . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.7.2 DynabsSolve . . . . . . . . . . . . . . . . . . . . . . . 44

4.7.3 CreateSubProblems . . . . . . . . . . . . . . . . . . . 45

4.7.4 ReplanIfNe essary . . . . . . . . . . . . . . . . . . . . 46

4.8 Dis ussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5 Case Study I 49

5.1 TaskEnvironmentClass . . . . . . . . . . . . . . . . . . . . . 49

5.2 TaskEnvironmentModel . . . . . . . . . . . . . . . . . . . . 51

5.2.1 DangerRewards . . . . . . . . . . . . . . . . . . . . . 52

5.2.2 ObservationTargetRewards . . . . . . . . . . . . . . 53

5.3 Skills. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5.4 DAREImplementation. . . . . . . . . . . . . . . . . . . . . . 54

5.4.1 ProblemModels . . . . . . . . . . . . . . . . . . . . . 54

5.4.2 Dynami Abstra tion . . . . . . . . . . . . . . . . . . 55

5.4.3 SolutionMethod . . . . . . . . . . . . . . . . . . . . . 57

5.4.4 SubproblemGeneration . . . . . . . . . . . . . . . . . 59

5.4.5 ReplanningConditions. . . . . . . . . . . . . . . . . . 60

5.5 Experiments. . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.5.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

5.6 Comments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

(9)

CONTENTS ix

6 Case Study II 69

6.1 TaskEnvironmentClass . . . . . . . . . . . . . . . . . . . . . 70

6.2 TaskEnvironmentModel . . . . . . . . . . . . . . . . . . . . 71

6.3 Skills. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

6.4 BeliefStateandFiltering . . . . . . . . . . . . . . . . . . . . 74

6.5 DAREImplementation. . . . . . . . . . . . . . . . . . . . . . 76

6.5.1 PlanningModel Generation . . . . . . . . . . . . . . . 76

6.5.2 SolutionMethod . . . . . . . . . . . . . . . . . . . . . 79

6.5.3 CameraMovement . . . . . . . . . . . . . . . . . . . . 81

6.5.4 DynabsSolveImplementation . . . . . . . . . . . . . 81

6.5.5 Replanning . . . . . . . . . . . . . . . . . . . . . . . . 82

6.6 Experiments. . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

6.6.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

6.6.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . 83

6.7 Dis ussion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

7 Conclusion 86

7.1 FutureWork . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

7.1.1 ExtentionstotheCaseStudies . . . . . . . . . . . . . 87

7.1.2 Dynami Task EnvironmentModels . . . . . . . . . . 88

(10)
(11)

Chapter 1

Introduction

ItisdiÆ ulttooverestimatetheimportan eofmathemati almodelsinour

modern so ietybe ause of their ommon use ine.g. naturals ien es and

engineering di iplines for many di erent purposes. Many types of mod-

els existstoday that an be usedfor a variety of tasks su h as predi ting

weather, simulating vehi le dynami s, monitoring nu lear power rea tors

andverifying omputerprograms.

ModelshavealsobeenusedwithintheareaofArti ialIntelligen e(AI)

to develop autonomous agents. It is widely onsidered that su h agents

shouldhavemodelsoftheirenvironments(andthemselves)tomakeitpos-

sibletooperate moresu essfully. Themodels anforexamplebeusedto

keeptra kofthe unobservablepartsof theworld,performpredi tion[34℄,

taskplanning[31℄andothertypesofreasoning[10℄.

1.1 Models and Tradeoffs

One ommon trait formathemati al models usedinpra ti al appli ations

isthatitisnotalwaysbene ial(orevenpossible)tomodeleveryaspe tof

asystemofstudydowntothesmallestdetailtogetasa urateaspossible.

Theproblemis thatthere isalwaysatradeo between a ura y andfea-

sibility ofamodelthatshouldbeusedfora ertainappli ationona given

ar hite ture. Theremightbeademandfortimelyresponseofasystemthat

prohibitslongdeliberationtime whi h inturn anmakeahighlydetailed,

but omputationallydemanding model inappropriatefor use in that par-

ti ular domain. Althoughthe omputational resour esthat anbe made

availablefordi erentappli ationshavebeenin reasingexponentiallysin e

thedawn ofele troni omputers,therewillalwaysbealimitwhen apar-

ti ularsystemis beingdeveloped anddeployed. Thismeansthat onewill

alwayshave totrade a model'sa ura y forfeasibilityto get a reasonable

performan einany future system,whi hisa fa tthat isoftenmentioned

intheliteratureaboutpra ti almathemati almodelling[44℄[24℄.

(12)

1.2 Task Environments and Models

Whenamodelistobe onstru tedforanautonomousagent,itisimportant

to onsiderthetaskenvironment [63℄inwhi htheagentwilloperate.The

omplexity of the task environment an give signi ant hints about the

di erenttypesof models that anbeusedforwhateverthepurposeofthe

model is.

Ataskenvironment,whi h anbeeitherrealorsimulated,spe i es:

 what theagent andototheenvironmentwith itsa tuators,

 what informationit anre eivefromitssensors,

 howtheenvironment works andwhat it ontains,and

 what is onsidered \good or bad" with the help of a performan e

measure

A task environmentfor an autonomous ground robot an e.g. spe ify

thatthea tuators onsistofapropulsionsystemandpossiblyamanipulator

arm. Su hagentsarealsotypi allyequippedwithsensorssu haslaserrange

s anners, amerasand sometimes ollisionsensors. Theenvironmentmay

onsistoftables, hairs,walls,stairset .,anditsperforman emeasuremay

be de ned in terms of power onsumption and the time to omplete an

assignedtask(su hasdeliveringapa kage).

Amodelthatanagentusesshouldbe losely onne tedtothetaskenvi-

ronmentthattheagentoperateswithin. Forexample,ifamodelisgoingto

beusedforpredi tingthestateofanautonomousagent'staskenvironment

depending on what a tions it performs, it better in lude spe i ations of

howthe a tuators,sensors andthesurrondingsworkinordertobeuseful.

Su hataskenvironmentmodel annotbetoodetailedduetothetradeo

betweena ura yandfeasibility.

A task environment or a model thereof an be lassi ed a ording to

some ommonly used dimensions [63℄ whi h to a large extent determine

howdiÆ ultitis tohandle.



Fully Observable

or

Partially Observable

: Ifthe agent'ssensors

an give a ess to all the relevant information in the environment

it is alled a fully observable task environment; otherwise the task

environmentis alledpartially observable.



Deterministic

or

Stochastic

: Ifthenextstate is ompletelydeter-

minedbythe urrentstateand thea tionexe utedbytheagentthe

taskenvironmentis alleddeterministi . Ifthereareseveralpossible

out omesofana tionitis alledasto hasti environment. Theterm

non-deterministi is often usedwhen out omes do nothave proba-

(13)

1.2. TASKENVIRONMENTSAND MODELS 3



Episodic

or

Sequential

: In an episodi environment, the agent's urrent de ision does not in uen e the performan e of any future

episode. All environments onsidered in this thesis will be sequen-

tialwhi hmeansthattheagent's urrentde isionmightin uen ethe

performan eoftheagentinfuturestates.



Static

or

Dynamic

: A task environment whi h may hange while theagentdeliberatesis alledadynami environment;otherwiseitis

alledstati .



Discrete

or

Continuous

: A ontinuous environment ontains el- ements that are more a urately des ribed with ontinuous models

involving real values instead of an enumerable set of values. Task

environments that do not have any ontinuous elements are alled

dis rete.



Single Agent

or

Multiagent

: A taskenvironmentwhereotherex- ternalagents,besidesthemainagentitself,trytorea hgoalsormax-

imizetheir utilitiesare alled multiagent. Ifthe externalagentsare

betterdes ribedwithoutde ision apabilities,orifnoexternalagents

exist,theenvironment anbe onsideredsingle agent.

Inthis thesis, thesedimensionsareusedto lassifythe intrinsi prop-

ertiesofataskenvironment.Theyarenotassumptionsthate.g. adesigner

ofanagent anmake. Ontheotherhand,adesigner anmakeassumptions

thatarere e tedintheagent'staskenvironmentmodelsthatitissupposed

touse. Constru tedmodelsthatrepresentpartsofataskenvironmentmust

oftenbea simpli ationof therealthingand thedi erentdimensionsare

then usedto lassify the model onstru tionassumptionsthat are notal-

ready a property of thetask environment. Thiswill bedis ussedmorein

Se tion1.4.

It is assumed that task environment models an be simulated. This

meansthatdi erenta tions anbetestedwiththemodelwhi hmayresult

in one or several possible out omes depending on whether the model is

deterministi ornot. Sto hasti models anbesimulatedbypseudorandom

numbergenerators.

A task environment lass orenvironment lass isa set of taskenvi-

ronmentswithsimilarproperties. Anagentisoftendesignedto operatein

instan esofaparti ulartaskenvironment lasswheree.g. theenvironment

an ontainadi erentnumberofobje tsand agentsbutmostoftheother

propertiesor assumptionsstay the same. In this thesis, the taskenviron-

ment instan esina parti ularenvironment lass are assumedto havethe

same lassi ation a ordingto the previously mentioned dimensionsand

thatthe a tuatorsand sensorsaresimilarlymodelled. Withina parti ular

environment lass,thetypesoftheobje tsintheenvironmentalsostaythe

samebutthenumberandinitial onditions mayvary intaskenvironment

References

Related documents

The Swedish Transport Administration will meet Sweden´s transport policy goal. The goal of Sweden´s transport policy is to ensure an efficient and

Per Nyblom Dynamic Abstraction for Interleaved Task Planning and Execution Linköping 2008.. Link oping Studies in S ien e and

The themes regarding family planning were: Family planning views and reasons; Methods of contraception; Children out of wedlock; Views on family planning programmes; and

Among the different kinds of possible in-hand reposition- ing, we specifically address the problem of pivoting an object between the two fingers of a parallel gripper so that, once

We propose an integrated framework for solving the goal assignment and trajectory planning problem minimizing the maximum cost over all vehicle trajectories using the

Alongside key criteria and targets taken from Sustainable Development Goal 11: Make Cities and human settlements inclusive, safe, resilient and sustainable, Perceived as a set

Under the assumption that the workspace is static and fully-known, we provide a systematic and automated scheme to synthesize both the discrete motion and task plan and the

Siebert, 2017, ‘‘Dynamic Line Rating Using Numerical Weather Predictions and Machine Learning: A Case Study’’, IEEE Transactions on Power Delivery, vol. Musilek, 2016,