Strategy Synthesis for Real-world Problems Modeled as Single-player Games of Imperfect Information

(1)

IN

DEGREE PROJECT TECHNOLOGY, FIRST CYCLE, 15 CREDITS

,

STOCKHOLM SWEDEN 2020

Strategy Synthesis for Real-world

Problems Modeled as

Single-player Games of Imperfect

Information

OLIVER ANTEROS LINNARSSON

SEBASTIAN LORENZO

KTH ROYAL INSTITUTE OF TECHNOLOGY

(2)

(3)

Strategy Synthesis for

Real-world Problems Modeled

as Single-player Games of

Imperfect Information

OLIVER ANTEROS LINNARSSON

SEBASTIAN LORENZO

Degree Programme in Computer Science and Engineering Date: June 8, 2020

Supervisor: Dilian Gurov Examiner: Pawel Herman

School of Electrical Engineering and Computer Science

Swedish title: Strategisyntes för verkliga problem modellerade som enspelarspel av ofullständig information

(4)

(5)

iii

Abstract

When most people play games, they attempt to intuitively come up with a winning strategy. However, algorithmic ways of synthesizing winning strate-gies for games does exist. These methods usually requires an abstract way of representing the game. One way of constructing such an abstract represen-tation is known as knowledge-based subset construction (KBSC). This the-sis explores the possibility of modeling real-world problems as games of im-perfect information, as well as the possibility and feasibility of synthesizing knowledge-based strategies from the model with the help of KBSC. This is done by first synthesizing a strategy for a simple real-world problem through intuitive reasoning. Then attempting to model the same problem as a game of imperfect information, applying KBSC on it, and synthesizing a knowledge-based strategy from that. The problem that was chosen for modeling, was that of an automatic vacuum cleaning robot with a broken navigation system, tasked with vacuum cleaning two rooms. It was found that it is indeed pos-sible to model real-world problems as games of imperfect information and to synthesize knowledge-based strategies for them with the help of KBSC. The knowledge-based strategy was found to be 20% more effective than the one synthesized through intuitive reasoning. However this was a rather simple problem in and of itself, and it was also limited to only a single agent. Fur-thermore the sample size of people attempting to synthesize strategies through intuitive reasoning was extremely limited. Thus further study is needed to ver-ify the findings presented in this thesis.

(6)

iv

Sammanfattning

Oftast när en person spelar ett spel så försöker han eller hon intuitivt komma fram till vinnande strategier. Men det finns även algoritmiska sätt att synteti-sera vinnande strategier för spel. Dessa metoder kräver vanligtvis ett abstrakt sätt att representera spelet. Ett sätt att konstruera en sådan abstrakt represen-tation är KBSC, eller "knowledge-based subset construction". Detta kandi-datarbete undersöker möjligheten att modellera verkliga problem som spel av ofullständig information, samt möjligheten och genomförbarheten av att syn-tetisera kunskapsbaserade strategier från modellen med hjälp av KBSC. Detta genomförs genom att först syntetisera en strategi för ett enkelt verkligt problem genom ett intuitivt resonemang. Därefter modelleras samma problem som ett spel av ofullständig information, sedan tillämpas KBSC på modellen och en kunskapsbaserad strategi syntetiseras utifrån det. Problemet som valdes för modellering var det av en automatisk robotdammsugare med ett trasigt navi-gationssystem, med uppgiften att dammsuga två rum. Det visade sig att det faktiskt var möjligt att modellera verkliga problem som spel av ofullständig information och att syntetisera kunskapsbaserade strategier för dem med hjälp av KBSC. Den kunskapsbaserade strategin visade sig vara 20% mer effektiv än den som syntetiserades genom intuitivt resonemang. Detta var dock ett ganska enkelt problem och det var också begränsat till endast en enda agent. Dessutom var antalet personer som försökte syntetisera strategier genom intuitivt reso-nemang extremt begränsad. På grund av detta behövs ytterligare undersökning för att verifiera resultaten som presenteras i detta kandidatarbete.

(7)

1 Introduction 1 1.1 Aim . . . 1 1.2 Research question . . . 2 1.3 Scope . . . 2 1.4 Approach . . . 2 1.5 Thesis outline . . . 2 2 Background 4 2.1 Games . . . 5 2.2 Modeling . . . 5 2.3 Objectives . . . 7 2.4 Strategy construction in single-player games . . . 7 2.5 KBSC . . . 8 2.6 Knowledge . . . 9 2.7 Knowledge-based strategy . . . 11 3 Method 12 3.1 Synthesizing a strategy for a real-world problem through intu-itive reasoning . . . 12

3.2 Modeling the problem as a single-player game of imperfect information . . . 12

3.3 Applying KBSC on the modeled single-player game of imper-fect information . . . 13

3.4 Synthesizing a based strategy from the knowledge-based model . . . 13

3.5 Comparing the two different methods of synthesizing strate-gies on real-world problems . . . 13

(8)

vi CONTENTS

4 Results 14

4.1 Synthesizing a strategy for the Roomba problem through in-tuitive reasoning . . . 14 4.2 Modeling the Roomba problem as a

single-player game of imperfect information . . . 15 4.3 Applying KBSC on the modeled Roomba problem . . . 16 4.4 Synthesizing a based strategy using the

knowledge-based model of the Roomba problem . . . 17 4.5 Comparison of the intuitive and algorithmic approach . . . 18

5 Discussion 19

5.1 Efficiency of knowledge-based strategy synthesis . . . 19 5.2 Difficulty and complexity of knowledge-based strategy synthesis 20 5.3 Difficulty of applying KBSC on a simple model . . . 20

6 Conclusion 21

7 Further Study 22

8 Acknowledgements 23

Bibliography 24

(9)

Chapter 1 Introduction

Playing games is a fun leisure for many people and many games require strate-gic thinking in order to win. When most people play games, they attempt to in-tuitively come up with winning strategies. However, algorithmic ways of syn-thesizing winning strategies for games does exist. These methods usually re-quires an abstract way of thinking about the problem, as well as an abstract way of representing the problem. Since strategy construction is such a common task, there are multiple different known algorithms for constructing these ab-stract representations of problems. One of them is known as knowledge-based subset construction (KBSC), which uses the concept of knowledge in order to reduce a problem with imperfect or missing information, into a knowledge-based representation of the problem with perfect information. This abstract knowledge-based representation can then be used to synthesize winning strate-gies for the initial problem.

Many problems and challenges that we face in the real world, do require strategic thinking, much in the same way as leisure games do. So the question is then; is it possible to model these real-world problems as games, much in the same way as leisure games are modeled. Is it then possible to synthesize strategies for these real-world problems using an abstract knowledge-based representation of them.

1.1 Aim

The aim of this thesis is to aid in the understanding of the real-world appli-cability of abstract games modeling, and to encourage the utilization of these methods when designing solutions for real-world problems.

(10)

2 CHAPTER 1. INTRODUCTION

1.2 Research question

This thesis investigates the feasibility of applying knowledge-based subset construction on problems in the real world. More specifically:

• Is it possible to model real-world problems as games of imperfect infor-mation?

• Is it possible to synthesize knowledge-based strategies for real-world problems when modeled as games of imperfect information,

using knowledge-based subset construction?

• How useful is this entire approach compared to a naive strategy synthesis based on intuitive reasoning about the same problem?

1.3 Scope

The scope of this thesis will be limited to attempting to model a simple real-world problem consisting of one agent, and extracting a knowledge-based strat-egy from that model. While there are many more complex real-world problems and real-world problems including multiple agents which might be feasible to model as games of imperfect information. For the sake of time and simplicity, this thesis will be limit to a single problem with one agent.

1.4 Approach

We will start off by attempting to intuitively construct a strategy for a real-world problem. When this is done, we will attempt to model the problem as a single-player game of imperfect information. After modeling the game we will attempt to synthesize a knowledge-based strategy by hand. Lastly we will compare these two different strategies by looking at their effectiveness in solving the problem and the difficulty in synthesizing them.

1.5 Thesis outline

This thesis is divided into eight chapters, the first one being the introduc-tion, which provides a broad overview of the subject as well as the goals and methodology that will be present in the thesis. The following five chapters will

(11)

CHAPTER 1. INTRODUCTION 3

cover any necessary background knowledge, the methodology used in the as-sociated experiments, the results of those experiments, as well as a discussion and a conclusion in relation to those results. Finally there will be a section providing a starting point for any further research building upon this thesis, as well as an acknowledgement chapter.

(12)

Chapter 2 Background

When searching for strategies, one can approach it in a multitude of ways. The strategy synthesis process usually depends on the availability of information and the quality of that information.

If information is scarce and of poor quality, one might improvise a strategy while simultaneously executing it, altering the strategy as new information is presented. This kind of strategy is usually employed in situations where the stakes are low, and the intelligence of the agent is high. For example, a human being walking; the agent is intelligent and the stakes are low, because a small mistake can easily be corrected before any harm can come to the agent.

If information is in abundance and the quality is pristine, one might de-velop a heuristic strategy. A heuristic strategy follows a set of rules, where each rule maps some observation to some action. Where each action has been handcrafted to fit the pre-existing information [1]. Heuristic strategies are of-ten perfect for artificial intelligence in computer games, since information is available in abundance, and the quality of the information is equivalent to the quality of the world where the agent will live.

If information is in abundance but the quality is poor, one might want to employ a strategy where deep analysis of the information ensures a strategy that works no matter the interpretation of the information. When considering most real-world problems, it becomes obvious that while information might be in abundance, the quality of the information is likely poor. This is partly due to the natural ambiguity of making choices in the real-world, and partly due to the wide array of information needed to accurately model the real-world. This leads us to create abstractions and models, that simplify this complex behavior.

(13)

CHAPTER 2. BACKGROUND 5

2.1 Games

A game in the context of this thesis is referring to a structured activity with one or more agents. An agent is something or someone that has their own predefined possible states that they can be in, and actions that they can take within the game.

Sometimes a game is described as being played "against nature". In these cases nature is the combination of everything in the world that is not an agent but can affect the outcome of any given action taken by an agent. When a game has a non-deterministic nature it is called a "game of imperfect information". Conversely when a game does not have a non-deterministic nature it is called a "game of perfect information". [2]

2.2 Modeling

One way of possibly determining strategies for a real-world problem is to model the problem as a single-player game of imperfect information against nature. While you might want to throw a rock 200 meters, you are likely not going to manage it due to gravity and air resistance, this is the nature in our model. Even if you know everything about yourself, and everything about rocks, that is not going to stop gravity from affecting the rock and altering its course. In order to determine a strategy for a situation such as this one, it helps to model it as a game of imperfect information. However, in order to understand what that means, we must first look at what a game of perfect information might look like.

We will start by introducing the model of a single-player game of perfect information [2]:

Definition 1. A single-player game of perfect information is defined as a tuple Ga= hL, l0, Σ, ∆i where:

(i) L is the finite set of all states in the game (ii) l0 is the initial state of the game where l0 ∈ L

(iii) Σ is the finite set of all actions available in the game, also known as the alphabet of the game

(iv) ∆ is set of all transitions available in the game (v) ∆ = {(l, σ, l0)|l, l0, ∈ L, σ ∈ Σ}

(14)

6 CHAPTER 2. BACKGROUND

Most leisure games are in fact games of perfect information, for example chess. In chess, both players can at any time during the game see the entire state of the board. Every game piece is visible for both players, and every game piece has a known predefined function and behavior.

However, in the real world, it is rarely the case that we are in a situation where perfect information is realistic. Actions taken in the real world usually end up with ambiguous results. So when we are modeling these ambiguous re-sults we have to account for the multiple potential outcomes of a single action. Let us make a modified version of chess. In this version, any piece can have swapped its behavior with any other piece. In other words, a pawn might be able to move like a queen, but then the queen must be limited to the movement of a pawn. In such a game a player would be able to observe the position of every piece, but it would not be able to find a winning strategy at once since the behavior of any one piece theoretically could have been swapped. In or-der to find a solution for such a game a player would need to memorize the movement of each piece, and through pattern matching their movement to the known behavior of other pieces, figure out which pieces have swapped behav-ior. Once the player has determined whether or not a piece has had its behavior swapped, it will now categorize that piece as the piece which it swapped its behavior with. Once all pieces has been categorized, an exhaustive search can be used to find the best possible strategy.

In order to model games of imperfect information, we are going to have to account for the ambiguity of actions as mentioned before. This is done by introducing a set of observations, where at least one observation is ambiguous (If no observation is ambiguous, then Gb would be the same as Ga). [2] In

our chess example we would have lots of observations describing the different movement behaviors and how they are mapped to each piece.

Definition 2. A single-player game of imperfect information is defined as a tuple Gb = hL, l0, Σ, ∆, Oi, where:

(i) L, l0, Σ, ∆ are all the same as in Ga.

(ii) O is the finite set of all states grouped into observations.

Games of imperfect information and a single player can be converted into games of perfect information with the application of established methods such as KBSC, which will be covered in chapter 2.5.

(15)

2.3 Objectives

When modeling games of imperfect information we need to be able to deter-mine the winning condition by using an objective. There are five different kinds of objectives that are usually used when determining the winning con-dition of a model [3]; the "Reachability Objective" Reach(τ ) requires that a location in τ is visited at least once, the "Safety Objective" Saf e(τ ) requires that only locations in τ are visited, the "Büchi Objective" Buchi(τ ) requires that at least one location in τ is visited infinitely often, the "CoBüchi Objec-tive" coBuchi(τ ) requires that only locations in τ are visited infinitely often, & the "Parity Objective" P arity(fpr) requires that the minimal priority

oc-curring infinitely often is even, where the priority is determined by a priority function fpr mapping each location to a none-negative integer priority.

Definition 3. The five different kinds of objectives for modeling games of imperfect information are defined as:

(i) Reach(τ ) = {l0, l1, ...|∃k ≥ 0, lk ∈ τ }

(ii) Saf e(τ ) = {l0, l1, ...|∀k ≥ 0, lk ∈ τ }

(iii) Buchi(τ ) = {π|Inf (π) ∩ τ 6= ∅} (iv) coBuchi(τ ) = {π|Inf (π) ⊆ τ }

(v) P arity(fpr) = {π|min {fpr(l)|l ∈ Inf (π)} iseven}

fpr := L → {0, 1, ..., d}, d ∈ N

2.4 Strategy construction in

single-player games

Let us say we have a model of a game, and let us say we have decided on an objective in the context of this game. A strategy would be a mapping of observations that could be made during the game, and what actions should be taken when one of these observations are observed. If we observe X and X has an entry in our strategy mapping, we look up the corresponding action and execute it. This will result in a new observation Y, that we then look up in the strategy and execute its corresponding action. This will repeat until our objective has either been fulfilled in which case the strategy was a winning

(16)

strategy, or until an observation is made that does not have an entry in our strategy mapping, in which case the strategy was a failing strategy.

Nylén & Jacobsson in their thesis "Investigation of a Knowledge-based Subset Construction for Multi-player Games of Imperfect Information" [2] start off by defining an objective ϕ as a set of plays that is to be considered winning. They then define a reachability objective τ as a subset of all obser-vations that when observed results in a winning state.

Definition 4. ϕ is a set of plays that is to be considered winning, where: (i) π ∈ ϕ is an infinite sequence of states l0l1l2...

(ii) τ ∈ O is a reachability objective. Where all o ∈ τ must have been observed at least once during ϕ.

Nylén & Jacobsson defines a strategy as winning if and only if every ob-servation in τ ends up being observed at least once during the execution of the strategy.

The problem of generating winning strategies in single-player games against nature was formulated and solved by John Reif in his paper "The com-plexity of two-player games of incomplete information" [4]. Reif’s solution was a form of subset construction, in which he eliminated the imperfect infor-mation by constructing games of perfect inforinfor-mation from the original imperfect-information games.

2.5 KBSC

When searching for strategies in games of imperfect information, it helps to have an objective view of the knowledge-base, and how that knowledge-base changes depending on each action that might be taken at each state of the game. In order to model this changing knowledge-base we will be utilizing a concept known as knowledge-based subset construction, or KBSC. Nylén & Jacobs-son [2] describes knowledge-based subset construction (KBSC) as a useful tool used in the synthesis of strategies for single-player games of imperfect information. The idea of KBSC is to create a new game GK where the states of this game represent the players knowledge of the original game G. Where knowledge is defined as a set of states that the original game G can be in at the given time.

Definition 5. The knowledge-based single-player game is defined as a tuple GK = hL, {l0}, Σ, ∆Ki, where:

(17)

(i) L is the set of all possible combinations of locations defined in the original game. L = 2L\{∅}

(ii) {l0} is the set of possible initial locations that the original game could

start in.

(iii) Σ is the finite set of actions available to the agent (player). (iv) ∆K is the set of all transitions between sets s ∈ L.

∆K ₌_(s

1, σ, s2) |s2 = P ostGσ(s1) ∩ o 6= ∅, o ∈ O

P ostG_σ(s) = {l0 ∈ L|l ∈ S, (l, σ, l0_{) ∈ ∆}}

(v) O is the objective of the original game.

Let us call the perfect-information game constructed using KBSC GK, and the original imperfect-information game G. Then the states in GK are equiv-alent to subsets of states in G. All plays in G that are indistinguishable to the player, corresponds to one play in GK. The subset of states reached in GK when one of those indistinguishable plays are made, is equal to the reachable states in G given these indistinguishable plays. The states of GK thus rep-resent enough of the player’s knowledge about the state of play in G, to be able to transfer strategies from GK back to G in a way that preserves winning strategies.

2.6 Knowledge

It has been argued that the notion of knowledge provides a natural level of ab-straction for reasoning about problems. The paper "Common Knowledge and Update in Finite Environments" by Ron van der Meyden [5] asks two questions related to this and tries to investigate them. First question asked is, given a de-scription of a distributed system, how do we efficiently compute the answer to a query about the knowledge of the agents (processors) in a given state? The second question concerns update, how should an agent maintain its knowledge of the world, and its knowledge of other agents’ knowledge, so that it may ef-ficiently determine whether it knows a particular fact or not?

A very simple abstract framework is used; suppose that some fixed num-ber of agents inhabit an environment with a finite numnum-ber of possible states. In each state of the environment every agent makes an observation, which will in general be insufficient to determine that state. The evolution of the state is constrained by a transition diagram, which determines the traces of the sys-tem, i.e. the possible sequences of states. We suppose that both the transition

(18)

diagram and the relationship between states and the agents observations are common knowledge to all agents. Though simple, this framework is sufficient to represent many interesting systems, among them is games of imperfect in-formation.

Agents have only incomplete information about the current state in general, since observations does not determine the state. However, an agent potentially knows more than it learns from its most recent observation by using its knowl-edge of the possible state transitions. An agent can derive additional informa-tion about other agents knowledge from its knowledge about prior states of the world, as well as gain additional information about the current state.

To formalize this sort of reasoning, a precise state of knowledge needs to be ascribed to each agent. There are a many ways to do this. Following the usual definitions of knowledge in distributed systems [6], in each global state of the system we are required to assign each agent a local state. An agent can be said to know a fact at a global state in which the agent is in a local state just when the fact is true in all global states where the agent is in the same local state. An agent’s local state thus contains the information it has about the global state.

In the framework proposed by the van der Meyden’s paper, an agent’s local state is determined by its observations. In order to take advantage of prior knowledge in reasoning about the current global state, an agent needs to recall information obtained from its previous observations. The assumption is that the agents have the strongest possible model of memory, namely perfect recall of all their observations.

The perfect recall assumption, however, still leaves two possible ascrip-tions of local state, depending on whether the agents operate synchronously (i.e., with a global clock visible to all agents) or asynchronously. The paper study both cases. In the synchronous case, the sequence of its observations in all previous states are assigned to an agent. In the asynchronous case, agents are not assumed to know the length of time for which they were making an observation, so consecutive repetitions in the sequence of observations are deleted and assigned to its agent.

These local states makes a S5nKripke structure. The propositional modal

language considered in Ron van der Meyden’s paper contains an operator for the knowledge of each agent, as well as an operator for the common knowledge of each group of agents. The basic propositions describe the current state of the world. Thus the sentences of this language refer only to the current state and to the mutual knowledge that agents have about the current state.

(19)

2.7 Knowledge-based strategy

Knowledge-based models can be used in order to construct knowledge-based strategies. [2]

Definition 6. A knowledge-based strategy is a mapping from some knowl-edge set, to an action. These strategies are often presented as tables with at least three columns.

Column 1 Contains the knowledge state that the agent should be in order to take the given action.

Column 2n+2 Contains the action that the agent should take, given that n actions has already been taken with the current knowledge state.

Column 2n+3 Contains the knowledge state that the agent will be in after taking the ac-tion, given that n actions has already been taken with the current knowl-edge state.

Knowledge state Action 1 Update 1 Action 2 Update 2

{l0} A {l0} B {l1}

{l1} B {l2} A {l2}

{l2} A {l1} B {l3}

{l3} B {l4} – –

For example; executing the strategy above, and starting in knowledge state {l0}, and defining a function

strategy(knowledge) => Action & NewKnowledge (i) strategy({l0}) => A & {l0}

(ii) strategy({l0}) => B & {l1}

(iii) strategy({l1}) => B & {l2}

(iv) strategy({l2}) => A & {l1}

(v) strategy({l1}) => A & {l2}

(vi) strategy({l2}) => B & {l3}

(20)

Chapter 3 Method

We begin by synthesizing a strategy for a real-world problem simply through intuitive reasoning and without using any modeling. Then continue by model-ing this real-world problem as a smodel-ingle-player game of imperfect information, KBSC is then applied to this model in order to synthesize a strategy, and fi-nally, the effectiveness of these two strategy synthesis methods are compared.

3.1 Synthesizing a strategy for a real-world

problem through intuitive reasoning

We begin by attempting to synthesize a real-world problem simply through intuitive reasoning. This is done in order to have an alternative naive method of synthesizing a strategy to compare to the more algorithmic approach of modeling it as a single-player game of imperfect information and applying KBSC on it. The problem we will try and synthesize strategies for is that of a robotic vacuum cleaner, a Roomba, that has been given the job to vacuum clean a house with two rooms, and no way of knowing which room it is currently in. Thus simulating a faulty navigation system.

3.2 Modeling the problem as a single-player

game of imperfect information

After having synthesized a strategy for the problem through intuitive reason-ing, the problem is then modeled as a single-player game of imperfect infor-mation. This is done in order to make the problem compatible to applying KBSC on it.

(21)

CHAPTER 3. METHOD 13

3.3 Applying KBSC on the modeled

single-player game of imperfect information

When the problem has been modeled as a single-player game of imperfect information, KBSC is applied on that model to synthesize a knowledge-based strategy.

3.4 Synthesizing a knowledge-based strategy

from the knowledge-based model

After the application of KBSC, a strategy is synthesized from the newly ac-quired knowledge-based model. The correctness of this strategy is then veri-fied by applying it on each possible starting state of the original game; making sure that the final state fulfills the winning objective of both the original game, and the knowledge-base game.

3.5 Comparing the two different methods of

synthesizing strategies on real-world

prob-lems

Finally, these two methods of synthesizing strategies for the Roomba problem are compared with regard to their effectiveness. The metric used to measure their effectiveness is the number of actions needed to reach a winning state, with the less number of actions needed, the better.

(22)

Chapter 4 Results

When formulating our research problem we decided on a simplified version of a logic AI for an automatic vacuum cleaning robot, a Roomba. The Roomba knows that it is alone in its task, so it is a true single-player game. It knows that there are two dirty rooms in need of cleaning, one is further to the west, and the other is further to the east. It also knows that it has a faulty navigation system and can therefore not identify if it is currently in the western or eastern room. The Roomba knows that both rooms are initially dirty and it knows that it will be randomly placed in one of them at the beginning. The house has the ability to keep the Roomba informed regarding how many rooms still needs to be cleaned, and the Roomba itself has the ability to move around and to vacuum clean.

4.1 Synthesizing a strategy for the Roomba

problem through intuitive reasoning

In order to establish a baseline strategy to compare our knowledge-based strat-egy with, we needed to look at the problem from a naive and straightforward perspective. What are we attempting to accomplish? We want to vacuum clean two rooms. We know that we are in one of those rooms when we start. We decided that the most straightforward approach would be to start cleaning in the room you start in, since you know that that room is bound to be dirty and needs cleaning. Since we started cleaning straight away, we now know that the number of dirty rooms has decreased by one. We also know that the room we are currently in is clean, and that the other room is dirty. This lead us to think about two different cases.

(23)

CHAPTER 4. RESULTS 15

Case 1: We started in the eastern room so we need to move west and clean that room as well.

Case 2: We started in the western room so we need to move east and clean that room as well.

Seeing as the Roomba can not distinguish which room it is currently in, we had to act as if both case 1 and case 2 were true at the same time, even though this is physically impossible.

We ended up with a strategy that starts of by cleaning the current room, then it moves to the east and cleans that room, then it moves to the west and cleans that room. This makes sure that no matter what room the Roomba started in both rooms will eventually end up cleaned.

4.2 Modeling the Roomba problem as a

single-player game of imperfect

informa-tion

In order to be able to synthesize a knowledge-based strategy for our Roomba problem, we had to formally model the problem as a single-player game of imperfect information, see figure 4.1. Let us call our game G and define it as a tuple where:

(i) G = hL, lstart, Σ, ∆, Oi

(ii) L := {l0, l1, l2, l3, l4, l5, l6, l7}

(iii) lstart := l0or l1

(iv) Σ := {W (Move West), E(Move East), V (Vacuum Clean)} (v) ∆ := {(l0, E, l1), (l0, W, l0), (l0, V, l2), (l1, E, l1), (l1, W, l0), (l1, V, l3), (l2, E, l4), (l2, W, l2), (l2, V, l2), (l3, E, l3), (l3, W, l5), (l3, V, l3), (l4, E, l4), (l4, W, l2), (l4, V, l6), (l5, E, l3), (l5, W, l5), (l5, V, l7), (l6, E, l6), (l6, W, l7), (l6, V, l6), (l7, E, l6), (l7, W, l7), (l7, V, l7)} (vi) O := {{l0, l1}, {l2, l3, l4, l5}, {l6, l7}}

(vii) Objective := Reach(τ ), τ ⊆ O τ = {{l6, l7}}

(24)

16 CHAPTER 4. RESULTS

Figure 4.1: Roomba model

4.3 Applying KBSC on the modeled Roomba

problem

Once we had our single-player game modeled, we needed to construct a knowledge-based game from it, see figure 4.2. This was done through KBSC and yielded the knowledge-based model GK where:

(i) GK = hL, {l0, l1}, Σ, ∆Ki

(ii) L := {{l0}, {l1}, {l2}, {l3}, {l4}, {l5}, {l6}, {l7},

{l0, l1}, {l2, l3}, {l2, l5}, {l2, l7}, {l4, l7}, {l3, l4},

{l3, l6}, {l5, l6}, {l6, l7}}

(iii) Σ := {W (Move West), E(Move East), V (Vacuum Clean)} (iv) ∆K := {({l0}, V, {l2}), ({l2}, E, {l4}), ({l4}, V, {l6}), ({l1}, V, {l3}), ({l3}, W, {l5}), ({l5}, V, {l7}), ({l0, l1}, W, {l0}), ({l0, l1}, E, {l1}), ({l0, l1}, V, {l2, l3}), ({l2, l3}, W, {l2, l5}), ({l2, l3}, E, {l3, l4}), ({l2, l5}, V, {l2, l7}), ({l3, l4}, V, {l3, l6}), ({l2, l7}, E, {l4, l7}), ({l3, l6}, W, {l5, l6}), ({l4, l7}, V, {l6, l7}), ({l5, l6}, V, {l6, l7})}

∆K _{does not include tuples where action α on state S would result in} (S, α, S).

(v) Objective := Reach(τK), τK ⊆ L τK = {{l6}, {l7}, {l6, l7}}

(25)

CHAPTER 4. RESULTS 17

Figure 4.2: Roomba KBSC model

4.4 Synthesizing a knowledge-based strategy

using the knowledge-based model of the

Roomba problem

Using the knowledge-based model yielded by KBSC, we could easily construct a knowledge-based strategy. This was done by starting at {l0, l1}, and then

locating the most efficient path through the graph using exhaustive search. This yielded a knowledge-based strategy of efficiency four, meaning that it requires four actions to reach a state that is to be considered winning.

Knowledge state Action Update {l0, l1} Move West {l0}

{l0} Vacuum Clean {l2}

{l2} Move East {l4}

(26)

18 CHAPTER 4. RESULTS

We then verified the strategy by applying it once on G for each of its pos-sible initial states, and asserting that the they all fulfilled the reachability ob-jective of both GK and G.

lstart= l0 l0 → l0 → l2 → l4 → l6 l6 ∈ {l6} ∈ τK l6 ∈ {l6, l7} ∈ τ lstart = l1 l1 → l0 → l2 → l4 → l6 l6 ∈ {l6} ∈ τK l6 ∈ {l6, l7} ∈ τ

4.5 Comparison of the intuitive and

algorith-mic approach

The strategy that we synthesized through intuitive reasoning turned out to be equivalent to the middle path of the knowledge-based model, that we con-structed from the single-player game of imperfect information. Thus, the intuitive-reasoning-strategy has an efficiency of five, meaning that it takes five actions to reach a winning state. However, the knowledge-based strategy that we synthesized using KBSC, ended up with an efficiency of four. This means that in this case the knowledge-based approach ended up being 20% more ef-ficient than the naive approach.

(27)

Chapter 5 Discussion

5.1 Efficiency of knowledge-based strategy

synthesis

We synthesized the naive strategy through intuition and straightforward think-ing, much in the same way a regular person would have tackled the problem had they been given it without any mention of strategy synthesis. We thought about how a regular human would have completed the task of vacuum clean-ing two rooms, and then we tried to retroactively fit that strategy to our more restrictive problem. After having initially completed the naive strategy, we actually thought we had quite a good, if not the best, strategy for the problem. However, after modeling the problem and applying KBSC we found that our strategy was actually the worst strategy possible that did not needlessly repeat itself. Our first strategy aimed at slowly solving the problem by first vacuum cleaning and then moving both West and East in time, while vacuum cleaning after each move. This resulted in a strategy with five actions. However, if you were to simply move to the East or the West at the start, that would immedi-ately collapse the uncertainty of the system. We would know for certain, after one action, exactly in what room the Roomba would be, no matter where it started. That way we could achieve a strategy with four actions instead of five. However, it is important to note that the sample size of participants used to synthesize strategies through intuitive reasoning was extremely limited; being limited to only ourselves, makes the resulting strategy less of a general case baseline, and more of an anecdotal approach.

(28)

20 CHAPTER 5. DISCUSSION

5.2 Difficulty and complexity of

knowledge-based strategy synthesis

The difficulty of modeling a simple real-world problem as a single-player game of imperfect information should not be hard. If modeling even the simplest and smallest of problems turned out to be hard, then the realistic usability of games modeling would be nonexistent. A similar logic applies to the overall complexity of models, however this is something that by necessity, has to be able to grow with the problem without becoming unwieldy. Otherwise the applicability of games modeling as a problem solving approach would never get beyond the theoretical stage.

5.3 Difficulty of applying KBSC on a simple

model

When we started working on the application of KBSC on our Roomba model, we had already researched the topic of games modeling and knowledge-based subset construction for about three months. Even though we had spent a con-siderable amount of time researching the topic, applying KBSC still took a couple of tries to get the model right. A model that was both correct, readable and contained enough information to be useful in the synthesis of knowledge-based strategies was needed. It was found that limiting the number of actions in the model, to only those actions that change the state of the model, drasti-cally improved the readability and by proxy the approachability of the model. However this discovery was made as a tangent to our own research and needs further study in order to verify its viability. It is also important to consider that this was a very simple problem and the complexity of modeling and applying KBSC will likely increase drastically if not exponentially as the complexity of the problem grows.

(29)

Chapter 6 Conclusion

It was shown that it is indeed possible to synthesize strategies for some real-world problems with one agent by modeling them as single-player games of imperfect information and then applying KBSC on them. It was also shown that this approach can be used to synthesize strategies that can be more ef-fective than a strategy synthesized simply through intuitive reasoning, while still being fairly simple to implement. Thus it can be a useful approach com-pared to synthesizing strategies through intuitive reasoning in some contexts. The caveat is that the problem was a fairly simple one, and so it might not be indicative of more complex problems; also the sample size for synthesizing strategies through intuitive reasoning was extremely limited and might not be indicative of the general public as a whole.

(30)

Chapter 7 Further Study

For any paper moving forwards from this thesis we suggest looking at the syn-thesize of knowledge-based strategies for more complex real-world problems through games modeling and the application of KBSC, as well as comparing its effectiveness with strategy synthesis through intuitive reasoning. Prefer-ably by surveying people to synthesize strategies through intuitive reasoning for the problem in question in order to acquire a bigger sample size than we did. We also suggest attempting to conduct the same study on real-world problems that include multiple agents, modeling them as multiplayer games of imper-fect information and applying multiplayer knowledge-based subset construc-tion (MKBSC) on them in order to synthesize strategies for them. We have started looking into the modeling of a simplified version of the alternating bit protocol as a multiplayer game of imperfect information, with the goal of even-tually finding a deterministic knowledge-based strategy for conflict resolution in distributed data-protocols, see appendix A.

Furthermore we also see a need for further study into the subject of model fidelity, as we realized during our research the fidelity of models is in general a lot higher than we found to be necessary. We found that reducing the fidelity of the model improved readability and therefore improved approachability. We think this warrant further study.

(31)

Chapter 8 Acknowledgements

Thank you to our supervisor Dilian Gurov 1 for his expertise and guidance. Thank you to Hrafndís Brá Heimisdóttir2, Adnan Jamil Ahsan3, Olivia Her-ber4, Henrik Kultala5, and Stefano Markidis6 for peer-reviewing our thesis. Thank you to our friends and family for proofreading and providing feedback; thank you Niclas, Malin, Anna, Anders and Jonathan.

1

Dilian Gurov (dilian@kth.se), KTH Royal Institute of Technology

2

Hrafndís Brá Heimisdóttir (hbhe@kth.se), KTH Royal Institute of Technology

3

Adnan Jamil Ahsan <adnanja@kth.se>, KTH Royal Institute of Technology

4

Olivia Herber <oherber@kth.se>, KTH Royal Institute of Technology

5

Henrik Kultala <kultala@kth.se>, KTH Royal Institute of Technology

6

Stefano Markidis <markidis@kth.se>, KTH Royal Institute of Technology

(32)

Bibliography

[1] Judea Pearl. Heuristics: Intelligent Search Strategies for Computer

Prob-lem Solving. USA: Addison-Wesley Longman Publishing Co., Inc., 1984.

isbn: 0201055945.

[2] Helmer Nylén & August Jacobsson. “Investigation of a Knowledgebased Subset Construction for Multi-player Games of Imperfect Information”. In: KTH ROYAL INSTITUTE OF TECHNOLOGY, SCHOOL OF EN-GINEERING SCIENCES, 2018.

[3] Laurent Doyen and Jean-François Raskin. “Games with Imperfect Infor-mation: Theory and Algorithms”. In: Lectures in Game Theory for

Com-puter Scientists. Ed. by Krzysztof R. Apt and ErichEditors Grädel.

Cam-bridge University Press, 2011, pp. 185–212. doi: 10.1017/CBO9780511973468. 007.

[4] John H. Reif. “The complexity of two-player games of incomplete in-formation”. In: Journal of Computer and System Sciences 29.2 (1984), pp. 274–301. issn: 0022-0000. doi: 10 . 1016 / 0022 - 0000(84 ) 90034-5.

[5] Ron [van der Meyden]. “Common Knowledge and Update in Finite En-vironments”. In: Information and Computation 140.2 (1998), pp. 115– 157. issn: 0890-5401. doi: 10.1006/inco.1997.2679.

[6] Ronald Fagin et al. “Knowledge-Based Programs.” In: Distributed

Com-puting 10 (July 1997), pp. 199–225. doi: 10.1007/s004460050038.

(33)

Appendix A

Alternating bit protocol

We divided the protocol into three distinct parts, the Sender (actor), the Re-ceiver (actor), and the Medium (nature). Dividing the protocol this way will allow us to model each actor part as its own single-player game of imperfect information Gi, and the Medium as a lossless channel, which lets us ignore its effect on the two actor models.

Sender Gsender = hL, lstart, Σ, ∆, Oi, where:

(i) L := {ls=0, ls=1} (ii) lstart := ls=0 (iii) Σ := {!msg0, !msg1} (iv) ∆ := {(ls=0, !msg0, ls=0), (ls=0, !msg0, ls=1), (ls=0, !msg1, ls=0), (ls=1, !msg1, ls=1), (ls=1, !msg1, ls=0), (ls=1, !msg0, ls=1)} (v) O := {{ls=0}, {ls=1}}

Receiver Greceiver = hL, lstart, Σ, ∆i, where:

(i) L := {lr=0, lr=1}

(ii) lstart := lr=0

(iii) Σ := {!ack0, !ack1}

(iv) ∆ := {(lr=0, !ack0, lr=0), (lr=0, !ack0, lr=1), (lr=0, !ack1, lr=0),

(lr=1, !ack1, lr=1), (lr=1, !ack1, lr=0), (lr=1, !ack0, lr=1)}

(v) O := {{lr=0}, {lr=1}}

(34)

26 APPENDIX A. ALTERNATING BIT PROTOCOL

Figure A.1: Sender & Receiver model

As you can see both the sender and receiver have two different states, two different actions and can make two different observations. The sender and receiver also each have a set of six different transitions.

(35)

(36)

www.kth.se

Strategy Synthesis for Real-world Problems Modeled as Single-player Games of Imperfect Information

Strategy Synthesis for Real-world

Problems Modeled as

Single-player Games of Imperfect

Information

OLIVER ANTEROS LINNARSSON

SEBASTIAN LORENZO

Strategy Synthesis for

Real-world Problems Modeled

as Single-player Games of

Imperfect Information

OLIVER ANTEROS LINNARSSON

SEBASTIAN LORENZO

Abstract

Sammanfattning

Contents

Chapter 1

Introduction

1.1

Aim

1.2

Research question

1.3

Scope

1.4

Approach

1.5

Thesis outline

Chapter 2

Background

2.1

Games

2.2

Modeling

2.3

Objectives

2.4

Strategy construction in

single-player games

2.5

KBSC

2.6

Knowledge

2.7

Knowledge-based strategy

Chapter 3

Method

3.1

Synthesizing a strategy for a real-world

problem through intuitive reasoning

3.2

Modeling the problem as a single-player

game of imperfect information

3.3

Applying KBSC on the modeled

single-player game of imperfect information

3.4

Synthesizing a knowledge-based strategy

from the knowledge-based model

3.5

Comparing the two different methods of

synthesizing strategies on real-world

prob-lems

Chapter 4

Results

4.1

Synthesizing a strategy for the Roomba

problem through intuitive reasoning

4.2

Modeling the Roomba problem as a

single-player game of imperfect

informa-tion

4.3

Applying KBSC on the modeled Roomba

problem

4.4

Synthesizing a knowledge-based strategy

using the knowledge-based model of the

Roomba problem

4.5