A Method for Finding Strategies in Pursuit-Evasion Games

(1)

INOM

EXAMENSARBETE TEKNIK, GRUNDNIVÅ, 15 HP

STOCKHOLM SVERIGE 2020,

A Method for Finding

Strategies in Pursuit-Evasion Games

OLOF GREN

DENNIS MAGNUSSON

KTH

SKOLAN FÖR ELEKTROTEKNIK OCH DATAVETENSKAP

(2)

A Method for Finding

Strategies in Pursuit-Evasion Games

OLOF GREN, DENNIS MAGNUSSON

Bachelor in Computer Science Date: June 8, 2020

Supervisor: Dilian Gurov Examiner: Pawel Herman

School of Electrical Engineering and Computer Science Swedish title: Ett Sätt att Upptäcka Strategier i Jaktflyktspel

(3)

(4)

iii

Abstract

Many real-world situations can be described as games over finite graphs, con- sisting of a set of agents performing joint actions affecting the state of the game. One class of games over finite graphs are the so called pursuit-evasion games, where a set of pursuers try to capture an evader on a finite map. In some pursuit-evasion games where the position of the evader is unknown finding an optimal strategy to ensure victory for the pursuers can be difficult. One way to simplify this process is by using the multiplayer knowledge-based subset construction (MKBSC) to transform the game graph to an expanded graph where the pursuers’ knowledge is included in the construction. In this report we investigate the usefulness of MKBSC for finding knowledge-based strategies for pursuit-evasion games by analyzing the generated graph by hand and extracting useful information from it. It was found that in general it is difficult to find the best knowledge-based strategies for pursuit-evasion games by hand with a non-symbolic representation of the game. This is mainly due to the fact that the sizes of the expanded graphs tended to be very large. It is possible that MKBSC can be useful for finding knowledge-based strategies for pursuit-evasion games with the use of symbolic representations of the game or by algorithmically finding the strategies based on the generated graphs.

(5)

iv

Sammanfattning

Många situationer kan beskrivas som spel på ändliga grafer bestående av en mängd agenter som utför sammansatta handlingar som påverkar spelets till- stånd. En klass sådana spel är de så kallade jaktflyktspelen, där en mängd jä- gare försöker fånga en flykting på en ändlig spelplan. I vissa jaktflyktspel där flyktingens position är okänd för jägarna kan det vara svårt att hitta en strategi som försäkrar vinst för jägarna. En metod för att förenkla detta är genom att an- vända sig av multiplayer knowledge-based subset construction (MKBSC) för att expandera spelgrafen till en expanderad graf som innehåller jägarnas kun- skap. I denna rapport undersöker vi användbarheten av MKBSC för att hitta kunskapsbaserade strategier för jaktflyktspel genom att analysera de expanderade graferna för hand och extrahera användbar information från dem. Resul- tatet var att det generellt sett är svårt att hitta användbara kunskapsbaserade strategier för jaktflyktspel genom att för hand analysera den expanderade grafen med en icke-symbolisk representation av spelet. Detta är huvudsakligen på grund av att storleken på det expanderade spelet tenderar att vara mycket stor.

Det är möjligt att MKBSC kan vara användbart för att hitta kunskapsbaserade strategier för jaktflyktspel genom att använda en symbolisk representation av spelet eller genom att söka genom den expanderade grafen med hjälp av algoritmer.

(6)

Chapter 1 Introduction

1.1 Universal strategies

Building a distributed system and keeping it running is very difficult, and in- volves many computers with many, possibly quite different programs needing to work together so they can perform a task of value. As if that is not enough we will have to deal with challenges such as limited bandwidths, failing hard- ware, uneven workloads, and possibly hackers with some malicious intent.

All this sounds quite similar to some kind of large cooperative board game, where the servers need to work together by coming up with a good strategy beforehand, while only being able to perform limited communication. Thus finding strategies for games can have great value even in fields unrelated to board games.

One such system where we would be interested in strategies would be search and rescue operations done by robots. If robots are used for search- rescue missions they would maybe have to deal with quite limited access to communication, probably needing to perform well even without a good wi-fi connection. Then there is the problem of actually finding a (possibly moving) target. So what kind of strategy would they need to have in order to find the person needing to be rescued?

Sadly, finding good strategies for games is considered hard. Finding a winning strategy for poker is difficult, and for rock paper scissors it seems to be impossible if the opponent acts unpredictably. Despite there being a whole field of game theory, we see little progress on how to find solutions for these real games. In this report we turn to mathematics.

1

(9)

2 CHAPTER 1. INTRODUCTION

1.2 Kinds of games and removing imperfect imformation

In some cases all actors are aware of the state, however in some games the state might be invisible. To illustrate this, compare checkers with poker, in the latter the player is unaware of the cards in the other players’ hands, there is no analog in checkers. We also have imperfect information in the search-rescue mission, if we know where we need to look it wouldn’t be a search-rescue mission. A game with invisible states is a game of imperfect information.

The robots will not be able to think and the way they act is the responsibility of their creator. Thus a plan is needed to determine how they will act. This is what we mean by "strategy". Or to say it mathematically:

A strategy is a function that maps some memory of how the game has progressed to an action that can be taken by that actor in that state.

Coming up with strategies can be difficult in some cases, especially when the game lacks perfect information, so methods for more easily coming up with efficient strategies have been created. One of them is knowledge-based subset construction[1], or KBSC, where a game of imperfect information can be converted to a game of perfect information, which makes finding strategies easier.

However KBSC is only able to convert single-player games of imperfect information to single-player games of perfect information. For the case of multiplayer games (like our robots) an extension of KBSC has been created[2], called Multi-player knowledge-based subset construction, or MKBSC. MKBSC creates a new multi-player game of imperfect information, where the knowledge of the other players is taken into consideration. A limitation with MKBSC is that it is unable to convert a game of imperfect information to a game of perfect information, however it is still a useful tool for simplifying and creating strategies for games.

1.3 Pursuit-evasion games

One specific category of games that are prevalent in real-world applications are pursuit-evasion games. Pursuit-evasion games are games where a set of pursuers try to locate and capture one or more evader, much like in the play- ground game of tag. An example of a pursuit-evasion game is shown in Figure

(10)

CHAPTER 1. INTRODUCTION 3

Figure 1.1: An example of a pursuit-evasion game map.

1.1. A pursuit-evasion game can have imperfect information if an evader is located in a location where the pursuers can not see it. This introduces additional complexity to the problem of creating a good strategy. When the evaders are captured the game ends. As you probably have figured out already, the robots are one possible application for what we learn about any strategies for pursuit evasion games. So what can we learn about the pursuit-evasion game, now that we are (intellectually) armed with the MKBSC?

1.4 Research Question

This study explores using MKBSC to create strategies for pursuit-evasion games and understand the advantages and limitations of using MKBSC to find strategies for pursuit-evasion games. This is done by creating some simple pursuit- evasion games and using MKBSC to find paths that lead to a winning state.

The scope of this research project is limited to analyzing small pursuit- evasion games with only one evader on relatively small maps, due to limitations in processing power. We will not focus on using MKBSC as a tool for simplifying automatic synthesis of strategies.

The research question for this project is

How can MKBSC be used to aid in the creation of strategies in pursuit-evasion games with imperfect information?

(11)

Chapter 2 Background

In this chapter we present formal specifications for games over finite graphs, strategies and what kinds of games have imperfect imformation. We will also present games with multiple cooperating players. We also present a different type of game altogether: the pursuit-evasion game, and how we adapt it for our purposes.

We also present our method, the Knowledge-based subset construction as well as a variant for multi player games, the Multi-Player Knowledge-Based Subset Construction.

Finally we also describe how this thesis differs from other works desrib- ing the MKBSC as well as works describing methods for finding strategies in pursuit-evasion games.

2.1 Games over finite graphs

Following the notation by Doyen and Raskin [3] a game against nature can be described with a tuple G = (L, lⁱ, Σ, ∆), where L is a set of states, li is the initial state, Σ is the alphabet of actions and ∆ is a set of transitions between the states. In every state l ∈ L the player can take some action a ∈ Σ after which the resulting state will be some state l⁰ such that (l, σ, l⁰) ∈ ∆. If multiple tuples in ∆ start with (l, σ), the resulting state l⁰ will depend on the action taken by nature. Actions taken by nature are actions not taken by the player.

Nature can for example be an opposing player or some environment affecting the state.

Let us consider a game of rock paper scissors where we know that the opposing player is not going to throw rock. This game is illustrated in Figure 2.1. In this game the states are L = {start, win, lose, tie}, li = start, Σ =

4

(12)

CHAPTER 2. BACKGROUND 5

Figure 2.1: A game of rock paper scissors {rock, paper, scissors}.

If we throw rock, then the game can only end in a victory or loss, if we throw paper the game can only end in a tie or loss and if we throw scissors the game can only end in a tie or win. All of these possible transitions are in the set ∆ = {(start, rock, win), (start, rock, lose), (start, paper, tie),

(start, paper, lose), (start, scissors, win), (start, scissors, tie)}. While the resulting state depends partially on our choice of action, it also depends on the choice of nature, represented by our opponent in this case.

2.1.1 Strategies

A game can be played with a strategy. A strategy can be described with a function α that maps the history of states to an action. A strategy is memoryless if only the most recent state is taken into consideration when choosing an action.

Similarly, nature can also follow a strategy, defined in much the same way.

If a player’s strategy will win no matter what the actions of nature are the strategy is a surely-winning strategy. If a surely-winning strategy exists in a game with perfect information there must also exist a memoryless surely- winning strategy for that game [3].

A knowledge-based strategy is a strategy that uses some representation of

(13)

6 CHAPTER 2. BACKGROUND

Figure 2.2: A game of rock paper scissors expressed as a game of imperfect information. The dashed red line indicates that the connecting states are indistinguishable to the agent.

the agent’s knowledge to determine the next action. [2] A knowledge-based strategy can be memoryless if only the agent’s knowledge at the time of the decision is considered, however it can also be memory-based if the agent’s previous knowledge is used.

2.1.2 Games of imperfect information

The definition of a game given above is only sufficient in the case where the states are distinguishable to the agent. In the opposite case an additional ele- ment O is added to the tuple to denote the set of sets of states that are indistinguishable to the agent. The observation set O is a partition of the states in L, those states that are in the same group are indistinguishable for the player.

Each group in O is called an observation. Thus the agent will need a strategy dependent on it’s memory of observations, rather than memory of states.

Games where the player is unable to distinguish between some states are called games with imperfect information. Thus, a game against nature with imperfect imformation is defined as G = (L, l0, Σ, ∆, O).

The game of rock paper scissors described above can also be expressed as a game of imperfect information. In this case two states can represent the move planned by the opponent. The player is unaware of this information, so the

(14)

states paper and scissors are indistinguishable to the player. The game graph of this game is shown in Figure 2.2

2.1.3 Multi-player Games

The game notation used in the previous section can be extended to multiplayer games against nature, where two or more players are cooperating against nature. In this thesis we will use the notation used in a bachelor thesis by Jacob- sson and Nylén [4].

In multi-player games every player chooses some action from their alphabet of actions, which results in some new state, based on the actions of all of the players and nature.

There are two main differences between the definitions of single-player and multi-player games.

Firstly, the Σ is no longer just the alphabet of actions, but the cross product of the actions of all players. Thus Σ is an alphabet of joint actions.

Secondly, O is no longer a single partition, but a list of partitions, one for each player.

An example of a game with multiple players against nature given is the robots lifting a bucket game in Jacobssons and Nyléns thesis. Two robots need to cooperate on lifting a bucket. They can squeeze their grip or attempt to lift the bucket. Both robots need to have a good grip to able to lift the bucket.

It is a multiplayer game against nature, where two robots need to make joint strategy, but of course the robots can not communicate. Imperfect imformation can be added to this game by letting one of the robots have a broken sensor.

This robot can not determine it has good grip on the bucket, which is vital, the robots can not lift the bucket unless both have a good grip. Thus two states are indistinguishable for one of the robots, meaning that this can be represented as a multi-player game of imperfect imformation.

2.1.4 Pursuit-Evasion games

Pursuit-Evasion games are games where one or more pursuers try to reach one or more evader. The game ends when the distances between the evaders and pursuers are less than some value dmin. The goal for the pursuers is to reach the pursuers and thus end the game, while the goal for the evaders is to postpone the end of the game.[5]

For our purposes we will use the defintion that pursuit-evasion games take place on a map M = (V, Em, E_v) where V is a set of vertices, Em is an

(15)

adjacency matrix describing how the players can move and Ev is the visibility set containing a set of tuples (u, v). If (u, v) ∈ E^v a player in the location u can see if a player is located at the location v.[6]

Huang et al. [6] describe pursuit-evasion games as a kind of multiplayer game where a set of pursuers have the goal of capturing a set of evaders. Ex- amples of real-world paralells to this type of game given by the authors are air and naval combat, car collision avoidance systems and others. Formally, the game is defined as the tuple G = (M, Ap, A_e, I, sched), where M is a map, A_pis a set of pursuers, Aeis a set of evaders, I the initial game state and sched is a mapping of each finite sequence of states to a set of players making the next move. In most examples, sched will alternate between all of the pursuers and all of the evaders, or allow all agents to move at every time step. In the former case the game is called a turn-based game and in the latter the game is called a synchronous game.

Turn-based pursuit-evasion games on a grid always have the possibility of ending within a finite amount of time, while the same is not true for synchronous games. One such example is a game with a 2x2 map with one pursuer and one evader where visibility is limited to orthogonally adjacent nodes. If the starting position of the pursuer is (0, 0) and the starting position of the evader is (1, 1) it is possible for the pursuer to never observe the evader if both players move simultaneously. However if the game is turn-based the pursuer can always observe the evader after the second turn.

The game state of a pursuit-evasion game in a given time is defined as a tuple (AP, A_E, posn, turn) where AP is the set of pursuers, AE is the set of evaders, posn is a function that returns the location of a pursuer or evader given a node and turn is the set of players scheduled to move in the current state.

For every player a in the game there is an observation function Oa defined as Oa((A_P, A_E, posn, turn)) = (A_P, A_E, posn_a, turn) where posna is a function that returns the location of all players visible to a.

A pursuit-evasion game in the notation described by Huang et al [6] can be converted to a game in the notation described in section Section 2.1.

2.2 Knowledge-Based Subset Construction

It is possible to convert a single-player game with imperfect information to an equivalent single-player game with perfect information by using a method known as Knowledge-Based Subset Contruction, or KBSC. [1] This construction works as follows: every subset of every observation in O, except ∅ become

(16)

Figure 2.3: A game of rock paper scissors

the nodes in the new game graph. The new edges from each node are all the edges from each of the states in node individually. Then any components not connected with the starting point are discarded. Finally a starting state goes to the node containing all initial states.

More formally: G^K = (L, l₀, Σ, ∆^K) where L = ( ^S

o∈O

2^o) \ ∅, 2^o is the power set of o and

∆^K = L × Σ × L

This construction preserves strategies.

If KBSC is applied to the game in Figure 2.1 the resulting game closely resembles the game in Figure 2.2, which can be seen in Figure 2.3.

2.2.1 Multi-Player Knowledge-Based Subset Construc- tion

Multiplayer Knowledge Based Subset Construction (MKBSC) is generated by first projecting the game onto each player. Projection here means to make the game into a single player game by letting nature choose the moves of the other players. Then the imperfect imformation is removed by KBSC. These games are then combined by taking their synchronous product. We take all the generated games and play them concurrently. At this point the observations need to be added back in. Two states belong to the same observation iff the player has the same knowledge in them. This results in yet another game with

(17)

imperfect imformation for many players.

To summarize, the steps of the MKBSC are:

1. Project the game onto each player.

2. Construct a game without hidden information by KBSC.

3. Use these games to make a multiplayer game.

4. Add back the missing observations.

After this step a new game G^K has been created where every player has gained knowledge of the other players’ knowledge. We can now do the knowledge based subset construcution again, which yields a new game G^2K which can give better strategies. G^2K can be used to create a game G^3K and so on.

A game G^N is isomorphic with the game G^{(N +1)} iff the game stabilizes after N iterations of MKBSC. In some cases the number of states in a game diverges when MKBSC is applied. The reasons why some games stabilize and others don’t is mostly unknown [4].

A Python implementation of MKBSC was developed by Nylén and Jacobs- son[4]. This implementation features visualization using the graph visualization library Graphviz[7]. The implementation relies on a nonstandard input format similar to the definition of a multiplayer game with imperfect information described by Doyen and Raskin[3].

2.3 Related work

Very little work has been done exploring the capabilities of MKBSC, however Nylén and Jacobsson [4] studied the behaviour of MKBSC on different graphs and found that the construction in general increased in size with repeated iterations, however in some cases the graphs stabilized. The reason for stabilization is speculated to be the players’ knowledge being impossible to expand further, causing an isomorphic game graph. The authors acknowledge that they have been unable to prove that the diverging games do not stabilize or decrease in size, however due to the computational cost of MKBSC diverging games are unrealistic to use in a practical scenario. Ultimately, the authors conclude that they are unsure about whether or not MKBSC is useful for real-world applications, due to its exponential increase in complexity when the size of the game increases.

Huang et al[6] researched model checking with temporal logic for pursuit- evasion games. Using a model checking software they were able to check

(18)

for the existance of winning strategies for the pursuers, and in the case where such a strategy did not exist provided a counterexample. Using three different model checkers the authors showed that the time required for verification of a temporal logic formula for a pursuit-evasion game increased exponentially as the map size increased, however it was possible to verify temporal logic formulas on grid-based pursuit-evasion games with grid sizes of up to 50 × 50 by using a symbolic representation of the states where the states are defined by a set of constraints. Ultimately the authors conclude that model checking can be useful for verification of large pursuit-evasion games, however tools are required to convert the games into a format suitable for the model checking software that was used.

Ramaithitima et al[8] created an algorithm for finding strategies for worst- case pursuit-evasion games where the goal was to eliminate points where the evader could be located. For example if the map is a line and two pursuers start in the same location and move in opposite directions, the evader can not be located between the pursuers. The authors provided some examples of maps with sizes of up to a few hundred vertices and showed that a solution for the pursuit-evasion game with two pursuers can be found within a reasonable amount of time, however as the number of pursuers increase the computation time increases exponentially. For reference, the authors compared their algorithm to a brute-force algorithm, which could only find solutions for maps with up to 20 vertices with two pursuers.

Other attempts at finding strategies for pursuit-evasion games have been made [5, 9] where the results show that it is difficult to find an optimal surely- winning strategy.

(19)

Chapter 3 Methods

Initially a script was developed to convert a pursuit-evasion game to a format similar to the nonstandard format used by the MKBSC implementation. The script is limited to discrete, synchronous pursuit-evasion games with two pursuers and one evader where the goal is for the pursuers to reach the same space as the evader. The input to the script is an adjacency map, a visibility map and the initial positions for the pursuers and evaders. The output of the script is a game where the states are represented by all possible positions for all the players. The actions of the evader is determined by nature.

Using the MKBSC library developed by Nylén and Jacobsson we input some simple pursuit-evasion games generated by the developed script and apply MKBSC. We analyzed the number of states in the generated game, the number of states in the game after a few iterations of MKBSC and whether or not the game stabilizes. The graphs generated by the MKBSC library were analyzed in order to try to find memoryless strategies resulting in a win state in as few moves as possible. In the cases where the number of states in the game generated by MKBSC is too large we pruned states not in the vicinity of the goal.

We primarily focused on pursuit-evasion games with relatively small sizes due to limitations in terms of computational resources.

The strategies found were represented in a table format in natural language as a mapping from some condition to an action.

In summary, the steps performed in this project are:

• Create a formal definition for a class of pursuit-evasion games.

• Define a translation form these pursuit-evasion games to games on finite graphs.

12

(20)

CHAPTER 3. METHODS 13

• Use KBSC / MKBSC to expand these games

• Attempt to find memoryless strategies in these new games.

• Translate back any resulting strategies to the original game in a table.

3.1 Alternative approaches

The main problem of trying to evaluate the MKBSC is that the size of interesting games of the right format tend to be unacceptably large, and would be difficult to analyze. Thus while many other real situations can be modeled by games, we want to have something of the right size: not so small that it is not strategically interesting, but not so large that applying the MKBSC makes finding strategies in the resulting graphs impossible. The size of pursuit-evasion games can be changed by increasing or decreasing the size of the map, so they are a good use case for exploring MKBSC. There may be some other games that can fulfill these critera, however pursuit-evasion have the benefit of being easily understandable and flexible.

In this report we focus primarily on relatively small pursuit-evasion games with less than 10 vertices on the map. In general the computational require- ments for larger games increases as the size of the maps increase, which may render the games unfeasible to apply MKBSC to. Another reason to primarily focus on small games is that they can be used to model scenarios that may occur in larger pursuit-evasion games.

The games we analyze are synchronous, because that is easier to model as a multiplayer game. If the games are not synchronous they can in many cases be modelled as single-player games, which makes MKBSC unnecessary. Another reason for the game to be synchronous is because it more closely resembles real-world applications of pursuit-evasion games.

Automatically finding strategies algorithmically could help, however it would increase the scope of this project to a level that is unfeasible to complete in a reasonable amount of time. In general, we try to limit our research to games that stabilize with MKBSC, and the sizes of these games are expected to be relatively small, which largely eliminates the need for automatic strategy synthesis.

(21)

Chapter 4 Results

In this chapter our conversion between a pursuit-evasion game to a game over a finite graph is shown in mathematical notation. We also show some examples of pursuit-evasion games and show how strategies can be found in the games by applying iterated MKBSC.

4.1 Conversion from Pursuit-evasion games to Multi-Player Games over finite graphs

Given a pursuit-evasion game G = (M, A^p, A_e, I, sched_sync), where M = (V, E_m, E_v) is the map, Ap = {p₁, p₂} is the set of two pursuers, Ae = {e}

is the evader, I is the initial state and schedsyncis a synchronous scheduler, G can be converted to a game G = (L, li, Σ, ∆, O) by first defining the states L as all possible combinations of locations for the players and an additional win state, or (V × V × V ) ∪ {win}. The win state represents all moments in time after the pursuers have captured the evader. L can include unreachable states.

The initial state liis defined by (posn(xp1), posn(x_p2), posn(x_e)), where xi is the starting position of the player i. Σ can be defined as all possible movement for the pursuers in all possible states, or {p1, p₂} × V , where the action (pi, v) represents the pursuer pi moving to the node v ∈ V from some other node.

This action is only valid when the position in states where pi ∈ posn(u). ∆ contains all possible ((v^p1, v_p2, v_e), ((p₁, u_p₁), (p₂, u_p₂)), (u_p1, u_p2, u_e)) where (v_p1, u_p1) ∈ E_m, (vp2, u_p2) ∈ E_m and (ve, u_e) ∈ E_m. Since the pursuers are always aware of each others’ position, O can be partitioned into sets con- sisting of the pursuers’ positions and the evader’s position in the cases where the evader is not visible to any of the pursuers. Thus, the observation set for

14

(22)

CHAPTER 4. RESULTS 15

pursuer p1 is defined as O_p₁ = { ^[

(u,v)∈V ×V

{(u, v, w)|∀w, (u, w) /∈ E_v}} ∪ ^[

(u,v)∈V ×V

{(u, v, w)|∀w, (u, w) ∈ E_v}

Similarly, the observation set for pursuer p2is defined as O_p₂ = { ^[

(u,v)∈V ×V

{(u, v, w)|∀w, (v, w) /∈ E_v}} ∪ ^[

(u,v)∈V ×V

{(u, v, w)|∀w, (v, w) ∈ E_v}

A full Python implementation of this conversion can be found in Appendix A.

4.2 MKBSC

As a proof of concept, we made the following simple game for one player against nature. On a map that is a circular graph of four nodes named n, w, s, and e, there is a pursuer and an evader starting on two adjacent nodes. The map of the game is shown in Figure 4.1. The game was made into a graph.

The states of the game is all possible combinations of positions of the players.

This results in a fairly large game, as seen in Figure 4.2.

Applying the KBSC to the game, makes the game dramatically smaller and results in the game seen in Figure 4.3.

Notably, the amount of states in the graph has decreased dramatically. Also it can be seen that there is no possibility of ever capturing the evader, because the ’win’ knowledge state has been pruned away from the graph. This happens because if both the pursuer and evader move, while they have distance of one between them the distance will remain one. Thus there can not exist any surely- winning (memoryless or otherwise) strategy for this game.

Now let us consider a similar game with an octagonal map and two pursuers starting in positions 0 and 2, and the evader starting in position 4. The map of the game is shown in Figure 4.4 The initial game contains 513 states representing all possible combinations of positions by the players and the win state. However, only 129 of these states are reachable. After one iteration of MKBSC the size of the game is increased to 170 states. The game stabilizes after one iteration. Finding a strategy from this graph is unfeasible due to the size of this graph, however it can be proven that the maximum length of this game played with an optimal strategy in this case is three turns. Thus if all states not in some path between the intial state and the win state with length three are removed the graph gets more readable, with only 15 states. This graph is shown in Figure 4.5. From this graph it is possible to find some func- tioning strategies, however it is also possible to find some non-surely-winning

(23)

16 CHAPTER 4. RESULTS

Figure 4.1: The four cycle graph pursuit game map.

Figure 4.2: The four cycle graph pursuit game graph. The states are connected by dashed red lines if they are indistinguishable to the agent.

(24)

Figure 4.3: The four cycle graph pursuit after applying the KBSC.

strategies. The optimal surely-winning strategy for this game would be for the pursuers to move in opposite directions until one of them catch the evader, represented by the actions (p1_7, p2_3) followed by (p1_6, p2_4). The strategy for this game is shown in Table 4.1.

Condition Action_p1 Action_p2

p1 is located at 0 and p2 is located at 2 Go to 7 Go to 3 p1 is located at 7 and p2 is located at 3 Go to 6 Go to 4 Table 4.1: Strategies for pursuers in the octagonal game

One example of a game where the players are required to utilize the other players’ information is the game shown in Figure 4.6. One player is able to see the entire field and is thus acting as an observer, while the other player is chasing the evader. The map splits in two parts where the chaser is unable to know which path the evader took. In this case the chaser needs to use the information of the observer to know which path to find. The original game has 513 states, where only 23 are reachable. After one iteration of MKBSC the game stabilizes and the number of states in the construction is 74. Removing all nodes not in paths shorter than 3 resulted in the graph shown in Figure

(25)

Figure 4.4: The octagonal game map.

Figure 4.5: Simplified graph of MKBSC applied to the octagonal game.

(26)

Figure 4.6: The observer-chaser game. The orange arrows represent the visibility graph.

4.7. This is an example of a game that can not be solved without allowing one player to access the other players’ knowledge. The strategy for this game is shown in Table 4.2.

Condition Action_p2

p2 is located at 3 Go to 2 p2 is located at 2 Go to 1 p2 is located at 1 and p1 knows that e is at 5 Go to 4 p2 is located at 1 and p1 knows that e is at 7 Go to 6 Table 4.2: Strategies for pursuers in the observer-chaser game

Finally let us consider a randomly generated pursuit-evasion game with 6 nodes and 12 edges on the map with 12 randomly generated edges in the visibility graph. The map of the game is shown in Figure 4.8. The initial game has 216 initial states. After one iteration the number of states is increased to 888. After a second iteration the size is increased to 5688. After this the computation time is too long, so the program was halted.

(27)

Figure 4.7: Simplified graph of MKBSC applied to the observer-chaser game.

Figure 4.8: The randomly generated game. The orange arrows represent the visibility graph.

(28)

Chapter 5 Discussion

5.1 Conversion from Pursuit-evasion games to Multi-Player Games over finite graphs

In general it seems that using MKBSC to find strategies for pursuit-evasion games without any kind of automatic strategy finding algorithm is unfeasible due to the size of the output graph. The graphing software is one of the main obstacles. Drawing huge graphs requires a lot of computational resources, and the resulting graphs are almost impossible to analyze by hand. There are however some optimizations that can be done in the conversion between the original pursuit-evasion game format and the extensive-form game that can reduce the effective number of states. For example if the two pursuers swap positions the state is effectively the same game. There are also states that are ismorphic, for example rotations and flips of the map. The number of states can be reduced by introducing a symbolic representation of the states as the positions of the players relative to the other players. Reducing the number of states by using a symbolic representation of the game is likely still not sufficient to yield a graph small enough to easily analyze by hand.

5.2 Stabilization of Iterated MKBSC

A game stabilizing after a certain number of iterations of MKBSC indicates that the players can not gain more information by understanding the knowledge of the other players. This seems to indicate that if a game does not stabilize the players can get more information that can be useful for a better strategy, so an optimal strategy might not be able to be found in the non-stabilized it-

21

(29)

22 CHAPTER 5. DISCUSSION

eration. Most of the games we analyzed stabilized after one iteration, since this was all of the information that the players could gain from knowing the other player’s knowledge. The randomly generated game, however, seemed to not stabilize. This indicates that it might not always be possible for games to stabilize within a reasonable amount of time which makes the usefulness of MKBSC for larger games unclear, and also makes it less feasible for use when modelling real-world situations. While it may not be likely that the visibility graph is much different from the map graph in real world scenarios the randomly generated graph game does illustrate that even with the use of algorithmic strategy searching, MKBSC may not always be able to find an optimal strategy for pursuit-evasion games. It might however be possible for non-stabilizing games in our pursuit-evasion game conversion to stabilize in equivalent games where a more compact symbolic representation of the games are used, however creating such a representation may be difficult with maps similiar to the randomly generated one.

(30)

Chapter 6 Conclusion

6.1 Summary

We have shown one way of converting a pursuit-evasion game to a game over a finite graph. Using the iterated MKBSC it is possible to find knowledge-based strategies for some of these games. However, while the games in most cases studied here converged, the resulting graphs tend to be too large to feasibly extract strategies by hand.

6.2 Conclusions

The use of MKBSC for finding strategies for pursuit-evasion games by hand seems to be limited to very small games. Without the use of automatic strategy synthesis finding strategies for larger games is difficult. In general the number of nodes increases drastically with iterated MKBSC. Most simple pursuit-evasion games stabilize after one iteration, however there are examples of pursuit-evasion games that do not stabilize. To be able to get any kind of useful result, it is important to use small simplified models of real scenarios, and use resulting strategies as a baseline for further work.

6.3 Future research

In terms of future research the most important thing to investigate is using some sort of automatic strategy synthesis algorithm to generate strategies using MKBSC for pursuit-evasion games. This would make it possible to analyze larger games, which is more relevant for understanding the limitations of

23

(31)

24 CHAPTER 6. CONCLUSION

real-world applications of MKBSC.

Some optimization in terms of reducing the space and computation required for MKBSC needs to be done to make it easier to find strategies. This can be done by specifying a goal state and pruning the states that are not relevant for reaching the goal state within an optimal amount of time in some way.

A more compact symbolic representation of pursuit-evasion games could be useful to more efficiently compute strategies.

(32)

Bibliography

[1] D. Berwanger, Ł. Kaiser, and B. Puchala. “A perfect-information construction for coordination in games”. In: vol. 13. 2011, pp. 387–398.

isbn: 9783939897347.

[2] D. Gurov and V. Goranko. “Knowledge-based strategy synthesis for multi- agent teams playing against nature”. In: KTH Technical Report, March 2020. KTH Royal Institute of Technology. 2020.

[3] Laurent Doyen and Jean-Fran¸cois Raskin. “Games with Imperfect Infor- mation: Theory and Algorithms”. Article. LSV, ENS Cachan and CNRS, France, Universite Libre de Bruxelles (ULB), Belgium.

[4] Helmer Nylén and August Jacobsson. Investigation of a Knowledge-Based Subset Construction for Multi-Player Games of Imperfect Information.

eng ; swe. KTH, Skolan för teknikvetenskap (SCI), 2018.

[5] Game Theory Models for Pursuit Evasion Games. University of British Columbia, Department of computer science.

[6] X. Huang, P. Maupin, and R. van der Mayden. Model Checking Knowl- edge in Pursuit Evasion Games. University of New South Wales, Com- puter Science and Engineering, 2013.

[7] John Ellson et al. “Graphviz — open source graph drawing tools”. In:

Lecture Notes in Computer Science. Springer-Verlag, 2001, pp. 483–484.

[8] Rattanachai Ramaithitima et al. “Hierarchical Strategy Synthesis for Pursuit- Evasion Problems”. In: ECAI. 2016.

[9] A. Antoniades, H.J. Kim, and S. Sastry. “Pursuit-Evasion Strategies for Teams of Multiple Agents with Incomplete Information”. In: vol. 1. 2003, pp. 756–761.

25

(33)

Appendix A

Source code for translating pursuit- evasion games to games over a

finite graph

The authors give permission to anyone that wishes to use, modify, distribute or copy the following source code to do so.

# Makes a game for the mkbsc.

# The game will be a pursuit-evasion game with two pursuers and a evader.

# The game will take place on a undirected graph, with some information on

# visibility. Two tiles can be visible from one another or not. A pursuer

# can see the evader if there is visibility between the two nodes.

# Moves are done simultaneously for all players.

# Usage

# a number of prompts are given-

# "input size of graph"

# input the number of nodes in the game arena

# "input number of edges"

# first input the number of edges you will give, and hit enter. then, on each

# line input two node indecies (zero-indexed).

# "input number of edges in visibility graph"

26

(34)

APPENDIX A. SOURCE CODE FOR TRANSLATING PURSUIT-EVASION GAMES TO GAMES OVER A FINITE GRAPH 27

# done in much the same way as in previous prompt.

Remember however:

# Visibility between tiles is NOT TRANSITIVE! (It is however commutative)

# vis(a, b) /\ vis(b, c) -/> vis(a, c)

# vis(a, b) -> vis(b, a)

# "Input output file"

# give a name to the file you want the finished game to be in.

# The file should not exist already. Please add the . game file suffix.

# Returns the index of a state def get_state(p1, p2, p3):

return p1 * n_nodes**2 + p2 * n_nodes + p3

n_nodes = int(input(’Input␣size␣of␣graph:␣’)+’\n’)

# arena contains at position x the neighbors of node x in the arena

arena = []

# vis contains at position x all places visible at node x in the arena.

vis = []

for i in range(n_nodes):

arena.append(set()) vis.append(set())

n_edges = int(input(’Input␣number␣of␣edges’)) for i in range(n_edges):

x, y = [int(x) for x in input().split(’␣’)] # Here edges in the arena are added

arena[x].add(y) arena[y].add(x)

n_edges_vis = int(input(’Input␣number␣of␣edges␣in␣

visibility␣graph’))

for i in range(n_edges_vis):

x, y = [int(x) for x in input().split(’␣’)] #here visibility edges are added

vis[x].add(y) vis[y].add(x)

(35)

28 APPENDIX A. SOURCE CODE FOR TRANSLATING PURSUIT-EVASION GAMES TO GAMES OVER A FINITE GRAPH

win_state = n_nodes**3

startpos = [int(x) for x in input(’Enter␣starting␣

position␣of␣players␣seperated␣by␣space:␣’).split()] # For alll 3.

fil = open(’../mkbsc/games/’ + input(’Input␣output␣file:

␣’), ’w+’)

fil.write("Alphabet:"+’\n’)

fil.write(’,’.join(["’p1_"+str(i)+"’" for i in range(

n_nodes)])+’\n’)

fil.write(’,’.join(["’p2_"+str(i)+"’" for i in range(

n_nodes)])+’\n’) fil.write(’’+’\n’)

#Make states

fil.write(’Base␣States:’+’\n’) for p1 in range(n_nodes):

for p2 in range(n_nodes):

index = str(p1) + ’X’ + str(p2) + ’x’ + str(p3) fil.write(str(get_state(p1, p2, p3)) + "=’" +

index + "’" + ’\n’)

fil.write(str(win_state) + "=’win’" + ’\n’)

fil.write(’’+’\n’)

fil.write(’Knowledge␣States:’ + ’\n’) fil.write(’’ + ’\n’)

fil.write(’Initial␣State:␣’ + str(get_state(startpos[0], startpos[1], startpos[2])) + ’\n’)

fil.write(’’ + ’\n’)

fil.write(’Observations:’ + ’\n’)

#Knowledge states

#NOTE it is assumed that the pursuers know each others’

location

def write_observations(

bool_true_if_player_1_false_if_player_2):

obs = ’’

(36)

APPENDIX A. SOURCE CODE FOR TRANSLATING PURSUIT-EVASION GAMES TO GAMES OVER A FINITE GRAPH 29

unique_observations = [] #Observation where the position of the evader is known

st = []

observing_player = p1

if bool_true_if_player_1_false_if_player_2:

observing_player = p1 else:

observing_player = p2

if p3 in vis[observing_player]:

unique_observations.append(str(get_state(p1, p2 , p3)))

continue

st.append(str(get_state(p1, p2, p3))) if len(st) != 0:

obs += ’,’.join(st) + ’|’

if len(unique_observations) != 0:

obs += ’|’.join(unique_observations) obs += ’|’+str(win_state)

else:

obs += str(win_state) fil.write(obs + ’\n’) write_observations(True) write_observations(False)

fil.write(’’ + ’\n’)

fil.write(’Transitions:’ + ’\n’) for p1 in range(n_nodes):

currstate = str(get_state(p1, p2, p3))

# Victory

if p1 == p3 or p2 == p3:

for i in range(n_nodes):

for u in range(n_nodes):

p1a = i p2a = u

(37)

30 APPENDIX A. SOURCE CODE FOR TRANSLATING PURSUIT-EVASION GAMES TO GAMES OVER A FINITE GRAPH

fil.write(currstate + ’␣’+str(p1a)+’,’+str(

p2a + n_nodes)+’␣’ + str(win_state) + ’\n’

) continue

for p1a in arena[p1]:

fil.write(currstate + ’␣’+str(p1a)+’,’+str(

p2a + n_nodes)+’␣’+ str(get_state(p1a, p2a , p3a))+’\n’)

# Make win state trap players for i in range(n_nodes):

for u in range(n_nodes):

p1a = i p2a = u

fil.write(str(win_state) + ’␣’+str(p1a)+’,’+str(p2a+

n_nodes)+’␣’ + str(win_state) + ’\n’)

fil.write(’’+’\n’)

fil.write(’Attributes:␣{"nodesep":␣0.5,␣"concentrate":␣

false,␣"splines":␣"True",␣"ranksep":␣0.5}’+’\n’)

fil.close()

(38)

TRITA TRITA-EECS-EX-2020:393

www.kth.se

A Method for Finding Strategies in Pursuit-Evasion Games

A Method for Finding

Strategies in Pursuit-Evasion Games

OLOF GREN

DENNIS MAGNUSSON

A Method for Finding

Strategies in Pursuit-Evasion Games

OLOF GREN, DENNIS MAGNUSSON

Abstract

Sammanfattning

Contents

Chapter 1 Introduction

1.1 Universal strategies

1.2 Kinds of games and removing imperfect imformation

1.3 Pursuit-evasion games

1.4 Research Question

Chapter 2 Background

2.1 Games over finite graphs

2.1.1 Strategies

2.1.2 Games of imperfect information

2.1.3 Multi-player Games

2.1.4 Pursuit-Evasion games

2.2 Knowledge-Based Subset Construction

2.2.1 Multi-Player Knowledge-Based Subset Construc- tion

2.3 Related work

Chapter 3 Methods

3.1 Alternative approaches

Chapter 4 Results

4.1 Conversion from Pursuit-evasion games to Multi-Player Games over finite graphs

4.2 MKBSC

Chapter 5 Discussion

5.1 Conversion from Pursuit-evasion games to Multi-Player Games over finite graphs

5.2 Stabilization of Iterated MKBSC

Chapter 6 Conclusion

6.1 Summary

6.2 Conclusions

6.3 Future research

Bibliography

Appendix A

Source code for translating pursuit- evasion games to games over a

finite graph