A Game Theoretic Approach to Multi-Agent Cooperation with Application to Economic Systems

(1)

A Game Theoretic Approach to

Multi-Agent Cooperation with

Application to Economic Systems

T O S H I H I R O S U Z U K I

Master of Science Thesis

(2)

(3)

A Game Theoretic Approach to

Multi-Agent Cooperation with

Application to Economic Systems

T O S H I H I R O S U Z U K I

Master’s Thesis in Optimization and Systems Theory (30 ECTS credits) Degree Programme in Engineering Physics (120 credits) Royal Institute of Technology year 2014

Supervisor at KTH was Xiaoming Hu Examiner was Xiaoming Hu

TRITA-MAT-E 2014:23 ISRN-KTH/MAT/E--14/23--SE

Royal Institute of Technology School of Engineering Sciences KTH SCI SE-100 44 Stockholm, Sweden URL: www.kth.se/sci

(4)

(5)

A Game Theoretic Approach to Multi-Agent

Cooperation with Application to Economic

Systems

Toshihiro Suzuki

Abstract

Consensus problem with multi-agent systems has interested researchers in various areas. Its difficulties tend to appear when available information of each agent is limited for achieving consensus. Besides, it is not always the case that agents can catch the whole states of the others; an output is often the only possible measurement for each agent in applications. The idea of graph Laplacian is then of help to address such a troublesome situation.

While every single agent obviously makes decision to achieve an individual goal of minimizing its own cost functional, all agents as a team can obtain even more improvement by cooperation in some cases, which leads to cooperative game theoretic approach. The main goal of this master thesis is to accomplish a combination of optimal control theory and cooperative game theory in order to solve the output consensus problem with limited network connectivity. Along with this combination, bargaining problems are considered out of necessity.

(6)

(7)

CONTENTS

1 Introduction

Recently there has been a considerable interest in multi-agent networked systems in which every single agent autonomously interacts with each other to perform tasks. Examples of a wide field of application of the multi-agent networked systems include subjects such as unmanned autonomous vehicles and flights [7] [8], wireless sensor network, and mobile robotics, smart grid [22], and economics [23].

In this thesis we highlight a monetary and fiscal stabilization policy problem in European countries for instance [20]. It is obvious that European countries play important roles in world economy, and many of them follow a common currency:

Euro. Following this successful example of European Economic and Monetary Union (EMU), some other areas such as Southeast Asian countries [21] also at- tempt a common currency. Thus, it is a serious issue how much benefit a country could receive by joining a monetary union and sharing a common currency and the analysis of consensus problem is notably motivated out of necessity.

Consensus problems have a long history, and recently there has been considerable concern to consensus problems for multi-agent systems. The word ”consensus” means an achievement of agreement between an agent (player) and its neighbors on the network. The neighbors are defined as members who can com- municate and exchange information with each other in a neighboring set. It is certainly reasonable to take into consideration the limited availability of information in the real world. In order to describe such links of a network mathematically, Graph Laplacian matrix plays important roles and contribute significantly to convergence analysis of consensus.

Each agent takes action using information about its neighbors, and the rest of agents in the neighboring set must be affected by each decision. Then how to make decision receives much attention, and in this thesis we focus on cooperative game; there is a willing participation of all the agents and each agent can benefit from cooperation. As a result, it is possible to reach a group goal while achieving individual goals for each agent simultaneously. This is the situation of so-called

”cooperative game.” The multi-agent system with cooperative agents has been analyzed by a number of papers [5] [9]. In contrast, the analysis of multi-agent systems becomes more complicated if there exist agents making decision in a distributed (decentralized) manner, i.e., in non-cooperative game [3] [4].

Pareto optimality is what we need to pay attention to in the process of consensus seeking problem under cooperative game. Cooperative game implies that there exists no alternative strategy to improve all the agents at once, which is the very property of Pareto optimality. Hence, introducing the coefficients for each cost

(10)

1 INTRODUCTION

function makes it possible both to construct cooperative game framework and to produce the Pareto frontier and then the next step is how to select a unique Pareto optimal solution: Our focus in on bargaining problem. For bargaining problem, we consider two types of bargaining solutions: Nash Bargaining Solution (NBS) and Raiffa Bargaining Solution (RBS) [12] [13]. Note that both of these bargaining solutions guarantee the Pareto optimality. Recently the comparison of these two solutions has grown its attention in a variety of areas such as resource allocation [14].

As described above, each agent makes decision according to a certain goal. LQ (linear system and quadratic cost) control problem is one example considered to define the goal as consensus in cooperative game [1]. The LQ control problem is one of important classes in optimal control theory both in theory and application.

It is solved via algebraic Riccati equation (ARE) which is probably the simplest quadratic matrix equation. In other words, it is both necessary and sufficient for the optimal control of LQ problem that the corresponding ARE has a solution, and the optimal control is computed using the solution and all the states.

However, it is not always possible to measure all the states in applications, and for that reason it is desirable to design the control based on outputs of the system.

The theoretic framework has been proposed in [7] for solving consensus problem on the basis of local output feedback control and without any cost function. In the absence of cost function, the conditions of output feedback have been studied for the consensus problem of multi-agent systems [10] [15]. Nevertheless, it is still an open question how to design the local output feedback control in a practical and concrete way. On the other hand, in the case of LQ control problem it is well known that the existence of output feedback must be verified numerically in general, and the analysis of output feedback control has been investigated [19].

This thesis is organized as follows. In section 2, the mathematical definitions preliminary throughout this thesis are introduced. In section 3, non-cooperative game theoretic approach is formulated for comparison. In section 4, how to achieve cooperative game theoretic approach is shown, involving Pareto optimality, output feedback control law, and bargaining solutions. In section 5, funda- mental information about EMU, which is an economic case study, is described.

In section 6, simulation results are given for both bargaining solutions: Nash Bar- gaining Solution (NBS) and Raiffa Bargaining Solution (RBS). Finally in section 7, the conclusion and the future work are described.

(11)

2 PROBLEM FORMULATION

2 Problem Formulation

2.1 Graph Laplacian

The interaction topology of a network is represented by a graphG = (V, E) where V and E are the set of nodes and edges respectively. The nodes correspond to all the agents, and the edges denote the links of communication network. By these definitions the neighboring set of agent i is denoted as Ni ={j ∈ V|(i, j) ∈ E}.

In this thesis, the graph is assumed to be not only undirected but also unweighted.

Then, graph Laplacian matrix is defined as follows to describe the graph:

L(i, j) =







|Ni|, for j = i

−1, for j ∈ Ni

0, otherwise

(2.1)

where |Ni| is the number of neighbors of agent i. For example, when considering four agents whose communication is established via the undirected graph of network as described in Fig. 2.1, the corresponding graph Laplacian is defined as follows:

L =







2 −1 −1 0

−1 2 −1 0

−1 −1 3 −1

0 0 −1 1





 (2.2)

Figure 2.1: Example of undirected graph.

(12)

Note that for undirected graphs L is a symmetric matrix whose eigenvalues are always real and eigenvectors are all orthogonal. We assume throughout this thesis that |Ni| ̸= 0 ∀i, and that the graph is strongly connected, which means there exists at least one path connecting between any two nodes inV.

Assumption 1. The graph is assumed to be undirected, unweighted, and strongly connected.

Here we argue two important properties of the graph Laplacian matrix L: First, it is obvious by definition that every row-sum of L is zero and that therefore, there always exists zero eigenvalue λ₁ = 0 of L whose corresponding eigenvector is 1 = [1, 1,· · · , 1]^T, which indicates L1 = 0; and, second, the second smallest eigenvalue of graph Laplacian is called algebraic connectivity and measures the speed of convergence of consensus algorithms.

2.2 Multi-Agent Systems

Let us consider a group of N agents whose dynamical system is denoted for i = 1,· · · , N by

˙x_i = Ax_i+ Bu_i yi = Cxi

zij = C(xi− xj), j ∈ Ni

(2.3)

where xi ∈ Rⁿis state, yi ∈ R^m is internal measurement to detect its own state, z_ij ∈ R^mis external (relative) measurement between each agent and its neighbors, and u_i ∈ R^r is control input. For z_ij, we need to unite all the relative state measurements into a single term due to the fact that every single agent cannot drive them to zero simultaneously.

z_i = ∑

j∈Ni

z_ij =|Ni|yi−∑

j∈Ni

y_j (2.4)

This united measurement (2.4) is what each agent can actually receive as relative output. It is worth mentioning that in error measurement (2.4) all the relative state measurements are regarded equally and that the choice of weights, however, does not have any impact on result as long as the the sum of weights is|Ni|.

We make mention of two terms of control input: local and global control laws.

The local control law consists of the output from neighbors:

u_i = u^l_i+ u^g_i (2.5)

When it comes to cooperative game theory as stated later, decomposition of control law is, actually, necessary to build the system of team by individual systems.

(13)

It can be difficult to achieve the formulations of cooperative game without the decomposition. The detailed explanation of these two control laws are skipped to later sections.

Next, we consider that under the above linear dynamical system (2.3) each agent owns the following cost functional:

J_i =

∫ T 0

[∑

j∈Ni

(y_i− yj)^TQ_i(y_i− yj) + (u^g_i)^TR_iu^g_i ]

dt (2.6)

where Q_i ∈ R^m^×m and R_i ∈ R^r^×rare symmetric and positive definite matrices.

The cost functional (2.6) indicates that for each agent its objective is to achieve consensus with respect to yi in its neighboring set.

2.3 Cooperative Game Theory

In this subsection, a general description of cooperative game theory is given under the assumption of a team of N agents. In cooperative game, an entire state of the team is governed by the following system.

˙x = A^cx +

∑N j=1

B_i^cu_i (2.7)

It is required for cooperative game that, as seen in (2.7), a control law of each agent can affect not only its own but also its neighbors. As mentioned in the previous subsection, the decomposition of control law makes it possible to realize cooperative game theoretic situation; we can define a system of team by appropriating the role of local and global control laws.

A cost functional of each agent is also defined by the entire state of team as follows:

J_i^c=

∫ _T

0

[x^TQ^c_ix + (u_i)^TR^c_iu_i]

dt (2.8)

From (2.7) and (2.8), we derive a general formulation of cooperative game theory.

Definition 1. In cooperative game, each agent i solves the following LQ control problem:

minui

J_i^c =∫T 0

[x^TQ^c_ix + (u_i)^TR^c_iu_i] dt s.t. ˙x = A^cx +∑_N

j=1B_i^cu_i (2.9)

(14)

The significant advantage of this is convexity, which plays an important role in cooperative game theory as noted later. Note that these formulations are only the general one of cooperative game theory and that the vectors and matrices in (2.7) and (2.8) have nothing to do with the rest of this thesis. In other words, what needs to be done next is to realize the above cooperative situation (2.9) using individual system (2.3) and cost (2.6): all the states can affect both each individual dynamical system and cost functional; and, all the control inputs only can affect each individual system but not cost functional as shown in (2.9).

(15)

3 NON-COOPERATIVE GAME

3 Non-Cooperative Game

Before we argue for the cooperative game theoretic approach, let us consider in this section the non-cooperative case; each agent focuses attention only on minimizing its own cost functional under the possible influence of other agents. No agent has a forced consensus by rule and every single agent has information about neighbors. In non-cooperative game, the available local information of each agent is represented as follows:

u^l_i = ∑

j∈Ni

F_iy_j (3.1)

Each local control law behaves to minimize its own cost functional, while it is obvious that the global control law is also a decision about how to minimize it.

Especially in terms of the global control law, it works in the same way as cooperative game theoretic approach as mentioned later.

Now, it is worth noting the following two things: First, we assume a limited system since non-cooperative approach is used only for comparison in this thesis; second, optimal cost values obtained in non-cooperative manner are required in cooperative game theoretic approach. In contrast, cooperative game theoretic approach considered in this thesis, as described later, is workable for an arbitrary linear systems of agents.

The limit as stated just above is that matrix B in (2.3) is identity matrix. Under this assumption of B, the optimal control law can be obtained in non-cooperative (decentralized) way. Note that the following theorem is an extension of Lemma 5 in [1].

Assumption 2. In non-cooperative game, matrix B in (2.3) is assumed to be identity matrix.

Theorem 3.1. Consider the system (2.3) and the cost functional (2.6). If Assump- tion 2 holds, then the non-cooperative optimal control law is as follows:

u^∗_i = 2S_i⁻¹C^TQ_i(|Ni|yi − zi)− 1

2R⁻¹_i S_ix_i (3.2) where S_isatisfies the following differential Riccati equation (DRE).

− ˙Si = 2|Ni|C^TQiC−1

2SiR⁻¹_i Si+ (A^TSi+ SiA) S_i(T ) = 0

(3.3)

(16)

Proof. Consider the following value functional:

V_i(x_i, t) = 1

2(x_i)^TS_i(t)x_i+ γ_i(t) (3.4) where S_i and γ_i are time-varying parameters. Correspondingly, the Hamilton- Jacobi-Bellman Equation (HJBE) of this system is as follows:

−∂V_i

∂t = ∑

j∈Ni

(y_i−yj)^TQ_i(y_i−yj)+(u^l_i^∗)^TR_iu^l_i^∗+∂V_i

∂x_i (

Ax_i+ ∑

j∈Ni

F_iy_j + u^g_i^∗ )

(3.5) The left-hand side of (3.5) is

−1

2(x_i)^TS˙_ix_i− ˙γi (3.6) and on the other hand, right-hand side is

(x_i)^T [

|Ni|C^TQ_iC− 1

4S_iR⁻¹_i S_i+ 1

2(A^TS_i+ S_iA) ]

x_i

+ (x_i)^T [

−2C^TQ_i ∑

j∈Nⁱ

Cx_j+ 2S_i∑

j∈Nⁱ

F_iy_j ]

+ (u^g_i^∗)^TR_iu^g_i^∗+ 2(x_i)^TS_iu^g_i^∗+∑

j∈Ni

(x_j)^TC^TQ_iCx_j.

(3.7)

Comparing both sides, we get the following equations in terms of u^g∗, (x_i)^Tx_i, and (x_i)^T respectively:

0 = (u^g_i^∗)^TR_iu^g_i^∗+ 2(x_i)^TS_iu^g_i^∗

−1 2

S˙_i =|Ni|C^TQ_iC− 1

4S_iR_i⁻¹S_i+ 1

2(A^TS_i+ S_iA) 0 = −2C^TQ_i ∑

j∈Nⁱ

Cx_j + 2S_i ∑

j∈Nⁱ

F_iy_j,

(3.8)

which exactly leads to the DRE (3.3) and the following optimal condition.

F_i = 2S_i(t)⁻¹C^TQ_i u^g_i^∗ =−1

2R⁻¹_i Si(t)xi

(3.9)

Finally the optimal control law (3.2) is shown to be obtained. Note that we can leave behind all the rest in γ_i, namely

− ˙γi = ∑

j∈Ni

(x_j)^TC^TQ_iCx_j. (3.10)

(17)

Deriving from this theorem, we also make mention of the infinite horizon case that will be considered in simulation section later.

Corollary 3.2. Consider the same system, cost functional, and assumption as Theorem 3.1. Then in the infinite horizon case, i.e., T −→ ∞, the non-cooperative optimal control law (3.2) can be represented as follows.

u^∗_i = 2S_i⁻¹C^TQi(|Ni|yi− zi)− 2(|Ni|Si⁻¹C^TQiC + A)xi (3.11) Proof. Since S_iA = ¹₂(A^TS_i+ S_iA) and the infinite horizon case implies ˙S_i = 0, the DRE (3.3) becomes an algebraic equation given by

−1

2S_iR⁻¹_i S_i =−2S_i⁻¹[

|Ni|C^TQ_iC + (A^TS_i+ S_iA)]

=−2(|Ni|Si⁻¹C^TQ_iC + A).

(3.12)

By substituting this into (3.2) we will obtain the desired result.

It is worthy of emphasis that non-cooperative approach formulated in Theo- rem 3.1 and Corollary 3.2 is more general than the simple double integrator model introduced in [1]. In [1] the individual dynamical system is governed only by con- trol input without matrix A and the states are just equivalent to the outputs, which implies that all the agents are capable of measuring the states of their neighbors completely. Contrastingly, Assumption 2 only requires the identity form of matrix B.

(18)

4 COOPERATIVE GAME

4 Cooperative Game

4.1 Pareto Optimality

Cooperation can be an efficient way to minimize the cost functionals since each agent, as seen in (2.5), can affect the systems of its neighbors thorough its output.

For example, let us consider a team of N agents who try to minimize their own cost function somehow. Each choice of decision composes a team strategy under the limit of available information, which means that each agent only knows its neighbors’ information. Among a set of strategies, if there exists one strategy that decreases the cost functionals for all the agents in the team, then the agents will obviously select that. There is no reason to stay the same but to change into that new strategy to get more satisfaction for all agents.

However, it may appear difficult to know how to make decision when we cannot achieve improvement without sacrificing some of the agents. In other words, if there exists no alternative strategy that improves all the cost functionals simultaneously, then the choice of strategy as a team would not be easy anymore. This property precisely indicates Pareto optimality.

Definition 2. Let u = [u₁,· · · , uN]^T be a set of strategies and J_i be the cost functional of agent i determined by all the strategies. Then, u^∗ is called Pareto optimal if there exists no set of strategies such that

J_i(u)≤ Ji(u^∗) (4.1)

with at least one strict inequality.

In general, Pareto optimal strategy is not unique and therefore a criterion for choosing a strategy must be developed out of necessity here. Before finding the criterion for choosing, we first consider the LQ control problem (2.9) and then compute a set of Pareto optimal solutions, i.e., Pareto frontier; the LQ control problem provides a candidate for a Pareto optimal solution according to some proper criterion because the LQ control problem is convex. On the other hand, its convexity can help make certain decisions about how to select strategy from the point of view of cooperative game theory. Namely, the cooperative game theoretic approach compensates the lack of uniqueness of Pareto optimal solutions.

4.2 Control Decomposition

Now we focus attention on how to realize the cooperative game (2.9) by individual systems (2.3) and cost functionals (2.6). First of all, we refer to the roles of local

(19)

4 COOPERATIVE GAME

and global control laws as described in (2.5); both parts of control law work to minimize every single cost functional in non-cooperative case. But differently from non-cooperative game, cooperative game requires different roles:

Definition 3. Under cooperative game, a control law of each agent is decomposed into local and global parts and both play the following role:

• Local control u^li is to be designed so that a dynamical system of team is stabilizable.

• Global control u^g_i is to minimize an individual cost functional.

Note that in cooperative case, the existence of the local control law makes it possible to consider only the global control law to compute the entire cost functional. For simplicity of understanding, first we mention the global control law and next, introduce the local control law.

4.2.1 Global Control Law

The objective of the global control law is not only to minimize its own cost functional but also to provide information for making choice of strategy. These are the very properties of Pareto optimality. As stated in [1], the solutions to (2.9) are given by computing the following set of strategies:

u^g∗(α) = arg min

u^g

∑N i=1

α_iJ_i(u^g) (4.2)

where α = [α1,· · · , αN] ∈ ∆(N) and ∆(N) denotes the simplex in R^N, i.e., α_i ≥ 0 ∀i and∑_N

i=1α_i = 1. Since the above optimal solution of global control law is with respect to α, it is not unique; it is Pareto frontier which is an edge of possible solutions.

Now let us define ∑_N

i=1α_iJ_i(u^g) as the team cost functional. Clearly the global control law achieves minimization of the team cost functional, while the cost functionals of each agent, under a given α, are also minimized since the team cost functional is a linear combination of individual cost functionals, which indicates Pareto optimality. In the form of LQ control problem as a whole, the team cost functional is defined with coefficient α = [α¹,· · · , α^N] as follows:

J^t(α) =

∑N i=1

α_iJ_i(u^g) =

∫ T 0

{y^TQy + (u^g)^TRu^g}dt (4.3)

(20)

4 COOPERATIVE GAME

where R∈ R^{N r}^×Nrand Q ∈ R^{N m}^×Nm are defined as follows.

R = diag{α1R₁,· · · , αNR_N}

Q(i, k) =











α_i|Ni|Qi+∑

j∈Ni

α_jQ_j, f or k = i

−αⁱQ_i− αkQ_k, f or k ∈ Ni

0, otherwise

(4.4)

These definitions show us the correspondence relationship of notations between the general description (2.9) and our cooperative game theoretic approach.

Note that the weighting effect by coefficient α is the next focus for improvement:

General formulation (2.9) Cooperative approach

J_i^c α_iJ_i

ui u^g_i

Q^c_i C^T (∑

kQ(i, k)) C

R^c_i α_iR_i

Table 4.1: Correspondence relationship of notations.

4.2.2 Local Control Law

The next step may seem to be to determine an optimal value of α and select a unique solution among Pareto frontier; yet there still exists something left to do after the team cost functional (4.3), and we cannot be ignorant of the dynamical system as a team then. The local control law is thus to emphasize in considering the team system.

In this section the local control law does not work to minimize the cost functional (2.6) but to define an entire team system for cooperation. In order to accomplish cooperative game theoretic approach (2.9), we propose the following output control law for each agent:

u^l_i = Kz_i = K∑

j∈Ni

C(x_i− xj) (4.5)

By using this local control law, Kronecker product leads to a dynamical system for the team of N agent.

˙x = (IN ⊗ A)x + (IN ⊗ BK)(L ⊗ C)x

= (IN ⊗ A + L ⊗ BKC)x (4.6)

(21)

4 COOPERATIVE GAME

In [1], the idea of interaction matrix has been proposed to realize game theoretic approach: The dynamical system of a team needs including impacts of neighbors on each agent. However, in [1] there is no particular algorithm for determining the interaction matrix that is also used to unite each agent’s dynamics together.

In this thesis we suggest a general framework according to the stability result stated in [7]; we focus attention on the eigenvalues of the Laplacian matrix (2.1).

Theorem 4.1. The local control law (4.5) stabilizes the entire system of team (4.6) if and only if it stabilizes the following system for an arbitrary agent i simultane- ously:

˙xi = Axi+ Bui

y_i = Cx_i z_i = λ_iCx_i

(4.7)

where λi is eigenvalue of the graph Laplacian matrix (2.1).

Proof. First, we consider the orthogonal transformation:

˜

x = (T ⊗ In)x (4.8)

where T LT^T = D = diag{λ1,· · · , λN}. Then, by the orthogonal transformation (4.8) the dynamical system of team (4.6) becomes as follows:

(T ⊗ In)(T^T ⊗ In) ˙˜x = (T ⊗ In)(I_N ⊗ A + L ⊗ BKC)(T^T ⊗ In)˜x

˙˜x = (I_N ⊗ A + D ⊗ BKC)˜x (4.9) Correspondingly, by taking block-diagonal of (4.9) an individual system as sub- system is

˙˜x_i = (A + λ_iBKC)˜x_i (4.10) where λ_i is an eigenvalue of the Laplacian matrix (2.1). Note that the stability of the transformed system (4.9) is equivalent to the stability of N subsystems since (4.9) block-diagonal by definition.

It is worth noting what is the motivation behind this approach: The team system must be stabilizable in solving algebraic Riccati equation (ARE), i.e., all un- stable modes are controllable; the stability result of [7] is thus powerful in the sense that the stability of the entire team system can be shown by analyzing a single dynamical system.

In terms of the stabilizable team system, we still leave something to care about.

Specifically the existence of zero eigenvalue of the graph Laplacian matrix (2.1) leads to the necessity of stable matrix A in determining gain K of the local control law; Otherwise, it would be impossible to achieve a stabilizable team system by

(22)

4 COOPERATIVE GAME

Assumption 3. Matrix A in system (2.3) is stable.

4.3 Team Dynamical System

Eventually, the set of Pareto optimal solutions to the following minimization prob- lem is defined with respect to coefficient α

u^g^∗(α) = arg min

u

J^t(α)

s.t. ˙x = A^tx + (IN ⊗ B)u^g (4.11) where

A^t=





A +|N1|BKC · · · −BKC · · · · ...

· · · · −BKC · · · A +|NN|BKC



 (4.12)

Note that each row of A^thas term (−BKC) in the column of its neighboring set.

It is well known that (4.11) is a standard linear quadratic regulator (LQR) problem and in the infinite horizon case (T → ∞) its solution is computed as follows:

u^g^∗(α, x) =−R⁻¹(I_N ⊗ B)^TP x (4.13) where P ∈ R^{N n}^×Nnsatisfies the following algebraic Riccati equation (ARE).

(I_N⊗C^T)Q(I_N⊗C)−P (IN⊗B)R⁻¹(I_N⊗B)^TP +P A^t+(A^t)^TP = 0 (4.14) As stated in the previous section and associated with (2.9), the LQ problem (4.11) is convex. The convexity guarantees the existence of a unique solution to (4.11), which can also be seen as a unique set of Pareto optimal solutions (Pareto frontier) with respect to α.

4.4 Bargaining Problem

After obtaining the unique Pareto frontier, the final step is to utilize the coefficient α in some way; we need the criterion for choosing strategies among the Pareto frontier by updating α. This coefficient α can be seen as bargaining solution in terms of cooperative game theory. Cooperative game theoretic approach assigns the bargaining solution α to a task to decrease the team cost functional (4.3).

Briefly, we select the unique Pareto optimal solution among the Pareto frontier so that the team cost functional is minimized.

(23)

4 COOPERATIVE GAME

Under the assumption that there is motivation for cooperation, a disagreement point, which denoted by d_i and represented as ”threat-point” in [1], must always be more than cooperative one:

d_i− Ji(α, u^g^∗(α))≥ 0 ∀i (4.15) In this sense, d_i is desired to be computed individually as a non-cooperative cost value. We thus put the optimal cost value given by non-cooperative way (3.2) as d_i in this thesis.

In general, bargaining problem is how agents negotiate to divide a unit with each other. LetA = {1, · · · , N} be a finite set of players. Then, bargaining set for A is a closed, convex, comprehensive, and positively bounded nonempty subset Ω ofR^N whose boundary points are Pareto optimal in Ω. Assume the feasible set of allocations Θ ⊂ R^N and each agent’s utility functional fi : [0, 1] 7→ R, and then a bargaining set is defined as follows:

Ω ={ω | ∃α ∈ Θ s.t. ωi = f_i(α_i), ∀i ∈ A} (4.16) Finally, by introducing disagreement point d ∈ R^N, we obtain the following definitions that are important for bargaining problem:

• Bargaining problem for A is referred to as pair (Ω, d).

• Bargaining solution Φ is to map bargaining problem (Ω, d) in bargaining set Ω, i.e., Φ(Ω, d)∈ Ω.

• Individual points for bargaining problem (Ω, d) indicate the points in Ωd= {α | α ∈ Ω, α ≥ d}.

There exist many papers to explain the theoretical framework of bargaining solutions. However, in this thesis we pay attention on axiomatic approaches: Nash bargaining solution (NBS) and Raiffa bargaining solution (RBS). These concrete formulations are introduced in the track of [14].

4.4.1 Nash Bargaining Solution (NBS)

It is well known that NBS ΦN satisfies the following four axioms:

A1: Pareto optimality

A2: Invariant to affine transformations (Independence of linear transformations) A3: Symmetry

(24)

4 COOPERATIVE GAME

These four axioms are important in considering bargaining solution: A1 denotes the very situation where there exists no improvement equal to all the agents any- more; A2 means that Φ_N is not affected after affine scaling; A3 assumes that all agents are equal to each other in terms of bargaining skill; A4 insists that ΦN is not affected by enlarging the domain.

Considering (4.15), NBS Φ_N is mathematically defined as follows:

α^∗ = arg max

α∈∆(N)

∏N i=1

(d_i− Ji(α, u^g^∗)) (4.17) where u^g^∗ is Pareto optimal control set. For the optimal bargaining solution, it is known that the following equation is satisfied:

α^∗₁(d₁− J1^∗(α^∗, u^g∗)) = · · · = α^∗N(d_N − JN^∗(α^∗, u^g^∗)) (4.18) Equivalently, we can compute NBS as follows:

α^∗_i =

∏

j̸=i(d_j− Jj^∗(α^∗, u^g∗))

∑N j=1

∏

k̸=j(dk− J_k^∗(α^∗, u^g∗)) (4.19) 4.4.2 Raiffa Bargaining Solution (RBS)

Apart from NBS, there also exists so desirable bargaining solution as to satisfy Pareto optimality: Raiffa Bargaining Solution (RBS). Differently from NBS, RBS drops A4, while keeping A1, A2, and A3. Instead, RBS gets the following new axiom:

A4^′: Monotonicity

When it comes to two-player bargaining problem, non-cooperative foundation of sequential RBS is mentioned in [13], which is helpful to understand how RBS works. By introducing the minimum cost qi for each agent i, RBS will be com- puted from the following problem:

α^∗ = arg max

α∈∆(N)

∏N i=1

[

d_i− Ji(α, u^g∗) + 1 N − 1

∑

j∈Ni

(J_j(α, u^g∗)− qj) ]

(4.20) Like NBS, the optimal RBS satisfies the following equation

α₁^∗[d₁− J₁^∗(α^∗, u^g∗) + W₁(α^∗)] =· · · = α^∗_N[d_N − J_N^∗(α^∗, u^g∗) + W_N(α^∗)]

(4.21) where Wi(α) = _N¹₋₁∑

j∈Ni(J_j(α, u^g^∗)− qj). Hence, RBS of each agent is computed as follows:

α^∗_i =

∏

j̸=i[d_j − Jj^∗(α^∗, u^g∗) + W_j(α^∗)]

∑N j=1

∏

k̸=j[dk− J_k^∗(α^∗, u^g∗) + Wk(α^∗)] (4.22)

(25)

4 COOPERATIVE GAME

4.5 Algorithm for Bargaining Solutions

Towards the end of cooperative game theoretic approach, we introduce the concrete algorithm for the next simulation section: We are now at the stage of com- bining the linear quadratic regulator control problem (4.11) with the bargaining solutions (4.19) or with (4.22). The algorithm in this paper follows [1].

Step 1: Start with the initial allocation α = [_N¹,· · · ,_N¹].

Step 2: Solve the linear quadratic regulator control problem (4.11) with α.

Step 3: Check if Ji(u^g^∗(α)) ≤ di, where di is non-cooperative cost value. If this condition is not satisfied for agent k, then correspondingly update α as follows:

α_i = {

αi+ 0.01, if i = k

αi−_N^0.01₋₁, if i̸= k, (4.23) and return to Step 2 until this condition is satisfied for all the agents.

Step 4: Compute the optimal allocation ˆα according to the bargaining solutions:

NBS (4.19) or RBS (4.22).

Step 5: Apply the following update rule:

αi := 0.9αi+ 0.1 ˆαi (4.24) If|ˆαi− αi| < 0.01, ∀i, then terminate this algorithm and determine α = ˆα, otherwise return to Step 2.

(26)

5 CASE STUDY: EMU

5 Case Study: EMU

5.1 Monetary and Fiscal Policies

In modern society, it becomes even more difficult for each nation to pursue inde- pendent monetary policy. It is said that the central banks of Germany and Japan are the only monetary authorities that can independently select their own interest rates in the long run.

When it comes to European countries, European Economic and Monetary Union (EMU) changes macroeconomic policy of participating countries: The monetary and exchange rate policies are not by their own countries anymore but by European Central Bank (ECB), which is an example of commission at the national level. The monetary policy aims at achieving economic growth and stability by means of stable price level and low unemployment. Participation in EMU pays the price of losing monetary authority and exchange rate adjustment. In other words, some European countries gave up their responsibility of momentary policy as a domestic instrument and transformed their determination of exchange rates and interest rates to ECB. As a result, the interest rates in European countries became completely insensitive to US interest rates and quite sensitive to German interest rates, instead. It is worth noting that EMU, however, does not seem to have introduced major structural change.

Not only must the European countries follow in terms of monetary meaning, but they also need to confirm such fiscal restrictions as Stability and Growth Pact (SGP). The fiscal policy refers to the government budget for economic activi- ties and changes aggregate demand and income distribution. The main goal of SGP is to coordinate the fiscal policies of participating countries in Euro-zone and to reduce macroeconomic spillovers or externalities. SGP consists of a politi- cal agreement which has been reached during Amsterdam European Council, and according to SGP, the participating countries need to hold the limit of 3 % of GDP on budget deficit.

5.2 Implementation of EMU Simulation

In general, inflation means an increase in price of commodity, which reflects a reduction in real value of currency. Inflation did not appear to have something to do with the output gap, but after Maastricht Treaty the impact of the output gap becomes large to meet the Maastricht convergence criteria. In order to deal with the output gap, money supply is the main method of the participating countries;

yet money supply does not affect price directly. Empirical results by many authors

(27)

5 CASE STUDY: EMU

indicate that the main channel through which price affects money in the long term is portfolio balance, whereas the balance does not impact directly on price. In this sense, money supply can be seen as an inflationary pressure.

In practice, we also need to highlight the speed of changing process by introducing Euro. Under the assumption of participation in monetary policy, there exists a huge need for structural reform due to the lack of flexibility in conse- quence of no domestic monetary policy. However, the need is not necessarily in hurry because the participating countries do not have exchange rate risk between each other.

Considering the above analyses, ECB executes a common monetary policy and circulates Euro instead of their conventional currency. Now Euro can be seen as a well established currency. Through the common currency, i.e., Euro, ECB tries to stabilize the price in EU and keep up an external value of Euro. Hence, a reliable fiscal stabilization policy is awaited in anticipation of the participating countries’ recession. Additionally, the cost of monetary policy coordination is highlighted as an important issue.

In implementing simulations of EMU, what we need to include in minimizing cost functional is as follows:

• Cost of output gap and inflation, i.e., fraction of the output gap.

• Loss of international competitiveness due to revaluation of the currency.

• Control cost of money supply.

Eventually, the analysis of fiscal stabilization policy will result in notewor- thy implications of EMU: For example, non-participating countries such as UK, Sweden, and Switzerland would have benefited from Euro; on the other hand, participating countries such as France, Spain, and Italy would perhaps have been better off without the common currenc

(28)

6 SIMULATION STUDY

6 Simulation Study

6.1 Problem Setting

In this thesis we consider a simplified macroeconomic model regarding domestic stabilization policy whose framework is linear quadratic (LQ) differential game in continuous time, see e.g., [20] [21]. In [20] [21], the open-loop system is assumed: All the agent can completely grasp what the other ones do. While on the other hand, in this thesis we assume the limit of available information for each agent as seen in cost functional (2.6).

Let us define the following state and input control for each country i:

x_i = [ e_i

p_i ]

, u_i = [ i_i

i_i+ f_i ]

(6.1)

where e_i is exchange rate, p_i is price level, i_i is nominal interest rate, and f_i is domestic real fiscal deficit respectively. Note that the variables are expressed as deviations from their long-run steady states, which indicates the normalization to zero. This definition indicates that exchange rate is controlled only by interest rate and on the other hand, price is governed both by interest rate and fiscal deficit.

In many cases it is reasonable to assume uncovered interest-rate parity (UIP) hypothesis, which implies that the exchange rate involves only the interest rates in its system. However, at the expense of Assumption 3, our simulation is based on the following form of dynamical system

A =

[ −a v b −c

]

(6.2)

where a, b, c > 0 and v ≥ 0. This form is according to the empirical fact that exchange rate is often stable and does not change dramatically in general.

Accordingly, we must minimize the following three elements: The first is cost of output gap, which is denoted by y_i, from natural level and inflation ˙p_i; the second is loss of international competitiveness, i.e., deviation between prices; and, the third is control cost of money supply, i.e., interest rate and fiscal deficit. These goals will be achieved by trying to minimize deviations of output gap between neighbors and global control law as seen in (2.6).

From the point of view of non-cooperative game, each country makes decision about how to minimize its own cost functional just by its own information: x_i, y_i, and z_i. In contrast, cooperative game can be viewed to be the following two steps: First, all the countries try to obtain a common stabilizing controller by

(29)

6 SIMULATION STUDY

sharing information from each output yi; and, second, European Central Bank (ECB) takes the optimal strategy to minimize the entire cost functional by taking into consideration x_i, which involves exchange rate e_i and price level p_i, from all the countries.

6.2 Simulation Results

Let us consider a monetary and fiscal stabilization policy problem of six countries:

France, Germany, Spain, Italy, Sweden, and UK. This situation might indicate the case where Sweden and UK would participate in Euro. In this simulation, we assume the neighbors of each country according to geographic connection.

Taking into consideration Assumption 1, we set the following graph Laplacian matrix according to import and export statistics between these six countries:

L =







4 −1 −1 −1 0 −1

−1 5 −1 −1 −1 −1

−1 −1 3 −1 0 0

−1 −1 −1 3 0 0

0 −1 0 0 2 −1

−1 −1 0 0 −1 3







(6.3)

Fig. 6.1 shows the schematic figure of connection of the countries.

Figure 6.1: Communication network of countries defined by import and export statistics. The indices in circles represent the index of each country in graph Laplacian matrix.

(30)

6 SIMULATION STUDY

Assumption 2 and 3 can determine a common dynamical system of the countries as follows:

˙x_i =

[ −2 0 1 −1

] x_i+

[ 1 0 0 1

] u_i, y_i =

[ 1 −2

0 1

] x_i

(6.4)

When it comes to the cost functionals, six countries share a common weighting matrix Q_i and have a different weighting matrix R_i depending on the country’s economic situation such as gross domestic product (GDP): A huge country like France, Germany, and UK needs more energy of money supply than other countries. The nominal GDP data is represented in Table 6.1.

J₁ =

∫ _T

0

{∑

j∈N1

(y₁− yj)^T

[ 8 3 3 4

]

(y₁− yj) + (u^g₁)^T

[ 4 0 0 4

] u^g₁

} dt

J₂ =

∫ T 0

{∑

j∈N2

(y₂− yj)^T

[ 8 3 3 4

]

(y₂− yj) + (u^g₂)^T

[ 5 0 0 5

] u^g₂

} dt

J₃ =

∫ _T

0

{∑

j∈N3

(y₃− yj)^T

[ 8 3 3 4

]

(y₃− yj) + (u^g₃)^T

[ 3 0 0 3

] u^g₃

} dt

J₄ =

∫ T 0

{∑

j∈N4

(y₄− yj)^T

[ 8 3 3 4

]

(y₄− yj) + (u^g₄)^T

[ 2 0 0 2

] u^g₄

} dt

J₅ =

∫ _T

0

{∑

j∈N5

(y₅− yj)^T

[ 8 3 3 4

]

(y₅− yj) + (u^g₅)^T

[ 2 0 0 2

] u^g₅

} dt

J₆ =

∫ T 0

{∑

j∈N6

(y₆− yj)^T

[ 8 3 3 4

]

(y₆− yj) + (u^g₆)^T

[ 4 0 0 4

] u^g₆

} dt

(6.5)

In determining the local control law (4.5) which stabilizes the team system (4.6) for cooperation, the condition for gain is K < 0.25. For example we take K =−2 and then matrix A^tof the team dynamical system in (4.11) is

(31)

6 SIMULATION STUDY

A^t=







−10 16 2 −4 2 −4 2 −4 0 0 2 −4

1 −9 0 2 0 2 0 2 0 0 0 2

2 −4 −12 20 2 −4 2 −4 2 −4 2 −4

0 2 1 −11 0 2 0 2 0 2 0 2

2 −4 2 −4 −8 12 2 −4 0 0 0 0

0 2 0 2 1 −7 0 2 0 0 0 0

2 −4 2 −4 2 −4 −8 12 0 0 0 0

0 2 0 2 0 2 1 −7 0 0 0 0

0 0 2 −4 0 0 0 0 −6 8 2 −4

0 0 0 2 0 0 0 0 1 −5 0 2

2 −4 2 −4 0 0 0 0 2 −4 −8 12

0 2 0 2 0 0 0 0 0 2 1 −7





 ,

(6.6) which is stabilizable since A^t is Hurwitz and, in other words, is asymptotically stable. We assume infinite horizon case and the initial values are

x₁(0) =

[ −0.4016 0.2336

]

, x₂(0) =

[ −0.3876 0.0404

]

, x₃(0) =

[ −0.3622

−0.4966 ]

, x₄(0) =

[ −0.3674 0.0298

]

, x₅(0) =

[ 0.3024 0.5066

]

, x₆(0) =

[ −0.7516 0.2886

] .

(6.7)

The following two facts are of help as evidence of the above initial values:

First, the participating countries are France, Germany, Spain, and Italy and they obviously have a close value of exchange rates to each other due to the common currency (Euro). On the other hand, Sweden and UK have their own currency: Swedish Kronor and Pound respectively. In our simulation it is assumed that the exchange rate of Swedish Kronor is higher than Euro and that the exchange rate of Pound is lower than Euro.

Second, we follow comparative price levels of final consumption by private households including indirect taxes based on the EU average as 100. Comparative price levels are the ratio between Purchasing Power Parities (PPPs) and market rate for each country. PPPs are currency conversion rates that convert economic indicators expressed in national currencies to a common currency, called Purchas- ing Power Standard (PPS), which equalizes the purchasing power of different national currencies. In that sense, meaningful comparison is possible. Fig. 6.2 shows the comparative price levels of six countries.

A Game Theoretic Approach to Multi-Agent Cooperation with Application to Economic Systems

A Game Theoretic Approach to

Multi-Agent Cooperation with

Application to Economic Systems

A Game Theoretic Approach to

Multi-Agent Cooperation with

Application to Economic Systems

A Game Theoretic Approach to Multi-Agent

Cooperation with Application to Economic

Systems

Toshihiro Suzuki

Contents

1 Introduction

2 Problem Formulation

2.1 Graph Laplacian

2.2 Multi-Agent Systems

2.3 Cooperative Game Theory

3 Non-Cooperative Game

4 Cooperative Game

4.1 Pareto Optimality

4.2 Control Decomposition

4.3 Team Dynamical System

4.4 Bargaining Problem

4.5 Algorithm for Bargaining Solutions

5 Case Study: EMU

5.1 Monetary and Fiscal Policies

5.2 Implementation of EMU Simulation

6 Simulation Study

6.1 Problem Setting

6.2 Simulation Results