• No results found

Mean Field Games for Jump Non-Linear Markov Process

N/A
N/A
Protected

Academic year: 2021

Share "Mean Field Games for Jump Non-Linear Markov Process"

Copied!
44
0
0

Loading.... (view fulltext now)

Full text

(1)

Linnaeus University Dissertations

No 260/2016

Rani Basna

Mean Field Games for Jump

Non-Linear Markov Process

linnaeus university press

Lnu.se

isbn: 978-91-88357-30-4

Mea n F iel d Game s f or J ump No n-Linea r Ma rk ov Pr oc ess Ra ni B asn a

(2)
(3)
(4)

No 260/2016

M

EAN

F

IELD

G

AMES FOR

J

UMP

N

ON

-L

INEAR

M

ARKOV

P

ROCESS

R

ANI

B

ASNA

(5)

Mean Field Games for Jump Non-Linear Markov Process

Doctoral dissertation, Department of Mathematics, Linnaeus University, Växjö, Sweden, 2016

ISBN: 978-91-88357-30-4

Published by: Linnaeus University Press, 351 95 Växjö, Sweden Printed by: Elanders Sverige AB, 2016

(6)

Basna, Rani (2016). Mean Field Games for Jump Non-Linear Markov Process, Linnaeus University Dissertation No 260/2016, ISBN: 978-91-88357-30-4. Written in English.

The mean-field game theory is the study of strategic decision making in very large populations of weakly interacting individuals. Mean-field games have been an active area of research in the last decade due to its increased significance in many scientific fields. The foundations of mean-field theory go back to the theory of statistical and quantum physics. One may describe mean-field games as a type of stochastic differential game for which the interaction between the players is of mean-field type, i.e the players are coupled via their empirical measure. It was proposed by Larsy and Lions and independently by Huang, Malhame, and Caines. Since then, the mean-field games have become a rapidly growing area of research and has been studied by many researchers. However, most of these studies were dedicated to diffusion-type games. The main purpose of this thesis is to extend the theory of mean-field games to jump case in both discrete and continuous state space. Jump processes are a very important tool in many areas of applications. Specifically, when modeling abrupt events appearing in real life. For instance, financial modeling (option pricing and risk management), networks (electricity and Banks) and statistics (for modeling and analyzing spatial data). The thesis consists of two papers and one technical report which will be submitted soon:

In the first publication, we study the mean-field game in a finite state space where the dynamics of the indistinguishable agents is governed by a controlled continuous time Markov chain. We have studied the control problem for a representative agent in the linear quadratic setting. A dynamic programming approach has been used to drive the Hamilton Jacobi Bellman equation, consequently, the optimal strategy has been achieved. The main result is to show that the individual optimal strategies for the mean-field game system represent 1/N-Nash equilibrium for the approximating system of N agents.

As a second article, we generalize the previous results to agents driven by a non-linear pure jump Markov processes in Euclidean space. Mathematically, this means working with linear operators in Banach spaces adapted to the integro-differential operators of jump type and with non-linear partial differential equations instead of working with linear transformations in Euclidean spaces as in the first work. As a by-product, a generalization for the Koopman operator has been presented. In this setting, we studied the control problem in a more general sense, i.e. the cost function is not necessarily of linear quadratic form. We showed that the resulting unique optimal control is of Lipschitz type. Furthermore, a fixed point argument is presented in order to construct the approximate Nash Equilibrium. In addition, we show that the rate of convergence will be of special order as a result of utilizing a non-linear pure jump Markov process.

In a third paper, we develop our approach to treat a more realistic case from a modelling perspective. In this step, we assume that all players are subject to an additional common noise of Brownian type. We especially study the well-posedness and the regularity for a jump version of the stochastic kinetic equation. Finally, we show that the solution of the master equation, which is a type of second order partial differential equation in the space of probability measures, provides an approximate Nash Equilibrium. This paper, unfortunately, has not been completely finished and it is still in preprint form. Hence, we have decided not to enclose it in the thesis. However, an outlook about the paper will be included.

Keywords: Mean-field games, Dynamic Programing, Non-linear continuous time Markov chains, Non-linear Markov pure jump processes, Koopman Dynamics, McKean-Vlasov equation, Epsilon--Nash equilibrium.

(7)
(8)

Acknowledgments

First, I would like to express my sincere gratitude to my supervisor, Docent Astrid Hilbert, for her continuous support, understanding, and many interesting discussions during my PhD time. Her guidance helped me in all the time of research and writing of this thesis.

I also want to thank my second supervisor, Professor Vassili Kolokoltsov, for his assistance and valuable discussions throughout my PhD.

I had the pleasure of interacting with a wonderful group at the Department of Math-ematics at Linnaeus University. I want to send special thanks to Lars Gustafsson, Marcus Nilsson, Patrik Wahlberg, Yuanyuan Chen and Haidar Al-Talibi for their assistance and support. Special thanks are due also to Roger Pettersson, who was always willing to take the time to help me out. I have approached Roger with such a range of questions, and he is always happy to share his thoughts.

A sincere thank you to my friends near and far for providing support and friendship that I needed. In particular, I would like to thank my friends Martin, Mattias, Marie, Anders, Caroline, Eva and Birgit,.

My family in Sweden Magnus, Anna, Erik, Axel, and Viggo thank you very much for everything you have done.

I am amazingly lucky to have such a wonderful family, my dad, my mom, my sister Raneem and my brother Rafeef who constantly remind me of what is important in life. They continue to shape me today, and I am so grateful for their unwavering support and encouragement. I wish they were with me.

The best thing in my last six years is diffidently that I have spent it beside my soul mate and best friend Hiba. I married the best person for me. There are no words to convey how much I love her. Hiba has been a true and great supporter and has unconditionally loved me during my good and bad times. She has faith in me even when I did not have faith in myself. I would not have been able to obtain this degree without her beside me. Helena, my little angel I love you so much. Thank you for bringing so much light into my life with your precious smile and your beautiful songs.

Växjö, September 2016 Rani

(9)
(10)

Preface

This thesis consists of an introduction two papers, Papers I−II and a technical report about the third paper. The introduction part provides mathematical definitions and tools which have been used in this thesis. Introducing the Mean-field game Theory on which Papers I−III are based. It ends with a short summary of the results of the included papers.

Papers included in the thesis

I. Rani Basna, Astrid Hilbert, Vassili Kolokoltsov. "An Epsilon-Nash equilibrium

for non-linear Markov games of mean-field-type on finite spaces".

Communi-cations on Stochastic Analysis,(2014) 449- 468.

II. Rani Basna, Astrid Hilbert, Vassili Kolokoltsov. " An Approximate Nash

Equi-librium for Pure Jump Markov Games of Mean-field-type on Continuous State Space". Submitted to Stochastics: An International Journal of Probability and

Stochastic Processes (2016).

III. Outlook of the paper. "Jump Mean-Field Games disturbed with common Noise". 2016 (preprint).

(11)
(12)

Contents

1 Introduction 1

1.1 Continuous Time Markov Processes . . . 1

1.2 Initial Value Problems for Ordinary Differential Equations . . . 11

1.3 Optimal Control . . . 13

1.4 Differential Game Theory . . . 19

1.5 Mean-Field Games . . . 22

References 26

2 Summary of included papers 29

3 Included Papers 32

I An Epsilon-Nash equilibrium for non-linear Markov games of mean-field-type on finite spaces.

II An Approximate Nash Equilibrium for Pure Jump Markov Games of Mean-field-type on Continuous State Space.

III Outlook for the paper Jump Mean-Field Games disturbed with com-mon Noise.

(13)
(14)

1

Introduction

In this section, we present the main mathematical tools which are used in the thesis. In Chapter one, we present the theory of Markov processes. Chapter two is dedi-cated to some results form the theory of ordinary differential equations. Optimal Control theory is introduced in Chapter 3 in both the diffusion case and then in the jump case. In Chapter 4 we display an introduction to Differential games theory. Finally, Chapter 5 is an introduction to mean-field game theory.

1.1

Continuous Time Markov Processes

Consider a probability space(Ω, F, P), where Ω is called the sample space, F is the

σ-algebra andP is a probability measure on F.

1.1.1 Markov Chains

Let(X, β(X)) be a measurable space, pn(x, dy) transition probabilities on (X, β(X)), andFn for(n = 0, 1, 2, ...) a filtration on (Ω, F, P).

Definition 1.1. A stochastic process(Xn)n=0,1,2,...on(Ω, F, P) is called (Fn)-Markov chain with transition probabilities pn if and only if

1. Xn isFn measurable ∀n ≥ 0

2. P[Xn+1∈ B|Fn] = pn+1(Xn, B) P-a.s. ∀n ≥ 0, B ∈ β(S).

Theorem 1.2. Given a measurable space (X, β(X)) with distribution μ, and

tran-sition probabilities pn, there exist a Markov chain on the space and it’s distribution

Pμ is unique, see [22].

1.1.2 Weak Convergence LetX be a polish space and let

Cb(X) := {f : f : X → R bounded continuous}

be the space of bounded continuous real-valued functions on X. We equip Cb(X) with supremum norm

f := sup x∈X|f(x)| .

With this norm, Cb(X) is a Banach space. Moreover let us define the space

P := {μ : μ probability measure on(X, β(X))}

is the space of all probability measures on X. We equip P(X) with the topology of weak convergence. We say that a sequence of measures μn ∈ P(X) converges weakly to a limit μ∈ P(X), denoted as μn⇒ μ if



f dμn



(15)

2

Proposition 1.3. (Prokhorov) A subsetK of P(X) is relatively compact in P(X)

if and only if it is tight, i.e,

∀ > 0 ∃X compact set of X with μ(X \ X) ≤  ∀μ ∈ K.

For the proof of this proposition and for more details see [10].

There are several ways to metricize the topology of weak convergence, at least on some subsets ofP(X). Let us denote by d the distance on X and, for p ∈ [0, ∞), byPp(X) the set of probability measures μ such that

 X

dp(x, y)dμ(y) ≤ ∞, for all point x ∈ X

The Monge-Kantorowich distance onPp(X) is given by

dp(μ, ξ) = inf γ∈Π(μ,ξ)  X2 d(x, y)p(x, y) 1/p (1.1) whereΠ(μ, ξ) is the set of Borel probability measures on R2dsuch that γ(A × Rd) = μ(A) and γ(Rd× A) = ξ(A) for any set A ⊂ Rd. The proof that d constitute a

metric and to show the existence of an optimal measure in (1.1) we refer to [6]. Theorem 1.4. (Kantorovich-Rubinstein Theorem) For any μ, ξ∈ P1(X),

d1(μ, ξ) = sup f ∈CLip(X)  X f(x)dμ(x) −  X f(x)dξ(x) 

where CLip(X) is the set of Lipschitz continuous functions.

For a proof of this Theorem see [6]. 1.1.3 Markov Processes

Let t∈ R+,X be a Polish space (complete, separable, metric space), β(X) the Borel

σ- algebra, ps,t(x, dy) transition probabilities (Markov kernels) on (X, β(X)), 0 ≤ s ≤ t≤ ∞, and (Ft)t≥0 a filtration on(Ω, F, P).

Definition 1.5. A stochastic process(Xt)t≥0 on the state spaceX is called an (Ft )-Markov process with transition probabilities ps,t if and only if

1. Xtis Ftmeasurable ∀t ≥ 0,

2. P[Xt∈ B|Fs] = ps,t(Xs, B) P-a.s. ∀0 ≤ s ≤ t, B ∈ β(X).

Definition 1.6. Let I be a finite set. A Q-Matrix on I is a matrix Q= (qi,j: i, j ∈

I) satisfying the following conditions • 0 ≤ −qii≤ ∞ for all i

(16)

j∈Iqij= 0 for all i

Thus in each row of Q we can choose the off-diagonal entries to be any non-negative real numbers, subject only to the constraint that the off-diagonal row sum is finite:

qi= j=i

qij≤ ∞.

The diagonal entry qii is then−qi, making the total row sum zero.

For more details see [19].

Definition 1.7. P C(R+,X) := {X : [0, ∞) → X| ∀t ≥ 0, ∃ > 0 : X(s) is constant on the interval [t, t + )}

Definition 1.8. A Markov process(Xt)t≥0 on(Ω, F, P) is called a pure jump process or continuous time Markov chain if and only if

(t → Xt) ∈ P C(R+,X)P-a.s.

1.1.4 The construction of time-inhomogeneous jump processes:

Let qt : X × β(X) → [0, ∞) be a kernel of positive measure, i.e. x → qt(x, A)

is measurable and A → qt(x, A) is a positive measure. Our aim in this section is to construct a pure jump process with instantaneous jump rates qt(x, dy). Let

λt(x) := qt(x, X\{x}) be the total rate of jumping away from x. And assume that λt(x) < ∞ ∀x ∈ X, no instantaneous jumps.

and set πt(x, A) :=qtλt(x,A)(x) .

Now suppose that(Yn, Jn, hn)n∈Nis a Markov chain with Jn jumping times, and

hn holding times. Then Jn = ni=1hi ∈ [0, ∞) with jump times {Jn : n ∈ N}.

Suppose that with respect toP(t0,μ),

J0:= t0, Y ∼ μ

P(t0,μ)[J1> t|Y0] := e

t∨t0

t0 λs(Y0)ds

for all t≥ t0, then(Yn−1, Jn)n∈Nis a time-homogeneous Markov chain onX × [0, ∞)

with transition law

P(t0,μ)[Yn∈ dy, Jn+1> t|Y0, J1, ..., Yn−1, Jn] = πJn(Yn−1, dy) · e− t∨Jn Jn λs(Y )ds i.e. P(t0,μ)[Yn ∈ A, Jn+1> t|Y0, J1, ..., Yn−1, Jn] =  A πJn(Yn−1, dy) · e−Jnt∨Jnλs(Y )ds

(17)

4

For Yn ∈ X, tn ∈ [0, ∞) strictly increasing define

X:= Φ((tn, Yn)n=0,1,2,...) ∈ P C([t0,∞), X∪{Δ}) by Xt:= Yn for tn≤ t < tn+1, n≥ 0 Δ for t≥ sup tn. Let (Xt)t≥t0:= Φ((Jn, Yn)n≥0) FX t := σ(Xs|s ∈ [t0, t]), t ≥ t0.

Theorem 1.9. UnderP(t0,μ),(Xt)t≥t0 is a Markov jump process with initial distri-bution Xt0∼ μ

Theorem 1.10. 1. The transition probabilities

ps,t(x, B) = P(s,x)[Xt∈ B] (0 ≤ s ≤ t, x ∈ X, B ∈ β(X))

satisfying the Chapman-Kolmogorov equations for each0 ≤ t ≤ s < ∞, x ∈

X, A ∈ β(X)

pt,s(x, A) =

 Rd

pt,s(y, A)pr,t(x, dy). (1.2)

2. If t → λt(x) is continuous for all x ∈ X, then

(ps,s+hf)(x) = (1 − λs(x) · h)f(x) + h · (qsf)(x) + o(h) (1.3)

holds for all s ≥ 0, x ∈ X and bounded functions f : → R such that t →

(qtf)(x) is continuous.

For a proof of the previous two theorems see [9].

We have showed in Theorem 1.9 the existence of a Markov process first and then obtained the Chapman-Kolmogorov equations for the transition probabilities in Theorem 1.10. There is a partial converse to this, which we will now develop. First we need a definition. Let{pt,s; 0 ≤ t ≤ s < ∞} be a family of mappings from X × β(X) → [0, 1]. We say that they are a normal transition family if, for each 0 ≤ t ≤ s < ∞:

1. the maps x→ pt,s(x, A) are measurable for each A ∈ β(X); 2. pt,s(x, A)is a probability measure on β(X) for each x ∈ X; 3. The Chapman-Kolmagorov equation (1.2) are satisfied.

(18)

Proposition 1.11. If{pt,r(x, A) := (Tt,rχA)(x) :=Af(y)p(t, x, r, dy), 0 ≤ t ≤ r < ∞} is a normal transition family and μ is a fixed probability measure on the mea-surable space(X, β(X)), then there exists a probability space (Ω, F, Pμ), a filtration

(Ft, t≥ 0) and a Markov process (Xt, t≥ 0) on that space such that:

• P[X(r) ∈ A | X(t) = x] = pt,r(x, A) for each 0 ≤ t ≤ r, x ∈ X, A ∈ β(X).

• X(0) has law μ.

For the proof see [2].

Proposition 1.12. Let Xtbe a pure Markov jump process, then τ1= inf t ≥ 0, Xt = X0

is anFt stopping time.

Theorem 1.13. UnderPx, τ1 and Xτ1 are independent and there is a β(X) measur-able function λ(x) on X, different to λ above, such that

Px[τ1> t] = e−λ(x)t.

For a proof for the above proposition and Theorem see [22]. 1.1.5 Forward and Backward Equations

Definition 1.14. The infinitesimal generatorAtof a Markov jump process at time t is the rate of change of the average for a function f : X → R of the process.

Atf(x) = lim h→0 Exf(Xh) − f(x) h , f ∈ DAt. Exf(Xh) = Ex[f(Xh)|τ1> h]Px[τ1> h] + Ex[f(Xh)|τ1< h]Px[τ1< h] Exf(Xh) = f(x)e−λ(x)h+  Xf(y)qt(x, dy)(1 − e −λ(x)h) + O(h) Exf(Xh) − f(x) = (1 − e−λ(x)h)  X f(y)qt(x, dy) − f(x) + O(h) Thus Atf(x) = λ(x)  X(f(y) − f(x))qt(x, dy)

In the case of a countable state space, the infinitesimal generator (or intensity ma-trix, kernel) has the following form:

Exf(Xh) = y=x f(y)qt(x, y)h + f(x)(1 − y=x qt(x, y)h) + O(h) Exf(Xh) − f(x) = y=x

(f(y) − f(x))qt(x, y)h + O(h)

Atf(x) =



y∈X

(19)

6

Theorem 1.15. (Kolmogorov’s backward equation)

If t→ qt(x, .) is continuous in total variation norm for all x ∈ X, then the transition kernels ps,t of the Markov jump process constructed above are the minimal solutions of the backward equation

−∂

∂s(ps,tf)(x) = −(Asps,tf)(x) (1.4) for all bounded functions f : X → R, 0 ≤ s ≤ t, with terminal condition

(pt,tf)(x) = f(x)

Remark 1.16. 1. Atis a linear operator on functions f : X → R.

2. ( 1.4) describes the backward evolution of the expectation values E(s,x)[f(Xt)] respectively the probabilitiesP(s,x)[Xt∈ B] when varying the starting times s. 3. In a discrete state space,( 1.4) reduces to

−∂

∂sps,t(x, z) =



y∈X

As(x, y)ps,t(y, z), pt,t(x, z) = δxz a system of ordinary differential equations.

For X being finite, ps,t= exp  t s Ardr =  n=0 1 n!  t s Ardr n

is the unique solution.

IfX is infinite, the solution is not necessarily unique (hence the process is not unique).

Theorem 1.17. (Kolmogorov’s forward equation)

The forward equation d

dt(ps,tf)(x) = (ps,tAtf)(x), (ps,sf)(x) = f(x) (1.5) holds for all 0 ≤ s ≤ t, x ∈ X and all bounded functions f : X → R such that t→ (qtf)(x) and t → λt(x) are continuous for all x

Corollary 1.18. (Fokker-Planck equation) Under the assumptions in the

The-orem,

d

dt(μt, f) = (μt,Atf) (1.6) for all t ≥ s and bounded functions f : X → R such that t → λt are pointwise continuous. One sometimes writes

d dtμt= A

tμt

(20)

1.1.6 Markov Evolution and Propagators

We start first with definition of the propagators. For a set S, a family of mappings

Ut,rfrom S to itself, parametrized by the pairs of numbers r≤ t (respectively t ≤ r)

from a given finite interval is called a (forward) propagator (respectively a backward propagator) is S, if Ut,tis the identity operator in S for all t and the following chain

rule, or propagator equation, holds for r≤ s ≤ t (respectively for t ≤ s ≤ r):

Ut,sUs,r= Ut,r.

A backward propagator Ut,rof bounded linear operator on Banach space B is called

strongly continuous if the operator Ut,rdepend strongly continuously on t and r.

Suppose Ut,r is a strongly continuous backward propagator of bounded linear

op-erator on a Banach space with a common invariant domain D. Let Lt, t ≥ 0 be a

family of bounded operators D→ B depending continuously on t. Let us say that the family Ltgenerates Ut,ron D if, for any f ∈ D, the equations

d dsU t,sf = Ut,sL sf, d dsU t,sf = −L sUt,sf, 0 ≤ t ≤ s ≤ r,

where the derivatives exist in the Banach topology of B.

One needs to estimate the difference between two propagators when the difference between their generators is available. Which lead to the following Theorem. Proposition 1.19. Let D and B, D ⊂ B be two Banach spaces equipped with a

continuous inclusion and let Li, i = 1, 2, t ≥ 0, be two families of bounded linear operators, which are continuous in time t. Assume moreover, that Uit,r are two propagators in B generated by Li, i= 1, 2, respectively, i.e. satisfying

d dsU t,s i f = Uit,sLi,sf, d dsU s,r i f = −Li,sUis,rf, t≤ s ≤ r, (1.7) for any f ∈ D, which satisfy Uit,r

B ≤ c1, i= 1, 2. Moreover, let D be invariant under U1t,sand U1t,s D≤ c2 Then we have

i) U2t,r− U1t,r=  r t U2t,s(L2s− L1s)U1s,rds (1.8) ii) U2t,r− U1t,r B(0,1)→Rk ≤ c21(r − t) sup t≤s≤r L2s− L1s B 1(0)→Rk

Proof. By (1.8), We have that

U2t,r− U1t,r= U2t,sU1s,rrs=t=  r t d ds(U t,s 2 U1s,r)ds =  r t U2t,sL2sU1s,r− U2t,sL1sU1s,rds =  r t U2t,s(L2s− L1s)U1s,rds

(21)

8

Which implies the proof.

With each Markov process Xtwe associate a family of operators(Ts,t, 0 ≤ t ≤ s <∞) on L∞(X) by prescription

(Tt,sf)(x) = E (f(Xt)|Xs= x)

for each f ∈ L∞(X) x ∈ X. We recall that I is the identity operator, If = f for each f ∈ L∞(X)

Theorem 1.20. 1. Tt,sis a linear operator on L∞(X) for each 0 ≤ t ≤ s < ∞. 2. Ts,s =I for each s≥ 0.

3. Tr,tTt,s= Tr,s whenever 0 ≤ r ≤ t ≤ s < ∞.

4. f ≥ 0 ⇒ Tt,sf ≥ 0 for all 0 ≤ t ≤ s < ∞, f ∈ L∞(X). 5. Tt,sis a contraction, i.e. Tt,s ≤ 1 for each 0 ≤ t ≤ s < ∞. 6. Tt,s(1) = 1 for all t ≥ 0.

Any family satisfying (1) to (6) of Theorem (1.20) is called Markov evolution or Markov propagator. It is obvious from previous that the following hold

pt,s(x, A) = (Tt,sχA)(x) = P[Xs∈ A|Xt= x].

By properties of conditional probability each pt,s(x, .) is a probability measure, where pt,s(x, .) is the transition probabilities we define them above and we have the

follow-ing

(Tt,sf)(x) =

 Rd

f(y)pt,s(x, dy) (1.9) Definition 1.21. A Markov propagator is said to be strongly continuous if for each 0 ≤ t ≤ s < ∞, f ∈ L∞(X) x ∈ X.

lim

t→0Ttf− f = 0 for all f ∈ C0(R

d). (1.10)

Now let us define

DA=



ψ∈ B; ∃φψ∈ B such that lim

t→0 Ts,tψ− ψ t − φψ = 0 and let us define the operatorA in B by the prescription

Aψ = φψ

ThenA is the generator of the propagator Ts,t.

(22)

1. DA is dense in B.

2. TtDA⊆ DA for each t≥ 0.

3. TtAψ = ATtψ for each t≥ 0, ψ ∈ DA.

Theorem 1.23. A is closed.

Definition 1.24. The following are the basic definitions related to the generators of

Markov processes. One says that an operatorA in C(Rd) defined on a domain DA • is conditionally positive, if Af(x) ≥ 0 for any f ∈ DA s.t. f(x) = 0 =

minyf(y);

• satisfies the positive maximum principle (PMP), if Af(x) ≤ 0 for any f ∈ DA

s.t. f(x) = maxyf(y) ≥ 0;

• is dissipative if (λ − A)f ≥ λ f for λ > 0, f ∈ DA.

Theorem 1.25. LetA be a generator of a Feller semigroup Φt. Then • A is conditionally positive,

• satisfies the PMP on DA,

If moreoverA is local and DA contains Cc∞, then it is locally conditionally positive and satisfies the local PMP on Cc∞. Where Cc is the space of continuous functions with compact support.

Corollary 1.26. Let B be a Banach space and let A : B → B be a bounded

dissipative linear operator. Then A generates a strongly continuous contraction semigroup(Tt)t≥0 on B, which is given by

Ttf = eAtf :=  n=0 1 n!(At) nf (t ≥ 0). (1.11)

Proposition 1.27. Let X be a locally compact metric space and A be a bounded

conditionally positive operator from C(X) to B(X). Then there exists a bounded transition kernel ν(x, dy) in X with ν(x, {x}) = 0 for all x, and a function a(x) ∈ B(X) such that

Af(x) =



Xf(z)ν(x, dz) − a(x)f(x) (1.12)

Conversely, ifA is of this form, then it is a bounded conditionally positive operator C(X) → B(X). Where C(X) is the space of continuous functions that vanish at infinity.

(23)

10

Theorem 1.28. Let ν(x, dy) be a weakly continuous uniformly bounded transition

kernel in a complete metric spaceX such that ν(x, {x}) = 0 and a ∈ C(X). Then op-erator (3.55) has C(X) as its domain and generates a strongly continuous semigroup Tt in C(X) that preserves positivity and is given by transition kernels pt(x, dy):

Ttf(x) =



pt(x, dy)f(y). (1.13)

In particular, if a(x) = ν(x, .), then Tt1 = 1 and Ttis the semigroup of a Markov process that we shall call a pure jump or jump-type Markov process.

Theorem 1.29. Let ν(x, dy) be a weakly continuous uniformly bounded transition

kernel in a metric space X such that ν(x, {x}) = 0. Let a(x) = ν(x, X). Define the following process Xx

t. Starting at a point x, the process remains there for a random a(x)-exponential time τ, i.e this time is distributed according to P (τ > t) =

exp[−ta(x)], and then jumps to a point y ∈ X distributed according to the probability

law ν(x,.)

a(x). Then the procedure is repeated.

Remark 1.30. IfX in Theorem 1.28 is locally compact and a bounded ν (depending

weakly continuous on x) is such that limx→∞

Kν(x, dy) = 0 for any compact set K, then L of form ( 1.12) preserves the space C(X) and hence generates a Feller semigroup.

In order to deal with one of the most important class of Markov processes we will make the following definition

Definition 1.31. A strongly continuous propagator of positive linear contractions

on C(X) is called a Feller Propagator. In other words, A Markov process in a locally compact metric space X is called a Feller process if its Markov propagator reduced to C(X) is a Feller propagator, i.e. it preserves C(X) and it is strongly continuous there.

Theorem 1.32. Every Lévy process is a Feller process.

Theorem 1.33. IfΦ is a Feller propagator, then the dual propagator Φ∗ onM (X) is a positivity-preserving propagator of contractions depending continuously on t and s. WhereM (X) is the set of bounded signed Borel measures on X.

The proof of the theorems and propositions above can be found in [2, 10, 14] and [15]

1.1.7 Law of Large Numbers for Empirical Measures

Let P(X) denote the space of probability vectors on X. We are given a family of jump rates matrices or pure jump Markov operator At(x, y, p) indexed by p ∈ P(X), and assume for simplicity that the map p→ At(p) is Lipschitz continuous. Given the states XN

1 (t), . . . , XNN(t) of N particles at any time t, the empirical measure has

the form μNt := 1 N N  i=1 δXN i (t), t≥ 0. (1.14)

(24)

Theorem 1.34. Suppose that At(x, y, .) is Lipschitz continuous for all x, y ∈ X, and

assume that μN

0 convergence in probability to q∈ P(X) as N tends to infinity. Then (μN

t )N∈Nconverge uniformly on compact time interval in probability to μt, where μt is the unique solution to equation ( 1.5) with μ0= q.

For more details see [8] and [20].

1.2

Initial Value Problems for Ordinary Differential

Equa-tions

Let E be a complete normed space (Banach space) and U an open subset of E. Assume that J ⊂ R is an open interval containing 0 ( for the sake of simplification) then we define

f : J × U −→ E .

By an integral curve with initial condition x0 we mean a mapping

X: J −→ U

such that X is differentiable, X(0) = x0 and ˙

X= f(t, X(t))

for all t∈ J. We define a local flow for f at x0 to be a mapping

X: J × U0−→ U

where U0 is an open subset of U containing x0 such that for each x∈ U0the map

t → Xx(t) = X(t, x)

is an integral curve for f with initial condition x.

Theorem 1.35. Let J be an open interval containing0. Let U be open in E. Let

x0∈ U. Let 0 < a < 1 such that the closed ball B2a is contained in U . Let f : J × U −→ E

be a continuous map, bounded by a constant C > 0, and satisfying a Lipschitz condition on U with Lipschitz constant K uniformly with respect to J . If b < a/C and b <1/K then there exists a unique flow

X : Jb× Ba(x0) −→ U. If f is of class Cp, then so is each integral curve X

(25)

12

By definition of the integral curve and with new notation D1:=

∂t, we have D1X(t, x) = f(t, X(t, x)).

We now investigate regularity in the initial value for the flow.

Theorem 1.36. Let J be an open interval containing0, and let U be open in E.

Let

f : J × U −→ E.

be a Cp map with p ≥ 1 (possibly p = ∞), and let x0 ∈ U. There exists a unique local flow for f at x0. We can select an open subinterval J0 of J containing0 such that the unique local flow

X : J0× U0−→ U. is of class Cp, and such that D

2X(t, x) :=∂x∂X(t, x) satisfies the differential equation

D1D2X(t, x) = D2f(t, X(t, x))D2X(t, x). on J0× U0 with initial condition D2X(0, x).

Theorem 1.37. Let J be an open interval containing0, and let U be open in E.

Let

f : J × U −→ E.

be a continuous map in t and a C1 map with p ≥ 1 on U and let x0 ∈ U. Then there exists a unique local flow for f at x0 such that the integral curve X(t, x) is of class C1 in both t and x.

The proof of the above theorems can be found in [18].

1.2.1 Continuous Dependence on Parameter and Differentiability Let as before J denote an open interval in R and U be an open subset in the Banach space E. Moreover, letΛ denote a metric space then we are going to study the parameter-dependent initial value problem

˙

X= f(t, x, λ), X(τ) = ξ

Theorem 1.38. Assume thatΛ is as above and that M := U × Λ, and that f is a

lipsichitz continuous in x and in the parameter λ. Then there exist a solution flow which is lipschitz continuous uniformly in all variables.

Let us now turn to recall results about the differentiability of the solution flow. We now assume that the function f is differentiable in x and the parameter λ. If we differential the equation

˙

X = f(t, x, λ)

with respect to the initial data ξ we get the following linear equation ˙z = D2f(t, X(t, τ, ξ, λ))

(26)

Theorem 1.39. Let the function f be of the type C1 in the variable x and the parameter λ then the solution flow X(τ, t, ξ, λ) is of type C1 in all variables and ∂X ∂ξ is a solution of the linearized initial value problem

˙z = D2f(t, X(t, τ, ξ, λ)) z(τ) = I.

Theorem 1.40. Let the function f be depending continuously on t and of type Cm in x∈ U and λ ∈ Λ. Then the solution flow X(τ, t, ξ, λ) is of type Cm in τ, ξ and x and the parameter λ.

The proof of the above theorems can be found in [18] and [1]. 1.2.2 Linearization of Ordinary Differential Equations

Definition 1.41. Let β(t, s), 0 ≤ t ≤ s ≤ T be a family of non-singular

trans-formation in a Banach space X with Borel σ-algebra F and measure μ. Recall that β(t, s) is non-singular if and only if for every A ∈ F such that μ(A) =

0, μ(β−1(t, s)A) = 0.

Further let F ∈ L∞(X). Then the family operators Φt,s: L(X) → L(X) defined by

t,sF)(x) = F (β(s, t)(x)) 0 ≤ t ≤ s ≤ T (1.15) is called the Koopman propagator with respect to β(t, s). Traditionally the Koopman propagator is called Koopman operator.

It can be shown, see [17], that the Koopman operator has the following properties:

• Φt,sis a linear operator.

• Φt,sis a contraction on L(X), i.e. Φt,sF

L∞(X)<F L∞(X)∀F ∈ L∞(X).

Being a contraction the Koopman operator is bounded. It is also straightforward to show thatΦt,sis a time inhomogeneous propagator see [17].

1.3

Optimal Control

Throughout this section we suppose that(Ω, F, P) is a probability space with filtra-tion{Ft, t≥ 0} satisfying the usual conditions.

Definition 1.42. (Control Processes)

Given a subset U of Rm, we denote by U0 the set of all measurable processes u = {ut, t≥ 0} valued in U. The elements of U0 are called a control processes.

In most cases it is natural to require that the control processes u is adapted to the X process. Then we define the control process u by

ut= u(t, Xt)

(27)

14

1.3.1 Optimal Control for Diffusions

Definition 1.43. (Controlled Diffusion Processes)

Let

b: (t, x, u) ∈ R+× X × U → b(t, x, u) ∈ X and

σ: (t, x, u) ∈ R+× X × U → σ(t, x, u) ∈ Xd

be two functions. For a given point x0∈ X we will consider the following controlled stochastic differential equation

dXt= b(s, Xs, u(s, Xs))dt + σ(s, Xs, u(s, Xs))dBt,

Xt= x, (1.16)

where B= {Bt, t≥ 0} is a Brownian motion valued in Xddefined on(Ω, F, IF, P).

In most concrete cases we also have to satisfy some control constraints, and we model this by taking as given a fixed subset U ⊆ Rmand requiring that u

t∈ U for

each t. We can now define the class of admissible control laws. Definition 1.44. A control law u is called admissible if

• u(t, x) ∈ U for all t ∈ R+ and all x∈ X.

• For any given initial point (t, x) the SDE ( 1.16) has a unique solution. The class of admissible control laws is denoted byU ⊂ U0.

Definition 1.45. The control problem is defined as the problem to maximize

Et,x  T t J(s, Xsu, us)ds + G(XTu)  , (1.17)

given the dynamics ( 1.16) and the constraints

u(s, y) ∈ U, ∀(s, y) ∈ [t, T ] × X. (1.18) Definition 1.46. The value function

V : [0, T ] × X → R is defined such that

V(t, x) = sup u∈U Et,x  T t J(s, Xu s, us)ds + G(XTu) 

given the dynamics ( 1.16).

(28)

1. There exists an optimal control lawˆu.

2. The optimal value function V is regular in the sense that V ∈ C1,2.

3. A number of limiting procedures in the following arguments can be justified. Theorem 1.47. (Hamilton Jacobi Bellman equation) Under the Assumption,

the following hold:

1. V satisfies the Hamilton Jacobi Bellman equation

⎧ ⎨ ⎩ ∂V ∂t(t, x) + supu∈U{J(t, x, u) + A uV(t, x)} = 0, ∀(t, x) ∈ (0, T ) × X V(t, x) = G(x), ∀x ∈ X.

2. For each(t, x) ∈ [0, T ]×X the supremum for the HJB equation above is attained at u= ˆu(t, x).

Theorem 1.48. (Verification theorem) Suppose that we have two functions H(t, x) and g(t, x), such that

• H is sufficiently integrable (see Remark 19.3.4 below), and solves the HJB equation ⎧ ⎨ ⎩ ∂H ∂t (t, x) + supu∈U{J(t, x, u) + A uV(t, x)} = 0, ∀(t, x) ∈ (0, T ) × X H(t, x) = G(x), ∀x ∈ X. • The function g is an admissible control law.

• For each fixed (t, x), the supremum in the expression

sup

u∈U{J(t, x, u) + A

uV(t, x)} is attained by the choice u= g(t, x).

Then the following hold:

1. The optimal value function V to the control problem is given by V(t, x) = H(t, x)

2. There exists an optimal control law ˆu, and in fact ˆu(t, x) = g(t, x).

(29)

16

1.3.2 Optimal Control for Markov Jump Processes

In this subsection we shall describe the principle of dynamic programing for a finite state Markov Jump Process Xt, t∈ [0, T ] and the corresponding HJB equation for

finding the optimal control strategy. We start by recalling the pure jump Markov process on a locally compact spaceX specified by the integral generator of the form

Atf(x) =



X(f(y) − f(x))ν(t, x, dy)

(1.19) with bounded kernel ν.

Definition 1.49. (Controlled Jump Processes)

Assuming that we can control the jumps of this process, i.e. the measure ν depends on the control ut, then we consider the following controlled dynamics

d

dt(μt, f) = (μt,A[t, ut]f) (1.20)

Suppose that the value function V : [0, T ] × X × Rk→ R starting at time t and

position x is defined as:

V(t, x, μs) := sup u.∈UEx  T t J(s, Xs, μs, us)ds + VT(XT, μT) 

where μ∈ P(Rk) the set of probability measure.

In this subsection and for simplicity we will fix all parameters but u and drop them. Definition Let V(t, x) be a real valued function on [0, T ] × X. We define a linear operatorL by

L = lim

h→0+

Et,xV(t + h, X(t + h)) − V (t, x)

h (1.21)

provided the limit exist for each x∈ X and each t.

Let D(L) be the domain of operator L and moreover assume that the following hold for each V ∈ D(L)

• V , ∂V

∂t andLV are continuous on [0, T ] × X • Et,x|V (t, X(s))| ≤ ∞, Et,x

s

t |LV (r, X(r))| dr ≤ ∞ ∀t ≤ s ∈ [0, T ] • (Dynkin formula) For t ≤ s,

Et,xV(t, X(s)) − V (t, x) = Et,x

 s

t

LV (r, X(r))dr

Proposition 1.50. Let V(t, x) be of class C1([0, T ] × X). Then V (t, x) ∈ D(L) and LV (t, x) =∂V

∂t(t, x) + A[t, u]V (t, x).

Where A[t, u] is the generator of the jump nonlinear time-inhomogeneous Markov process.

(30)

A proof for the above Proposition can be found in [12].

The dynamic programing equation is derived from the following procedures. If we take a constant control u(s) = u for t ≤ s ≤ T , then

V(t, x) ≤ Ex  T t J(s, Xs, us)ds + VT(X T)  .

If we deduct V(t, x) from both sides, divide by h, let h → 0 and use Dynkin’s formula we get that:

lim h→0h −1E x  T t J(s, Xs, us)ds  = J(t, x, u) lim h→0h −1E x  V(t, Xt) − VT(T, X T)  = lim h→0h −1E x  T t L[t, u]V (s, Xs)ds  = L[t, u]V (t, x) Therefore, for all u∈ U we have:

0 ≥ L[t, u]V (t, x) + J(t, x, u). Ifˆu is the optimal control strategy, we will have:

V(t, x) := Ex  T t J(s, Xs,ˆus)ds + VT(T, XT,)  0 = L[t, ˆu]V (t, x) + J(t, x, ˆu), which leads to the dynamic programing equation

0 = sup

u.∈U[L[t, u]V (t, x) + J(t, x, u)].

if xtis a controlled Markov chain with finite state space and jump rates ν(t, x, y, u)

using the above proposition. The HJB equation becomes the following system of ordinary differential equations, see [12]

∂V

∂t + maxu [J(t, x, u) + A[t, u]V ] = 0. (1.22)

Next we present a well-posedness result and a verification Theorem for the HJB equation in the space of bounded continuous functions Cb(X) for the case of non

homogenuous pure jump Markov process. let us first formulate the assumptions that we need

1. The set U is compact.

(31)

18

3. The cost function J(t, x, u) ∈ Cb(X).

4. The terminal value function G(XT) ∈ Cb(X).

Theorem 1.51. Under the above assumptions there exists a unique solution v to

the HJB equation ( 1.22)

Theorem 1.52. Under the above assumptions the unique solution v to the HJB

equation ( 1.22) coincide with the value function V(t, x). Moreover, there exists an optimal control u, which is given by any function satisfying

A[t, u(t, x)]v(t, x) + J(t, x, u(t, x)) = sup

u∈UA[t, u]v(t, x) + J(t, x, u)

For the proof of the above Theorems see [3].

The optimal control u∗in the above Theorem may be discontinuous. If the max-imum has achieved at only one point for every (t, x) then it follow form the com-pactness of U that the optimal control u∗is continuous. Sufficient additional condi-tions for such a unique maximum is the strict concavity of the functionΘ(t, x, u) =

A[t, u]V (t, x)+J(t, x, u) and the convexity of U. Under somewhat stronger

assump-tions we show next a Lipschitz continuity of the optimal control u∗.

Lemma 1.53. Let the assumptions above be fulfilled. And Let the functionΘ(t, x, .)

satisfy the following

1. Θ(t, x, .) is C2 for each (t, x) ∈ [0, T ] × X.

2. Θ(t, ., u)u satisfies onX a Lipschitz condition, uniformly with respect to t, u. 3. The absolute value of the Eigenvalues of the matricesΘuu are bounded above

by γ >0.

Proof. It is clear that the assumptions above on the functionΘ(t, x, u) are inherited

by the cost function J(t, x, u) and the Jump coefficient ν(t, x, u). We will divide the proof to two steps.

Assumption3 means that the cost function J(t, x, .) is a strictly concave function, hence it has a unique maximum on the compact set U . Given s∈ [0, T ], x1, x2∈ X let

u1= u∗(t, x1), u2= u∗(t, x2) From1, 3 and Taylor formula we have,

J(t, x1, u2) − J(t, x1, u1) ≥ Ju(s, x1, u1)(u2− u1) −|Θuu|

2 |u2− u1|2

≥ Ju(s, x1, u1)(u2− u1) −γ

2|u2− u1|2

due to the fact that the cost function reach a maximum at u1on the convex set U , the first term on the right hand side of the above inequality vanish. By applying the integral form of the mean value Theorem on the left hand side we have that

 1 0

Ju(P1(λ))(u2− u1)dλ ≥ λ

(32)

where P1(λ) = (t, x1, u1+ λ(u2− u1)). likewise, if we exchange x1 by x2 we get,  1 0 Ju(P2(λ))(u2− u1)dλ ≥ λ 2|u2− u1|2, (1.24) where P2(λ) = (t, x2, u1+ λ(u2− u1)). By adding (1.23) and (1.24) together we get

 1 0 [Ju(P1(λ)) − Ju(P2(λ))] (u2− u1)dλ ≥ γ |u2− u1| 2 By Cauchy’s inequality  1 0 [Ju(P1(λ)) − Ju(P2(λ))] dλ ≥ γ |u2− u1| . From assumption B6, and|P1(λ) − P2(λ)| = |x1− x2|

|Ju(P1(λ)) − Ju(P2(λ))| ≤ C |x1− x2| ,

where C is the Lipschitz constant for Ju(t, ., u). Hence, we have that

C|x2− x1| ≥ γ |u2− u1| |u∗(t, x2) − u(t, x1)| ≤ C

γ |x2− x1| .

Repeating the same line arguments in the jump coefficient ν(t, x, u) guarantee the claimed regularity property.

For more details and discussions on the continuity, existence and uniqueness of the optimal control see [11].

1.4

Differential Game Theory

A non-cooperative game with an arbitrary, finite number of players A; B; C; · · · in normal form can be described by the sets SA, SB, SC,· · · of possible strategies of

these players and by their payoff functions

ΠA(sA; sB; sC; · · · ), ΠB(sA; sB; sC; · · · ), ΠC(sA; sB; sC; · · · ); · · · .

These functions specify the payoffs of the players A; B; C; · · · for an arbitrary profile (sA; sB; sC; · · · ), where a profile (or a situation) is any collection of strategies sAfrom SA, sB from SB, sCfrom SC, etc.

Strictly and weakly dominated strategies, dominant strategies and Nash equilib-ria are defined in the same way as for two players. In particular, a situation (s

A, sB, sC,· · · ) is called a Nash equilibrium, if none of the players can win by

deviating from this situation, or, in other words, if the strategy s

(33)

20

to the collection of the strategies s

B, sC,· · · , the strategy sB is the best reply to the

collection of the strategies s

A, sC,· · · , etc. In formal language this means that

ΠA(sA; sB; sC; · · · ) ≥ ΠA(sA; sB; sC; · · · )

for all sA from SA,

ΠB(sA; sB; sC; · · · ) ≥ ΠA(sA; sB; sC; · · · )

for all sB from SB, etc.

Differential games or continuous-time infinite dynamic games study a class of decision problems, under which the evolution of the state is described by a differential equation and the players act throughout a time interval.

In particular, in the general n-person differential game, Player i seeks to: max ui  T t0 Ji[s, x(s), u1(s), u2(s), · · · , u n(s)]ds+Gi(x(T )), for i ∈ N = {1, 2, · · · , n}

subject to the dynamics

˙x(s) = f[s, x(s), u1(s), u2(s), ..., un(s)], x(t0) = x0,

where x(s) ∈ X ⊂ Rmdenotes the state variables of game, and u

i∈ Uiis the control

of Player i, for i∈ N.

The functions f[s, x, u1, u2, ..., un], Ji[s, ·, u1, u2, ..., un] and Gi(·), for i ∈ N, and s∈ [t0, T] are differentiable functions. A set-valued function ηi(.) defined for each i∈ N as

ηi(s) = {x(t), t0≤ t ≤ i

s}, t0≤ is≤ s,

where i

s is non decreasing in s, and ηi(s) determines the state information gained

and recalled by Player i at time s∈ [t0, T]. Specification of ηi(·) (in fact, isin this

formulation) characterizes the information structure of Player i and the collection (over i ∈ N) of these information structures is the information structure of the game.

Definition 1.54. A set of strategies{v1∗(s), v∗2(s), · · · , v∗n(s)} is said to constitute a

non-cooperative Nash equilibrium solution for the n-person differential game, if the following inequalities are satisfied for all vi(s) ∈ Ui, i∈ N:

 T t0 Ji[s, x(s), v 1(s), v∗2(s), · · · , v∗i−1(s), v∗i(s), v∗i+1(s), · · · , vn∗(s)]ds + Gi(x∗(T )) ≥  T t0 Ji[s, x[i](s), v

1(s), v2∗(s), · · · , vi−1∗ (s), vi(s), vi+1∗ (s), · · · , v∗n(s)]ds + Gi(x[i](T )) where on the time interval[t0, T]

˙x[i](s) = f[s, x[i](s), v

1(s), v2∗(s), · · · , vi−1∗ (s), vi(s), vi+1∗ (s), · · · , v∗n(s)], x[i](t0) = x0. The set of strategies{v1∗(s), v∗2(s), · · · , v∗n(s)} is known as a Nash equilibrium of the game.

(34)

Definition 1.55. Open-loop Nash Equilibria If the players choose to commit

their strategies from the outset, the players information structure can be seen as an open-loop pattern in which ηi(s) = x0, s∈ [t0, T]. Their strategies become functions of the initial state x0 and time s, and can be expressed as{ui(s) = ϑi(s, x0), for i ∈ N}.

Definition 1.56. Closed-loop Nash Equilibria Under the memoryless perfect

state information, the players information structures follow the pattern ηi(s) = {x0, x(s)}, s ∈ [t0, T]. The players strategies become functions of the initial state x0,

current state x(s) and current time s, and can be expressed as {ui(s) = ϑi(s, x, x0), for i ∈ N}.

Definition 1.57. Feedback Nash Equilibria To eliminate information

nonunique-ness in the derivation of Nash equilibria, one can constrain the Nash solution further by requiring it to satisfy the feedback Nash equilibrium property. In particular, the players information structures follow either a closed-loop perfect state (CLPS) pat-tern in which ηi(s) = {x(s), t0 ≤ t ≤ s} or a memoryless perfect state (MPS) pattern in which ηi(s) = {x0, x(s)}. Moreover, we require the following feedback Nash equilibrium condition to be satisfied.

Definition 1.58. For the n-person differential game with MPS or CLPS

informa-tion, an n-tuple of strategies{u∗i(s) = φ∗i(s, x) ∈ Ui, for i∈ N} constitutes a feedback Nash equilibrium solution if there exist functionals Vi(t, x) defined on [t0, T] × Rm and satisfying the following relations for each i∈ N:

Vi(T, x) = Gi(x), Vi(t, x) =  T t Ji[s, x∗(s), φ∗1(s, ηs), φ∗2(s, ηs), · · · , φ∗n(s, ηs)]ds + Gi(x∗(T )) ≥  T t Ji[s, x[i](s), φ 1(s, ηs), φ∗2(s, ηs), · · · , · · · , φ∗

i−1(s, ηs), φi(s, ηs), φ∗i+1(s, ηs), . . . φ∗n(s, ηs)]ds + qi(x[i](T )), ∀φi(·, ·) ∈ Γi, x∈ Rn, where on the interval[t0, T],

˙x[i](s) = f[s, x[i](s), φ 1(s, ηs), φ∗2(s, ηs), · · · · · · , φ∗ i−1(s, ηs), φi(s, ηs), φ∗i+1(s, ηs), . . . φ∗n(s, ηs)], x[1](t) = x; ˙x∗(s) = [s, x(s), φ 1(s, ηs), φ∗2(s, ηs), · · · , φ∗n(s, ηs)], x(s) = x;

and ηs stands for either the data set {x(s), x0} or {x(τ), τ ≤ s}, depending on whether the information pattern is MPS or CLPS.

Theorem 1.59. An n-tuple of strategies {u∗i(s) = φ∗i(t, x) ∈ Ui, for i ∈ N} pro-vides a feedback Nash equilibrium solution to the game if there exist continuously differentiable functions Vi(t, x) : [t0, T] × Rm → R, i ∈ N, satisfying the following set of partial differential equations:

(35)

22 −Vi t(t, x) = maxui {Ji[t, x, φ∗1(s, x), φ∗2(s, x), · · · · · · , φ∗ i−1(s, x), ui(s, x), φ∗i+1(s, x), . . . φ∗n(s, x)] + Vi x(t, x)f[t, x, φ∗1(s, x), φ∗2(s, x), · · · · · · , φ∗ i−1(s, x), ui(s, x), φ∗i+1(s, x), . . . φ∗n(s, x)]} = {Ji[t, x, φ 1(s, x), φ∗2(s, x), · · · , φ∗n(s, x)] + Vi x(t, x)f[t, x, φ∗1(s, x), φ∗2(s, x), · · · , φ∗n(s, x)]}, V(T, x) = Gi(x), i ∈ N.

The proof may be found in [7] or [16].

1.5

Mean-Field Games

The aim of this section is to present in a simplified framework some of the ideas developed in the Mean Field Games area. The Mean Field Game theory in diffusion case is well studied in literature see [6], [13] and [4], we will leave the jump case to the papers after the introduction. It is not our intention to give a full picture of this fast growing area, but we will try to provide an approach as self content as possible. The typical model for Mean Field Games (MFG) is the following system

i) − ∂tV − νΔV + H(x, μ, DV ) =0 ii) ∂tμ− νΔμ − ∇(DpH(x, μ, DV )μ) =0

μ(0) = μ0, V(x, T ) =G(x, μ(T )),

(1.25)

where in the above system, ν is a nonnegative parameter, V is the value function

V : [0, T ] × X → R, μ is the distribution function, and H is the Hamiltonian. The

first equation has to be understood backward in time and the second one is forward in time. There are two crucial structure conditions for this system: the first one is the convexity of H= H(x, μ, DV ) with respect to the last variable. This condition implies that the first equation (a Hamilton-Bellman-Jacobi equation) is associated with an optimal control problem. This first equation shall be the value function associated with a typical small player. The second structure condition is that μ0 (and therefore μ(t)) is (the density of) a probability measure.

The heuristic interpretation of this system is the following. An average agent controls the stochastic differential equation :

dXt= atdt+√2νBt (1.26) where(Bt) is a standard Brownian motion. She aims at minimizing the quantity

E  T 0 1 2J(s, Xs, μ(s), Vs)ds + G(XT, μ(T ))  . (1.27) Note that in this cost the evolution of the measure μsenters as a parameter. The value function of our average player is then given by(1.25(i)). Her optimal control

(36)

is, at least heuristically, given in feedback form by a∗(x; t) = −DpH(x, μ, DV ). Now,

if all agents argue in this way, their repartition will move with a velocity which is due, on the one hand, to the diffusion, and, one the other hand, on the drift term

−DpH(x, μ, DV ). This leads to the Kolmogorov equation (1.25(ii)).

The Mean Field Game theory developed so far has been focused on two main issues: first investigate equations of the form (1) and give an interpretation (in economics for instance) of such systems. Second analyze differential games with a finite but large number of players and link their limiting behavior as the number of players goes to infinity and equation (1).

1.5.1 Symmetric functions of many variables

LetX be a compact metric space and vN : XN → R be a symmetric function. Let

assume the following

1. There is some C >0 such that

vNL(X)≤ 0

2. There is a constant w independent of N such that

|vN(X) − vN(Y)| ≤ w(d1(μNX, μNX)), ∀Y, X ∈ XN, where μN X = N1 N i=1δxi and μNX= N i=1δyi.

Theorem 1.60. If the vN are symmetric and satisfy the assumptions (1) and (2) above, then there is a subsequence(vnk) of (vN) and a continuous map U : P(X) →

R such that

lim

k→∞X∈Xsupnk|vnk(X) − U(μ nk

X)| = 0

For a proof of the above Theorem see [6]. 1.5.2 Mean-Field Equation i) − ∂tV +1 2|DV (x, t)|2=F (x, μ(t)) ii) ∂tμ− ∇(DV (x, t)μ(x, t)) =0 μ(0) = μ0, V(x, T ) =G(x, μ(T )) (1.28)

Our aim is to prove the existence of classical solutions for this system and give interpretation in term of game with finitely many players. Let us briefly recall the heuristic interpretation of this system: the map V is the value function of a typical agent who controls her velocity a(t) and has to minimize her cost

 T 0 (

1

(37)

24

where x(t) = x0+0ta(s)ds. Her only knowledge on the overall world is the

dis-tribution of the one-all agent, represented by the density μ(t) of some probability measure. Then her feedback strategy i.e., the way she ideally controls at each time and at each point her velocity is given by a(x, t) = −DV (x, t). Now if all agents argue in this way, the density μ(x, t) of their distribution μ(t) over the space will evolve in time with the equation of conservation law (1.28 (ii)).

We then have to prove the existence and uniqueness of a fixed point. Starting from a given initial distribution (which is the agents anticipation of the overall players dynamics) μ. From this initial distribution each player uses a backward reasoning achieved by the Hamilton-Jacobi- Bellman equation to obtain his optimal strategy ui. These optimal strategies can be plugged into the forward Kolmogorov

equation to know the actual dynamics of the overall community implied by individual behavior, which is the distribution μ∗. Finally the rational expectation hypothesis implies that there are coherent between the anticipated initial distribution μ and the resulted one μ∗.This forward/backward procedure is the essence of the Mean Field Game theory in continuous time.

Let assume that all measures have a finite first order moment. LetP1(R) be the set of such Borel probability measures μ on Rd such that

Rd|x| dμ(x) < ∞. The

setP1(R) can be endowed with the following distances:

d1(μ, ξ) = inf γ∈Π(μ,ξ)



|x − y| dγ(x, y)



whereΠ(μ, ξ) is the set of Borel probability measures on R2dsuch that γ(A × Rd) = μ(A) and γ(Rd×A) = ξ(A) for any set A ⊂ Rd. We assume that J : X×P1(R) → R

and G: X × P1(R) → R are satisfying the following assumptions 1. J and G are uniformly bounded by C0 overRd× P1(R)

2. J and G are Lipschitz continuous i.e. for all(x1, μ1), (x2, μ2) ∈ Rd× P1(R)

we have

|F (x1, μ1) − F (x2, μ2)| ≤ C0[|x1− x2| + d(μ1, μ2)] and

|G(x1, μ1) − G(x2, μ2)| ≤ C0[|x1− x2| + d(μ1, μ2)]

3. Finally we suppose that μ0 is absolutely continuous, with a density still de-noted μ0 which is Hölder continuous that satisfiesRd|x|2μ0(x)dx.

A pair(V, μ) is a classical solution to (1.28) if V, μ : Rd×P1(R) → R are continuous,

of class C2 in space and C1 in time and(V, μ) satisfies (1.28) in the classical sense. The main result of this section is the following existence result:

Theorem 1.61. Under the above assumptions, there is at least one solution to

(38)

Let us assume that, besides assumptions given at the beginning of the section, the following conditions hold:

 X(F (x, μ1) − F (x, μ2))d(μ1− μ2)(x) ≥ 0 ∀μ1 , μ2∈ P1(R), μ1 = μ2 and  X(G(x, μ1) − G(x, μ2))d(μ1− μ2)(x) ≥ 0 ∀μ1 , μ2∈ P1(R)

Theorem 1.62. Under the above conditions, there is a unique classical solution to

the mean field equation ( 1.28)

Remark 1.63. The case that we treated above for ν = 1 in equation ( 1.26) called

the first order mean field equation. For ν = 0 a second order equation appear and a solutions in viscosity and distributional sense to the Mean Field Game arise, for more details see [6].

1.5.3 Application to games with finitely many players

Let us assume that(V, μ) is a solution for the mean field equation (1.28) and let us investigate the optimal strategy for a representative player who counts the density

μ of the other players as given. She faces the following minimization problem

inf a J(a), J(a) = E  T 0 1 2|a|2+ F (Xs, μs)ds + G(XT, μT)  .

where Xt = X0+0tasds+√2Bs, X0 is a fixed random initial condition with law

μ0 and the control a is adapted to some filtration (Ft). We assume that Bt is an d-dimensional Brownian motion adapted to the filtration(Ft) and that X0and(Bt) are independent. We claim that the feedback strategy ¯a(t, x) := −DxV(t, x) is

optimal for this optimal stochastic control problem.

Lemma 1.64. Let( ¯Xt) be the solution of the stochastic differential equation d ¯Xt= ¯a(t, ¯Xt)dt +√2dBt

¯

X0= X0

andˆa(t) = ¯a(t, Xt). Then

inf

a J(a) = J(ˆa) =

 RN

V(0, x)dμ0(x).

We will now look at differential game with N players. In this game a player

i(i = 1, ..., N) is controlling through her control aia dynamic of the form dXti= aitdt+√2dBti

References

Related documents

Heinonen, and Johan ˚ Akerman, Decoherence and mode hopping in a magnetic tunnel junction based spin torque oscillator, Phys. 108

Linköping Studies in Science and Technology.. FACULTY OF SCIENCE

We saw that classical programming techniques can solve the optimal portfolio problem if the constraints are linear and that the Differential Evolution algorithm can

poorly fluorescence; and has very similar “turn-on” fluorescent sensing behaviors upon Zn

Smartphones’ usages are growing rapidly. Smart phone usages are not limited to the receiving/calling or SMSing anymore. People use smartphone for online shopping,

För mig som blivande lärare kändes asatron relevant som uppsatsämne – inte bara därför att ”det blotas till Oden och Freja”, som det vidare sägs i nyss nämnda

Syftet med denna studie var att undersöka vad pedagogerna i förskolan anser om teknik, främst digital teknik i relation till förskolans läroplan. Denna stu- die visar att det

Generellt kan konstateras att både Portvakten och Limnologen har en låg energi- användning och båda husen uppfyller med god marginal det uppsatta målen på 45 respektive 90 kWh/m²