• No results found

Comparison of modes of convergence in a particle system related to the Boltzmann equation

N/A
N/A
Protected

Academic year: 2021

Share "Comparison of modes of convergence in a particle system related to the Boltzmann equation"

Copied!
69
0
0

Loading.... (view fulltext now)

Full text

(1)

Master Thesis Project

Comparison of modes of convergence in a particle system

related to the Boltzmann equation

Mikael Petersson

(2)
(3)

Comparison of modes of convergence in a particle system

related to the Boltzmann equation

Matematiska institutionen, Link¨opings Universitet Mikael Petersson

LiTH - MAT - EX - - 2010 / 27 - - SE

Examensarbete: 30 hp Level: D

Supervisor: J¨org-Uwe L¨obus,

Matematiska institutionen, Link¨opings Universitet Examiner: J¨org-Uwe L¨obus,

Matematiska institutionen, Link¨opings Universitet Link¨oping: November 2010

(4)
(5)

Matematiska Institutionen 581 83 LINK ¨OPING SWEDEN November 2010 x x http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-61303 LiTH - MAT - EX - - 2010 / 27 - - SE

Comparison of modes of convergence in a particle system related to the Boltzmann equation

Mikael Petersson

The distribution of particles in a rarefied gas in a vessel can be described by the Boltzmann equation. As an approximation of the solution to this equation, Caprino, Pulvirenti and Wagner [3] constructed a random N -particle system.

In the equilibrium case, they prove in [3] that the L1-distance between the density

function of k particles in the N -particle process and the k-fold product of the solution to the stationary Boltzmann equation is of order 1/N . They do this in order to show that the N -particle system converges to the system described by the stationary Boltzmann equation as the number of particles tends to infinity.

This is different from the standard approach of describing convergence of an N -particle system. Usually, convergence in distribution of random measures or weak convergence of measures over the space of probability measures is used. The purpose of the present thesis is to compare different modes of convergence of the N -particle system as N tends to infinity assuming stationarity.

Random measures, Stochastic particle systems, The Boltzmann equation.

Nyckelord Keyword Sammanfattning Abstract F¨orfattare Author Titel Title

URL f¨or elektronisk version

Serietitel och serienummer Title of series, numbering

ISSN 0348-2960 ISRN ISBN Spr˚ak Language Svenska/Swedish Engelska/English Rapporttyp Report category Licentiatavhandling Examensarbete C-uppsats D-uppsats ¨ Ovrig rapport Avdelning, Institution Division, Department Datum Date

(6)
(7)

Abstract

The distribution of particles in a rarefied gas in a vessel can be described by the Boltzmann equation. As an approximation of the solution to this equation, Caprino, Pulvirenti and Wagner [3] constructed a random N -particle system.

In the equilibrium case, they prove in [3] that the L1-distance between the

density function of k particles in the N -particle process and the k-fold product of the solution to the stationary Boltzmann equation is of order 1/N . They do this in order to show that the N -particle system converges to the system described by the stationary Boltzmann equation as the number of particles tends to infinity.

This is different from the standard approach of describing convergence of an N -particle system. Usually, convergence in distribution of random measures or weak convergence of measures over the space of probability measures is used. The purpose of the present thesis is to compare different modes of convergence of the N -particle system as N tends to infinity assuming stationarity.

Keywords: Random measures, Stochastic particle systems, The Boltzmann equation.

(8)
(9)

Acknowledgements

I would like to thank my supervisor and examiner J¨org-Uwe L¨obus for a very good cooperation during this project. I also want to thank my opponent David Lam for his valuable opinions of the report. Moreover, I want to express my gratitude to Prashant Kara who recommended me for this project. I am very grateful for the fantastic years with my friends at the mathematics program at Link¨opings Universitet. My thoughts also go out to my family and other friends. Mikael Petersson

(10)
(11)

Notation

B(S) Bounded real valued functions on S B(S) The Borel σ-algebra over S

b

B(S) Relatively compact Borel measurable subsets of S C(S) Continuous real valued functions on S

Cb(S) Bounded continuous real valued functions on S

CK+(S) Continuous non-negative real valued functions on S with compact support C The complex numbers

∂E The boundary of E E(X) The expected value of X L1(S) Integrable functions on S

L+(S) Measurable functions from S to [0, ∞]

M(S) Locally finite measures on S N The natural numbers, {1, 2, ...} P{ξ ∈ E} The probability of the event {ξ ∈ E} P(S) Borel probability measures on S

R The real numbers

R+ The non-negative real numbers, [0, ∞)

σ(E ) The σ-algebra generated by E 2S All subsets of S

∅ The empty set

A ⊂ S A is a subset of, or equal to S ξ= ηd ξ and η have the same distribution ξn d − → ξ ξn converges in distribution to ξ ξn p − → ξ ξn converges in probability to ξ µn v −→ µ µn converges vaguely to µ Pn w −→ P Pn converges weakly to P kf kL1 The L1-norm of f

kf k∞ The uniform norm of f

(12)
(13)

Contents

1 Introduction 1

2 Preliminaries 3

2.1 Metric Spaces and Normed Vector Spaces . . . 3

2.2 σ-algebras and Measures . . . 6

2.3 Integration of Real Valued Functions . . . 9

2.4 Random Elements . . . 12

3 Convergence of Probability Measures 15 3.1 Prohorov’s Metric . . . 15

3.2 Prohorov’s Theorem . . . 17

3.3 Weak Convergence of Probability Measures . . . 20

3.4 Convergence Determining Sets . . . 24

4 Convergence of Random Measures 27 4.1 Convergence in Distribution of Random Elements . . . 27

4.2 Random Measures . . . 30

4.3 Convergence in Distribution of Random Measures . . . 33

5 Convergence of Particle Systems 37 5.1 The Boltzmann Equation . . . 37

5.2 The N -particle Process . . . 39

6 Comparison of Modes of Convergence 43 6.1 Comparison of Convergence I . . . 43

6.2 Comparison of Convergence II . . . 46

6.3 Speed of Convergence . . . 50

(14)
(15)

Chapter 1

Introduction

The distribution of particles in a rarefied gas in a vessel can be described by the Boltzmann equation. The solution to this equation is a function f (x, t) that for each t ∈ [0, T ] is the probability density function for a random vector X which describes the state of one particle at time t. The state of one particle is given by its position and its velocity.

To approximate the solution to the Boltzmann equation, Caprino, Pulvirenti and Wagner [3] constructed a system containing N particles. This system is described by an initial boundary value problem. Its solution is the joint density function for the state of the N particles. Usually for an N -particle system at a fixed time t ≥ 0, its limit as the number of particles tends to infinity is expressed

(a) in terms of a weak limit of random measures or

(b) in terms of a weak limit of a sequence of measures over the space of probability measures.

The literature [3] is different. It uses an

(c) “L1-distance between the k-particle density and the k-fold product of the

solution to the stationary Boltzmann equation”.

The objective of this thesis is to clarify the relationship among the modes (a)-(c). In fact, the convergences (a) and (b) are very well established. In this text, a presentation of the convergences (a) and (b) and their relations based on Ethier, Kurtz [4] and Kallenberg [7] is given. We describe the solutions to the Boltzmann equation as the limit of an N -particle system in the form of [3]. Furthermore, we compare the convergence (c) with (a) and (b).

The outline of the report is as follows. Chapter 2 collects definitions and theorems which will be needed later. Some basic theory about measures and how these are used to define integration is presented. At Link¨opings universitet, these topics are not subject to undergraduate education. In order to make the present thesis paper accessible to the master students at the university of Link¨oping, we include this part. All results stated in this chapter are used in the following chapters. With a few exceptions, the proofs of the results are given by references.

Chapter 3 deals with the theory of weak convergence of measures on arbi-trary spaces. As a starting point, we look at the Prohorov metric on the space of

(16)

probability measures over some metric space S. A characterization of the com-pact subsets of this space of measures is given by Prohorov’s theorem through the notion on tightness. Another important result is that when S is separable, then weak convergence is equivalent to convergence in the Prohorov metric. A number of ways to verify this convergence is presented. The chapter is based on Ethier, Kurtz [4].

Then in chapter 4 this theory is specialized to measures on spaces of mea-sures. First, the theory of weak convergence is paraphrased as the theory of convergence in distribution. Then random measures are defined and we go over some important results. The final theorem of this chapter gives two ways of verifying convergence in distribution of random measures. The theory of the chapter is built up with the aim of presenting a rigorous proof of this result. Here we use Kallenberg [7].

In chapter 5, the mathematical formulation of the system of particles is given and how this can be approximated by an N -particle system. We describe how [3] defines convergence of these systems. This mode of convergence does not involve the theory of chapter 3 and 4.

Finally, chapter 6 compares different modes of convergence of the N -particle systems. Two other possible convergence types that involve the theory presented in chapter 3 and 4 are described. We prove two results that give relations between the different modes of convergence. The proofs of these results use ideas from articles of Sznitman [8] and Grigorescu, Kang [6]. By using the calculations in these proofs, we establish two results regarding the speed of the convergence.

(17)

Chapter 2

Preliminaries

This chapter collects some background material which will be used later in this thesis. The main purpose of the chapter is to establish the notation and to give an introduction to the reader not familiar with measure and integration theory. Section 1 introduces metric spaces which are used to define distances between elements in abstract spaces. Section 2 contains some basic definitions from measure theory and describes how the important Lebesgue measure is constructed. Section 3 defines integration of real valued functions with respect to a measure. Finally, section 4 gives an introduction to how measure theory are used in probability theory. The material in this chapter is mainly based on Folland [5].

2.1

Metric Spaces and Normed Vector Spaces

This section collects definitions and theorems about metric spaces and normed vector spaces that will be used later.

Metric Spaces

Denote by R+ the set of all non-negative real numbers, that is R+ = [0, ∞).

For an arbitrary set S, a function d : S × S → R+ is said to be a metric on S if

the following holds.

• d(x, y) = 0 if and only if x = y. • d(x, y) = d(y, x) for all x, y ∈ S.

• d(x, z) ≤ d(x, y) + d(y, z) for all x, y, z ∈ S.

If d is a metric on S, then (S, d), or simply S if the metric is understood, is called a metric space. In a metric space (S, d) a sequence of elements x1, x2, ... ∈ S

is said to converge to x ∈ S if for every ε > 0 there exists a positive integer N such that d(xn, x) < ε for all n ≥ N . A sequence x1, x2, ... ∈ S is called

a Cauchy sequence if for every ε > 0 there exist a positive integer N such that d(xm, xn) < ε for all m, n ≥ N . A set E ⊂ S is called complete if every

(18)

Cauchy sequence in E converges to some element that belongs to E. An open ball centered at x with radius r > 0 is defined as the set

B(x, r) = {y ∈ S : d(x, y) < r}.

A subset E of S is said to be open if for every x ∈ E there exists r > 0 such that B(x, r) ⊂ E. For example, every open ball is open so the name is justified. The set E is closed if its complement Ec is open. In this text, the inclusion symbol

includes the possibility that the sets are equal. The intersection of all closed sets that contains E is the smallest closed set that contains E. It is called the closure of E and is denoted by E. The boundary of E is defined as

∂E = E ∩ Ec

and the interior of E is defined as E◦= E \ E. A neighbourhood of x ∈ S is a set E ⊂ S such that x ∈ E◦. If E is a subset of S such that E = S, it is said to be dense in S. A metric space (S, d) is separable if it contains a dense countable subset. The following theorem gives a condition for a subset of a metric space to be complete.

Theorem 2.1. If (S, d) is a complete metric space and E ⊂ S is closed, then the metric space (E, d) is complete.

Proof. See section 0.6 in [5].

Let E be a subset of S. A family of sets {Vα}α∈A, where A is some arbitrary

index set, is said to be a covering of E if E ⊂ S

α∈AVα. If for every ε > 0

there exists a finite family of open balls with radius ε that covers E, then E is called totally bounded. A set K is said to be compact if for every family of open sets {Gα}α∈Athat covers K, there exists a finite set F ⊂ A such that {Gα}α∈F

covers K. If E is a set such that E is compact, then it is said to be relatively compact. The following theorem gives two other characterizations of a compact set in a metric space.

Theorem 2.2. If (S, d) is a metric space and K ⊂ S, then the following state-ments are equivalent.

(a) K is compact.

(b) K is complete and totally bounded.

(c) For every sequence of elements in K, there exists a subsequence that con-verges to some element that belongs to K.

Proof. See section 0.6 in [5].

The next theorem gives a condition for a set to be compact in the special case when S = Rd.

Theorem 2.3. If K is a closed and bounded subset of Rd, then K is compact. Proof. See section 0.6 in [5].

(19)

2.1. Metric Spaces and Normed Vector Spaces 5

Normed Vector Spaces

A norm on a space X is a function k · k : X → R+ that satisfies the following.

• kx + yk ≤ kxk + kyk for all x, y ∈ X . • kλxk = |λ|kxk for all x ∈ X and λ ∈ C. • If kxk = 0, then x = 0.

A space X such that if x, y ∈ X and α, β ∈ R, then αx + βy ∈ X , is called a vector space over R. A vector space together with a norm is called a normed vector space. Let C(S) denote the space of all continuous real valued functions on S and define

kf k∞= sup x∈S

|f (x)|, for all f ∈ C(S).

This is called the uniform norm and it can be shown that (C(S), k · k∞) is a

normed vector space. A subspace of a normed vector space X is a space Y ⊂ X such that if f, g ∈ Y and α, β ∈ R, then αf + βg ∈ Y. Every subspace of C(S) together with the uniform norm is also a normed vector space.

The Stone-Weierstrass Theorem

To state the next theorem, we will need the following definitions. If f and g are two functions on S, their product f g is defined as

f g(x) = f (x)g(x) for all x ∈ S.

A subspace B of a normed vector space such that if f, g ∈ B, then f g ∈ B, is called an algebra. The function 1 on a space S is defined by

1(x) = 1, for all x ∈ S,

and a constant function on S is a function of the form c 1, where c is a constant. If B ⊂ C(S) is a set such that for all x, y ∈ S, where x 6= y, there exists a function h ∈ B such that h(x) 6= h(y), then it is said to separate points. The following result is known as the Stone-Weierstrass theorem and the version presented here is the one given in Yosida [9].

Theorem 2.4 (The Stone-Weierstrass Theorem). Let K be a compact set and B a subset of C(K) that satisfies the following conditions.

• B is an algebra.

• B contains the constant functions.

• If f1, f2, ... ∈ B and f ∈ C(K) are functions such that fn → f uniformly,

then f ∈ B.

Then B is dense in C(K) if and only if B separates the points of K. Proof. See section 0.2 in [9].

In the proof of one of the results of this thesis, the following corollary of the Stone-Weierstrass theorem will be used.

(20)

Corollary 2.5. Let K be a compact set and B a subset of C(K) that satisfies the following conditions.

• B is an algebra.

• B contains the constant functions. • B separates the points of K. Then B is dense in C(K).

Proof. Let B be an algebra and suppose that α, β ∈ R and f, g ∈ B. Then there exists sequences of functions {fn} and {gn} in B such that fn → f and

gn → g uniformly. Since addition and multiplication are continuous operations

it follows that

αfn+ βgn→ αf + βg, and fngn→ f g

uniformly, so that αf + βg ∈ B and f g ∈ B and hence B is an algebra. If B contains the constant functions, then so does B, and if B separates the points of K, then so does B. By applying B to the Stone-Weierstrass theorem it follows that B is dense in C(K).

2.2

σ-algebras and Measures

This section presents some basic definitions and theorems about σ-algebras and measures.

σ-algebras

First, some basic definitions. An algebra over a set S is a non-empty collection A of subsets of S that satisfies the following.

• If E ∈ A, then Ec∈ A.

• If E1, ..., En∈ A, thenSni=1Ei ∈ A.

A σ-algebra over S is an algebraS such that if E1, E2, ... ∈S then S∞i=1Ei∈

S . If S is a σ-algebra over S, then (S, S ) is called a measurable space and the sets inS are called measurable sets. Let (S, S ) be a measurable space. It follows from the definition that if A1, A2, ... ∈S , then T∞i=1Ai∈S since

∞ \ i=1 Ai= ∞ [ i=1 Aci !c .

Moreover, the empty set ∅ and S always belong to the σ-algebra since ∅ = A∩Ac

and S = A ∪ Ac for any set A ⊂ S. It can also be shown that the intersection

of any family of σ-algebras is again a σ-algebra. If E is a family of subsets of S, there exists a smallest σ-algebra over S that contains E , namely the intersection of all σ-algebras containing E . It is called the σ-algebra generated by E and is denoted by σ(E ). The Borel σ-algebra over S is the σ-algebra generated by the family of all open subsets of S and is denoted byB(S).

(21)

2.2. σ-algebras and Measures 7

The Monotone Class Theorem

For the next result, we need two more definitions. A family of subsets C of some space S is called a π-system if A, B ∈ C implies that A ∩ B ∈ C. A family of subsets D of S is called a λ-system if it satisfies the following conditions.

• S ∈ D.

• If A, B ∈ D and B ⊂ A, then A \ B ∈ D.

• If A1, A2, ... ∈ D and A1⊂ A2⊂ ..., thenS∞i=1Ai∈ D.

There exists different formulations of the following theorem. The one stated here is taken from Kallenberg [7].

Theorem 2.6 (Monotone Class Theorem). If C is a π-system and D is a λ-system in some space S such that C ⊂ D, then σ(C) ⊂ D.

Proof. See chapter 1 in [7].

Measures

A measure on (S,S ), or simply on S if the σ-algebra is understood, is a function µ :S → [0, ∞] which satisfies the following.

• µ(∅) = 0.

• If A1, A2, ... ∈S are disjoint, then µ S ∞

i=1Ai = P ∞

i=1µ(Ai).

If µ is a measure on (S,S ), then (S, S , µ) is called a measure space. A measure µ is said to be σ-finite if there exists A1, A2, ... ∈S such that S = S∞i=1Aiand

µ(Ai) < ∞, for all i. A measure defined on the Borel σ-algebra is called a Borel

measure. If a statement about points x ∈ S, for instance convergence, holds for all x except for those in a set N ⊂ S such that µ(N ) = 0, the statement is said to hold almost everywhere (a.e.) or for almost every x. A simple but important example of a measure is the Dirac measure. If (S,S ) is a measurable space and x ∈ S, the Dirac measure at x is for any A ∈S defined as

δx(A) =



1 if x ∈ A 0 otherwise

Some basic properties of measures are summarized in the following theorem. Theorem 2.7. If (S,S , µ) is a measure space, then the following statements hold.

(a) If A, B ∈S and A ⊂ B, then µ(A) ≤ µ(B). (b) If A1, A2, ... ∈S , then µ S

i=1Ai ≤ P ∞

i=1µ(Ai).

(c) If A1, A2, ... ∈S and A1, ⊂ A2⊂ ..., then µ S∞i=1Ai = limi→∞µ(Ai).

(d) If A1, A2, ... ∈ S , A1 ⊃ A2 ⊃ ... and µ(A1) < ∞, then µ T∞i=1Ai =

limi→∞µ(Ai).

(22)

From the first statement of theorem 2.7 it follows that if µ(E) = 0 and F ⊂ E, then µ(F ) = 0 if F ∈S . However, it need not be the case that F ∈ S . A measure such that the corresponding σ-algebra contains all subsets of sets of measure zero is called a complete measure and it follows from the next theorem that a measure always can be extended to a complete measure.

Theorem 2.8. Let (S,S , µ) be a measure space and define N = {N ∈ S : µ(N ) = 0}. Let

S = {E ∪ F : E ∈ S , F ⊂ N for some N ∈ N }, and define µ :S → [0, ∞] by

µ(E ∪ F ) = µ(E).

ThenS is a σ-algebra and µ is the unique extension of µ to a complete measure onS .

Proof. See section 1.3 in [5].

In theorem 2.8, µ is called the completion of µ and S is called the com-pletion ofS with respect to µ. The next theorem shows how measures can be constructed. For this, two more definitions are needed. If A is an algebra, then a function µ0: A → [0, ∞] is called a premeasure if the following holds.

• µ0(∅) = 0.

• If A1, A2, ... ∈ A are disjoint and S∞i=1Ai ∈ A, then µ0(S∞i=1Ai) =

P∞

i=1µ0(Ai).

In analogy to measures, a premeasure µ0 is said to be σ-finite if there exists

A1, A2, ... ∈ A such that S =S∞i=1Ai and µ(Ai) < ∞, for all i.

Theorem 2.9. Let A be an algebra over S and µ0 a premeasure on A. For

each E ∈ S let µ∗(E) = inf ( X i=1 µ0(Ai) : A1, A2, ... ∈ A, E ⊂ ∞ [ i=1 Ai ) , and define

µ(E) = µ∗(E), for all E ∈ σ(A).

Then µ is a measure on σ(A) and its restriction to A is µ0. Moreover, if µ0 is

σ-finite, then µ is the unique extension of µ0 to a measure on σ(A).

Proof. See section 1.4 in [5].

The Lebesgue Measure on R

d

One of the most important measures in applications is the Lebesgue measure on the euclidean space. Since it is used in this paper, the construction is given, but without any justification of the results. The complete theory can be found in for example Folland [5]. First, the construction of the Lebesgue measure on R is given and then this is extended to the d-dimensional case. To begin,

(23)

2.3. Integration of Real Valued Functions 9

define an h-interval as a subset of R of the form (a, b], (a, ∞) or ∅, where −∞ ≤ a < b < ∞. Let A denote the set of all finite disjoint unions of h-intervals. If (a1, b1], ..., (an, bn] are disjoint h-intervals, define

µ0 n [ i=1 (ai, bi] ! = n X i=1 (bi− ai),

and let µ0(∅) = 0. Then it can be shown that µ0 defines a σ-finite premeasure

on A. By theorem 2.9 there exists a unique extension to a measure µ on σ(A). Moreover, it can be shown that σ(A) =B(R), so µ is the unique Borel measure on R such that µ((a, b]) = b−a for all h-intervals (a, b]. The Lebesgue measure λ is the completion of µ and is defined on L, the completion ofB(R) with respect to µ. The sets in L are called Lebesgue measurable sets. Moving on to the d-dimensional case, we define a rectangle in Rd as a set of the form

A1× ... × Ad, where A1, ..., Ad∈B(R).

Let Addenote the collection of finite disjoint unions of d-dimensional rectangles. If E ∈ Ad is the finite disjoint union of the rectangles A11× · · · × A1d, ..., A

n 1 × · · · × An d, let π0(E) = n X i=1 λ(Ai1) · · · λ(Aid),

where λ is the Lebesgue measure on R. Then it can be shown that π0 defines

a σ-finite premeasure on Ad. As in the construction of the one-dimensional Lebesgue measure, there exists a unique extension of π0 to a measure µd on

σ(Ad) and it can be shown that σ(Ad) =B(Rd). This measure, constructed as

in theorem 2.9, is the unique measure onB(Rd) such that

µd(A1× · · · × Ad) = λ(A1) · · · λ(Ad) for all A1, ..., Ad∈B(R).

The d-dimensional Lebesgue measure λd is the completion of µd and is defined

on Ld, the completion ofB(Rd) with respect to µd.

2.3

Integration of Real Valued Functions

This section defines integration of a real valued function with respect to a mea-sure. For any set S, denote by 2S the family of all subsets of S. Let S and X be

two arbitrary sets. For any mapping f : S → X, the mapping f−1: 2X→ 2S is

defined by

f−1(E) = {s ∈ S : f (s) ∈ E}.

If (S,S ) and (X, X ) are measurable spaces, a function f : S → X is said to be (S , X )-measurable if

f−1(E) ∈S for all E ∈ X .

If S and X are understood, then f is called just measurable and if nothing else is said, it is assumed that the σ-algebras are the Borel σ-algebras. If f and g are measurable functions, then f + g and f g are measurable as well. It can also be shown that if f1, f2, ... is a sequence of measurable functions such that

(24)

fn → f pointwise, then f is measurable. For a measurable space (S,S ), the

indicator function of a set E ∈S is defined as 1E(x) =



1 if x ∈ E 0 otherwise

A simple function on S is a function φ that can be written in the form

φ =

n

X

i=1

ai1Ei, a1, ..., an ∈ C, E1, ..., En∈S .

It can be shown that all simple functions are measurable. Let (S,S , µ) be an arbitrary measure space and define

L+(S) = {f : S → [0, ∞] : f is measurable}. If φ =Pn

i=1ai1Ei is a simple function in L

+, the integral of φ with respect to

the measure µ is defined as Z φ dµ = n X i=1 aiµ(Ei).

This definition is extended to all functions in L+(S) as follows.

Z f dµ = sup Z φ dµ : 0 ≤ φ ≤ f, φ simple  .

Theorem 2.10 (Monotone Convergence Theorem). Let f, f1, f2, ... be functions

in L+ such that f1≤ f2≤ ... and fn → f pointwise. Then

Z

f dµ = lim

n→∞

Z fndµ.

Proof. See section 2.2 in [5].

The functions f+(x) = max{f (x), 0} and f−(x) = min{−f (x), 0} are called the positive and negative parts of f , respectively. It holds that f = f+− f

and it can be shown that if f is measurable, so is f+and f. The integral of a

real valued measurable function f is defined as Z f dµ = Z f+dµ − Z f−dµ.

If both terms on the right hand side are finite, then f is said to be integrable. Integration over a subset E of S is defined as

Z

E

f dµ = Z

f 1Edµ.

We now introduce a class of functions that will play an important role in this thesis. Let (S,S , µ) be a measure space and for any real valued function f on S let kf kL1=R |f | dµ. Define

(25)

2.3. Integration of Real Valued Functions 11

This space is often be abbreviated L1(S) when the σ-algebra and the measure

are understood. It can be shown that (L1(S), k · kL1) is a normed vector space.

Two functions f and g in L1(S) are defined to be equal if f (x) = g(x) for almost every x ∈ S. The following theorem is another important result about limits of integrals.

Theorem 2.11 (Dominated Convergence Theorem). Let f1, f2, ... be a sequence

in L1(S). Suppose that fn → f a.e. and that there exists g ∈ L1(S) such that

|fn| ≤ g a.e. for all n. Then f ∈ L1(S) and

Z

f dµ = lim

n→∞

Z fndµ.

Proof. See section 2.3 in [5].

The next theorem is a special case of the Fubini theorem. The reference in the proof is to the more general case outlined in [5].

Theorem 2.12 (Fubini). If f ∈ L1(Rd+e), then Z Rd+e f (x, y)dλd+e(x, y) = Z Rd Z Re f (x, y)dλe(y)  dλd(x) = Z Re Z Rd f (x, y)dλd(x)  dλe(y).

Proof. See section 2.5 in [5].

For the remainder of this text we assume that if the measure in the integral is not written out explicitly, then the integration is carried out with respect to the Lebesgue measure, that is,

Z

f (x) dx = Z

f (x) dλ(x).

Theorem 2.13 (H¨older’s Inequality). Let f and g be measurable functions on (S,S , µ). If 1 < p < ∞ and p−1+ q−1= 1, then Z |f g| dµ ≤ Z |f |pdµ 1/pZ |g|qdµ 1/q .

Proof. See section 6.1 in [5].

Using the following result, it is sometimes possible to write an integral with respect to an arbitrary measure as an integral with respect to the Lebesgue measure.

Theorem 2.14. If f is a measurable function on (S,S , µ) and 0 < p < ∞, then Z |f |pdµ = p Z ∞ 0 tp−1µ{x ∈ S : |f (x)| > t} dt. Proof. See section 6.4 in [5].

(26)

2.4

Random Elements

This section gives a basic introduction to how measure theory is used in proba-bility theory, based on Billingsley [1]. First, a number of definitions is given. A probability measure on a measurable space S, is a measure µ such that µ(S) = 1. A probability space is a measure space (S,S , µ) where µ is a probability mea-sure. An event space is a probability space (Ω,F , P) where the elements of Ω and F are called outcomes and events, respectively. The probability measure P assigns probabilities to events. A random element is a measurable mapping ξ from the event space Ω to a metric space S. The distribution of ξ is a Borel probability measure on S denoted by P ◦ ξ−1 such that for all E ∈B(S),

P ◦ ξ−1(E) = P(ξ−1(E)) = P{ω ∈ Ω : ξ(ω) ∈ E} = P{ξ ∈ E}.

For any set E ∈B(S) it is said that ξ ∈ E almost surely (a.s.) if P{ξ ∈ E} = 1 and a random element is called non-random if there exists ξ0 ∈ S such that

ξ = ξ0 a.s. The term random element is used when S is an arbitrary metric

space. In some special cases we will use different terms depending on the nature of S.

• If S = R, then ξ will be called a random variable. • If S = Rd, then ξ will be called a random vector.

• If S is a space of measures, ξ will be called a random measure. A function f ∈ L+

(Rd) such that R f (x) dx = 1 is called a probability density

function, or simply density function. The next result, which is taken from Billingsley [2], shows how the distribution of a random vector can be defined by a density function.

Theorem 2.15. Let f be a probability density function and ν an arbitrary measure. If the mapping µ is defined by µ(E) =R

Ef dν, for all E ∈S , then

µ is a probability measure. Moreover, if g is a measurable function which is integrable with respect to µ, then

Z

g dµ = Z

f g dν. Proof. See section 16 in [2].

If X is a random vector, then the distribution of X can be defined by a density function f with respect to a measure ν as follows.

P{X ∈ E} = Z

E

f (x) dν(x), for all E ∈B(Rd).

It follows from theorem 2.15 that P◦X−1is a Borel probability measure. In this thesis, the density function will always be defined with respect to the Lebesgue measure. If X is a random variable, the expected value of X is defined as

E(X) = Z

X(ω) d P(ω).

Sometimes it is more convenient to express the expected value as an integral over the the real numbers. To do this, the following lemma can be used.

(27)

2.4. Random Elements 13

Lemma 2.16. Using the notation above, if f : S → R is a measurable function, then Z Ωf (X(ω)) d P(ω) = Z Sf (x) d P ◦ X −1(x),

whenever either side is well defined. Proof. See section 10.1 in [5].

Using this result, the expected value of X can be written as

E(X) = Z

S

x d P ◦ X−1(x).

Moreover, if X is defined in terms of a density function f , with respect to a measure ν, and E(X) < ∞, then it follows from theorem 2.15 that

E(X) = Z

S

xf (x) dν(x).

An expression for the expected value in the case when the random variable is non-negative is given in the following theorem.

Theorem 2.17. If X is a non-negative random variable, then

E(X) = Z ∞

0

P{X > t} dt. Proof. This is an immediate consequence of theorem 2.14.

Now we introduce a basic type of convergence in probability theory. Let ξ, ξ1, ξ2, ... be random elements defined on the probability space (Ω,F , P) and

taking values in the metric space (S, d). The sequence ξ1, ξ2, ... is said to

con-verge in probability to ξ if for every ε > 0, lim

n→∞P{d(ξn, ξ) > ε} = 0.

This is denoted by ξn p

−→ ξ.

Theorem 2.18 (Chebyshev’s Inequality). Let X be a random variable defined on (Ω,F , P) and let p ∈ (0, ∞). If E(|X|p) < ∞, then for any ε > 0,

P{|X| > ε} ≤ E(|X|

p)

εp .

Proof. If A(ε) = {ω ∈ Ω : |X(ω)| > ε}, then

E(|X|p) = Z |X(ω)|p d P(ω) ≥ Z A(ε) |X(ω)|p d P(ω) ≥ εpP(A(ε)).

Theorem 2.19. If X is a random variable on (Ω,F , P) and 1 ≤ p < ∞, then (E(|X|))p ≤ E(|X|p).

(28)

Proof. If p = 1 we have equality and for 1 < p < ∞ it follows from H¨older’s inequality that E(X) = Z |X · 1| d P ≤ Z |X|p d P 1/pZ |1|q d P 1/q = (E(|X|p)1/p, where q = p/(p − 1).

(29)

Chapter 3

Convergence of Probability

Measures

This chapter presents a theory for convergence of sequences of Borel probability measures defined on a metric space (S, d). Section 1 introduces the Prohorov metric ρ defined on P(S), the space of all Borel probability measures on S. Section 2 gives a characterization of the compact subsets ofP(S). Section 3 shows that convergence in the Prohorov metric is equivalent to weak convergence of probability measures if S is separable and gives useful tools for verifying this convergence. Finally, section 4 introduces the concept of separating and convergence determining sets which can be used as another tool for verifying weak convergence. The chapter is based on section 3.1-3.4 in Ethier, Kurtz [4].

3.1

Prohorov’s Metric

This section shows how to construct a metric space where the elements are Borel probability measures. For any A ∈B(S) define

Aε=  x ∈ S : inf y∈Ad(x, y) < ε  . (3.1)

LetC denote the collection of all closed subsets of B(S) and define the function ρ :P(S) × P(S) → [0, 1] by

ρ(P, Q) = inf{ε > 0 : P (F ) ≤ Q(Fε) + ε for all F ∈C }. (3.2) Then (P(S), ρ) is a metric space. To show this the following two lemmas will be used.

Lemma 3.1. Let P, Q ∈P(S) and α, β > 0. If

P (F ) ≤ Q(Fα) + β for all F ∈C , (3.3) then

Q(F ) ≤ P (Fα) + β for all F ∈C .

(30)

Proof. Let F1 ∈ C be arbitrary and set F2 = (F1α)c, where c denotes the

complement. Then F2 ∈ C since F1α is an open set and it also holds that

F1⊂ (F2α) c, so P (F1α) = 1 − P (F2) ≥ 1 − Q(F2α) − β = Q((F α 2) c) − β ≥ Q(F 1) − β.

The first inequality follows from (3.3) and the second is true because F1 ⊂

(F2α)c.

Lemma 3.2. If P, Q ∈P(S) and P (F ) = Q(F ) for all F ∈ C , then P = Q. Proof. See section 1.1 in [1].

Theorem 3.3. (P(S), ρ) is a metric space.

Proof. It follows from lemma 3.1 that ρ(P, Q) = ρ(Q, P ) for all P, Q ∈P(S). If P = Q, equation (3.2) is true for all ε > 0, which implies that ρ(P, Q) = 0. Conversely, if ρ(P, Q) = 0, then

P (F ) ≤ Q(Fε) + ε, for all ε > 0 and F ∈C . From this it follows by lemma 3.1 that

Q(F ) ≤ P (Fε) + ε, for all ε > 0 and F ∈C .

These two equations together imply that P (F ) = Q(F ) for all F ∈C . By lemma 3.2 it follows that P = Q. To prove the triangle inequality, let P, Q, R ∈P(S) and suppose that δ > ρ(P, Q) and ε > ρ(Q, R). Then,

P (F ) ≤ Q(Fδ) + δ ≤ Q(Fδ) + δ ≤ R((Fδ)ε) + δ + ε ≤ R(Fδ+ε) + δ + ε,

for all F ∈C . From this it follows that ρ(P, R) ≤ δ + ε. Since this is true for all δ > ρ(P, Q) and ε > ρ(Q, R),

ρ(P, R) ≤ ρ(P, Q) + ρ(Q, R) for all P, Q, R ∈P(S),

so ρ satisfies the properties of a metric and hence (P(S), ρ) is a metric space. The following theorem gives another formula for this metric.

Theorem 3.4. Let P, Q ∈P(S) and define

M (P, Q) = {µ ∈ P(S × S) : µ(A × S) = P (A), µ(S × A) = Q(A), A ∈ B(S)}. Then

ρ(P, Q) = inf

µ∈M (P,Q)

inf{ε > 0 : µ{(x, y) : d(x, y) ≥ ε} ≤ ε}.

Proof. Suppose that ε0 ∈ {ε > 0 : µ{(x, y) : d(x, y) ≥ ε} ≤ ε} and µ ∈

M (P, Q). Then for any F ∈ C , P (F ) = µ(F × S) = µ((F × S) ∩ {(x, y) : d(x, y) < ε0}) + µ((F × S) ∩ {(x, y) : d(x, y) ≥ ε0}) ≤ µ(F × Fε0) + µ{(x, y) : d(x, y) ≥ ε 0} ≤ µ(S × Fε0) + ε 0 = Q(Fε0) + ε 0,

(31)

3.2. Prohorov’s Theorem 17

so ε0∈ {ε > 0 : P (F ) ≤ Q(Fε) + ε, F ∈C } and hence ρ(P, Q) ≤ ε0. From this

it follows that

ρ(P, Q) ≤ inf

µ∈M (P,Q)inf{ε > 0 : µ{(x, y) : d(x, y) ≥ ε} ≤ ε}.

For the proof of the opposite inequality, see section 3.1 in [4].

As an application of this theorem, a relation between convergence in the Prohorov metric and convergence in probability of random elements can be proved.

Corollary 3.5. Let (S, d) be separable and suppose that ξ, ξ1, ξ2, ... are S-valued

random elements defined on the same probability space (Ω,F , P), with distribu-tions P, P1, P2, ..., respectively. If ξn p − → ξ, then limn→∞ρ(Pn, P ) = 0. Proof. If ξn p

−→ ξ, then for every ε > 0, lim

n→∞P{d(ξn, ξ) > ε} = 0.

If µn∈P(S × S) denotes the joint distribution of (ξn, ξ) for all n ∈ N, then

lim

n→∞µn{(x, y) : d(x, y) > ε} = 0.

It follows from theorem 3.4 that

ρ(Pn, P ) ≤ inf{ε > 0 : µn{(x, y) : d(x, y) ≥ ε} ≤ ε}.

Since the right hand side tends to zero as n → ∞, the result follows. The last theorem of this section will be needed for later purposes.

Theorem 3.6. If (S, d) is separable, then (P(S), ρ) is separable. If (S, d) is complete and separable, then (P(S), ρ) is complete and separable.

Proof. See section 3.1 in [4].

3.2

Prohorov’s Theorem

In this section, a characterization of the compact subsets of the metric space (P(S), ρ) is given through the notion of tightness. The key result is Prohorov’s theorem. Tightness is defined as follows. A probability measure P ∈ P(S) is said to be tight if for every ε > 0 there exists a compact set K such that P (K) ≥ 1 − ε. A family of probability measuresM ⊂ P(S) is said to be tight if for every ε > 0 there exists a compact set K such that

inf

P ∈MP (K) ≥ 1 − ε.

To prove Prohorov’s theorem we will use the following lemma.

(32)

Proof. Let ε > 0 be arbitrary. Since S is separable, it contains a dense countable subset {x1, x2, ...}. For each n = 1, 2, ..., it is possible to choose a positive integer

Nn such that P Nn [ i=1 B(xi, 1 n) ! ≥ 1 − ε 2n.

Let K be the closure of

∞ \ n=1 Nn [ i=1 B(xi, 1 n).

It follows from theorem 2.1 that K is complete since (S, d) is complete and K ⊂ S is closed. Let δ > 0 be arbitrary and choose a positive integer α such that 1/α < δ. Then ∞ \ n=1 Nn [ i=1 B(xi, 1 n) ⊂ Nα [ i=1 B(xi, 1 α) ⊂ Nα [ i=1 B(xi, δ),

so it follows that K is totally bounded. Since K is complete and totally bounded, it follows from theorem 2.2 that it is compact and

P (K) ≥ P ∞ \ n=1 Nn [ i=1 B(xi, 1 n) ! = 1 − P ∞ [ n=1 Nn [ i=1 B(xi, 1 n) !c! ≥ 1 − ∞ X n=1 P Nn [ i=1 B(xi, 1 n) !c! = 1 − ∞ X n=1 1 − P Nn [ i=1 B(xi, 1 n) !! ≥ 1 − ∞ X n=1 ε 2n = 1 − ε.

Theorem 3.8 (Prohorov’s Theorem). If (S, d) is complete and separable, then the following statements are equivalent.

(a) M is tight.

(b) For every ε > 0 there exists a compact set K such that inf

P ∈MP (K

ε) ≥ 1 − ε, (3.4)

where Kε is defined as in (3.1).

(33)

3.2. Prohorov’s Theorem 19

Proof. (a) implies (b): Since K ⊂ Kε andM is tight, it follows that for every

ε > 0 there exists a compact set K such that inf

P ∈MP (K

ε) ≥ inf

P ∈MP (K) ≥ 1 − ε.

(b) implies (c): We will first show thatM is totally bounded, that is, for every δ > 0, there exists a finite setN such that

M ⊂ [

P ∈N

B(P, δ) = [

P ∈N

{Q ∈M : ρ(P, Q) < δ}.

Let δ > 0 be arbitrary and for ε ∈ (0, δ/2) choose a compact set K such that (3.4) holds. Since K is compact, it is totally bounded so there exists a finite set {x1, ..., xn} ⊂ K such that K ⊂S

n

i=1B(xi, ε) and from this it follows that

Sn

i=1B(xi, 2ε). Let x0∈ S be arbitrary and m ≥ n/ε. Define

N = ( 1 m n X i=0 kiδxi: k0, ..., kn∈ {0, ..., m}, n X i=0 ki= m ) ,

where δxi denotes the Dirac measure. Let E1= B(x1, 2ε) and

Ei= B(xi, 2ε) \ i−1

[

k=1

B(xk, 2ε), for i = 2, ..., n.

For arbitrary Q ∈M define ki = [mQ(Ei)] for i = 1, ..., m, where [·] denotes

the integer part, and let k0 = 1 −P m

i=1ki. If we set P = m1 P n

i=0kiδxi, then

P ∈N and for any F ∈ C ,

Q(F ) = Q(F ∩ Kε) + Q(F ∩ (Kε)c) ≤ Q n [ i=1 (F ∩ Ei) ! + Q((Kε)c) ≤ Q   [ F ∩Ei6=∅ Ei  + ε = X F ∩Ei6=∅ Q(Ei) + ε ≤ X F ∩Ei6=∅  [mQ(Ei)] + 1 m  + ε ≤ X F ∩Ei6=∅ ki m+ 2ε.

It follows from the construction of the sets E1, ..., En that

{xi: i ∈ {1, ..., n}, F ∩ Ei6= ∅} ⊂ F2ε. Using this, X F ∩Ei6=∅ ki m = 1 m X F ∩Ei6=∅ kiδxi(F 2ε) ≤ 1 m n X i=0 kiδxi(F 2ε) = P (F),

(34)

so Q(F ) ≤ P (F2ε) + 2ε for all F ∈C and this proves that ρ(P, Q) ≤ 2ε < δ.

It can be concluded thatM is totally bounded which implies that the closure ofM is totally bounded. By theorem 3.6, (P(S), ρ) is complete since (S, d) is separable and complete, so it follows that the closure ofM is complete. Putting all this together, we have that the closure ofM is complete and totally bounded, which implies thatM is relatively compact.

(c) implies (a): Let ε > 0 and Q ∈ M be arbitrary. Since M is relatively compact, it is totally bounded, so for any n = 1, 2, ..., we can choose a finite set Nn such that there exists Pn∈Nn satisfying ρ(Pn, Q) < ε/2n+1. Moreover, it

follows from lemma 3.7 that there exists compact sets K1, K2, ... such that

Pn(Kn) ≥ 1 −

ε

2n+1, for all n = 1, 2, ...

This equation and that ρ(Pn, Q) < ε/2n+1gives

QKnε/2n+1≥ Pn(Kn) −

ε

2n+1 ≥ 1 −

ε

2n, for all n = 1, 2, ...

Let K be the closure of T∞

n=1K ε/2n+1

n . Arguing as in the proof of lemma 3.7,

it follows that K is compact and

Q(K) ≥ Q ∞ \ n=1 Knε/2n+1 ! = 1 − Q ∞ [ n=1  Knε/2n+1 c! ≥ 1 − ∞ X n=1  1 − QKnε/2n+1 ≥ 1 − ∞ X n=1 ε 2n = 1 − ε.

Since Q ∈M was arbitrary, it holds for this K that inf

P ∈MP (K) ≥ 1 − ε,

soM is tight.

The following result states that compactness can be verified through tight-ness even without the assumption that (S, d) is complete and separable. Corollary 3.9. Let (S, d) be an arbitrary metric space. If M is tight, then M is relatively compact.

Proof. See section 3.2 in [4].

3.3

Weak Convergence of Probability Measures

This section introduces the concept of weak convergence of probability measures and shows that this is equivalent to convergence in the Prohorov metric if the

(35)

3.3. Weak Convergence of Probability Measures 21

metric space (S, d) is separable. To define weak convergence, let Cb(S) denote

the space of all real valued bounded continuous functions defined on S, en-dowed with the uniform norm kf k∞= supx∈Sf (x). A sequence of probability

measures P1, P2, ... ∈P(S) is said to converge weakly to P ∈ P(S) if

lim n→∞ Z f dPn= Z f dP, for all f ∈ Cb(S). (3.5) This is denoted by Pn w

−→ P . In the next theorem, several ways of verifying weak convergence is given. The following definition will be used in this result. A set A ⊂ S is said to be a P -continuity set if A ∈B(S) and P (∂A) = 0. Theorem 3.10. Let (S, d) be arbitrary and let P, P1, P2, ... ∈P(S). Then, of

the following statements, (b) through (f ) are all equivalent and all of them are implied by (a). If (S, d) is separable, then all 6 statements are equivalent.

(a) limn→∞ρ(Pn, P ) = 0.

(b) Pn w

−→ P .

(c) limn→∞R f dPn=R f dP for all uniformly continuous f ∈ Cb(S).

(d) lim supn→∞Pn(F ) ≤ P (F ) for all closed sets F ⊂ S.

(e) lim infn→∞Pn(G) ≥ P (G) for all open sets G ⊂ S.

(f ) limn→∞Pn(A) = P (A) for all P -continuity sets A ⊂ S.

Proof. (a) implies (b): Throughout this proof, we let kf k = kf k∞. Define εn=

ρ(Pn, P ) + 1/n and let f ≥ 0 be a function in Cb(S). Using theorem 2.14 and

that f is bounded gives Z f dPn = Z kf k 0 Pn{x : f (x) > t} dt ≤ Z kf k 0  P ({x : f (x) ≥ t}εn) + ε n  dt = Z kf k 0 P ({x : f (x) ≥ t}εn) dt + ε nkf k, for all n = 1, 2, ...

Since the left hand side is a bounded sequence of points and the right hand side is a decreasing sequence of points it holds that

lim sup n→∞ Z f dPn≤ lim n→∞ Z kf k 0 P ({x : f (x) ≥ t}εn) dt = Z kf k 0 P {x : f (x) ≥ t} dt = Z f dP, (3.6)

where the first equality follows from the dominated convergence theorem. Now, let f ∈ Cb(S) be arbitrary. Using (3.6) and that kf k + f and kf k − f are

(36)

non-negative functions it follows that lim sup n→∞ Z (kf k + f ) dPn≤ Z (kf k + f ) dP lim sup n→∞ Z (kf k − f ) dPn≤ Z (kf k − f ) dP

Note that R kf kdPn = kf kPn(S) = kf k · 1, for all n, so these two inequalities

can be rewritten as kf k + lim sup n→∞ Z f dPn≤ kf k + Z f dP kf k − lim inf n→∞ Z f dPn≤ kf k − Z f dP

Since kf k is finite, it can be subtracted from both sides of the inequalities, which implies that lim sup n→∞ Z f dPn≤ Z f dP ≤ lim inf n→∞ Z f dPn.

From this it follows that limn→∞R f dPn =R f dP .

(b) implies (c): If Pn w

−→ P , then by definition, limn→∞R f dPn=R f dP for all

f ∈ Cb(S). In particular, this holds for all uniformly continuous f ∈ Cb(S).

(c) implies (d): Let F ⊂ S be closed and for every ε > 0 define fε(x) = max  1 − d(x, F ) ε , 0  , for all x ∈ S,

where d(x, F ) = infy∈Fd(x, y). Then fεis uniformly continuous and belongs to

Cb(S). Since fε(x) = 1 for all x ∈ F , it holds that 1F(x) ≤ fε(x) for all x ∈ S

and hence Pn(F ) = Z 1FdPn≤ Z fεdPn.

The sequence {Pn(F )} is bounded, so we have that

lim sup n→∞ Pn(F ) ≤ lim sup n→∞ Z fεdPn = Z fεdP,

where the equality follows from (c). Since this holds for all ε > 0, lim sup n→∞ Pn(F ) ≤ lim ε→0 Z fεdP = Z 1FdP = P (F ),

because fεconverges uniformly to 1F as ε → 0.

(d) implies (e): If G ⊂ S is open, then lim inf n→∞ Pn(G) = lim infn→∞ (1 − Pn(G c)) = 1 − lim sup n→∞ Pn(Gc) ≥ 1 − P (Gc) = P (G),

(37)

3.3. Weak Convergence of Probability Measures 23

where the inequality follows from (d).

(e) implies (f): If A ⊂ S is a P -continuity set, then lim sup

n→∞

Pn(A) ≤ lim sup n→∞ Pn(A) = lim sup n→∞ (1 − Pn((A)c)) = 1 − lim inf n→∞ Pn((A) c) ≤ 1 − P ((A)c) = P (A) and lim inf

n→∞ Pn(A) ≥ lim infn→∞ Pn(A

) ≥ P (A) = P (A).

From this it can be concluded that limn→∞Pn(A) = P (A).

(f) implies (b): Let f ≥ 0 be a function in Cb(S). Then P {x : f (x) = t} = 0 for

all but at most countably many t ∈ (0, kf k). Since ∂{x : f (x) ≥ t} ⊂ {x : f (x) = t},

it follows that {x : f (x) ≥ t} is a P -continuity set for almost every t ∈ (0, kf k). Using the dominated convergence theorem and (f) gives

lim n→∞ Z f dPn= lim n→∞ Z kf k 0 Pn{x : f (x) ≥ t} dt = Z kf k 0 P {x : f (x) ≥ t} dt = Z f dP,

for all non-negative functions in Cb(S). By the linearity of the integral, this

holds for all f ∈ Cb(S), and hence Pn w

−→ P .

(e) implies (a): This part of the proof is under the assumption that (S, d) is sepa-rable. Let ε > 0 be arbitrary. Since S is separable there exists E1, E2, ... ∈B(S)

such that S =S∞

i=1Ei, Ei∩ Ej= ∅ for all i 6= j and

sup{d(x, y) : x, y ∈ Ei} <

ε

2, for all i ∈ N. (3.7) Let N be the smallest positive integer such that

P N [ i=1 Ei ! > 1 − ε 2.

Define G as the collection of open sets of the form {Si∈IEi}ε/2 where I ⊂

(38)

condition (e) that for each i = 1, ..., k there exists a positive integer nisuch that

P (Gi) ≤ Pn(Gi) + ε/2, for all n ≥ ni. If n0= max{n1, ..., nk}, then

P (G) ≤ Pn(G) +

ε

2, for all G ∈G and n ≥ n0. Let F ∈C be arbitrary and define

F0=

[

{Ei: 1 ≤ i ≤ N, Ei∩ F 6= ∅}.

Since F0ε/2∈G , it holds that

P (F ) = P F ∩ N [ i=1 Ei !! + P F ∩ N [ i=1 Ei !c! ≤ P (F0) + P N [ i=1 Ei !c! ≤ PF0ε/2+ε 2 ≤ Pn  F0ε/2+ ε ≤ Pn(Fε) + ε, for all n ≥ n0,

where the last inequality is true because equation (3.7) implies that F0ε/2⊂ Fε.

It follows that ρ(Pn, P ) ≤ ε, so for every ε > 0, there exists a positive integer

n0 such that ρ(Pn, P ) ≤ ε for all n ≥ n0, which means that

lim

n→∞ρ(Pn, P ) = 0.

3.4

Convergence Determining Sets

To conclude weak convergence of a sequence of Borel probability measures it is not necessary to verify the convergence in (3.5) for all f ∈ Cb(S). This section

gives a characterization of sets M ⊂ Cb(S) such that if the convergence in (3.5)

holds for all f ∈ M , then it also holds for all f ∈ Cb(S). Let B(S) denote the

space of all bounded real valued functions defined on S. A sequence of functions f1, f2, ... ∈ B(S) is said to converge boundedly and pointwise to f ∈ B(S) if fn

converges pointwise to f and supnkfnk < ∞. This is denoted by

bp-lim

n→∞

fn= f. (3.8)

A set M ⊂ B(S) is called bp-closed if whenever f1, f2, ... ∈ M , f ∈ B(S) and

(3.8) holds, then f ∈ M . The bp-closure of M is the smallest bp-closed set containing M . A set M ∈ B(S) is said to be dense in B(S) if the bp-closure of M is equal to B(S). If M is bp-dense in B(S) and f ∈ B(S) there need not exist a sequence f1, f2, ... ∈ M such that (3.8) holds. Separating and

convergence determining sets are defined as follows. A set M ⊂ Cb(S) is called

separating if whenever P, Q ∈P(S) and Z

f dP = Z

(39)

3.4. Convergence Determining Sets 25

we have P = Q. A set M ⊂ Cb(S) is called convergence determining if whenever

P, P1, P2, ... ∈P(S) and lim n→∞ Z f dPn= Z f dP, for all f ∈ M , (3.9) we have Pn w

−→ P . If a set is convergence determining, then it is also separat-ing. The converse is false in general but the following theorem gives sufficient conditions for when it does hold.

Lemma 3.11. Let {P1, P2, ...} ⊂ P(S) be relatively compact, let P ∈ P(S)

and suppose that M ⊂ Cb(S) is separating. If (3.9) holds, then Pn w

−→ P . Proof. See section 3.4 in [4].

Recall from section 2.1 that a set M ⊂ Cb(S) is said to separate points if for

all x, y ∈ S, where x 6= y, there exists a function h ∈ M such that h(x) 6= h(y). Furthermore, a set M ⊂ Cb(S) is said to strongly separate points if for all x ∈ S

and δ > 0, there exists a finite set {h1, ..., hk} ⊂ M such that

inf

y:d(y,x)≥δ i∈{1,...,k}max |hi(y) − hi(x)| > 0.

The concluding theorem of this chapter relates these definitions to separating and convergence determining sets. For this theorem, recall the definition of an algebra of functions defined in section 2.1.

Theorem 3.12. Let (S, d) be complete and separable. If M ⊂ Cb(S) is an

algebra, then the following statements hold. (a) If M separates points, then M is separating.

(b) If M strongly separates points, then M is convergence determining. Proof. See section 3.4 in [4].

(40)
(41)

Chapter 4

Convergence of Random

Measures

In this chapter, a theory for convergence of sequences of random measures is presented. The results in this section will in particular hold for random prob-ability measures, but the theory is given for more general random measures. Section 1 is devoted to the study of convergence of random elements in a met-ric space S. Then in section 2, this theory is specialized to the case when S is a space of measures and some important uniqueness results are given. Fi-nally, section 3 gives the key theorem for verifying convergence of a sequence of random measures. The chapter is based on Kallenberg [7].

4.1

Convergence in Distribution of Random

El-ements

In the previous chapter, convergence of probability measures were studied. Sometimes it is more convenient to consider convergence of random elements and in this section the corresponding theory for this is given. A sequence of ran-dom elements ξ1, ξ2, ... is said to converge in distribution to the random element

ξ if P ◦ ξ−1n w

−→ P ◦ ξ−1, or equivalently

lim

n→∞E(f (ξn)) = E(f (ξ)), for all f ∈ Cb(S).

This is denoted by ξn d

−→ ξ. The following theorem, which is important for later purposes, gives a relation between convergence in distribution and convergence in probability.

Theorem 4.1. If ξ1, ξ2, ... are random elements in a metric space (S, d) and ξ

is a non-random element in (S, d), then ξn d

−→ ξ if and only if ξn p

−→ ξ. Proof. Let µn = P ◦ ξn−1 for n ∈ N and let µ = P ◦ ξ−1. By definition, ξn

d

−→ ξ if and only if µn

w

−→ µ, which by theorem 3.10 is equivalent to that µn(A) → µ(A)

for all µ-continuity sets A ⊂ S. Since ξ is non-random, µ = δx0for some x0∈ S,

so it is enough to show that ξn p

−→ ξ if and only if

µn(A) → δx0(A), for all δx0-continuity sets A ⊂ S. (4.1)

(42)

First assume that ξn p

→ ξ and let A ⊂ S be a δx0-continuity set. Then x0∈ ∂A,/

so if x0∈ A, there exists ε0> 0 such that B(x0, ε0) ⊂ A and

|µn(A) − δx0(A)| = |µn(A) − 1|

= µn(Ac)

≤ µn(B(x0, ε0)c)

= P{ξn∈ B(x0, ε0)c}

= P{d(ξn, x0) ≥ ε0} → 0 as n → ∞.

If x0 ∈ A, then since x/ 0 ∈ ∂A, there exists ε/ 0 > 0 such that B(x0, ε0) ⊂ (A)c

and

|µn(A) − δx0(A)| = |µn(A) − 0|

≤ µn(A)

= 1 − µn((A)c)

≤ 1 − µn(B(x0, ε0))

= µn(B(x0, ε0)c)

= P{d(ξn, x0) ≥ ε0} → 0 as n → ∞.

Conversely, assume that (4.1) holds. Then for any ε > 0, lim

n→∞P{d(ξn, x0) ≥ ε} = limn→∞µn(B(x0, ε) c) = δ

x0(B(x0, ε)

c) = 0,

since B(x0, ε)c is a δx0-continuity set for every ε > 0.

The concept of tightness of random elements is defined as follows. A sequence of random elements in a metric space (S, d) is said to be tight if for every ε > 0, there exists a compact set K ⊂ S such that

lim inf

n→∞ P{ξn∈ K} ≥ 1 − ε.

In the case when S is separable and complete, we can replace the ‘lim inf’ by ‘inf’. The next result is Prohorov’s theorem formulated for sequences of random elements. For this, the following definition is needed. A sequence of random elements ξ1, ξ2, ... is said to be relatively compact in distribution if every

subsequence has a further subsequence that converges in distribution.

Theorem 4.2. If ξ1, ξ2, ... are random elements in a complete and separable

metric space (S, d), then {ξn} is tight if and only if {ξn} is relatively compact

in distribution.

Proof. By definition, {ξn} is tight if and only ifM = {P ◦ ξ1, P ◦ ξ2, ...} is tight.

According to theorem 3.8, this is equivalent to relative compactness ofM with respect to the Prohorov metric. Since (S, d) is separable, it follows from theorem 3.10 that convergence in the Prohorov metric is equivalent to weak convergence. This means that every sequence inM has a subsequence that converges weakly and this is equivalent to that {ξn} is relatively compact in distribution.

The next lemma will be used in the proof of a tightness criterion for random measures.

(43)

4.1. Convergence in Distribution of Random Elements 29

Lemma 4.3. Let S and T be complete and separable metric spaces and let f : S → T be a continuous mapping. If {ξn} is a sequence of random elements

which is tight in S, then {f (ξn)} is tight in T .

Proof. For any function g ∈ Cb(T ) it holds that g ◦ f ∈ Cb(S). So if ξn d

−→ ξ, then for any g ∈ Cb(T ),

lim

n→∞E(g ◦ f (ξn)) = E(g ◦ f (ξ)),

which is the same as f (ξn) d

−→ f (ξ). This implies that {f (ξn)} is relatively

compact in distribution if {ξn} is. By Prohorov’s theorem, relative compactness

is equivalent to tightness for complete separable metric spaces, so the result follows from this.

The following theorem is used in the proof of the main theorem for conver-gence of random measures.

Theorem 4.4. Let S and T be metric spaces and let ξ, ξ1, ξ2, ... be random

elements in S such that ξn d

−→ ξ. Suppose that f, f1, f2, ... : S → T are

measur-able functions and that C ⊂ S is a measurmeasur-able set such that ξ ∈ C a.s. Then fn(ξn)

d

−→ f (ξ) if fn(sn) → f (s) as sn→ s ∈ C.

Proof. Let G ⊂ T be an open set and suppose that s ∈ f−1(G) ∩ C. Since G is open it follows from the assumptions in the theorem that there exists a neighbourhood N of s and a number m ∈ N such that fk(s0) ∈ G for all

k ≥ m and s0 ∈ N . This implies that N ⊂ T∞

k=mf −1

k (G) and since N is a

neighbourhood of s it follows that s ∈ Tm◦, where

Tm= ∞

\

k=m

fk−1G.

Such m exist for arbitrary s ∈ (f−1G) ∩ C, so we have that f−1(G) ∩ C ⊂

[

m=1

Tm◦.

Let µ, µ1, µ2, ... be the distributions of ξ, ξ1, ξ2, ..., respectively. Theorem 2.7 (c)

and the assumption that ξ ∈ C almost surely gives µ(f−1(G)) = µ((f−1(G)) ∩ C) ≤ µ ∞ [ m=1 Tm◦ ! = sup m µ (Tm◦) . (4.2)

By theorem 3.10 (e) and that Tm◦ ⊂ fn−1(G) for sufficiently large n, it follows

that

µ(Tm◦) ≤ lim inf

n→∞ µn(T ◦

m) ≤ lim infn→∞ µn(fn−1(G)). (4.3)

Equations (4.2) and (4.3) together yields µ ◦ f−1(G) ≤ lim inf

n→∞ µn◦ f −1 n (G).

Using theorem 3.10 (e) again it follows that µn ◦ fn−1 w −→ µ ◦ f−1, which is equivalent to that fn(ξn) d −→ f (ξ) since µ ◦ f−1, µ 1◦ f1−1, µ2◦ f2−1, ... are the distributions of f (ξ), f1(ξ1), f2(ξ2), ..., respectively.

(44)

4.2

Random Measures

Now the focus is turned to the case when the random elements are measure valued. In Kallenberg [7], the random measures considered are assumed to take values on a topological space S, which is locally compact, second countable and Hausdorff. To avoid any discussion about topology, this section will only consider random measures on a metric space (S, d), where S is an open subset of the euclidean space Rd. This class of spaces form a special case of the ones

satisfying the conditions mentioned above. Let S ⊂ Rd be open and denote by

b

B(S) the class of all relatively compact subsets of B(S). A measure µ on S is said to to be locally finite if

µ(B) < ∞, for all B ∈ bB(S).

Denote by M(S) be the space of all locally finite measures defined on S. If f is a continuous function, then the support of f is defined as

supp f = {x : f (x) 6= 0}.

Let CK+(S) denote the family of continuous functions f : S → R+with compact

support. Before stating the next theorem, the concept of vague convergence is needed. A sequence of measures µ1, µ2, ... ∈ M(S) is said to converge vaguely

to µ ∈ M(S) if Z f dµn→ Z f dµ, for all f ∈ CK+(S). Note that CK+(S) ⊂ Cb(S), so if µn w −→ µ, then µn v −→ µ.

Theorem 4.5. Let f1, f2, ... be dense in CK+(S), where S ⊂ R

d is open, and define b ρ(µ, ν) = ∞ X k=1 1 2k min  Z fkdµ − Z fkdν , 1  , for all µ, ν ∈ M(S).

Then (M(S),ρ) is a complete and separable metric space and convergence inb this metric is equivalent to vague convergence.

Proof. See section A2 in [7].

Now everything is set up for the following definition. A random measure is a random element in (M(S),ρ), that is, a measurable mapping from the eventb space (Ω,F , P) to (M(S),bρ). If µ ∈ M(S), define

b Bµ(S) =

n

B ∈ bB(S) : µ(∂B) = 0o and if ξ is a random measure, define

b Bξ(S) =

n

B ∈ bB(S) : ξ(∂B) = 0 a.s.o.

A set M ⊂ M(S) is called vaguely relatively compact if every sequence µ1, µ2, ... ∈

M has a subsequence that converges vaguely to some element µ ∈ M(S). From theorem 4.5 it follows that a set M is vaguely relatively compact if and only if M is compact with respect to the metric ρ. Two properties of vague convergenceb are stated in following theorem.

(45)

4.2. Random Measures 31

Theorem 4.6. If S is an open subset of Rd, then the following holds.

(a) A set M ⊂ M(S) is vaguely relative compact if and only if sup µ∈M Z f dµ < ∞ for all f ∈ CK+(S). (b) If µn v

−→ µ, then µn(B) → µ(B) for all B ∈ bBµ(S).

Proof. See section A2 in [7].

To prove an important uniqueness result for random measures, we will use the following result.

Theorem 4.7. If m ∈ M(S), the Borel σ-algebra over M(S) is generated by the sets

n

{µ : µ(B) ∈ A} : A ∈B(R+), B ∈ bBm(S)

o . Proof. See section A2 in [7].

The following two uniqueness results will be used in the proof of the main convergence result of this chapter.

Lemma 4.8. If ξ and η are two random measures on some open set S ⊂ Rd

such that

(ξ(B1), ..., ξ(Bk)) d

= (η(B1), ..., η(Bk)), for all B1, ..., Bk∈ bBξ+η(S), k ∈ N,

then ξ= η.d

Proof. The monotone class theorem stated in section 2.2 will be used to prove this result. Define

D =M ∈B(M(S)) : P ◦ ξ−1

(M ) = P ◦ η−1(M ) .

To show that this class of subsets is a λ-system, first note that M(S) ∈ D since P ◦ ξ−1(M(S)) = P ◦ η−1(M(S)) = 1.

Then let M1, M2∈ D with M2⊂ M1. By basic properties of measures,

P ◦ ξ−1(M1\ M2) = P ◦ ξ−1(M1) − P ◦ ξ−1(M2)

= P ◦ η−1(M1) − P ◦ η−1(M2)

= P ◦ η−1(M1\ M2),

so M1\ M2∈ D. To prove the last property of a λ-system, let M1, M2, ... be an

increasing sequence of sets from D. By theorem 2.7 (c),

P ◦ ξ−1 ∞ [ i=1 Mi ! = lim i→∞P ◦ ξ −1(M i) = lim i→∞P ◦ η −1(M i) = P ◦ η−1 ∞ [ i=1 Mi ! ,

(46)

soS∞

i=1Mi∈ D. Then define

C = {{µ : µ(B1) ∈ A1, ..., µ(Bk) ∈ Ak} : A1, ..., Ak ∈B(R+)

B1, ..., Bk∈ bBξ+η(S), k ∈ N}.

To show that C is a π-system, let M1 and M2be two arbitrary sets in C, that is

Mi = {µ : µ(B1i) ∈ A i 1, ..., µ(B i ki) ∈ A i ki}, i = 1, 2.

Then the intersection of M1and M2is

{µ : µ(B11) ∈ A 1 1, ..., µ(B 1 k1) ∈ A 1 k1, µ(B 2 1) ∈ A 2 1, ..., µ(B 2 k2) ∈ A 2 k2}.

This set belongs to C, so it follows that C is a π-system. If M ∈ C, then by the assumption in the lemma,

P ◦ ξ−1(M ) = P{ξ ∈ {µ : µ(B1) ∈ A1, ..., µ(Bk) ∈ Ak}}

= P{ξ(B1) ∈ A1, ..., ξ(Bk) ∈ Ak}

= P{η(B1) ∈ A1, ..., η(Bk) ∈ Ak}

= P{η ∈ {µ : µ(B1) ∈ A1, ..., µ(Bk) ∈ Ak}}

= P ◦ η−1(M ),

so M ∈ D and hence C ⊂ D. It follows from the monotone class theorem that σ(C) ⊂ D and from theorem 4.7 that σ(C) =B(M(S)), so it can be concluded that ξ= η.d

Lemma 4.9. If ξ and η are two random measures on some open set S ⊂ Rd such that Z f dξ=d Z f dη for all f ∈ CK+(S), then ξ= η.d

Proof. First define the σ-algebras F and G as follows. F = σ n µ : Z f dµ ∈ Ao: f ∈ CK+(S), A ∈B(R+)  , G = σn{µ : µ(B) ∈ A} : B ∈ bB(S), A ∈ B(R+) o .

Almost by following the proof of lemma 4.8 line by line, it can be shown that P ◦ ξ−1(M ) = P ◦ η−1(M ), for all M ∈ F .

By theorem 4.7, the Borel σ-algebra over M(S) is generated by a subset of G, so it is enough to show that G ⊂ F . For any f ∈ CK+(S), define the mapping πf by

πf(µ) =

Z

f dµ, for all µ ∈ M(S) and for any B ∈B(S), define the mapping πB by

(47)

4.3. Convergence in Distribution of Random Measures 33

To prove that G ⊂ F , it is enough to show that πB is F -measurable for any

B ∈ bB(S). First let K ⊂ S be an arbitrary compact set and choose a sequence of functions f1, f2, ... ∈ CK+(S) such that fn ↓ 1K. Then πf1, πf2, ... are F

-measurable functions that converge pointwise to πK on M(S), so it follows

that

πK is F -measurable for any compact K ⊂ S. (4.4)

Now let K ⊂ S be a fixed compact set. The monotone class theorem will be used to prove that πB is F -measurable for any Borel set B ⊂ K. Define

D = {B ∈B(K) : πB is F -measurable.} .

To prove that this is a λ-system, first note that it follows from (4.4) that K ∈ D. Then let B1, B2, ... be a sequence of increasing sets in D and let B0=S∞i=1Bi.

By theorem 2.7 (c), lim

i→∞πBi(µ) = limi→∞µ(Bi) = µ(B 0) = π

B0(µ),

for any µ ∈ M(S), so πB0 is F -measurable and hence B0 ∈ D. To prove the

third property of a λ-system, let A and B be Borel subsets of K such that B ⊂ A and note that

πA\B(µ) = µ(A \ B) = µ(A) − µ(B) = πA(µ) − πB(µ),

for any µ ∈ M(S). Since the difference of two measurable functions is again measurable it follows that A \ B ∈ D. Now define

C = {B ∈B(K) : B is compact} .

This family of subsets is a π-system since the intersection of any two compact sets is again a compact set and (4.4) implies that C ⊂ D. By the monotone class theorem, σ(C) ⊂ D and since σ(C) =B(K), it follows that πB is F -measurable

for any Borel subset B of K. In particular, this holds for any B ∈ bB(K). Since K was arbitrary, this proves the lemma.

4.3

Convergence in Distribution of Random

Mea-sures

The proof of the key theorem for convergence of random measures uses the following tightness criterion.

Lemma 4.10. If ξ1, ξ2, ... are random measures on some open set S ⊂ Rd, then

the following three statements are equivalent. (a) {ξn} is relatively compact in distribution.

(b) {R f dξn} is tight in R+ for every f ∈ CK+(S).

(48)

Proof. (a) implies (b): Assume that {ξn} is relatively compact in distribution.

Since (M(S),ρ) is complete and separable, it follows from theorem 4.2 thatb relative compactness in distribution is equivalent to tightness for {ξn}. Let

f ∈ CK+(S) be arbitrary and define the mapping πf : M(S) → R+ by πf(µ) =

R f dµ. Since this mapping is continuous, it follows from lemma 4.3 that {πf(ξn)} is tight in R+.

(b) implies (c): Let B ∈ bB(S) be arbitrary. Choose open relatively compact sets G1, G2, ... ∈ S such that B ⊂ S∞i=1Gi. Since B is compact, there exists

i1, ..., imsuch that C = Gi1∪ ... ∪ Gim covers B. Here C is relatively compact

because a finite union of compact sets is again compact. Then we can choose f ∈ CK+(S) such that supp f = C and f = 1 on B. Note that ξn(B) ≤R f dξn

for any n. From this and the assumption, it follows that for every ε > 0, there exists a positive real number r satisfying

inf n P {ξn(B) ≤ r} ≥ infn P Z f dξn≤ r  ≥ 1 − ε, so {ξn(B)} is tight.

(c) implies (a): Let G1, G2, ... ∈ bB(S) be open sets such that S ⊂ S∞k=1Gk.

Then for each ε > 0 and k ∈ N, there exists a positive real number rk such that

inf

n P{ξn(Gk) ≤ rk} ≥ 1 − ε2 −k.

Define A =T∞

k=1{µ : µ(Gk) ≤ rk} and let f ∈ CK+(S) be arbitrary. Since f has

compact support there exists only a finite collection of sets Gi1, ..., Gim where

f 6≡ 0. From this it follows that sup µ∈A Z f dµ ≤ sup µ∈A m X k=1 Z Gik f dµ ≤ sup µ∈A m X k=1 kf k µ(Gik) ≤ kf k m X k=1 rik< ∞, so the set A =T∞

k=1{µ : µ(Gk) ≤ rk} is relatively compact by the first statement

of theorem 4.6. This implies that A, the closure of A, is compact and inf n P{ξn∈ A} ≥ infn P{ξn ∈ A} = 1 − sup n P{ξ n ∈ Ac} = 1 − sup n P ( ξn∈ ∞ [ k=1 {µ : µ(Gk) > rk} ) ≥ 1 − sup n ∞ X k=1 P{ξn∈ {µ : µ(Gk) > rk}} ≥ 1 − ∞ X k=1 sup n P{ξn (Gk) > rk} = 1 − ∞ X k=1  1 − inf n P{ξn(Gk) ≤ rk}  ≥ 1 − ∞ X k=1 ε2−k= 1 − ε,

References

Related documents

In the Papers 2–4 of the present thesis, we advocate the stochastic approach to robust structural topology optimization by Paper 2 ◦ formulating the problems of robust

As opposed to the previous section which presented and discussed the uncer- tainty values of the target based methods, this section presents not only the uncertainty figures, but

När det gällde barnens övergång mellan förskola och förskoleklass ansåg sig personalen i förskolan inte ha mycket att säga till om vid utformningen av denna

Ofta har det varit för klena skruvar för matning av spannmål och undermåliga krökar, som har gett upphov till problemen.. Övriga problem med hanterings- och matningsutrustningen

This paper expands the literature by investigating the convergence behaviour of carbon dioxide emissions in the Americas and the factors determining the formation

Nevertheless, despite the multiple sensor approach, much of the work concerned the investigation of the Time-of-Flight camera as it is a quite new type of 3D imaging sen- sors and

Energy systems analysis of Swedish pulp and paper industries from a regional cooperation perspective. – Case study modeling

One explanation may be that the success of supplier involvement is contingent upon the process of evaluating and selecting the supplier for collaboration (Petersen et al.,