Spectral Theory for Perron-Frobenius operators

(1)

Examensarbete i matematik, 30 hp Handledare: Michael Benedicks Examinator: Denis Gaidashev September 2019

Department of Mathematics Uppsala University

Spectral Theory for Perron-Frobenius operators

Wouter Slegers

(2)

(3)

Introduction 2

1 Estimates for the dynamical system 3

1.1 Some definitions and preparatory statements . . . 3

1.1.1 On The Strong Law of Large Numbers and Central Limit Theorem . . . 4

1.1.2 The Perron-Frobenius operator . . . 5

1.1.3 A suitable subspace of L¹(m) . . . 9

1.2 A Lasota-Yorke inequality . . . 11

1.2.1 Distortion estimates . . . 11

1.2.2 Proof of the Lasota-Yorke inequality . . . 14

1.3 Approximation of P by a finite rank operator . . . 17

2 Spectral information 21 2.1 Existence of a unique absolutely continuous f -invariant measure . . . 21

2.2 The spectrum of the operator P . . . 23

2.2.1 A spectral gap . . . 24

2.2.2 Eigenvalues on the unit circle and the Koopman operator . . . 25

2.3 Exponential decay of correlations and the Central Limit Theorem . . . 28

Bibliography 31

(4)

About this thesis

In this thesis we will discuss spectral theory for Perron-Frobenius operators for expanding piecewise C² functions. This we use to conclude that for such dynamical systems we have exponential decay of correlations and that the Strong Law of Large Numbers and Central Limit Theorem hold. It is a com- prehensible self-contained exposition of perhaps the simplest case discussed in [9].

Introduction

We will be investigating statistical properties of certain dynamical systems. Our interest lies in dynamical systems f : [0, 1] → [0, 1], where f is surjective and piecewise continuous, that exhibit “chaotic”

behaviour. To be able to study the behaviour of f , we call a measurable function h : [0, 1] → C an “ob- servable”. We will considerh ◦ fⁱ

i≥0as a sequence of random variables, where h ◦ fⁱis the observation made by h at time i.

Quite a lot is known about sequences of identically distributed independent random variables. Most fa- mous among the results are possibly the Strong Law of Large Numbers and the Central Limit Theorem.

In the classic example of such a sequence, the coin flip, the former theorem tells us that the fraction of heads will converge to 1/2 with probability 1. The latter tells us that, with appropriate normalization, the probability distribution of the number of heads that will appear will converge to a normal distribution.

Our sequence h ◦ fⁱis quite far from being pairwise independent. Generally, for small i, the dependence between h and h ◦ fⁱ is significant. With that in mind, we turn our attention to the behaviour for large i. We expect, with sufficient “chaotic” properties imposed on f , the correlation between h and h ◦ fⁱ to decrease as i increases. It is for this reason worth investigating if results such as the Strong Law of Large Numbers and the Central Limit Theorem hold for our sequence, as they both depend on letting i go to infinity.

Our goal is to show that, indeed, for functions f with certain “chaotic” behaviour, we have exponential decay of correlations on our sequence h ◦ fⁱand that, for this sequence, the Strong Law of Large Numbers and Central Limit Theorem hold. Our approach will make use of the Perron-Frobenius operator, also named transfer operator, an operator directly associated to f . In Chapter 1 we establish various estimates regarding this operator. In Chapter 2 we use said results to investigate the spectral properties of this operator and prove the presence of a spectral gap, to then complete the discussion. In this thesis, appropriate definitions and notions will be introduced and the steps will be worked out in detail so as to make this a self-contained treatment of the subject.

Acknowledgements

I wish to thank my supervisor Michael Benedicks for his indispensable help, explaining the different concepts and approaches.

(5)

Estimates for the dynamical system

We begin by introducing most of the necessary definitions and concepts in the first section of this chapter.

In the last two sections we do most of the work, in the form of important estimates, which we will use in Chapter 2.

1.1 Some definitions and preparatory statements

We equip the unit interval with the Borel algebra. We consider a dynamical system f : [0, 1] → [0, 1].

Let h : [0, 1] → C measurable. Before we can investigate the (in)dependence of our sequence h ◦ fⁱ, we want the sequence to be identically distributed. Firstly, to see [0, 1] as a probability space and h ◦ fⁱ as a sequence of random variables, we must have a probability measure on [0, 1].

Definition 1.1

A measure µ on [0, 1] is called invariant for f : [0, 1] → [0, 1] if for every measurable A ⊂ [0, 1] we have µ(A) = µ(f⁻¹(A)).

For an f -invariant measure µ the probability of event A at time i, µ fⁱx ∈ A = µ f⁻ⁱ{x ∈ A}, is the same as the probability of event A at time 0, µ {x ∈ A}. As a result, the sequence h ◦ fⁱ is then identically distributed in the probability space ([0, 1], µ).

Note that R h ◦ fⁱdµ is the mean of random variable h ◦ fⁱ. Since the sequence h ◦ fⁱ is identically distributed, we haveR hdµ = R h ◦ fⁱdµ for any i. Actually, an equivalent definition for the f -invariance of a measure µ is that

Z

hdµ = Z

h ◦ f dµ

holds for any h ∈ L¹(µ). One can prove the equivalence of the definitions using that A is a measurable set if and only if1A∈ L¹(µ).

We shall write m for the Lebesgue-measure. To find an f -invariant measure we will utilize the Lasota- Yorke Theorem. For the purpose of this theorem we remind the reader of the notion of absolute continuity.

Definition 1.2

A measure µ is absolutely continuous w.r.t m if for every measurable set A we have that m(A) = 0 implies µ(A) = 0.

Remark 1.3: A measure µ is absolutely continuous w.r.t m if and only if there exists an h ∈ L¹(m) such that µ = h · m, i.e.

µ(A) = Z

A

hdm

for all measurable A. One calls h the Radon-Nikodyn derivative, denoted h = dµ/dm.

Henceforth, when we write “absolutely continuous” we mean with respect to the Lebesgue measure m. The Lasota-Yorke Theorem relies on the notion we have, until now, referred to as “certain chaotic behaviour for f ”. We define it here.

(6)

0 1 1

f2

I2

J2

Figure 1.1: An expanding piecewise C²function

Definition 1.4

We define f : [0, 1] → [0, 1] to be an expanding piecewise C² function if it satisfies the following require- ments.

• There exists a finite set 0 = x₁< x₂< ... < x_n< x_n+1= 1.

• For every 1 ≤ i ≤ n, f is C² on (xi, xi+1) and can be extended to a C² function on [xi, xi+1].

• For every 1 ≤ i ≤ n we have |Df | _(x

i,x_i+1)≥ λ > 1.

See Figure 1.1 for an example of an expanding piecewise C² function.

Henceforth, when we talk of an expanding piecewise C²function f , we write (x₁, ..., x_n+1) for the smallest set satisfying the definition and λ for the constant used. For each 1 ≤ i ≤ n we write Ii:= (xi, xi+1) for the intervals and Ji:= f (Ii) for their images. Lastly we write fi:= f |_(x_i_,x_i+1₎for the separate continuous parts of f and φi:= f_i⁻¹ for their inverses.

Remark 1.5: Since, for an expanding piecewise C² function f , the separate differentiable parts fi can be extended to a C²function on [xi, xi+1], D²f is bounded on (xi, xi+1) for any 1 ≤ i ≤ n. Because there are only finitely many such intervals we may use that there exists a constant K such that

D²f ≤ K wherever D²f is defined.

Definition 1.6

An expanding function f is called full-branched if for every 1 ≤ i ≤ n the interval (xi, xi+1) is mapped to all of (0, 1), i.e. Ji= (0, 1) for all 1 ≤ i ≤ n.

Theorem 1.7 (Lasota-Yorke [1973])

If f is an expanding piecewise C²function then there exists an absolutely continuous f -invariant measure.

We give a proof of this theorem later, in Chapter 2. First we will have to establish a few more facts.

1.1.1 On The Strong Law of Large Numbers and Central Limit Theorem

Conditions that ensure that the Strong Law of Large Numbers holds for a sequence h ◦ fⁱ turn out be fairly simple.

Definition 1.8

An ergodic system f is a dynamical system on a probability space ([0, 1], µ), where, for any invariant set, i.e. a measurable set A such that A = f⁻¹A, we have either µ(A) = 0 or µ(A) = 1.

When we have an ergodic system f we can use Birkhoff’s Ergodic Theorem, see e.g. [8]. This tells us that for h ∈ L¹(m) we have, for almost every x ∈ [0, 1],

1 N

N −1

X

i=0

h ◦ fⁱ(x) → Z

hdµ, (1.1)

(7)

as N → ∞. In other words, the average of the sequence h ◦ fⁱ converges to its mean with probability 1.

This is exactly the Strong Law of Large Numbers.

Now it is validity of the Central Limit Theorem that we are interested in investigating further. For this we will have to look at a property stronger than ergodicity, called mixing. For two events A, B ⊂ X, if we had complete time-independence, we would have the property

µ(A ∩ f⁻ⁱB) = µ(A)µ(B)

for any i. In our case of short-term correlation, a more realistic property would be that µ(A ∩ f⁻ⁱB) → µ(A)µ(B)

as i → ∞. This is the (strong) mixing property of a system f on ([0, 1], µ). This property, like f - invariance of a measure, has an equivalent definition in terms of functions, namely

Z

X

h(g ◦ fⁱ)dµ → Z

X

hdµ Z

X

gdµ

for any g, h ∈ L²(m). The equivalence of the definitions is proved with similar methods as those for invariant measures, see e.g. [8] on ergodicity and strong mixing properties. In our case we will find that for two functions g, h ∈ L²(m) the limit will be approached slowly. It is therefore that we will later restrict to a space of “nicer” functions that will give us a stronger property: exponential convergence.

Definition 1.9

Let f be a dynamical system on probability space ([0, 1], µ) and F a class of functions on [0, 1]. The system is said to have exponential decay of correlations if, for some constants C and r < 1,

Z

X

h(g ◦ fⁱ)dµ − Z

X

hdµ Z

X

gdµ

< Crⁱ holds for all g, h ∈ F .

Various theorems exist that prove a form of the Central Limit Theorem for sequences that satisfy certain mixing properties. Along the lines of most such theorems, we wish to prove that, if Eh =R hdµ = 0, we get

√1 N

N

X

i=0

h ◦ fⁱ→ N (0, σ),

where σ can be determined as a limit. We shall consider a specific system, that of an expanding piecewise C²function, and then prove that the Central Limit Theorem holds for the sequence h ◦ fⁱ, for h chosen in the right space. That the sequence, under these conditions, satisfies a rather strong mixing property, is to be expected. It will, in the end, turn out to be fairly easy to prove exponential decay of correlations once we have everything we need for proving the Central Limit Theorem for our sequence.

1.1.2 The Perron-Frobenius operator

Central in this thesis will be the study of the Perron-Frobenius operator associated to the dynamical system f . It is the study of the spectrum of this operator, worked out in Chapter 2, that will yield the results we wish to prove.

Definition 1.10

Let f be a dynamical system. We define the Perron-Frobenius operator as the linear operator P_f: L¹(m) → L¹(m) given by

Z

E

Pf(h)dm = Z

f⁻¹(E)

hdm,

for any integrable function h and measurable set E. By using the pull-back of f on measures, i.e.

f_∗(µ)(E) = µ(f⁻¹(E)), we can write the definition more shortly Pf(h) · m = f_∗(h · m).

From now on we will write P instead of P_f when no confusion occurs.

(8)

Remark 1.11: It is clear from the definition, f∗(µ)(E) = µ(f⁻¹(E)), that µ is a fixed point for f∗ if and only if µ is an f -invariant measure. And we see by definition, P (h) · m = f∗(h · m), that if h is a fixed point for P , then h · m is a fixed point for f_∗. Hence, we will obtain an f -invariant measure h · m, once we find a fixed point h for P .

Proposition 1.12

Let f satisfy Definition 1.4. For h ∈ L¹(m) we can use the following explicit expressions for P h,

P h(x) = X

y∈f⁻¹(x)

h(y)

|Df (y)| =

n

X

i=1

h(φi(x))

|Df (φi(x))|·1J_i(x).

Proof. That the two expressions are equal can easily be seen by the following, X

y∈f⁻¹(x)

h(y)

|Df (y)|= X

1≤i≤n f_i⁻¹(x) exists

h(f_i⁻¹(x)) Df (f_i⁻¹(x))

=

n

X

i=1

h(φ_i(x))

|Df (φi(x))|·1Ji(x).

We shall prove the statement by proving that Definition 1.10 is satisfied, i.e. that Z

E n

X

i=1

h(φ_i(x))

|Df (φi(x))| ·1Ji(x)dx = Z

f⁻¹(E)

h(x)dx

holds for all Lebesgue measurable sets E. We will prove our claim for closed intervals E, the result can then be extended to all Lebesgue measurable sets. We find

Z

E n

X

i=1

h(φ_i(x))

|Df (φi(x))|·1J_i(x)dx =

n

X

i=1

Z

E∩J_i

h(φ_i(x))

|Df (φi(x))|dx

(∗)=

n

X

i=1

Z

f⁻¹(E)∩I_i

h(x)dx

= Z

f⁻¹(E)

h(x)dx,

where (∗) holds by a change of variables, but let us be more precise. Let 1 ≤ i ≤ n and take a, b such that E ∩ J_i= [a, b]. Since f is expanding we have either Df (x) > α for all x ∈ I_i or Df (x) < −α for all x ∈ I_i. In the latter case we have Df_i(x) = − |Df_i(x)| and f_i⁻¹(E) = [f⁻¹(b), f⁻¹(a)], so a change of variables gives

Z

E∩Ji

h(φi(x))

|Df (φ_i(x))|dx = Z b

a

h(φi(x))

|Df (φ_i(x))|dx

=

Z f_i⁻¹(b) f_i⁻¹(a)

h(φ_i(f_i(x)))

|Df (φi(fi(x)))| · Dfi(x)dx

= −

Z f_i⁻¹(a) f_i⁻¹(b)

h(x)

|Df (x)|· (− |Dfi(x)|) dx

= Z

f_i⁻¹(E)

h(x)dx.

In the former case however, we have Df_i(x) = |Df_i(x)| and f_i⁻¹(E) = [f_i⁻¹(a), f_i⁻¹(b)] ⊂ I_i. Therefore we can follow the same steps but replacing Df_iby |Df_i| will not give a minus-sign and since fi(a) ≤ f_i(b) there is no need to swap those in the integral before the last step, which then also does not give a minus- sign. Therefore the result is the same.

Proposition 1.13

Let f satisfy Definition 1.4. For h ∈ L¹(m) we have kP hk₁≤ khk₁, and if h is non-negative

kP hk₁= khk₁.

(9)

Proof. By Proposition 1.12 we find

|P h| (x) =

X

y∈f⁻¹(x)

h(y)

|Df (y)|

≤ X

y∈f⁻¹(x)

|h(y)|

|Df (y)|= P |h| (x).

Therefore, using Definition 1.10, we have kP hk₁=

Z

[−1,1]

|P h| dλ ≤ Z

[−1,1]

P |h| dλ = Z

f⁻¹[−1,1]

|h| dλ = Z

[−1,1]

|h| dλ = khk₁,

with equalities everywhere when h is non-negative.

For an expanding piecewise C²function f the function P h ∈ L¹(m) inherits some of the regularity one imposes upon h, with discontinuities on the edges of the intervals Ji. One can take two points x, x⁰ close together, such that, for some i, x is within Ji and x⁰ is not. Then x and x⁰ may have a different amount of pre-images causing the explicit form of P h(x) and P h(x⁰) to differ in the amount of terms in the sum, probably resulting in a discontinuity.

Depending on the kind of functions one is working with, this is often of no great concern. However, when we want certain regularities of h to carry over to P h, what we can do is require f to be full-branched.

In that case Ji = (0, 1) for all 1 ≤ i ≤ n, which removes those discontinuities. This ensures that P is invariant on the space of continuous functions. We will use this in later chapters. Since the points 0 and 1 are still unaccounted for and the definition of an expanding piecewise C² function imposes no conditions on f for the points x_i for 1 ≤ i ≤ n + 1, the function may still have some “bad” parts. For our discussion, however, we simply assume f to be defined everywhere.

We are interested in long term behaviour of our dynamical system, i.e. iterations of f . The following lemma gives us an explicit way to work with the iterations of P , which, as we will shortly prove in Proposition 1.15, corresponds to iterating f .

Proposition 1.14

Let f satisfy Definition 1.4. For h ∈ L¹(m) and N ∈ N we can write P^Nh = X

y∈f^−N(z)

h(y)gN(y),

where

g_N(y) =

N −1

Y

j=0

1

|Df (f^j(y))|.

Proof. Using Proposition 1.12 we see that repeated application of P gives

P^Nh(z) = P^{N −1}



 X

y∈f⁻¹(z)

h(y)

|Df (y)|





= P^{N −2}



 X

s∈f⁻¹(z)

P

y∈f⁻¹(s) h(y)

|Df (y)|

|Df (s)|





= P^{N −2}



 X

y∈f⁻²(z)

h(y)

|Df (f (y))| |Df (y)|



 ...

= X

y∈f^−N(z)

h(y) QN −1

j=0 |Df (f^j(y))|

= X

y∈f^−N(z)

h(y)gN(y).

(10)

Proposition 1.15

Let f satisfy Definition 1.4. For N ∈ N we have

P_f^N = P_fN. Proof. Note that, by the chain rule, we have

g_N(y) =

N −1

Y

j=0

1

|Df (f^j(y))| = 1

|Df^N(y)|. This, in view of Proposition 1.14, gives

P_f^Nh = X

y∈f^−N(z)

h(y) QN −1

j=0 |Df (f^j(y))| = X

y∈f^−N(z)

h(y)

|Df^N(y)|= P_fNh.

To work with P^N it will be helpful to outline the structure of f^N a little better. We assume f to be full-branched. As we see in the definition of an expanding piecewise C² function, f “divides” [0, 1] into n open intervals f_i⁻¹[(0, 1)] = (xi, xi+1), where 1 ≤ i ≤ n, in such a way that f : (xi, xi+1) → (0, 1) is a bijective C² function for each i. Now if we consider f², this again is a full-branched expanding piecewise C²function, but it splits (0, 1) into n² intervals, namely, f_i⁻¹

2 [f_i⁻¹

1 [0, 1]] = f_i⁻¹

2 [(xi₁, xi₁+1)] for pairs 1 ≤ i1, i2≤ n. In this case it is each f²: f_i⁻¹

2 [f_i⁻¹

1 [0, 1]] → (0, 1) that is bijective and C². In more generality we have the following.

Remark 1.16: There exist open intervals ∆i₁,...,i_N, where N ∈ N and 1 ≤ i^j ≤ n, for each 1 ≤ j ≤ N , such that

• ∆i₁,...,i_N ∩ ∆_i⁰

1,...,i⁰_N = ∅ whenever (i1, ..., iN) 6= (i⁰₁, ..., i⁰_N);

• m S ∆i₁,...,i_N = 1 for any N ∈ N, where the union is taken over all (i¹, ..., iN) with 1 ≤ ij ≤ n for 1 ≤ j ≤ N .

• f_i⁻¹_{N +1}(∆i1,...,iN) = ∆i1,...,iN,iN +1;

• f⁻¹(∆i₁,...,i_N) =Sn

i_{N +1}=1∆i₁,...,i_N,i_{N +1};

• f : ∆_i₁_,...,i_{N +1}→ ∆_i₁_,...,i_N is C²and a bijection;

• f^N: ∆i₁,...,i_N → (0, 1) is C² and a bijection.

• f^N is C² on ∆_i₁_,...,i_N and can be extended to a C² function on the closure of ∆_i₁_,...,i_N.

If f is a full-branched expanding piecewise C²function, then, for any N ∈ N, f^N is again a full-branched expanding piecewise C² function and it divides (0, 1) into the n^N intervals denoted by ∆_i₁_,...,i_N. Let z ∈ [0, 1]. By the chain rule we have

Df^N(z) = Q

1≤j≤N

Df (f^j(z))

≥ λ^N, so the accompanying constant, from the definition of an expanding piecewise C² function, Definition 1.4, can be taken to be λ^N. A visualization of this for an easy example of a full-branched expanding function is given in Figure 1.2.

Proposition 1.17

Let f satisfy Definition 1.4 and be full-branched. Let y, y⁰ ∈ ∆i1,...,iN for some 1 ≤ ij < n. We have the following inequalities,

|y − y⁰| ≤ 1 λ^N

f^N(y) − f^N(y⁰)

and |∆i₁,...,i_N| ≤ 1 λ^N. Proof. Using that f is continuous on ∆i₁,...,i_N, by the Mean Value Theorem we obtain

|y − y⁰| |Df (x)| = |f (y) − f (y⁰)|

for some x in-between y and y⁰. By definition |Df (x)| > λ, hence we get that

|y − y⁰| ≤ 1

λ|f (y) − f (y⁰)| , (1.2)

(11)

f1 f2

∆₁ ∆₂

f

∆_1,1

∆2,1

∆_1,2

∆2,2

f² f³

Figure 1.2: Iterates of the full-branched expanding piecewise C²function f given by f (x) = 1−2 |x − 1/2|.

holds. Note that since y, y⁰∈ ∆i₁,...,i_N we have f (y), f (y⁰) ∈ ∆i₁,...,i_{N −1}, so with repeated use of (1.2) we obtain the first inequality. Also obtained through repeated use of (1.2) is

|∆i1,...,iN| = sup

y,y⁰∈∆_i1,...,iN

|y − y⁰| ≤ 1

λ^N sup

y,y⁰∈∆_i1,...,iN

f^N(y) − f^N(y⁰) ≤ 1

λ^N, where the last step holds since f^N(y), f^N(y⁰) ∈ [0, 1].

As we will often end up working with a fixed N ∈ N and since the order of the intervals might not be relevant, for simplicity, we will denote the afore-mentioned intervals by ∆_N,ν, where 1 ≤ ν ≤ n^N. So for every tuple (i₁, ..., i_N), where 1 ≤ i_j ≤ n for each 1 ≤ j ≤ N , there exists a unique 1 ≤ ν ≤ n^N such that ∆i₁,...,i_N = ∆N,ν. In other words, we write {∆N,ν}_1≤ν≤nN for the partition corresponding to the pre-image f^−N([0, 1]).

Let z ∈ (0, 1). For any 1 ≤ ν ≤ n^N, we write yν for the unique point yν∈ ∆N,ν with f^N(yv) = z. This, due to Proposition 1.14, allows us to use the following notation

P^Nh(z) = X

1≤ν≤n^N

h(yν)gN(yv), (1.3)

when f is a full-branched expanding piecewise C² function.

1.1.3 A suitable subspace of L

¹

(m)

We consider the sequence

ϕN(h) := 1 N

N −1

X

i=0

Pⁱh,

for some h ∈ L¹(m), because, if this sequence of averages converges, then the limit is P -invariant. This is how we will find an f -invariant measure. However, convergence in L¹(m) is not quite good enough.

The unit ball in L¹(m) is not sequentially compact. Take for example the sequence fN: [0, 1] → R where fN(x) = N x for 0 ≤ x ≤ 1/N and 0 otherwise. Clearly kfNk₁ = 1 for any N ∈ N, but the (pointwise) limit is not contained in L¹(m). To avoid this problem we will consider a “stronger” subspace of L¹(m).

We will consider the operator P on this subspace instead, which requires us to prove the chosen subspace is P -invariant. Then, for any h in this subspace, ϕN(h) should be contained in this subspace. Picking the right subspace, we will be able to prove that the sequence ϕN(h) has a uniformly convergent subsequence.

We will use the space of H¨older continuous functions.

Definition 1.18

A map h : [0, 1] → C is called H¨older continuous if there exist constants 0 < α < 1 and C such that

|h(x) − h(x⁰)| ≤ C |x − x⁰|^α for all x, x⁰∈ [0, 1].

We fix 0 < α < 1 and define Hα as the set of all α-Hölder continuous functions. For an α-Hölder continuous function, the Hölder constant is then defined to be

|h|_α:= sup

x,x⁰∈[0,1]

|h(x) − h(x⁰)|

|x − x⁰|^α .

(12)

Note that |·|_αdoes not give a norm since |h|_α= 0 holds for any constant h. However, using the semi-norm

|·|_α we can define the norm

khk_α:= |h|_α+ |h(0)| .

We will now establish a few facts about H_α and this H¨older norm. Firstly, the H¨older norm dominates the L¹(m) norm.

Proposition 1.19

For h ∈ Hα we have khk₁≤ khk_α. Proof. For any x ∈ [0, 1] we have

|h(x)| ≤ |h(0)| + sup

y,y⁰∈[0,x]

|h(y) − h(y⁰)|

≤ |h(0)| + sup

y,y⁰∈[0,x]

|y − y⁰|^α sup

y,y⁰∈[0,x]

|h(y) − h(y⁰)|

|y − y⁰|^α

≤ |h(0)| + |[0, x]|^α|h|_α. Therefore

khk₁= Z

[0,1]

|h(x)|

≤ Z

[0,1]

|h(0)| + |h|_α Z

[0,1]

|x|^α

≤ |h(0)| + |h|_α= khk_α. Remark 1.20: Note that, for any h ∈ Hα, we also have

khk_∞≤ |h(0)| + sup

x,y∈[0,1]

|h(x) − h(y)| ≤ |h(0)| + |h|_α sup

x,y∈[0,1]

|x − y|^α= khk_α.

Henceforth we shall assume f to be full-branched. As discussed, P h then retains some of the regularity of h. In particular, P h is H¨older continuous if h is. This is implicit in Lemma 1.31.

As mentioned at the start of this section, for the space Hα to be suitable for our purposes, we need the sequence ϕ_N(h) to have a limit. This is the case in a space where any sequence, contained in the unit ball, has a limit. We use the Arzel`a-Ascoli Theorem to prove that this is the case for H_α, see e.g.

[7][Ascoli’s Theorem]. The theorem uses the notion of equicontinuity. A sequence {f_N}^∞_{N =1}of continuous functions f_N: [0, 1] → C is equicontinuous if, for any > 0 there exists a δ > 0 such that for all N ∈ N and x, y ∈ [0, 1] we have

|f_n(x) − f_n(y)| < , when |x − y| < δ.

The statement of the Arzel`a-Ascoli Theorem is then that, if the sequence {fN}^∞_{N =1}is uniformly bounded and equicontinuous, that there exists a uniformly convergent subsequence. In particular, the next lemma holds.

Lemma 1.21 (Arzel`a-Ascoli Theorem)

Let {fN}^∞_{N =1} a sequence of Hölder continuous functions fN. If there exists a constant C such that kfNk_α≤ C for all N ∈ N then fN has a uniformly convergent subsequence with the limit in Hα. Proof. The proof consists simply of checking the conditions of the Arzelà-Ascoli Theorem. Since Hölder continuity implies continuity, any f_N is continuous. By Remark 1.20 we know that, for any N ∈ N, we have

kfNk_∞≤ kfNk_α≤ C.

This gives both a uniform bound and equicontinuity. The latter property follows because for any > 0, we can take δ = (/C)^1/α, which gives

|fN(x) − fN(y)| ≤ |x − y|^α|fN|_α<

C|fN|_α≤ ,

(13)

for any N ∈ N and x, y ∈ [0, 1] with |x − y| < δ.

Lastly, let f_N⁰ be the uniformly convergent subsequence and g its limit, we prove g ∈ Hα. Let > 0 and x, y ∈ [0, 1]. Let M ∈ N such that kfN⁰ − gk_∞< /2 |x − y|^αfor all N ≥ M . Then

|g(x) − g(y)|

|x − y|^α ≤ |g(x) − f_N⁰ (x)|

|x − y|^α +|f_N⁰ (x) − f_N⁰ (y)|

|x − y|^α +|f_N⁰ (y) − g(y)|

|x − y|^α

< |x − y|^α

|x − y|^α + |f_N⁰ |_α≤ C + .

Hence g is H¨older continuous and |g|_α≤ C.

We define the total variation of a map h on an interval [a, b] ⊂ R to be

b

_

a

h := sup

P ∈P mP

X

i=1

|h(xi+1) − h(xi)| ,

where P := {P = (x₁, ..., x_m_P₊₁: a = x₁< ... < x_m_P₊₁= b} is the set of partitions of [a, b]. We define the function h to be of bounded variation ifW1

0h < ∞. If h is differentiable we have

1

_

0

h = Z

[0,1]

|Dh| . (1.4)

Note thatW1

0h, similarly to |h|_α, is only a semi-norm. One could use the space of functions of bounded variation as an alternative to H_α. Part of this is done in, for example, [2]. We, however, shall stick with H¨older continuous functions here, but the definition of bounded variation remains useful.

1.2 A Lasota-Yorke inequality

The rest of this chapter will consist of establishing the estimates required for proving the spectral properties that we will discuss in Chapter 2. In this section we prove the following Lasota-Yorke inequality.

Theorem 1.22

For a full-branched expanding piecewise C² function f and h ∈ Hα we have, for any N ∈ N, P^Nh

_α≤ Cr^Nkhk_α+ C⁰khk₁, for C, C⁰ and r < 1 constants.

For the remainder of this paper, we let f denote a full-branched expanding piecewise C²function, i.e. f shall henceforth satisfy definitions 1.4 and 1.6. Note also that from here on out we will most commonly use the notation introduced in (1.3).

1.2.1 Distortion estimates

Firstly we will obtain a distortion estimate. Distortion estimates are often used, in one form or another, when working with expanding maps. It will prove very important in our approach to proving the Lasota- Yorke inequality. The essence of each result we prove in this section entails the following statement:

“The variation of the function Df^N on an interval ∆N,ν has a bound that does not depend on N ”. This follows mainly from D²f being bounded, but that does not immediately give us a bound on D²f^N that is independent of N . Note that “the variation of Df^N being small on ∆N,ν” corresponds to “the value

Df^N(yν) Df^N(y⁰_ν)

being close to 1 for yν, y_ν⁰ ∈ ∆N,ν”. This is, broadly, the result of both Lemma 1.23 and Corollary 1.25.

It is high time to make precise the estimates that we have.

(14)

Lemma 1.23

Let N ∈ N and 1 ≤ ν ≤ n^N. For any pair yν, y_ν⁰ ∈ ∆N,ν we have

log Df^N(yν) Df^N(y⁰_ν)

≤ K0|z − z⁰| ≤ K0

where K0 is a constant and z := f^N(yν) and z⁰:= f^N(y⁰_ν).

Proof. Note that Df^N(yν)/Df^N(y_ν⁰) > 0 since Df^N is either only positive or only negative on ∆N,ν by definition. We write yν,j = f^j(yν) and y⁰_ν,j = f^j(y_ν⁰) for any 1 ≤ j ≤ N (so yν,N = z and y⁰_ν,N = z⁰).

This, in view of the chain rule, allows us to write Df^N(y_ν) =QN −1

j=0 Df (y_ν,j). We now calculate, using the Mean Value Theorem and Proposition 1.17,

log Df^N(y_ν) Df^N(y_ν⁰)

=

log

QN −1

j=0 |Df (y_ν,j)|

QN −1 j=0

Df (y_ν,j⁰ )

!

=

N −1

X

j=0

log |Df (yν,j)| − log

Df (y⁰_ν,j)

=

N −1

X

j=0

D log |Df (x_ν,j)|

y_ν,j− y_ν,j⁰

≤

N −1

X

j=0

D²f (xν,j) Df (x_ν,j)

yν,j− y⁰_ν,j

≤ K

λ

N −1

X

j=0

1

λ^{N −j}|z − z⁰|

= K

λ |z − z⁰|

N −1

X

j=0

1 λ

^{N −j}

≤ K

λ |z − z⁰| 1 1 − 1/λ

≤ K₀|z − z⁰| ≤ K₀,

for K0 = K/(λ − 1). The third step holds for some xν,j between yν,j and y⁰_ν,j which is obtained using the Mean Value Theorem. We also used the inequalities |Df | > λ and

D²f

< K from Definition 1.4 and Remark 1.5.

Corollary 1.24

Using that the final estimate is independent on the choice of our pair yν, y⁰_ν∈ ∆N,ν, we obtain Df^N(yv)

|Df^N(y_v⁰)| ≤ sup_x∈∆

N,ν

Df^N(x) infx⁰∈∆N,ν|Df^N(x⁰)| ≤ e^K⁰. Corollary 1.25

For any yν, y_ν⁰ ∈ ∆N,ν we have

Df^N(y_ν) Df^N(y⁰_ν)− 1

≤ K1|z − z⁰| , where K1 is a constant.

Proof. For now we write A := Df^N(y_ν)/Df^N(y_ν⁰). From the Taylor series of e^x we know that for any x the inequality |e^x− 1| ≤ e^|x|· |x| holds. Hence we get

|A − 1| =

e^log(A)− 1

≤ e^|log(A)|· |log(A)| ≤ e^K⁰· K0|z − z⁰| = K1|z − z⁰| , where K₁= e^K⁰K₀.

(15)

If the value

Df^N(x)

was constant on ∆N,ν, we would obtain Df^N(x)

= 1

|∆_N,ν|.

In most cases, however, it is not. Instead, having proved the variation of

Df^N(x)

to be bounded on

∆N,ν, we do obtain the following lemma and corollaries.

Lemma 1.26

For any y_ν∈ ∆N,ν we have

K₂⁻¹ 1

|∆_N,v| ≤

Df^N(yν) ≤ K2

1

|∆_N,ν|, where K₂ is a constant.

Proof. In view of Corollary 1.24, we get Df^N(yν)

|∆N,ν| ≥ inf

x∈∆N,ν

Df^N(x) |∆N,ν|

= infx∈∆_N,ν

Df^N(x) sup_x0∈∆N,ν|Df^N(x⁰)| sup

x⁰∈∆N,ν

Df^N(x) |∆N,ν|

≥ 1

e^K⁰ sup

x⁰∈∆N,ν

Df^N(x) |∆N,ν|

≥ 1

e^K⁰ Z

∆N,ν

Df^N

= 1

e^K⁰ _

∆_N,ν

f^N = 1 e^K⁰,

where we used (1.4). The last equality holds because f^N is full-branched. This gives the first inequality, with K₂= e^K⁰. The second inequality is proved symmetrically, starting with the observation Df^N(y_ν)

|∆N,ν| ≤ sup_x∈∆_N,ν

Df^N(x)

|∆N,ν|, again using Corollary 1.24 and (1.4).

Corollary 1.27

For any y_ν∈ ∆_N,ν we have

K₂⁻¹|∆N,ν| ≤ 1

|Df^N(yν)| ≤ K2|∆N,ν| .

The following corollary tells us that the change in the ratio between two intervals under f is also bounded.

Corollary 1.28

Let I, J ⊂ [0, 1] two intervals such that I, J ⊂ ∆_N,v for some N ∈ N and 1 ≤ ν ≤ n^N. Then

K₃⁻¹

f^N(I)

|f^N(J )| ≤ |I|

|J | ≤ K3

f^N(I)

|f^N(J )|, where K3 is a constant.

Proof. Since f is continuous on ∆_N,ν we have, using Lemma 1.26, that f^N(I)

≤ sup

x∈∆N,ν

Df^N(x)

|I| ≤ K2

1

|∆_N,ν||I|

and

f^N(J ) ≥ inf

x∈∆N,ν

Df^N(x)

|J | ≥ K₂⁻¹ 1

|∆_N,ν||J | , hold. Therefore

f^N(I)

|f^N(J )| ≤ K2|I| / |∆N,ν|

K₂⁻¹|J | / |∆N,ν| = K₂²|I|

|J |,

giving us the first inequality, with K3= K₂². The second inequality is acquired symmetrically.

(16)

Note that, for the results of this section, the exact values of K0, K1, K2 and K3 are irrelevant. What is important is that K0, K1, K2and K3 only depend on f and not on N .

Remark 1.29: We use that f is C² on each interval only to conclude that D²f is bounded, which ensures that the results in this section hold. This requirement is quite strong and we can relax it a little.

Fixing 0 < α < 1 and our space H_α first, we can, instead of taking f to be piecewise C², replace the second part of Definition 1.4 by

• Fixing α < β < 1 we assume that, for every 1 ≤ i ≤ n, f is C^1+β on (xi, xi+1).

Here f ∈ C^1+βmeans that f is differentiable and Df is β-H¨older continuous, i.e. f ∈ C¹and |Df |_β< ∞.

With this adaptation we can still obtain a very similar result. We use the following. For x > λ we have D log(x) < 1/λ. Hence, using the Mean Value Theorem, |log(x) − log(x⁰)| ≤ 1/λ |x − x⁰| holds.

Therefore, using notation from the proof of Lemma 1.23, we get

log(|Df (yν,j))| −

log(Df (y⁰_ν,j)) ≤ 1

λ |Df (yν,j)| −

Df (y_ν,j⁰ )

≤|Df |_β λ

yν,j− y_ν,j⁰

β.

We can use this to replace the part in the proof of Lemma 1.23 where this term appears. The final result becomes,

log Df^N(yν) Df^N(y_ν⁰)

≤ K0|z − z⁰|^β≤ K0,

where K0 = |Df |_β/(λ − λ^1−β). The constant K0 is again invariant of N and the fact that |z − z⁰| is replaced by |z − z⁰|^β only affects Corollary 1.25. Since β > α, this does not make a difference when this corollary is used later in the proof of Lemma 1.31. Therefore all results in this thesis still hold with these relaxed condition on f .

1.2.2 Proof of the Lasota-Yorke inequality

Our next step towards proving Theorem 1.22, will be to prove two lemmas, which will require some working out. Before we get to the heart of it, we establish a useful inequality.

Proposition 1.30

If I ⊂ [0, 1] is an interval and h ∈ Hα we have

kh1Ik_∞≤ |I|^α|h|_α+ 1

|I|

Z

I

|h| .

Proof. By taking out the constant _|I|¹ R

Ih, representing the average of h on I, we find kh1Ik_∞≤

h1I− 1

|I|

Z

I

h ∞

+

1

|I|

Z

I

h

≤ sup

x,x⁰∈I

|h(x) − h(x⁰)| + 1

|I|

Z

I

|h|

≤ sup

x,x⁰∈I

|x − x⁰|^α sup

x,x⁰∈I

|h(x) − h(x⁰)|

|x − x⁰|^α + 1

|I|

Z

I

|h|

≤ |I|^α|h|_α+ 1

|I|

Z

I

|h| .

Now, we prove the following lemma, which will form the main part of the proof of Theorem 1.22.

Lemma 1.31

Let h ∈ Hα and N ∈ N. We have

P^Nh

_α≤ Cr^N|h|_α+ C⁰khk₁ for C, C⁰ and r < 1 constants.

(17)

Proof. The proof shall consist of manipulating the terms present in P^Nh

and then applying the distortion estimates proven in the previous section. Let z, z⁰ ∈ [0, 1]. Let, for any 1 ≤ ν ≤ n^N, be the pair (y_ν, y⁰_ν) ∈ f^−N(z) × f^−N(z⁰) such that y_ν, y_ν⁰ ∈ ∆N,ν. Now, using expression (1.3), we write

1

|z − z⁰|^α

P^Nh(z) − P^Nh(z⁰)

(1.5)

= 1

|z − z⁰|^α

X

1≤ν≤n^N

h(yν)gN(yν) − X

1≤ν≤n^N

h(y_ν⁰)gN(y_ν⁰)

= 1

|z − z⁰|^α

X

1≤ν≤n^N

h(y_ν)g_N(y_ν) − h(y⁰_ν)g_N(y_ν⁰)

≤ 1

|z − z⁰|^α X

1≤ν≤n^N

|h(y_ν)g_N(y_ν) − h(y_ν⁰)g_N(y_ν)| + |h(y_ν⁰)g_N(y_ν) − h(y⁰_ν)g_N(y_ν⁰)|

≤ 1

|z − z⁰|^α X

1≤ν≤n^N

|h(y_ν) − h(y⁰_ν)| g_N(y_ν) + 1

|z − z⁰|^α X

1≤ν≤n^N

|h(y_ν⁰)| |g_N(y_ν) − g_N(y_ν⁰)| .

We shall consider these two sums separately and call them T₁and T₂respectively. In view of Proposition 1.17 and Corollary 1.27 we get

T1 = 1

|z − z⁰|^α X

1≤ν≤n^N

|yν− y_ν⁰|^α|h(yν) − h(y⁰_ν)|

|y_ν− y⁰_ν|^α gN(yν) (1.6)

≤ 1

|z − z⁰|^α X

1≤ν≤n^N

1

λ^N |z − z⁰|

α

|h|_αgN(yν)

≤

1 λ^α

^N

|h|_α X

1≤ν≤n^N

1

|Df^N(y_ν)|

≤

1 λ^α

^N

|h|_α X

1≤ν≤n^N

K₂|∆_N,ν|

=

1 λ^α

N

K2|h|_α. In the last step we use that F

1≤ν≤n^N∆_N,ν is a disjoint union, that makes up all of [0, 1], except for finitely many points. Now we consider the second sum, we use Corollary 1.25 and Proposition 1.30 to obtain

T₂ = 1

|z − z⁰|^α X

1≤ν≤n^N

|h(y⁰_ν)|

1

Df^N(yν)− 1 Df^N(y⁰_ν)

(1.7)

≤ 1

|z − z⁰|^α X

1≤ν≤n^N

h1∆N,ν

∞

1

|Df^N(yν)|

Df^N(y_ν) Df^N(y⁰_ν)− 1

≤ K1

|z − z⁰|

|z − z⁰|^α X

1≤ν≤n^N

h1∆_N,ν

_∞

1

|Df^N(yν)|

≤ K1

X

1≤ν≤n^N

h1∆_N,ν

_∞

1

|Df^N(yν)|

≤ K₁ X

1≤ν≤n^N

|∆_N,ν|^α|h|_α+ 1

|∆N,ν| Z

∆_N,ν

|h|

! 1

|Df^N(y_ν⁰)|.

What we have done is split the estimate of T₂into two more sums T₃and T₄respectively, corresponding to the two parts that Proposition 1.30 splits

h1∆N,ν

∞ into. For the T₃ part we use Proposition 1.17

(18)

and Corollary 1.27 to obtain

T3 = K1

X

1≤ν≤n^N

|∆N,ν|^α|h|_α 1

|Df^N(y_ν⁰)| (1.8)

≤ K1|h|_α X

1≤ν≤n^N

1 λ^N

α

K2|∆N,ν|

≤ K1|h|_α X

1≤ν≤n^N

K2|∆N,ν|

= K1K2

1 λ^α

^N

|h|_α Now, for the last sum, we use Corollary 1.27 to find

T₄ = K₁ X

1≤ν≤n^N

1

|∆N,ν| Z

∆N,ν

|h|

! 1

|Df^N(y_ν⁰)|

≤ K1

X

1≤ν≤n^N

K₂|∆N,ν|

|∆N,ν| Z

∆_N,ν

|h|

!

≤ K1K2

X

1≤ν≤n^N

Z

∆N,ν

|h|

!

= K₁K₂khk₁. Using the estimates on T1, T2, T3 and T4, we conclude

P^Nh

_α = sup

z,z⁰∈[0,1]

P^Nh(z) − P^Nh(z⁰)

|z − z⁰|^α

≤

1 λ^α

N

K2+

1 λ^α

N

K1K2

!

|h|_α+ K1K2khk₁

≤ Cr^N|h|_α+ C⁰khk₁

for C = max {K1, K1K2}, C⁰= K1K2and r = λ^−α, proving the lemma.

In order to prove Theorem 1.22 we need not only an estimate on P^Nh

_α, but also one on the other part that makes up

P^Nh

α, namely

P^Nh(0)

. This we obtain through the following lemma.

Lemma 1.32 For h ∈ Hα we have

P^Nh

_∞≤ Cr^N|h|_α+ C⁰khk₁ for some constant C, C⁰ and r where r < 1.

Proof. This proof is now fairly straightforward as we use very similar techniques to those used in the proof of Lemma 1.31. We use Proposition 1.30, Propostion 1.17 and Corollary 1.27 to find

P^Nh

_∞≤ X

1≤ν≤n^N

h1∆_N,ν

_∞

1

|Df^N|1∆_N,ν

_∞

≤ X

1≤ν≤n^N

|∆N,ν|^α|h|_α

1

|Df^N|1∆_N,ν

∞

+ X

1≤ν≤n^N

1

|∆_N,ν| Z

∆N,ν

|h|

!

1

|Df^N|1∆_N,ν

∞

≤ X

1≤ν≤n^N

1 λ^N

α

|h|_αK2|∆N,ν| + X

1≤ν≤n^N

K2

|∆N,ν|

|∆N,ν| Z

∆_N,ν

|h|

≤ K2

1 λ^α

^N

|h|_α+ K2khk₁.

All ingredients for proving Theorem 1.22 are now in place.