Total positivity and oscillatory kernels : An overview, and applications to the spectral theory of the cubic string

(1)

Examensarbete

Total positivity and oscillatory kernels: an overview, and

applications to the spectral theory of the cubic string

Marcus Kardell

(2)

(3)

Total positivity and oscillatory kernels: an overview, and

applications to the spectral theory of the cubic string

Department of Mathematics, Link¨opings Universitet Marcus Kardell

LiTH-MAT-EX–2010/18–SE

Examensarbete: 30 hp Level: Advanced

Supervisor: Hans Lundmark,

Department of Mathematics, Link¨opings Universitet Examiner: Hans Lundmark,

Department of Mathematics, Link¨opings Universitet Link¨oping: Aug 10 2010

(4)

(5)

Abstract

In the study of the Degasperis-Procesi differential equation, an eigenvalue prob-lem called the cubic string occurs. This is a third order generalization of the second order problem describing the eigenmodes of a vibrating string. In this thesis we study the eigenfunctions of the cubic string for discrete and continu-ous mass distributions, using the theory of total positivity, via a combinatorial approach with planar networks.

Keywords: Cubic string, total positivity, oscillatory kernels, planar networks. URL: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-58005

(6)

(7)

Acknowledgements

First I would like to thank my supervisor, Hans Lundmark, for providing me with reading material, pedagogical explanations of difficult subjects, and for catching errors in my proofs and the early drafts. Thanks also to my opponent Anton H¨ogh¨all, for reading a daunting report. Finally, thanks to family and friends, who supported me by attending my final presentation or otherwise encouraged me to do a good job.

Thank you.

(8)

(9)

Introduction

In the theory of small oscillations, the vibrations u(y, t) of a string with mass density g(y) can be described by the second order partial differential equation uyy = g(y)utt. Separation of variables u(y, t) = Φ(y)τ (t), combined with the

condition that the ends are fixed at y = ±1, gives the following second order eigenvalue problem

Φyy(y) = zg(y)Φ(y), y ∈ (−1, 1), Φ(±1) = 0.

The object of interest in this thesis is a third-order generalization of this problem, called the cubic string, which in its basic form is the differential equa-tion

−Φyyy(y) = zg(y)Φ(y), y ∈ (−1, 1).

When g(y) is a discrete measure – a linear combination of Dirac distributions – the ordinary string is piece-wise linear, since the second derivative is zero except at the mass points. The cubic string is instead piece-wise a quadratic polyno-mial. When the mass distribution is continuous, things get more complicated.

The main difference between the problems is that the cubic string uses three boundary conditions. In [1], the Dirichlet-like boundary conditions

Φ(−1) = Φy(−1) = Φ(1) = 0

were studied, and this is the case we will study. Note that this is not a self-adjoint problem. For Neumann-like boundary conditions, see [2].

There are no known physical applications of the cubic string so far. The historical motivation for the cubic string comes from the Degasperis–Procesi (DP) and Camassa–Holm (CH) shallow water equations. These are the PDEs

ut− uxxt+ (b + 1)uux= buxuxx+ uuxxx, (x, t) ∈ R,

where b = 2 is CH and b = 3 is DP. The DP equation was studied in [3], where the wave equation

ψx(x) − ψxxx(x) = zmψ(x), (1.1)

indirectly arose. (See [3] for more details.) Here we just note that this wave equation is equivalent to the cubic string problem via the transformation

y = tanhx 2, ψ(x) = 2Φ(y) 1 − y2, m(x) = 1 − y2 2 3 g(y). Kardell, 2010. 1

(12)

2 Chapter 1. Introduction

Note that the variable transformation is invertible, and that the mass distri-bution m(x) is still positive if g(y) is. We can use the wave equation (1.1) to derive a system of linear equations, which will tell us a lot about how the eigenfunctions of the cubic string behave.

We start by taking the Fourier transform of each side. We get iω(1 + ω2)Õψ(ω) = Ôzmψ =: bf =⇒ bψ(ω) = 1 iω(1 + ω2₎f =⇒b b ψ(ω) = É π 2 Û Z x −∞ e−|x|_{dx · b}_f

The integral can be calculated as Z x −∞ e−|x|dx = ¨ ex x ≤ 0 2 − e−x x ≥ 0 =: g(x). Transforming back, we get the integral equation

b ψ(ω) = 1 2 √ 2π bf ·bg =⇒ ψ(x) = 1 2(f ∗ g)(x) = z 2 Z ∞ −∞ g(x − s)m(s)ψ(s) ds. When m(s) is a discrete measure,

m(s) = 2

n

X

j=1

mjδ(x − xj), x1< x2< · · · < xn,

this integral equation breaks down to a system of n equations:

ψ(x) = z n X j=1 g(x − xj) · mjψ(xj) =⇒ ψ(x1) ψ(x2) .. . ψ(xn) = G · m1ψ(x1) m2ψ(x2) .. . mnψ(xn) , where G = g(x1− x1) g(x1− x2) . . . g(x1− xn) g(x2− x1) g(x2− x2) . . . g(x2− xn) . . . . g(xn− x1) g(xn− x2) . . . g(xn− xn) .

We know that g(xi− xi) = g(0) = 1. When i < j, g(xi− xj) = exi−xj, and

otherwise g(xi− xj) = 2 − exj−xi. Defining Eij = exi−xj we see that G takes

the form Gn= 1 E12 E13 . . . E1n 2 − E12 1 E23 . . . E2n 2 − E13 2 − E23 1 . . . E3n . . . . 2 − E1n 2 − E2n 2 − E3n . . . 1

(13)

3

for each n. To study the eigenvalues of this matrix, we are going to need the theory of oscillatory matrices and kernels, which is presented in Chapter 2 and 3 respectively, relying heavily on the original work [4]. An important stepping stone is Perron’s Theorem, which is proven in detail. In Chapter 4 we show that G is oscillatory for each n, by constructing a planar network which has G as its weight matrix.

Note that G was already known to be oscillatory, by the more general theory in [5], but here we have chosen to take a more combinatorial approach to this problem. The matrix G appeared in the original article [3] about the DP equa-tion, but no one seems to have studied this particular planar network before.

(14)

(15)

Chapter 2

Oscillatory matrices and

their eigenvalue properties

In this chapter we study the simpler case of oscillatory matrices and their eigenvalue properties, then in the next chapter we move on the more general case of oscillatory kernels.

2.1 Preliminaries

We first remind the reader of some facts from linear algebra. Let A be a square matrix A = a11 a12 . . . a1n a21 a22 . . . a2n . . . . an1 an2 . . . ann .

We are going to represent such a matrix as ||aik||n1. The transpose of A is written

as At_{= ||a}

ki||n1. When At= A, the matrix is called symmetric. The notation

for the determinant of A will be det A, |A|, or |aik|n1. The rank of a matrix (the

maximal number of linearly independent rows or columns) is written rank A. The eigenvalues of A are calculated as roots to the characteristic equation |A − λI| = 0, where I is the identity matrix. We can write I = ||δik||n1 using

the Kronecker delta δik (1 if i = k, otherwise 0). Since transposing a matrix

doesn’t change its determinant, we see that |A − λI| = |(A − λI)t_{| = |A}t_{− λI|,}

which implies that A and At_{have the same eigenvalues.}

Minors (subdeterminants) of A will be notated

A s1 s2 . . . sp t1 t2 . . . tp ,

where (s1, s2, . . . , sp), and (t1, t2, . . . , tp), are the indices determining what rows

and columns, respectively, are kept from A. In connection with this we let M_pn = {(i1, i2, . . . , ip) : 1 ≤ i1 < i2 < · · · < ip ≤ n} be the set of all tuples of

p distinct ordered indices, chosen from the integers 1 to n. A minor where all sj= tj is called principal.

(16)

6 Chapter 2. Oscillatory matrices and their eigenvalue properties

Closely related to minors are cofactors, written Aik= (−1)i+kA 1 . . . i − 1 i + 1 . . . k k + 1 . . . n 1 . . . i − 1 i . . . k − 1 k + 1 . . . n , which are used when expanding a determinant with respect to a row or column; for example |A| = n X k=1 aikAik

where we have expanded along row i.

Differentiating a determinant is possible with the use of the product rule of differentiation. The determinant is a sum of products of elements. Differentiat-ing with respect to each element gives us

d dx|aik| n 1 = n X i=1 n X k=1 Aik d dxaik.

Finally, we are also going to need a few facts about permutations, i.e. bi-jections from {1, 2, . . . , n} to itself. There are exactly n! such functions on n elements. One interpretation of a permutation σ is that it changes the order of the elements (1, 2, . . . , n) to (σ(1), σ(2), . . . , σ(n)). Multiplication of two per-mutations σ and τ is defined as the composition of the two functions, σ(τ (x)). Permutations that only swap two elements are called transpositions. All permu-tations can be expressed as a product of transpositions in many different ways, but one can show that the number of factors for a given permutation is always odd or always even. One can then define the sign of a permutation, sgn(σ) = (−1)k_{, where k is the number of factors of σ.}

We are going to use permutations to write determinants in a very succinct way.

Definition 1. Let A be a square matrix ||aik||n1. Then

|A| =X σ sgn(σ) n Y i=1 aiσ(i) ! ,

where the sum is over all permutations σ on {1, 2, . . . , n}.

2.2 The Cauchy–Binet Theorem

In linear algebra the basic relation |AB| = |A||B| for square matrices is stud-ied. The Cauchy–Binet Theorem is a generalization, handling both square and rectangular matrices.

Theorem 1. Let A be a p × n matrix, and B an n × p matrix. Let C = AB. Then |C| = X k∈Mn p A 1 2 . . . p k1 k2 . . . kp B k1 k2 . . . kp 1 2 . . . p .

(17)

2.2. The Cauchy–Binet Theorem 7

Proof. The sum is over the n_pways of choosing indices k1 to kp. Note that if

p > n, then the sum is empty. This is consistent with the ranks of the matrices, since rank C is at most equal to the smallest rank of its factors, and both A and B has rank less than or equal to n. Consequently, C, which is a p × p matrix, is rank deficient, which means that |C| = 0.

If p = n, the sum has only one term. The relation is then the well-known |C| = |A||B|.

Let us now look at the case where p < n (note that the following argument is valid also for p = n). Looking closer at C, we have

|C| = c11 c12 . . . c1p c21 c22 . . . c2p . . . . cp1 cp2 . . . cpp ,

which according to the rules of matrix multiplication is equal to Pn k1=1a1k1bk11 Pn k1=1a1k1bk12 . . . Pn k1=1a1k1bk1p Pn k2=1a2k2bk21 Pn k2=1a2k2bk22 . . . Pn k2=1a2k2bk2p . . . . Pn kp=1apkpbkp1 Pn kp=1apkpbkp2 . . . Pn kp=1apkpbkpp .

Thanks to the multilinearity of determinants, we can write this as

n X k1=1 n X k2=1 · · · n X kp=1 a1k1a2k2. . . apkp bk11 bk12 . . . bk1p bk21 bk22 . . . bk2p . . . . bkp1 bkp2 . . . bkpp .

Each term that has ki = kj for some i and j will be zero, because the

corre-sponding rows in the determinant are equal, and thus linearly dependent. In the remaining terms, all ki’s are different, which means we can introduce (for each

term in the sum) a function σ, which is from {1, 2, . . . , p} onto {k1, k2, . . . , kp},

via the relation σ(i) = ki. We get

|C| =X σ p Y i=1 aiσ(i) bσ(1)1 bσ(1)2 . . . bσ(1)p bσ(2)1 bσ(2)2 . . . bσ(2)p . . . . bσ(p)1 bσ(p)2 . . . bσ(p)p . (2.1)

Note that σ is not really a permutation in a strict sense, since it is not onto itself, but it still behaves just like a normal permutation. (It is isomorphic to one such.)

Let us transpose the rows in each minor, in a way such that the indices σ(i) appear ordered from lowest to highest, i.e. kmin= min{ki} ends up on the first

row, and so on. Each transposition changes the sign of the determinant, so for each term this process yields a sign factor, equal to the sign of σ.

|C| =X σ sgn(σ) p Y i=1 aiσ(i) bkmin1 . . . bkminp . . . . bkmax1 . . . bkmaxp .

(18)

The extra sign factor is exactly what we need to convert this sum of products to minors of A. We can group the terms according to the range of σ, or the values {kmin, . . . , kmax}. Then, using Definition 1 on each such group, we get

|C| = X k∈Mn p A 1 . . . p kmin . . . kmax B kmin . . . kmax 1 . . . p ,

which is nothing but the desired relation.

The Cauchy–Binet Theorem also gives us a way to deal with minors of a product.

Corollary 1. An arbitrary minor of C = AB can be computed with C s1 . . . sk t1 . . . tk =X r A s1 . . . sk r1 . . . rk B r1 . . . rk t1 . . . tk . Proof. The matrix behind the minor

C s1 . . . sk t1 . . . tk is the product of as11 . . . as1n . . . . ask1 . . . askn and b1t1 . . . b1tk . . . . bnt1 . . . bntk .

Applying the Cauchy–Binet Theorem to this product gives us the corollary.

2.3 Oscillatory matrices

Definition 2. A matrix A = ||aik||n1 is called totally non-negative (TNN) if all

minors of A are non-negative. A is called totally positive (TP) if all minors are positive. If A is totally non-negative, and some power of it, Aκ_{, is totally}

positive, A is called oscillatory. The smallest such κ is then called the exponent of A.

Theorem 2. {TP matrices} ⊂ {Oscillatory matrices} ⊂ {TNN matrices}. Proof. The inclusions follow immediately from the definition. That the inclu-sions are strict follows from simple examples. Let

A = 0 1 0 1 , B = 1 1 0 1 1 1 0 1 1 .

A is TNN, but not TP. A is also not oscillatory since A2_{= A, which means that}

no power of A can be TP.

B is also TNN, since all minors are found to be 0 or 1. We can calculate

B2= 2 2 1 2 3 2 1 2 2

and verify that all minors are positive, which means that B is oscillatory (but not TP).

(19)

2.4. Associated matrices 9

Using the Cauchy–Binet Theorem, we can find more properties of these three classes of matrices.

Theorem 3. The product of two TNN (TP) matrices is TNN (TP).

Proof. From Cauchy–Binet, each minor of the product is a sum of products of minors from the factors. When the factors are TNN, the sum must be greater than or equal to zero. In the TP case, the sum is positive.

Theorem 4. The product of a TNN matrix A and a TP matrix is TP if and only if A is non-singular.

Proof. The necessity is trivial. The sufficiency follows from expanding the prod-uct using the Cauchy–Binet theorem. In fact, if A is non-singular, at least one minor is positive, making the sum on the right-hand side positive.

Theorem 5. The product of an oscillatory matrix A and a TP matrix is TP. Thus, if an oscillatory matrix A has exponent κ, An is TP for all n ≥ κ. Proof. This follows from the previous theorem, and that the relation 0 < det(Aκ_{) = (det A)}κ_{implies that an oscillatory matrix must be non-singular.}

2.4 Associated matrices

Before we can say anything about the eigenvalues of an oscillatory matrix we need to lay some groundwork. The definitions and theorems below will then have their analogues when we move on to oscillatory kernels.

An important tool to study the eigenvalues of oscillatory matrices will be associated matrices. Let A be an (arbitrary) n×n matrix ||aik||n1. For 1 ≤ p ≤ n,

let Mpn = {(i1, i2, . . . , ip) : 1 ≤ i1< i2 < · · · < ip ≤ n} be the set of all tuples

of p distinct, ordered indices as before. Note that the elements in Mpn can be

ordered lexicographically1_{, and numbered from 1 to N =} n p

. Let s, t represent two such elements (s1, s2, . . . , sp) and (t1, t2, . . . , tp) respectively. We can then

define as,tas the minor

A s1 s2 . . . sp t1 t2 . . . tp .

Definition 3. Ap= ||as,t||N1 is the p-th associated matrix of A.

Let A = 2 1 3 4 1 2 1 2 3 .

A1is of course equal to A itself. A3is the matrix consisting of only one element,

namely det A. To calculate A2we need to compute the 32

2

= 9 minors of size 2, with row and column indices chosen from {(1, 2), (1, 3), (2, 3)}, corresponding to s, t ∈ {1, 2, 3}. For example, a1,2= A 1 2 1 3 = 2 3 4 2 = −8.

1_{Given two tuples s = (s}

1, s2, . . . , sp) and t = (t1, t2, . . . , tp), s precedes t lexicographically

(20)

Note that the associated matrix of a totally positive matrix has positive elements. We can also note the following two properties.

Theorem 6. Let A, B and C be matrices with their respective associated ma-trices Ap, Bp and Cp.

i) If C = AB then Cp= ApBp.

ii) If A = B−1 then Ap= B−1p .

Proof. Property i) follows immediately from the Cauchy–Binet Theorem. In fact, each element cs,tof Cp is a minor of C, from which follows that

cs,t= N

X

r=1

as,rbr,t.

If we apply property i) to the relation AB = I we get ApBp= Ip. But Ip is

nothing but another unit matrix Ip= ||δst||N1. (If s 6= t, then at least one row

of the minor es,tconsists only of zeros.) Consequently, ApBp= I, and property

ii) follows.

The main theorem regarding associated matrices is the Kronecker Theorem. Theorem 7 (Kronecker). Let A be an n × n matrix, with the complete system of eigenvalues λ1, λ2, . . . , λn. Then the eigenvalues of Ap consist of all possible

products of p distinct λk.

Proof. Each matrix A is similar to a triangular matrix T with the same eigen-values as A, i.e., for some matrix P ,

A = P T P−1 = P λ1 ∗ ∗ ∗ 0 λ2 ∗ ∗ 0 0 . .. ∗ 0 0 0 λn P−1.

Applying property i) and ii) from Theorem 6, we get Ap = PpTpPp−1. Let us

study the elements of Tp,

ts,t= T s1 s2 . . . sp t1 t2 . . . tp .

For s = t, it is easy to see that ts,s= λs1λs2. . . λsn:= Λs. For s > t, there

exists a number r such that sr> tr, which means that

T s1 s2 . . . sp t1 t2 . . . tp = λs1 ∗ ∗ ∗ ∗ ∗ 0 . .. ∗ ∗ ∗ ∗ 0 0 λsr−1 ∗ ∗ ∗ 0 0 0 0 ∗ ∗ 0 0 0 0 . .. ∗ 0 0 0 0 0 λsn = 0.

(21)

2.5. Perron’s Theorem 11

The conclusion is that Tp is also a triangular matrix

Tp= Λ1 ∗ ∗ ∗ 0 Λ2 ∗ ∗ 0 0 . .. ∗ 0 0 0 ΛN ,

with the same eigenvalues Λs= λs1λs2. . . λsnas the matrix Ap. This completes

the proof.

2.5 Perron’s Theorem

An important theorem of this chapter is Perron’s Theorem. This concerns any matrix with positive elements, and has been subject to many different proofs. This proof follows [4]. For an interesting survey of alternative proofs, see [6]. Theorem 8 (Perron). If all the elements of a matrix A = ||aik||n1 are positive,

then A has a positive, simple eigenvalue ρ, whose absolute value is larger than that of all other eigenvalues of A. To this eigenvalue corresponds an eigenvector with positive coordinates.

Proof. When A is a 1 × 1 matrix, the theorem is true. We are going to do a proof by induction, so let us assume that the theorem is true for matrices of size (n − 1) × (n − 1) and smaller. Let

Dm(λ) = |λδik− aik|m1 (m = 1, . . . , n)

be determinants of different size. Expanding the largest one, Dn(λ), along the

last column we get

Dn(λ) = (λ − ann)Dn−1(λ) + n−1_X i=1 (−ain)Ain(λ), where Ain(λ) = (−1)i+n[λI − A] 1 . . . i − 1 i + 1 . . . n 1 . . . i − 1 i . . . n − 1 (2.2) is the cofactor of λδin− ain. Since i < n, all of the minors in (2.2) have the

same last row, which means that we can expand each of them with respect to this row. Note that this is now the (n − 1)st row. We get

Ain(λ) = (−1)i+n n−1_X k=1 (−1)n−1+k(−ank) [λI − A] 1 . . . i − 1 i + 1 . . . n − 1 1 . . . k − 1 k + 1 . . . n − 1 . From this, by moving some sign factors around, we obtain

Ain(λ) = n−1_X k=1 ankA (n−1) ik (λ),

(22)

where A(n−1)_ik is the cofactor of λδik− aik in Dn−1(λ). Putting this together we

have Dn(λ) = (λ − ann)Dn−1(λ) − n−1_X i,k=1 ainankA (n−1) ik (λ).

According to the induction hypothesis, the truncated matrix ||aik||n−11 has a

maximal, positive eigenvalue which we can denote by ρn−1. Using this value,

which is a root to Dn−1(λ) = 0, we get

Dn(ρn−1) = − n−1_X i,k=1 ainankA (n−1) ik (ρn−1).

Here we are going to need a second induction hypothesis, that at step n − 1 we have A(n−1)_ik (λ) > 0 when λ ≥ ρn−1. It is not meaningful to talk about an

eigenvalue ρ0, but we consider the second hypothesis true for n = 1.

Using this, Dn(ρn−1) turns out to be negative, since we know all elements

of the matrix A to be positive, in particular the elements ain and ank. On the

other hand,

lim

λ→∞Dn(λ) = +∞.

Dn(λ) is a continuous function, so the equation Dn(λ) = 0 must have at least

one positive root larger than ρn−1. We denote the largest of these roots by ρn.

Let us study the cofactors and principal minors Aii(λ). Note that we could

have begun by expanding Dn(λ) along any row i and column i, leaving us with

the first term (λ − aii)Aii(λ) instead. Aii(λ) would play the role of Dn−1(λ),

and we would find that ρn is larger than the largest eigenvalue of each of these

principal minors. So for λ ≥ ρn, the cofactors Aii(λ) cannot change sign.

Expanding them in terms of λ, the first term turns out to be λn−1, so lim

λ→∞Aii(λ) = +∞.

The conclusion is that all cofactors Aii(λ) > 0 when λ ≥ ρn. This gives us

another part of the theorem. Indeed, by the product rule of differentiation,

D_n0(ρn) = n X i,k=1 δikAik(ρn) = n X i=1 Aii(ρn) > 0,

thus ρn is a simple root of Dn(λ) = 0.

Now we need to deal with the other cofactors Aik(λ), where i 6= k.

Expand-ing them, with respect to row k and column i, similarly to how we did before and keeping close track of all the sign factors, we get

Aik(λ) = akiC(λ) +

X

p,q

Cpq(λ)apiakq (p, q 6= i, k)

where C(λ) is the principal minor where rows and columns i, k are deleted, and Cpq(λ) is the cofactor of λδpq− apq in C(λ). Note especially that the first

term has no minus sign, which is because the element (−aki) in this cofactor is

situated at row k −1 or column i−1 depending on which of the indices is largest. This compensates the minus sign from (−aki) when expanding the cofactor.

(23)

2.5. Perron’s Theorem 13

Note also that there seems to be a minor error in [4], where they have reversed the order of the indices k and q in akq.

By the induction hypotheses, C(λ) and Cpq(λ) are positive when λ ≥ ρn, so

we conclude that Aik(λ) > 0 for λ ≥ ρn. This implies that Aik(λ) is positive

also when i 6= k, which completes the induction step for hypothesis two. It is now possible to construct an eigenvector with positive coordinates. Let u be a vector with coordinates uk equal to the cofactors A1k(ρn). Expanding

the first row of the determinant in the characteristic equation, we get

|δikρn− aik|n1 = 0 =⇒ n

X

k=1

(δ1kρn− a1k)uk= 0.

Replacing the first row of that determinant with a copy of another row i ∈ {2, . . . , n}, another determinant equal to zero appears. For each i, we can again expand with respect to the first row. Since all other rows are as before, this results in the same cofactors, but with other coefficients multiplying them. Thus

n

X

k=1

(δikρn− aik)uk = 0

holds for each i ∈ {1, . . . , n}. This is nothing but the vector equation ||δikρn− aik||n1· u = 0 =⇒ ρnu = Au.

We also need to compare ρn to the absolute value of the other eigenvalues

of A. Let (v, λ) be an eigenpair of the transposed matrix At_{, with ρ}

n6= λ. Note

that λ then also is an eigenvalue of A. We have the equations

Au = ρnu =⇒ n X k=1 aikuk = ρnui (i = 1, 2, . . . , n), (2.3) Atv = λv =⇒ n X i=1 aikvi= λvk (k = 1, 2, . . . , n). (2.4)

Applying the triangle inequality to (2.4),

|λ||vk| = n X i=1 aikvi ≤ n X i=1 aik|vi| (k = 1, 2, . . . , n), (2.5)

with equality if and only if all the complex numbers vihave the same argument.

Combining (2.3) and (2.5) into

|λ| n X k=1 uk|vk| ≤ n X k=1 uk n X i=1 aik|vi| = n X i=1 |vi| n X k=1 aikuk= ρn n X i=1 |vi|ui

we see that ρn= |λ| if and only if we have equality in (2.5). But equality there

would make λ > 0 by (2.4), implying ρn= λ which contradicts the assumption.

Thus, ρnmust be strictly larger than all other |λ|. This completes the induction

(24)

2.6 Eigenvalues of oscillatory matrices

We are ready to prove the main theorem of this section.

Theorem 9. The eigenvalues of an oscillatory matrix A = ||aik||n1 are simple

and positive:

λ1> λ2> · · · > λn > 0.

Proof. First, consider the special case that A is totally positive. Then, for each q ≤ n, the associated matrix Aq has positive elements. As the reader might

have guessed, we are going to apply Perron’s theorem to this matrix.

Numbering the eigenvalues of A according to absolute value, |λ1| ≥ |λ2| ≥

· · · ≥ |λn|, Kronecker’s theorem combined with Perron’s theorem gives us that

Aq has an eigenvalue λ1λ2. . . λq > 0, which is larger than the absolute value of

all other products of p λk’s. Specifically, λ1λ2. . . λq > |λ1λ2. . . λq−1λq+1|. The

first inequality implies that all λq > 0, and from the second we get λq > λq+1.

These combine to λ1> λ2> · · · > λn> 0.

If, instead, A is oscillatory, there is some exponent κ, such that Aκis totally positive. Since A is oscillatory, Aκ+1= A · Aκ is also totally positive, according to Theorem 5 of Section 1. From this, we get two chains of inequalities, λκ1 >

λκ 2 > · · · > λκn> 0 and λ κ+1 1 > λ κ+1 2 > · · · > λ κ+1 n > 0. From λκ i > 0 and λ κ+1

i > 0 we get that each λi > 0, which we can use on

the inequalities λκ

i > λκi+1 to get that, for each i ∈ {1, . . . , n − 1}, λi > λi+1,

(25)

Chapter 3

Integral eigenvalue

problems with oscillatory

kernels

In this chapter we deal with the integral generalization of oscillatory matrices, namely oscillatory kernels. We show that their eigenvalues behave similarly to the matrix case, and that their eigenfunctions have certain oscillatory properties.

3.1 Oscillatory kernels

We are now going to study an integral eigenvalue problem of the form ϕ(x) = λ

Z b a

K(x, s)ϕ(s) dσ(s).

This is sometimes called a Fredholm equation of the second kind [8]. Here σ(s) is a strictly increasing function on (a, b), i.e. dσ(s) > 0. This corresponds to the mass distribution m(x) from the introduction being positive. K(x, s) is a continuous, oscillatory kernel, which is defined as follows.

Definition 4. K(x, s) is oscillatory if each sampling of it is oscillatory, i.e. if the matrix ||K(xi, xk)||n1 is oscillatory for each choice of a < x1< · · · < xn < b.

We also introduce the following notation for determinants of samplings of K(x, s): K x1 x2 . . . xn s1 s2 . . . sn = K(x1, s1) K(x1, s2) . . . K(x1, sn) K(x2, s1) K(x2, s2) . . . K(x2, sn) . . . . K(xn, s1) K(xn, s2) . . . K(xn, sn) . The oscillatory property of K implies three important determinantal properties which we will often refer to.

Theorem 10. If K(x, s) is oscillatory, the following holds: 1

1_{These three properties are actually necessary and sufficient, but we will not use this fact.}

(26)

16 Chapter 3. Integral eigenvalue problems with oscillatory kernels i) K(x, s) > 0, a < x, s < b ii) K x1 x2 . . . xn s1 s2 . . . sn ≥ 0, a < x1< x2< · · · < xn s1< s2< · · · < sn < b iii) K x1 x2 . . . xn x1 x2 . . . xn > 0, a < x1< x2< · · · < xn< b

Proof. The determinant in property ii) is a minor of an oscillatory matrix. By definition, oscillatory matrices must be totally non-negative, so the minor is greater than or equal to zero.

Property iii) contains the determinant of an oscillatory matrix. This must always be greater than zero, otherwise no power of the matrix could be totally positive.

Property i) follows from looking at a 2 × 2 matrix

K(x1, x1) K(x1, x2)

K(x2, x1) K(x2, x2)

.

For this matrix to be oscillatory, K(x1, x2) and K(x2, x1) must be non-zero

(positive). If they were not, the matrix would be triangular, and so would all powers of it. Thus no power would be totally positive. Since x1 < x2 were

chosen arbitrarily, property i) must be valid for x 6= s. The case x = s is covered by property iii).

3.2 Associated kernels

In analogy with the matrix case, we can call the determinants Kn(X, S) = K

x1 x2 . . . xn

s1 s2 . . . sn

the n-th associated kernel of K(x, s). Here, X = (x1, x2, . . . , xn) and S =

(s1, s2, . . . , sn) fulfill the conditions

a < x1< x2< · · · < xn s1< s2< · · · < sn

< b.

Let us denote the set of such n-tuples as Mn_{. The associated kernels have the}

following important property:

Theorem 11. Let K(x, s), L(x, s) and N (x, s) be kernels related by

K(x, s) = Z b

a

L(x, t)N (t, s) dσ(t).

Then for their respective associated kernels Kn(X, S), Ln(X, S) and Nn(X, S),

Kn(X, S) =

Z

Mn

Ln(X, T )Nn(T, S) dσ(T )

holds, where X, S, T ∈ Mn _{and dσ(T ) = dσ(t}

(27)

3.2. Associated kernels 17

Proof. This can be shown by the following calculations. The associated kernel Kn(X, S) is equal to Kn(X, S) = |K(xi, sk)|n1 = Z_abL(xi, t)N (t, sk) dσ(t) n 1 .

On each row i, to be able to break out the integral sign, we index the integration variable t with any permutation π(i), i.e.,

Kn(X, S) =

Z_abL(xi, tπ(i))N (tπ(i), sk) dσ(tπ(i))

n 1 = = Z b a . . . Z b a L(x_i, tπ(i))N (tπ(i), sk) n 1 dσ(T ).

From each row in the determinant, we can also take out a factor L(xi, tπ(i));

Z b a . . . Z b a L(x1, tπ(1)) . . . L(xn, tπ(n))N (tπ(i), sk) n 1 dσ(T ).

Rearranging the rows of what is left in the determinant, we can get the indices in increasing order. This gives us a determinant of the form Nn(T, S), but to

do this, we must compensate with a sign factor depending on the permutation π. The expression then becomes

Z b a . . . Z b a sgn(π) n Y i=1 L(xi, tπ(i))Nn(T, S) dσ(T ).

Note that the value of this integral does not depend on π. Therefore, we can form the average over all permutations π and still get the same value. This means that we have

Kn(X, S) = 1 n! Z b a . . . Z b a X π sgn(π) n Y i=1 L(xi, tπ(i))Nn(T, S) dσ(T ) = = 1 n! Z b a . . . Z b a Ln(X, T )Nn(T, S) dσ(T ).

The n-dimensional cube [a, b]n that we are integrating over can be seen as the union of n! wedges

{t1≤ t2≤ · · · ≤ tn} ∪ {t2≤ t1≤ · · · ≤ tn} ∪ · · · ∪ {tn≤ · · · ≤ t2≤ t1}.

Each of these give the same contribution to the integral over the whole cube, since each sign change in Ln(X, Π(T )) is cancelled be the same sign change

in Nn(Π(T ), S). This means that we can choose just one of these wedges to

integrate over n! times, thus cancelling the factor _n!1. We choose the wedge Mn

of course, and the result is the desired relation Kn(X, S) =

Z

Mn

(28)

18 Chapter 3. Integral eigenvalue problems with oscillatory kernels

With help from this, we can show the following theorem, which is the integral analogue of Kronecker’s theorem.

Theorem 12. If the integral equation

ϕ(x) = λ Z b

a

K(x, s)ϕ(s) dσ(s)

has a complete system of orthonormal eigenfunctions ϕ0(x), ϕ1(x), . . . with

cor-responding eigenvalues λ0, λ1, . . . then the integral equation for the associated

kernel Kn(X, S),

Φ(X) = Λ Z

Mn

Kn(X, S)Φ(S) dσ(S),

has a complete system of orthonormal eigenfunctions consisting of determinants Φp(X) = |ϕpi(xk)|

n

i,k=1, X ∈ Mn

with corresponding eigenvalues Λp = λp1λp2. . . λpn where 0 ≤ p1 < p2< · · · <

pn are running over all possible products of λ’s.

Proof. We have |ϕpi(xk)| n 1 = λpi Z b a K(xk, s)ϕpi(s) dσ(s) n 1

which by the calculations of the previous theorem is equal to λp1. . . λpn Z Mn K(X, S)|ϕpi(sk)| n 1dσ(S).

Thus, we know the eigenfunctions and eigenvalues to be correct. What about the orthonormality? We have

(Φp(S), Φq(S)) = Z Mn |ϕpi(sk)| n 1|ϕqi(sk)| n 1dσ(S) = = Z b a ϕpi(s)ϕqi(s) dσ(s) n 1 =X τ sgn(τ ) n Y i=1 δpiτ (qi).

The products in the expression above are 1 if the sequences pi and τ (qi) are

equal for each entry. Since pi and qi are strictly increasing, all permutations

except the identity permutation will give a zero contribution to the sum. The orthonormality follows.

3.3 An integral analogue of Perron’s Theorem

To prove the eigenvalue properties of our integral eigenvalue problem, we need a theorem which corresponds to Perron’s Theorem in the matrix case. Note that the following theorem can also be applied to associated kernels Kn(X, S). The

(29)

3.4. Eigenvalues and eigenfunctions of oscillatory kernels 19

Theorem 13. If the integral equation

ϕ(x) = λ Z b

a

K(x, s)ϕ(s) dσ(s)

has a continuous kernel K(x, s) satisfying K(x, s) ≥ 0 and K(x, x) > 0, then the absolutely smallest eigenvalue λ0 is positive, simple, and has a strictly smaller

absolute value than the other eigenvalues. The corresponding eigenfunction ϕ0(x) has no zeros in (a, b).

We will not present a full proof of this theorem here. In [4], a proof is given for the case when K(x, s) is symmetric, i.e., K(x, s) = K(s, x), though that is not the case for our kernel. Indeed, in the introduction we showed a sampling Gn, which clearly is non-symmetric.

The general theory of these integral equations dates back to Fredholm [7], where a generalization of determinants, called the Fredholm determinant, is defined and studied. It turns out that the Fredholm determinant, which is a function of the parameter λ, is zero if and only if λ is an eigenvalue of the kernel. This is used to prove several analogues to theorems from linear algebra, for example that K(x, s) and K(s, x) have the same eigenvalues. For a nice exposition of the theory, see [8]. A complete proof of Theorem 13 in the non-symmetric case is given in [9].

3.4 Eigenvalues and eigenfunctions of oscillatory

kernels

Now we just have to combine the previous theorems to receive the desired prop-erties for eigenvalues of an oscillatory kernel.

Theorem 14. If K(x, s) is a continuous, oscillatory kernel, its eigenvalues are positive and distinct, i.e. 0 < λ1< λ2< · · · < ∞.

Proof. Let the eigenvalues of K be numbered2_{according to their absolute values}

|λ0| ≤ |λ1| ≤ . . . with corresponding orthonormal eigenfunctions

ϕ0(x), ϕ1(x), . . .

According to Theorem 12, the eigenvalues of Kn(X, S) are all possible

prod-ucts λp1. . . λpn. The smallest eigenvalue must then be λ0λ1. . . λn−1, with the

corresponding eigenfunction |ϕi−1(xk)|n1.

Theorem 13 gives that the n-th associated kernel’s eigenvalue λ0λ1. . . λn−1

is positive, and less than |λ0λ1. . . λn−2λn| for each n. From this, the eigenvalue

condition 0 < λ0< λ1< . . . follows.

We can also say something about the eigenfunctions of K(x, s). First, a definition.

Definition 5. The functions ϕ0(x), ϕ1(x), . . . , ϕn−1(x) are called a Chebyshev

system if each non-trivial linear combination of them has at most n − 1 zeros.

(30)

20 Chapter 3. Integral eigenvalue problems with oscillatory kernels

There is a useful criterion for determining when a number of functions form a Chebyshev system.

Theorem 15 (Chebyshev criterion). The functions ϕ0(x), ϕ1(x), . . . , ϕn−1(x)

form a Chebyshev system if and only if the determinant D = |ϕi(xk)|n−1i,k=0 is

different from zero (of constant sign) for all samplings a ≤ x1< · · · < xn≤ b.

Proof. Assume that there is a sampling (x1, . . . , xn) such that D = 0. Then, for

some solution ci, not all equal to zero,

D = |ϕi(xk)|n−1i,k=0= 0 ⇐⇒ D t_{= |ϕ} k(xi)|n−1i,k=0= 0 ⇐⇒ ϕ0(x0) ϕ1(x0) . . . ϕn−1(x0) . . . . ϕ0(xn−1) ϕ1(xn−1) . . . ϕn−1(xn−1) c0 c1 .. . cn−1 = 0 .. . 0 ⇐⇒ ⇐⇒ n−1 X i=0 ciϕi(xk) = 0, k = 0, 1, . . . , n − 1.

Clearly, when D is zero, there exists a non-trivial linear combination with n zeros x0 through xn−1, which is equivalent to ϕ0(x), . . . , ϕn−1(x) not being a

Chebyshev system. When no such linear combination exists, D is non-zero. Since it is a continuous function on a connected set, D can not change sign.

The following theorem shows one of the main properties of Chebyshev sys-tems. There are also other oscillatory properties of these systems, but we will not go into these here. Again, see [4].

Theorem 16. If a sequence of functions ϕ0(x), ϕ1(x), ϕ2(x), . . . is orthonormal,

and the first n functions form a Chebyshev sequence for each n, then the function ϕj(x) has exactly j zeros in (a, b) for any j.

Proof. By the definition of a Chebyshev system, ϕj(x) can have at most j zeros.

Assume it has p ≤ j zeros, α1< · · · < αp. Let αp+1be an arbitrary point in the

interval (a, b) and study the determinant D(αp+1) = |ϕi(αk+1)|p0. D(αp+1) thus

defines a function D(x) which is a linear combination of ϕ0(x), ..., ϕp(x). From

Theorem 15 we know that D is different from zero when αp+1 6= α1, . . . , αp.

(When αp+1 is equal to any other αi, D is of course zero, since two columns are

equal.) Letting αp+1 slide from b to a, we can swap two columns each time it

passes through a zero, to keep the ordering of the points α. This means that we can still apply Theorem 15 in each interval, though the column switching gives us a sign change between each interval. It follows that D(αp+1) has the

same p zeros as ϕj(x). Both functions change sign at each zero, which means

that (D, ϕj) 6= 0. Thus, D has a component in the ϕj-direction. Consequently,

p ≥ j, which combined with the Chebyshev criterion gives us that ϕjhas exactly

j zeros.

Next, we show that eigenfunctions of an oscillatory kernel form a Chebyshev system.

Theorem 17. Given an oscillatory kernel K(x, s), the first n functions in the sequence of its eigenfunctions ϕ0(x), ϕ1(x), . . . form a Chebyshev system for each

(31)

3.4. Eigenvalues and eigenfunctions of oscillatory kernels 21

Proof. Apply Theorem 12 and 13 (the analogues of Kronecker and Perron) to the associated kernel Kn(X, S). The eigenfunction corresponding to its

small-est eigenvalue is |ϕi−1(xk)|n1, and has no zeros. The Chebyshev criterion then

implies that ϕ0(x), ϕ1(x), . . . , ϕn−1(x) form a Chebyshev system for each n.

Now we have all the parts needed for the theorem we set out to prove. Theorem 18. If K(x, s) is a continuous oscillatory kernel, and σ(s) is a strictly increasing function, then the integral equation

ϕ(x) = λ Z b

a

K(x, s)ϕ(s) dσ(s) has the following three properties:

i) All eigenvalues are positive and simple; 0 < λ0< λ1< λ2< . . .

ii) The eigenfunction ϕ0(x) corresponding to the smallest eigenvalue λ0 has

no zeros in (a, b).

iii) The eigenfunction ϕj(x) corresponding to the eigenvalue λj has exactly j

zeros in (a, b) and changes sign at each.

Proof. Property i) is Theorem 14. Theorem 17 shows that the eigenfunctions form a Chebyshev system, and thus have the properties ii) and iii).

Let us just note that the theorem could be generalized to an arbitrary non-decreasing σ(x). The completely discrete case (where dσ is a sum of Dirac deltas) breaks down to the case with oscillatory matrices, handled in Chapter 2. When dσ is a mix of these cases, the proof gets a bit more technically complicated, but can still be followed. The interested reader is referred to Section 4.4 in [4].

(32)

(33)

Chapter 4

Planar networks

In this chapter we show the connection from TNN/Oscillatory/TP matrices to planar networks, and show that G is totally nonnegative.

4.1 Preliminaries

A planar network is a graph consisting of vertices and edges. Edges are directed connections from one vertex to another, and may not cross other edges; also, no loops are allowed. This means that there must be vertices without any incoming edges, which we can call sources. Vertices without outgoing edges are called sinks.

In this chapter, we assume that each network has n sources and sinks each, which we place to the left and right respectively, numbered bottom to top. Other vertices will be placed between them in such a way that all edges are directed left to right. A series of edges connecting a source and a sink is called a path.

To each edge is assigned a real number, called its weight. The weight of a path is defined as the product of the weights of its edges. The sum of the weights of all paths connecting source i and sink k is denoted wik. The weight

matrix of a network is the matrix W = ||wik||n1.

Example: Let us study the following simple network.

1 1 2 2 a 1 1 1 1 b c d

Here n = 2. The entries w11, w12and w21correspond to only one possible path,

and will consist of only one term. For w22 we can choose two paths, abc or d.

The weight matrix is

a ac

ab abc + d

.

A collection of paths, connecting a set of sources to a set of sinks, can also be assigned a weight. This is defined as the product of the weights of the paths. We will want to look at vertex-disjoint paths, i.e., paths that are non-intersecting and non-touching. In the example above, the only collection of vertex-disjoint paths connecting sources {1, 2} and sinks {1, 2} has weight ad.

(34)

24 Chapter 4. Planar networks

We will mostly deal with totally connected networks. That is, for each set of p sources and p sinks, there exists at least one collection of vertex-disjoint paths connecting those sources and sinks. The previous example showed a totally connected 2 × 2 network. By inspection, the following 4 × 4 network also turns out to be totally connected. From now on we assume that all edges go from left to right.

1 1

2 2

3 3

4 4

The same construction will be used throughout this chapter. It is not hard to imagine what the network looks like for larger n. In this construction, an edge will be called essential if it is diagonal, or if it is one of the horizontal edges in the middle of the network. Note that there are exactly n2 _{such edges. These}

are the weights that we want to manipulate later on. All other weights will be set to 1, and will not be displayed in the graphs.

4.2 Lindstr¨

om’s Lemma

We want to show that the matrix G from the introduction is oscillatory. The first part will be to show that it is TNN. Calculating all minors of arbitrary size is unfeasible, since there are 2n_n− 1 minors of an n × n matrix. Luckily, this number can be reduced quite a lot. Lindstr¨om’s Lemma links the minors of a weight matrix to path collections of a planar network.

Theorem 19 (Lindstr¨om’s Lemma). A minor W

i1 i2 . . . ip

k1 k2 . . . kp

of the weight matrix of a planar network is equal to the sum of weights of all collections of vertex-disjoint paths that connect the sources {i1, i2, . . . , ip} to the

sinks {k1, k2, . . . , kp}.

Proof. It is enough to prove the lemma for the determinant of the entire weight matrix, which corresponds to choosing all sources and sinks. Minors can then be handled in exactly the same way.

The determinant is equal to

|W |n 1 = X σ sgn(σ) n Y i=1 wiσ(i) ! ,

where the sum is over all permutations on {1, 2, . . . , n}. The productQn_i=1wiσ(i)

(35)

4.3. Constructing a planar network with a given weight matrix 25

connects source i to sink σ(i). Contributions from collections of vertex-disjoint paths comes from the identity permutation, which has sign +1. Our task is to show that all other terms cancel.

To do this we deform the network a bit, if necessary, to guarantee that no vertices lie on the same vertical line. This means that we can number the vertices from left to right in a well-defined way. We can then define an involution on the non-vertex-disjoint collections of paths, that takes the leftmost vertex where two paths coincide and switch the parts of those paths that lie to the left of that vertex. This preserves the weight of the collection, but changes the sign of the permutation σ associated to this term. Pairing terms in this way gives us the desired cancellation of all non-vertex-disjoint collections.

The following corollaries are straightforward applications of Lindstr¨om’s Lemma.

Corollary 2. If a planar network has non-negative weights, then its weight matrix is TNN.

Corollary 3. If a totally connected network has positive weights, then its weight matrix is TP.

This means that if we can show that our matrix G is the weight matrix of a planar network with non-negative weights, it is TNN. This is the goal of the next section.

Note that in [10] it is shown that our construction of totally connected networks generates all TP matrices, and each TP matrix is the weight matrix of such a network with positive weights on each essential edge. Since we do not expect our matrix to be TP, we will probably get a network with zeros in some weights, but it turns out that we can still construct a network such that its weight matrix equals G.

4.3 Constructing a planar network with a given

weight matrix

From the introduction, we have for each n the matrix

Gn= 1 E12 E13 . . . E1n 2 − E12 1 E23 . . . E2n 2 − E13 2 − E23 1 . . . E3n . . . . 2 − E1n 2 − E2n 2 − E3n . . . 1 .

where Eij = exi−xj. We are going to show that this matrix is the weight matrix

of a planar network with non-negative weights. With a change of variables, where Eij = 1 − Qij, we can write our matrix as

Gn= 1 1 − Q12 1 − Q13 . . . 1 − Q1n 1 + Q12 1 1 − Q23 . . . 1 − Q2n 1 + Q13 1 + Q23 1 . . . 1 − Q3n . . . . 1 + Q1n 1 + Q2n 1 + Q3n . . . 1 .

(36)

Remember that x1< · · · < xn, which means that both Eijand Qij are between

0 and 1. We are going to switch between these forms depending on which one gives the cleanest calculations. We start with the case n = 2.

Theorem 20. G2 is totally positive.

Proof. We need to match the network

1 1

2 2

a

b c

d

with the matrix

G2= 1 1 − Q12 1 + Q12 1 .

Moving from source 1 to sink 1, we have only one choice of paths, namely a. This path must be equal to the element g11, so a = 1. The same reasoning gives

us ab = 1 + Q12 and ac = 1 − Q12, from which we get b and c immediately. To

find d, we note that the weight of the collection of vertex-disjoint paths from {1, 2} to {1, 2} is equal to ad = 1 · d = d. We calculate the determinant of G2

to be

det G2= 1 − (1 + Q12)(1 − Q12) = Q212.

Lindstr¨om’s Lemma tells us that d must be equal to this value. We draw the conclusion that G2 is the weight matrix of the network

1 1 2 2 1 1 + Q12 1 − Q12 Q2 12

which is a totally connected planar network with positive weights. G2 is thus a

totally positive matrix.

We could of course have found out this result just by looking at the matrix elements and the determinant, but the network will be needed later on, as we see that things are not this simple for G3and higher. Let us denote the weights

of each essential edge up to n = 5 with a letter:

1 1 2 2 3 3 4 4 5 5 a = 1 b = 1 + Q12 c = 1 − Q12 d = Q2 12 e f g h i j k l m n o p q r s t u v x y z

(37)

We will keep referring back to this picture several times, sometimes referring to the edges themselves by these same letters. With the help from Lindstr¨om’s Lemma, we will try to find the weights of all these edges, and hope to see a pattern which extends to arbitrary size. We start with the rightmost diagonal (consisting of c, h, o, and y), corresponding to the first row of G.

Theorem 21. The weights of the rightmost diagonal (c, h, o, etc.) are positive. Proof. We get the following system of equations

8 > > > > < > > > > : a = 1 ac = 1 − Q12 ach = 1 − Q13 acho = 1 − Q14 . . . ,

which has the solution

a = 1, c = 1 − Q12, h = 1 − Q13 1 − Q12 , o = 1 − Q14 1 − Q13 , . . .

Note that we can simplify h, o, and the following weights in their diagonal, since h = 1 − Q13 1 − Q12 = E13 E12 =e x1−x3 ex1−x2 = e x2−x3 _{= E} 23= 1 − Q23.

In the same way, we get o = 1 − Q34and so on. We see that all weights on this

diagonal are positive.

In the same way, we can study the leftmost diagonal.

Theorem 22. The weights of the leftmost diagonal (b, e, j, etc.) are positive. Proof. Here we get the system

8 > > > > < > > > > : a = 1 ab = 1 + Q12 abe = 1 + Q13 abej = 1 + Q14 . . . . The solution is a = 1, b = 1 + Q12, e = 1 + Q13 1 + Q12 , j = 1 + Q14 1 + Q13 , . . .

which unfortunately cannot be simplified further, but we do see that these weights are positive.

On the right half of the network, things remain relatively simple thanks to the fact that we could simplify the weights on the outer diagonal. Indeed, the following theorem holds.

(38)

Theorem 23. The remaining weights in the right half of the network are zero. Proof. We start with the edge g. To calculate its weight we use that Eii =

exi−xi_{= 1, and see that the minor}

G 1 2 2 3 = E12 E13 E22 E23 = E12 E12E23 E22 E22E23 = E23 E12 E12 E22 E22 is equal to zero. The only collection of vertex-disjoint paths from {1, 2} to {2, 3} uses the following links;

1 1 2 2 3 3 a c d g

where g is the only unknown weight. We draw the conclusion that g must be zero. In the same way, all 2 × 2 minors from {1, 2} to {i, i + 1} will turn out to be zero, but that will not help us find out more weights along the diagonal of g. (We get equations like 0 = 0 · n, etc.) Instead, we need a more clever argument. Let us look at all possible paths from 3 to 4. We know that moving from source 3 to sink 3, or more generally, source i to sink i, will add up to a total weight of 1. Thus, moving first from 3 to 3, and then up to 4 along edge o

1 1 2 2 3 3 4 4 m n o 1

gives the total weight 1 · o = o which we calculated to be 1 − Q34. This is exactly

the weight we want in total from source 3 to sink 4, so the contributions from paths using edges m and n should add up to zero. The only way to do this, while using non-negative weights, is to let the weights m and n both be zero.

With the same argument we can show, by induction, that if we put zeros on all diagonal edges in the right half of the network (except for the rightmost diagonal, which we already calculated); the entries above the main diagonal in G match the possible paths in the network. Thus, u, v, and x should be put to zero, and so on. An arbitrary source i will then only have one possible path to sink k, provided that i < k. The weight on this path will be

(1 − Qi,i+1)(1 − Qi+1,i+2) . . . (1 − Qk−1,k) = Ei,i+1. . . Ek−1,k= Eik= 1 − Qik

as desired.

Now we are going to deal with the horizontal edges in the middle of the network. We already calculated a = 1 and d = Q212, and might guess that the

(39)

Theorem 24. The n-th essential horizontal edge has positive weight Q2 n−1,n,

for all n ≥ 2.

Proof. We show that the guess for p is correct, using an argument easily ex-tended to an arbitrary such edge.

Thanks to the zeros we only get two contributions when moving from source 4 to sink 4. The first is p, and the second is o times the total weight moving from 4 to 3, 1 1 2 2 3 3 4 4 0 0 o p

which is 1 + Q34. This gives us p by the following simple calculation:

p + o · (1 + Q34) = 1 =⇒ p = 1 − (1 − Q34)(1 + Q34) = Q234.

The reader can probably see how this calculation works for arbitrary n. So far, it’s been relatively simple to calculate the weights, and we have found simple patterns, which show that certain weights are positive. On the left half of the network, things get a bit more complicated. We will need to do f first. Theorem 25. The weight f is positive.

Proof. To get an equation determining f , we look at path collections from {2, 3} to {1, 2}. The only vertex-disjoint one is adbf .

1 1 2 2 3 3 a b d f

By Lindstr¨om’s Lemma, this value is equal to the minor G 2 3 1 2 = 1 + Q12 1 1 + Q13 1 + Q23 = (1 + Q12)(1 + Q23) − (1 + Q13) = 1 + Q12+ Q23+ Q12Q23− 1 − Q13= 1 − E12+ 1 − E23+ 1 − E12− E23+ E12E23− 1 − 1 + E13= 2(1 − E12− E23+ E13) = 2(1 − E12)(1 − E23) = 2Q12Q23. Thus, we get f = 2Q12Q23 abd = 2Q12Q23 (1 + Q12)Q212 = 2Q23 (1 + Q12)Q12 , which is positive.

(40)

Above f in the second diagonal, we can calculate a sequence of positive weights in the following way.

Theorem 26. The weights of the second leftmost diagonal (k, r, etc.) are positive.

Proof. Moving up along the second diagonal, we need to look at path collections from {i, i + 1} to {1, 2}, for i ≥ 3. These minors are equal to

G i i + 1 1 2 = 1 + Q1,i 1 + Q2,i 1 + Q1,i+1 1 + Q2,i+1 = 2Q12Qi,i+1(1 − Q2,i),

by a calculation similar to that in the previous proof. We then calculate k and r; k = 2Q12Q34(1 − Q23) abdef = · · · = Q34(1 + Q12)(1 − Q23) Q23(1 + Q13)(1 − Q22) , r = 2Q12Q45(1 − Q24) abdef jk = · · · = Q45(1 + Q13)(1 − Q24) Q34(1 + Q14)(1 − Q23) .

Now the pattern for the second diagonal should be apparent. Since the weights form a telescoping product, it follows by an induction argument that this pattern holds higher up along the diagonal.

An even trickier task is to prove the pattern for edges l, t and upwards. (Those diagonal edges in the left half of the network that are closest to the middle, to be precise). To find a path collection that includes edge l we choose sources {2, 3, 4} and sinks {1, 2, 3}. The only collection of vertex-disjoint paths has weight adibf l:

1 1 2 2 3 3 4 4 a b d f i l

Then, to get an equation for t, we choose sources {2, 3, 4, 5} and sinks {1, 2, 3, 4}, and so on. Apparently, we would need to be able to calculate the minors

Gn 2 3 . . . n 1 2 . . . n − 1 .

Calculating these minors with Mathematica up to n = 6 showed a promising pattern, namely 2Q12Q223Q234. . . Q2n−2,n−1Qn−1,n. We are going to prove this

now.

Theorem 27. The diagonal edges in the left half of the network closest to the middle are positive.

(41)

Proof. It is actually possible to calculate the desired minor directly. Let exi _{= z}

i.

With this definition,

1 − Qij = Eij = zi zj , and 1 + Qij = 2 − Eij = 2 − zi zj = 2zj− zi zj .

We calculate the minor for n = 6, but exactly the same row and column opera-tions could be used to calculate this minor for arbitrarily large n. We have

G 2 . . . 6 1 . . . 5 = 1 + Q12 1 1 − Q23 1 − Q24 1 − Q25 1 + Q13 1 + Q23 1 1 − Q34 1 − Q35 1 + Q14 1 + Q24 1 + Q34 1 1 − Q45 1 + Q15 1 + Q25 1 + Q35 1 + Q45 1 1 + Q16 1 + Q26 1 + Q36 1 + Q46 1 + Q56 = = 2z2− z1 z2 1 z2 z3 z2 z4 z2 z5 2z3− z1 z3 2z3− z2 z3 1 z3 z4 z3 z5 2z4− z1 z4 2z4− z2 z4 2z4− z3 z4 1 z4 z5 2z5− z1 z5 2z5− z2 z5 2z5− z3 z5 2z5− z4 z5 1 2z6− z1 z6 2z6− z2 z6 2z6− z3 z6 2z6− z4 z6 2z6− z5 z6 .

Factoring out the denominators z2 to zn from each row, and z3 to zn−1 from

column 3 through 5, gives us the following determinant;

1 z2z23z42z52z6 2z2− z1 z2 z22 z22 z22 2z3− z1 2z3− z2 z32 z32 z23 2z4− z1 2z4− z2 (2z4− z3)z3 z42 z24 2z5− z1 2z5− z2 (2z5− z3)z3 (2z5− z4)z4 z25 2z6− z1 2z6− z2 (2z6− z3)z3 (2z6− z4)z4 (2z6− z5)z5 .

We now subtract column 2 from column 1, and see that we get z2− z1 in each

position in the first column. Factoring this out, we get

z2− z1 z2z32z 2 4z 2 5z6 1 z2 z22 z22 z22 1 2z3− z2 z32 z 2 3 z 2 3 1 2z4− z2 (2z4− z3)z3 z42 z24 1 2z5− z2 (2z5− z3)z3 (2z5− z4)z4 z25 1 2z6− z2 (2z6− z3)z3 (2z6− z4)z4 (2z6− z5)z5 .

(42)

second from the third, and so on. This yields

z2− z1 z2z32z42z52z6 1 z2 z22 z22 z22 0 2(z3− z2) z32− z22 z32− z22 z32− z22 0 2(z4− z3) 2(z4− z3)z3 z42− z 2 3 z 2 4− z 2 3 0 2(z5− z4) 2(z5− z4)z3 2(z5− z4)z4 z52− z42 0 2(z6− z5) 2(z6− z5)z3 2(z6− z5)z4 2(z6− z5)z5 .

Expanding along the first column gives only one term. From the remaining cofactor we can take out a factor 2 from the first column, and a factor of the form (zi+1− zi) from each row. We have now reduced our determinant to the

following expression; 2Q5_i=1(zi+1− zi) z2z23z42z52z6 1 z3+ z2 z3+ z2 z3+ z2 1 2z3 z4+ z3 z4+ z3 1 2z3 2z4 z5+ z4 1 2z3 2z4 2z5 .

Once again, we subtract the first row from the second row, and so on, to get the final step

G 2 . . . 6 1 . . . 5 = 2 Q5 i=1(zi+1− zi) z2z32z42z25z6 1 z3+ z2 z3+ z2 z3+ z2 0 z3− z2 z4− z2 z4− z2 0 0 z4− z3 z5− z3 0 0 0 z5− z4 = 2(z2− z1)(z3− z2) 2_(z 4− z3)2(z5− z4)2(z6− z5) z2z32z 2 4z 2 5z6 = 2Q12Q223Q 2 34Q 2 45Q56.

We have proven our claim about the minors from {2, . . . , n} to {1, . . . , n − 1}. Using this result, by an induction argument similar to the one regarding the second left diagonal,

l = Q34 Q23

, t = Q45 Q34

, . . .

Fortunately, the remaining edges will turn out to be zero, but proving this requires a little finesse.

Theorem 28. The remaining weights in the left half of the network are zero. Proof. We start with the edge s. This edge is included in the only vertex-disjoint path collection from {3, 4, 5} to {1, 2, 3}.

(43)

4.3. Constructing a planar network with a given weight matrix 33 1 1 2 2 3 3 4 4 5 5 a b d f i l e k s

This corresponds to the minor

G 3 4 5 1 2 3 = 2 − E13 2 − E23 2 − E33 2 − E14 2 − E24 2 − E34 2 − E15 2 − E25 2 − E35 , which is the determinant of a sum of two matrices, each with rank 1:

2 2 2 2 2 2 2 2 2 +

−E13 −E23 −E33

−E14 −E24 −E34

−E15 −E25 −E35

.

Indeed, the first matrix obviously has all columns linearly dependent. In the second matrix, all columns are multiples of the first one, because of how Eij is

defined. The rank of a sum of two matrices can never exceed the sum of the ranks of the terms, so our minor has rank less than or equal to two. Thus, it cannot have full rank, and must be zero. Because of this, one of the links in the path collection must be zero, and the only unknown link at this point is s. We conclude that s is zero.

Let us take a close look at the left half of the network. We now have a ”gap” in the structure. 1 2 3 4 5 6 x1 x2

Note that there is then no way to find a path collection that distinguishes x1

from x2. Thus, we can let x1 be zero, and try to calculate x2. (By the same

(44)

x2 in a unique path collection, we need to use sources {2, 4, 5, 6} and sinks

{1, 2, 3, 4}. 1 2 3 4 5 6 x2

Provided that the corresponding minor is zero, x2 must be zero. In that case

we need to look at the minor from {2, 3, 5, 6, 7} to {1, 2, 3, 4, 5}, then from {2, 3, 4, 6, 7, 8} to {1, 2, 3, 4, 5, 6}, and so on. We draw the network for n = 8 to convince ourselves that the pattern continues.

1 2 3 4 5 6 7 8

We see that below the three highest sources, exactly one source is excluded from the pattern, then all sources down to number 2 are included. So what we need to show is that the minors

G

2 . . . n − 4 n − 2 n − 1 n 1 . . . n − 2

are zero. There is actually a very simple argument for this. Expanding the minor along the three last rows, we see from the structure of the matrix that all 3 × 3 minors from these rows are zero, with the same argument about ranks

(45)

as before. Summing over these minors times their cofactors of course gives us a zero sum, and the proof is complete.

Concluding this section, we see that all links of the network we have con-structed are non-negative. G is the weight matrix of this network, so we have presented a complete proof of the main result of this chapter.

Theorem 29. The matrix G is totally non-negative.

Below is a table of all the calculated non-zero weights, for easy reference.

1 1 2 2 3 3 4 4 5 5 a b c d e f h i j k l o p q r t y a = 1 c = 1 − Q12 b = 1 + Q12 f = 2Q23 (1 + Q12)Q12 l = Q34 Q23 d = Q2 12 h = 1 − Q23 e = 1 + Q13 1 + Q12 k = Q34(1 + Q12)(1 − Q23) Q23(1 + Q13)(1 − Q22) t = Q45 Q34 i = Q2 23 o = 1 − Q34 j = 1 + Q14 1 + Q13 r =Q45(1 + Q13)(1 − Q24) Q34(1 + Q14)(1 − Q23) . . . p = Q2 34 y = 1 − Q45 q = 1 + Q15 1 + Q14 . . . .

(46)

(47)

Chapter 5

Summary and conclusion

To show that our totally non-negative matrix is oscillatory we need only a short additional argument. We note that multiplying weight matrices of two networks is the same as first concatenating the networks and then calculating the resulting weight matrix. This is nothing but the Cauchy–Binet Theorem in disguised form. Then we realize that by concatenating a large enough number of networks of the kind we just calculated, there will be enough room in the network to find a collection of vertex-disjoint paths with non-zero weight for any sets of sources and sinks. This network then has a totally positive weight matrix, which is a power of G. We have proven the following.

Theorem 30. G is an oscillatory matrix.

There is a more formal criterion proven in [4] which tells us that a totally non-negative matrix A is oscillatory if (and only if) it is invertible and has positive elements above and below the main diagonal. This makes sense when we think about A as the weight matrix of a planar network, as those elements guarantees that one can move at least one step up or down from each node. This is necessary if the concatenated network is to be totally positive. For more oscillatory criteria, see [10].

Since G is oscillatory for arbitrary samplings x1 < x2 < · · · < xn, the

kernel g(s − x) that gave rise to it must be oscillatory. This means that the eigenfunctions ψ(x) of the integral equation

ψ(x) = z 2

Z ∞ −∞

g(s − x)m(s)ψ(s) ds have the properties proven in Chapter 3.

(48)

(49)

Bibliography

[1] H. Lundmark, J. Szmigielski, Degasperis–Procesi peakons and the discrete cubic string, International Mathematics Research Papers, (2005), no. 2, 53–116.

[2] J. Kohlenberg, H. Lundmark, J. Szmigielski, The inverse spectral problem for the discrete cubic string, Inverse Problems 23 (Feb 2007), 99–121. [3] A. Degasperis, D. D. Holm, A. N. W. Hone, A new integrable equation with

peakon solutions, Theoret. and Math. Phys. 133 (2002), no. 2, 1463–1474. [4] F. R. Gantmacher, M. G. Krein, Oscillation Matrices and Kernels and

Small Vibrations of Mechanical Systems, revised ed., American Mathemat-ical Society, Providence, Rhode Island, (2002).

[5] M. G. Krein, Sur les fonctions de Green non-symétriques oscillatoires des opérateurs différentiels ordinaires C. R. (Doklady) Acad. Sci. URSS (N.S.) (1939), no. 25, 643–646

[6] C. R. MacCluer, The many proofs and applications of Perron’s Theorem, SIAM Rev. 42 (2000), no. 3, 487–498.

[7] I. Fredholm, Sur une classe d’´equations fonctionnelles, Acta Math. 23 (1903), 365–390.

[8] F. Smithies, Integral Equations, Cambridge University Press, New York, (1958).

[9] R. Jentzsch, ¨Uber Integralgleichungen mit positivem Kern, J. Reine Angew. Math. 141 (1912), 235–244

[10] S. Fomin and A. Zelevinsky, Total positivity: tests and parametrizations, Math. Intelligencer 22 (2000), no. 1, 23–33.

(50)

(51)

Copyright

The publishers will keep this document online on the Internet - or its possi-ble replacement - for a period of 25 years from the date of publication barring exceptional circumstances. The online availability of the document implies a permanent permission for anyone to read, to download, to print out single copies for your own use and to use it unchanged for any non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this per-mission. All other uses of the document are conditional on the consent of the copyright owner. The publisher has taken technical and administrative mea-sures to assure authenticity, security and accessibility. According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement. For ad-ditional information about the Link¨oping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its WWW home page: http://www.ep.liu.se/

Upphovsr¨att

Detta dokument h˚alls tillgängligt p˚a Internet - eller dess framtida ersättare - under 25 ˚ar fr˚an publiceringsdatum under förutsättning att inga extraordi-nära omständigheter uppst˚ar. Tillg˚ang till dokumentet innebär tillst˚and för var och en att läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervisning.

¨

Overföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillst˚and. All annan användning av dokumentet kräver upphovsmannens med-givande. För att garantera äktheten, säkerheten och tillgängligheten finns det lösningar av teknisk och administrativ art. Upphovsmannens ideella rätt in-nefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet p˚a ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i s˚adan form eller i s˚adant sammanhang som är kränkande för upphovsmannens litterära eller konstnärliga anseende eller egenart. För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/

c

2010, Marcus Kardell

Total positivity and oscillatory kernels : An overview, and applications to the spectral theory of the cubic string

Examensarbete

Total positivity and oscillatory kernels: an overview, and

applications to the spectral theory of the cubic string

Marcus Kardell

Total positivity and oscillatory kernels: an overview, and

applications to the spectral theory of the cubic string

Abstract

Acknowledgements

Contents

Chapter 1

Introduction

Chapter 2

Oscillatory matrices and

their eigenvalue properties

2.1

Preliminaries

2.2

The Cauchy–Binet Theorem

2.3

Oscillatory matrices

2.4

Associated matrices

2.5

Perron’s Theorem

2.6

Eigenvalues of oscillatory matrices

Chapter 3

Integral eigenvalue

problems with oscillatory

kernels

3.1

Oscillatory kernels

3.2

Associated kernels

3.3

An integral analogue of Perron’s Theorem

3.4

Eigenvalues and eigenfunctions of oscillatory

kernels

Chapter 4

Planar networks

4.1

Preliminaries

4.2

Lindstr¨

om’s Lemma

4.3

Constructing a planar network with a given

weight matrix

Chapter 5

Summary and conclusion

Bibliography