Orthogonal polynomials and special functions

(1)

U.U.D.M. Project Report 2020:29

Examensarbete i matematik, 15 hp Handledare: Jörgen Östensson

Examinator: Veronica Crispin Quinonez Juni 2020

Department of Mathematics Uppsala University

Orthogonal polynomials and special functions

Elsa Graneland

(2)

(3)

Abstract

This thesis is a brief introduction to the theory of orthogonal polynomials which is based on the theory of inner product spaces. We will also consider some special functions, especially the Bessel function.

We present definitions of orthonormal and monic orthogonal polynomials, and discuss the three-term recurrence relation, the Jacobi matrix and also a result concerning the zeros of the orthogonal polynomials.

Furthermore the Sturm-Liouville problem is presented and in particular, the Bessel function. Many polynomials and special functions are solutions to differential equations describing phenomena in nature. Lastly some applications to physics, e.g. quantum mechanics, are being presented.

(4)

1 Introduction

The theory of orthogonal polynomials and special functions is of intrinsic interest to many parts of mathematics. Moreover, it can be used to explain many physical and chemical phenomena. For example, the vibrations of a drum head can be explained in terms of special functions known as Bessel functions. And the solutions of the

Schr¨odinger equation for a harmonic oscillator can be described using orthogonal polynomials known as Hermite polynomials. Furthermore, the eigenfunctions for the Schr¨odinger operator associated with the hydrogen atom are described in terms orthogonal polynomials known as Laguerre polynomials.

In Section 2 the definitions of inner product and inner product space are presented, and in Section 3 the Gram-Schmidt algorithm is described. In Section 4 the orthonormal and monic orthogonal polynomials are defined. This is followed by a discussion about the three-term recurrence relation, the Jacobi matrix and the zeros of orthogonal polynomials. This thesis considers three of the classical orthogonal polynomials, which are presented in Section 5. In Section 6 we discuss Sturm-Liouville problems. The last section in this thesis, Section 7, present the applications mentioned above.

2 Inner product and inner product space

The standard scalar products of Rⁿ and Cⁿ satisfy some calculation rules. These are taken as axioms in a general inner product space.

Definition 2.1. The inner product h · , · i on a vector space X is a mapping of X × X in to the scalar field K (= R or C) satisfying the following. For all vectors x, y and z and scalars α we have:

1. hx + y, zi = h x, z i + h y, z i.

2. h α x, y i = α h x, y i.

3. h x, y i = h y, x i.

4. h x, x i ≥ 0 and equal if and only if x = 0.

The inner product on X defines a norm on X given by kxk =phx, xi and a distance function, or metric, on X given by

d(x, y) = kx − yk =phx − y, x − yi.

When an inner product space is complete, the space is called a Hilbert space and is usually denoted by H.

Remark. It follows from the definition above, that

h αx + βy, z i = αh x, z i + βh y, z i.

h x, αy + βz i = αh x, y i + βh x, z i.

Due to the conjugate in the second variable, one says that the inner product is sesquilinear, meaning ”1¹₂ times linear”.

(6)

Definition 2.2. Two elements x and y in an inner product space X are said to be orthogonal if h x, y i = 0. A set of vectors is called a orthonormal set if these vectors are pairwise orthogonal and of norm 1.

Example 1. Euclidean space Rⁿ.

Given vectors x = (x₁, x₂, ..., x_n) and y = (y₁, y₂, ..., y_n) in Rⁿ an inner product is defined by

h x, y i = x₁y₁+ x₂y₂+ ... + x_ny_n=

n

X

j=1

x_jy_j. (2.1)

This makes Rⁿ into a Hilbert space.

Example 2. Unitary space Cⁿ.

The standard inner product on Cⁿ is given by

h x, y i =

n

X

j=1

x_jy_j. (2.2)

This makes Cⁿ into a Hilbert space.

Example 3. The space C[a, b].

Let C[a, b] denote the space complex-valued continuous functions defined on the interval [a, b]. An inner product on C[a, b] is given by

h f, g i = Z b

a

f (x) g(x) dx. (2.3)

This space is not complete in the metric induced by the scalar product.

Example 4. The space L²(R, dµ).

Given a positive Borel measure dµ on R, we let L²(R, dµ) denote the equivalent classes of all Borel measurable functions such that

Z

R

|f |²dµ < ∞.

Two functions are considered equivalent if they agree µ-almost everywhere. An inner product on L²[R, dµ] is given by

h f, g i = Z

R

f (x) g(x) dµ. (2.4)

This makes L²(R, dµ) into a Hilbert space.

For the classical orthogonal polynomials it holds that dµ = ω(x) dx.

Moreover the so-called ω usually vanishes outside some interval [a, b]. In this case we write L²([a, b], ω(x)dx), and the scalar product is given by

h f, g i = Z b

a

f (x) g(x)ω(x) dx.

In case h f, g i = 0 we say that f and g are orthogonal with respect to the weight ω on [a, b].

For more about the theory of inner product spaces and Hilbert spaces, see [3].

(7)

3 Gram-Schmidt process

The Gram Schmidt process is an algorithm that is used to turn any linearly independent set of vectors into a orthonormal set of vectors.

We denote the linearly independent set of vectors by x_j, and the resulting orthonormal set of vectors by ej. The steps of the algorithm are as follows:

• First step: The first element of the orthonormal sequence, e₁, will be obtained from

e₁ = 1 kx₁kx₁.

• Second step: All the following steps will include two parts: first create a vector orthogonal to the previous vector(s), then normalize it. We create v₂ as

v₂ = x₂− h x₂, e₁ie₁, and then normalize it:

e₂ = 1 kv2kv₂.

• Third step: We create v3 as

v₃ = x₃− h x₃, e₁ie₁ − h x₃, e₂ie₂ and then normalize:

e₃ = 1 kv₃kv₃. The algorithm proceeds by induction.

• nth step: Suppose that {e₁, ..., e_n−1} is an orthonormal set such that span{e₁, ..., e_n−1} = span{x₁, ..., x_n−1}. The vector v_n is defined by:

v_n = x_n−

n−1

X

k=1

h x_n, e_kie_k.

Note that v_n is a non-zero vector. Otherwise x_n would belong to the span of {x₁, ..., x_n−1} and the set {x₁, ..., x_n} would be linearly dependent. Note also that v_n is orthogonal to all the vectors e₁, ..., e_n−1.

Normalizing v_n:

en = 1 kv_nkvn,

we therefore obtain an orthonormal set {e₁, ..., e_n} with span{e₁, ..., e_n} = span{x₁, ..., x_n}.

Example 5. Gram-Schmidt procedure on vectors in R³

Consider the two vectors x₁ = (1, 1, 0) and x₂ = (1, 2, 1). The Gram-Schmidt procedure can be used to obtain a set {e1, e2} that is orthonormal with respect to the standard scalar product in R³.

(8)

• First step: The vector e₁ is obtained by normalizing x₁: e₁ = 1

kx₁kx₁ = 1

√2

1, 1, 0 .

• Second step: We create v₂ as:

v₂ = x₂− h x₂, e₁ie₁ = (1, 2, 1) − 3

2(1, 1, 0) = 1

2(−1, 1, 2).

And now we normalize:

e2 = 1

kv₂kv2 = 1

√6(−1, 1, 2).

Example 6. Gram-Schmidt process on polynomials

Consider the set u = {1, x, x²}, and let u₁ = 1, u₂ = x and u₃ = x². The Gram-Schmidt process can be used to obtain a set {e₁, e₂, e₃} that is orthonormal with respect to the inner product

h f, g i = Z 1

−1

f (x) g(x) dx.

• First step: The first element of the orthonormal sequence, e₁, will be obtained from

e₁ = 1

ku₁ku₁ = 1

√2

• Second step: We create v₂ as:

v2 = u2− h u2, e1ie1 = x − hx, 1i = x − Z 1

−1

x · 1 dx = x.

And then make it normalized:

e₂ = 1

kv₂kv₂ = x q2

3

= r3

2x.

• Third step: We create v3 as:

v₃ = u₃− h u₃, e₁ie₁− h u₃, e₂ie₂

= x²−

Z 1

−1

x²· 1

√2dx

· 1

√2 −

Z 1

−1

x²· r3

2x dx

· r3

2x

= x²−1 2 · 2

3− 0 = x²−1 3 Note that

kv3k² = Z 1

−1

x² −1

3

2

dx = 8 45, and therefore:

e₃ = 1

kv₃kv₃ = 3√ 5 2√

2

x²− 1

3

.

These are, up to a multiplicative factor, the first three so-called Legendre polynomials.

(9)

4 Orthogonal polynomials

4.1 Orthonormal and monic orthogonal polynomials

Let dµ be a positive Borel measure on R having finite moments, that is Z

R

|x|^mdµ < ∞ (4.1)

for all integers m ≥ 0. Furthermore assume that supp µ contains infinitely many points, i.e. #(supp µ) = ∞. Then the set {1, x, x², ...} is linearly independent. The

orthonormal polynomials, p_n(x), satisfy Z

p_n(x) p_m(x) dµ = δ_nm (4.2)

where

δ_nm=

(0, n 6= m, 1, n = m.

We assume that the coefficient of the highest order term is positive:

p_n(x) = γ_nxⁿ+ ..., γ_n> 0. (4.3) This makes the p_n’s unique. They can be obtained by applying the Gram-Schmidt process to the set {1, x, x², ...}.

The orthogonal polynomials having leading coefficient 1 are called monic orthogonal polynomials. We denote them by π_n(x), i.e. π_n(x) = p_n(x)/γ_n.

Let M = Mn be the (n + 1) × (n + 1) matrix where all the elements are M = (M_ij) =

Z

x^i+jdµ(x)

0≤i,j≤n

, and let D_n denote its determinant:

D_n = det M =

R s⁰dµ R sdµ · · · R sⁿdµ R sdµ R s²dµ · · · R sⁿ⁺¹dµ

... ... ... ... R sⁿ⁻¹dµ R sⁿdµ · · · R s²ⁿ⁻¹dµ

R sⁿdµ R sⁿ⁺¹dµ · · · R s²ⁿdµ .

Note that

0 ≤

n

X

i=0

t_ixⁱ

2

L²(R,dµ) =DXⁿ

i=0

t_ixⁱ,

n

X

j=0

t_jx^jE

L²(R,dµ)

=

n

X

i,j=0

t_it_j Z

R

x^i+jdµ =

n

X

i,j=0

t_it_jM_i,j = hM t, ti_Rⁿ⁺¹,

(4.4)

where t = {t0, ..., tn}^T ∈ Rⁿ⁺¹. Since #(supp µ) = ∞ it follows that the matrix M is strictly positive definite, and therefore

D_n> 0.

(10)

For notational convenience we set D₋₁ = 1. Now set D₀(x) = 1 and for n ≥ 1

D_n(x) =

1 x · · · xⁿ

.

There is an explicit formula for the orthonormal polynomials, due to Heine.

Theorem 1. Let p_n(x) for n ≥ 0 be the sequence of polynomials orthogonal with respect to dµ. Then

p_n(x) = 1

√D_n−1D_nD_n(x).

Proof. To prove that

p_n(x) = 1

√D_n−1D_nD_n(x) (4.5)

we write ˜p_n(x) = √ ¹

Dn−1Dn

D_n(x). For all j < n, multiplying with x^j we get following expression

pD_nD_n−1 Z

x^jp˜_n(x)dµ =

=

R x^jdµ R x^j+1dµ · · · R x^j+ndµ .

Note that the last row in the matrix above is identical to some other row lying above.

When two rows in a matrix are identical, the determinant will be 0, as seen e.g. using row operations. That gives us

Z

x^jp˜_n(x) dµ = 0,

which also implies that ˜p_n(x) ⊥ ˜p_m(x) for all m < n. Furthermore we want to show that k˜pnk² = 1. Because of the equation ˜pn(x) = √ ¹

Dn−1Dn

Dn(x), we have the following

k˜p_nk² = h ˜p_n, ˜p_ni = 1 Dn−1Dn

hD_n(x), D_n(x)i.

The right hand of this equation can be expressed using the definition of the inner product:

k˜p_nk² = 1 D_n−1D_n

Z

D_n(x) D_n(x) dµ.

(11)

Expanding the second factor D_n(x) along the last row, it follows that this can be rewritten as

k˜pnk² = 1 D_n−1D_n

Z

Dn(x) (Dn−1xⁿ+ · · · ) dµ where the dots represent the terms of lower order.

Note that all the terms of lower order will be orthogonal to D_n(x) and therefore these integrals will be zero. What we have left is the following expression:

k˜p_nk² = 1 D_n

Z

D_n(x) xⁿdµ.

Using expansion along the last row one realizes that this can be rewritten as:

k˜pnk² = 1 D_n

R x⁰dµ R xdµ · · · R xⁿdµ R xdµ R x²dµ · · · R xⁿ⁺¹dµ

... ... ... ... R xⁿ⁻¹dµ R xⁿdµ · · · R x²ⁿ⁻¹dµ

R xⁿdµ R xⁿ⁺¹dµ · · · R x²ⁿdµ ,

where the determinant is exactly the same thing as D_n. Thus k˜pnk² = 1

D_nDn = 1.

Finally, expanding D_n(x) along the last row one sees that the coefficient for the highest order term in ˜p_n is given by

√ 1

D_n−1D_n · D_n−1 =r D_n−1

D_n > 0. (4.6)

Hence ˜p_n(x) = p_n(x).

4.2 Three-term recurrence relation

Since x p_k(x) is a polynomial of degree k + 1, it can be written as a sum

x p_k(x) =

k+1

X

j=0

c_jp_j(x) (4.7)

for some constants c_j. Suppose that k ≥ 1 and that l < k − 1. Then, due to orthogonality, we have following expression:

Z

x p_k(x) p_l(x) dµ = Z

p_k(x) x p_l(x) dµ = 0, (4.8)

(12)

since x p_l(x) is a polynomial of degree l + 1 < k. On the other hand, by (4.7) we have Z

x p_k(x) p_l(x) dµ =

k+1

X

j=0

c_j Z

p_j(x) p_l(x) dµ = c_l. (4.9)

By comparing equation (4.8) and equation (4.9), we see that for all l < k − 1 it holds that c_l= 0. That gives us only three terms left in (4.7): when l = k − 1, l = k and l = k + 1. Therefore, the polynomial x p_k(x) can be expressed as

x p_k(x) = c_k+1p_k+1(x) + c_kp_k(x) + c_k−1p_k−1(x), k ≥ 1. (4.10) This is the famous three-term recurrence formula. Obviously, we also have

x p₀(x) = c₁p₁(x) + c₀p₀(x), k = 0. (4.11) Set

a_k= Z

x p²_k(x) dµ bk=

Z

x pk+1(x) pk(x) dµ

for k ≥ 0. Multiplying equation (4.10) by pk(x) and integrating with respect to dµ, we get:

a_k= Z

x p²_k(x) dµ = 0 + c_k Z

p²_k(x) dµ + 0 = c_k. (4.12) Similarly multiplying equation (4.10) by pk+1(x) and integrating with respect to dµ, we get:

b_k = Z

x p_k+1(x) p_k(x) dµ = c_k+1 Z

p²_k+1dµ + 0 + 0 = c_k+1. (4.13) Finally, multiplying equation (4.10) by pk−1(x) and integrating with respect to dµ, we get:

b_k−1 = Z

x p_k(x) p_k−1(x) dµ = 0 + 0 + c_k−1 Z

p²_k−1dµ = c_k−1. (4.14) Doing similar calculation for the case k = 0 we get:

x p_k(x) = b_kp_k+1(x) + a_kp_k(x) + b_k−1p_k−1(x), k ≥ 1, (4.15) x p₀(x) = b₀p₁(x) + a₀p₀(x), k = 0. (4.16) The leading coefficient of the left-hand side of the equation (4.15) is γ_k and on the right-hand side b_k· γ_k+1. Therefore:

γ_k = b_k· γ_k+1,

(13)

so that

b_k= γ_k

γ_k+1 > 0. (4.17)

If we set T to be the tridiagonal matrix

T =







a₀ b₀ 0 · · · b0 a1 b1 . ..

0 b₁ a₂ . ..

... . .. ... ...





 the equations (4.15) and (4.16) can be rewritten as

T





 p₀(x) p₁(x)

... p_k(x)

...







= x





 p₀(x) p₁(x)

... p_k(x)

...





 .

4.3 The Jacobi matrix

Definition 4.1. The Jacobi matrix is defined as the symmetric matrix T such that:

T =







a₀ b₀ 0 · · · 0 b₀ a₁ b₁ . .. ... 0 b₁ a₂ . .. 0

... . .. ... ... b_n−2 0 · · · 0 b_n−2 a_n−1







, bj > 0, (4.18)

in the finite-dimensional case, and:

T =







a₀ b₀ 0 · · · b0 a1 b1 . ..

0 b₁ a₂ . ..

... . .. ... ...







, b_j > 0, (4.19)

in the infinite-dimensional case.

Theorem 2. Let T be a finite n × n Jacobi matrix. Then the following holds:

1. The spectrum of T is simple. Thus, there exist n distinct eigenvalues λ1, λ2, ..., λn. 2. If f = (f (1), f (2), ..., f (n))^T 6= 0 is an eigenvector of T , i.e., T f = λf , then

f (1) 6= 0 and f (n) 6= 0.

Proof. We begin by proving the second statement. Assume that:

(T − λI)f = 0 (4.20)

(14)

where f = (f (1), f (2), ..., f (n))^T. The first row of equation (4.20) reads:

(a₀− λ)f (1) + b₀f (2) = 0. (4.21) For the proof we use proof by contradiction. Suppose that f (1) = 0. Since b₀ > 0 it follows from equation (4.21) that f (2) = 0. The equations in the middle rows of (4.20) can be written as

bi−2f (i − 1) + (ai−1− λ)f (i) + bi−1f (i + 1) = 0 (4.22) for 2 ≤ i < n. From equation (4.22), with i = 2, and get following expression:

b₀f (1) + (a₁− λ)f (2) + b₁f (3) = 0 (4.23) Since f (1) = 0, f (2) = 0 and b₁ > 0 it follows that f (3) = 0, too. Continuing in this manner, setting i = 3, ..., i = n − 1, it follows f (1) = f (2) = ... = f (n) = 0. This means that, if f (1) = 0, then f = 0. Therefore f (1) 6= 0. That f (n) 6= 0 follows from a similar reasoning, starting from the last row.

For the proof of the first statement, we suppose that f and ˜f are two (non-zero) eigenvectors with the same eigenvalue λ. We can find a pair (a, ã) 6= (0, 0) such that g = af + ã ˜f has g(1) = 0. For example we may take a = 1 and ã = −^{f (1)}_˜

f (1). Since g is in the same eigenspace it follows, from the proof of the second statement, that g = 0. This means that f and ˜f are linearly dependent. So each eigenspace in one-dimensional.

Since the matrix T is symmetric, it follows that it must have n distinct real eigenvalues.

4.4 Zeros of orthogonal polynomials

When considering the zeros of orthogonal polynomials, there are some properties that characterise the zeros. We will focus on two of them:

Theorem 3. Let dµ be a Borel measure with finite moments and infinite support, and let p_n(x) be the associated orthonormal polynomials. Then,

1. The n zeros of p_n(x), are real and simple.

2. The zeros of p_n+1(x) and p_n(x) interlace, that is, between any two zeros of p_n+1(x) lies exactly one zero of p_n(x), and vice versa.

Proof of 1. When proving the first property we may assume, without loss of generality, that R dµ = 1. Let T be the Jacobi matrix such that

T =







a₀ b₀ 0 · · · b₀ a₁ b₁ . ..

0 b₁ a₂ . ..

... . .. ... ...





 .

We know that

(15)

0 = (T −λ)





 p₀(λ) p₁(λ)

...





=







a0− λ b0 0 · · · b₀ a₁− λ b₁

0 b₁ a₂− λ . ..

. . . . .. . .. bn−2

b_n−2 a_n−1− λ b_n−1 b_n−1 a_n− λ b_n

. .. ...











 p₀(λ) p₁(λ)

...





.

Hence, p_n(λ) = 0 if and only if

(p₀(λ), ..., p_n−1(λ)) = (1, p₁(λ), .., p_n−1(λ)) 6= 0, is an eigenvector of the n × n Jacobi matrix







a₀ b₀ 0 · · · 0 b₀ a₁ b₁ . .. ... 0 b₁ a₂ . .. 0

... . .. ... ... b_n−2 0 · · · 0 b_n−2 a_n−1







(4.24)

with eigenvalue λ. Therefore, the zeros of p_n are the same as the eigenvalues of the matrix in equation (4.24). But, and due to Theorem 2 this matrix has n real and simple zeros.

Proof of 2. The roots of p_n(x) and p_n+1(x) are eigenvalues of the two matrices T_n−1 and T_n, where

T_n−1 =







a₀ b₀ 0 · · · 0 b₀ a₁ b₁ . .. ... 0 b₁ a₂ . .. 0

... . .. ... ... b_n−2 0 · · · 0 b_n−2 a_n−1





 ,

Tn=







a₀ b₀ 0 · · · 0 b₀ a₁ b₁ . .. ... 0 b₁ a₂ . .. 0 ... . .. ... ... b_n−1 0 · · · 0 b_n−1 a_n





 .

Consider

h e_n, (T_n− λI)⁻¹e_ni,

the (n, n)-entry in the resolvent (Tn− λI)⁻¹. Observe that we here use the notation e₀, ..., e_n for the standard vectors in Rⁿ⁺¹, and index rows and columns in the

(n + 1) × (n + 1) matrix T_n from 0 to n. Now recall the formula A⁻¹ = 1

det Aadj (A)

(16)

for the inverse of a matrix A in terms of the so-called adjoint of A:

adj (A) = (cof (A))^T. Using this gives us the following:

h en, (Tn− λI)⁻¹eni = det(T_n−1− λI)

det(T_n− λI) = − π_n(λ)

π_n+1(λ), (4.25) where π are the monic orthogonal polynomials. The spectral theorem gives us:

T_n= U_nΛ U_n^T, (4.26)

where U_n is the orthogonal matrix whose columns are the eigenvectors of T_n and Λ = (λ0, ..., λn) is the diagonal matrix of eigenvalues. Write Un = (fj(i)). Using the spectral theorem

T_n− λI = U_nΛ U_n^T − λI = U_n(Λ − λI) U_n^T, so that

(T_n− λI)⁻¹ = U_n(Λ − λI)⁻¹U_n^T. From equation (4.25) we get

− π_n(λ)

π_n+1(λ) = h en, (Tn− λI)⁻¹eni = h en, Un(Λ − λI)⁻¹U_n^Teni

= h U_n^T e_n, (Λ − λI)⁻¹U_n^T e_ni =

n

X

j=0

f_j²(n) λ_j− λ.

(4.27)

Therefore,

n

X

j=0

f_j²(n)

λ − λ_j = π_n(λ)

π_n+1(λ). (4.28)

By looking at the left-hand side of the equation, we realise that when λ is close to λ_j the ratio will give us either big, positive values or big negative values, depending on whether λ gets closer to λ_j from the left-hand side or the right-hand side. The derivative

d dλ

Xf_j²(n)/(λ − λ_j) = −X

f_j²(n)/(λ − λ_j)²

is negative, so that the function in the left-hand side of equation (4.28) is decreasing everywhere.

The graph of the function in the left-hand side of equation (4.28) can be visualized as in Figure 1 below.

λ0 λ1 λ2 λn−1 λn

Figure 1: Interlacing zeros of orthogonal polynomials.

(17)

Thus, the left-hand side of equation (4.28) must have a zero between any two zeros of π_n+1(λ). But this must be a zero of π_n(λ).

For further theory about orthogonal polynomials, see [1].

5 Classical orthogonal polynomials

Many orthogonal polynomials are named after the ones who discovered them. This thesis will focus on three of these orthogonal polynomials.

• Legendre polynomials

• Hermite polynomials

• Laguerre polynomials

The polynomials mentioned above are sometimes called classical orthogonal polynomials.

They are polynomials orthogonal with respect to different weight functions. It turns out that they are also solutions to differential equations.

5.1 Legendre polynomials

Consider the polynomials which are orthogonal on the interval [−1, 1] with respect to the inner product

h f, g i = Z 1

−1

f (x) g(x) dx,

i.e. w.r.t. the weight function ω(x) = 1. The Legendre polynomials P_n(x) are defined by

p_n(x) =

r2n + 1

2 P_n(x), (5.1)

where p_n(x) denotes the orthonormal polynomial. It turns out that the Legendre polynomials are given by the formula

P_n(x) = 1 2ⁿn!

dⁿ

dxⁿ[(x²− 1)ⁿ]. (5.2)

Equation (5.2) is called Rodrigues’ formula. Using the binomial theorem on (x²− 1)ⁿ and differentiating the result n times, it follows that:

P_n(x) =

N

X

j=0

(−1)^j (2n − 2j)!

2ⁿj!(n − j)!(n − 2j)!x^n−2j (5.3) where N = n/2 if n is even and N = (n − 1)/2 if n is odd. The six first Legendre

polynomials will, by help of equation (5.3), therefore be:

P₀(x) = 1 P₁(x) = x

P₂(x) = ¹₂(3x² − 1) P₃(x) = ¹₂(5x³− 3x)

P4(x) = ¹₈(35x²− 30x²+ 3) P5(x) = ¹₈(63x⁵− 70x³+ 15x)

(18)

Theorem 4. The polynomials p_n(x) defined from equations (5.1) and (5.2) are orthonormal i.e. the Legendre polynomials satisfy

h Pn, Pmi =

(0 if m 6= n,

2

2n+1 if m = n.

Proof. For m = n, we must show that kP_nk =hZ 1

−1

P_n²(x)dxi1/2

=

r 2

2n + 1. (5.4)

Let u = x²− 1. The function uⁿ and its derivatives (uⁿ)⁰, . . . , (uⁿ)⁽ⁿ⁻¹⁾ are zero at x = ±1, and (uⁿ)⁽²ⁿ⁾ = (2n)!. Integrating by parts n times, using equation (5.2) we get:

(2ⁿn!)²kP_nk² = Z 1

−1

(uⁿ)⁽ⁿ⁾(uⁿ)⁽ⁿ⁾dx

= (uⁿ)⁽ⁿ⁻¹⁾(uⁿ)⁽ⁿ⁾

1

−1

− Z 1

−1

(uⁿ)⁽ⁿ⁻¹⁾(uⁿ)⁽ⁿ⁺¹⁾dx

= · · ·

= (−1)ⁿ(2n)!

Z 1

−1

uⁿdx

= 2(2n)!

Z 1 0

(1 − x²)ⁿdx.

Doing the substitution x = sin τ and we get:

Z 1 0

(1 − x²)ⁿdx = Z π/2

0

cos²ⁿ⁺¹τ dτ = Z π/2

0

cos τ · cos²ⁿτ dτ

Let I_n =Rπ/2

0 cos τ · cos²ⁿτ dτ . Clearly I₀ = 1. Doing integration by parts, we get:

I_n= (sin τ · cos²ⁿτ )

π/2

0

− Z π/2

0

sin τ · 2n cos²ⁿ⁻¹τ (− sin τ )dτ =

= 2n Z π/2

0

cos²ⁿ⁻¹τ (1 − cos²τ )dτ

= 2n Z π/2

0

cos²ⁿ⁻¹τ − cos²ⁿ⁺¹dτ

= 2n(I_n−1− I_n).

That is,

I_n = 2n(I_n−1− I_n), which gives

I_n= 2n

2n + 1I_n−1. (5.5)

(19)

Repeated use of equation (5.5) gives I_n = 2n

2n + 1 · 2(n − 1)

2(n − 1) + 1I_n−2 = · · · = 2n · 2(n − 1) · ... · 2 (2n + 1)(2n − 1) · ... · 3I₀. Extend both numerator and denominator with the even factors 2n, (2n − 2), (2n − 4), ..., 2, it follows that

(2ⁿn!)²kP_nk² = 2(2n)! · 2²ⁿn²(n − 1)²(n − 2)²· ... · 1²

(2n + 1)(2n)! = 2²ⁿ⁺¹(n!)² (2n + 1)(2n)!. Now dividing both sides with (2ⁿn!)² gives that

kP_nk² = 2 2n + 1. To complete the proof it is enough to check that

h x^m, P_ni = 0 for m < n. But for m < n,

2ⁿn! h x^m, P_ni = Z 1

−1

x^m(uⁿ)⁽ⁿ⁾dx

= x^m(uⁿ)⁽ⁿ⁻¹⁾

1

−1

− m Z 1

−1

x^m−1(uⁿ)⁽ⁿ⁻¹⁾dx

= · · ·

= (−1)^mm!

Z 1

−1

(uⁿ)^(n−m)dx

= (−1)^mm!(uⁿ)^(n−m−1)

1

−1

= 0,

since n − m − 1 is an integer k such that 0 ≤ k ≤ n − 1.

The Legendre polynomials are known to be solutions of the Legendre differential equation (5.6)

(1 − x²)P_n⁰⁰− 2xP_n⁰+ n(n + 1)P_n = 0. (5.6)

5.2 Hermite polynomials

The second class of polynomials are the Hermite polynomials. They are orthogonal on R with respect to the inner product

h f, g i = Z ∞

−∞

f (x) g(x) e^−x²dx, (5.7)

i.e. with the weight function ω(x) = e^−x².

(20)

The Hermite polynomials H_n(x) are defined by p_n(x) = 1

(2ⁿn!√

π)^1/2H_n(x), (5.8)

where p_n(x) is the sequence of orthonormal polynomials. It can be shown that H_n(x) = (−1)ⁿe^x² dⁿ

dxⁿ(e^−x²). (5.9)

When developing the differentials from equation (5.10) we get:

Hn(x) = n!

N

X

j=0

(−1)^j 2^n−2j

j!(n − 2j)!x^n−2j (5.10)

where N = n/2 if n is even and N = (n − 1)/2 if n is odd. With help from equation (5.10) the six first polynomials can be computed:

H₀(x) = 1 H₁(x) = 2x

H₂(x) = 4x²− 2 H₃(x) = 8x³− 12x

H₄(x) = 16x⁴− 48x²+ 12 H₅(x) = 32x⁵−160x³+120x Theorem 5. The polynomials pn(x) defined by equation (5.8) and (5.9) are orthonormal, i.e. the Hermite polynomials satisfy

Z ∞

−∞

e^−x²H_m(x)H_n(x)dx =

(0 if m 6= n, 2ⁿn!√

π if m = n.

Proof. Equation (5.10) can also be written as:

H_n(x) =

N

X

j=0

(−1)^j

j! n(n − 1) · · · (n − 2j + 1)(2t)^n−2j (5.11) when n = 2, 3, 4.... For n ≥ 1 we get following from differentiating:

H_n⁰(x) = 2n

M

X

j=0

(−1)^j

j! (n − 1)(n − 2) · · · (n − 2j)(2t)^n−1−2j = 2n H_n−1(x) (5.12) where M = (n − 2)/2 if n is even and M = (n − 1)/2 if n is odd. To prove the theorem we apply equation (5.12) to Hm, and we assume that m ≤ n. Note that, since

ω(x) = e^−x² it follows from equation (5.9) that

(−1)ⁿe^−x²H_n(x) = ω⁽ⁿ⁾(x).

(21)

Integrating by parts, m times, we get (−1)ⁿ

Z ∞

−∞

e^−x²H_m(x)H_n(x)dx = Z ∞

−∞

H_m(x)ω⁽ⁿ⁾(x)dx

= H_m(x)ω⁽ⁿ⁻¹⁾(x)

∞

−∞

− Z ∞

−∞

2mH_m−1ω⁽ⁿ⁻¹⁾(x)dx

= −2m Z ∞

−∞

H_m−1(x)ω⁽ⁿ⁻¹⁾(x)dx

= · · ·

= (−1)^m2^mm!

Z ∞

−∞

H₀(x)ω^(n−m)(x)dx.

In the last row of the equation, H₀(x) = 1. If m < n, integrating once more, we obtain 0 since ω and its derivatives approach 0 as x → ±∞. If m = n, the integral in the last row equals:

Z ∞

−∞

e^−x²dx =√ π.

There is a differential equation that the Hermite polynomials satisfy, called the Hermite differential equation:

H_n⁰⁰− 2x H_n⁰ + 2n H_n= 0. (5.13)

5.3 Laguerre polynomials

Consider the polynomials which are orthogonal on [0, ∞) with respect to the inner product

h f, g i = Z ∞

0

f (x) g(x) e^−xdx, (5.14)

i.e. the weight function ω(x) = e^−x. The Laguerre polynomials are defined by

p_n(x) = e^−x/2L_n(x) (5.15)

where p_n(x) is the sequence of orthonormal polynomials. It can be shown that Ln(x) = e^x

n!

dⁿ

dxⁿ(xⁿe^−x), n = 0, 1, 2, .... (5.16) Equation (5.16) can be expressed without the differentiation. Then it will look like

L_n(x) =

n

X

j=0

(−1)^j j!

n j

x^j. (5.17)

Using the formula from equation (5.17), the first five polynomials can be calculated:

L0(x) = 1 L1(x) = −x + 1

L₂(x) = ¹₂(x²− 4x + 2) L₃(x) = ¹₆(−x³+ 9x²− 18x + 6) L₄(x) = ₂₄¹ (x⁴− 16x³+ 72x²− 96x + 24)

(22)

The Laguerre polynomials are a solution to a differential equation, called the Laguerre differential equation:

xL⁰⁰_n+ (1 − x)L⁰_n+ nL_n= 0.

The above results concerning classical orthogonal polynomials can be found in [3].

6 Sturm-Liouville problems

Many mathematical and physical problems lead to a differential equation of the form:

(p(x)v⁰)⁰+ q(x)v + λr(x)v = 0 a < x < b. (6.1) In addition one has boundary conditions at the end points of the interval. The

differential equation (6.1) together with boundary conditions is called a Sturm-Liouville eigenvalue problem, and a solution is called an eigenfunction with eigenvalue λ.

Example 7. Consider the Sturm-Liouville problem d²v

dx² + λv = 0 0 < x < L, (6.2)

v(0) = v(L) = 0, (6.3)

where p = r = 1 and q = 0. This boundary condition is called a Dirichlet boundary condition. The eigenfunctions and eigenvalues for this problem are

v_n(x) = sinnπx

L , λ_n= nπ L

2

n = 1, 2, 3, ...

Example 8. Consider the Sturm-Liouville problem d²v

dx² + λv = 0 0 < x < L, (6.4)

v⁰(0) = v⁰(L) = 0, (6.5)

where p = r = 1 and q = 0. This boundary conditions is called a Neumann boundary condition. The eigenfunctions and eigenvalues for this problem are

vn(x) = cosnπx

L , λn = nπ L

2

n = 0, 1, 2, ....

Example 9. Consider the Sturm-Liouville problem with d²v

dx² + λv = 0 0 < x < L, (6.6)

v(0) = v(L), v⁰(0) = v⁰(L), (6.7)

where p = r = 1 and q = 0. This kind of boundary condition is called a periodic boundary condition. The eigenfunctions and eigenvalues for this problem are

v_n(x) = α_ncos 2nπx L

+ β_nsin 2nπx L

, λ_n = 2nπ L

2

n = 1, 2, 3, ....

The results above can all be shown by elementary computations using the fact that the eigenvalues must be real.

(23)

Example 10. The Bessel equation.

For ν ∈ [0, ∞), consider the differential equation

r²w⁰⁰(r) + rw⁰(r) + (r²− ν²)w(r) = 0, r > 0. (6.8) The equation above is called the Bessel equation of order ν.

Divide (6.8) with r and put v(x) = w(r) where x = r/√

λ. Then (xv⁰(x))⁰+

λx − ν² x

ν(x) = 0. (6.9)

This equation has the form of a the Sturm-Liouville equation with p(x) = r(x) = x and q(x) = −ν²/x. The bounded solution of equation (6.8), denoted J_ν, is called the Bessel function (of the first kind) of order ν.

Theorem 6. Consider the Sturm-Liouville equation

(p(x)v⁰)⁰+ q(x)v + λr(x)v = 0, a < x < b, (6.10) together with Dirichlet, Neumann or periodic boundary conditions. Then the

eigenfunctions corresponding to different eigenvalues are orthogonal with respect to a inner product

h u, v ir = Z b

a

u(x) v(x) r(x) dx.

Proof. Let us write

Lv := (pv⁰)⁰+ qv. (6.11)

Using the product rule, we get:

uLv − vLu = u(pv⁰)⁰+ uqv − v(pu⁰)⁰− vqu

= (upv⁰)⁰− u⁰pv⁰− (vpu⁰)⁰+ u⁰pv⁰. (6.12) Equation (6.12) can be simplified to

uLv − vLu = [p(uv⁰ − vu⁰)⁰]⁰ (6.13) Equation (6.13) is called the Lagrange identity. Integrating equation (6.13) over the interval [a, b], we get following equation, which is called Green’s formula:

Z b a

(uLv − vLu) dx = p (uv⁰− vu⁰)

b

a. (6.14)

Assuming that u and v satisfy the boundary conditions, then p (uv⁰− vu⁰)

b

a = 0, (6.15)

which leads to

Z b a

(uLv − vLu) dx = 0. (6.16)

(24)

Now assume that v_n and v_m are two eigenfunctions with two different eigenvalues, λ_n and λ_m. Then

− Lv_n = λ_nr v_n, (6.17)

− Lv_m = λ_mr v_m. (6.18)

If we multiply equation (6.17) with v_m and equation (6.18) with v_n, integrating them over the interval [a, b] and finally taking the difference of these two equations, we obtain the following expression:

− Z b

a

(v_mLv_n− v_nLv_m) dx = (λ_n− λ_m) Z b

a

v_nv_mr dx (6.19) Applying Green’s formula we get that

(λ_n− λ_m) Z b

a

v_nv_mr dx = 0. (6.20)

Since λ_n 6= λ_m it follows that

h v_n, v_mi_r = Z b

a

v_nv_mr dx = 0, (6.21)

proving orthogonality.

For further theory about Sturm-Liouvlle problems, see [4, Pinchover].

The three classical orthogonal polynomials were all solutions to differential equations:

• The Legendre differential equation

(1 − x²)v⁰⁰− 2xv⁰+ λv = 0 (6.22)

• The Hermite differential equation

v⁰⁰− 2xv⁰+ λv = 0 (6.23)

• The Laguerre differential equation

xv⁰⁰+ (1 − x)v⁰+ λv = 0 (6.24) They can all be written in the form of Sturm-Liouville equation for different choices of p, q and r.

The Legendre differential equation can, clearly, be written in the form (6.1), where p = (1 − x²), q = 0 and r = 1. Here [a, b] = [−1, 1].

Now consider the Hermite differential equation. After multiplying both sides of equation (6.1) with e^−x² we see that

e^−x²v⁰⁰− 2xe^−x² + λe^−x²v = 0, or

(e^−x²v⁰)⁰+ λe^−x²v = 0.

(25)

This gives us: p = e^−x², q = 0 and r = e^−x². Here (a, b) = (−∞, ∞).

Finally consider the Laguerre differential equation. Multiplying both sides of (6.1) with e^−x, we find

e^−xx v⁰⁰+ e^−x(1 − x)v⁰+ e^−xλv = 0. (6.25) It is easy to see that this can be rewritten as

(e^−xxv⁰)⁰+ e^−xλv = 0, (6.26) that is p = x e^−x, q = 0 and r = e^−x. Here [a, b) = [0, ∞).

It is interesting to note that the orthogonality of the classical orthogonal polynomials with respect to the weights indicated in the previous work can also be seen from the proof of Theorem 6. Indeed formula (6.15) still holds true, even in the absence of boundary conditions. Simply note that p vanishes at the ends of the intervals (and, in case the intervals are unbounded, p decays faster than the polynomials).

For more about the theory of Sturm-Liouville problems, see [4].

7 Applications

7.1 Eigenvalues of the Laplace operator

Let Ω be a bounded open subset of Rⁿ. The eigenvalue problem for the Dirichlet Laplace operator

∆ = ∂²

∂x²₁ + · · · + ∂²

∂x²_n is then given by the equation

− ∆u = λu, (7.1)

where u is a function in Ω such that u = 0 on ∂Ω.

One can show that the Dirichlet Laplace operator has discrete eigenvalues (λ_j)^∞₁ , 0 < λ₁ ≤ λ₂ ≤ λ₃ ≤ ...,

accumulating at infinity. They are important in many applications. For example, consider the wave equation

∂²v

∂t² − c²∆v = 0

for the amplitude v(x, t) of a wave at the position x = (x₁, ..., x_n) at time t. Separating variables by writing v(x, t) = u(x)T (t) we find that

u T⁰⁰− c²∆u T = 0.

Dividing both sides by c²uT , it follows that

∆u u = T⁰⁰

c² .

Both sides must be constant, i.e. independent of (x, t), and denoting the constant by

−λ we find that u must satisfy the equation (7.1), whereas

T⁰⁰+ λc²T = 0. (7.2)

(26)

The solutions of equation (7.2) describe periodic motion depending on the frequency ω, where ω² = λc². Note that the frequencies are determined by the eigenvalues of the Laplacian.

7.1.1 Eigenvalue problem in a disk

To find the frequencies generated by a circular drum we must find the eigenvalues for the Dirichlet Laplace operator in a disk. Let Ω be the disk defined by

{0 ≤ r < a, 0 ≤ θ ≤ 2π}.

The eigenvalue problem (7.1), written in polar coordinates, becomes

u_rr+1

ru_r+ 1

r²u_θθ = −λu (7.3)

where 0 < r < a, 0 ≤ θ ≤ 2π and u(a, θ) = 0. To find the solutions we separate variables: u(r, θ) = R(r)Θ(θ). Then we get two Sturm-Liouville problems:

Θ⁰⁰(θ) + µ Θ(θ) = 0 (7.4)

where Θ(0) = Θ(2π), and

R⁰⁰(r) +1

rR⁰(r) +

λ − µ

r²

R(r) = 0. (7.5)

The solutions of the first problem is

Θ_n(θ) = A_ncos n θ + B_nsin n θ (7.6) where µ_n = n² for n = 1, 2, 3, .... Using this, the second Sturm-Liouville problem

(equation (7.5)) can be written as:

R⁰⁰(r) +1

rR⁰(r) + λ − n²

r²

R(r) = 0. (7.7)

Doing the change of variables s =√

λr and R(r) = Ψ(s) we get:

R⁰(r) = d

drR(r) = d drΨ(√

λr) = Ψ⁰(√

λr) ·√

λ (7.8)

and

R⁰⁰(r) = Ψ⁰⁰(s) · λ. (7.9)

Therefore equation (7.7)

λ Ψ⁰⁰(s) +

√ λ

s Ψ⁰(s)√

λ + λ − n²λ

s² Ψ(s) = 0. (7.10)

Dividing both sides with λ, it follows Ψ⁰⁰(s) + 1

sΨ⁰(s) + 1 − n²

s²

Ψ(s) = 0. (7.11)

(27)

After multiplying both sides of equation (7.11) by s², we see that Ψ satisfies the Bessel equation of order n, compare (6.8). Since Ψ is bounded, we have that Ψ(s) = J_n(s).

The function R(r) = J_n(s) = J_n(√

λr) must vanish when r = a. If we let (α_n,m)^∞_m=1 denote the zeros of the function J_n on the positive real line, we see that the eigenvalues λ are given by

√

λa = α_n,m or

λ = (α_n,m)² a² .

This implies that the frequencies that a circular drum can generate are given by ω =√

λc = α_n,m a c.

7.1.2 Eigenvalue problem in a ball

Imagine a solid ball of radius a given in spherical coordinates. Let us introduce the spherical coordinates

x = r sin φ cos θ, y = r sin φ sin θ, z = r cos φ, see Figure 2 below.

y z

x

r

θ φ

Figure 2: Notation for a spherical coordinate system.

We define B_a and S² by

B_a = {0 ≤ r < a, 0 ≤ φ ≤ π, 0 ≤ θ ≤ 2π} and S² = {0 ≤ φ ≤ π, 0 ≤ θ ≤ 2π}.

The eigenvalue problem for the Dirichlet Laplace operator in B_a, written in spherical coordinates, becomes

1 r²

∂

∂r

r²∂u

∂r

+ 1

r²

1 sin φ

∂

∂φ

sin φ∂u

∂φ

+ 1

sin²φ

∂²u

∂θ²

(7.12)

= −λu, (r, φ, θ) ∈ B_a,

u(a, φ, θ) = 0, (φ, θ) ∈ S². (7.13)

(28)

The function u(r, φ, θ) can be expressed, using the method of separating variables, as u(r, φ, θ) = R(r)Y (φ, θ).

We obtain two eigenvalue problems, one over the unit sphere S² and the one for the radial function R(r). The eigenvalue problem over the unit sphere takes the form

1 sin φ

∂

∂φ

sin φ∂Y

∂φ

+ 1

sin²φ

∂²Y

∂θ² = −µY (φ, θ) ∈ S², (7.14) with the boundary condition

Y (φ, 0) = Y (φ, 2π), Yθ(φ, 0) = Yθ(φ, 2π). (7.15) The second eigenvalue problem takes the form

1 r²

d dr

r²dR

dr

= µ r² − λ

R 0 < r < a, (7.16) with boundary condition R(a) = 0. The function Y can be expressed, using the method of separating variables, as

Y (φ, θ) = Φ(φ)Θ(θ).

Substituting this form of Y into equation (7.14), we obtain

Θ⁰⁰(θ) + νΘ(θ) = 0, (7.17)

where 0 < θ < 2π, and sin φ d

dφ

sin φdΦ(φ) dφ

+ (µ sin²φ − ν)Φ(φ) = 0, (7.18) where 0 < φ < π. The periodic boundary condition from equation (7.15) gives us the eigenvalues ν_m = m² and eigenfunctions

Θ_m(θ) = A_mcos mθ + B_msin mθ

for m = 0, 1, 2, .... Substituting ν = m² into the equation (7.18) for Φ(φ) and doing the change of variables x = cos φ, P (x) = Φ(φ(x)), we get

(1 − x²) d dx

(1 − x²)dP dx

+(1 − x²)µ − m²P = 0 (7.19) where −1 < x < 1 and m = 0, 1, 2, .... For m = 0 we obtain

d dx

(1 − x²)dP dx

+ µP = 0, −1 < x < 1, (7.20) i.e. the Legendre equation. It turns out that equation (7.20) has bounded solutions if and only if µ = n(n + 1) where n = 0, 1, 2, ...; they are the Legendre polynomials, P_n. For m > 0, equation (7.19) is called the associated Legendre equation of order m. This equation has got bounded solutions when µ = n(n + 1) given by

P_n^m(x) = (1 − x²)^m/2d^mP_n

dx^m . (7.21)

Orthogonal polynomials and special functions

U.U.D.M. Project Report 2020:29

Department of Mathematics Uppsala University

Orthogonal polynomials and special functions

Elsa Graneland

Abstract

Contents

1 Introduction

2 Inner product and inner product space

3 Gram-Schmidt process

4 Orthogonal polynomials

4.1 Orthonormal and monic orthogonal polynomials

4.2 Three-term recurrence relation

4.3 The Jacobi matrix

4.4 Zeros of orthogonal polynomials

5 Classical orthogonal polynomials

5.1 Legendre polynomials

5.2 Hermite polynomials

5.3 Laguerre polynomials

6 Sturm-Liouville problems

7 Applications

7.1 Eigenvalues of the Laplace operator