The Discrete Wavelet Transform

(1)

School of Education, Culture and Communication

Division of Mathematics and Physics

MASTER’S DEGREE PROJECT IN MATHEMATICS

The Discrete Wavelet Transform

by

Anton Wirén

MAA515 — Examensarbete i matematik för masterexamen

DIVISION OF MATHEMATICS AND PHYSICS

MÄLARDALEN UNIVERSITY SE-721 23 VÄSTERÅS, SWEDEN

(2)

School of Education, Culture and Communication

Division of Mathematics and Physics

MAA515 — Master’s Degree Project in Mathematics

Project name:

The Discrete Wavelet Transform

Author: Anton Wirén Version: 22nd June 2021 Supervisor: Sergei Silvestrov Reviewer: Doghonay Arjmand Examiner: Lars Hellström Comprising: 30 ECTS credits

(3)

Abstract

In this thesis we will explore the theory behind wavelets. The main focus is on the discrete wavelet transform, although to reach this goal we will also introduce the discrete Fourier transform as it allow us to derive important properties related to wavelet theory, such as the multiresolution analysis. Based on the multiresolution it will be shown how the discrete wavelet transform can be formulated and show how it can be expressed in terms of a matrix. In later chapters we will see how the discrete wavelet transform can be generalized into two dimensions, and discover how it can be used in image processing.

(4)

Chapter 1 Introduction

1.1 Background

The word wavelet is a French word that translates to ‘small wave’ which refers to the visual representation of a wavelet function. More specifically, a wavelet is a finitely actively function with compact support that can be used as the basis element in the wavelet transform. The most well known example of an wavelet is the Haar wavelet, but in theory there is infinitely amounts of wavelet functions as its definition is very loose.

The existence of wavelet functions has been known for a long time, however most of the development of wavelet theory took off in the 1980’s as some key results were discovered at that time. One of the pioneers in the contribution of wavelet theory were Stephané Mallat. He developed the idea of a multiresolution analysis (MRA) in which is the foundation of the discrete wavelet transform (DWT) and later published the paper [1] in 1989 where he showed his discoveries. In this paper he also explained the theory behind wavelets and how it can be applied to image processing. Mallat’s paper [1] were one of the inspirations to write this thesis and I highly recommend anyone interesting in this topic to read his paper.

1.2 Approach

The aim of this thesis is to mathematically understand what the discrete wavelet transform is, where it comes from, and how it works. To do this we will go through wavelet theory and multiresolution analysis as it is the foundation of the DWT and also cover some Fourier transform theory and show how it is connected to the DWT. Once the theory behind DWT is derived we will see how easily it is to obtain the two dimensional DWT and what happens when the transform is applied to an image. This thesis will at parts very theoretical and it is assumed that the reader has background knowledge in calculus and linear algebra on a university level.

In chapter 2 we define the discrete Fourier transform (DFT) and look at some of its properties. The main goal of this chapter is to understand how the DFT can be calculated and also how we can represent it as a matrix using the Fourier matrix and how it can be calculated quickly using the so called fast Fourier transform.

(6)

In chapter 3, the main focus is on the theory of multi resolution analysis (MRA) and its implications. Here we introduce the scaling and wavelet function, which is the fundamental building blocks in wavelet theory. The idea of a scaling filter will be discussed and how it is related to the wavelet filter by the so called quadrature mirror filter. In addition to this we will also go through how the general formula for constructing the scaling and wavelet filters.

In chapter 4 we introduce the discrete wavelet transform (DWT). Based on the results from previous chapters we will derive the DWT and show how this transform can be described in terms of the so called wavelet matrix. The wavelet matrix need to satisfy some properties which we will show. In addition to this, we discover how the wavelet matrix can be related to the Fourier matrix.

In chapter 5 we’ll see how the two dimensional discrete wavelet transform can be performed by using a separable basis. Here we also define the matrix formulation for the two dimensional DWT, which is a simple task to do once we know the one dimensional DWT formulation. In this chapter we will also discover how the DWT can be applied to images, and how it can be used as a tool for image processing.

(7)

Chapter 2 The Discrete Fourier Transform

The discrete Fourier transform (DFT) is a commonly used tool in signal processing that allow us to find the spectral decomposition of a finite discrete signal, or in other words, find the frequency content of the signal. In this section we will see how the DFT can be derived and show some interesting properties. The DFT will also be used in later section in order to derive some properties connected to the discrete wavelet transform.

2.1 Basic Properties

The discrete signal will be denoted using the column vector 𝑧 where 𝑧(𝑛) is used to denote the 𝑛-th component of 𝑧. The length of 𝑧 will be 𝑁 where we let 𝑛 ∈ {0, 1, . . . , 𝑁 − 1}, thus

𝑧= (𝑧(0), 𝑧(1), . . . , 𝑧(𝑁 − 1))>.

The discrete data from 𝑧 is assumed to be periodic, and thus repeated infinitely. What we mean is that

𝑧(𝑛 + 𝑗 𝑁) = 𝑧(𝑛), ∀𝑛, 𝑗 ∈ Z

Basically, adding a integer multiple of 𝑁 will give the same value. We will also require that the squared sum of elements in 𝑧 will be finite so that the signal is well defined. If the signal has infinite squared sum then it cannot be represented as a DFT, so it is a necessary condition. Now define the space [2]

ℓ2(Z𝑁) = {𝑧 = (𝑧(0), 𝑧(1), . . . , 𝑧(𝑁 − 1)) >

: 𝑧(𝑛)) ∈ C, 0 ≤ 𝑛 ≤ 𝑁 − 1}

so that ℓ2_(Z𝑁) can be viewed as a 𝑁-dimensional vector space on C. The inner product of two

vectors 𝑧, 𝑤 ∈ ℓ(Z𝑁) is given by [2] h𝑧, 𝑤i = 𝑁−1 Õ 𝑛=0 𝑧(𝑛)𝑤 (𝑛)

(8)

where 𝑧 are orthogonal to 𝑤 if and only if h𝑧, 𝑤i = 0. We also have that norm of a vector 𝑧 is given by [2] k𝑧k = 𝑁−1 Õ 𝑛=0 |𝑧(𝑛) |2 !1/2 . Lemma 2.1.1. If we define [2] ˜ 𝑓_𝑘(𝑛) = √1 𝑁 𝑒2𝜋𝑖𝑘𝑛/𝑁, 0 ≤ 𝑘, 𝑛 ≤ 𝑁 − 1

then the family of functions { ˜𝑓₀,𝑓˜₁, . . . , 𝑓˜𝑁−1} forms an orthonormal basis for ℓ(Z𝑁).

Proof. If 𝑗 , 𝑘 ∈ {0, 1, . . . , 𝑁 − 1} then [2] h ˜𝑓_𝑗, 𝑓˜_𝑘i = 𝑁−1 Õ 𝑛=0 ˜ 𝑓_𝑗(𝑛) ˜𝑓_𝑘(𝑛) = 𝑁−1 Õ 𝑛=0 1 √ 𝑁 𝑒2𝜋𝑖 𝑗 𝑛/𝑁√1 𝑁 𝑒2𝜋𝑖𝑘𝑛/𝑁 = 1 𝑁 𝑁−1 Õ 𝑛=0 𝑒2𝜋𝑖 𝑗 𝑛/𝑁𝑒−2𝜋𝑖𝑘𝑛/𝑁 = 1 𝑁 𝑁−1 Õ 𝑛=0 𝑒2𝜋𝑖( 𝑗 −𝑘)𝑛/𝑁 = 1 𝑁 𝑁−1 Õ 𝑛=0 (𝑒2𝜋𝑖( 𝑗 −𝑘)/𝑁)𝑛 .

Here we note that if 𝑗 = 𝑘 then 𝑒2𝜋𝑖( 𝑗 −𝑘)/𝑁 =1 so that 1/𝑁Í𝑁−1

𝑛=0 1 = 1 which proves that

k ˜𝑓_𝑗k2=1 and thus normalized. To prove orthogonality condition h ˜𝑓_𝑗,𝑓˜_𝑘i = 0 for 𝑗 ≠ 𝑘 we note that since −𝑁 + 1 ≤ 𝑗 − 𝑘 ≤ 𝑁 + 1, then 𝑒2𝜋𝑖( 𝑗 −𝑘)/𝑁 ≠1 hence it is possible to write the sum as a geometric series [2] 1 𝑁 𝑁−1 Õ 𝑛=0 (𝑒2𝜋𝑖( 𝑗 −𝑘)/𝑁)𝑛 =1 − (𝑒 2𝜋𝑖( 𝑗 −𝑘)/𝑁₎𝑁 1 − 𝑒2𝜋𝑖( 𝑗 −𝑘)/𝑁 and since (𝑒2𝜋𝑖( 𝑗 −𝑘)/𝑁)𝑁

=1 and 𝑒2𝜋𝑖( 𝑗 −𝑘)/𝑁 ≠1 the above equation must be zero, proving that ˜

𝑓_𝑗 is orthogonal to ˜𝑓_𝑘 for 𝑗 ≠ 𝑘 and we conclude that { ˜𝑓₀, 𝑓˜₁, . . . , 𝑓˜_𝑁₋₁} is an orthonormal

set.

Because { ˜𝑓₀, 𝑓˜₁, . . . ,𝑓˜_𝑁₋₁} is an orthonormal basis for ℓ(Z_𝑁) we can write any vector 𝑧∈ ℓ(Z𝑁) as [2] 𝑧= 𝑁−1 Õ 𝑚=0 h𝑧, ˜𝑓𝑘i ˜𝑓𝑘 (2.1.1) where h𝑧, ˜𝑓_𝑘i = 𝑁−1 Õ 𝑘=0 𝑧(𝑛)√1 𝑁 𝑒2𝜋𝑖𝑘𝑛/𝑁= √1 𝑁 𝑁−1 Õ 𝑘=0 𝑧(𝑛)𝑒−2𝜋𝑖𝑘𝑛/𝑁

(9)

by definition of the inner product. To follow [2] we will define ˆ𝑧(𝑘) =√𝑁h𝑧, ˜𝑓_𝑘i and note that for any 𝑘 ∈ Z we have that

ˆ𝑧(𝑘 + 𝑁) = 𝑁−1 Õ 𝑛=0 𝑧(𝑛)𝑒−2𝜋𝑖(𝑘+𝑁)𝑛/𝑁 = 𝑁−1 Õ 𝑛=0 𝑧(𝑛)𝑒−2𝜋𝑖𝑘𝑛/𝑁𝑒−2𝜋𝑖𝑛= ˆ𝑧(𝑘)

where we used Euler’s identity to obtain the last result i.e 𝑒−2𝜋𝑖𝑛=_{1 for all 𝑛 ∈ Z. This proves} that ˆ𝑧 is 𝑁 periodic just as the original 𝑧 vector, hence it will also belong to ℓ(Z𝑁).

Definition 2.1.1. The Discrete Fourier Transform (DFT) is the map

ˆ𝑧(𝑘) =

𝑁−1

Õ

𝑛=0

𝑧(𝑛)𝑒2𝜋𝑖𝑘𝑛/𝑁, 𝑘=0, 1, . . . , 𝑁 − 1

Theorem 2.1.1. Suppose 𝑧, 𝑤 ∈ ℓ(Z𝑁), then the following holds (i) Fourier inversion:

𝑧(𝑛) = 1 𝑁 𝑁−1 Õ 𝑘=0 ˆ𝑧(𝑘)𝑒2𝜋𝑖𝑘𝑛/𝑁, 𝑛=0, 1, . . . , 𝑁 − 1 (2.1.2)

(ii) Parseval’s relation: h𝑧, 𝑤i = _𝑁1h ˆ𝑧, ˆ𝑤i

(iii) Plancherel’s formula: k𝑧k2= 1

𝑁k ˆ𝑧k 2_.

Proof. The inversion formula can be confirmed by noting that

𝑧(𝑛) = 𝑁−1 Õ 𝑘=0 h𝑧, ˜𝑓𝑘i ˜𝑓𝑘(𝑛) = 𝑁−1 Õ 𝑘=0 1 √ 𝑁 ˆ𝑧(𝑘)√1 𝑁 𝑒2𝜋𝑖𝑘𝑛/𝑁 = 1 𝑁 𝑁−1 Õ 𝑘=0 ˆ𝑧(𝑘)𝑒2𝜋𝑖𝑘𝑛/𝑁.

The Parseval’s relation and Plancherel’s formula is proved in a similar way. The Inverse Fourier Transform can be obtained by simply calculating the inversion formula and apply it on every 𝑛. To get a better understanding on how the inversion formula can be interpreted we define [2] 𝑓_𝑘(𝑛) = √1 𝑁 ˜ 𝑓_𝑘(𝑛) = 1 𝑁 𝑒2𝜋𝑖𝑘𝑛/𝑁, 𝑘 , 𝑛=0, 1, . . . , 𝑁 − 1.

This is just a scaled version of ˜𝑓_𝑘 and thus it follows that 𝑓_𝑘 ∈ ℓ2(Z_𝑁). Note that k 𝑓_𝑘k2=1/𝑁 hence the basis elements 𝑓𝑘 are not normalized. However the orthogonality property still holds

so that { 𝑓0, 𝑓₁, . . . , 𝑓_𝑁₋₁} is an orthogonal basis for ℓ2(Z_𝑁). We will refer this as the Fourier basis. From (2.1.1) we see that

𝑧= 𝑁−1 Õ 𝑘=0 h𝑧, ˜𝑓_𝑘i ˜𝑓_𝑘= 𝑁−1 Õ 𝑘=0 √ 𝑁h𝑧, ˜𝑓_𝑘i 𝑓_𝑘 = 𝑁−1 Õ 𝑘=0 ˆ𝑧(𝑘) 𝑓𝑘

which shows that the coefficients to 𝑓𝑘 is ˆ𝑧(𝑘) when expanding 𝑧 in terms of the Fourier basis.

(10)

2.2 Matrix Formulation

The matrix formulation of the discrete Fourier transform allow us to obtain the Fourier matrix. This is one of the most important matrix used in engineering mathematics, and later we will see how it is connected to wavelet theory. To derive the matrix formulation we first define

𝜔𝑘 𝑛= 𝑒−2𝜋𝑖𝑘𝑛/𝑁

so that for any 𝑘 ∈ {0, 1, . . . , 𝑁 − 1} [2]

ˆ𝑧(𝑘) = 𝑁−1 Õ 𝑛=0 𝑧(𝑛)𝜔𝑘 𝑛= 1 𝜔𝑘 𝜔2𝑘 𝜔3𝑘 . . . 𝜔𝑘(𝑁−1)          𝑧(0) 𝑧(1) . . . 𝑧(𝑁 − 1)         

by definition of vector multiplication. We can generalize the above equation to solve for all 𝑘∈ {0, 1, . . . , 𝑁 − 1} by defining the 𝑁 × 𝑁 matrix

𝐹_𝑁 =                1 1 1 1 . . . 1 1 𝜔 𝜔2 𝜔3 . . . 𝜔𝑁−1 1 𝜔2 𝜔4 𝜔6 . . . 𝜔2(𝑁−1) 1 𝜔3 𝜔6 𝜔9 . . . 𝜔3(𝑁−1) . . . . . . . . . . . . . ._. .. . 1 𝜔𝑁−1 𝜔2(𝑁−1) 𝜔3(𝑁−1) . . . 𝜔(𝑁−1) (𝑁−1)               

so that the coefficients ˆ𝑧 = ( ˆ𝑧(0), ˆ𝑧(1), . . . , ˆ𝑧(𝑁 − 1))>can be obtained by calculating

ˆ𝑧 = 𝐹𝑁𝑧 .

The 𝐹𝑁 matrix will be referred as the Fourier matrix. Observe that each element of 𝐹𝑁 is a

root of unity and that the size of this matrix depends on how many elements 𝑧 have. In the case when 𝑁 = 4, then 𝐹₄=         1 1 1 1 1 −𝑖 −1 𝑖 1 −1 1 −1 1 𝑖 −1 −𝑖         .

Example 2.2.1. Given 𝑧 = (4, −7, 0, 1)>∈ ℓ2_(Z₄) we find ˆ𝑧 by calculating

ˆ𝑧 = 𝐹4𝑧=         1 1 1 1 1 −𝑖 −1 𝑖 1 −1 1 −1 1 𝑖 −1 −𝑖                 4 −7 0 1         =         −2 4 + 8𝑖 10 4 − 8𝑖         .

(11)

We see that the DFT is a linear transformation and thus there exist an inverse of the matrix 𝐹_𝑁 so that the Fourier transform is invertible. We will now take a moment and focus on the inverse of this transform and it’s properties.

Definition 2.2.1. The inverse Fourier transform (IDFT) is given by

ˇ𝑧(𝑛) = 1 𝑁 𝑁−1 Õ 𝑘=0 𝑧(𝑘)𝑒2𝜋𝑖𝑘𝑛/𝑁, 𝑛=0, 1, . . . , 𝑁 − 1.

The inverse it self is given by [2]

ˇ𝑧(𝑛) = 𝑁−1 Õ 𝑘=0 𝑧(𝑘)1 𝑁 𝜔−𝑘𝑛 = 𝑁−1 Õ 𝑘=0 1 𝑁 𝜔𝑘 𝑛𝑧(𝑘).

In this form we see that the (𝑛, 𝑘)-th element of the inverse Fourier matrix 𝐹_𝑁−1 is equal to 𝜔−𝑘𝑛/𝑁. Basically, to find the inverse matrix Fourier matrix we simply conjugate each element of 𝐹𝑁 and multiply with 1/𝑁. On matrix form we may write this as

𝐹−1

𝑁 =

1 𝑁

𝐹_𝑁.

In the case when 𝑁 = 4 we obtain the inverse Fourier matrix

𝐹−1 4 = 1 4         1 1 1 1 1 𝑖 −1 −𝑖 1 −1 1 −1 1 −𝑖 −1 𝑖        

and it can be confirmed that 𝐹₄−1𝐹₄= 𝐼 so that this is indeed the inverse.

Example 2.2.2. Given 𝑢 = (−2, 4 + 8𝑖, 10, 4 − 8𝑖) ∈ ℓ(Z4) we find ˇ𝑢by calculating

ˇ 𝑢= 𝐹−1 4 𝑢=         1 1 1 1 1 𝑖 −1 −𝑖 1 −1 1 −1 1 −𝑖 −1 𝑖                 −2 4 + 8𝑖 10 4 − 8𝑖         =         4 −7 0 1        

Here we see that this result agrees with Example 2.2.1 so that we indeed obtained the inverse. We will now take a moment and explore the relation between the Fourier coefficients ˆ𝑧(𝑘) and ˆ𝑧(𝑘 + 𝑁/2). To see how these are related we begin by assuming 𝑁 is divisible by two. This means that for any 𝑚 = 0, 1, . . . , 𝑁 − 1 we can split ˆ𝑧(𝑘) into even and odd parts, more specifically [2] ˆ𝑧(𝑘) = 𝑁−1 Õ 𝑛=0 𝑧(𝑛)𝑒−2𝜋𝑖𝑘𝑛/𝑁 = 𝑁/2−1 Õ 𝑛=0 𝑧(2𝑛)𝑒−2𝜋𝑖𝑘 (2𝑛)/𝑁+ 𝑁/2−1 Õ 𝑛=0 𝑧(2𝑛 + 1)𝑒−2𝜋𝑖𝑘 (2𝑛+1)/𝑁 = 𝑁/2−1 Õ 𝑛=0 𝑧(2𝑛)𝑒−2𝜋𝑖𝑘𝑛/(𝑁/2)+ 𝑒−2𝜋𝑖𝑘/𝑁 𝑁/2−1 Õ 𝑛=0 𝑧(2𝑛 + 1)𝑒−2𝜋𝑖𝑘𝑛/(𝑁/2) (2.2.1)

(12)

and if we do the same thing with ˆ𝑧(𝑘 + 𝑁/2) and use that 𝑒−2𝜋𝑖𝑛/(𝑁/2)is a 𝑁/2 periodic function, then ˆ𝑧 𝑘+ 𝑁 2 = 𝑁/2−1 Õ 𝑛=0 𝑧(2𝑛)𝑒−2𝜋𝑖(𝑘+𝑁/2)𝑛/(𝑁/2)+ 𝑒−2𝜋𝑖(𝑘+𝑁/2)/𝑁 𝑁/2−1 Õ 𝑛=0 𝑧(2𝑛 + 1)𝑒−2𝜋𝑖(𝑘+𝑁/2)𝑛/(𝑁/2) = 𝑁/2−1 Õ 𝑛=0 𝑧(2𝑛)𝑒−2𝜋𝑖𝑘𝑛/(𝑁/2)+ 𝑒−2𝜋𝑖𝑘/𝑁𝑒−2𝜋𝑖(𝑁/2)𝑛/𝑁 𝑁/2−1 Õ 𝑛=0 𝑧(2𝑛 + 1)𝑒−2𝜋𝑖𝑘𝑛/(𝑁/2) = 𝑁/2−1 Õ 𝑛=0 𝑧(2𝑛)𝑒−2𝜋𝑖𝑘𝑛/(𝑁/2)− 𝑒−2𝜋𝑖𝑘/𝑁 𝑁/2−1 Õ 𝑛=0 𝑧(2𝑛 + 1)𝑒−2𝜋𝑖𝑘𝑛/(𝑁/2) (2.2.2)

where we used 𝑒−𝜋𝑖 = −1 in the last equality. From these formulas it’s clear that the only difference between the calculation of ˆ𝑧(𝑘) and ˆ𝑧(𝑘 + 𝑁/2) is the sign of the odd numbers 𝑛. Basically, once we have calculate ˆ𝑧(𝑘) its simple to calculate ˆ𝑧(𝑘 + 𝑁/2). This means that we actually only have to solve the first half of the Fourier linear system to obtain every Fourier coefficients. This is the method used in the fast Fourier transform.

Example 2.2.3. Given 𝑧 = (4, −7, 0, 1), thus 𝑁 = 4, we use (2.2.1) to obtain

ˆ𝑧(1) = 4 · 1 + 0 · (−1) | {z } even 𝑛 + (−7) · (−𝑖) + 1 · 𝑖 | {z } odd 𝑛 =4 + 8𝑖.

In this case 𝑁/2 = 2 and by (2.2.2) we obtain ˆ𝑧(3) by changing sign for odd 𝑛 in the calculation of ˆ𝑧(1), which gives ˆ𝑧(3) = 4 − 8𝑖. Note that this agrees with the result obtained in Example 2.2.1.

(13)

Chapter 3 Multiresolution Analysis

The multiresolution analysis is a property that is used in the discrete wavelet transform. In this section we will describe what the multiresolution analysis is and how it is derived. From this we will also be able to prove the properties of the so called scaling and wavelet functions which is the building blocks in the discrete wavelet transform.

3.1 Translation & Dilation Operators

We begin by defining the unit translation operator as

𝑇 𝑓(𝑡) = 𝑓 (𝑡 − 1).

This operator translate the function 𝑓 by one unit to the right and is a linear operator. We see that the operator 𝑇 satisfies

h𝑇 𝑓 , 𝑇 𝑔i = ∫ ∞ −∞ 𝑓(𝑡 − 1)𝑔(𝑡 − 1)𝑑𝑡 = ∫ ∞ −∞ 𝑓(𝜉)𝑔(𝜉)𝑑𝑡 = h 𝑓 , 𝑔i

when using the substitution 𝑡 − 1 = 𝜉. This shows that the unit translator operator 𝑇 preserves the scalar product. In fact this operator is also unitary which means that 𝑇−1= 𝑇∗.

Now let 𝜙(𝑡) be some function that satisfies [3]

h𝑇𝑛 𝜙, 𝑇𝑚𝜙i = h𝑇𝑛−𝑚𝜙, 𝜙i = 𝛿_𝑛,𝑚 where 𝛿𝑛,𝑚= ( 1 if 𝑛 = 𝑚, 0 otherwise

so that the family of functions [3]

(14)

are orthonormal, thus

{𝜙(𝑡 − 𝑛) : 𝑛 ∈ Z} (3.1.1)

forms an orthonormal set. These types of functions will be referred as a translation function. The simplest example of such function is the Haar scaling function which is defined as

𝜙(𝑡) =          0, 𝑡≤ 0 1, 0 < 𝑡 ≤ 1 0, 𝑡 >1 (3.1.2)

Trivial calculations shows that (3.1.1) is satisfied so that the integer translates of the Haar scaling function forms an orthonormal set. The general consideration of a scaling function is that the integer translates of a scaling function ((3.1.1)) will span a subspace of 𝐿2(R). This subspace will be referred as V0, where V0⊂ 𝐿2(R), and is formally defined as [4]

V₀= ( 𝑓(𝑡) = ∞ Õ 𝑛=−∞ 𝑐𝑛𝑇 𝑛 𝜙(𝑡) : k 𝑓 k2= ∞ Õ 𝑛=−∞ |𝑐𝑛| 2 <∞ )

If we use the Haar scaling function we see that V0is the space of square integrable functions

that are piecewise constant on each unit interval [3]. We observe that there exists an infinite numbers of function belonging to the V0 space, however compared to the 𝐿2(R) it is very

small.

In combination with the translating operator 𝑇 , we also want to define a dilation operator 𝐷: 𝐿2(R) ↦→ 𝐿2(R) as

𝐷 𝑓(𝑡) = √

2 𝑓 (2𝑡).

For example if the dilation operator is applied on the Haar scaling function, defined in (3.1.2), we would obtain 𝐷 𝜙(𝑡) =          0, 𝑡≤ 0 √ 2, 0 < 𝑡 ≤ 1₂ 0, 𝑡 > 1 2 .

so that the support of the function is halved and the constant is enlarged by a factor of √

2. It can be confirmed that the dilation operator will still preserve norm for the Haar scaling function. In general we can confirm this by calculating

h𝐷 𝑓 , 𝐷𝑔i = ∫ ∞ −∞ 2 𝑓 (2𝑡)𝑔(2𝑡)𝑑𝑡 = ∫ ∞ −∞ 𝑓(𝜉)𝑔(𝜉)𝑑𝜉 = h 𝑓 , 𝑔i

proving that the dilation operator is preserves the inner product. Note that we used the substitution 𝜉 = 2𝑡. This is also an unitary operator so that 𝐷∗= 𝐷−1. It is also possible to combine 𝑛 translations with 𝑚 dilations to obtain the function

(15)

which is also an unitary operator since both the translation and dilation operators are unitary. Sometimes we will use the more compact notation

𝐷𝑚𝑇𝑛𝜙(𝑡) = 𝜙_𝑚,𝑛(𝑡)

which is frequently used in many literature’s. However in most cases we will use the other notation in this thesis with some exception when space is the issue. Here we note that for 𝑡 ∈ R and 𝑛1, 𝑛₂∈ Z,

h𝐷𝑚

𝜙(𝑡 − 𝑛₁), 𝐷𝑚𝜙(𝑡 − 𝑛₂))i = h𝜙(𝑡 − 𝑛₁), 𝐷𝑚−𝑚𝜙(𝑡 − 𝑛₂))i = h𝜙(𝑡 − 𝑛1), 𝜙(𝑡 − 𝑛2)i

= 𝛿𝑛₁,𝑛₂

which shows that the function 𝐷𝑚

𝑇𝑛𝜙(𝑡) in (3.1.3) is orthonormal for any fixed 𝑚 hence {𝐷𝑚

𝑇𝑛𝜙(𝑡), 𝑛 ∈ Z} (3.1.4)

forms an orthonormal basis. Let V𝑚denote the subspace of 𝐿

2_{(R) spanned by {𝐷}𝑚 𝑇𝑛𝜙(𝑡), 𝑛 ∈ Z}, or more formally [4] V𝑚 = ( 𝑓(𝑡) = 2𝑚/2 ∞ Õ 𝑛=−∞ 𝑐_𝑛𝜙(2𝑚𝑡− 𝑛) : k 𝑓 k2= ∞ Õ 𝑛=−∞ |𝑐𝑛| 2 <∞ ) . (3.1.5)

Basically V𝑚 is the space of square integrable functions that are piecewise constant on each

2−𝑚 interval. We also note that the number of constant intervals with length 2−𝑚on each unit interval is the inverse of its length i.e 1/2−𝑚=2𝑚

, which is usually referred as the resolution. This is why V𝑚 can be described as the space of resolution 2

𝑚

. Note that V0 is describing

the space of unit resolution which agrees with the definition that we made earlier. The relation between the V0and V1is given by 𝐷V0= V1where 𝐷 is the dilation operator acting on the V0

space. Basically 𝐷 will halve the length of the intervals, or equivalent double the resolution. From this its clear that the relation between V0and V𝑚 is given by

𝐷𝑚V₀= V_𝑚, 𝑚∈ Z

so that the V𝑚 space can be described as the 𝑚 dilations of the V0space. We also have that

. . .V₋₂⊂ V₋₁⊂ V₀⊂ V₁⊂ V₂⊂ . . . . (3.1.6) since lower resolutions are always a subset of higher resolutions. If we let 𝑚 → ∞, then V𝑚 is

representing the infinite detail space, hence its equal to 𝐿2_{(R). When the resolution is negative,} the constant intervals will be greater than one, and if we let the limit of 𝑚 → −∞ then V𝑚 is

representing the ‘no-detail space’. It will actually be equal to {0} due toÍ

𝑛|𝑐𝑛|

2_<_{∞. We can}

summarize our findings so far by defining the Multiresolution Analysis concept.

Definition 3.1.1. Multiresolution Analysis (MRA) is a sequence of closed subspaces {V𝑚,

(16)

(i) There exist a scaling function 𝜙(𝑡) such that V0=span{𝜙(𝑡 − 𝑛) : 𝑛 ∈ Z}. (ii) V𝑚 ⊂ V𝑚+𝑘 𝑘 =1, 2, . . . (iii) 𝑓 (𝑡) ∈ V𝑚⇐⇒ 𝑇 𝑛 𝑓(𝑡) ∈ V_𝑚 for all 𝑛 ∈ Z (iv) 𝑓 (𝑡) ∈ V𝑚⇐⇒ 𝐷 𝑘 𝑓(𝑡) ∈ V_𝑚+𝑘 𝑘=1, 2, . . . (v) lim 𝑚→∞ V𝑚= 𝐿2(R) (vi) lim 𝑚→−∞ V𝑚 = {0}

We always need (i) in order to build the V0 space. Once we have the V0 we can always

build the remaining spaces by using the dilation operator to create any space V𝑚 as we seen

earlier. Note that (ii) follows from (3.1.6). Property (iii) means that V𝑚 is invariant under

translations by integers multiples of 2𝑚

, usually called self similarity in time. (iv) describes the self similarity in scale when using 2𝑚

as the dilation factor.

3.2 Scaling Function Properties

The scaling function will always need satisfy the MRA property which is a very strong restriction. This implies that it must be of finite energy, i.e belong to the 𝐿2 space. These are the only required conditions, however in most cases we want it to have more restrictions as we will explain here. First of all, it should be localized in time which means that it should rapidly reach zero as |𝑡 | → ∞ [7]. Usually we also want the scaling function to be compactly supported which means that the function is zero (or almost zero) outside a interval, and non-zero inside this interval. For practical use the scaling function should be normalized in 𝐿1_{(R), thus}

∫ ∞

−∞

𝜙(𝑡)𝑑𝑡 = 1.

so that it can be used as a orthonormal basis. For example, the Haar scaling function is clearly normalized and is supported on the interval [0, 1], hence both compactly supported and localized in time, relatively speaking. In the remaining part of this section we want to prove the elemental properties for the scaling function.

The MRA property of the scaling function means that it can be expressed in terms of a higher resolution space. To see this we first note that 𝐷𝑚

𝜙(𝑡) ∈ V_𝑚 and since V_𝑚 ⊂ V_𝑚₊₁ it follows that 𝐷𝑚

𝜙(𝑡) ∈ V_𝑚₊₁. This means that 𝐷𝑚𝜙(𝑡) can be described in terms of the V_𝑚₊₁ space, and by using (3.1.5) with the substitution 𝑓 (𝑡) = 𝐷𝑚

𝜙(𝑡) we conclude that 𝐷𝑚𝜙(𝑡) = ∞ Õ 𝑛=−∞ ℎ(𝑛) 𝐷𝑚+1𝑇𝑛𝜙(𝑡) (3.2.1)

for some specific coefficients ℎ(𝑛). This equation shows that we can use some combinations of basis element, or scaling function, in V𝑚+1to construct the basis element in the lower resolution

(17)

space V𝑚, where 𝑚 is an integer [3]. The set of non-zero coefficients ℎ(𝑛) is called the scaling

filter. The term filter is frequently used in image processing as the filter can be applied on an image to cause a wide range of effects, for example edge detection. Each scaling function is completely determined by the scaling filter, which also implies that the filter is unique for each scaling function. The simplest example of such filter is the Haar scaling function which is given by ℎ(0) = ℎ(1) = 1/√2 and all other ℎ(𝑛) are set to zero. Note that (3.2.1) can be solved for 𝜙(𝑡) on the left hand side by applying the dilation operator 𝐷−𝑚on both sides. This produces the so called scaling equation [3]

𝜙(𝑡) = ∞ Õ 𝑛=−∞ ℎ(𝑛) 𝐷𝑇𝑛𝜙(𝑡) = √ 2 ∞ Õ 𝑛=−∞ ℎ(𝑛)𝜙(2𝑡 − 𝑛) (3.2.2)

and is the fundamental property used in wavelet theory. This equation will allow us to prove the properties related to the scaling function and wavelets as we will see. The formula given in (3.2.2) is valid for any scaling function that satisfies an MRA. The scaling equation can be used on the Haar scaling function to obtain

𝜙(𝑡) = √1 2

𝐷 𝜙(𝑡) +√1 2

𝐷𝑇 𝜙(𝑡) = 𝜙(2𝑡) + 𝜙(2𝑡 − 1).

Basically we use two Haar scaling functions from the higher resolution space V1and combine

them to construct scaling function from the lower resolution space V0 which gives the Haar

scaling function. Because we want our scaling functions to form an orthonormal basis of V0

we need that

∫ ∞

−∞

𝜙(𝑡)𝜙(𝑡 − 𝑘)𝑑𝑡 = 𝛿𝑘 (3.2.3)

where 𝛿𝑘 =1 if 𝑘 = 0 and otherwise zero for any integer 𝑘 ≠ 0. This relation can be written in

terms of the scaling equation (3.2.2) which gives that [5]

𝜙(𝑡)𝜙(𝑡 − 𝑘) = √ 2Õ 𝑛 ℎ(𝑛)𝜙(2𝑡 − 𝑛)𝜙(𝑡 − 𝑘) = √ 2Õ 𝑛 ℎ(𝑛)𝜙(2𝑡 − 𝑛) Õ 𝑗 ℎ( 𝑗 )𝜙(2(𝑡 − 𝑘) − 𝑗 )

and if we integrate both sides we obtain [5]

∫ ∞ −∞ 𝜙(𝑡)𝜙(𝑡 − 𝑘)𝑑𝑡 = 2 Õ 𝑛 ℎ(𝑛) " Õ 𝑗 ℎ( 𝑗 ) 1 2 ∫ ∞ −∞ 𝜙(2𝑡 − 𝑛)𝜙(2𝑡 − 2𝑘 − 𝑗 )𝑑 (2𝑡) # =Õ 𝑛 Õ 𝑗 ℎ(𝑛) ℎ( 𝑗 )𝛿_𝑛,_{2𝑘+ 𝑗} =Õ 𝑛 ℎ(𝑛) ℎ(𝑛 − 2𝑘)

(18)

where we made the substitution 𝑛 = 2𝑘 + 𝑗 in the last equality. Basically the orthonormality of the scaling function given in (3.2.3) can be equivalently expressed as

Õ

𝑛

ℎ(𝑛) ℎ(𝑛 − 2𝑘) = 𝛿𝑘. (3.2.4)

This property plays an important role in how the construction of the so called scaling filters as we will see in (3.4). Note that the special case when 𝑘 = 1 this reduces to

Õ

𝑛

ℎ(𝑛)2=1

which is the normalization property. Another thing to observe is that we can use the scaling equation to derive that [5]

∫ ∞ −∞ 𝜙(𝑡)𝑑𝑡 = √ 2 ∞ Õ 𝑛=−∞ ℎ(𝑛) ∫ ∞ −∞ 𝜙(2𝑡 − 𝑛)𝑑𝑡 (3.2.5) = √ 2 ∞ Õ 𝑛=−∞ 1 2ℎ(𝑛) ∫ ∞ −∞ 𝜙(2𝑡 − 𝑛)𝑑 (2𝑡 − 𝑛) (3.2.6) = √ 2 2 ∞ Õ 𝑛=−∞ ℎ(𝑛) ∫ ∞ −∞ 𝜙(𝑡)𝑑𝑡 (3.2.7)

and since the integral of the scaling function over real numbers is equal to one i.e∫ 𝜙(𝑡) = 1, thus it is necessary that

∞ Õ 𝑛=−∞ ℎ(𝑛) = √ 2.

This result shows that the sum of the scaling filter ℎ(𝑛) must be equal to √2 under the assumption that the scaling equation is satisfied. Another property we can show by using the scaling equation in (3.2.2) is how the continuous Fourier transform of 𝜙(𝑡) can be expressed.

Definition 3.2.1. The continuous Fourier transform of the function 𝑓 (𝑡) ∈ 𝐿2_{(R) is given by}

ˆ 𝑓(𝑘) =

∫ ∞

−∞

𝑓(𝑡)𝑒−2𝜋𝑖𝑘𝑡𝑑 𝑡 , 𝑘 , 𝑡 ∈ R.

The continuous Fourier transform of the scaling function 𝜙(𝑡) can be derived by using the scaling equation property given in (3.2.2), thus we have that

ˆ 𝜙(𝑘) = ∫ ∞ −∞ 𝜙(𝑡)𝑒−2𝜋𝑖𝑘𝑡𝑑 𝑡 = ∫ ∞ −∞ ∞ Õ 𝑛=−∞ √ 2ℎ(𝑛)𝜙(2𝑡 − 𝑛)𝑒−2𝜋𝑖𝑘𝑡𝑑 𝑡 = √ 2 ∞ Õ 𝑛=−∞ ℎ(𝑛) ∫ ∞ −∞ 𝜙(2𝑡 − 𝑛)𝑒−2𝜋𝑖𝑘𝑡𝑑 𝑡 .

(19)

By performing the variable change 𝑡 → 2𝑡 − 𝑛/𝑁, the above equation can be written as ˆ 𝜙(𝑡) = √1 2 ∞ Õ 𝑛=−∞ ℎ(𝑛) ∫ ∞ −∞ 𝜙(𝑡)𝑒−2𝜋𝑖(𝑘/2) (𝑡+𝑛/𝑁)𝑑 𝑡 = √1 2 ∞ Õ 𝑛=−∞ ℎ(𝑛)𝑒−2𝜋𝑖(𝑘/2)𝑛/𝑁 ∫ ∞ −∞ 𝜙(𝑡)𝑒−2𝜋𝑖(𝑘/2)𝑡𝑑 𝑡 thus ˆ 𝜙(𝑡) = 1 √ 2 ˆ ℎ 𝑘 2 ˆ 𝜙 𝑘 2 (3.2.8) where ˆ ℎ(𝑘) = ∞ Õ 𝑛=−∞ ℎ(𝑛)𝑒−2𝜋𝑖𝑘𝑛/𝑁 (3.2.9)

is the discrete Fourier transform of the coefficients ℎ(𝑛). The relation (3.2.8) implies that for any 𝑗 ∈ Z we have ˆ 𝜙 𝑘 2𝑗−1 = √1 2 ˆ ℎ 𝑘 2𝑗 ˆ 𝜙 𝑘 2𝑗 (3.2.10)

so that we can iterate ˆ𝜙(𝑘) as

ˆ 𝜙(𝑘) = 1 √ 2 ˆ ℎ 𝑘 2 ˆ 𝜙 𝑘 2 =√1 2 ˆ ℎ 𝑘 2 1 √ 2 ˆ ℎ 𝑘 22 ˆ 𝜙 𝑘 22 . . . = ˆ𝜙 𝑘 2𝑝 𝑝 Ö 𝑗=1 1 √ 2 ˆ ℎ 𝑘 2𝑗 . (3.2.11)

Observe that this was derived by expanding the last term using (3.2.10). Equation Equa-tion 3.2.11 hold for any integer 𝑝 ≥ 1 under the assumpEqua-tion that the Fourier transform of the scaling exists. In addition if ˆ𝜙(𝑘) is continuous at a neighborhood of 𝑘 = 0 and if the limit exists as 𝑝 → ∞, then ˆ 𝜙(𝑘) = lim 𝑝→∞ ˆ 𝜙 𝑘 2𝑝 𝑝 Ö 𝑗=1 1 √ 2 ˆ ℎ 𝑘 2𝑗 = ˆ𝜙(0) ∞ Ö 𝑗=1 1 √ 2 ˆ ℎ 𝑘 2𝑗 . (3.2.12)

(20)

The constant ˆ𝜙(0) depends on the norm of the scaling function. If the scaling function is normalized in 𝐿1(R) then ˆ𝜙(0) = 1 since this is equivalent to

∫ ∞

−∞

𝜙(𝑡)𝑑𝑡 = 1.

By normalizing 𝜙(𝑡) we ensure that the infinite product will converge to the scaling function independent of the choice of initial function [3]. This means that once the normalization is done, the limit only depends on the coefficients ℎ(𝑛), also known as the scaling filter.

Example 3.2.1. We now want to see how the Fourier transform of the Haar scaling function

can be expanded using (3.2.11). The Haar scaling filter is given by ℎ(0) = ℎ(1) = 1/ √

2 so that the length of the filter is 𝑁 = 2. Any other coefficient ℎ(𝑛) will be zero, thus by definition of

ˆ ℎ(𝑘) in (3.2.9), we have that ˆ ℎ(𝑘) = 1 Õ 𝑛=0 ℎ(𝑛)𝑒−𝜋𝑖𝑘𝑛= 1 √ 2(1 + 𝑒 −𝜋𝑖𝑘₎

and since the Haar scaling function is normalized it follows that ˆ𝜙(0) = 1 so that by (3.2.12) we see that ˆ 𝜙(𝑘) = ∞ Ö 𝑗=1 1 2(1 + 𝑒 −𝜋𝑖𝑘/2𝑗 ), 𝑘 ∈ R.

is an alternative expression for the Fourier transform of the Haar scaling function.

In general we have that any function belonging to the 𝐿2 space can be represented in the time or frequency domain, and since the scaling function also belongs to the 𝐿2 space we can do the same thing with the scaling function. For our purposes we will only focus on the frequency domain. Let 𝑓 (𝑡) ∈ 𝐿2_{(Z) be a function that is orthogonal to unit translations, that} is

𝛿𝑛k 𝑓 (𝑡) k

2_{= h 𝑓 (𝑡), 𝑓 (𝑡 − 𝑛)i}

where 𝛿𝑛is equal to one if 𝑛 = 0, and otherwise zero. The Parseval’s identity tells us that the

above equation is also equivalent to

𝛿_𝑛k 𝑓 (𝑡) k2= 1 𝑁

h ˆ𝑓(𝑘), ˆ𝑓(𝑘)𝑒−2𝜋𝑖𝑘𝑛/𝑁i

where we also used the translation identity of the Fourier transform. If 𝑛 = 0, the above equation is equal to the norm of the function, and for any other 𝑛 it will be equal to zero. If we express this in terms of a integral it’s clear that

𝛿𝑛k 𝑓 (𝑡) k 2₌ 1 𝑁 ∫ ∞ −∞ ˆ 𝑓(𝑘) ˆ𝜙(𝑘)𝑒2𝜋𝑖𝑘𝑛/𝑁 = 1 𝑁 ∫ 𝑁 0 ∞ Õ 𝑗=−∞ | ˆ𝑓(𝑘 + 𝑗 𝑁) |2𝑒2𝜋𝑖𝑘𝑛/𝑁.

(21)

By using the 𝑁-periodicity of the exponential function 𝑒2𝜋𝑖𝑘/𝑁 is clear that the equation is true if and only if ∞ Õ 𝑗=−∞ | ˆ𝑓(𝑘 + 𝑗 𝑁) |2= k 𝑓 (𝑡) k2, ∀𝑘 ∈ R (3.2.13) This is an important result that it we will use in the next section when proving some interesting properties of the scaling and wavelet function, and since both of these functions are assumed to be orthonormal, (3.2.13) is actually equal to one.

Theorem 3.2.1. The set {𝐷𝑚

𝑇𝑛𝜙(𝑡) : 𝑛 ∈ Z} is an orthonormal basis of V_𝑚 if | ˆℎ(𝑘) |2+ | ˆℎ(𝑘 +𝑁

2) |

2₌_2. _(3.2.14)

Proof. As a consequence of (3.2.13) we have that

∞

Õ

𝑘=−∞

| ˆ𝜙(2𝑘 + 𝑗 𝑁)|2=1 (3.2.15)

assuming orthonormality of 𝜙(𝑡). Recall from (3.2.10) that

ˆ 𝜙(𝑘) = √1 2 ˆ ℎ 𝑘 2 ˆ 𝜙 𝑘 2

which means that if we substitute this expression into (3.2.15) then

∞ Õ 𝑗=−∞ |√1 2 |2| ˆℎ(𝑘 + 𝑗 𝑁 2 ) | 2_{| ˆ} 𝜙(𝑘 + 𝑗 𝑁 2 ) | 2₌₁ ∞ Õ 𝑗=−∞ | ˆℎ(𝑘 + 𝑗 𝑁 2 ) | 2_{| ˆ}_𝜙_{(𝑘 +} 𝑗 𝑁 2 ) | 2₌₂

Now if we separate the above expression in odd and even integers 𝑗 , then

∞ Õ 𝑗=−∞ | ˆℎ(𝑘 + 𝑗 𝑁) |2| ˆ𝜙(𝑘 + 𝑗 𝑁) |2+ ∞ Õ 𝑗=−∞ | ˆℎ(𝑘 + 𝑗 𝑁 +𝑁 2) | 2_{| ˆ} 𝜙(𝑘 + 𝑗 𝑁 + 𝑁 2) | 2₌₂ | ˆℎ(𝑘) |2 ∞ Õ 𝑗=−∞ | ˆ𝜙(𝑘 + 𝑗 𝑁) |2+ | ˆℎ(𝑘 + 𝑁 2) | 2 ∞ Õ 𝑗=−∞ | ˆ𝜙(𝑘 + 𝑗 𝑁 + 𝑁 2) | 2₌₂

where in the last equality we used that ˆℎ(𝑘) is a 𝑁-periodic function. Note that by (3.2.15) we see that both sums are equal to one, hence the above equation can be reduced to

| ˆℎ(𝑘) |2+ | ˆℎ(𝑘 +𝑁 2) |

2₌_2.

This means that the scaling filter ℎ(𝑘) with length 𝑁 in the Fourier domain will satisfy the above property under the assumption that the scaling function is normalized. If it is not normalized, then of course the right hand side will not be equal to 2. However since we will only consider orthonormal scaling functions, this result is enough.

(22)

3.3 Wavelet Function Properties

A wavelet function is a function that can be constructed based on a given scaling function. In this section we will give an explanation on how this can be done and also show some properties of the wavelet function. A lot of properties will be very similar to the scaling function as we will see. This is because the wavelet function can be created using the scaling function. In many cases you could say that the properties of the wavelet function is inherited from the scaling function.

To get the properties of the wavelet function we begin by defining W𝑚 as the subspace of

V𝑚+1that is orthogonal to V𝑚. Mathematically we write these conditions as

W𝑚 ⊂ V𝑚+1, W𝑚⊥ V𝑚

Functions in the W𝑚 space will be referred as the wavelet function and we will use 𝜓 (𝑡) to

denote such functions. If 𝜓 (𝑡) is a normalized function then

{𝐷𝑚

𝑇𝑛𝜓(𝑡) : 𝑛 ∈ Z}

is an orthonormal basis of W𝑚. Basically if we apply the dilation operator 𝐷 𝑚

and the translator operator 𝑇𝑛

where we let 𝑛, 𝜓 (𝑡) can be used as an orthonormal basis of W𝑚. This is similar

as we shown for the scaling function in previous sections. By definition of an orthonormal basis we might write the W𝑚 space more formally as

W𝑚= ( 𝑓(𝑡) = 2𝑚/2 ∞ Õ 𝑛=−∞ 𝑑𝑛𝜓(2 𝑚 𝑡− 𝑛) : k 𝑓 k2= ∞ Õ 𝑛=−∞ |𝑑𝑛| 2 <∞ ) .

Because W𝑚 is a subset of V𝑚+1we know that any wavelet can be expressed in terms of the

scaling function in V𝑚+1. This is done by using (3.2.2) to set the equality

𝜓(𝑡) =

∞

Õ

𝑛=∞

𝑔(𝑛) 𝐷𝑇𝑛𝜙(𝑡) (3.3.1)

for some coefficients 𝑔(𝑛). The specific coefficients 𝑔(𝑛) is called the wavelet filter. Here the property W𝑚 ⊥ V𝑚 means that the wavelet and scaling function should be orthogonal for all

integers 𝑛. By using the condition V𝑚 ⊂ V𝑚+1we can express

V𝑚+1= V𝑚⊕ W𝑚, 𝑚∈ Z.

Basically this means that W𝑚 can be viewed as the difference between V𝑚+1 and V𝑚. This

is why W𝑚 is usually referred as the detail space. Since the above equation is true for any

integer 𝑚 we can, without loss of generality, write it as V𝑚 = V𝑚−1⊕ W𝑚−1. This shows that

(23)

is orthogonal to V𝑚−1. Note that we can keep decomposing the lower resolution space so that V𝑚 = V𝑚−1⊕ W𝑚−1 = V𝑚−2⊕ W𝑚−2⊕ W𝑚−1 = V𝑚−3⊕ W𝑚−3⊕ W𝑚−2⊕ W𝑚−1 . . . = V𝑚−𝑘 𝑚−1 Ê 𝑖=𝑚−𝑘 W𝑖 (3.3.2)

for any integers 𝑚 and 𝑘 ≥ 0. By the multiresolution analysis given in (3.1.1) we know that the sequence {V𝑚 : 𝑚 ∈ Z} forms an MRA which means that lim𝑚→∞V𝑚 = 𝐿2(R) and

lim𝑚→−∞V𝑚 = {0} by definition. Thus if we both let 𝑚 → ∞ while simultaneously letting

𝑚− 𝑘 → −∞ then (3.3.2) becomes 𝐿2(R) = ∞ Ê 𝑖=−∞ W𝑖

which shows that the union of all orthonormal detail space W𝑚 bases is an orthonormal basis

of the 𝐿2_{(R) space.}

We now will see how the Fourier transform of the wavelet function can be expressed. If we use the relation given by (3.3.1), then

ˆ 𝜓(𝑘) = ∫ ∞ −∞ 𝜓(𝑡)𝑒−2𝜋𝑖𝑘𝑡𝑑 𝑡 = ∫ ∞ −∞ ∞ Õ 𝑛=−∞ √ 2𝑔(𝑛)𝜓 (2𝑡 − 𝑛)𝑒−2𝜋𝑖𝑘𝑡𝑑 𝑡 = √ 2 ∞ Õ 𝑛=−∞ 𝑔(𝑛) ∫ ∞ −∞ 𝜓(2𝑡 − 𝑛)𝑒−2𝜋𝑖𝑘𝑡𝑑 𝑡 .

and if we make the variable change 𝑡 → 2𝑡 − 𝑛/𝑁, then

ˆ 𝜓(𝑡) = √1 2 ∞ Õ 𝑛=−∞ 𝑔(𝑛) ∫ ∞ −∞ 𝜓(𝑡)𝑒−2𝜋𝑖(𝑘/2) (𝑡+𝑛/𝑁)𝑑 𝑡 = √1 2 ∞ Õ 𝑛=−∞ 𝑔(𝑛)𝑒−2𝜋𝑖(𝑘/2)𝑛/𝑁 ∫ ∞ −∞ 𝜓(𝑡)𝑒−2𝜋𝑖(𝑘/2)𝑡𝑑 𝑡 hence ˆ 𝜓(𝑡) = √1 2ˆ 𝑔 𝑘 2 ˆ 𝜓 𝑘 2 (3.3.3)

(24)

where ˆ 𝑔(𝑘) = ∞ Õ 𝑛=−∞ 𝑔(𝑛)𝑒−2𝜋𝑖𝑘𝑛/𝑁 (3.3.4)

is the discrete Fourier transform of 𝑔(𝑛). Note that the (3.3.3) property is almost identical to what we got with the scaling function. This equation can also be iterated to obtain

ˆ 𝜓(𝑘) = ˆ𝜓 𝑘 2𝑝 𝑝 Ö 𝑗=1 1 √ 2ˆ 𝑔 𝑘 2𝑗 . (3.3.5)

similarly to what we did in (3.2.11). Now if we let 𝑝 → ∞ in the above expression, then

ˆ 𝜓(𝑘) = ˆ𝜙(0) ∞ Ö 𝑗=1 1 √ 2 ˆ ℎ 𝑘 2𝑗 . _(3.3.6)

This proves that the Fourier transform of the wavelet function can be expressed as a iteration. Let’s make an example of this using the Haar wavelet function.

Example 3.3.1. The Haar wavelet filter is given by 𝑔(0) = 1/

√

2 and 𝑔(1) = −1/ √

2, thus the total length of the filter is 𝑁 = 2. By definition of ˆ𝑔(𝑘) in (3.3.4) we have that

ˆ 𝑔(𝑘) = 1 Õ 𝑛=0 𝑔(𝑛)𝑒−𝜋𝑖𝑘𝑛= 1 √ 2(1 − 𝑒 −𝜋𝑖𝑘_).

The Haar wavelet function is already normalized, so it follows that ˆ𝜓(0) = 1 hence by (3.3.6) we see that ˆ 𝜓(𝑘) = ∞ Ö 𝑗=1 1 2(1 − 𝑒 −𝜋𝑖𝑘/2𝑗 ), 𝑘 ∈ R

is an another way of representing the Fourier transform of the Haar wavelet function.

Theorem 3.3.1. The set {𝐷𝑚

𝑇𝑛𝜓(𝑡) : 𝑛 ∈ Z} is an orthonormal basis of W𝑚 if

| ˆ𝑔(𝑘) |2+ | ˆ𝑔(𝑘 + 𝑁

2) |

2₌_2. _(3.3.7)

Proof. _{By using the result from (3.2.13) we see that the set {𝜓 (𝑡 − 𝑛) : 𝑛 ∈ Z} is orthonormal if}

∞

Õ

𝑛=−∞

| ˆ𝜓(2𝑘 + 𝑗 𝑁)|2=1

and from (3.3.3) we know that

ˆ 𝜓(𝑘) = √1 2ˆ 𝑔 𝑘 2 ˆ 𝜓 𝑘 2

(25)

which means that if we substitute this into the above expression we obtain ∞ Õ 𝑗=−∞ |√1 2 |2| ˆ𝑔(𝑘 + 𝑗 𝑁 2 ) | 2_{| ˆ}_𝜓_{(𝑘 +} 𝑗 𝑁 2 ) | 2₌₁

and if we multiply both sides by two and separate the above expression in odd and even integers 𝑗, as we did in the proof for (3.2.1), then

| ˆ𝑔(𝑘) |2 ∞ Õ 𝑗=−∞ | ˆ𝜓(𝑘 + 𝑗 𝑁) |2+ | ˆ𝑔(𝑘 + 𝑁 2) | 2 ∞ Õ 𝑗=−∞ | ˆ𝜓(𝑘 + 𝑗 𝑁 + 𝑁 2) | 2₌₂

where again we used that ˆ𝑔(𝑘) is a 𝑁-periodic function. The orthonormality property of 𝜓 (𝑡) means that both summations is equal to one, which implies that

| ˆ𝑔(𝑘) |2+ | ˆ𝑔(𝑘 + 𝑁 2) |

2₌₂

so that (3.3.7) is proved.

Theorem 3.3.2. If 𝜙(𝑡) and 𝜓 (𝑡) both are normalized functions and h𝜙, 𝜓i = 0, then

ˆ 𝑔(𝑘) ˆℎ(𝑘) + ˆ𝑔(𝑘 + 𝑁 2) ˆℎ(𝑘 + 𝑁 2) = 0 (3.3.8)

Proof. The spaces W0is orthogonal to V0if and only if the set {𝜓 (𝑡 − 𝑛) : 𝑛 ∈ Z} is orthogonal

to the set {𝜙(𝑡 − 𝑛) : 𝑛 ∈ Z}. Mathematically this means that for any integer 𝑛 we have that ∫ ∞

−∞

𝜓(𝑡)𝜙(𝑡 − 𝑛) = 0.

Note that we can rewrite the left hand side using the Parseval’s relation as ∫ ∞ −∞ 𝜓(𝑡)𝜙(𝑡 − 𝑛)𝑑𝑡 = 1 𝑁 ∫ ∞ −∞ ˆ 𝜓(𝑘) ˆ𝜙(𝑘)𝑒2𝜋𝑖𝑘𝑛/𝑁 = 1 𝑁 ∫ 𝑁 0 ∞ Õ 𝑘=−∞ ˆ 𝜓(𝑘 + 𝑗 𝑁) ˆ𝜙(𝑘 + 𝑗 𝑁)𝑒2𝜋𝑖𝑘𝑛/𝑁𝑑 𝑘

using the 𝑁-periodicity of 𝑒2𝜋𝑖𝑘𝑛/𝑁. The above equation is zero if and only if

∞ Õ 𝑘=−∞ ˆ 𝜓(𝑘 + 𝑗 𝑁) ˆ𝜙(𝑘 + 𝑗 𝑁) = 0. or equally ∞ Õ 𝑘=−∞ ˆ 𝜓(2𝑘 + 𝑗 𝑁) ˆ𝜙(2𝑘 + 𝑗 𝑁) = 0. (3.3.9)

(26)

From here we can use the equalities ˆ 𝜙(𝑘) = 1 √ 2 ˆ ℎ 𝑘 2 ˆ 𝜙 𝑘 2 and ˆ 𝜓(𝑘) = 1 √ 2ˆ 𝑔 𝑘 2 ˆ 𝜙 𝑘 2

so that the (3.3.9) can be written as

∞ Õ 𝑘=−∞ ˆ 𝜓(2𝑘 + 𝑗 𝑁) ˆ𝜙(2𝑘 + 𝑗 𝑁) = ∞ Õ 𝑘=−∞ 1 2𝑔ˆ(𝑘 + 𝑗 𝑁 2 ) ˆℎ(𝑘 + 𝑗 𝑁 2 ) | ˆ𝜙(𝑘 + 𝑗 𝑁 2 ) = 0| 2

and if we split up this sum into even and odd integers 𝑗 and use the 𝑁-periodicity of ˆℎ(𝑘) and ˆ 𝑔(𝑘), then 1 2𝑔ˆ(𝑘) ˆℎ(𝑘) ∞ Õ 𝑘=−∞ | ˆ𝜙(𝑘 + 𝑗 𝑁) |2+ 1 2𝑔ˆ(𝑘 + 𝑁 2) ˆℎ(𝑘 + 𝑁 2) ∞ Õ 𝑘=−∞ | ˆ𝜙(𝑘 + 𝑗 𝑁 + 𝑁 2) | 2₌_0.

Note that the summations are equal to one assuming orthonormality of the scaling function, thus (3.3.9) is equivalently written as

ˆ 𝑔(𝑘) ˆℎ(𝑘) + ˆ𝑔(𝑘 + 𝑁 2) ˆℎ(𝑘 + 𝑁 2) = 0. (3.3.10) A consequence of this theorem is that we now can relate ˆ𝑔(𝑘) to ˆℎ(𝑘). To see this we first note that ˆℎ(𝑘) and ˆℎ(𝑘 + 𝑁/2) can not vanish at the same time because of (3.2.14). This means that there exists a function 𝜆(𝑘) so that

ˆ 𝑔(𝑘) = 𝜆(𝑘) ˆℎ(𝑘 + 𝑁 2) and ˆ 𝑔(𝑘 + 𝑁 2) = −𝜆(𝑘) ˆℎ(𝑘). (3.3.11)

By substituting these expressions into (3.3.8) we confirm that the equality is still holds. Since both ˆℎ(𝑘) and ˆ𝑔(𝑘) are 𝑁-periodic functions it follows that 𝜆(𝑘) is an 𝑁-periodic function. By using 𝑘 + 𝑁/2 instead of 𝑘 in (3.3.11) and the 𝑁-periodicity of ˆ𝑔(𝑘), it is clear that we have the relation

𝜆(𝑘) = −𝜆(𝑘 + 𝑁

(27)

which equivalently may be written as

𝜆(𝑘) = 𝑒±2𝜋𝑖𝑘𝑛/𝑁𝑓(𝑘)

for some 𝑁-periodic function 𝑓 . However in order to have an orthonormal wavelet we will also need |𝜆(𝜔)|2=1. This is a consequence of (3.2.14) and (3.3.7). To summarize, 𝜆(𝑘) will need to be chosen so that [5]

(i) 𝜆(𝑘) is a 𝑁-periodic function. (ii) 𝜆(𝑘) = −𝜆(𝑘 +𝑁

2).

(iii) |𝜆(𝑘)|2=1.

The most common choice for 𝜆(𝑘) are −𝑒−2𝜋𝑖𝑘/𝑁,𝑒−2𝜋𝑖𝑘/𝑁 or 𝑒2𝜋𝑖𝑘/𝑁, although any other function satisfying the given properties for 𝜆(𝑘) can be used to generate a valid ˆ𝑔(𝑘) [5]. Different choices of 𝜆(𝑘) will of course lead to a different relation between ˆ𝑔(𝑘) and ˆℎ(𝑘). This means that the construction of wavelets is not unique and thus it is possible to generate multiple wavelets from a single scaling function which satisfies (3.3.8). To follow the theory from [5] we will use 𝜆(𝑘) = −𝑒−2𝜋𝑖𝑘/𝑁 so that

ˆ

𝑔(𝑘) = −𝑒−2𝜋𝑖𝑘/𝑁ℎˆ(𝑘 + 𝑁

2) (3.3.12)

is a valid equality which can also be used to get a connection between the scaling and wavelet filters ℎ(𝑛) and 𝑔(𝑛) as we will see.

Example 3.3.2. In the case of the Haar wavelet we know that

ˆ ℎ(𝑘) = 1 √ 2(1 + 𝑒 −𝜋𝑖𝑘_).

Here we note that the orthogonality property | ˆℎ(𝑘)|2+ | ˆℎ(𝑘 + 𝑁/2)|2=2 is satisfied which can be verified by elementary calculations. Using (3.3.12) we find that

ˆ 𝑔(𝑘) = −𝑒−2𝜋𝑖𝑘/𝑁ℎˆ(𝑘 + 𝑁 2) = −𝑒 −2𝜋𝑖𝑘𝑛/𝑁_√1 2(1 + 𝑒 𝜋𝑖(𝑘+𝑁/2) ) = √1 2(1 − 𝑒 −2𝜋𝑖𝑘/𝑁₎

so that the Fourier transform of the Haar wavelet is given by

ˆ 𝜓(𝑘) = 1 √ 2ˆ 𝑔 𝑘 2 ˆ 𝜙 𝑘 2 = 1 2(1 − 𝑒 −2𝜋𝑖𝑘/(2𝑁)_{) ˆ} 𝜙 𝑘 2 = 1 2𝜙ˆ 𝑘 2 −1 2𝜙ˆ 𝑘 2 𝑒−2𝜋𝑖𝑘/(2𝑁).

From here we can apply the inverse Fourier transformation on both sides to confirm that

𝜓(𝑡) = 𝜙(2𝑡) − 𝜙(2𝑡 − 1).

Here manage to build the Haar wavelet from the Haar scaling function by using the theory we have built up so far. In general we know that any wavelet is determined by its scaling function, and these calculations highlights that point.

(28)

Because of the relation between ˆℎ(𝑘) and ˆ𝑔(𝑘) in (3.3.12) we expect that the coefficients ℎ(𝑛) and 𝑔(𝑛) should be related by a formula. This formula can be proven by noting that [5]

ˆ 𝑔(𝑘) = ∞ Õ 𝑛=−∞ 𝑔(𝑛)𝑒−2𝜋𝑖𝑘𝑛/𝑁 = −𝑒−2𝜋𝑖𝑘/𝑁 ∞ Õ 𝑛=−∞ ℎ(𝑛)𝑒2𝜋𝑖(𝑘+𝑁/2)𝑛/𝑁 = ∞ Õ 𝑛=−∞ ℎ(𝑛) (−𝑒−2𝜋𝑖𝑘/𝑁+2𝜋𝑖(𝑘+𝑁/2)𝑛/𝑁) = ∞ Õ 𝑛=−∞ (−𝑒𝜋𝑖𝑛 ) ℎ(𝑛)𝑒−2𝜋𝑖𝑘 (1−𝑛)/𝑁 = ∞ Õ 𝑛=−∞ (−1)1−𝑛ℎ(𝑛)𝑒−2𝜋𝑖𝑘 (1−𝑛)/𝑁

and if we let 𝑗 = 1 − 𝑛 we find that the coefficients 𝑔( 𝑗 ) and ℎ( 𝑗 ) is related by the formula

𝑔( 𝑗 ) = (−1)𝑗ℎ(1 − 𝑗 ). (3.3.13)

If the scaling function coefficients is related to the wavelet coefficients in this way, we know that the scaling and wavelet function is orthogonal to each other. The formula (3.3.13) is usually refereed as the quadrature mirror relation (QMR) and allow us to easily construct the wavelet given the scaling function coefficients ℎ(𝑛). Recall that the relation (3.3.13) was derived using 𝜆(𝑘) = −𝑒−2𝜋𝑖𝑘/𝑁. By choosing another valid 𝜆(𝑘) we will, in general, obtain a different QMR. Another thing to consider is that the wavelet is orthogonal to the scaling function for any 𝑛 translation, hence without loss of generality we can define [5]

𝑔( 𝑗 ) = (−1)𝑗+𝑁−2ℎ(1 − 𝑗 + 𝑁 − 2) = (−1)𝑗ℎ(𝑁 − 1 − 𝑗 ). (3.3.14) assuming 𝑁 is an even number, which is always the case. With this notation we translate the wavelet function by 𝑁 − 2 which is always possible since the translation of the wavelet function does not invalidate the orthogonality condition with the scaling function. Basically (3.3.14) tells us that the substitution

𝑔(0) = ℎ(𝑁 − 1) 𝑔(1) = −ℎ(𝑁 − 2) 𝑔(2) = ℎ(𝑁 − 3) . . . 𝑔(𝑁 − 3) = −ℎ(2) 𝑔(𝑁 − 2) = ℎ(1) 𝑔(𝑁 − 1) = −ℎ(0)

(29)

Example 3.3.3. The Haar scaling filter is given by ℎ(0) = ℎ(1) = 1/

√

2 and if we use (3.3.13) we see that 𝑔(0) = ℎ(1) and 𝑔(1) = −ℎ(0), hence 𝑔(0) = 1/

√

2 and 𝑔(1) = 1/ √

2 is the wavelet function coefficients. By using (3.3.1) we see that

𝜓(𝑡) = 𝑔₀ √

2𝜙(2𝑡) + 𝑔1

√

2𝜙(2𝑡 − 1) = 𝜙(2𝑡) − 𝜙(2𝑡 − 1)

which is the Haar wavelet by definition.

3.4 Construction of the Scaling Filter

So far we have been using the Haar scaling filter ℎ(0) = ℎ(1) = 1/

√

2 without really knowing where it comes from. In this section we will derive the Haar scaling filter in the proper way and also see how we can create even longer scaling filters ℎ(0), ℎ(1), . . . , ℎ(𝑁 − 1) by using the Daubechie’s method that can be used to create the Daubechie’s scaling function for even numbers 𝑁. First we note that the Fourier series of the generalized scaling filter is given by

ˆ ℎ(𝑘) = 𝑁−1 Õ 𝑛=0 ℎ(𝑛)𝑒−2𝜋𝑖𝑘𝑛/𝑁

so that if we let 𝑘 = 0 then

ˆ ℎ(0) = 𝑁−1 Õ 𝑛=0 ℎ(𝑛).

By using the result from (3.2.5) we see that ˆℎ(0) =√2 under the assumption that the coefficients ℎ(𝑛) is zero outsize the interval 0 ≤ 𝑛 ≤ 𝑁 − 1. However if we let 𝑘 = 𝑁/2 then

ˆ ℎ 𝑁 2 = 𝑁−1 Õ 𝑛=0 ℎ(𝑛)𝑒−𝜋𝑖𝑛= 𝑁−1 Õ 𝑛=0 ℎ(𝑛) (−1)𝑛 (3.4.1) since 𝑒−𝜋𝑖𝑛= (−1)𝑛

for any integer 𝑛. This means that the sign is changed on every other ℎ(𝑛). Now if we use (3.2.14) for 𝑘 = 0 we see that

| ˆℎ(0)|2+ | ˆℎ(𝑁 2) |

2₌_2.

Since ˆℎ(0) = √

2 it implies that ˆℎ(𝑁/2) = 0 for this equation to be satisfied, hence (3.4.1) equal to zero is one constraint. The scaling filter can be used as a basis, and in order for this basis to be orthonormal we need that the orthonormality property of the scaling function (3.2.4) is satisfied. So for any 𝑁 length scaling filter, the normalization constraint is given by

𝑁−1

Õ

𝑛=0

(30)

and the orthogonality constraint is

𝑁−1

Õ

𝑛=0

ℎ(𝑛) ℎ(𝑛 − 2𝑘) = 0

Together they ensure that the scaling filter is indeed in orthonormal basis. These properties can actually be more intuitively described using matrix form as we will see later in section 4.2. The properties listed so far are the only necessary conditions for the scaling filter and if we summarize our findings we see that

                                             𝑁−1 Õ 𝑛=0 ℎ(𝑛) = √ 2 𝑁−1 Õ 𝑛=0 ℎ(𝑛) (−1)𝑛=0 𝑁−1 Õ 𝑛=0 ℎ(𝑛)2=1 𝑁−1 Õ 𝑛=0 ℎ(𝑛) ℎ(𝑛 − 2𝑘) = 0, 𝑘=1, 2, . . . , 𝑁 2 − 1 (3.4.2)

is the requirement for the scaling filter. The first property tells us that the sum of the scaling filter is equal to

√

2 which is another way of saying that ˆℎ(0) = √

2. The second property means that if we put a minus sign on every other element in the scaling filter, the sum will be zero which is also equivalent to saying that ˆℎ(𝑁/2) = 0. The third is the normalization constraint and the fourth is the orthogonal property so that the scaling function forms an orthonormal basis as mentioned before. For example if we want to find the scaling filter when 𝑁 = 2 then it leads to the system of equations

         ℎ(0) + ℎ(1) = √ 2 ℎ(0) − ℎ(1) = 0 ℎ(0)2+ ℎ(1)2=1.

By solving this system we find that the unique solution is ℎ(0) = ℎ(1) = 1/ √

2 which we recall as the Haar scaling filter. Note that the orthogonality equation is not needed when 𝑁 = 2. However for any longer scaling filter 𝑁 > 2 the system of equations given in (3.4.2) will have an infinite amount of solutions which means that there are infinite amounts of scaling filter when 𝑁 > 2. The definition of a scaling filter is therefore is very loose. We could basically ‘invent’ our own scaling filter if we find some solution to the given system for some 𝑁. However if we want to get finite amounts of solutions we need to add more constraints. One approach is to assume that the wavelet function has 𝑁 vanishing moments. This is equivalent to requiring

(31)

that

∫ ∞

−∞

𝑡𝑘𝜓(𝑡)𝑑𝑡 = 0, 𝑘=0, 1, . . . , 𝑁 − 1. (3.4.3)

A wavelet function has 𝑁 vanishing moments if and only if the scaling function can be used to express polynomials up to degree 𝑁 − 1. Hence by increasing the vanishing moments we also increase the complexity of the V0space. It turns out that (3.4.3) is also equivalent to saying

that the 𝑗 -th derivative of the Fourier transform of the scaling filter, with length 𝑁, at 𝑘 = 𝑁/2 is zero for 𝑗 = 1, 2, . . . , 𝑁/2 − 1. Mathematically we could express this as

ˆ ℎ( 𝑗 )( 𝑁 2) = 0, 𝑗 =0, 1, . . . , 𝑁 2 − 1.

Note that this could be viewed as a generalization of the constraint given in (3.4.1). One consequence of this requirement is that the function | ˆℎ(𝑘)|2 will have a horizontal tangent when approaching 𝑘 = 𝑁/2, and be exactly horizontal at 𝑘 = 𝑁/2. This approach was proposed by Ingrid Daubechie’s [9] which is why these kinds of scaling filters will produce the so called Daubechie’s scaling function. To derive the general formula for the constraint ˆℎ( 𝑗 )(𝑁

2) = 0 we

will will first see how the derivative of ˆℎ(𝑘) looks like. If we derive ˆℎ(𝑘) with respect to 𝑘 then ˆ ℎ0(𝑘) = 𝑁−1 Õ 𝑛=0 −2𝜋𝑖𝑛 𝑁 ℎ(𝑛)𝑒−2𝜋𝑖𝑘𝑛/𝑁 so that ˆ ℎ0( 𝑁 2) = 𝑁−1 Õ 𝑛=0 −2𝜋𝑖𝑛 𝑁 ℎ(𝑛) (−1)𝑛

where we again used that 𝑒−𝜋𝑖𝑛= (−1)𝑛

for integers 𝑛, similar to what we did in (3.4.1). Because ˆ

ℎ0(𝑁/2) = 0 we can multiply both sides with the constant 𝑁/(2𝜋𝑖) to obtain

ˆ ℎ0( 𝑁 2) = 𝑁−1 Õ 𝑛=0 𝑛 ℎ(𝑛) (−1)𝑛+1=0.

Here we have the expression of the constraint when 𝑗 = 1 in ˆℎ( 𝑗 )(𝑁/2) = 0. This process can of course be done for any 𝑗 = 0, 1, . . . , 𝑁/2 − 1. From the above expression we see that the generalized 𝑗 -th derivative constraint can be written in the form

ˆ ℎ( 𝑗 )( 𝑁 2) = 𝑁−1 Õ 𝑛=0 𝑛𝑗ℎ(𝑛) (−1)𝑛+1=0.

(32)

Now if we add this constraint to (3.4.2) we obtain the new system of equations                                                        𝑁−1 Õ 𝑛=0 ℎ(𝑛) = √ 2 𝑁−1 Õ 𝑛=0 ℎ(𝑛) (−1)𝑛=0 𝑁−1 Õ 𝑛=0 ℎ(𝑛)2=1 𝑁−1 Õ 𝑛=0 ℎ(𝑛) ℎ(𝑛 − 2 𝑗 ) = 0, 𝑗 =1, 2, . . . , 𝑁 2 − 1 𝑁−1 Õ 𝑛=0 𝑛𝑗ℎ(𝑛) (−1)𝑛+1=0, 𝑗 =0, 1, . . . , 𝑁 2 − 1 (3.4.4)

The solution to this gives us the Daubechie’s scaling filter of length 𝑁. This system will always give us 2𝑁2−1 unique solutions [12]. It does not matter what solution is picked as any solution

will give us the desired properties. Now suppose we want to create the Daubechie’s 𝑁 = 4 scaling filter. (3.4.4) can now be used to obtain the system

                   ℎ(0) + ℎ(1) + ℎ(2) + ℎ(3) = √ 2 ℎ(0) − ℎ(1) + ℎ(2) − ℎ(3) = 0 ℎ(0)2+ ℎ(1)2+ ℎ(2)2+ ℎ(3)2=1 ℎ(0)ℎ(2) + ℎ(1)ℎ(3) = 0 ℎ(1) − 2ℎ(2) + 3ℎ(3) = 0 (3.4.5)

which has one solution [12]

ℎ(0) = 1 +√3 4√2 , ℎ(1) = 3 +√3 4√2 , ℎ(2) = 3 −√3 4√2 , ℎ(3) = 1 −√3 4√2 (3.4.6)

and the other

ℎ(0) = 1 − √ 3 4 √ 2 , ℎ(1) = 3 − √ 3 4 √ 2 , ℎ(2) = 3 + √ 3 4 √ 2 , ℎ(3) = 1 + √ 3 4 √ 2 (3.4.7)

hence the two solutions are the reflection of each other. The first solution is generally considered the default choice, however both are a valid choice. By using the scaling filter (3.4.6) in combination with the scaling function given in (3.2.2) we can actually see how the Daubechie’s scaling function looks like. This is an simple iterative process and exact details can be seen at [5] (p.25). The plot of the Daubechie’s 𝑁 = 4 scaling function can be seen in the left figure in Figure 3.1. By using 𝑁 = 4 we actually obtain a very complicated function compared to the

(33)

0 0.5 1 1.5 2 2.5 3 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4

(a) Daubechies 𝑁 = 4 scaling function 𝜙(𝑡)

0 0.5 1 1.5 2 2.5 3 -1.5 -1 -0.5 0 0.5 1 1.5 2

(b) Daubechies 𝑁 = 4 wavelet function 𝜓 (𝑡)

Figure 3.1: Plot of Daubechie’s 𝑁 = 4 scaling and wavelet function.

𝑁 =2 which gives the Haar scaling function. With the Haar scaling function we can express constant lines while with the Daubechie’s 𝑁 = 4 scaling function can express lines, also known as one degree polynomial. The reason is that the translations of the Daubechie’s 𝑁 = 4 scaling function will actually fit like a puzzle, which in turn will create straight lines. In general the Daubechie’s 𝑁 scaling function allow us to express 𝑁/2 degree polynomials. This means that by using longer filter we are able to express more complicated functions when using the scaling function as the basis element in, or equivalent that the V𝑚 space is a more complex space for

longer scaling filter.

Once we have our scaling filter ℎ(𝑛) we can define the wavelet filter 𝑔(𝑛) by using the quadrature mirror relation (QMR) as defined in (3.3.14). Recall that the QMR tells us how the 𝑔(𝑛) coefficients are constructed given the ℎ(𝑛) coefficients. If we use the scaling filter given by (3.4.6) in combination with our QMR, then the wavelet coefficients are given by

𝑔(0) = 1 − √ 3 4 √ 2 , 𝑔(1) = −3 − √ 3 4 √ 2 , 𝑔(2) = 3 + √ 3 4 √ 2 , 𝑔(3) = −1 + √ 3 4 √ 2

which is the Daubechie’s 𝑁 = 4 wavelet filter. Note that this is almost the same as the reflection solution given in (3.4.7), however the only difference is the minus sign on the second and fourth coefficient. By using this filter in combination with the wavelet equation (3.3.1), the Daubechie’s 𝑁 = 4 wavelet function can be obtained. The visualization of this function can be seen in the right plot in Figure 3.1. The process of finding the coefficient scaling and wavelet filter can of course be done for any given even number 𝑁. First you solve (3.4.4) and pick one of the two solutions as the scaling filter. In the next step we find the wavelet filter by using the QMR given the scaling filter. Just for completeness we list the conditions of the wavelet filter,

(34)

thus                                                        𝑁−1 Õ 𝑛=0 𝑔(𝑛) = 0 𝑁−1 Õ 𝑛=0 𝑔(𝑛) (−1)𝑛= √ 2 𝑁−1 Õ 𝑛=0 𝑔(𝑛)2=1 𝑁−1 Õ 𝑛=0 𝑔(𝑛)𝑔(𝑛 − 2 𝑗 ) = 0, 𝑗=1, 2, . . . , 𝑁 2 − 1 𝑁−1 Õ 𝑛=0 𝑛𝑗𝑔(𝑛) (−1)𝑛+1=0, 𝑗=0, 1, . . . , 𝑁 2 − 1 (3.4.8)

which is easily obtained from (3.4.4), QMR formula (3.3.14) and previous results. Note that the only difference is in the first two constraints. Basically if we sum the wavelet filter it should equal zero, and if we put a minus sign on every other wavelet filter, it’s summation should be equal to

√ 2.

(35)

Chapter 4 The Discrete Wavelet Transform

The wavelet transform can be used to capture both the frequency and time components of a signal by splitting it up into different resolutions. This is an advantage, compared to the DFT, since the DFT only captures the frequency content. In this section we will explore the DWT and also try to understand it from a mathematical point of view. Some results from the previous sections will be used such as the multiresolution analysis concept. If we use the MRA property, then it’s just a matter of finding the expression of the coefficients, or inner products, as we will see.

4.1 Derivation of the Discrete Wavelet Transform

In this section we will see how it is possible to derive the DWT using wavelet theory. First we note that the MRA property V𝑚 = V𝑚−1⊕ W𝑚−1 means that any function 𝑓 (𝑡) ∈ V𝑚 can be

written as [5] 𝑓(𝑡) = ∞ Õ 𝑛=−∞ 𝑐_𝑚(𝑛) 𝐷𝑚𝑇𝑛𝜙(𝑡) (4.1.1) = ∞ Õ 𝑛=−∞ 𝑐𝑚−1(𝑛) 𝐷 𝑚−1 𝑇𝑛𝜙(𝑡) + ∞ Õ 𝑛=−∞ 𝑑𝑚−1(𝑛) 𝐷 𝑚−1 𝑇𝑛𝜓(𝑡) (4.1.2)

for some coefficients 𝑐𝑚−1(𝑛) and 𝑑𝑚−1(𝑛) given 𝑐𝑚(𝑛). This decomposition will always

possible under the assumption that the scaling function is forming an multiresolution analysis. The goal is to find an explicit form of the coefficients 𝑐𝑚−1(𝑛) and 𝑑𝑚−1(𝑛) in relation to

𝑐_𝑚(𝑛). To derive this we first need to note that the translation and dilation operators satisfies the relation [3] 𝑇 𝐷 𝜙(𝑡) = 𝑇 √ 2𝜙(2𝑡) = √ 2𝜙(2𝑡 − 2) = 𝐷𝑇2𝜙(𝑡).

(36)

Basically we have that the dilation and translation operators satisfy 𝑇 𝐷 = 𝐷𝑇2. By using this result in combination with the scaling equation given in (3.2.2) we see that [3]

𝐷𝑚𝑇𝑛𝜙(𝑡) = 𝐷𝑚𝑇𝑛 2 𝑗 −1 Õ 𝑘=0 ℎ(𝑘) 𝐷𝑇𝑘𝜙(𝑡) = 2 𝑗 −1 Õ 𝑙=0 ℎ(𝑙) 𝐷𝑚𝑇𝑛𝐷𝑇𝑙𝜙(𝑡) = 2 𝑗 −1 Õ 𝑙=0 ℎ(𝑙) 𝐷𝑚𝐷𝑇2𝑛𝑇𝑙𝜙(𝑡) = 2 𝑗 −1 Õ 𝑙=0 ℎ(𝑙) 𝐷𝑚+1𝑇2𝑛+𝑙𝜙(𝑡) (4.1.3)

Due to the orthogonality condition between 𝜙(𝑡) and 𝜓 (𝑡), and property V𝑚 = V𝑚−1⊕ W𝑚−1

we can write 𝐷𝑚𝑇𝑘𝜙(𝑡) = Õ 𝑛 h𝐷𝑚−1 𝑇𝑛𝜙, 𝐷𝑚𝑇𝑘𝜙i𝐷𝑚−1𝑇𝑛𝜙(𝑡) + Õ 𝑛 h𝐷𝑚−1 𝑇𝑛𝜓 , 𝐷𝑚𝑇𝑘𝜙i𝐷𝑚−1𝑇𝑛𝜓(𝑡)

hence splitting up the scaling function in V𝑚 by using the scaling function and wavelet as a

basis. Next we want to see how the inner products can be expressed. We begin with the first one, h𝐷𝑚−1 𝑇𝑛𝜙, 𝐷𝑚𝑇𝑘𝜙i = h 2 𝑗 −1 Õ 𝑙=0 ℎ(𝑙) 𝐷𝑚𝑇2𝑛+𝑙𝜙, 𝐷𝑚𝑇𝑘𝜙i = 2 𝑗 −1 Õ 𝑙=0 ℎ(𝑙) h𝐷𝑚𝑇2𝑛+𝑙𝜙, 𝐷𝑚𝑇𝑘𝜙i = 2 𝑗 −1 Õ 𝑙=0 ℎ(𝑙) h𝑇2𝑛+𝑙, 𝑇𝑘i = 2 𝑗 −1 Õ 𝑙=0 ℎ(𝑙)𝛿(2𝑛 + 𝑙, 𝑘) = ℎ(𝑘 −2𝑛)

By using the result from (4.1.3) and the above result we get that h𝐷𝑚−1

𝑇𝑛𝜓 , 𝐷𝑚𝑇𝑘𝜙i = 𝑔(𝑘 − 2𝑛). Now if we put everything together we have

𝐷𝑚𝑇𝑘𝜙(𝑡) = Õ 𝑛 ℎ(𝑘 − 2𝑛)𝐷𝑚−1𝑇𝑛𝜙(𝑡) + Õ 𝑛 𝑔(𝑘 − 2𝑛)𝐷𝑚−1𝑇𝑛𝜓(𝑡)

(37)

so that for any 𝑓 (𝑡) ∈ V𝑚 𝑓(𝑡) = 2𝑚 −1 Õ 𝑘=0 𝑐_𝑚(𝑘) 𝐷𝑚𝑇𝑘𝜙(𝑡) = 2𝑚 −1 Õ 𝑘=0 𝑐𝑚(𝑘) Õ 𝑛 ℎ(𝑘 − 2𝑛)𝐷𝑚−1𝑇𝑛𝜙(𝑡) + 2𝑚 −1 Õ 𝑘=0 𝑐𝑚(𝑘) Õ 𝑛 𝑔(𝑘 − 2𝑛)𝐷𝑚−1𝑇𝑛𝜓(𝑡) = 2𝑚−1 −1 Õ 𝑛=0 𝑐_𝑚₋₁(𝑛) 𝐷𝑚−1𝑇𝑛𝜙(𝑡) + 2𝑚−1 −1 Õ 𝑛=0 𝑑_𝑚₋₁(𝑛) 𝐷𝑚−1𝑇𝑛𝜓(𝑡) where we defined [3] 𝑐_𝑚₋₁(𝑛) = 2𝑛+2𝑁−1 Õ 𝑘=2𝑛 ℎ(𝑘 − 2𝑛)𝑐_𝑚(𝑘) (4.1.4) and 𝑑𝑚−1(𝑛) = 2𝑛+2𝑁−1 Õ 𝑘=2𝑛 𝑔(𝑘 − 2𝑛)𝑐𝑚(𝑘). (4.1.5)

From there equations it’s clear that given the vector 𝑐𝑚 and the scaling and wavelet filter, the

coefficient vectors 𝑐𝑚−1 and 𝑑𝑚−1 can be calculated based on the formulas above. The result

can then be inserted back into (4.1.1) to obtain the V𝑚−1⊕ W𝑚−1 decomposed version of the

function. This is what we call the discrete wavelet transform of the function. Note that this process can be iterated since we can use the vector 𝑐𝑚−1 to obtain the coefficients 𝑐𝑚−2 and

𝑑_𝑚₋₂.

4.2 Matrix Formulation

The relations (4.1.4) and (4.1.5) allow us to find the coefficients that will split the signal into two components in the lower resolution space. In fact this can also be written in matrix form which will be the main focus of this section. By using the matrix formulation of the wavelet transform it allow us to understand it from a linear algebra point of view. In addition, the wavelet matrix formulation allow us to see it’s connection to the Fourier matrix as we will explain in the next section.

Suppose the scaling and wavelet filter is of length 𝑁, which is always divisible by two, then the matrix formulation of the coefficient equation in (4.1.4) is given by the linear equation [10]

           ℎ(0) ℎ(1) . . . ℎ(𝑁 − 1) ℎ(0) ℎ(1) . . . ℎ(𝑁 − 1) ℎ(0) ℎ(1) . . . ℎ(𝑁 − 1) . . . ℎ(2) ℎ(3) . . . ℎ(𝑁 − 1) ℎ(0) ℎ(1)                     𝑐_𝑚(0) 𝑐_𝑚(1) . . . 𝑐_𝑚(2𝑚− 1)          =          𝑐_𝑚₋₁(0) 𝑐_𝑚₋₁(1) . . . 𝑐_𝑚₋₁(2𝑚−1− 1)         

The Discrete Wavelet Transform

School of Education, Culture and Communication

Division of Mathematics and Physics

MASTER’S DEGREE PROJECT IN MATHEMATICS

The Discrete Wavelet Transform

by

Anton Wirén

MAA515 — Examensarbete i matematik för masterexamen

DIVISION OF MATHEMATICS AND PHYSICS

School of Education, Culture and Communication

Division of Mathematics and Physics

Contents

Chapter 1

Introduction

1.1

Background

1.2

Approach

Chapter 2

The Discrete Fourier Transform

2.1

Basic Properties

2.2

Matrix Formulation

Chapter 3

Multiresolution Analysis

3.1

Translation & Dilation Operators

3.2

Scaling Function Properties

3.3

Wavelet Function Properties

3.4

Construction of the Scaling Filter

Chapter 4

The Discrete Wavelet Transform

4.1

Derivation of the Discrete Wavelet Transform

4.2

Matrix Formulation