Restoration of clipped sound signals -a weighted Fourier series and AR approach

(1)

Restoration of clipped sound signals - a weighted Fourier series and AR

approach

A bachelor thesis on Optimization

Anders Gidmark Helena Olofson gidmark@kth.se holofson@kth.se

Supervisor: Per Enqvist

SA104X Degree Project in Engineering Physics, First Level Department of Mathematics, Optimization and Systems Theory

Royal Institute of Technology (KTH)

Stockholm, Sweden

(2)

Abstract

Sound signals can be distorted in many dierent ways, one of them is called clipping. A clipped sound signal diers from a non-clipped signal in the way that the amplitudes of the sound wave that are higher than a certain amplitude threshold has been partially lowered or completely lowered to the threshold, the latter is called hard clipping. Since data is lost when a signal is clipped, there is an interest in restoring the signal. For a hard clipped signal, it is often impossible to perfectly restore the signal.

In this thesis two dierent methods for partially restoring a symmetrically hard clipped signal are suggested. The two methods considered are a weighted Fourier series (WFS) t and an autoregressive (AR) model approach. Both methods attempt to restore the signal by solving optimization problems designed to minimize the errors of the respective model.

Evaluation and comparison of the two methods showed that the AR method typically performed better than the WFS method. The AR method was eective at restoring the signal, while the WFS method stuck close to the clipped signal, which might be due to dierences in their optimization problems.

(3)

1 Introduction

In this bachelor's thesis we will develop and examine dierent approaches to the audio distortion clipping, which can occur due to hardware limitations when transfering or recording audio. It is in most cases best to prevent clipping in the rst place. However if clipping has occured you might want to nd a way to restore the original signal from the clipped signal. Two dierent methods to partially restore the signal and update the clipped values are suggested. We will consider an autoregressive (AR) model approach and a weighted Fourier series least squares t. Inspiration to these two methods has to some part been taken from articles read, among other the article "Missing Data Recovery Via a Nonparametric Iterative Adaptive Approach" [9]. This article deals with the problem to recover the missing data from the available data.

In the report, we will rst introduce the concept of clipping and how it can arise. The theory behind the Fourier series approach and the autoregressive model approach are presented and an insight in the theory of optimization is given. Moving on to the next section, the two approaches are explained in more detail. In the simulation's section the theory of the developed algorithm is explained, as well as the results obtained using the Fourier series approach and the AR approach being presented and later on discussed in the following section.

2 Theory

Clipping is a form of distortion of a sound wave and is often caused due to hardware limitations, when for example a speaker or microphone has been subjected to overloading. There are dierent types of clipping, for example symmetrical- and asymmetrical clipping. Symmetrical clipping is when the wave is clipped at the same amount for both positive and negative output values, while asymmetrical clipping is when the wave is clipped at dierent amount for positive and negative output values.

There are both hard- and soft clipping. Hard clipping is when the wave has been clipped of at a certain level and stays at the amplitude threshold for a certain time. The clipped output signal can not assume values above the amplitude threshold for positive output values and below the amplitude threshold for negative output values. In soft clipping the top- and bottom of the output signal is a bit rounded and there is not an evident amplitude threshold as in hard clipping [1].

In Figure 1 an illustration of a signal which has been subject to symmetrical soft- and hard clipping is shown. The hard clipped signal is red and the soft clipped signal green in the Figure. In this report we will restrict ourselves to symmetrical, hard clipping and real-valued signals.

(5)

Figure 1: An example of a symmetrical soft clipped, symmetrical hard clipped and an unclipped signal

The dierent types of clipping manifests themselves dierently. When a music instrument is played, particles in the air start to vibrate and to build up sound waves. A sound wave can consist of dierent harmonics, depending on how the signal looks like. If the sound wave is a sine wave it has all its energy at the fundamental frequency f, which is the frequency of the fundamental tone (rst harmonic). The rst overtone (second harmonic) has a frequency of 2f and the second overtone (third harmonic) a frequency of 3f. The odd-order harmonics have a frequency equal to an odd number times the fundamental frequency, while the even order harmonics have a frequency of an even number times the fundamental frequency. The high-order harmonics have thus a higher frequency than the low-order harmonics [3].

The sound of the even and odd harmonics, but also the low- and high order harmonics is audible dierent. The even and low order harmonics sound more musical, whereas the odd and high-order harmonics sound more harsh. In symmetrical clipping the odd order harmonics are more evident, whereas both even and odd harmonics are evident in asymmetrical clipping. The more asymmetrical, the more evident are the even order harmonics. In hard clipping the high- order harmonics are more common. In soft clipping on the other hand the low order harmonics are more evident [4].

Clipping is often caused due to hardware limitations which can occur for example in a loudspeaker system.

2.1 Clipping in a loudspeaker system

In a loudspeaker system there is a crossover network and a tweeter and a woofer.

The tweeter deals with high frequency (HF) signals and the woofer with low frequency (LF) signals. A common problem when dealing with a loudspeaker system is that the tweeter is blown up if you supply too much power [2]. A tweeter is more sensitive than a woofer and can only handle a fraction of the power an amplier, which are rated for use with the loudspeaker, supply. In music LF signals are more common than HF signals and therefore the tweeter is designed with less strong components. The energy of the HF signals have lower amplitude than the LF signals. Therefore it will be the LF signals that are being clipped rst. It has been argued that when the LF signals clip the power limit of the amplier there is established more HF signals which eventually crash the tweeter. Clipping can be frequency dependent, though in this thesis we only study frequency independent clipping.

(6)

However in recent years it has been evident that is not the whole thruth. Today many musicians are experimenting with clipping in order to obtain new sounds.

The quality of the ampliers in recent years are also better with bigger dynamic range and it is also claimed that the sound becomes better when clipped on these ones. Therefore it is quite common to overdrive the amplier. When this happens the LF signal will clip while the HF signal will continue to increase, because its amplitude is much less. The result is a crashed tweeter. Another reason a loudspeaker system is damaged is that a clipped signal has greater area underneath the curve and therefore more power than an unclipped signal [6].

2.2 Optimization

In optimization you usually want to nd the best solution from a set given a certain condition. Usually this can be reduced to minimizing or maximizing an objective function f subject to some known constraints.

"min

x f (x)

s.t gi(x) ≤ 0, i = 1, ..., m

#

(1)

The solutions to this general form must lie in the feasible set which is given by F = {x ∈ Rⁿ: gi(x) ≤ 0, i = 1, ..., m}.

2.3 Quadratic optimization

As you will see in Section 3 it is very useful to use quadratic optimization when you want to minimize the error in a linear equation or the dierence between two functions. A general quadratic optimization problem can be stated as below.



 min

x

1

2x^THx + c^Tx , H ∈ R^n×n, c ∈ Rⁿ s.t Ax ≥ b , A ∈ R^m×n, b ∈ R^m



 (2)

This is a special case of equation (1). If c is put to zero, only the quadratic term is left. In the solution to this problem we will restrict ourselves to when the objective function is convex. The convexity of f is guaranteed if H is positive semi-denite. Every minimizing solution then satises the equation Hx = −c.

However, if H is positive denite then f is strictly convex. This implies that there is a unique optimal solution to (2) given by the equation H ˆx = −c.

When solving quadratic optimization problems dierent solution algorithms could be used. There is one algorithm, the active-set, that is of special importance. It is an iterative search algorithm and is used when solving our dened optimization problems (11) and (12) in Section 3. The new search direction for the optimal solution dk is according to the active-set algorithm determined while still being at the active constraints boundaries. This new search direction is a feasible descent direction for the objective function. The algorithm looks for solutions that lie in the dened feasible sets. The dicult part with the

(7)

2.4 Least-squares problem

A special type of a quadratic optimization problem is a least-squares problem which can be stated as below.





 minx

1 2

N

X

k=1

||A_kx − b_k||² s.t A ∈ R^{N ×M}, x ∈ R^{M ×1}, b ∈ R^{N ×1}





 (3)

Ak is here the k:th row vector in matrix A. When the objective function is expanded and compared to the general objective function for a quadratic optimization problem, it can be concluded that in this case H = A^TAand c = −A^Tb [8].

2.5 Literature

During the project several articles were studied. The article "Real-Time Perception- Based Clipping of Audio Signals Using Convex Optimization"[1] presents three optimization methods to keep the clipped output signal as close as possible to the input signal, while restricting the input signal to a given amplitude range.

In this publication a physchoacoustic model was used, which has in mind that the distortion perceptibility depends on which frequency that is considered.

In the article "Missing Data Recovery Via a Nonparametric Iterative Adaptive Approach"[9], the authors have come up with a method to recover missing data samples from available samples. In this method an iterative algorithm is used to construct an eective weight matrix to t a complex valued signal to a Fourier series. In their implementation they splitted the signal in two vectors, one of them holding the available samples and the other vector the missing samples.

This way of thinking was used to keep track of the clipped values in our Fourier series method and in our AR approach method. The Fourier series method also keep track of the Fourier series corresponding to the clipped values.

In "Introduction to Digital Speech Processing"[6] the theory behind an autoregressive approach is presented. This approach is used in the article "A Simple Algorithm for the Restoration of Clipped Speech Signal"[10] when predicting the clipped y-values. The prediction coecients is here obtained both by a least square solution and an algorithm based on Kalman ltering. Our AR approach method also determines the prediction coecients by a least square solution.

There are several audio softwares that correct clipping problems. Most of them are aimed at the professional markets. Sony Sound Forge and iZotope have for example come up with professional software for the audio market. We have however not been able to nd any software that are publicly issued.

2.6 Fourier series approach

During a short time interval, a sound signal can be seen as periodic and thus approximated with a Fourier series with constant coecients. The serie is com- pound of even (cosinus) terms, odd (sinus) terms and a constant term for

(8)

k = 0. The order of the Fourier series below is p and it has 2p+1 terms.

f (t) = a0/2 +

p

X

k=1

akcos(kt) + bksin(kt)

By approximating the clipped signal with a Fourier series with suitable choice of coecients an adaption of the original signal is possible. The Fourier series coecients is chosen such that the quadratic error for given weights is minimized, where the error is the absolute value of by which the clipped signal vary from the approximated Fourier series.

2.7 Autoregressive

In the autoregressive model (AR) approach we assume that there is a relation- ship between the output-values of the signal. A general formula for the actual series is

yk=

p

X

n=1

anyk−n+ Gek. (4)

Here p is the order of the series. In this equation an is the lter coecients which can be assumed to be constant during a short time frame, G the gain parameter and e is assumed to be gaussian white noise. Gaussian white noise is noise which has equal amplitude for all frequencies. The lter function for an AR model has a polynomial in the denominator and the so called gain in the numerator. The general formula for the lter function is

H(z) = G

1 −Pp

n=1αnz⁻ⁿ, (5)

which is derived from the equation yk− ˜y_k. Here ˜yk is the general formula for the output value,

˜ yk=

p

X

n=1

αnyk−n (6)

[6], where αn are the prediction coecients. From here on in the thesis, we will denote the model error Gek as uk ≡ Gek. Formula (4) can now be written as

p

X

n=0

αnyk−n= uk, (7)

α0is here dened to be one and a minus sign is absorbed in the constants αn. The special type of lter described by formula (5) is an innite impulse response (IIR) lter. As you can see the impulse response goes toward innity when the denominator goes toward zero. In linear prediction analysis, the lter function is described by a nite impulse response (FIR) lter. The IIR lter is only stable if the poles have negative real parts, whereas the FIR lter is always stable.

However the IIR lter has higher computational eciency.

There are a couple of methods that can be used to determine the AR-coecients.

Two methods are the yule-walker equations and the covariance method which both are based upon least squares [7].

(9)

3 Method & model

Assume a symmetrically hard clipped real valued signal y with N data points, such that the absolute value of any single data point cannot exceed yclip. Take all data points yk (k = 1, 2, ..., N )where |yk| = yclipand let them form the set yc, the set of all clipped signals in y. Let the indices of the clipped signals be the set c = {c1, c₂, ..., c_N_c}. All unclipped data points and their indices can then form the sets yg and g = {g1, g₂, ..., g_N_g}, respectively. Let Ng and Nc be the amount of data points in yg and yc respectively, such that N = Ng+ N_c. Next, Figure 2 illustrates how the data points belong to ygor ycdepending on their amplitude.

Figure 2: A symmetrically hard clipped signal, where all clipped data points are in yc and all non-clipped data points are in yg. The index k for yk if yk is clipped is in the set c.

Since it is known that no elements in yg are clipped, only the elements in yc

are subject to change. For both approaches below, the restored elements in yc

are subject to constraints, which are:

yclip≤ |yi| ≤ ylimit, ∀ i ∈ c (8) Here ylimitis introduced as an upper limit for |yi|. ylimitis for simplicity set to a multiple of yclip. For example, a ylimit= 100yclipmeans that the signal cannot be more than 99% clipped relative to the original signal, while ylimit = 10yclip

corresponds to a signal that cannot be more than 90% clipped. The limitation is introduced mostly as a physical restriction; the signal cannot be unbounded.

There is also the matter that the more a signal is clipped, the more information is lost. 99% clipping means that a large part of the signal lost, unless there were just a few outliers.

3.1 Fourier series approach

Let p be the maximum index for the Fourier series, such that 2p + 1 are the total amount of terms (2p from a1 to ap & b1to bp, 1 from ^a₂⁰).

f (a, b, t_k) =a₀ 2 +

p

X

i=1

(a_icos(ω_it_k) + b_isin(ω_it_k)) (9)

(10)

Restoration of clipped sound signals 3 Method & model

Here the ωiare chosen in a frequency window of interest, with higher frequencies for a higher index i. First consider a least squares t under the constraint (8).

(F SP )





 min

y_c,a,b N

X

k=0

|yk− f (a, b, tk)|² , y ∈ R^N, y_c∈ R^N^c, a, b ∈ R^p

where f (a, b, t_k) = a₀ 2 +

p

X

i=1

(a_icos(ω_it_k) + b_isin(ω_it_k)) s.t yclip≤ |f (a, b, ti)| ≤ ylimit , ∀ i ∈ c





 (10) In this model, all data points are weighted equal, no matter if they belong to yc

or yg. To enable the model to place less importance to the elements belonging to yc, we introduce the weight vector w ∈ R^N, with 0 ≤ wk≤ 1 , k = 1, 2, ..., N. The weights can now be chosen essentially arbitrarily. For a simple approach, let all wk where k ∈ c be a single value w1and all wk where k ∈ g be set to 1.

(W F SP )





 min

y_c,a,b N

X

k=0

w_k|y_k− f (a, b, t_k)|² , y, w ∈ R^N, y_c∈ R^N^c, a, b ∈ R^p

where f (a, b, t_k) = a0

2 +

p

X

i=1

(a_icos(ω_it_k) + b_isin(ω_it_k)) s.t yclip≤ |f (a, b, ti)| ≤ ylimit , ∀ i ∈ c





 Solving this linear least squares problem lets us update yc. (11)

3.2 Autoregressive approach

Let p be the order of the AR model. After calculating the AR coecients using the whole signal y, we can attempt to update the elements in yc. The signal is under the constraint

p

X

n=0

αnyk−n = uk, k = p, p + 1, ..., N as per the AR model.

Consider the model error uk. Choosing the elements in yc in a way that the sum of the model error squared is minimized implies that the new data points in yc would follow the calculated AR coecients optimally. We still apply the constraint (8).

(ARP )





 miny_c,u

N

X

k=p

u²_k , y ∈ R^N, yc∈ R^N^c, u ∈ R^{N −p}, α ∈ R^p

s.t

p

X

n=0

αnyk−n = uk , k = p, p + 1, ..., N y_clip≤ |y_i| ≤ y_limit , ∀ i ∈ c





 (12) When the signal is updated, it is possible to calculate new AR coecients and once again solve the optimization problem (12) to update the signal. The autoregressive approach can thus be implemented in an iterative manner.

(11)

4 Simulation

The simulations in this section have been done in matlab. The sound is open source and taken from [11]. This algorithm uses several parameters, which are explained below. The parameters were changed, one at a time, during the simulations to investigate the eects they had on the restoration of the signal.

The signal is normalized before the algorithm is applied.

The parameter yclipis a measure of the amplitude threshhold compared to the original signal, lower yclipmeans the amplitude threshhold is lower; yclip= 30%

thus means that the amplitude threshhold is 30% of the maximum amplitude of the signal before the clipping. The algorithm does not require knowledge of yclip

in any way, it is only a measure of how much the signal is articially clipped.

How the algorithm utilises overlapping intervals is illustrated in gure 3.

Parameter Explanation Starting

value yclip The amplitude threshhold of the signal

compared to the signal before clipping. 50%

Samplelength This integer is the length of the sample intervals taken from the signal. 500 Iter How many times we perform the algo-

rithm and update the clipped values. 1 Order Determines the order of the models

used to update the signal. 30 (AR), 200 (FS)

Overlap The number of overlapping intervals. 3 Weight The weights for the clipped and un-

clipped samples.

w_clip = 0.3, w_unclip= 1

Table 1: Brief explanation of the parameters that were changed during the simulations. The starting values corresponds to values of the parameters that empirically gave a decent restored signal. In the gures of the restored signals in the subsection below, the parameters are using their starting values unless stated otherwise.

Now for an explanation of how the algorithm works; the initial phase of the algorithm is to input the necessary parameters and choose one of the approaches.

Next in line is to nd the maximum amplitude for the signal, and use this to determine the indices c and g. When this is completed, the restoration can begin. The data points in the signal y is divided into overlapping sample intervals. For all intervals where there are any clipped data points, the clipped data points are updated using the chosen approach. For the Fourier series method, this means solving the optimization problem (WFSP). For the AR method, the AR coecients has to be calculated before the optimization problem (ARP) is solved. The restoration process is repeated for all the other overlapping intervals. Once this is completed, take the average for each data points using all the restored overlapping intervals, and return it as the restored signal. Below is a more compact description of the algorithm.

(12)

Restoration of clipped sound signals 4 Simulation

Algorithm 1 Signal restoration algorithm Choose either Fourier series or AR approach.

Find indices c and g.

Split y into sample intervals of size Samplelength, let the last interval be larger if needed.

for all intervals do

if there is at least one index ∈ c in the current interval then repeat update the interval using the chosen approach until chosen amount of iterations has been performed end if

end for

Join together split intervals.

Repeat the above for all statement for all overlapping intervals.

The restored signal is now the average of all the separate overlapping intervals.

Figure 3: A gure illustrating 3 overlapping intervals.

Figure 3 describes how the overlapping intervals are used in the algorithm.

In the gure, the white lled rectangles represent the sample intervals. Let y = [y1, y2, ..., yN], and S ≡ Samplelength . In 'Overlap 1', the rst interval contains y1 to yS, the second interval starts from index (S + 1) and goes to (2 S), and so on. In 'Overlap 2', the rst interval has starting index (1 + S/3). For 'Overlap 3', the starting index is now (1 + 2 S/3). The grey lled rectangles are thus intervals ignored in Overlap 2 & 3. All intervals have length S, except the end intervals which are allowed to be longer. All fractions are rounded to nearest integers before being used as index. In general, for n overlapping intervals the starting index for the rst interval for 'Overlap 2 ' is (1 + S/n).

Now on to the actual simulations. In the following part of this section, not all

gures will be shown due to the sheer amount of gures. There will be at least one gure per parameter; more gures can be found in the appendix.

(13)

Figure 4: The original signal used for the simulations.

Figure 4 shows the signal used in these simulations before the articial clipping.

The signal is 1.125 seconds long and features a woman speaking. Longer signals have not been used to reduce the computing time needed, since the code is not optimized for fast runtime.

Figure 5: The signal, articially clipped to 50% of its maximum magnitude.

The lower graph shows the dierence between the original signal and the clipped signal. The norm of the dierence signal shown in the lower graph is 4.51.

From here on for all the restorations, the norm of the dierence between the restored signal and the original signal will be denoted ydn, and given for all

gures in the simulation section. The norm of the dierence between the clipped signal and the restored signal will be called ycn. As such, ycn = 4.51 for the clipped signal above.

(14)

4.1 Simulations using the AR algorithm

As mentioned above, the optimization problem (ARP) was implemented in an iterative manner. When updating the signal using the AR approach the yule- walker AR coecients were calculated with the built-in function aryule. These coecients were then used as the AR coecients in the optimization problem (ARP). The optimization problem was solved in matlab with the function quadprog, using the active-set algorithm. Figure 6 below illustrates the results obtained by using the starting values dened in Table 1. ydn = 1.93 for this restoration.

Figure 6: Restoration of the clipped signal, using the starting values for the parameters. The norm of the lower graph, dierence between the restored signal and the original signal is ydn= 1.93

As can be seen in Figure 6, the restored signal is closer to the original signal than the clipped signal. There are some spikes in the signal around index 6000.

Next, variations in the parameters are considered.

(15)

4.1.1 Overlapping intervals

Here, the eect of dierent amounts of overlapping intervals is simulated. Figure 7 below shows the result of using ten overlapping intervals. ydn= 1.76for this restoration.

Figure 7: Restoration of the clipped signal, using 10 overlapping intervals. ydn= 1.76

Figure 7 shows a smoother signal than Figure 6. When using more overlapping intervals, the restored signal typically becomes more smooth, due to the greater amount of values being used in the average. Depending on at what index the intervals begin, there may be some 'spikes' in the updated signals, which can be seen in Figure 6. These spikes are reduced when the amount of overlapping intervals increase.

(16)

4.1.2 Iterations

Here, the amount of iterations of the algorithm is varied. Figure 26 below shows the result of using two iterations. ydn= 8.17for this restoration.

Figure 8: Restoration of the clipped signal, using two iterations. ydn= 8.17

Figure 26 illustrates that another iteration of the algorithm gives more spikes and more deviations from the original. After several other simulations with higher amount of iterations it appears that the more iterations the worse the result.

(17)

4.1.3 Sample length

Here, dierent lengths for the sample intervals are considered. Below is Figure 9, showing the result of using sample length 1500. ydn= 1.57for this restoration.

Figure 9: Restoration of the clipped signal, using sample length 1500. ydn = 1.57.

Increasing the sample length appears to make the restored signal smoother, with less spikes. The maximum sample length tried was 3000 (see appendix, Figure 25), which gave a similar result to the sample length 1500. When reducing the sample length below 500, the restored signal became noticably worse (Figure 24).

The time needed for the computation increases as the sample length increases.

(18)

4.1.4 The order of the model

Here, the order of the model is varied. Figure 10 below shows the result of using order 200. ydn= 2.56 for this restoration.

Figure 10: Restoration of the clipped signal, using an AR model of order 200.

ydn= 2.56

Figure 10 illustrates how the restoration of the signal can be worsened by choosing an AR model of higher order. Choosing a model of lower order can give a worse result as well, though not as bad as in the gure above.

(19)

4.1.5 Dierent values for yclip

Here, yclipis varied, such that the performance of the AR approach for dierent amplitude threshholds can be considered. Below is a gure with yclip = 0.8, y_cn= 0.71.

Figure 11: The signal, articially clipped to 80% of its maximum magnitude.

The norm of the dierence signal shown in the lower graph is ycn= 0.71.

Figure 12 shows the result of restoring the clipped signal in Figure 11 using the parameters in Table 1. ydn= 0.26for this restoration.

(20)

Figure 12: Restoration of the clipped signal, where yclip= 0.8. ydn= 0.26

Figures 11 and 12 show that when the signal is less clipped it is simpler to restore.

Being less clipped means the amplitude threshhold is higher, less information is lost.

4.1.6 Multiple iterations for a less clipped signal

In this subsection, multiple iterations are performed on the clipped signal with yclip= 0.8, as seen in Figure 11. The sample length used is 1500. Below are two

gures; the rst using two iterations of the algorithm and the second using 12 iterations of the algorithm. ydn= 0.11for the former, ydn= 0.19for the latter.

(21)

Figure 13: Restoration of the clipped signal where yclip = 0.8, sample length 1500. Two iterations are performed. ydn= 0.11

Figure 14: Restoration of the clipped signal where yclip = 0.8, sample length 1500. 12 iterations are performed. ydn= 0.19

While the second iteration is near perfect and actually deviates less than the

rst iteration, the third iteration deviates a bit more. Every iteration after the

(22)

tenth is very similar to each other. Figure 31 showing 10 iterations can be seen in the appendix. Thirteen iterations (Figure 32) gives the same result as the simulation with 12 iterations as seen in Figure 14

4.2 Simulations using the Fourier series algorithm

When updating the signal using the Fourier series approach the optimization problem 11 was solved. It was solved in matlab with lsqlin, also using the active-set algorithm.

Figure 15 below shows restoration of the clipped signal, using the standard parameters. Here with the weights for the clipped values, wclip, and the weights for the unclipped values, wunclip, set to one. The dierence between the restored signal and the original signal, ydn, was 4.22.

Figure 15: Restoration of the clipped signal, using the standard parameters with wclip= 1 and wunclip= 1. ydn= 4.22

In order to determine the most optimal weights, wclipwas changed. wunclipwas set to one because the values of the restored signal is here the same as for the original signal. The best restoration of the signal was achieved with wclip= 0.3 and wunclip= 1. This is illustrated in Figure 16 below, ydnwas here 3.99.

(23)

Figure 16: Restoration of the clipped signal, using the standard parameters with wclip=0.3. ydn= 3.99

All the simulations below are done with the weights wclip= 0.3and wunclip= 1.

(24)

4.2.1 Overlapping intervals

Here, the eect of dierent amounts of overlapping intervals is simulated. Figure 17 shows the result of using ten overlapping intervals. ydn = 3.55 for this restoration.

Figure 17: Restoration of the clipped signal, using 10 overlapping intervals.

ydn= 3.55

Figure 17 shows that the restoration of the signal is better using 10 overlapping intervals. The more overlapping intervals used, the more time-consuming are the calculations, but the result is better with a more smooth signal.

(25)

4.2.2 Iterations

Here, the amount of iterations of the algorithm is varied. Figure 18 below shows the result of using two iterations. ydn= 4.46for this restoration.

Figure 18: Restoration of the clipped signal, using two iterations. ydn= 4.46

Figure 18 shows that the restoration of the signal is worse using two iterations.

The deviations from the original are more apparent with greater spikes. This was the case also for the AR approach when two iterations were used, as can be seen in Figure 26. Figure 33 illustrates that the result becomes worse, the more iterations used.

(26)

4.2.3 Sample length

Here, dierent lengths for the sample intervals are considered. Figure 19 shows the result of using sample length 2000. ydn= 4.34for this restoration.

Figure 19: Restoration of the clipped signal, using sample length 2000. ydn= 4.34.

Figure 19 shows that there are more spikes using a sample length greater than 500. When using sample length 401 the restored signal is exactly the same as the clipped signal, which can be seen in Figure 34.

(27)

4.2.4 Order of the model

Here, the order of the model is varied. Figure 20 below shows the result of using order 235. ydn= 3.96 for this restoration.

Figure 20: Restoration of the clipped signal, using a Fourier series of order 235.

ydn= 3.96

Figure 20 illustrates that there is less clear spikes in the restored signal using order 235 than obtained using the standard parameters with order 200. ydnalso assumes a less value. Order 249 has less clear spikes compared to order 235, but ydnis greater compared to order 235 as can be seen in Figure 35. The less order used, the more deviations from the original signal. This is illustrated in Figure 36 using order 50.

(28)

Restoration of clipped sound signals 5 Discussion

4.2.5 Dierent values for yclip

Here, yclip is varied, such that the performance of the Fourier series approach for dierent amplitude threshholds can be considered. Figure 11 in Section 4.1.5 shows the signal articially clipped to 80% of its maximum magnitude. Below is a gure showing the restored signal when yclip= 0.8, ydn= 0.66.

Figure 21: Restoration of the clipped signal, where yclip= 0.8. ydn= 0.66

Figure 21 shows that ydn is less if less information is lost. When restoring the signal using multiple iterations there was not a notable improvement from using only one iteration. This is illustrated in Figure 38 and Figure 39.

5 Discussion

The AR approach restores the signal well, the signal is closer to the original signal than the clipped signal for certain choices of parameters. For example, consider the clipped signal with yclip= 0.5in Figure 5, where ycn= 4.51. Fig- ure 9, showing the restored signal with sample length 1500, has the dierence norm ydn= 1.57. The fraction of their norms is 1.57

4.51= 0.348, 34.8%. Since the percentage is below 50%, the restored signal in this case actually deviates less from the original signal than the clipped signal. It is noteworthy that ydn, the norm of the dierence between the restored signal and the original signal, does not tell everything about the signal restoration; there is no information whether there are many small deviations or a single large deviation.

(29)

AR approach. For some choices of parameters the restoration of the signal is closer to the original than obtained using the starting value parameters. This is obtained for example when using a larger amount of overlapping intervals (Figure 17). The AR approach also performed better using a larger amount of overlapping intervals. However increasing the overlapping intervals lengthens the calculation time linearly, as twice the overlapping intervals typically means twice the computation time. Therefore an adjustment between performance and calculation time has to be considered when deciding the number of overlapping intervals.

The Fourier series approach is better suited for a great number of terms, but not so many such that the number of terms is equal to or greater than the sample length, if this happens there is an exact depiction of the clipped signal (Figure 34). An exact depiction is therefore possible when a certain number of y-values is known with corresponding number of terms according to the optimization problem (WFSP). Increasing the number of terms result in an overdetermined system, which leads to a raised computation time. The order, p, must be chosen such that the number of terms, 2p + 1, takes a value which at the most equals the sample length. When the order exceeds the sample length the calculations become heavy and therefore take a lot of time.

For the AR model, choosing the order of the model is not as simple as with the Fourier series model; Figure 10 shows a signal restored with the AR model of order 200, nearly an order of magnitude higher than the starting parameters, sig- nicantly worsens the restoration. On the other hand, when using an AR model of order 3 (see Figure 22), an order of magnitude less than the starting value, the restored signal was still not as bad as the one in Figure 10. These results implicate that while the order of the AR model might need a bit of tuning, it is typically safer to choose a lower order model rather than a higher order model.

The sample length chosen aected the two dierent approaches dierently. Since one of the initial assumptions of the AR model was that the AR coecients were constant during a short time frame, worsened performance for a large enough sample length should be expected. The AR approach appeared to perform better with a longer sample length (see Figure 9, 25), though the computation time increased as well due to the increased complexity of the calculation of the AR coecients. Because of this increased computation time sample lengths larger than 3000 data points were not tested for the AR model; it cannot be concluded if a large enough sample length causes the AR approach to start performing worse.

For the Fourier series approach, the performance did noticeably worsen as the sample length increased (Figure 19). Considering the results when changing the order of the Fourier series model, the implication is that for the Fourier series approach, the order of the model relative to the sample length is what matters for the performance of the restoration of the signal. The importance appears to lie in choosing an order p such that 2p + 1 is close to, but does not equal or exceed the sample length.

Neither the Fourier series approach nor the AR approach performed well with increasing amount of iterations. A notable exception to this might be the signal shown in Figure 13; during the iteration process the signal was restored better

(30)

Restoration of clipped sound signals 5 Discussion

after two iterations, which never occurred in the simulations where the signal was more clipped. While not conclusive, Figure 14 and 32 suggest that a signal restored by the AR approach can converge after an increased amount of iterations, though it does not necessarily have to converge towards the original signal. This result suggests that the sound signal used for these simulations cannot be perfectly described by the AR model. Thus, it cannot be expected that the restored signal would converge towards the original signal. A possible reason for the dierence in whether a restored signal converges or not depending on how much the signal was clipped is the estimation of the AR coecients. The AR coecients were calculated using the Yule Walker equations. The extent of how much the estimated AR coecients are aected by how much the signal is clipped has not been thoroughly analyzed in this thesis. However, it is expected that the AR coecients can deviate depending on how much the signal is clipped, since there is less data available, which might be the cause of whether the restored signal converges or not. When the signal is less clipped, the AR coecients should be closer to what they would be if they were calculated for the original signal.

The dierence in performance between the AR approach and the Fourier series approach can be explained by the dierence in their optimization problems, (ARP) (12) and (WFSP) (11). The objective function for (ARP) is to minimize the sum of the squares of the model error u, while the objective function for (WFSP) is to t a Fourier series to the known data points, including the clipped ones. Since the objective function still wants to keep the tted Fourier series as close as possible to the clipped data points as long as wclip > 0, it is not surprising that the restored signal is still close to the clipped signal. The fact that (ARP) does not rely on the clipped data points ycfor its objective function lets it deviate from the clipped data points when attempting to restore the original signal.

A disadvantage of our algorithm developed for the two methods is that it requires tuning of several parameters. When we tested the two methods we changed one parameter at a time to see the eect on the restoration of the signal. It was however quite diult to test all combinations of parameters in a systematic way.

The code could also have been optimized for eciency, to make the simulations faster. It was also sometimes hard to see any dierence comparing dierent graphs using dierent parameters against each other. For the Fourier series approach the graph obtained using the standard parameters with wclip = 0.25 (Figure 37) diers not that much from the graph using the standard parameters with wclip = 0.3 (Figure 16). A source of error is also the carefulness in the choice of parameters. We performed a lot of simulations, but we limited us to approximately seven simulations per parameter.

It should also be noted that all the simulations presented in this thesis were performed using only one sound source, speech. The methods were also tested on another type of sound; a train whistle. The Fourier series approach behaved essentially the same as for the speech sound. The AR approach performed a bit worse than for speech, though it still performed better than the Fourier series approach. It could be possible that the Fourier series approach and AR approach perform dierently on another type of sound as well, for example music instruments.

(31)

6 Conclusion

In this thesis, sound signals were modeled using both an AR model and an Fourier series model. The models were used to present two possible approaches for partially restoring the clipped sound signal; the AR approach and the Fourier series approach. These approaches were used to restore an articially clipped speech sound signal. The results obtained using the two methods were evaluated and compared to each other. The AR approach was found to perform better than the Fourier series approach, which can be explained by the dierence in their respective optimization problems. The objective function for the optimization problem using the Fourier series approach (WFSP) t a Fourier series to known data points, including the clipped ones. This is not the case for the optimization problem using the AR approach (ARP), where the magnitude of the model error is minimized; the clipped data points only aect the objective function indirectly through the calculation of the AR coecients. This indirect eect is small if the estimated AR coecients from the clipped signal are good. This is the reason the restored signal still is close to the clipped signal for the Fourier series approach.

Both methods produce a smoother restored signal when using more overlapping intervals. The Fourier series approach performs better when using a high order of the model relative to the sample length, while the AR approach performs better with larger sample lengths and a low order of the AR model. While the theory predicted the AR model would perform worse at a large enough sample length, this could not be conrmed by the simulations due to simulations quickly requiring more time as the sample length increased. None of the methods performed better for more iterations of the algorithm; it appeared to not converge. Further study revealed that it might be possible that the AR method converges, though it will not necesserarily converge towards the original signal.

In summary, the algorithm using the AR approach suggested in this thesis could eectively partially restore a clipped sound signal. The weakness of the algorithm is the fact that it requires tuning of several parameters, though some of them was determined to increase the performance of the restoration algorithm for the cost of longer computation time.

7 References

[1] Bruno Defraene, Toon van Watershoot, Hans Joachim Ferreau, Moritz Diehl and Marc Moonen, "Real-Time Perception-Based Clipping Of Au- dio Signals Using Convex Optimization". Katholieke Universiteit Leuven, 2012.

[2] Jim Lesurf, "Clipping", www.st-andrews.ac.uk/~www_pa/Scots_Guide/

audio/clipping/page1.html , recieved 2013-04-13 .

[3] Reginald Bain, "The Harmonic Series", http://teachersites.schoolworld.

com/webpages/EAbalos/files/hs.pdf, recieved 2013-04-13 .

[4] R.G. Keen, "Eects Descriptions", %http://www.geofex.com/effxfaq/

fxdescr.htm, recieved 2013-04-13 .

[5] Monty Ross Rane Corporation, "Power Amplier Clipping And Its Eects On Loudspeaker", http://www.adx.co.nz/techinfo/audio/note128.pdf,

(32)

Restoration of clipped sound signals A Graphs from the simulations

recieved 2013-04-13 .

[6] L. R. Rabiner and R. W. Schafer, "Introduction to Digital Speech Pro- cessing", now, Vol. 1, pp. 75-96, 2007

[7] Gidon Eshel, "The Yule Walker Equations for the AR Coecients", http:

//www-stat.wharton.upenn.edu/~steele/Courses/956/ResourceDetails/

YWSourceFiles/YW-Eshel.pdfl , recieved 2013-04-13 .

[8] A. Sasane and K. Svanberg, "Optimization", Department of Mathematics, Royal Institute of Technology, Stockholm, 2012.

[9] Petre Stoica, Jian Li, Jun Ling and Yubo Cheng,"Missing Data Recovery Via a Nonparametric Iterative Adaptive Approach", Uppsala University, University of Florida, 2009

[10] Abdelhakim Dahimene, Mohamed Noureddine and Aarab Azrar,"A Sim- ple Algorithm for the Restoration of Clipped Speech Signal", Boumerdes University, Boumerdes, Algeria, 2007

[11] evincent, "Matlab reference software and evaluation software for under- determined speech and music data",http://sisec2008.wiki.irisa.fr/

tiki-file_galleries.php, received 2013-03-22

A Graphs from the simulations

In this section there are several graphs from simulations of both the AR approach and the Fourier series approach, which were only mentioned in words in the simulation section. The simulations are using the same starting values and clipped/original signal unless otherwise stated.

(33)

A.1 AR simulations

Figure 22: Restoration of the clipped signal, order 3. ydn= 2.05

Figure 23: Restoration of the clipped signal, using no overlaps. ydn= 1.94

(34)

Figure 24: Restoration of the clipped signal, using sample length 200. ydn= 6.36

(35)

Figure 26: Restoration of the clipped signal, using two iterations. ydn> 20

A.1.1 Multiple iterations for a less clipped signal

Figure 27: Restoration of the clipped signal where yclip = 0.8, sample length 1500. A single iteration is performed. ydn= 0.18

(36)

Figure 28: Restoration of the clipped signal where yclip = 0.8, sample length 1500. Three iterations are performed. ydn= 0.16

Figure 29: Restoration of the clipped signal where yclip = 0.8, sample length 1500. Four iterations are performed. ydn= 0.22

(37)

Figure 30: Restoration of the clipped signal where yclip = 0.8, sample length 1500. Eight iterations are performed. ydn= 0.28

Figure 31: Restoration of the clipped signal where yclip = 0.8, sample length 1500. Ten iterations are performed. ydn= 0.19

(38)

Figure 32: Restoration of the clipped signal where yclip = 0.8, sample length 1500. 13 iterations are performed. ydn= 0.19

A.2 Fourier series simulations

Figure 33: Restoration of the clipped signal, using 10 iterations. y = 5.92

(39)

ydn= 4.34

(40)

ydn= 7.47

Figure 37: Restoration of the clipped signal, where wclip= 0.25. ydn= 4.00

(41)

A.2.1 Multiple iterations for a less clipped signal

Figure 38: Restoration of the clipped signal where yclip = 0.8. Two iterations are performed. ydn= 0.66

Figure 39: Restoration of the clipped signal where yclip = 0.8. Ten iterations are performed. ydn= 0.65

Restoration of clipped sound signals -a weighted Fourier series and AR approach