Digital compensation of distortion in audio systems

(1)

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Digital compensation of distortion in audio systems

Examensarbete utfört i elektroniksystem vid Tekniska högskolan i Linköping

av

Fredrik Bengtsson, Rikard Berglund

LiTH-ISY-EX--10/4367--SE

Linköping 2010

Department of Electrical Engineering Linköpings tekniska högskola Linköpings universitet Linköpings universitet SE-581 83 Linköping, Sweden 581 83 Linköping

(2)

(3)

Digital compensation of distortion in audio systems

Examensarbete utfört i elektroniksystem

vid Tekniska högskolan i Linköping

av

Fredrik Bengtsson, Rikard Berglund

LiTH-ISY-EX--10/4367--SE

Handledare: Kent Palmkvist

isy, Linköpings universitet

Pär Gunnars Risberg

Actiwave AB

Examinator: Kent Palmkvist

isy, Linköpings universitet

(4)

(5)

Avdelning, Institution

Division, Department Elektroniksystem

Department of Electrical Engineering Linköpings universitet

SE-581 83 Linköping, Sweden

Datum Date 2010-05-06 Språk Language Svenska/Swedish Engelska/English Rapporttyp Report category Licentiatavhandling Examensarbete C-uppsats D-uppsats Övrig rapport

URL för elektronisk version

http://www.es.isy.liu.se http://www.ep.liu.se ISBN — ISRN LiTH-ISY-EX--10/4367--SE

Serietitel och serienummer

Title of series, numbering

ISSN

—

Titel

Title Digital kompensering av distorsion i ljudsystem_{Digital compensation of distortion in audio systems}

Författare

Author

Fredrik Bengtsson, Rikard Berglund

Sammanfattning

Abstract

The advancements of computational power in low cost FPGAs are giving the op-portunity to implement real-time compensation of loudspeakers and audio systems. The need for expensive commercial audio systems is reduced when the fidelity of much cheaper audio systems easily can be improved by real-time compensation. The topic of this thesis is to investigate and evaluate methods for digital com-pensation of distortion in audio systems. More specifically, a VHDL module is implemented to, when necessary, alleviate the problem of drastically deteriorating fidelity of the bass appearing when the input power is too high.

Nyckelord

Keywords Distortion, Digital Compensation, Signal Processing, Digital Filters, Audio

(6)

(7)

Abstract

The advancements of computational power in low cost FPGAs are giving the op-portunity to implement real-time compensation of loudspeakers and audio systems. The need for expensive commercial audio systems is reduced when the fidelity of much cheaper audio systems easily can be improved by real-time compensation. The topic of this thesis is to investigate and evaluate methods for digital com-pensation of distortion in audio systems. More specifically, a VHDL module is implemented to, when necessary, alleviate the problem of drastically deteriorating fidelity of the bass appearing when the input power is too high.

(8)

(9)

Acknowledgments

We would like to thank: Kent Palmkvist

Pär Gunnars Risberg Everyone at Actiwave AB Sebastian Abrahamsson Markus Råbe

for their help in this thesis.

Fredrik Bengtsson and Rikard Berglund

(10)

(11)

2.3.1 Amplifier . . . 7 2.3.2 Loudspeakers . . . 7 2.4 Filters . . . 8 2.4.1 Analog filters . . . 8 2.4.2 Digital filters . . . 8 2.4.3 IIR filters . . . 9 2.4.4 FIR filters . . . 10 2.5 Class-D amplifiers . . . 11 2.6 Audio compressor . . . 11 2.6.1 Functionality . . . 11 2.6.2 Limiter . . . 13 3 Test equipment 15 3.1 Loudspeaker A . . . 15 3.2 Amplifiers . . . 16 3.2.1 Amplifier A . . . 16 3.2.2 Amplifier B . . . 17 ix

(12)

3.2.3 Amplifier C . . . 18

3.3 Sound card . . . 18

3.4 Microphone . . . 19

3.5 Digital oscilloscope . . . 19

4 Investigation 21 4.1 Analysis of audio sequences . . . 21

4.2 Identifying the cause of distortion . . . 22

4.3 Model of clipping . . . 23

4.3.1 One tone test . . . 23

4.3.2 Two tone test . . . 27

4.3.3 Conclusion . . . 28

4.4 Saturation of amplifiers . . . 28

4.4.1 Voltage saturation . . . 28

4.4.2 Over-current protection . . . 28

4.5 Voltage saturation/frequency dependency . . . 28

4.5.1 Bass/midrange driver test . . . 29

4.5.2 Full range loudspeaker test . . . 30

4.6 Conclusion . . . 31

5 The proposed solution 33 5.1 Basic functionality of the model . . . 33

5.2 Essential structure of the limiter . . . 34

5.2.1 LP prefiltering . . . 34

5.2.2 LP postfiltering . . . 34

5.2.3 Preserving a full frequency range signal . . . 34

5.3 Amplitude reduction of the bass channel . . . 34

5.3.1 Time multiplexing . . . 35

5.3.2 Scaling . . . 36

5.3.3 FFT and notch filter . . . 38

5.3.4 Discussion and conclusions . . . 38

5.4 Choice of filters . . . 39

5.4.1 Crossover . . . 39

5.5 The decision making block . . . 40

5.5.1 The maximum block . . . 41

5.5.2 Direct steering . . . 41

5.6 Optional functionality . . . 42

5.7 The volume control . . . 43

5.8 The complete limiter . . . 44

5.9 Limiter simulations . . . 44

5.9.1 Limitation of a sinusoid with constant amplitude . . . 45

5.9.2 Limitation of a ramping sinusoid . . . 46

5.9.3 Limitation of music . . . 47

(13)

Contents xi

6 VHDL implementation 49

6.1 FPGA . . . 49

6.2 Clock domains . . . 49

6.3 Implementation and optimization . . . 50

6.3.1 Biquads . . . 50

6.3.2 Decision making block . . . 50

6.3.3 Miscellaneous . . . 50 6.4 Volume control . . . 51 7 THD measurements 53 7.1 Introduction to THD measurements . . . 53 7.2 THD measurement . . . 53 7.2.1 Data collection . . . 54 7.3 Results . . . 54

8 Conclusions and discussion 57 8.1 Audible improvement . . . 57

8.2 Regarding the model . . . 58

8.2.1 Functionality of the model . . . 58

8.2.2 Future work and discussion . . . 58

8.3 Supplementary conclusion . . . 60

(14)

(15)

Chapter 1

Background

1.1 Introduction

The advancements of computational power in low cost FPGAs are giving the op-portunity to implement real-time compensation of loudspeakers and audio systems. The need for expensive commercial audio systems is reduced when the fidelity of much cheaper audio systems easily can be improved by real-time compensation. Today it is possible to implement hardware that digitally compensates for e.g. phase delays, loudspeaker characteristics and distortion.

The topic of this master thesis is to investigate and evaluate methods for digital compensation of distortion in audio systems. It is well known that a lot more energy is needed to produce a loud bass sound than a loud high-pitched sound, i.e., most of the sound energy in music comes from the kick drum and the bass guitar or bass synthesizer [16]. However, the limitations of amplifiers and loudspeakers will result in drastically deteriorating fidelity when playing too loud, which is the problem to be solved. Therefore, a hardware module, described in VHDL, will be implemented in order to, when necessary, reduce the problem of arising distortion. The area of interest is digital compensation of audio systems. The main ques-tions are: Is it possible to identify the cause of audible distortion due to too high volume in the audio system and digitally compensate for this with a real-time com-pensating module in hardware? If yes, is this a reliable and flexible implementation that is applicable in different audio systems?

The answers to these questions, among with others, will be presented in this master thesis.

1.2 Purpose

The purpose of this thesis is to gain further knowledge of the distortion rendered by too high input volume and how to compensate for this. The work was done at Actiwave AB. There were four main tasks:

(16)

1. Investigate how a too high input signal rendering audible distortion is digi-tally detected in an audio system.

2. Determine how the distortion can be compensated and develop a model solving the problem stated above.

3. Implement a VHDL module, similar to the earlier developed model, accord-ing to hardware limitations and processaccord-ing time in a real-time hardware system.

4. Present data comparing the distortion in the audio system when compensa-tion is on and off.

1.3 Goals

The main goals are to develop a fully working model in Scilab or Simulink and implement the same model in hardware via VHDL. The model should compensate for a too high input volume by performing calculations on a sound sequence. The signal should only be altered when necessary, i.e., when it is identified that the bass will give rise to distortion at a given volume. This allows the user to play music at a higher volume without getting the feeling that the audio system loses its high fidelity.

Further, the VHDL module might have differences compared to the previously developed model because of hardware restrictions that may arise. The VHDL module should be applicable in any Actiwave audio system. It has to be fairly easy to adapt to systems with different amplifiers and loudspeakers.

If different methods for any task are viable to implement in the module, they should be investigated as far as possible in the given time frame. A performance comparison should be performed after implementation and a discussion should clearly motivate the choice of a specific model.

1.4 Outline of the report

Chapter 1 is an introduction to the thesis where background, purpose, goals,

outline of the report and used denotations and definitions are stated.

Chapter 2 briefly presents related research that the reader should be well

ac-quainted with in order to fully understand this thesis.

Chapter 3 lists all the test equipment used in this thesis and states their most

significant properties.

Chapter 4 covers an investigation of how the audible distortion can be detected

and what its effects are.

Chapter 5 covers the development of the proposed model and briefly discusses

different methods of solving the problem.

Chapter 6 briefly covers the implementation of the VHDL described hardware

module on an FPGA.

(17)

1.5 Denotations and definitions 3 Chapter 8 presents the conclusions of both the hardware and software model and

discusses further development of the proposed solution.

1.5 Denotations and definitions

Software, abbreviations and acronyms used in this thesis are explained in this section.

Scilabr

Scilab is an interactive platform for numerical computation providing a powerful computing environment for engineering and scientific applications [4].

Simulinkr

Simulink is an environment for multidomain simulation and Model-Based Design for dynamic and embedded systems. It provides an interactive graphical envi-ronment and a customizable set of block libraries that let you design, simulate, implement, and test a variety of time-varying systems, including communications, controls, signal processing, video processing, and image processing [23].

ISErWebPACK

TM

ISE WebPACK design software is a fully featured front-to-back FPGA design solution. ISE WebPACK is a tool for FPGA and CPLD design offering HDL synthesis and simulation, implementation, device fitting, and JTAG programming [25].

ARTA

A program for the impulse response measurement and for real-time spectrum anal-ysis and frequency response measurements [8].

(18)

Abbreviations and acronyms

Abbreviations and acronyms used in this report are stated below. Acronyms Explanation

clk Clock

DC Direct current

DSP Digital signal processor

f Frequency

FFT Fast Fourier transform FIR Finite impulse response FPGA Field programmable gate array

f0 Bandwidth

fc 3 dB cutoff frequency

fs Sampling frequency FSM Finite-state machine

HDL Hardware description language HP High pass

IC Integrated circuit

ICP Injection-molded co-polymer polypropylene IIR Infinite impulse response

IMD Intermodulation distortion IP Intellectual Property LC Inductor and capacitor LP Low pass

LR Linkwitz-Riley

LSI Linear and shift-invariant LSB Least significant bit NS Noise shaping OP-amp Operational amplifier PCB Printed circuit board PWM Pulse-width modulation Q Quantization

RMS Root mean square

S/PDIF Sony/Philips Digital Interconnect Format SR Slew rate

T Period

THD Total harmonic distortion THD+N THD and noise

TOSLink Toshiba link VHDL VHSIC HDL VHSIC Very High Speed IC

(19)

Chapter 2

Related theory and research

Related theory and research necessary to understand the thesis will be presented in this chapter. However, the reader is expected to have relevant knowledge in mathematics and electronics.

2.1 Energy and power

Relevant energy and power definitions are stated below.

2.1.1 Average pseudo power

The average pseudo power for a non-periodic sequence x(n), with sample indexes

N , is dimensionless and is defined in equation 2.1 [7]. Pseudo will be left out when

talking about the power.

P = 1 2N + 1 N X n=−N |x(n)|2 (2.1) The power P of a time continuous and periodic signal with the period T is defined in equation 2.2 [21]. P = 1 T Z T |x(t)|2dt (2.2)

2.1.2 Root mean square

The root mean square (RMS) is defined as in equation 2.3 if x(n) is a periodic sequence with period N [21].

VRM S = v u u t 1 N N−1 X n=0 |x(n)|2 _(2.3) 5

(20)

2.2 Distortion

This section describes the relevant types of distortion and why they occur.

2.2.1 Undersampling

Undersampling distortion occurs when the sampling frequency isn’t high enough to ensure that the reconstructed signal is not too far from the original one. To ensure this, the sampling theorem has to be fulfilled [7].

Sampling theorem (The Nyquist Sampling Theorem)

If a continuous time signal xa(t) is band limited to ω = ω0(f = f0), i.e.,

Xa(jω) = 0, |ω| > ω0= 2πf0

then xa(t) can be recovered from the samples x(n) = xa(nT ) provided that fs=_T1s > 2f0.

However, the input signal to the audio systems used in this thesis are already digital with a given sampling frequency of 44.1 kHz. Distortion caused by under-sampling will therefore not be further discussed in this thesis.

2.2.2 Total harmonic distortion

Consider the fundamental sinusoid, X1sin(ω1t + φ1). The harmonics are defined

as sinusoids with an arbitrary amplitude Xk and phase φk, where the frequency

is ωk = kω1, where k = 2, 3, ... [21]. The harmonics are simply a multiple of a

frequency existing in the input signal.

Total harmonic distortion (THD) is a measurement of the amount of harmonics in a non-sinusoid shaped signal. THD is the ratio between the harmonics RMS and the complete signals RMS. The DC-component is assumed to be zero. The definition follows in equation 2.4. k is the number of the harmonic where k = 1 is the fundamental and e denotes the RMS of the sinusoids [21].

T HD = pP∞ k=2Xke2 pP∞ k=1Xke2 (2.4) Equation 2.5 defines an alternative definition of THD.

T HD =

pP∞

k=2Xke2

X1e (2.5)

2.2.3 Modulation distortion

Modulation distortion, or sometimes called intermodulation distortion (IMD), is all frequencies not harmonically related to the input signal in the loudspeakers output. Noise is however not IMD [10]. Consider two frequencies, f1and f2, where

(21)

2.3 Non-linearities in audio systems 7

f2is the highest, in the input signal. The non-linearities will create the differences

and sums of the input frequencies, hence, f1+ f2and f2− f1. Unfortunately, since

there are harmonics, even more IMD frequencies will appear from all possible combinations of frequencies [13].

2.2.4 IMD vs. THD

IMD is almost always greater in magnitude compared to THD. Most consider this type of distortion to be by far more offensive because, unlike THD, the frequencies appearing are not harmonics related to the fundamental and are therefore more likely to be described as irritating to the listener [10].

2.3 Non-linearities in audio systems

This section will describe the most important non-linearities occurring in audio systems.

2.3.1 Amplifier

A few of the most common causes of a distorted waveform is specified below. In general, electronics produce far less distortion than the loudspeaker itself [19].

Saturation

The output voltage is limited to a minimum and maximum value close to the power supply voltages. Saturation occurs when the amplifiers voltage gain produces an output that is greater or less than the maximum or minimum voltage respectively [24]. A signal is referred to as clipped when the maximum or minimum voltage is saturated.

Slew rate

The amplifier has a maximum rate of voltage change which is referred to as slew rate (usually defined as volts/ms). The output waveform will be distorted when the slew rate is reached.

Non-linear transfer function

No electrical components are ideal. Therefore some non-linearity will always be introduced in amplifiers, causing a non-linear transfer function. The introduced THD and noise is most often specified in the data sheets of a amplifier. However, IMD is not always stated in data sheets.

2.3.2 Loudspeakers

Ideally a loudspeaker would produce acoustic waves, which are a linear transfor-mation of the electrical input signal [17]. However, non-linearities exists and some of them, depending on loudspeaker type, are produced by:

(22)

The voice coil

In order to have a bass reproduction with high enough power, a large voice coil excursion is required. This increases the already inherent non-linear distortion [17]. The non-linearities in a coil has it origin in the fact that they are not ideal components, i.e., they have both resistance and capacitance, but foremost, in the fact that the inductance of a coil can vary largely with the current [20].

The diaphragm

First of all, the diaphragm does not work as one single unit. The acousto-mechanical impedance of the diaphragm varies over its area, ranging from being clamped at the edges and relatively free to move in the middle. Low frequency displacement limits are generally set by the excursion capability of the diaphragm relative to the fixed electrodes [9]. The frequency response is clearly affected by the diaphragms properties causing a non-linear behavior.

The enclosure

The air spring provided by the sealed enclosure causes some non-linearity. For high air volume displacements the restoring force of the enclosure can become linear; normally, for volumes changes no greater than about ±5 %, this non-linearity may be neglected [9].

2.4 Filters

The term filter can be explained as a mapping of an input signal to an output signal. The mapping can often be described with a mathematical expression. In this thesis, a filter will be defined as a frequency selective filter working with electrical signals. A frequency selective filter has the property of rejecting specified frequencies and letting others pass [11].

2.4.1 Analog filters

The input signal to an analog filter is often time continuous and the filter can either be active or passive. The passive filter consists of components such as resistors, inductors and capacitors. Passive filters are important since they are often used as reference when designing more advanced filters [21].

Active filters generally consists of resistors, capacitors and operational am-plifiers (OP-amps) [21]. Active filters were created to alleviate the non-wanted properties of the inductors such as large physical size, weight and non-linearities. An active filter can opposed to passive filters generate signal energy, i.e. amplify the signal, and if wrongly constructed they can therefore be unstable [11].

2.4.2 Digital filters

Digital filters are working with discrete-time signals and have become more and more important in general, but also because it is more common to implement digital signal processing today. Discrete-time filters are often designed with passive

(23)

2.4 Filters 9

time continuous filters as reference. Some characteristic properties of the digital filters are stated below [21].

Sensitivity: Digital systems are, compared to analog systems, independent of

component properties such as manufacturing precision, aging and temperature sensitivity. Digital systems are therefore less sensitive to component variations and gives a higher reliability.

Physical properties: Using VLSI technology both shrinks the physical size and

the power consumption.

Flexibility: It is relatively easy to design a general DSP that can be used for

many different purposes.

Quantization: All signals are quantized in digital systems. This may lead to

problems of different kinds.

2.4.3 IIR filters

The impulse response of an IIR filter can only be implemented recursively [21]. The advantage with IIR filters is that they can be implemented with a lower order than FIR filters and still meet the same specifications [7]. However, IIR filters have a non-linear phase response, but can still be made arbitrarily close to the linear-phase response with increasing cost [12].

Digital bi-quadratic filters

The filter is often abbreviated biquad since the transfer function is a ratio of two quadratic functions in the z-domain as seen in equation 2.6. The filter has a cutoff slope of 12 dB/octave, but a higher slope can be achieved by cascading filters, which is preferred instead of using a single 4th order design, since higher orders result in higher coefficient sensitivity [5].

H(z) = a0+ a1z

−1_{+ a}

2z−2

1 + b1z−1+ b2z−2 (2.6)

The most straightforward implementation is the direct form I, seen in difference equation 2.7 and figure 2.1 [5].

y(n) = a0x(n) + a1x(n − 1) + a2x(n − 2) + b1y(n − 1) + b2y(n − 2) (2.7)

At low frequencies, biquads are more susceptible to quantization error, mainly from the feedback coefficients b1 and b2, but also from the limited amount of bits

stored in the delay memory. Lack of resolution in the coefficients makes a precise positioning of the poles difficult and the delay memory problem is inherited from the limited amount of bits that can be stored. One way of alleviating these issues is to add the quantization error to the next sample calculation. This technique is called noise shaping (NS) and the 1st-order of NS is shown in figure 2.2 where both the output of the summation and the quantized output of the summation is fed back (compare with figure 2.1) [5].

(24)

= = = = D D [Q \Q D E E

Figure 2.1. Direct form I.

= = = = D D [Q \Q = 4 D E E

Figure 2.2. Direct form I with 1st-order noise shaping.

2.4.4 FIR filters

The impulse response is unlike IIR filters’ impulse response finite. The advantage of FIR filters is that they can be implemented to have a linear phase response which might be important depending on application. The disadvantage is that the order of the filter may become very high for stringent requirements on the magnitude response. A large filter order leads to a large amount of performed multiplications and additions. A linear-phase FIR filter has a symmetric or anti-symmetric impulse response according to equations 2.8 and 2.9 respectively [7].

h(n) = h(N − n), n = 0, 1, ..., N (2.8)

h(n) = −h(N − n), n = 0, 1, ..., N (2.9)

Equation 2.8 has symmetry around n = N/2 and equation 2.9 has anti-symmetry around n = N/2. The order of both filters are N [7].

(25)

2.5 Class-D amplifiers 11

2.5 Class-D amplifiers

Class-D amplifiers work with pulses, i.e., a voltage that switches between high and low levels. By LP filtering the pulses in an LC filter before reaching the loud-speaker, an analog voltage can be achieved. The advantage of class-D amplifiers are their high efficiency. The basic principle of class-D amplifiers can be seen in fig-ure 2.3, where a discrete-time sequence is first converted to PWM, then amplified and finally LP filtered [1].

Figure 2.3. The principal of a class-D amplifier.

The duty cycle of both the PWM and the PWM amplifier stage is limited to a factor less than 100 %. If the input’s duty cycle is larger than what the PWM is capable of handling, the output’s duty cycle is saturated to a given factor, i.e., a modulation limit (stated in data sheets of PWM stages). This is called over

modulationand yields clipping of the output signal.

2.6 Audio compressor

An audio compressor reduces the dynamic range of a signal. It is used exten-sively in audio recording, production work, noise reduction, and live performance applications [18].

2.6.1 Functionality

A compressor is a variable gain device, where the amount of gain used depends on the level of the input. The gain will be reduced when the signal level is high which makes louder sequences softer, reducing the dynamic range. A basic overview of a compressor is shown in figure 2.4.

(26)

A compressor’s input-output relation is often described by a graph as in figure 2.5. The horizontal axis corresponds to the input signal level, and the vertical axis is the output level. The gain is one (1) before the threshold and an input level above the threshold will be reduced by the compressor. The height of the line defines the dynamic range of the output, and the slope of the line is the same as the compressor’s gain [18].

1RFRPSUHVVLRQ &RPSUHVVLRQ &RPSUHVVLRQ &RPSUHVVLRQ /LPLWLQJ 7KUHVKROG ,QSXW/HYHOG% 2 X WS X W / H Y H O G %

Figure 2.5. Input-output characteristics of a compressor.

A generic compressor will take a little time to adjust to a new input level. This time is called attack time, while the time taken for a compressor to readjust to a input-output gain of one (1) is called release time. An overview of these times can be seen in figure 2.6 [18].

(27)

2.6 Audio compressor 13

2.6.2 Limiter

A limiter is a compressor where the compression of the signal above a certain threshold is very high, i.e., the output is limited to the specified threshold inde-pendent of the input signal level above the threshold. A limiter can also be defined as in figure 2.5, i.e., an input-output gain of at least 10:1 [18].

(28)

(29)

Chapter 3

Test equipment

All equipment used in this thesis is briefly described in this section. An overview of the test system can be seen in figure 3.1.

6RXQGFDUG

$PSOLILHU

2VFLOORVFRSH

6RXUFH

Figure 3.1. The complete test system setup.

3.1 Loudspeaker A

The loudspeaker used in this thesis is produced by Paradigm and the model is Cinema 110L/R v.3. It is mainly designed to serve as a front speaker in a home cinema audio system, but can be used in a 2-channel system as well. The speaker has two midrange drivers and one tweeter. The relevant technical data is stated below and the loudspeaker can be seen in figure 3.2 [15].

• Tweeter: 1” tweeter

• Bass: 2x4.5” bass/midrange with ICP cones • Frequency response: ± 2dB from 120 Hz to 20 kHz • Impedance: ≈ 8 Ω

• Crossover frequency: 2 kHz

(30)

• Enclosure: 2-way closed cabinet

The tweeter was disconnected in most test cases. If not, it will be stated.

Figure 3.2. Loudspeaker A - Paradigm Cinema 100 L/R v.3.

3.2 Amplifiers

3.2.1 Amplifier A

One of the amplifiers used were Yamaha AV Receiver model RX-V630RDS. The relevant specifications for the main channel (the only channel used in this thesis) are stated below [26]. The amplifier can be seen in figure 3.3.

• Minimum RMS output power: 75 W

• Frequency response: -3 dB, 10 Hz to 100 kHz • Impedance: 8 Ω

• Distortion: 0.06 % THD at 45 W

(31)

3.2 Amplifiers 17

3.2.2 Amplifier B

The second amplifier used is an Actiwave designed class-D amplifier PCB with the 20 W Stereo Digital Amplifier Power Stage TAS5602 from Texas Instrument. The PCBs are normally mounted inside the loudspeakers, but in the test setup used in this thesis the PCB will be placed outside the loudspeaker. The relevant specifications of the output stage are stated below and the PCB can be seen in figure 3.4 [22].

• Continuous output power: 19 W at 24 V and 8 Ω • Frequency response: -3 dB, 10 Hz to 100 kHz • Impedance: 8 Ω

• Maximum output swing: < 24 V since Vdd= 24 V (not stated in data sheet)

• Distortion: 0.08 % THD at 10 W, 24 V, 1 kHz and 8 Ω • Modulation limit: 97.7 %

• Sample frequency: 44.1 kHz

Figure 3.4. Amplifier B.

Figure 3.5 shows the data flow between digital sample and output. The output from the DSP is modulated to PWM and then amplified by the power stage before being subject to LP filtering.

'63 3:0

PRGXODWRU

$PSOLILHU

VWDJH /&ILOWHU

(32)

3.2.3 Amplifier C

The third amplifier used is an Actiwave design as well. The specifications are the same as in amplifier B, but the data flow is different as can be seen in figure 3.7. The DSP and the digital-to-PWM class-D controller has been replaced with an FPGA. The PCB can be seen in figure 3.6.

Figure 3.6. Amplifier C.

)3*$ $PSOLILHU

VWDJH /&ILOWHU

Figure 3.7. Data flow of amplifier C.

3.3 Sound card

The sound card used was an M-AUDIO Firewire 410 and can be seen in figure 3.8. The technical data relevant to what have been used in this thesis is stated below [14].

• Input: Analog (used for microphone)

• Output: S/PDIF on TOSLink optical connector • Frequency response: 20-40 kHz ± 1 dB

(33)

3.4 Microphone 19

Figure 3.8. M-Audio Firewire 410 Sound card.

3.4 Microphone

The microphone used was a Behringer ECM8000. Further information regarding the microphone is left out since it is not considered necessary.

3.5 Digital oscilloscope

The digital oscilloscope used was a UNI-T UT3062C. Further information regard-ing the digital oscilloscope will be left out since it is not considered necessary.

(34)

(35)

Chapter 4

Investigation

This chapter will present different investigations performed in order to identify the cause of audible distortion due to too high signal swing.

4.1 Analysis of audio sequences

Several audio sequences were played, recorded and analyzed with loudspeaker A and amplifier A. One of these will be shown to illustrate the difficulty of recognizing signs of distortion in an audio sequence. Figure 4.1(a) shows a short test sequence, referred to as sequence A, with two bass beats from the song Crazy by Lumidee.

Sequence A was played at a volume where distortion was clearly audible. The recorded response can be seen in figure 4.1(b). The recording was made in a small and echoic room sensitive to the rooms own impulse response. Studying the differences between the played and the recorded signals reveals a few things.

First of all, the recorded signal will always be different since both the room’s and the audio equipments’ impulse responses will affect the waveform.

Further, the envelope of the signals are similar, but the distortion adds higher frequencies that can be seen as the faster swings up and down below the signals envelope. The origin of these higher frequencies will be explained in section 4.3.

Except the differences stated above, it is difficult to gain any useful information by comparing played and recorded signals with clearly audible distortion.

The spectra of both the played and the recorded sequence A is not shown since it did not reveal anything of interest. In general, differences not seen in the time domain could possibly be revealed in the frequency domain.

LP filtering sequences A reveals that most of the signal’s power is located in the lower frequency band. This can be seen in figure 4.2. A visual comparison of figure 4.1(a) and 4.2 shows the strong resemblance even though higher frequencies are lacking. This result shows in what frequency band the signal’s energy is located, giving a starting point for the implementation of the compensating model.

(36)

0 0.5 1 1.5 2 2.5 3 3.5 4 x 104 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

(a) Sequence A played.

0 0.5 1 1.5 2 2.5 3 3.5 4 x 104 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 (b) Sequence A recorded.

Figure 4.1. Comparison of the played and recorded sequence A.

4.2 Identifying the cause of distortion

Loudspeaker A and amplifier A was used for playback. The oscilloscope was connected at the input of the loudspeaker. By raising the volume, it was revealed that the audible distortion coincided with the amplifier’s output voltage saturation, i.e., the signal was clipped. The distortion can be heard when the output swing is a few volts under maximum (before the signal is clipped) since the closer the

(37)

4.3 Model of clipping 23 0 0.5 1 1.5 2 2.5 3 3.5 4 x 104 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

Figure 4.2. The LP filtered sequence A.

swing gets to the maximum, the more distortion was observed on the top and on the bottom of the sinusoid.

By this test the conclusion is drawn that in this case the audible distortion primarily has nothing to do with the loudspeakers inability to reproduce sound of high power, but with the amplifier’s inability to produce correct high power signals to the loudspeaker.

Amplifier A was exchanged to amplifier B since the system to be used in the final product uses Actiwave’s own amplifier PCB (amplifier B or a similar PCB). One significant difference is that the Actiwave amplifier PCBs are class-D while amplifier A is analog.

The same test, i.e., raising the volume and observing the oscilloscope, was performed with the same result for amplifier B as well.

4.3 Model of clipping

A Simulink model was created to analyze how the saturation of sinusoids affect the spectrum. The effects of a clipped signal will be described in this section.

4.3.1 One tone test

Three different kinds of saturations was investigated and they can, together with their spectra, be seen in figures 4.3, 4.4 and 4.5. How an input signal is saturated, e.g., if only the top, only the bottom, both top and bottom are clipped, and if the clip is symmetric or not, affects the magnitude of the distortion at the output and at what frequency the distortion mainly arises.

(38)

0 0.01 0.02 0.03 0.04 0.05 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 t(s)

(a) A symmetrically clipped sinusoid.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 −80 −70 −60 −50 −40 −30 −20 −10 Frequency (kHz) Magnitude−squared, dB

(b) Spectrum of a symmetrically clipped sinusoid.

(39)

4.3 Model of clipping 25 0 0.01 0.02 0.03 0.04 0.05 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 t(s)

(a) Asymmetrically clipped sinusoid.

(b) Spectrum of an asymmetrically clipped sinusoid.

(40)

0 0.01 0.02 0.03 0.04 0.05 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 t(s)

(a) Asymmetrically clipped sinusoid.

(b) Spectrum of an asymmetrically clipped sinusoid.

(41)

4.3 Model of clipping 27

4.3.2 Two tone test

The superposition of two frequencies are shown in figure 4.6(a) and its spectrum in figure 4.6(b). As seen in the figures, several input frequencies increases the complexity of the amount and magnitude of the upcoming distortion.

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 t(s)

(a) Clipped signal.

0 50 100 150 200 250 300 350 400 450 500 −80 −75 −70 −65 −60 −55 −50 −45 −40 −35 −30 Frequency (Hz) Magnitude−squared, dB

(b) Spectrum of clipped signal.

(42)

4.3.3 Conclusion

As seen in figure 4.6(b), clipping produces THD and IMD at both higher and lower frequencies than the input frequencies, however, they have far less power. THD and IMD is also created by other non-linearities in the complete system.

The human ear has a dynamic range that goes from 0 dB at the threshold of hearing to 120 dB at the threshold of pain (depending of frequency) [2]. Looking at the simulation results, the magnitude of the unwanted frequencies are at least 20 dB lower than the fundamental. There is apparently enough power in the distortion for the human ear to hear it, presupposed that the fundamental tone is played at a high volume, which it must be in order to achieve clipping in the amplifier.

Double-blind subjective tests show that 3 % THD is audible on different types of sounds. With carefully selected material (such as a flute solo) detecting THD down to 2 % or 1 % might be possible. A THD of 1 % with sine waves is audible [6].

The conclusion is that, looking at a music sequence will not easily reveal any of the effects of clipping since the amount of ingoing frequencies are too large, and to distinguish smaller changes in magnitude of these frequencies are very difficult. Distortion is however quite easily detected by the human ear and distortion, caused by clipping or not, should be kept to an absolute minimum to maintain high fidelity.

4.4 Saturation of amplifiers

The output of the amplifier can saturate either because the maximum output swing isn’t high enough to describe the signal or because the over-current protection is activated.

4.4.1 Voltage saturation

The cause of saturation differs with the structure of the amplifier stage. A class-A amplifier is limited by the maximum output voltage while a class-D amplifier can also be limited by the modulation limit that clips the signal.

4.4.2 Over-current protection

The other case of saturation can be caused by the current limitations of the am-plifier. If the power of the output signal is too large, it will result in that Imax

is reached. The output voltage will then drop according to U = RImax, since the

current I is limited. Sinking too much current from the amplifier could also result in a protective shut down of the amplifier circuit.

4.5 Voltage saturation/frequency dependency

The dependency between voltage saturation and frequency will be investigated and described in this section.

(43)

4.5 Voltage saturation/frequency dependency 29

4.5.1 Bass/midrange driver test

Loudspeaker A and amplifier B was used for playback. By applying sine waves at different frequencies and different volume, the maximum voltage for non-audible distortion (V1) and the minimum voltage for audible distortion (V2) was noted

with the oscilloscope and stated in table 4.1. Multitone tests were performed as well. Note that this data is measured while the tweeter was disconnected.

f V1 V2 50 21.5 22.8 90 22 22.8 100 22 22.8 110 20.8 23 120 22 23 50, 90 22.4 22.8 90, 100 22.4 23 50, 120 22.4 23 100, 120 21.6 22.8 50, 90, 120 21 22.8 90, 100, 120 22 22.8 20, 30, 35 21.6 22.4 20, 50, 120 22.4 22.8 35, 50, 65 22 22.8 25, 35, 40 22 22.4 25, 30, 35, 40, 45 22 22.4 20, 50, 60, 90, 120 22 22.8 30, 50, 100, 110, 120 21 22.8

Table 4.1. Voltages for output signal with and without distortion on bass/midrange driver in loudspeaker A.

The accuracy of the oscilloscope is an error source, as well as the inability to adjust the volume with enough granularity, but judging by the data given in table 4.1, there is an approximate maximum output voltage of 22 V in the frequency band investigated. Ideally, if the granularity of the volume adjustment was better,

(44)

4.5.2 Full range loudspeaker test

After connecting the tweeter, the same test was performed again with the result given in table 4.2. f V1 V2 50 14 15.6 90 16.4 18 100 18 20 110 18.8 20.8 120 20 21.2 50, 90 14 16 90, 100 15.6 17.6 50, 120 14.4 16.8 100, 120 16.8 18.8 50, 90, 120 14 15.6 90, 100, 120 17.2 18.8 20, 30, 35 13.6 15.6 20, 50, 120 13.2 14.8 35, 50, 65 12.8 14.8 25, 35, 40 13.6 15.6 25, 30, 35, 40, 45 13.6 15.6 20, 50, 60, 90, 120 12.8 14 30, 50, 100, 110, 120 12.8 14.8

Table 4.2. Voltages for output signal with and without distortion on full range loud-speaker A.

This time the results are harder to interpret. Listening to the distortion did however reveal that the tweeter was the source and looking at the data shows that a lower input frequency limits the output voltage. Doing another series of measurements, shown in table 4.3, reveals the relation between voltage and the audible distortion for the tweeter.

Judging by this data, the tweeter gives audible distortion at a lower voltage for lower frequencies. Analyzing the waveform of the sinusoids while the voltage was increased shows a larger amount of distortion at the top and bottom of the waveform for lower frequencies than for high frequencies. A probable reason for this is that current supply isn’t stable enough when providing the amount of current needed (large area under the waveform gives high power and high current consumption).

The non-linearities caused by the current supply cause more distortion and it increases for lower frequencies. The distortion is most likely in the frequency range over 2000 Hz, rendering them unheard when the tweeter was disconnected.

(45)

4.6 Conclusion 31 f V1 V2 20 11.6 13.2 25 12.4 14 30 14 15.6 35 13.6 15.6 40 13.6 15.2 45 14 15.6 50 14 15.6 55 13.2 15.2 60 14.4 16.4 65 15.2 17.6 70 14.4 16.4 75 14.4 16.4 80 15.6 17.6 85 15.2 17.2 90 16.4 18 100 18 20 110 18.8 20.8 120 20 21.2

Table 4.3. Voltages for output signal with and without distortion on full range loud-speaker A.

4.6 Conclusion

If there is no tweeter connected to the output channel and if the bass/midrange driver is incapable of reproducing high frequencies, the solution for eliminating distortion is to limit the outgoing voltage to 22 V in order to ensure no clipping of the signal, i.e., limit the swing of the bass.

If the bass/midrange driver is capable of playing higher frequencies it is also capable of reproducing high frequency distortion. A detection of high power con-sumption (high RMS) could then be used to limit the bass. Another solution would be to simply lower the bass in the full range channels, i.e., the channel with elements capable of reproducing high frequencies, and let any other bass channels have a higher output voltage allowed.

The work in this thesis is primarily focusing on eliminating the distortion caused by the bass and occurring in the bass/midrange drivers. The distortion in the tweeter will not be further investigated.

(46)

(47)

Chapter 5

The proposed solution

The proposed solution for ensuring the output to not get clipped will be explained and discussed in this chapter. Several models have been developed in software in order to choose the model with the best functionality and the least hardware requirements. The modeling started in Scilab but later Simulink was used in order to reduce the translation from Scilab code to VHDL.

Chapter 6 will describe the hardware implementation of the developed model. The choice of the model is made with the hardware implementation in mind in order to create a VHDL module suitable for implementation in Actiwave’s existing system and its inherited requirements.

5.1 Basic functionality of the model

According to section 4.6 a maximum output voltage exists in order to not end up with a clipped signal. This voltage has to be matched to a maximum digital sample value that never should be exceeded. The proposed solution is to use a limiter (audio compressor), as explained in section 2.6. The limiter will limit the digital sample values, which makes it impossible to end up with a clipped analog signal.

As earlier mentioned, most of the signal’s power is in the bass frequencies and this thesis is focused on reducing distortion in the bass/midrange driver, hence only the swing of lower frequencies has to be decreased in order to reduce the total amplitude sufficiently. The swing of the bass should only be altered where it, according to real-time measurements, is estimated to affect the fidelity of the output, e.g., at loud bass beats. This will leave the bass intact during parts of sound sequences where the bass gives the listener a perception of a full and natural sound. By only applying the limiter at a low frequency band the higher frequencies in a sound sequence will be unaffected. This is important since the total experience of the sound should not be altered, but only improved when necessary.

(48)

5.2 Essential structure of the limiter

Parts of the structure of the limiter are more or less necessary in order to achieve high performance. These parts will be explained in this section. Dashed blocks without content in the figures are to be defined later in the chapter.

5.2.1 LP prefiltering

Since only the lower frequencies are interesting for making the decision of limiting the bass or not, a LP filter should be attached first in the audio chain according to figure 5.1.

Figure 5.1. Limiter overview with a LP prefilter.

5.2.2 LP postfiltering

When changing the amplitude of a signal there will be audible clipping sounds from the discontinuities. In order to be able to change the amplitude with an arbitrary large factor, the signal has to be LP filtered after the amplitude change. This reduces the clipping sound and is illustrated in figure 5.2.

The structure of the limiter is now as seen in figure 5.3.

5.2.3 Preserving a full frequency range signal

In order to preserve a full range signal the HP part has to be added to the LP part. The delay introduced by the LP filters has to be in same order as the delay introduced by the HP filter, therefore two HP filters are added as seen in figure 5.4.

5.3 Amplitude reduction of the bass channel

Disregarding how the decision is made whether to decrease bass or not, there are a few straightforward ways of implementing a change of amplitude for the bass and they will be described in this section.

(49)

5.3 Amplitude reduction of the bass channel 35 0.16 0.165 0.17 0.175 0.18 0.185 0.19 0.195 0.2 0.205 -1 -0.5 0 0.5 1

(a) Illustration of amplitude change on a sinusoid.

0.17 0.175 0.18 0.185 0.19 0.195 0.2 0.205 0.21 -1 -0.5 0 0.5 1

(b) LP filtering of the same sinusoid where the amplitude has been changed.

Figure 5.2. Illustration of the effects of changing the amplitude and how LP filtering alleviates the clipping sound.

Figure 5.3. Limiter overview with a LP postfilter.

5.3.1 Time multiplexing

By using several HP filters with different cutoff frequencies between fc1 and fcn

where fc1< fcn, the appropriate amount of attenuation can continuously be chosen

by the control signal s. This is illustrated in figure 5.5 where the appropriate filter is selected and figure 5.6 illustrates how the frequency response changes when time multiplexing HP filters.

(50)

+3 /3 +3 /3 \ [ [KS [OS [KS [OS

Figure 5.4. Limiter overview when a full frequency range signal is established.

+3

Q

[OS

V

Figure 5.5. Time multiplexing several filters with different cutoff frequencies. s is the multiplexor’s control signal.

5.3.2 Scaling

Adjustment of the bass amplitude can be done by scaling the LP part. The chosen cutoff frequency of the LP prefilter will determine what frequency band of the bass that is affected by the limiter.

Figure 5.7 illustrates how the scaling is performed with multiplication of a attenuation factor A and the LP prefiltered signal. Figure 5.8 illustrates how the amplitude of the frequency response changes with the attenuation factor.

(51)

5.3 Amplitude reduction of the bass channel 37

Figure 5.6. Real-time adjustment of the filter cutoff. f1 ∈ [fc₁, fcn] and is the

time-varying cutoff set by the HP filter chosen with the multiplexor, f2is the cutoff set by the

LP prefilter.

$

[OS

[

Figure 5.7. Scaling of the prefiltered LP part by multiplication with the attenuation factor A.

Figure 5.8. Real-time adjustable attenuation of magnitude. f2 is the cutoff set by the

(52)

5.3.3 FFT and notch filter

Performing an FFT on the signal after the LP prefilter would reveal what frequency is too strong. By applying a notch filter on top of this frequency the bass could be attenuated. The way of doing this would be according to the following procedure:

1. Create a buffer of the number of samples needed to perform an FFT. 2. Find the bass frequency with maximum magnitude.

3. Create a notch filter at this frequency. 4. Apply the notch filter.

5.3.4 Discussion and conclusions

All three methods listed above have their advantages and disadvantages which will be briefly discussed here.

• Time multiplexing:

– Advantages:

1. Only the deepest bass is reduced to start with.

– Disadvantages:

1. A feedback system has to be implemented in order to know how much the bass is attenuated.

2. There is a delay between making the decision of attenuation and applying the attenuation, especially if the bass to be reduced is of high frequency and the selection of filters are switching from the lowest cutoff frequency to the highest.

3. The transparency of the sound within the bass band is altered when attenuating certain frequencies.

• Scaling:

– Advantages:

1. Direct steering can be applied since the magnitude of the bass di-rectly shows how much it is over the allowed limit, hence how much it has to be scaled down.

2. The delay is short between making the decision of attenuating and applying the attenuation.

3. The transparency of the sound within the bass band is preserved since only the amplitude of the bass is affected, not the frequencies.

1. All frequencies in the bass is attenuated whether it is needed or not.

(53)

5.4 Choice of filters 39

• FFT and notch:

– Advantages:

1. Only the frequencies causing distortion will be attenuated and the rest of the bass is intact.

1. A feedback system has to be implemented in order to know how much the bass is attenuated.

2. A longer delay is introduced when performing a FFT.

3. There might be several frequencies that are too strong and a deci-sion has to be made which ones to attenuate and how much. 4. The transparency of the sound within the bass band is altered when

attenuating certain frequencies.

After testing the different methods and evaluating how suitable they are for hardware implementation, the method of scaling the bass was chosen. The struc-ture of the limiter is now as in figure 5.9.

+3 /3 +3 /3 \ [ [KS [OS [KS [OS $

Figure 5.9. Overview of limiter with the chosen attenuation method.

5.4 Choice of filters

The choice of filters will be discussed in this section.

5.4.1 Crossover

The part of the signal that is subject to processing is first separated, thereafter signal processing is performed. The signals are then mixed together again. The summed output should ideally be unchanged in both frequency, phase and relative

(54)

levels of amplitude compared to the original signal, i.e., a perfect crossover would be ideal.

A commonly used method of implementing active audio crossovers is a design with in-phase outputs and steep 24 dB/octave slopes. There are crossover offering the following characteristics:

1. Absolutely flat amplitude response throughout the passband with a steep 24 dB/octave roll off rate after the crossover point.

2. The acoustic sum of the two driver responses is unity at crossover. (Ampli-tude response of each is -6 dB at crossover, i.e., there is no peaking in the summed acoustic output.)

3. Zero phase difference between drivers at crossover. 4. The low pass and high pass outputs are always in phase.

The two drivers mentioned should here be thought of as the channels that are added together in the last step of the limiter, adding up the signal y seen in figure 5.9.

The crossover is however of non-linear phase. Research on the audible impact of slowly changing non-linear phase response shows that the audible results are so minimal as to be nonexistent; especially in comparison with all the other system non-linearities. With real world music sources, it is not audible at all [3].

With the facts above stated the choice of a crossover like this is reasonable. Each of the LP and HP filters in the limiter (see figure 5.9) are following the above stated characteristics and together creating a crossover with flat amplitude.

The filters are realized with biquads, each requiring five coefficients. NS is used in the biquad structure according to figure 2.2.

There are several reasons for not choosing FIR filter before IIR. First of all, IIR is, according to the discussion above, good enough for this purpose. Secondly, a FIR filter gives a longer delay, which is not desirable if preventable. A major reason for keeping the delay to a minimum is synchronization between video motion and sound if the system is used as TV loudspeakers. A last reason for using IIR filters and biquads is that the hardware structure still has to be implemented in Actiwave’s FPGA system for reasons not concerning this thesis. Hence, it is a good choice to utilize a preexisting structure due to hardware limitations.

5.5 The decision making block

Inside the decision making block is where the decision to limit the bass or not is taken. A brief overview of the block is shown in figure 5.10.

A definition of variables used in the decision making block is stated in table 5.1.

(55)

5.5 The decision making block 41

PD[ )60

[OS

$

Figure 5.10. Overview of the decision making block.

Variable Definition

w A window of samples

lw Length of window

A(w) Attenuation in w

max(w) Maximum value in w

threshold Threshold set for limiting samples

m Number of old maximum values stored

Table 5.1. Definition of variables used the decision making block.

5.5.1 The maximum block

The maximum block is continuously identifying and storing the maximum absolute value of incoming samples. The internally stored maximum sample value, max, is reset on every lw samples and a new maximum value is then available on the

output port. Every window w has its own max ∈ [0, 1[. Equation 5.1 defines how

max is calculated.

max(w) = maximum{|x(n)|, |x(n + 1)| . . . , |x(n + (lw− 1))|} (5.1) The choice of letting the limiter react on peak values are motivated by the fast attack time required. Other options are to react on an average value or a RMS value in a window w. This would however give a slower response and peaks could easily slip through and cause distortion without being detected by the limiter.

5.5.2 Direct steering

A direct steering is applied when calculating the attenuation factor A. The attack and release of the A is described in this section.

(56)

Attack

Any sample values over the threshold is counteracted by a decrease of the atten-uation factor A, i.e. an attack, according to eqatten-uation 5.2.

A =

(

1, if max ≤ threshold

threshold

max , if max > threshold

(5.2) The factor An, in a given sample window wn, is multiplied with the samples

values in the same sample window wn. This requires a delay of the signal of the

length lwto guarantee that no sample values leaving the limiter is larger than the threshold. The delay is however left out, and An is applied to window wn+1 (one

window after wn). This could result in sample values over the threshold, but it

would only be a problem if the window w is wide. The window length lw should

be chosen short enough to achieve a fast response time. The attenuation delay is, depending on tuning of the limiter (the value of lw), in the order of what is given

in equation 5.3. Attenuation delay ≤ lw fs = N 44100 Hz (5.3)

Release

There are however restrictions in the direct steering when in comes to the release. A decrease of A (an attack) is always allowed since the system has to be able to limit any sample values that are too large. But an increase of A (a release) cannot always be allowed since it might cause unnecessary switching of A.

A few of the latest max-values will be taken into consideration if a release is implied by maxnin wn. The following equation is valid for switching A in a release.

An = threshold

maximum{maxn−m,...,maxn}, where m > 1

Figure 5.11 shows a sinusoid of constant amplitude and illustrates how the last 5 maximum values (m = 5) are taken into consideration.

Equation 5.4 shows the relation between, lw, m and the lowest frequency, fl,

to be played by the system.

lw= dfs

fl/2(m + 1)e (5.4)

The question of how many old maximums that should be taken into consider-ation is a matter of tuning.

5.6 Optional functionality

It is illustrated in section 5.2.2 and figure 5.2(a) how a change of amplitude causes discontinuities in the waveform. One way of mitigating this is to change the

(57)

5.7 The volume control 43

Figure 5.11. A constant amplitude sinusoid with a number of windows w and belonging

max-values. m = 5 in this illustration and the A factor is held between the peaks since one of the latest five (m) maximums are one (1) when a release is implied in between the peaks of the sinusoid.

amplitude in the zero crossings of the signal. There is however no guarantee that a zero crossing is available in a specific short instance of time after the change of

A is implied. Thus, a change of A might be forced before a window w of time has

past.

5.7 The volume control

First of all, the position of the limiter should be before the volume control. A generic sample is in the interval [−1, 1[ and the volume is adjusted by multiplying the sample value with a factor smaller than one (1). In order to not reduce the resolution of the sample values, the multiplication is the last operation one want to perform before leaving the digital domain.

The limiter will be placed before the volume multiplication in the audio chain. Therefore, the volume has to be taken into consideration inside the limiter. This is done by simply multiplying the max-value with the volume factor. However, this feature has been left out in Simulink models since the limiter isn’t working in real-time and the simulations work with a fix volume level.

(58)

5.8 The complete limiter

An overview of the complete limiter derived in this chapter is seen in figure 5.12.

+3 /3 +3 /3 \ [ [KS [OS [KS [OS

Figure 5.12. An overview of the complete limiter.

5.9 Limiter simulations

A few simulation cases will in this section illustrate the functionality of the limiter developed in Simulink. As earlier mentioned, the Simulink model has no volume control, i.e., the simulations are performed with constant volume. The chosen

(59)

5.9 Limiter simulations 45

5.9.1 Limitation of a sinusoid with constant amplitude

The limitation of a sinusoid with constant amplitude is seen in figure 5.13.

0.05 0.055 0.06 0.065 0.07 0.075 0.08 0.085 0.09 0.095 0.1 -1 -0.5 0 0.5 1 t x

(a) The input signal x = sin(2π50).

0.050 0.055 0.06 0.065 0.07 0.075 0.08 0.085 0.09 0.095 0.1

0.5 1

t

A

(b) The attenuation factor A.

0.05 0.055 0.06 0.065 0.07 0.075 0.08 0.085 0.09 0.095 0.1 -1 -0.5 0 0.5 1 t y

(c) The limited output signal y ≈ 0.42sin(2π50).

Figure 5.13. Simulations of the limiter when the input is a sinusoid with constant amplitude.

In figure 5.13(b), notice how A is held constant and that the amplitude of the sinusoid is approximately at the threshold.

(60)

5.9.2 Limitation of a ramping sinusoid

The limitation of a sinusoid with a smooth ramping up and down is seen in figure 5.14. 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 -1 -0.5 0 0.5 1 t x

(a) The input signal x = sin(2π)sin(2π50).

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0 0.5 1 t A

(b) The attenuation factor A.

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 -1 -0.5 0 0.5 1 t y

(c) The limited output signal y ≈ 0.42sin(2π50) where x(n) > threshold.

Figure 5.14. Simulations of the limiter when the input is a ramped sinusoid.

In figure 5.14(b), notice how the amplitude of the output y is approximately at the threshold. The release is slower than the attack and this can be seen by the asymmetry in figure 5.14(c) where the amplitude is lower than the limit of 0.42 after t = 0.25.

(61)

5.10 Conclusion 47

5.9.3 Limitation of music

The limitation of a sequence of music, the song Knobbers with Crookers, is seen in figure 5.15. The input x and the output y has been LP filtered in order to simplify analysis. The bass beats can be seen in figure 5.15(b) and the resulting bass in figure 5.15(d).

Notice how the bass frequencies in figure 5.15(d) is approximately held below the threshold. A few peaks are over the threshold, however, extremely short time periods of distortion is most likely not heard by the human ear.

5.10 Conclusion

The results in this chapter shows that it is possible to implement a relatively simple and straightforward model compensating for the distortion in the bass. The hardware complexity, the audible improvements and measurements supporting the improvements of fidelity are yet to be investigated after the VHDL implementation.

(62)

0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 -1 -0.5 0 0.5 1 t x

(a) The input signal.

0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 -1 -0.5 0 0.5 1 t x

(b) The input signal LP filtered.

0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 0 0.5 1 t A

(c) The attenuation factor A.

0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 -1 -0.5 0 0.5 1 t y

(d) The limited output signal LP filtered.

0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 -1 -0.5 0 0.5 1 t y

(e) The limited output signal.

(63)

Chapter 6

VHDL implementation

This chapter will briefly describe the implementation of the limiter in VHDL. The module was interconnected with already existing VHDL modules and some prop-erties such as word lengths are inherited from the existing system. The complete audio chain is basically given even if changes had to be done to instantiate the limiter. An overview of the complete system is seen in figure 6.1.

63',) 5; 6LJQDO SURFHVVLQJ 9ROXPH FRQWURO /LPLWHU 3:0 ,Q 2XW

Figure 6.1. An overview of the existing VHDL environment where the limiter is in-stantiated. The dashed blocks are generally in the system, but are not used in this thesis.

6.1 FPGA

The FPGA used during development was a Xilinx Spartan 3 XC3S250E, but the limiter will later be implemented on a larger Spartan 6. Changes to the VHDL code may have to be performed in order to optimize for the new hardware environment.

6.2 Clock domains

The system operates on a 50 MHz clock and the sample clock. Typically the sample rate is 44100 Hz. In that case, there is about 1133 system clock periods on every sample clock period. This gives time for a lot of processing until the next sample clock arrives. Thus, there is no particular time-restraint to adapt to in this system when processing data.

A third clock is created by dividing the sample clock and thus creating the window w used for continuously finding maximum values in the input sequence.

Digital compensation of distortion in audio systems

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Digital compensation of distortion in audio systems

Digital compensation of distortion in audio systems

Examensarbete utfört i elektroniksystem

vid Tekniska högskolan i Linköping

av

Abstract

Acknowledgments

Contents

Chapter 1

Background

1.1

Introduction

1.2

Purpose

1.3

Goals

1.4

Outline of the report

1.5

Denotations and definitions

Scilabr

Simulinkr

ISErWebPACK

ARTA

Abbreviations and acronyms

Chapter 2

Related theory and research

2.1

Energy and power

2.1.1

Average pseudo power

2.1.2

Root mean square

2.2

Distortion

2.2.1

Undersampling

2.2.2

Total harmonic distortion

2.2.3

Modulation distortion

2.2.4

IMD vs. THD

2.3

Non-linearities in audio systems

2.3.1

Amplifier

2.3.2

Loudspeakers

2.4

Filters

2.4.1

Analog filters

2.4.2

Digital filters

2.4.3

IIR filters



2.4.4

FIR filters

2.5

Class-D amplifiers

2.6

Audio compressor

2.6.1

Functionality

2.6.2

Limiter

Chapter 3

Test equipment

6RXQGFDUG

$PSOLILHU

6RXUFH

3.1

Loudspeaker A

3.2

[OS

[OS