• No results found

Study of Interferer Canceling Systems in a Software Defined Radio Receiver

N/A
N/A
Protected

Academic year: 2021

Share "Study of Interferer Canceling Systems in a Software Defined Radio Receiver"

Copied!
79
0
0

Loading.... (view fulltext now)

Full text

(1)

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Study of Interferer Canceling Systems in a Software

Defined Radio Receiver

Examensarbete utfört i Radioelektronik vid Tekniska högskolan vid Linköpings universitet

av

Oskar Holstensson LiTH-ISY-EX--12/4650--SE

Linköping 2013

Department of Electrical Engineering Linköpings tekniska högskola

Linköpings universitet Linköpings universitet

(2)
(3)

Study of Interferer Canceling Systems in a Software

Defined Radio Receiver

Examensarbete utfört i Radioelektronik

vid Tekniska högskolan vid Linköpings universitet

av

Oskar Holstensson LiTH-ISY-EX--12/4650--SE

Handledare: Nicolas Regimbal

Atlantic Innovation Electronic Solutions Examinator: Ted Johansson

isy, Linköpings universitet Linköping, 22 maj 2013

(4)
(5)

Avdelning, Institution Division, Department

Electronic Devices

Department of Electrical Engineering SE-581 83 Linköping Datum Date 2013-05-22 Språk Language Svenska/Swedish Engelska/English   Rapporttyp Report category Licentiatavhandling Examensarbete C-uppsats D-uppsats Övrig rapport  

URL för elektronisk version

http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-92757 ISBN

— ISRN

LiTH-ISY-EX--12/4650--SE Serietitel och serienummer Title of series, numbering

ISSN —

Titel Title

Studie av Störsignalsneutraliserande System i en Mjukvarudefinierad Radiomottagare Study of Interferer Canceling Systems in a Software Defined Radio Receiver

Författare Author

Oskar Holstensson

Sammanfattning Abstract

This thesis describes the work related to an interferer rejection system employing frequency analysis and cancellation through phase-opposed signal injection. The first device in the frequency analysis chain, an analog fast Fourier transform application-specific integrated circuit (asic), was improved upon. The second device, a chained fast Fourier transform followed by a frequency analysis module employing cross-correlation for signal detection was specified, designed and implemented in vhdl.

Nyckelord

(6)
(7)

Abstract

This thesis describes the work related to an interferer rejection system employ-ing frequency analysis and cancellation through phase-opposed signal injection. The first device in the frequency analysis chain, an analog fast Fourier transform application-specific integrated circuit (asic), was improved upon. The second device, a chained fast Fourier transform followed by a frequency analysis mod-ule employing cross-correlation for signal detection was specified, designed and implemented in vhdl.

(8)
(9)

Acknowledgments

I want to express my gratitude to my supervisor Nicolas Regimbal for his helpful guidance during the thesis.

Many thanks go to my examiner Ted Johansson for his support and never-ending source of wisdom and patience.

I would also like to express my gratitude to my friends at the university, most notably my very good friend Christoffer Peters.

Finally I thank my loving family for always supporting me in whatever endeavor I submit to at the time.

(10)
(11)

Contents

Notation xiii

I

Background

1 Introduction 17

1.1 Leon and the interferer rejection system . . . 17

1.2 Goals . . . 18

1.3 Document outline . . . 19

2 Software radio 21 2.1 Traditional radio receivers . . . 21

2.2 Software defined radio . . . 21

2.3 Cognitive radio . . . 22

2.4 This work . . . 23

II

Results

3 The sasp 27 3.1 Introduction . . . 27

3.2 Analog processing modules . . . 28

3.2.1 Analog adders . . . 28

3.2.2 Delay line . . . 28

3.2.3 Sample and hold . . . 29

3.2.4 Weighting unit . . . 29 3.2.5 Matrix unit . . . 30 3.2.6 Sample selector . . . 30 3.3 Control modules . . . 31 3.3.1 Flip-flop . . . 31 3.3.2 Address generator . . . 31 3.3.3 Coefficient control . . . 35 3.4 Results . . . 38 xi

(12)

xii CONTENTS

3.5 Conclusions . . . 41

3.5.1 Future work . . . 41

4 Frequency analysis and the dsp 43 4.1 Theory . . . 43 4.2 Feasibility test . . . 44 4.2.1 Algorithm . . . 45 4.2.2 Results . . . 45 4.3 rtl implementation . . . 49 4.3.1 Buffer . . . 50 4.3.2 fft . . . 50 4.3.3 Frequency analyzer . . . 53 4.3.4 Compensator . . . 57 4.3.5 Results . . . 57 4.4 Conclusions . . . 59 4.4.1 Future work . . . 59

4.4.2 Multiple signal detection . . . 59

4.4.3 Solving ambiguities . . . 59 5 Summary 61 5.1 Conclusions . . . 61 5.2 Future work . . . 62 5.3 Final words . . . 62 Bibliography 65 A Signal detection with cross-correlation 67 B Fast Fourier transform 71 B.1 Decimation-in-time radix-4 FFT . . . 71

(13)

Notation

Abbreviations

Abbreviation Meaning

adc Analog to Digital Converter aft Analog Fourier Transform

asic Application-Specific Integrated Circuit cmos Complementary Metal Oxide Semiconductor

dac Digital to Analog Converter dft Discrete Fourier Transform dsp Digital Signal Processor fft Fast Fourier Transform

fpga Field Programmable Gate Array fsm Finite State Machine

mac Multiply And Accumulate

rom Read-Only Memory

rtl Register-Transfer Level

sasp Sampled Analog Signal Processor sdr Software Defined Radio

sr Software Radio

vhdl vhsicHardware Description Language vhf Very High Frequency

vhsic Very-High-Speed Integrated Circuit

(14)
(15)

Part I

(16)
(17)

1

Introduction

This thesis was carried out at Atlantic Innovation Electronic Solutions in Bor-deaux with the aim of studying and implementing the interferer cancellation sys-tem proposed in the Leon project, discussed later in this chapter. It was done in the scope of a master’s thesis project of 20 weeks in the spring of 2012.

The present work has been led in the framework of the ITP SIMCLAIRS com-peted program. France, United Kingdom and Sweden have mandated the Euro-pean Defence Agency (EDA) to contract the Project with a Consortium composed of THALES SYSTEMES AEROPORTES France, acting as the Consortium Leader, SELEX Galileo Ltd, THALES UK Ltd and SAAB AB.

1.1

Leon and the interferer rejection system

Leon is a project supervised by Atlantic Innovation Electronic Solutions. The project aims at creating an interferer rejection system using the sampled analog signal processor, or sasp, described below. The goal is to be able to cancel out any wideband signal from the vhf to Kubands, or 30 MHz to 18 GHz.

The system makes use of frequency analysis of the input signal to distinguish powerful interferers, and superimposes a phase-opposed signal on top of the in-put signal, delayed through a delay line. An overview of the system is depicted in figure 1.1.

For frequency analysis, the input signal is processed using an analog Fourier transform. The algorithm is the popular radix-4 fast Fourier transform (fft) al-gorithm by Cooley and Tukey (1965).

To enhance the frequency resolution, the dsp itself performs another fft. A dig-17

(18)

18 1 Introduction

Figure 1.1:Proposed system

ital signal processing unit then processes the output of the second transform to determine the frequency, amplitude and phase of any interfering signals. If the frequency resolution following the analog Fourier transform is sufficient, then the cascaded transform can be omitted, but then the aft must supply a spectrum excerpt for analysis.

The purpose of the analog delay line is to delay the signal while its spectral con-tents are being determined, and phase-opposed rejection signals are being cre-ated. When the signal exits the delay line, the rejection signals are added to cancel out the interferers.

The sampled analog signal processor, or sasp, was created at lab IMS in Bor-deaux by Dr. Rivet (2009) for his doctor’s thesis. The signal processor performs a Fourier transform on the incoming radio signal, allowing for isolation of indi-vidual frequency components, and simplifying subsequent signal processing. A proof-of-concept sasp was created, capable of continuously sustaining 64-point Fourier transforms at a sampling frequency of 640 MHz.

Analyzing the incoming spectrum with the help of the sasp, interfering signals are detected and have their phase-opposed equivalents superimposed over them. With accurate enough analysis of the incoming spectrum, this will effectively cancel the interferers. However, the limitation to 64-point transforms proves problematic. The possibility of performing a chained Fourier transform after that of the sasp has been examined, with a proof-of-concept Matlab model. This improves the precision of the analysis, but also comes with a new set of issues.

1.2

Goals

This work has a clear divide between two tasks, and consequently its goals comes in these two parts.

• First, the requirements of the digital signal processing will be analyzed and implemented in vhdl.

(19)

1.3 Document outline 19

• Second, the work on the sasp will be resumed and advanced towards its completion.

The first goal is open for different approaches and architectures. The second goal, in that it involves a project resumed at a late stage of development, is more restricted and has the following requirements.

• Maximum chip area: 1.44 mm2

• Maximum power consumption: 100 mW • Minimum speed: 2 GHz

The two tasks stated above are in the chronological order in which the goals were identified and performed. However, for the sake of narrative, this report reorders them in the order in which they appear in the system.

1.3

Document outline

Chapter 2 discusses modern radio receiver challenges and presents the concept of software defined radio.

In chapter 3, the work on improving the sasp is presented. The different building blocks, analog and digital, are presented and the work done on them during this thesis is highlighted.

In chapter 4, the subject of chaining Fourier transforms is discussed. The issues with inaccuracy involved in this method of spectral analysis are presented, and solutions to these problems are discussed. The chapter presents a comprehensive workflow from mathematical concept to a Matlab model and ending in a synthe-sizable vhdl model.

(20)
(21)

2

Software radio

This chapter briefly brings up the topic of traditional radio and moves towards the concept of the software radio, setting the background of the work.

2.1

Traditional radio receivers

Traditional radio receivers work by tuning in to a certain channel in the wanted band. The radio signal from the antenna is amplified through a low-noise ampli-fier. Signals in other channels and even other bands need to be filtered out, and are often done so at an intermediate frequency.

However, an interferer of sufficient power risk saturating the low-noise amplifier and might even damage it. Employing a tuned antenna or RF filters attenuates interferers out of band, and in-band interferers are already assumed to behave according to the corresponding standard.

These steps to minimize the damage caused by interferers greatly reduce the tun-ability of the circuit. Highly configurable RF filters of sufficient quality are very difficult to construct. To keep the cost and power consumption at a minimum, radio receivers are highly tuned and the entire signal path is optimized for the target specification.

2.2

Software defined radio

The high specialization of radio receiver circuits prohibits them from sharing more than a fraction of their signal paths. This leads to device complexity grow-ing with the number of communication standards accommodated.

(22)

22 2 Software radio

Any change in an already existing standard, such as a allocating a new band, calls for a redesign of the radio hardware. To be able to accommodate an additional standard, a device needs to be upgraded with an entirely new transceiver. The aim of software defined radio (sdr) is to create a transceiver architecture able to accommodate any number of wireless standards simultaneously, while main-taining a low power consumption. When concurrent standards are modified or new standards are introduced, the sdr unit is compliant right after reprogram-ming. The concept of the software radio was proposed by Mitola (1995).

The ideal software defined radio involves digitizing the incoming radio signal at the antenna. With a sufficiently fast and accurate analog-to-digital converter (adc) followed by a powerful enough digital signal processing unit, any wireless standard can be accommodated. Such a device, practically only consisting of dig-ital components, is sometimes referred to as a software radio (sr). However, the requirements this puts on the speed and accuracy of the adc pushes the power consumption to impractical levels using modern technology. The concept of the all-digital software radio remains an utopian one.

More practical sdr architectures make use of both analog and digital compo-nents, sometimes with multiple signal paths to accommodate a wide frequency range. Deval (2010) discusses the problems, advantages and disadvantages of software radio compared to multi-radio approaches, and presents practical de-sign solutions and circuit examples.

2.3

Cognitive radio

A natural extension of the software defined radio is the cognitive radio. The term was first proposed by Mitola and Maguire (1999).

Traditionally, the frequency spectrum is divided into bands and is licensed per ge-ographical area. This regulation of the frequency spectrum is necessary to avoid overlapping bands, and consequently evades interference between services. How-ever, the spectrum is fully utilized only when all channels in all bands are allo-cated. More likely is the situation where one band is overutilized while another band on a different service is underutilized. This is a common situation with cellular communication (overutilization) versus television broadcasting (under-utilization).

A cognitive radio detects free, unused channels and adapts its transmission and reception parameters to better utilize the wireless spectrum. With accurate enough spectrum sensing the cognitive radio can use the full potential of the radio envi-ronment without causing interference to other devices. A cognitive radio is basi-cally composed of a software defined radio with spectrum sensing capabilities. The field of cognitive radio is an active research topic. Razavi (2010) introduces a low-noise amplifier for a cognitive radio receiver for the range of 50 MHz to 10 GHz. Kitsunezuka et al. (2012) presents a cognitive radio receiver capable of

(23)

2.4 This work 23

receiving signals between 30 MHz and 2.4 GHz. It is also able to sense spectral energy to determine band availability.

2.4

This work

The ultimate goal in the Leon project is to cancel powerful interferer signals from the vhf to Kubands, or 30 MHz to 18 GHz. At this bandwidth any single filter is

not feasible, and accommodating all possible blocker profiles is highly unfeasible. The Leon project is designed to accommodate any radio receiver operation, and targets no specific application or radio standard. The project is in other words effectively a flexible filter, directly appropriate for use in an sdr or cognitive radio receiver.

(24)
(25)

Part II

(26)
(27)

3

The

SASP

3.1

Introduction

The sampled analog signal processor, or the sasp, is a device that is capable of performing a Fourier transform with analog samples, or analog Fourier transform (aft). It was created at lab IMS in Bordeaux by Dr. Rivet (2009) for his doctor’s thesis.

In Leon, the sasp performs the first of two cascaded Fourier transforms. It is located at the input (figure 3.1).

Figure 3.1:The aft in the proposed system

The sasp performs a sample-and-hold operation on the input signal, and then utilizes a series of delay lines and analog arithmetic units to perform the division-in-time radix-4 fast Fourier transform by Cooley and Tukey (1965), derived in section B.1. One frequency bin of the operation is selected and its complex analog

(28)

28 3 The sasp

value is output each time the transform has completed.

The chosen architecture has the advantage of being able to operate continuously. It inputs one sample and outputs one frequency bin per clock cycle.

A structural overview of the sasp is presented in figure 3.2.

Figure 3.2:Overview of the sasp

The sasp was previously realized in a demonstrator chip by Dr. Rivet in the 65 nm cmos technology from ST Microelectronics. The demonstrator operates at frequencies up to 640 MHz with 64 samples. The power consumption of the demonstrator is 450 mW.

The improvement work aims at elevating the operating frequency of the sasp to at least 2 GHz at a power consumption of less than 100 mW.

3.2

Analog processing modules

The principal function of the sasp is to process sampled analog signals, true to its name. In this section the different elements to achieve this function are described.

3.2.1

Analog adders

The analog adders perform addition with differential analog voltage samples. This is accomplished by adding currents; the inputs are connected to transistors that act as voltage controlled current sources (figure 3.3). The current through the common resistor exhibits an increase proportional to the sum of the input voltages, and the sum can be sensed as the increase in voltage across it.

3.2.2

Delay line

The delay lines of the sasp make temporary storage for the samples as each stage of the fft requires the samples to arrive in a specific order. The first butterfly of the first stage requires the samples with indices 0, 16, 32 and 48; the stage thus needs to store samples 0-47 before the first butterfly can be processed.

(29)

3.2 Analog processing modules 29

bias

apos aneg bpos bneg

bias

Vdd

outpos

outneg

Figure 3.3:Analog two-input adder

Furthermore, the delay lines play a role in the deserialization and serialization of the samples. At the input of each stage, one sample arrives per clock cycle, but the matrix unit processes four samples at a time. The output delay line then serializes the samples so that they are again sent to the next stage at a rate of one sample per clock cycle.

3.2.3

Sample and hold

At the input of the sasp, the sample and hold circuit converts the continuous-time input signal to a discrete-continuous-time one suitable for processing.

3.2.4

Weighting unit

Both windowing and the fft algorithm require the input samples to be multi-plied with certain coefficients. For the sasp, this is accomplished in the weighting unit. It is based on the work by Abiven (2011). The device effectively multiplies an analog sample with a digital value.

The architecture was improved in this thesis. The previous architecture utilized a base-10 approach, providing 100 possible digital values with eight control lines. This approach is called binary coded decimal and was chosen as it is clear and

intuitive when programming by hand.

As the coefficients were to be provided by a read-only memory (rom) structure that can be programmed automatically, a pure binary approach was chosen in-stead. This increased the number of values to 256.

The multiplication is accomplished by scaling the input by factors of 2−k, k = 0, 1, 2, . . . , 7, and then adding a subset of these together. The subset is determined by the bits in the digital factor.

(30)

30 3 The sasp c = 7 X n=0 2−n (3.1)

The largest possible coefficient is 2 − 2−7

, and its use is considered as scaling the input by unity.

Multiplying complex numbers is accomplished by using four real-valued weight-ing units and two two-input analog adders as shown in equation 3.2.

<{out} = <{a} ∗ <{b} − ={a} ∗ ={b}

={out} = ={a} ∗ <{b} + <{a} ∗ ={b} (3.2)

3.2.5

Matrix unit

The matrix unit implements the addition matrix derived in section B.1. Equa-tion B.6 is included here for clarity.

                X(k) X(k + N4) X(k + N2) X(k +3N4 )                 =                1 1 1 1 1 −j1 j 1 −1 11 1 j1j                                1 WNk WN2k WN3k                 |                F0(k) F1(k) F2(k) F3(k)                |

The trivial multiplications by factors −j, −1 and j are performed by simply rewiring the differential analog signals at the input of the analog adders.

3.2.6

Sample selector

The sample selector waits at the end of the pipeline, and grabs one specific fre-quency bin every time it appears at the output. The frefre-quency bin to be selected is programmable by specifying the corresponding binary number at 6 input pins. After the sample selector is a set of buffers to drive the output pins of the chip. These buffers have an extended output swing to facilitate chip measurement.

(31)

3.3 Control modules 31

3.3

Control modules

To control the workflow and provide coefficients for the analog weighting units, a set of control modules are required. This section presents the principal modules and their function.

The modules described here were all designed and implemented during this the-sis.

3.3.1

Flip-flop

Digital flip-flops are used to store the digital values used for controlling the sasp. The selected architecture is that of Yuan and Svensson (1989). The chosen ar-chitecture was selected due to its simplicity; it does not require complementary clock phases.

The architecture is dynamic and will lose its data unless a minimum clock speed is maintained. This is the digital equivalent of the delay line cell. The schematic and layout of the flip-flop is depicted in figure 3.4 and figure 3.5 respectively.

Vdd in out clk clk clk clk

Figure 3.4:Flip-flop schematic Figure 3.5:Flip-flop layout

3.3.2

Address generator

To control the sasp, an address generator unit is used. It contains a 6-bit counter to provide a global state followed by adders to compensate for phase differences in the different stages of processing.

Since the contents of the rom modules are easily manipulated, their contents can be shifted to obtain a virtual phase shift. Moreover, the phase difference of the sample selector can be compensated off-chip. By utilizing these techniques, the Hamming window unit, stage 1 and the sample selector all run on the base address of the address generator, saving space and power. Stages 2 and 3 run on addresses with their own phase adjustments.

The counter and adder architectures are that of Kogge-Stone adders (Kogge and Stone, 1973). Stages 2 and 3 have hard-coded offsets to provide the required phase difference. The Kogge-Stone architecture was selected because of the speed

(32)

32 3 The sasp

requirements; the adders need to operate reliably at 2 GHz. Simple carry-chain adders proved to be too slow for the application, even for as few as 6 bits. Adding two binary numbers the pen-and-paper way, one begins the addition at the rightmost bits. If both bits are 1, a value of one carries over to the left. The algorithm then proceeds to the next bit position, adding the bits at that position together with any carried bit. This is repeated for all bits.

The problem with this algorithm is that it forms a chain of carried bits, and the final outcome will have to wait until this long chain is fully traversed. To speed up this process, carry-lookahead is performed.

For each pair of bits of the inputs, An and Bn, n = 0, 1, . . . , N − 1, two properties

are derived.

• If An and Bn are both 1, then a bit will be carried to the left regardless of

whether a carry arrives from the right. This property is called generate, or

Gn.

• If only one of An or Bnis 1, then a bit will be carried if and only if a carry

arrives from the right. This property is called propagate, or Pn.

These two properties are calculated as a first step. Secondly, these properties are calculated for all pairs of two consecutive positions of the inputs. For positions n and n − 1, this group of bits is set to generate if bit n already generates (Gn = 1),

or if bit n − 1 is set to generate while bit n is set to propagate that bit. The entire group of bits is set to propagate if both bits of the sequence propagate.

Extending these definitions for any group of bits running from n to m, their cumu-lative generate and propagate properties are called Gn↔mand Pn↔mrespectively.

The cumulative properties can be extended, always including one bit to see if the extended group is set to generate or propagate. However, a lot of redundant processing is avoided by instead taking the generate and propagate properties of a group, and combining it with the largest adjoining group that has already been resolved.

Ultimately this will determine Gn↔0and Pn↔0for all bit positions n. Since each

processing step can effectively double the group length, a total of log2N

process-ing steps are required.

When a group has the propagate property, it will propagate to the left any carried bit from the right. However, when the cumulative properties are known all the way to the least significant bit (bit 0), there is no possibility of a carry bit arriving from the right. The generate property then unambiguously determines whether the group already has generated a carry bit or not. Now all the complete groups generate a carry bit to the left if and only if it has the generate property, that is,

Cn = Gn−1↔0.

To arrive at the sum, the three bits An, Bn and Cn are added to form the sum

(33)

3.3 Control modules 33

Pn property; since Pn = AnBn it is possible to reduce the sum calculation to

Sn = PnCn.

The phase-compensated addresses are converted to base-4, decoding each pair of binary bits into four lines. This goes well with the radix-4 design of the fft imple-mentation, and also reduces the complexity of the coefficient rom architecture.

pprev Vdd pprev p p pout Figure 3.6: Calculate P

schematic Figure 3.7:Calculate P layout

Calculate P and calculate G circuits were created and laid out. The schematic and layout for calculate P is shown in figure 3.6 and figure 3.7 respectively. The schematic and layout for calculate G is shown in figure 3.8 and figure 3.9 respec-tively. Vdd gprev gprev g g p p gout Figure 3.8: Calculate G

schematic Figure 3.9:Calculate G layout

Using the above structures, a matrix performing all of the reductions was cre-ated and laid out. The final layout for the structure calculating the cumulative generate property is shown in figure 3.10. The input to the circuit is at the top terminals, and the output is routed from the bottom.

(34)

34 3 The sasp

(35)

3.3 Control modules 35

3.3.3

Coefficient control

The Hamming window unit, stage 2 and stage 3 all require digital coefficients for their weighting units. These coefficients are supplied by use of a NOR rom structure, as seen in figure 3.11.

precharge

word line 0

word line 1

word line 2

Vdd Vdd Vdd

bit line 0 bit line 1 bit line 2

Figure 3.11: romprinciple

Each word line is controlled by a small logic unit. It activates during one half of the clock cycle when the right address is supplied. During the other half of the clock cycle the bit lines are charged to VDDby pull-up transistors.

The contents of the rom is indicated by the presence of absence of a pull-down transistor. When a word line is activated, the presence of a transistor at the junc-tion between said word line and a bit line will pull the corresponding bit line towards ground, signifying a logical zero at this address. The bit lines without transistors in the active junction will remain at VDD, signifying a logical one. The

contents of the 3x3 example rom depicted in figure 3.11 is 010, 101 and 111 at addresses 0, 1 and 2 respectively.

The bit lines are heavily exposed to parasitic capacitance, and are therefore very slow. To accurately read the value of the bit lines at each cycle a clocked sensor approach is used.

When a bit line voltage drops below a reference voltage, nominally 200 mV below

VDD, an internal node is quickly discharged. This signifies a logical zero. If the

bit line voltage is kept high, the internal node is kept undischarged and a logical one is implied. At the end of the discharge phase, the logical value is forwarded to the output of the sensor.

The schematic and layout of the sensor is depicted in figure 3.12 and figure 3.13 respectively.

(36)

36 3 The sasp The layout of the coefficient rom for the last stage of the sasp is depicted in figure 3.14. The address is input from the right and the data is output to the left.

Vdd

ref bit line

out clk

clk

clk

clk

Figure 3.12: romsensor schematic

(37)

3.3 Control modules 37

(38)

38 3 The sasp

3.4

Results

The address generator was validated by post-layout simulation at 3 GHz to guar-antee robustness at the nominal frequency of operation, 2 GHz.

The post-layout simulation puts the average current of the address generator at 1.43 mA. Figure 3.15 shows the internal six-bit counter state.

Figure 3.15 shows the state of the internal address counter.

Figure 3.16 shows the base-4 address of the Hamming window unit and stage 1. It is a delayed version of the internal address with each pair of bits decoded into the equivalent base-4 digit.

Figure 3.17 shows the base-4 address of stage 2. This address enjoys a phase offset in addition to being delayed and decoded into three base-4 digits.

Clock

Bit 1

Bit 2

Bit 3

Bit 4

Bit 5

Bit 6

Time (ns)

0 5 10 15 20

Figure 3.15:Address counter

All of the rom structures were verified by post-layout simulation. The largest rom, that of the last stage and the one depicted in figure 3.14, has an average power consumption of 4.25 mA.

The results show that the address generator is able to reliably provide addresses to all the blocks without discrepancies at up to 3 GHz.

(39)

3.4 Results 39

Digit 1

3012301230123012301230123012301230123012301230123012301230123012301230

Digit 2

0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0

Digit 3

0 1 2 3 0

Time (ns)

0 5 10 15 20

Figure 3.16:Hamming window unit and stage 1 address

Digit 1

3 01230123012301230123012301230123012301230123012301230123012301230123

Digit 2

3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0

Digit 3

0 1 2 3 0 1

Time (ns)

0 5 10 15 20

(40)

40 3 The sasp To test the finished modules, the entire Hamming window unit with coefficient romwas tested along with the address generator. The simulation includes the effects of resistive and capacitive parasitics after layout. Figure 3.18 depicts the output of the Hamming window unit running at 3 GHz. The input to the weight-ing unit in this simulation is a constant voltage signal.

The output shows that the weighting unit is able to faithfully reconstruct the raised cosine that is the Hamming window. The address generator and the coeffi-cient rom supply the required control signals for it to work.

0

5

10

15

20

0

50

100

150

Time (ns)

P

otential (mV)

(41)

3.5 Conclusions 41

3.5

Conclusions

At the end of the thesis, the work on the improved sasp had come a long way. The blocks that were worked on in this thesis; the address generator, coefficient romblocks and the weighting unit, were finished. However, much work is still required to provide for a completed circuit.

The constructed digital units include the rom architecture and the address gen-erator. These perform well under post-layout simulations up to 3 GHz, while the target speed is 2 GHz. This speed margin leaves room for some amount of parasitic capacitance when routing the address signals across the chip, as well as some margin for fabrication. These blocks comprise the high-level control of all the stages.

The weighting unit, arguably one of the most critical blocks in the system, works reliably in simulations. It enjoys an improved architecture that paves the way for more linear behavior of the circuit.

As mentioned in section 1.2 in the introduction, the chip has power and area requirements in addition to the 2 GHz speed requirement. Since the sasp was not fully finished, there was not yet any substantial area estimation, but an ap-proximation put the occupied area well below the limit. The power consumed by the created blocks were moderate enough not to count substantially towards the maximum.

3.5.1

Future work

The delay lines of all the stages had their analog parts laid out even before the work of the thesis. However, still missing is the decoding circuitry, using the in-coming address to input or output samples in the correct order. The intimacy be-tween the delay lines and its control circuitry will inevitably lead to some amount of redesign of the analog architecture.

The input sample block as well as the sample selector (output) block needs to be realized and verified up to 2 GHz.

The area and power requirements need to be properly assessed; as the chip nears completion a power and area budget needs to be drawn up and followed through. As a final task, the system will need to be assembled and verified by post-layout simulation including the entire chip die to guarantee satisfactory performance up to the desired operating frequency.

(42)
(43)

4

Frequency analysis and the

DSP

In the Leon topology, the digital signal processor (dsp) takes care of the frequency analysis and forwards detected interferers to the signal generator (figure 4.1). The purpose is to increase the fidelity of the signal detection by further process-ing the sasp output to produce a more fine-grained spectrum followed by an improved signal detection algorithm.

This chapter looks into the theory and implementation of the frequency analysis and further highlights the improvements in the signal detection algorithm.

Figure 4.1:The dsp in the proposed system

4.1

Theory

The sasp performs a discrete Fourier transform (dft) using the fft algorithm for efficient computation.

(44)

44 4 Frequency analysis and the dsp

Before performing the dft, windowing is required to minimize leakage. The saspuses the Hamming window (figure 4.2) for this purpose.

0.0

0.2

0.4

0.6

0.8

1.0

Time, normalized

Amplitude

0

1

Figure 4.2:Hamming window

The highest side lobes of the Hamming window are at -46 dB, and leakage beyond a distance of two frequency bins is attenuated at least by this amount (figure 4.3). However, the leakage to neighboring frequency bins is significant. This leads to ambiguity in the actual frequency of the blockers.

In addition to attenuation, the windowing exhibits phase distortion (figure 4.4) when the sinusoidal source is of a frequency that is not a multiple of the sampling frequency.

The dsp performs a cascaded fft with the same windowing procedure. The spectral leakage introduced by the two windowing functions yields a significant source of error. For instance, the phase error can be as large as π/2, which pre-vents the information from being useful as the phase error must be kept small. A signal detection algorithm to counter the effects of the two instances of win-dowing is derived in appendix A. This algorithm was used in the subsequent feasibility test and vhdl implementation.

4.2

Feasibility test

(45)

4.2 Feasibility test 45

−60

−50

−40

−30

−20

−10

0

Frequency bin

Magnitude (dB)

−4

−2

0

2

4

Figure 4.3:Magnitude of the Hamming window (normalized)

4.2.1

Algorithm

For an aft of N points, and a succeeding dft of M points; 1. Provide N M input samples

2. Perform M Fourier transforms of input samples 0 . . . N −1, N . . . 2N −1, · · · , N (M − 1) . . . N M − 1

3. Select one bin and gather its corresponding samples from the successive Fourier transforms

4. Perform a second Fourier transform on these M samples

5. Sweep the spectrum, and for each peak above a certain threshold found, obtain its precise frequency, amplitude and phase information using the algorithm derived in appendix A

6. Generate phase-opposed signals and add them to the input

4.2.2

Results

The test consists of a sweep of one input signal from 50 MHz to 60 MHz, with an amplitude of one and a phase of zero. The sampling frequency of the first 64-point aft is 640 MHz, and the frequency appears in bin 5. The successive output samples of this bin are fed to an fft of 64 points and then analyzed using coarse detection and correlated detection.

(46)

46 4 Frequency analysis and the dsp

Frequency bin

Phase (r

adians)

−4

−2

0

2

4

− π

0

π

Figure 4.4:Phase of the Hamming window

50

54

58

0.80

0.85

0.90

0.95

1.00

1.05

1.10

Standard

Frequency (MHz)

Amplitude

50

54

58

0.80

0.85

0.90

0.95

1.00

1.05

1.10

Correlated

Frequency (MHz)

Amplitude

Figure 4.5:Detected amplitude

The Matlab model shows improved precision of the detected amplitude when the frequency is not a multiple of the sampling frequency (figure 4.5).

(47)

4.2 Feasibility test 47

error from −π/2 to π/2, which is not useful when the goal is to generate a phase-opposed signal (figure 4.6). This method of peak detection is here calledstandard.

50

54

58

Standard

Frequency (MHz)

Phase (r

adians)

− π

0

π

50

54

58

Correlated

Frequency (MHz)

Phase (r

adians)

− π

0

π

Figure 4.6:Detected phase

Selecting the frequency bin with the greatest magnitude naturally leads to quan-tization of the detected frequency. Using correlation, the frequency can be deter-mined with greater fidelity (figure 4.7).

In both cases, for input frequencies very close to 60 MHz, the algorithms detect a frequency around 50 MHz. This is due to the ambiguity introduced by sampling; for this frequency bin, 50 MHz is equivalent to DC, and 60 MHz is equivalent to the sampling frequency.

(48)

48 4 Frequency analysis and the dsp

50

54

58

−1.0

−0.5

0.0

0.5

1.0

Standard

Frequency (MHz)

Frequency error (percentage)

50

54

58

−1.0

−0.5

0.0

0.5

1.0

Correlated

Frequency (MHz)

Frequency error (percentage)

(49)

4.3 rtlimplementation 49

After detection, a phase-opposed signal is generated and superimposed over the input signal. The attenuation is then measured as the total spectral power after compensation as compared to the total spectral power before compensation. When using correlation, the attenuation shows a more regular behavior (figure 4.8). This is partially due to the sensitivity to phase error when generating the phase-opposed signal. When the phase error is large, as is the case when not correlating, the compensation signal will not fully cancel the interferer.

50

54

58

0

10

20

30

40

50

Standard

Frequency (MHz)

Atten

uation (dB)

50

54

58

0

10

20

30

40

50

Correlated

Frequency (MHz)

Atten

uation (dB)

Figure 4.8:Attenuation

4.3

RTL

implementation

A vhdl register-transfer level (rtl) implementation was created, chiefly con-sisting of an fft unit followed by a frequency analyzer unit (figure 4.9). The frequency analyzer unit scans the spectrum coming from the fft, and when it detects peaks using simple threshold calculations, a peak matcher unit is dis-patched to find the exact frequency of the peak.

Due to the inelasticity of the input timing, a buffer unit precedes the fft unit so that samples are saved while the fft is performed.

A compensator unit is placed after the frequency analyzer unit to compensate for the effects of windowing and cross-correlation, discussed above.

The number of points for the rtl implementation is adjustable by a single param-eter.

(50)

50 4 Frequency analysis and the dsp

Input

buffer

Radix-4

FFT

Frequency analyzer

Peak

matcher

Comp-ensator

Figure 4.9: rtlimplementation layout

4.3.1

Buffer

Before the fft a small buffer is placed that stores samples when the fft is per-forming its calculations. The buffer unit places incoming samples in a small queue and provides them to the fft unit when it is ready to accept them. The length of the buffer unit depends on the size of the fft and the width of the peak detection. Resizing of the buffer might be needed if these parameters change.

In implementations where the input sampling is governed by another clock do-main, the buffer unit will also serve to transfer the data into the clock domain of the fft and the rest of the system.

4.3.2

FFT

The fft unit consists of one radix-4 butterfly performing the transform in-place, using four complex sample memories for intermediate sample storage. The im-plementation uses a decimation-in-frequency decomposition, as derived in sec-tion B.2. The fft unit is depicted in figure 4.10.

Since the data is manipulated in-place and kept in four separate memories, the algorithm is careful to always place the samples so that the four samples for each butterfly operation reside on separate memories. This is accomplished by shifting the samples a certain number of steps clockwise when reading from and writing to the memories.

Inevitably, data hazards would occur as one stage of the fft ends and the next one begins. Wait states are inserted to avoid this.

Equation B.6 is included here for clarity. The butterfly computes the complex operation, for n = 0, 1, 2, . . . ,N41;               y0(n) y1(n) y2(n) y3(n)               =                1 WNn WN2n WN3n                              1 0 1 0 0 1 0 −j 1 0 −1 0 0 1 0 j                             1 0 1 0 1 0 −1 0 0 1 0 1 0 1 0 −1                             x(n) x(n+N /4) x(n+N /2) x(n+3N /4)               |

(51)

4.3 rtlimplementation 51

eight complex additions per operation. In the implementation the calculation is pipelined in three stages; two for the additions and one for the complex rotation. The twiddle factors WNm, m = 0, 1, 2, . . . , N − 1, are pre-calculated and stored in a look-up table.

When inputting samples, the pipeline is shorted right before the first butterfly multiplier, and it is used to apply the Hamming window. The Hamming window values are stored in a separate look-up table.

To minimize latency, the last stage of the transform calculates samples in the order that the frequency analyzer units expects them and outputs them as they become available.

(52)

52 4 Frequency analysis and the dsp

Pipeline control

Butterfly

Coeffs

INPUT IS B U TT ER FL Y IS W IN D O W IS O U TP U T ST A G E IN D EX P R O G R ES S R O TA TI O N M EM O R Y IN D IC ES ST A G E B YP A SS ED

Sample shifter

Sample shifter

O U TP U T ST B

3

0

1

2

3

0

1

2

Sample memory read ports

Sample memory write ports

Writes

O U TP U T IN D EX O U TP U T V A LU E

Out

IN P U T ST B IN P U T A C K Figure 4.10: fft

(53)

4.3 rtlimplementation 53

4.3.3

Frequency analyzer

The frequency analyzer is a higher order unit that effectively detects and matches the peaks that appear in the input spectrum. It consists of a peak detector, a peak matcher arbiter and one or more peak matchers. The principal flow of the frequency analyzer is shown in figure 4.11.

Peak matcher

arbiter

Peak

detector

matcher

Peak

(54)

54 4 Frequency analysis and the dsp

Peak detector

The peak detector serves to detect energy peaks in the spectrum, signalling to the peak matcher arbiter when a falling edge from a sample of sufficient magnitude was detected. It takes data directly from the fft and delays it, as to be able to supply the peak matcher with a full spectrum excerpt. The peak detector is depicted in figure 4.12.

After detecting a peak, the frequency analyzer temporarily inhibits the detection, as to not trigger multiple times on the same peak. This limits the minimum distance between two adjacent signals. Subsequent signals within this minimum distance will be ignored.

Abssq

Control

IN P U T V A LU E IN P U T IN D EX G R EA TE R IN P U T ST B

Compare

&

EN A B LE

Inhibit

O U TP U T V A LU E O U TP U T IN D EX O U TP U T ST B

(55)

4.3 rtlimplementation 55

Peak matcher arbiter

The peak matcher arbiter’s main role is to distribute the workload among the free peak matchers. The principal schematic is depicted in figure 4.13.

The peak matchers that follow the arbiter can only receive one spectrum excerpt at a time, which prevents peaks with overlapping spectrums to be detected by the same peak matcher. The peak matcher arbiter serves the spectrum excerpt only to a peak matcher that is free to receive more input.

Each peak matcher comes with a multiply-and-accumulate pipeline, and increas-ing the number of peak matchers improves performance. This effectively de-creases the latency of the algorithm when encountering multiple peaks.

STB 1 STB 2 FREE 1 FREE 2 STB VALUE INDEX STB N FREE N

...

Peak matcher 2

N-to-M

encoder

Peak matcher 1

Peak matcher N

Figure 4.13:Peak matcher arbiter

Peak matcher

Each peak matcher unit consists of a complex multiply-and-accumulate (mac) pipeline that computes the correlation between the samples and the Hamming window in the frequency domain at a certain offset. The peak matcher is depicted in figure 4.14.

At the end of the mac pipeline is a magnitude unit that calculates the square of the magnitude, |z|2 = <{z}2+ ={z}2. In this implementation the square of the magnitude is used instead of just the magnitude since taking the square root is expensive in hardware, and not needed since the maximum will be found just as well using this metric.

The pipeline is controlled by a number of state machines implementing heuristics to maximize the magnitude squared |z|2, i.e. finding the peak.

To maximize pipeline utilization the unit can process more than one peak simul-taneously, time-sharing the mac unit between them.

(56)

56 4 Frequency analysis and the dsp C O R R ST B P R O G R ES S C O R R SL O T Peak arbiter Correlator search ST B EN D FI R ST SL O T P O S Subsample index CORRPOS C O R R A C K Dispatcher C LE A R MAC SAMPLEADDR Sample memory Window table WINDOWADDR SAMPLEDATA WINDOWDATA A C K Record keeper FI R ST SL O T C O M M A N D V A LI D C O R R EN D GREATER Complex value Magnitude SE A R C H EN D P O S OUTPUTSTB OUTPUTACK O U TP U T SL O T O U TP U T SU B SA M P LE IN D EX Sample

index OUTPUTSAMPLEINDEX Calc freq OUTPUTFREQ

INPUTVALUE Sample shifter FR EE SL O T FR EE A C K INPUTSTB SA M P LE A D D R OUTPUTVALUE INPUTINDEX C O R R D O N E C O R R D O N E SL O T FREE

(57)

4.3 rtlimplementation 57

4.3.4

Compensator

Following the peak matcher is the compensator unit that compensates for the effects of the first windowing and the autocorrelation.

It contains two look-up tables containing the normalized reciprocals of the two effects, and one complex multiplier taking its factor from one of the tables. Every peak passes through its pipeline twice to compensate for both effects.

Since the two tables are normalized, the compensation additionally effectively scales the peak signature by a coefficient that needs to be compensated for later. For efficiency this scaling is assumed to be performed more efficiently in a later stage, where the magnitude has been obtained and doesn’t require a complex multiplication.

4.3.5

Results

The vhdl model was simulated in ModelSim, sweeping frequencies over one frequency bin. The results are shown in figure 4.15, with the two plots depicting the detected amplitude and detected phase respectively.

50

52

54

0.80

0.85

0.90

0.95

1.00

1.05

1.10

Frequency (MHz)

Amplitude

50

52

54

Frequency (MHz)

Phase (r

adians)

− π

0

π

Figure 4.15:Detected amplitude and phase

The output shows that the amplitude is determined accurately, but more so at lower frequencies. It shows similar characteristics to the results of the Matlab model in figure 4.5.

The phase is detected accurately as well, similar that of the phase detected with the Matlab model in figure 4.6.

(58)

58 4 Frequency analysis and the dsp

The vhdl model achieves satisfactory precision in detecting the important char-acteristics of the interferer.

(59)

4.4 Conclusions 59

4.4

Conclusions

Using cross-correlation for the signal detection provides higher frequency, ampli-tude and phase precision. By using a pipelined architecture, multiple signals can be detected with low latency.

With the acquired results, the dsp fulfills the goals of this work block. Being able to reliably detect interferers, even those not a multiple of the sampling frequency, is a key feature of the Leon interferer cancellation loop.

This method in tandem with the cascaded fft architecture proves effective in de-tecting close-in interferers. Using a delay line, these interferers can theoretically be attenuated by up to 30 dB.

4.4.1

Future work

Since the twiddle factor look-up table and the window look-up table never op-erate simultaneously, they can be merged into one, saving space or complexity depending on the target hardware.

The radix-4 fft uses a simple finite state machine (fsm) for control, and delays are manually inserted between the fft stages to prevent data hazards. A thor-ough investigation on the possibilities of reordering the butterfly operations can minimze the required delays between the stages of the fft.

Peak matching is currently done assuming an odd number of samples, and that the window function in the frequency domain is truncated outside of N −12 fre-quency bins from the center. This makes sense for N = 5, where only the spectral contents of the main lobe are considered, but for N < 5 precision could be in-creased by not truncating the main lobe contents. This is particularly severe for

N = 3, where one of the three samples is currently ignored, losing valuable

infor-mation.

4.4.2

Multiple signal detection

Since the signal power outside the main lobe of the Hamming window is low, multiple signals can be distinguished if their main lobes do not overlap. In this case, simply discarding the spectral contents of the window outside the main lobe when correlating still yields good results, and signals can be distinguished if they are at least five frequency bins apart.

4.4.3

Solving ambiguities

Because of the leakage in the first dft, a blocker detected at a frequency offset in a specific bin can originate at any frequency fblocker = fof f set+ nfs, n ∈ Z, but

then with augmented amplitude and phase.

Introducing diversity by observing the frequency contents in a different bin, or using a different sampling frequency, a system of linear congruences appear.

(60)

60 4 Frequency analysis and the dsp

With enough observations with orthogonal parameters, all blockers can be dis-tinguished.

Traditional algorithms for solving systems of linear congruences will not suffice since the detected values are not well-known integers. Instead, a system follow-ing the fuzzy math discipline is more likely to succeed, implementfollow-ing an algo-rithm for solving a system of fuzzy linear congruences. This is beyond the scope of this document.

(61)

5

Summary

The task of this thesis was divided into two parts. The first task was to inves-tigate the signal processing aspects of the Leon loop, develop a Matlab model and finally write an rtl model in vhdl. The second task was to continue the development of the sasp and advance it towards its tape-out.

Development of the sasp yielded new control and data structures, increasing the reliability and speed of the circuit. The weighting unit, responsible for perform-ing the multiplications in the fft algorithm, was improved and its linearity issues were alleviated. The gains were verified via simulations with extracted parasitics. The dsp was modeled in Matlab and an algorithm for fine peak detection was developed. This method, using correlation, was implemented in vhdl along with a radix-4 fft implementation.

5.1

Conclusions

More detailed conclusions can be found in the respective sections of the two work items; section 3.5 for the sasp and section 4.4 for the dsp.

At the end of the thesis, the work on the sasp had come a long way. Improving the linearity, power consumption and speed has been a priority as it is essential for the overall functionality of the feedback system proposed in the Leon project. The blocks that were worked on, including the address generator, rom structures and weighting unit are shown to work well in simulations, and reach their stipu-lated design goals.

The work on the dsp ended with a full vhdl model, including a full fft im-61

(62)

62 5 Summary

plementation constructed with synthesis on an fpga in mind. It uses cross-correlation to increase the precision of the signal detection, a method that works well in simulations.

Using the original premise of the Leon project; a cascaded fft configuration to-gether with the improved signal detection yields good results and interferers can theoretically be attenuated by up to 30 dB.

5.2

Future work

More detailed discussions on future work can be found in the respective sections of the two work items; section 3.5.1 for the sasp and section 4.4.1 for the dsp. There still remained work on the sasp at the end of this thesis. The work is to be resumed and finished and the chip will then finally be sent to fabrication. As for the dsp, it needs to be implemented in a specific fpga architecture. Some modern logic synthesizers have the capability of inferring hardware blocks such as memories and multipliers, but some degree of architecture-specific optimiza-tion is inevitable. After porting, the true performance of the dsp will show. The sasp and the dsp will need to be tested together to see the open-loop perfor-mance of the interferer detection algorithm with real signals.

Finally the entire, closed loop of project Leon needs to be simulated in its entirety. The signal generator, the delay line and the signal combiner needs to be present at this stage. When this is done the true capabilities of the project will show.

5.3

Final words

The work on the sasp and the dsp carried widely different requirements and approaches.

The sasp had a clear goal from the start and its iterative process had already begun when the work was resumed. The work done in this thesis on the sasp moved it towards its completion.

The work on the dsp was more open and encouraged innovation. This allowed time for reflection and research, and allowed for the discovery of using cross-correlation. This method overcame the principal limitation of the dsp, namely the ability to accurately identify interferers at frequencies other than multiples of the sampling frequency.

The dsp finally had a viable rtl model for synthesis on an fpga. Simulations show promising results.

This thesis has enabled me to explore two widely different domains of engineer-ing, and I have learned a great deal on how to solve engineering problems in the two.

(63)

5.3 Final words 63

I am confident that project Leon will play a significant role in the future of soft-ware radio as it elegantly solves one of its fundamental problems with interferers.

(64)
(65)

Bibliography

Y. Abiven. A low-power 2 GHz discrete time weighting system dedicated to sam-pled analog signal processing.ICECS, pages 57–60, 2011. Cited on page 29.

J. W. Cooley and J. W. Tukey. An algorithm for the machine calculation of com-plex fourier series.Math. Comput., 19:297–301, 1965. Cited on pages 17 and 27.

Y. Deval. Low cost mobile RF terminal paradigms: From multi-radio to software radio. InIEEE International Conference on Solid-State and Integrated Circuit Tech-nology (ICSICT), pages 627–630, 2010. Cited on page 22.

M. Kitsunezuka, K. Kunihiro, and M. Fukaishi. Efficient use of the spectrum.

IEEE Microwave Magazine, 13(1):55–63, Jan/Feb 2012. Cited on page 22.

P. Kogge and H. Stone. A parallel algorithm for the efficient solution of a general class of recurrence equations. IEEE Transactions on Computers, pages 783–791,

1973. Cited on page 31.

J. Mitola III. The software radio architecture.IEEE Transactions on Computers, 33:

26–38, May 1995. Cited on page 22.

J. Mitola III and G. Q. Maguire, Jr. Cognitive radio: Making software radios more personal.IEEE Pers. Commun, 6(4):13–18, Aug 1999. Cited on page 22.

B. Razavi. Cognitive radio design challenges and techniques. IEEE J. Solid-State Circuits, 45(8):1542–1553, Aug 2010. Cited on page 22.

François Rivet. Design of a Radio Frequency Front-End Receiver dedicated to Software-Radio for Mobile Terminals. PhD thesis, University of Bordeaux 1, 2009.

Cited on pages 18 and 27.

J. Yuan and C. Svensson. High-speed CMOS circuit technique.IEEE J. Solid-State Circuits, 24(1):62–70, Feb 1989. Cited on page 31.

(66)
(67)

A

Signal detection with

cross-correlation

To properly identify the frequency, magnitude and phase of sinusoidal blockers, the distortion introduced by windowing needs to be countered.

A real sinusoidal input signal can be rewritten as two complex ones.

Sin(t) = Aincos(ϕin+ 2πfint)

= Ain

exp(i(ϕin+ 2πfint)) + exp(−i(ϕin+ 2πfint))

2

(A.1)

Observing only the positive frequency of the real signal, it can be rewritten as

Sin,pos(t) =

Ain

2 exp(i(ϕin+ 2πfint)) (A.2) The windowing of the input signal yields a convolution with said window in the frequency domain. Since the main lobe of the Hamming window extends over several frequency bins, leakage into adjacent frequency bins is significant. For this reason, when observing spectral components in a bin, the actual frequency of the originating signal is ambiguous.

Furthermore, the Hamming window contains frequency components extending towards infinity, and this in combination with sampling leads to aliasing. This is usually not a problem, unless the signal frequency is in the first or last fre-quency bins, where leakage of the main lobe of the signal’s negative frefre-quency counterpart creates an alias of equal magnitude in the very same bin.

Treating the effects of aliasing separately, the contribution from Sin,pos to a

(68)

68 A Signal detection with cross-correlation

quency bin n in a dft started at a time kTDFT, k ∈ Z can be regarded as

Bin,pos[n] = AH amming(nfsfin)

Ain

2 exp(i(ϕin+ 2πfinkTDFT)) (A.3) Introducing the variable fof f set = nfsfin, Bin,pos[n] can be rewritten as

Bin,pos[n] =AH amming(fof f set)

Ain 2 exp(i(ϕin+ 2π(fof f set+ nfs)kTDFT)) =AH amming(fof f set) Ain 2

exp(i(ϕin+ 2πfof f setkTDFT)) exp(i(2πnfskTDFT))

(A.4)

Observing that the time between successive runs of the first dft of N samples is

TDFT = N /fs, the last factor resolves to unity;

exp(i(2πnfskTDFT)) = exp(i(2πnfskN

fs

)) = exp(i(2πnkN )) = 1 (A.5) Sampling the output of bin n over successive runs of the first dft, with k = 0, 1, 2, 3, . . . , K − 1, the signal will appear as a complex sinusoidal with frequency

fof f set. Performing a second dft on this sequence of samples will yield the

con-tribution to frequency bin m;

Bin,pos[n, m] =AH amming(mfDFTfof f set)

AH amming(fof f set)Ain

2 exp(iϕin)

(A.6)

With sufficiently large K, the frequency offset within the bin of the first dft,

fof f set, can be acquired with some precision, and the effects of the first

Ham-ming window can be countered. However, the effects of the second HamHam-ming window still cause problems when determining the amplitude and phase of the blocker, since even a small offset in frequency will cause distortion as per fig-ures 4.3 and 4.4.

This problem can be facilitated with using cross-correlation on the output of the second dft with a fine-grained Hamming window in the frequency domain. The peak of the complex cross-correlation will yield the most likely frequency of the blocker.

(AH amming? Bin,pos)(f ) =

X

m



AH amming(mfDFT + f )AH amming(mfDFTfof f set)

AH amming(fof f set)

Ain

2 exp(iϕin) 

(69)

69 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

−5

0

5

Frequency bin

Magnitude

Figure A.1:Precise signal frequency implied by correlating with the window

Relating to previous discussions, a sinusoidal signal at the input of the second dftwill appear in the frequency spectrum as a Hamming window.

The peak of the cross-correlation between the dft output and the Hamming win-dow appear where the input likely originated.

Since the Hamming window is heavily attenuated outside of the main lobe, pe-ripheral frequencies can be ignored, trading the accuracy loss for computational efficiency. When limiting the frequency band, and also assuming that all spectral components are spaced by at least the same frequency offset, the cross-correlation can be regarded as a shifted version of the Hamming window’s autocorrelation function.

This operation becomes an instance of autocorrelation. A property of autocorrela-tion is that its maximum value is found at a lag of zero; in this case at f = −fof f set.

Detecting this peak yields fof f set. The value of the autocorrelation is then;

ACorr(f ) = X m  AH amming(mfDFTf ) AH amming(mfDFTf )  =X m |AH amming(mfDFTf )|2 (A.8)

(70)

70 A Signal detection with cross-correlation

This function is periodic with fDFT, and its effects can be countered when fof f set

is known, as is the case after the peak detection.

Having detected a peak at f with (AH amming? Bin,pos)(f ), and removed

window-ing artifacts by dividwindow-ing by ACorr(f ) and AH amming(f ), and then multiplying by

two, the original signal amplitude and phase is obtained.

(71)

B

Fast Fourier transform

B.1

Decimation-in-time radix-4 FFT

The sasp implements a decimation-in-time radix-4 FFT. The dft is defined as.

X(k) =

N −1

X

n=0

x(n)WNkn (B.1)

The first step is dividing the summation into four interleaved sub-summations.

X(k) = N /4−1 X n=0 x(4n)WN4kn + N /4−1 X n=0 x(4n + 1)WNk(4n+1) + N /4−1 X n=0 x(4n + 2)WNk(4n+2) + N /4−1 X n=0 x(4n + 3)WNk(4n+3) = N /4−1 X n=0 x(4n)WN4kn +WNk N /4−1 X n=0 x(4n + 1)WN4kn +WN2k N /4−1 X n=0 x(4n + 2)WN4kn +WN3k N /4−1 X n=0 x(4n + 3)WN4kn (B.2) 71

(72)

72 B Fast Fourier transform

Now the recursive nature of the decomposition starts to show, as the four sum-mations are in themselves the very definitions of smaller dft instances. The four summations are defined as follows, for i = 0, 1, 2, 3.

Fi(k) = N /4−1 X n=0 x(4n + i)WN4kn = N /4−1 X n=0 x(4n + i)WN /4kn (B.3)

An interesting property of this definition is that WN4kn is cyclic with a period of N4. This means that Fi(k) = Fi(k + N4) = Fi(k + N2) = Fi(k + 3N4 ), i.e. four

output samples share the same recursive dft. Aligning these four output samples illustrates the benefit of this.

X (k) = F0(k) +WNkF1(k) +WN2kF2(k) +WN3kF3(k) X  k +N 4  = F0(k + N /4) +WNk+N /4F1(k + N /4) +WN2(k+N /4)F2(k + N /4) +WN3(k+N /4)F3(k + N /4) X  k +N 2  = F0(k + N /2) +WNk+N /2F1(k + N /2) +WN2(k+N /2)F2(k + N /2) +WN3(k+N /2)F3(k + N /2) X  k + 3N 4  = F0(k + 3N /4) +WNk+3N /4F1(k + 3N /4) +WN2(k+3N /4)F2(k + 3N /4) +WN3(k+3N /4)F3(k + 3N /4) (B.4)

References

Related documents

Figure 43 shows the results of the measured power with the rotation and elevation angle for the antenna mounted in a horizontal position. The data contained in this plot is then used

The dominating direc- tions (gradient of image function) in the directional textures (spatial domain) correspond to the large magnitude of frequency

Airtraq laryngoskop (grupp 1) visades i studier bidra till en signifikant snabbare intubationstid än intubation med standard Macintosh laryngoskop (Ndoko et al., 2008; Ranieri,

5.10.4 En förklaring till att flickor är mer positiva till att låta elever läsa extra, kan vara att de är ge- nerellt mer positiva till religion i allmänhet och där igenom även

Electronic, transport, and spin properties of grain boundaries (GBs) are investigated in electrostatically doped graphene at finite electron densities within the Hartree and

Ifall det sker en reflektion enligt pedagogens intention, sker den mellan en pedagog och ett barn som råkar titta i fotoramen eller mellan barn och andra aktörer

In general, the Fourier transform is used to move a function from amplitude as a function of time to amplitude as a function of frequency. Looking at a function which