• No results found

Hardware evaluation platform based on GNU Radio and the USRP

N/A
N/A
Protected

Academic year: 2021

Share "Hardware evaluation platform based on GNU Radio and the USRP"

Copied!
60
0
0

Loading.... (view fulltext now)

Full text

(1)

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Hardware evaluation platform based on GNU Radio

and the USRP

Examensarbete utfört i Elektroniksystem vid Tekniska högskolan i Linköping

av

Carl Ingemarsson

LiTH-ISY-EX--09/4246--SE

Linköping 2009

Department of Electrical Engineering Linköpings tekniska högskola Linköpings universitet Linköpings universitet SE-581 83 Linköping, Sweden 581 83 Linköping

(2)
(3)

Hardware evaluation platform based on GNU Radio

and the USRP

Examensarbete utfört i Elektroniksystem

vid Tekniska högskolan i Linköping

av

Carl Ingemarsson

LiTH-ISY-EX--09/4246--SE

Handledare: Oscar Gustafsson

ISY, Linköpings universitet

Examinator: Oscar Gustafsson

ISY, Linköpings universitet

(4)
(5)

Avdelning, Institution

Division, Department

Division of Electronics Systems Department of Electrical Engineering Linköpings universitet

SE-581 83 Linköping, Sweden

Datum Date 2009-05-21 Språk Language  Svenska/Swedish  Engelska/English  ⊠ Rapporttyp Report category  Licentiatavhandling  Examensarbete  C-uppsats  D-uppsats  Övrig rapport  ⊠

URL för elektronisk version

http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-18367

ISBN

ISRN

LiTH-ISY-EX--09/4246--SE

Serietitel och serienummer

Title of series, numbering

ISSN

Titel

Title

Hårdvaruutvärderingsplattform baserad på GNU Radio och USRP Hardware evaluation platform based on GNU Radio and the USRP

Författare

Author

Carl Ingemarsson

Sammanfattning

Abstract

GNU Radio is a software framework allowing easy creation of digital signal pro-cessing applications on a regular PC. The Universal Software Radio Peripheral (USRP) is a hardware component that can be used as a radio front-end and that is connected to a PC using USB. GNU Radio and the USRP together form a system for software-defined radio.

The purpose of this thesis project have been to insert a large programmable logic circuit into the system that GNU Radio and the USRP together form. The goal of this is to make it possible to move parts of the signal processing away from GNU Radio and instead implement these parts in hardware. Possibilities for doing this has been analyzed and one of the possible systems performing this has been designed.

Nyckelord

(6)
(7)

Abstract

GNU Radio is a software framework allowing easy creation of digital signal pro-cessing applications on a regular PC. The Universal Software Radio Peripheral (USRP) is a hardware component that can be used as a radio front-end and that is connected to a PC using USB. GNU Radio and the USRP together form a system for software-defined radio.

The purpose of this thesis project have been to insert a large programmable logic circuit into the system that GNU Radio and the USRP together form. The goal of this is to make it possible to move parts of the signal processing away from GNU Radio and instead implement these parts in hardware. Possibilities for doing this has been analyzed and one of the possible systems performing this has been designed.

(8)
(9)

Acknowledgments

First of all I would like to thank my supervisor and examiner Oscar Gustafsson for all support given during this thesis project. I would also like to express a sign of gratitude to Kent Palmkvist for all the help I have got from him with various issues arising during the hardware implementation part of the project. I also owe a thank to my friend Victoria for the distraction that she has offer over numerous lunches and coffee breaks.

Halfway into the project I started sharing office with Saima Athar. I would like to thank her for good company, tips of Parsi dishes and advice of LaTeX typesetting!

My parents have always supported me. Thanks for all the patience!

Last but certainly not least I would like to thank Helena for all the support she has given me during this somewhat chaotic time in both of our lives and for giving me the most beautiful gift I could ever imaging, our first child and daughter, Ellen.

(10)
(11)

Contents

1 Introduction 3

2 GNU Radio 7

2.1 Background . . . 7

2.2 Using GNU Radio . . . 8

2.3 The GNU Radio Companion . . . 9

3 USRP 11 3.1 Top level architecture . . . 11

3.2 USRP Signal flow . . . 12

3.3 USRP USB communications . . . 13

3.3.1 USB protocol . . . 14

3.3.2 USRP configuration . . . 15

4 Different alternatives 17 4.1 Initial idea . . . 17

4.2 Early implementation alternatives . . . 18

4.3 USRPs follow-up: USRP2 . . . 20

4.3.1 Specifications of the USRP2 . . . 20

4.3.2 How much larger is the FPGA on the USRP2? . . . 21

4.3.3 Utilization of the USRP2 . . . 22

4.3.4 Conclusions regarding the USRP2 . . . 22

5 Implementation 25 5.1 Design decision . . . 25

5.2 DE3 . . . 25

5.3 The ISP1761 . . . 27

5.3.1 ISP1761 Peripheral controller . . . 27

5.3.2 ISP1761 Host controller . . . 29 ix

(12)

x Contents

5.3.3 ISP1761 DMA . . . 29

5.4 Implementation architecture . . . 30

5.4.1 Asynchronous bus interface timing of the ISP1761 . . . 32

5.4.2 Timing for a PIO write . . . 32

5.4.3 Timing for a PIO read . . . 33

5.4.4 Timing for a peripheral DMA transfer . . . 34

5.4.5 Timing for a host DMA transfer . . . 35

6 Results 37 6.1 IP-block operation . . . 37

6.1.1 IP-block programming interface . . . 38

6.2 Firmware . . . 40

6.3 Conclusions . . . 40

6.4 Future work . . . 41

(13)

Nomenclature

ADC Analog-to-digital converter ALM Adaptive Logic Module ALUT Adaptive look-up table BRAM Block random access memory CIC Cascaded integrator-comb CLB Configurable logic block DAC Digital-to-analog converter DC Direct Current

DCM Digital clock management DMA Direct memory access

EEPROM Electrically erasable programmable read only memory FFT Fast Fourier transform

FIFO First In First Out

FPGA Field Programmable Gate Array LAB Logic Array Block

LUT Look-up table

MIMO Multi input multi output PIO Programmed IO

RF Radio Frequency 1

(14)

2 Contents

SWIG Simplified wrapper interface generator USB Universal Serial Bus

(15)

Chapter 1

Introduction

A software-defined radio system is a radio communications system where major parts of the signal procession are done by software compared to application-specific hardware. Such a system will typically consist of some kind of computer, for instance a PC or a embedded computer system, a radio frequency front-end, and an analog-to-digital converter (ADC) (and/or digital-to-analog converter (DAC)). The idea with software-defined radio is to make it possible to easily implement several different radio communication schemes simply by changing the software running on the computer. The flexibility obtained in such a system would ulti-mately allow the implementation of new communications standards on-line. Ad hoc cognitive radio mesh networks have been proposed where mobile radio com-munication devices would be able to sense available frequency bands and schedule high bandwidth communication without the need for complex infrastructures as compared with for instance the cellular phone networks of today. The software-defined radio is centric to these proposals.

In an ideal software-defined radio the antenna would be directly connected to the analog-to-digital converter (for reception), but since the bandwidth and resolution of such converters are limited a radio end is needed. The front-end will typically amplify the radio signal and mix it to a frequency band within reach of the ADC. The basic architecture of a software defined-radio system is illustrated in Figure 1.1.

GNU Radio is a free-software (open source) software-defined radio framework allowing easy creation of signal processing software on a regular PC. The software is created in the python programming language by connecting predefined signal processing blocks in a graph-like manor.

(16)

4 Introduction

Figure 1.1. The basic architecture of a software-defined radio system

The Universal Software Radio Peripheral (USRP) is an integrated circuit board that is connected to a PC by a USB 2.0 link. It is designed especially for use with GNU Radio. When combined with various available daughter boards it forms the required hardware (front-end and analog-to-digital conversion) to make a com-plete software-defined radio system together with GNU Radio. The USRP houses two dual-channel analog-to-digital conversion codecs, a USB controller and a field programmable gate array (FPGA). The combined system of GNU Radio and the USRP is depicted in Figure 1.2.

The purpose of this thesis project has been to investigate the possibility to modify the system consisting of a USRP and GNU Radio to enable it to be used as a evaluation platform for hardware implementations of digital signal processing algorithms. This is done by introducing a second or larger programmable logic device to the USRP. During this thesis project different options have been analyzed and one such modified system has been designed.

(17)

5

(18)
(19)

Chapter 2

GNU Radio

2.1

Background

The GNU Radio software project was started in 1998 by Eric Blossom. The motivation from the beginning was that he wanted to design an entirely software based HDTV receiver. This as a reaction to the restriction on hardware receivers in the brodcast flag legislation that at the time was upcoming in the USA [1]. Involved in the starting the GNU Radio project was also the philanthropist John Gilmore [2].

GNU Radio consists of a library of performance critical signal processing blocks written in C++ and a glue of python code that binds these blocks together [3]. The user of GNU Radio writes a python program where signal processing blocks are instantiated, configured and connected together. The blocks are connected to form a directed graph. There are three types of blocks.

• Typical data processing block that have both inputs and outputs, for instance filters and (de)modulation blocks.

• Source blocks that only have outputs, example of these are a block providing the incoming data from a USRP or a sinusoidal generator.

• Sink blocks that only have inputs. Such a block could for instance be a graphical FFT analyzer or a block providing access to the sound card output to the speakers.

If the predefined block do not fulfill the need of a user application it is possible to easily write a new signal processing block. The block is implemented as a C++

(20)

8 GNU Radio

class that is made accessible from python using the Simplified wrapper interface generator (SWIG) [4].

2.2

Using GNU Radio

To illustrate the workings of GNU Radio a small code example is presented below. The code is stolen from [5]. The code instantiates two sinusoidal signal generator and connects them to the left versus right audio speaker.

#!/ u s r / bin /env python from gnuradio import gr from gnuradio import audio d e f build_graph ( ) :

sampling_freq = 48000 ampl = 0 . 1

f g = gr . flow_graph ( ) # g e t empty fl o w graph # i n s t a n t i a t e s o u r c e and s i n k b l o c k s s r c 0 = gr . s i g _ s o u r c e _f ( sampling_freq , gr .GR_SIN_WAVE, 350 , ampl ) s r c 1 = gr . s i g _ s o u r c e _f ( sampling_freq , gr .GR_SIN_WAVE, 440 , ampl ) d s t = audio . s i n k ( sampling_freq ) # connect the b l o c k s f g . connect ( ( src0 , 0 ) , ( dst , 0 ) ) f g . connect ( ( src1 , 0 ) , ( dst , 1 ) ) r e tu r n f g i f __name__ == ’__main__ ’ : f g = build_graph ( ) f g . s t a r t ( ) raw_input ( ’ P r e s s Enter to q u i t : ’ ) f g . stop ( )

(21)

2.3 The GNU Radio Companion 9

2.3

The GNU Radio Companion

A graphical tool for creating GNU Radio flow graphs is currently under construc-tion. It is written by Josh Blum and is called the GNU Radio Companion [6]. A screenshot from GNU Radio Companion is shown in Figure 2.1.

(22)
(23)

Chapter 3

USRP

3.1

Top level architecture

As said above the USRP is an integrated circuit board that together with daughter boards can operate as a radio front-end and ADC/DAC for GNU Radio. In this chapter a more thorough description of the USRP will be given.

The USRP has been developed by a team lead by Matt Ettus at Ettus Re-search [7]. There are other hardware solutions to interface GNU Radio with the electromagnetic spectrum but the USRP has become the standard one. The USRP is open source hardware, as stated on the homepage of Ettus Research [8]:

The entire USRP design is open source, including schematics, firmware, drivers, and even the FPGA and daughter board designs.

A simple sketch of the USRP can be found in Figure 3.1. The board includes the following components:

• The Cypress Semiconductor EZ-USB FX2 CY7C68013 USB 2.0 peripheral controller, commonly referred to as the FX2.

• A EP1C12 Cyclone FPGA from Altera.

• Two AD9862 ADC/DAC codecs from Analog Devices.

• Four slots for daughter board connection, two for reception and two for transmission.

• A EEPROM storing identification data for the USB controller. 11

(24)

12 USRP

Figure 3.1. The Universal Software Radio Peripheral

• A ADP3336 voltage regulator IC. • A AD9513 clock distribution IC. • A 64 MHz crystal oscillator.

3.2

USRP Signal flow

The received analog radio signal enters the system via an antenna that is connected to a receive daughter board. The daughter board essentially is the RF front-end and there are several receive and transmit daughter boards available. The different daughter boards target different RF bands and makes the USRP more flexible. At the moment daughter boards with different bandwidths in the range from DC to 5.9 GHz are available. For a complete list, check out [8].

On the daughter board the signal typically is amplified and fed into a mixer in order to move the desired frequency band down to DC or to some intermediate frequency in the frequency band that the ADC can accept, this mixing process is also called down conversion. The analog signal is then sampled by the ADC and the digital samples are fed into the FPGA [9]. The ADCs samples at a rate of 64 Msamples per second. The data width of the samples is 12 bits [10]. There is one

(25)

3.3 USRP USB communications 13

dual channel ADC available for each of the two receive daughter boards. These can be used as two separate real channels, but are typically used in parallel for complex sampling. Inside of the FPGA further down conversion is performed (in the case the signal is not already at DC). Quadrature downconverion is performed in the case of complex sampling [11]. After this the signal is decimated in a programmable decimation unit. This decimation is made in two steps. The first is done using programmable decimator consisting of a multi-rate four stage cascaded integrator-comb filter (CIC). The second decimation step decimates by two and uses a half band FIR filter as decimation filter [12]. Once the signal is decimated it sent the USB controller (via some buffers) and on to the host computer.

The samples to be transmitted are similarly transfered over the USB bus and then fed into the FPGA (through buffers in both the FX2 and FPGA). In the FPGA the signal is interpolated in a CIC interpolation unit. This is done to raise the signal rate to 32 Msamples per second which is the necessary input rate of the DAC. The DAC in the AD9862 is configured for complex sampling. Therefore you could not use four real transmit channels simultaneously. It would however be possible to change the configuration of the USRP in GNU Radio to make this possible [13]. The bus that goes between the FPGA and the two ADC/DAC codecs runs at 64 MHz and on it the two 32 Msamples per second streams of real and imaginary samples are multiplexed. The signal is then further interpolated by four in the AD9862. The analog output value of the AD9862 is therefore updated at a rate of 128 Msamples per second. For the transmit case no up conversion is done in the FPGA. Instead it is done in the AD9862 [14].

The samples going to the DAC are 14 bits wide and as said above the samples from the ADC are 12 samples wide. Samples sent over the USB bus are usually 16 bits wide but can be configured to be 8 bits wide. Either the samples are padded or truncated to achieve this. Data sent to and from the USRP is packed in 512 byte large USB packets. Data from the different down converters (or to the up converters) are interleaved in the packet [15]. Since samples typically are 16 bit wide 256 real samples will fit into a 512 byte packet. If complex sampling is used only half as many complex samples will fit.

3.3

USRP USB communications

The FX2 contains a 8051 micro controller that implements the USB communica-tion with the host as described in chapter 9 of the USB 2.0 specificacommunica-tion [16]. A simple FIFO buffer interface is provided between the FX2 and the FPGA. This interface consists of one FIFO buffer in each direction. The size of the buffers is

(26)

14 USRP

2048 byte [17].

3.3.1

USB protocol

In order to describe how the USRP and host controller communicates over the USB bus it is necessary to give some brief information about parts of the USB protocol.

In version 1.1 of the USB specification only two speeds modes where available. Low speed providing a transfer speed of 1.5 Mbps and Full speed with a transfer speed of 12 Mbps. With the addition of High speed in version 2.0 of the specifi-cation USB can also support transfers at 480 Mbps. It should be noted that in practice part of these bandwidths are eaten up by protocol overhead.

In a USB device several so called endpoints can be present. There is always one control endpoint that is assigned endpoint number 0. This is the only endpoint that supports transfers in both directions. The other endpoints are defined to have a specific direction and are referred to as IN endpoints if data are transfered from them to the host or as OUT endpoints if data are transfered to them from the host. The non-control endpoints are simply ports that the host can transfer data packets to or from. The USRP FX2 has two non-control endpoints. Data going to the USRP from the host is transfered to endpoint 2. Data going from the USRP to the host is transfered from endpoint 6 [18]. The control endpoint, endpoint 0, is used for reception of so called control requests and for data reception and transmission that is associated with these.

Every transfer over the USB bus starts with the USB host sending a so called token to the device. This token can be either a SETUP, IN, OUT or SOF token. The IN token represents a request for data from a IN endpoint of the device. The OUT token precedes a data transfer to a OUT endpoint. A SETUP token preceeds a so called control request from the host to the device. The use of the SOF token is beyond the interest of this report.

Control requests are made up of different stages. They always starts with a SETUP token being sent from the host to the device. After that a 8 byte setup packet is sent to endpoint 0. This identifies what control request is being made and arguments. Among these how many bytes are to be sent in the data stage that comes next and in what direction. The control request concludes with handshaking in a status stage. A set of control requests called the standard control requests have to be implemented in the device and respond to in a certain manor according to the USB specification. Other control requests are associated with the device belonging to a certain USB device class. For instance there is a USB class for computer mice. When designing a USB peripheral it is also possible to define

(27)

3.3 USRP USB communications 15

custom so called vendor request.

When a USB device is powered and connected to a USB host a process called enumeration starts. This involves sending of description data from the device to the host that tells the host what capabilities the device has. During the enumer-ation the device is also assigned an address (when not enumerated it has default address 0). The different steps involved in the enumeration process consists of control requests being sent from the host to the device [19].

There are three different transfer types for a non-control endpoints; interrupt, isochronous and bulk. The two endpoints used in the FX2 on the USRP are defined as bulk endpoints [18].

3.3.2

USRP configuration

As stated above the USB controller of the USRP is a Cypress FX2. It supports high-speed USB 2.0. The USRP and GNU Radio by default won’t support USB 1.1 or below. There are however patches available to allow the use of USB 1.1 [20]. When the USRP is powered, the FX2 is in a default state as provided by the manufacturer. When enumerated by the USB host it identifies itself with vendor and product IDs and revision number that tells that it is a USRP but that it isn’t configured. These IDs are stored in an EEPROM on the USRP. When opening a USRP connection in a GNU Radio application a configuration of the USRP will be conducted if the USRP is in the unconfigured state.

The firmware for the FX2 8051 micro controller that is essential to make the USB Controller function in the correct manor is not stored in the USB controller on power up. Neither is it stored elsewhere on the USRP. The configuration of the USRP therefore starts with sending the 8051 firmware over the USB bus. This is accomplished by a vendor request for firmware download that the FX2 supports in both its power up stage and when configured with firmware. During the firmware download the processor is kept in reset (though writing a reset command to a position in the micro controller memory using the firmware download vendor request). When reset the host will lose contact with the device and when it is woken from reset the device will be enumerated once more. This time it will identify itself as a configured USRP.

After this is done the GNU Radio software will also configure the FPGA. This is done by sending the configuration bit stream to the FX2 using a USRP-specific vendor request. The 8051 then make a serial configuration of the FPGA through one of its PIO ports. The configuration of the USRP concludes with some more USRP-specific control requests being sent. These for instance set control registers in the FPGA and the AD9862s.

(28)
(29)

Chapter 4

Different alternatives

4.1

Initial idea

The initial idea behind this project was that if somehow a larger FPGA could replace the one on the USRP then functionality could be moved from the software to the hardware side of the system, by putting it in unused space of the FPGA. This would make it possible to use this system as a hardware implementation evaluation platform. The resources utilization of the the FPGA on the USRP is presented in Table 4.1.

Number of Logic Elements: 11138 out of 12060 92% Combinatorial w/o register: 4349

Register only: 1302 Combinatorial with register: 5487

Number of LABs: 1182 out of 1206 98% Number of M4Ks: 42 out of 52 81% Number of I0 pins: 173 out of 173 100%

Clock pins: 2 out of 2 100%

Number of Global clocks: 6 out of 8 75% Number of PLLs: 0 out of 2 0% Number of CRC blocks: 0 out of 1 0% Number of ASMI Blocks: 0 out of 1 0%

Table 4.1. Resources used in the standard USRP FPGA configuration

A Logic Array Block or LAB in the Cyclone FPGA family is an array of 10 17

(30)

18 Different alternatives

Logic Elements (LEs). The LE consists of a 4-input look-up-table (LUT) and a register. The MK4s are hard memory blocks in the FPGA that can contain 128 36-bit words. For later comparison lets keep in mind that this FPGA has 12060 4-input LUTs and 239616 bits of dedicated memory [21].

A larger FPGA would be needed first of all because the logic utilization of the FPGA sitting on the USRP as can be seen in Table 4.1 is fairly high. It might be possible to free some resources in the FPGA but that still wouldn’t give that much since the FPGA in itself is quite small.

One could ask: Is it not the idea of SDR that as much as possible of the radio system should be implemented in software? Why then make it possible to move functionality ’back’ into the hardware part?

This of course would be relevant if the goal was to use this system for SDR. Here that really is not the case. The plan is to use the system designed in this thesis project in research in the area of implementation of signal processing hardware. For instance if new communication schemes are to be implemented they maybe to some extent can be proved in a unmodified GNU Radio - USRP setup. At a later stage if the system is to be proved in hardware it could be good to have a simple framework for moving functionality from software to hardware.

If the signal processing hardware to be implemented is part of a radio com-munication system it is of benefit that the USRP with daughter boards provides a very flexible radio front-end. But even if the signal processing hardware is not part of a radio communications system this setup could be of use. The USRP essentially is a data acquisition device and GNU Radio in that case would provide a handy software interface with many predefined signal processing blocks if there is a need to do some signal processing on the host computer.

4.2

Early implementation alternatives

If a larger FPGA is to be introduced into the system the question is: How to do this?

The first alternative conceived early in this thesis project was to replace the USRP with some other hardware board including a larger FPGA. It was not an option however to design a entire custom integrated circuit board to replace the USRP from scratch. In any case it was thought that it would be good if the daughter boards for the USRP still could be used in the final solution.

It would also be good if the changes made to the hardware part of the system did not make it necessary tho make large changes in the GNU Radio software. Of course, given the open source nature of the software, it would be possible to

(31)

4.2 Early implementation alternatives 19

make changes, but if it could be avoided it would be good. Ideally we would want an FPGA board with a fairly large FPGA and the FX2 USB controller. Then it would not be needed to change the software at all. GNU Radio could think that it still communicated with the USRP when it in reality is our custom hardware. In this case we also could reuse large parts of the FX2 firmware. The firmware would only be needed to be change with considering how the pins of our FX2 is connected on our board. Maybe it would also be possible to have the same interface between the FX2 and new FPGA. These things would make the design of the new system a whole lot easier. There also are example of people that have done things similar to this [22].

If it would not be possible to find a solution involving using an FX2 to interface the computer it would still be attractive to stay with a USB 2.0 link. Even if another USB controller is used it would still be possible then to fool GNU Radio that it is communicating with a USRP and that way no modification to the software would be required at all.

Quite a lot of time and energy was devoted to looking at different FPGA boards and different add-on board available for these. Various FPGA boards where also available at the department that had reasonably large FPGAs on them. If the daughter boards for the USRP still should be used it would be necessary to create a custom add-on board and connect the daughter boards to it.

There where no problem finding an FPGA board with a reasonably large FPGA and a USB 2.0 controller (even an FX2, see the above reference for an example). It was harder though to find an FPGA board with both a USB 2.0 controller and high speed ADC/DAC. Given this it would be necessary to put ADC/DAC on the custom add-on board for the daughter boards. This would imply that all signals from the FPGA to the ADC/DAC and the daughter boards should have to pass between the FPGA board and the daughter card. Most probably the expansion header of the FPGA board would not allow so many signals and this would then mean that the signals would have to be multiplexed over the expansion header and then demultiplexed on the add-on board. This demultiplexing would most easily be done in a small programmable logic device, a CPLD or FPGA.

Now this conceived custom made add-on board has become quite complex. At this moment it includes a small FPGA, ADC/DAC and connectors for the daughter cards. This not that different from the USRP itself. Simply add the FX2 and we get something that basically is the USRP. At this point this alternative was abandoned with the emerge of a new one. Why not still use the USRP but put a FPGA board between it and the computer? This concept is illustrated in Figure 4.1.

(32)

20 Different alternatives

Figure 4.1. Possible implementation alternative

4.3

USRPs follow-up: USRP2

Some time into the project it came to our attention that the USRP a few months earlier had got a follow-up. A beta version of the USRP2 had been released. An obvious question thus needed an answer: Could this board somehow be a solution for our needs?

4.3.1

Specifications of the USRP2

The specs. of the USRP2 are the following: • Gigabit Ethernet interface

• 25 MHz of instantaneous RF bandwidth • Xilinx Spartan 3-2000 FPGA

• Dual 100 MHz 14-bit ADCs • Dual 400 MHz 16-bit DACs • 1 MB of high-speed SRAM

• Locking to an external 10 MHz reference • 1 PPS (pulse per second) input

(33)

4.3 USRPs follow-up: USRP2 21

• Standalone operation

• The ability to lock multiple systems together for MIMO

• Compatibility with all the same daughter boards as the original USRP As can be seen the USRP2 has improved compared with the original USRP in various aspects. Most important for the needs that this thesis project is addressing the USRP2 is compatible with the daughter boards for the original USRP and has a larger FPGA. The USRP2 was released in a beta version that as of now no longer is available. The author has not found any information of when it will become available again.

4.3.2

How much larger is the FPGA on the USRP2?

The FPGA on the USRP2 is a Spartan 3-2000 from Xilinx. This FPGA has 5120 of what Xilinx calls Configurable logic blocks (CLBs). Each of these blocks contain four slices and one slice contains two 4-input LUTs [23]. The calculated number of 4-input LUTs in this FPGA is therefore 5120 ∗ 8 = 40960. This is 3.4 times more than in the Cyclone FPGA on the USRP. The number of dedicated memory bits in the USRP2 FPGA is 737280. This is 3 times more than the dedicated memory in the Cyclone FPGA on the USRP.

How to compare the size of an Altera FPGA and a Xilinx FPGA in a good way is not however generally as easy as this. In the data sheet referred Xilinx gives a number for “Logic Cells” that is 46080. This is what you get if you consider each CLB to house nine Logic Cells. Besides the eight 4-input LUTs there are some additional logic that that permits a larger design to fit into the CLB (and that is what Xilinx counts as “the ninth LUT”. The low-level component Altera denotes as Logic Element (LE) also contains some additional logic besides the LUT but Altera doesn’t use some normalized measure like “Logic Cell” to indicate this.

For these device families (Cyclone and Spartan 3) the internal architecture is quite similar. For later generations of Altera and Xilinx FPGAs it is even harder to make good comparisons, because the architecture then differ more between the device families [24]. All modern FPGAs also contain hard blocks like multiplier and digital clock management units (DCMs). If a comparison was made for a given design it would be easier. At worse you could simply implement the design on both FPGAs and look at the resulting utilization. In our case we only want a measure of how much space there is in the FPGAs for research and experimentation. For this purpose the comparison above with LUTs and dedicated memory bits is sufficient to give us a rough estimate of the size of the FPGAs. For a given application a more detailed analysis could provide a better comparison.

(34)

22 Different alternatives

Number of Slices: 11104 Out of 20480 54% Number of Slice Flip Flops: 12657 Out of 40960 30% Number of 4 input LUTs: 20212 Out of 40960 49%

Number used as logic: 18130 Number used as Shift registers: 1058 Number used as RAMs: 1024

Number of IOs: 309

Number of bonded IOBs: 301 Out of 333 90% IOB Flip Flops: 261

Number of BRAMs: 34 Out of 40 85% Number of MULT18X18s: 16 Out of 40 40% Number of GCLKs: 6 Out of 8 75% Number of DCMs: 1 Out of 4 25%

Table 4.2. Resources used in the standard USRP2 FPGA configuration [25]

4.3.3

Utilization of the USRP2

In Table 4.2 the resource use of the current version of the USRP2 configuration is listed. As can be seen 85% of the BRAMs are utilized. BRAM stands for Block Ram and is Xilinx name for the dedicated memory blocks in their FPGAs. 40% of the dedicated 18x18 bit multiplier blocks (MULT18X18) are used and 54% of the slices are use. The majority of the slices are used as logic and a smaller amount is used as RAMs and shift registers. 6 out of 8 of the global clock paths are used.

Since the USRP2 so far only has been released in a beta version where all functionalities yet not have been implemented it is very likely that the amount of free resources will shrink in further releases of the standard configuration for the USRP2. This is also indicated in [25].

4.3.4

Conclusions regarding the USRP2

In this section, some different aspects of the USRP2, but mainly its FPGA, have been discussed. Here, some conclusions regarding the USRP2 will be drawn.

First of all even though the FPGA on the USRP2 is larger than the one on the USRP it is not more then 3-4 times larger. The FPGA is also, as is the one on the USRP, utilized to a large degree, especially memory wise, with a as of today BRAM utilization of 85%. This thesis project is aiming at construction a hardware eval-uation framework useful for hardware implementing oriented research. It is really

(35)

4.3 USRPs follow-up: USRP2 23

hard to estimate the longterm resource requirements in terms of programmable logic that these research activities will have, i.e. how large the involved FPGA(s) will have to be. Since the long-term requirements on FPGA size are unknown, optimization of FPGA size (under other constraints of cost reduction, feasibility, etc.) is desirable.

From this perspective the solution described in Figure 4.1 could be quite good. It should be possible to find a FPGA board with a fairly large FPGA and two USB 2.0 controllers. That solution would also have another benefit in that it would be possible to keep all the same functionalities of the USRP and still have an virtually empty FPGA left for the research activities.

Virtually empty should not be taken as completely empty. Certainly some resources of this FPGA will be needed in order for simply establishing the link between the USRP and the computer, but that would probably not be an especially large amount.

Given this, and also to some degree the fact that the USRP was available at our department and that the beta version of the USRP2 was discontinued in December last year [26], any further considerations involving the USRP2 was not made in this thesis project.

(36)
(37)

Chapter 5

Implementation

5.1

Design decision

With the considerations described in the last chapter in mind the implementation idea depicted in figure 4.1 was chosen for implementation. After some time spent on finding a suitable FPGA board, it was decided to use a development board from Terasic Technologies called DE3. On this board sits a Stratix III FPGA. The board comes in three different variants with different sized FPGAs on them and the actual board used in this project has an EP3SL150 FPGA. A aspect crucial for this implementation was that there should be a possibility to have the FPGA board function as both a USB host and a USB peripheral simultaneously. The DE3 has a ISP1761 USB controller from NXP included. This is a dual role USB controller with three ports. Essentially it is a combination of three USB controllers in one chip. These controllers are a USB host controller, a USB peripheral controller and a USB On-The-Go (OTG) controller. This makes it possible to use this chip as needed in this project and this was one of the reasons this FPGA board was chosen in the first place.

5.2

DE3

Some of the features one the DE3 are as follows [27]: • An Altera EP3SL150 Stratix III FPGA

• A ISP1761 dual role host/device USB On-The-Go controller • Many expansion headers

(38)

26 Implementation

Number of ALMs: 56800 Number of ALUTs: 113600 Number of Logic Elements: 142500 Number of LABs: 5680 Number of M9Ks: 355 Number of M144Ks: 16 Number of I0 pins: 744 Number of Global clocks: 16 Number of 18x18 multipliers: 384

Number of PLLs: 8

Table 5.1. Resources available in the DE3 FPGA

• DDR2 SO-DIMM socket • SD Card reader

• Leds, Seg7 display, and buttons.

Although having a DDR2 memory socket and a SD Card reader isn’t crucial for this design, they might come in handy in the future use of the platform here designed. The amount of available resources in the DE3 FPGA is found in Table 5.1.

The Stratix III FPGA family has a architecture that is a bit more advanced compared to the Cyclone FPGA family. The programmable logic is build up of a low level block called a Adaptive Logic Module (ALM). As in the Cyclone family a Logic Array Block or LAB consists of 10 low level logic modules, but in this case the low level module is a ALM instead of a LE. Every ALM consists of two Adaptive look-up tables (ALUTs), two registers, adder circuitry and some additional logic. The two ALUTs can be used as two separate four input LUTs or as one five and one three input LUT. If the two functions that are to be implemented in these LUTs share inputs, they can have as much as six inputs each.

In Table 5.1 a LE count is included but this is only a relative estimate given by Altera. In reality there are no Logic Elements in a Stratix III FPGA. If we use this estimate in a comparison with the Cyclone FPGA on the USRP this FPGA can include roughly 12 times more logic.

There are two types of dedicated memory blocks in the DE3 FPGA, M144Ks and M9Ks. The M114Ks can store 2048 x 72 bits and the M9Ks can store 256 x 36 bits of data. Since there are 355 M9Ks and 16 M144Ks the total amount

(39)

5.3 The ISP1761 27

of dedicated memory is 5630976 bits. This is 23.5 times more dedicated memory than in the USRP FPGA.

5.3

The ISP1761

Maybe the most time consuming part of this thesis project was to read and un-derstand how the USB protocol in general and the USB controller one the DE3 in particular work. Some aspects of the USB protocol was described earlier in subsection 3.3.1. In this section some, but certainly not all, information of how the ISP1761 USB Controller works will be presented.

As said above the ISP1761 essentially is a combination of three USB controllers: one host controller, one peripheral controller and one USB On-The-GO (OTG) controller. The functionality of the OTG controller has not been used in this project and its workings therefore won’t be further described. The ISP1761 has three USB ports. Two of them are for host-only operation. One of the ports can be configured as a peripheral or host port. In this project it has been configured as a peripheral port. That is to say that it has been configured to be connected directly to the peripheral controller. The USB ports that are configured as host ports are connected to an internal root hub. From the the host controllers perspective this is an ordinary USB hub that needs to be enumerated as any other USB device. All communications between the USB host controller and USB devices connected to the host ports go through the root hub.

The ISP1761 has a memory interface that is intended to be connected to a generic processor bus. This interface is shared by the different controllers in a memory mapped fashion. The interface is asynchronous and will be further de-scribed below. It would of course be possible write a custom FSM and connect it to the generic processor interface, but this would be very tedious and unpractical. Especially the protocol stack for the host controller would be very hard to imple-ment without a processor. For this reason a soft processor has been used inside the FPGA in this project.

A somewhat simplified view of the internal architecture of the ISP1761 can be found in figure 5.1. Note that the root hub is included in the OTG controller block.

5.3.1

ISP1761 Peripheral controller

The peripheral controller of the ISP1761 is with some small modifications func-tionally identical with the ISP1582 USB peripheral controller [29]. It is possible

(40)

28 Implementation

Figure 5.1. Internal architecture of the ISP1761 [28]

to configure it to support up to 14 endpoints, 7 in each direction. That in ad-dition to the default endpoint that is present as it is for all USB devices. For each configured endpoint there is a FIFO buffer for either incoming or outgoing data. The FIFO buffers can be configured to be double buffered as long as no more than a total of 8 kB is used since this is the amount of available memory. The USB protocol handling as defined in chapter 9 of the USB specification [16] is handled by external firmware running on the processor connected to the generic processor bus interface. The peripheral controller signals actions on the USB bus via interrupts to the processor. The processor reads and writes data from/to the endpoint buffers and provide control of the peripheral controller by writing appro-priate values to different control registers and reading status from different status registers.

(41)

5.3 The ISP1761 29

5.3.2

ISP1761 Host controller

As stated above the host controller is connected to the internal root hub. The controller must therefore start its operation by enumerating the hub. This is done in the same way that any other USB device is enumerated. Later, when a USB device is connected to one of the host ports the root hub will notify the host controller of the status change. This will trigger firmware operations that perform enumeration of the device. All transfers to or from a USB device connected to the host controller, control transfers or data transfers, are started by the firmware inside of the generic processor. The firmware prepares a data structure called a PTD (Phillips Transfer Descriptor), a modified version of the EHCI data structure of the Enhanced host controller interface specification [30]. This data structure defines the type, destination and other aspects of the transfer. All transfers start with the submission of a PDT to a PTD memory area of the host controller. If the transfer does contain a data stage where data is to be transfered to the device the data is sent to the payload memory of the host controller by the firmware. When the transfer is done the host controller notifies the firmware. If a data payload has been read from the device this is transfered from the payload memory of the host controller to the generic processor by the firmware. Transfers that does not contain a data stage are typically control requests. The memory size of the host controller memory is 64 kBytes.

5.3.3

ISP1761 DMA

Payload data to or from both the peripheral and the host controller can be trans-ferred by either programmed IO (PIO) or direct memory access (DMA). Reads and writes to the various control registers of the ISP1761 can be performed only by PIO. The generic processor interface provides a ordinary asynchronous mem-ory interface with signals as described below. In addition to the ordinary memmem-ory interface signals there are two interrupt requests and four handshaking signals for DMA transfers.

• A 17-bit wide address bus. • A bidirectional 32 bit data bus. • A active low chip select signal. • A active low read enable signal. • A active low write enable signal.

(42)

30 Implementation

• A active low write enable signal.

• A interrupt request signal for the peripheral controller. • A interrupt request signal for the host controller.

• A DMA requirement signal for the peripheral (device) controller (DC_DREQ). • A DMA acknowledgment signal for the peripheral controller (DC_DACK). • A DMA requirement signal for the host controller (HC_DREQ).

• A DMA acknowledgment signal for the host controller (HC_DACK). A DMA transfer is started by the firmware. It configures the ISP1761 for a DMA transfer either to or from the peripheral controller or to or from the host controller. When the ISP1761 is ready for transfer it asserts the DMA transfer requirement signal. If the transfer involves the peripheral controller that is to say DC_DREQ, for the host controller the signal is HC_DREQ. The ISP1761 is ready for DMA when a DMA operation is configured and there is either data available or for the peripheral case: a FIFO buffer is empty.

The assertion of the DREQ signal triggers a DMA transfer by some other part of the system, this typically is a DMA controller. When the asserted DREQ is seen the DMA acknowledgment signal (DC_DACK or HC_DACK) is asserted by the DMA controller and a DMA read or write burst is conducted. When the number of words of a burst have been transfered the ISP1761 de-asserts the DREQ signal, the DACK signal is de-asserted in response and no more word is transfered in this burst. If more bursts are required this procedure is repeated. The source or destination address for a DMA transfer and the block size is written to registers when the DMA transfer is configured and thus the address bus is not needed during the DMA burst transfers.

5.4

Implementation architecture

In order to use the DE3 as intended here the FPGA needs to be configured in such a way that the USB traffic going between the computer to the USRP can pass through the DE3. Here are some key requirements on the system:

• The USB peripheral should be programmed so that the computer sees a USRP. This will mean that no changes needs to be made to the GNU Radio software.

(43)

5.4 Implementation architecture 31

• There should be some easy-to-use interface for custom signal processing logic to be inserted into the FPGA when this system later is used as a hardware evaluation platform.

• The USB host needs to be programmed in such a way that it can configure the USRP if this is needed.

Because of the fact that the ISP1761 needs a processor with firmware imple-menting parts of the USP protocol stack to function in a useful manor a Nios II soft processor from Altera was instantiated in the FPGA. A custom IP-block was inserted between the ISP1761 and the processor. The processor is connected to the IP-block using the Avalon switch fabric provide by Altera. The IP-block in turn is connected to the asynchronous bus interface of the ISP1761. The IP-block also provides a interface consisting of FIFO buffers for custom signal processing logic to be inserted into the FPGA. The setup is illustrated in figure 5.2.

Figure 5.2. DE3 system architecture

The role of the IP-block is firstly to pass on PIO reads and writes of the processor core to the USB controller. Secondly it can be programmed to act as a DMA controller for the ISP1761 and perform DMA transfers to/from either the endpoint FIFOs in the peripheral controller or the memory in the host controller. The data to be read or stored in these DMA transfers are written to or read from the FIFO buffer of the custom logic interface. DMA transfers are conducted in read or write bursts of 16 32 bit data in each burst. All 512 byte blocks of data going either to or from the USRP are only transfered by DMA. These blocks

(44)

32 Implementation

Figure 5.3. PIO write timing requirements

therefore never have to be transfered over the Avalon switch fabric to the Nios 2 processor. All communication between the ISP1761 and the Nios 2 processor are done by use of PIO.

Another reason to have the IP-block between the Avalon switch fabric and the asynchronous bus of the ISP1761 is that the timing requirements for PIO and DMA transfers are different. This makes the interfacing with the Avalon switch fabric simpler. These timing requirements will be presented in the next section.

The Nios 2 processor and the Avalon switch fabric runs at a speed of 100 MHz.

5.4.1

Asynchronous bus interface timing of the ISP1761

The ISP1761 has a asynchronous bus interface. As stated above payload transfers can be done using both DMA and PIO and register access is only possible using PIO. Oslo stated earlier is that the timing requirements of the asynchronous bus interface is different depending on what type of transfer that is to be conducted. In this section these differences will be described.

5.4.2

Timing for a PIO write

The timing requirements for a PIO write are shown in Figure 5.3. There are setup time requirements on address, chip select and data relative to the de-assertion of the write enable signal. This however is not critical since the write pulse must be more then 17 ns wide. Therefore address, chip select and write enable can be asserted all at the same time. Since these signals are to be produced by a

(45)

5.4 Implementation architecture 33

Figure 5.4. PIO read timing requirements

synchronous system clocked by a 100 MHz clock the write pulse will have to be 20 ns (which is equal to two 10 ns clock periods). There are also hold requirements on address, chip select and data relative to the de-assertion of the write enable signal. Even though these are significantly less than the clock period of 10 ns, this means that these signals have to be asserted for one clock cycle more than the write enable signal. The write cycle for a PIO write will therefore take 3 clock periods to complete [28].

5.4.3

Timing for a PIO read

The timing requirements for a PIO read are presented in Figure 5.4. Note that the measures relative to the data signal in this case are not requirements but rather a maximum bound on the delay of the data arriving from the USB controller and a maximum time the data is available after the read enable signal is de-asserted. The setup time for address and chip select relative to the assertion of the read enable signal is 0. That is to say that address, chip select and the read enable signal can be asserted simultaneously. The write pulse needs to be at least 22 ns wide [28]. Given this it will be possible, as in the case of a PIO write, to complete a PIO read operation in three clock cycles (given that the time delays aren’t too large).

With a PIO read cycle time of 3 clock periods there sometimes where timing issues when the system was placed and routed by the tool. Most of the times there where no issues with the memory read if the the place and route was run a second time. This indicates that the time delays was close around whats allowed

(46)

34 Implementation

to meet the timing requirement of this setup and some times became too large for the read cycle to complete in three clock periods. With the writes there probably where no errors (although this is hard to verify if you cannot read back what you have written). The reason behind this assumption is of course that in the write case time delays should not be critical at all (at least if all signals have roughly equal delay). This is since all relevant signals in the write case goes in the same direction. The read cycle is more critical however, since the data bus in this case drives data in the opposite direction. The timing info regarding a PIO read states that the data should be available no later then 22 ns after the time instance where the read enable signal is asserted. Now there of course are some time delay from when RD_N is asserted in the FPGA until the ISP1761 can see this. This delay thus have to be added to the 22 ns. There is also some time delay for the data to go from the USB controller to the Nios2 processor that also have to be added. Apparently this delay sometimes makes the data arrive to late so that it is not available yet when the CPU data master samples the value on the bus on the rising edge of the clock that ends the third clock period. Since this problem usually disappeared with a second run of the place and route tool it was not further examined.

5.4.4

Timing for a peripheral DMA transfer

To continue, the timing requirements for both read and write DMA bursts to/from the peripheral controller are given in Figure 5.5. As above the the measures on the read data are not requirements but rather information. After that the DC_DREQ

(47)

5.4 Implementation architecture 35

Figure 5.6. Host DMA write timing requirements

signal has been asserted the DC_DACK signal is asserted. The polarity of these two signals are programmable, in the figure DC_DREQ is assumed to be active high and DC_DACK is assumed to be active low. The DC_DACK signal cannot be asserted directly after the DC_DREQ signal is asserted. This is however not really any extra requirement since the DC_DREQ signal is a asynchronous output signal from the ISP1761. It will anyhow need to be synchronized before sampled by any synchronous system to avoid metastability. This synchronization will guar-antee at least the required setup time. The reads or writes are then performed in a burst fashion. A write or read pulse needs to be at least 39 ns wide. This would mean four 10 ns clock periods. There is a recovery time of 36 ns before the next pulse can arrive and a cycle time requirement of 75 ns. A DMA write or read to the peripheral controller can therefore be performed in eight clock periods.

5.4.5

Timing for a host DMA transfer

To conclude this section let us look at the timing requirements for DMA transfers to or from the host controller. The requirements for a DMA write burst can be found in Figure 5.6 and the timing requirements for at DMA read burst can be found in Figure 5.7. The requirements are somewhat similar to the peripheral DMA transfer requirements but the transfers to and from the host controller can be done faster. The cycle time for a host DMA write is 51 ns which corresponds to six clock periods of 10 ns and the cycle time for a host DMA read is 38 ns which corresponds to four clock periods of 10 ns.

(48)

36 Implementation

(49)

Chapter 6

Results

The implementation process essentially have had two parts, implementation of the IP-block and firmware programming of the Nios 2 processor.

6.1

IP-block operation

The IP block first of all was implemented to allow passing of PIO reads and writes from the processor core to the ISP1761. This functionality was successfully imple-mented. In Figure 6.1 on-chip measurement of one PIO write and one PIO read cycle are shown. Here we clearly don’t have any read timing issues as described earlier in the text. As can be seen in the figure the read data has arrived already after 20 ns. The Avalon read and write cycles are determined to three clock periods by assertion of the wait-request signal during the first two cycles.

Figure 6.1. PIO read and write from the Nios 2 CPU through the USB interface IP-block

(50)

38 Results

6.1.1

IP-block programming interface

The address bus to the IP-block from the Avalon switch fabric is extended by one bit as compared to the address bus to the ISP1761 from the IP-block. This allows a control register and a status register to be memory mapped outside of the address space of the ISP1761. On address 0x40000 the control register is located and on address 0x40004 the status register is found. In Table 6.1 a bitwise description of the control register is found. In Table 6.2 the status register is described.

Bit Symbol Description

7 to 5 - Reserved

4 INT_EN Interrupt enable for IP-block. When a buffer is less then half full and RUN bit is set and this bit is set an interrupt will trigger the firmware to start DMA transfer to the buffer.

3 - Reserved

2 BUSY Written by firmware before a DMA transfer is con-figured. Cleared when DMA transfer is done. Will prevent generation of interrupts and set busy bit in status register.

1 WRITE_READ_N Controls what direction a upcoming DMA transfer should be conducted.

0 RUN If this bit is not set. The IP-block will ignore DREQ signals.

Table 6.1. Bitwise description of the IP-block control register(address 0x40000)

After that the functionality for passing through PIO transfers had been imple-mented support for DMA transfers from the host controller and to the peripheral controller was added. A DMA write burst of 16 words to the peripheral controller

(51)

6.1 IP-block operation 39

Bit Symbol Description

7 IRQ_BIT Interrupt bit. Interrupt is cleared by writing one to this bit.

6 INT_REASON If zero interrupt was generated because in buffer for host controller needs data. If one interrupt was generated because in buffer for peripheral con-troller needs data.

5 to 1 - Reserved 0 BUSY IP-block is busy.

Table 6.2. Bitwise description of the IP-block status register(address 0x40004)

that has been measured on-chip can be seen in Figure 6.2. It should be noted that the OTG_DC_DREQ signal typically does not arrive at a clk edge, but since the data in the plot has been sampled with a 100 MHz clock, we can’t see a new value until the rising edge of the clock. The signal is feed through a flip-flop to avoid it causing meta-stability in the FSM. The fin_rdreq signal is a synchronous read enable to the IN FIFO for data from the USRP. This signal will be asserted combinatorially when the delayed OTG_DC_DREQ is seen. In this cycle the next_state signal will be updated and the FSM will in the next cycle move from the IDLE state. When the state changes OTG_DC_DACK is asserted. The write pulses are 40 ns wide and the cycle time is 80 ns. As can be seen one such burst takes almost 128 clock cycles to complete. To transfer a whole 512 bytes block 8 such bursts have to be transmitted. A on-chip measurement of this can be seen in Figure 6.3.

Figure 6.3. Complete 8 burst peripheral DMA write transfer

On-chip measurements of a DMA read burst from the host controller can be seen in Figure 6.4. Here the cycle time is 40 ns and the whole burst takes about 64 clock periods to complete. Also in this case 8 such bursts are needed for a complete 512 byte transfer.

(52)

pe-40 Results

Figure 6.4. DMA write burst from the ISP1761 host controller

ripheral controller and write bursts to the host controller has partly been imple-mented but not fully tested because of lacking support for this in the most recent version of the firmware.

6.2

Firmware

Delivered with the DE3 was 2 demonstration designs: one USB host and one USB device application. The firmware construction was started from the code of these two demonstration designs. They where first integrated together and at a later point in time the firmware for the peripheral controller was rewritten from scratch and the USB device demonstration code was removed. The host code have been reworked to some extent, but is still not fully operational, it is a lot more complex than the device code. The code from the demonstrations did not use interrupts at all, this has been change to a high degree. The code did not work at high speed, this has been fixed.

6.3

Conclusions

A part time goal was to make it possible to tunnel control requests though the DE3 board. This has been achieved and it is possible to program the USRP though the DE3.

A second part time goal was to make it possible to receive data from the USRP. DMA transfers to the IP block from the host controller and to the peripheral controller from the IP-block have been implemented and tested. This functionality was to shown earlier in this chapter. The second part time goal was not met however met due to the fact that the firmware was not completed, simply due to lack of time.

The logic usage of the current version of the implementation can be seen in Table 6.3. As can be seen not that much resources are used for logic but about 40% of the memory bits are used. A large part of the memory usage is due to the

(53)

6.4 Future work 41

relatively large on-chip memory for the Nios2 processor. In the current version of the firmware about half of the on-chip memory are used. If the firmware is further reworked it might be possible to free larger portions of memory resources.

Number of ALUTs: 2732 out of 113800 2% Dedicated logic registers: 3015 out of 113800 2% ALMS completely or partially used: 2510 out of 56800 4% Dedicated memory bits used: 2345408 out of 5630976 42% Nuber of I0 pins: 110 out of 1744 15% DSP block 18 elements: 4 out of 384 1%

Number of PLLs: 1 out of 8 13%

Table 6.3. Resources used in the DE3 FPGA

A block of 512 bytes are transfered to or from the peripheral controller in about 1152 clock periods. This is equivalent to a bandwidth of about 44.4 MBps. A transfer from the host controller is conducted in roughly half the time and hence the data rate is doubled. A DMA transfer to the host controller should take roughly 5/8 of the peripheral data rate. Since the relatively high speed of these transfers there should not be so much contention of the bus to the ISP1761 that this becomes the bottleneck of the system.

6.4

Future work

First of all the implementation needs to be finalized. This includes completion of the firmware and final corrections and testing of the IP-block for the two DMA transfer cases that as of now only have been partly implemented and not tested at all in the FPGA..

Quite a lot of resources is used for the Nios 2 processor, there really is no need to have such a fancy processor for this application so some resources of the DE3 FPGA could be freed by changing the processor. At the moment the most resource consuming processor of the three variants that are available of the Nios 2 is used. The used version consumes about 1500 LEs. The most lightweight version of the Nios 2 consumes only 600-700 LEs.

If the firmware is reworked it would probably make some more memory re-sources available. A total rewrite of the firmware with the goal of minimizing the memory resources needed would probably save large amounts of memory. Without any changes to the firmware a the memory usage at least could be halved if the

(54)

42 Results

size of the on-chip memory is decreased.

It perhaps also could be a good idea to replace the Avalon switch fabric with some simpler bus to reduce the resources needed for the interconnect.

(55)

Bibliography

[1] Norton Quinn. Wired: GNU Radio Opens an Unseen World. http://www. wired.com/science/discoveries/news/2006/06/70933?currentPage=2, June 2006.

[2] John Gilmore. John Gilmores’ home page. http://www.toad.com/gnu/. [3] GNU Radio. http://gnuradio.org/trac.

[4] Simplified Wrapper and Interface Generator. http://www.swig.org. [5] Eric Blossom. Exploring gnu radio. http://www.gnu.org/software/

gnuradio/doc/exploring-gnuradio.html, November 2004.

[6] Josh Blum. GNU Radio Companion. http://http://www.joshknows.com/ .py/grc, April 2009.

[7] Ettus Reasearch LLC. http://http://www.ettus.com.

[8] Ettus Reaserch: Order page. http://www.ettus.com/orderpage.html, 2009.

[9] Eric Blossom. Re: [Discuss-gnuradio] multi_file usrp tuning options. http://lists.gnu.org/archive/html/discuss-gnuradio/2007-06/ msg00004.html, 2007.

[10] P. Balister and J. Reed. USRP Hardware and Software Description. http: //www.ece.vt.edu/swe/chamrad/crdocs/CRTM09_060727_USRP.pdf, June 2006.

[11] Ettus Research LLC Matt Ettus. USRP User’s and Developer’s Guide. http: //http://www.joshknows.com/.py/grc.

[12] Matt Ettus. Re: [Discuss-gnuradio] USRP RX Decimation Rate. http://lists.gnu.org/archive/html/discuss-gnuradio/2007-04/ msg00265.html, 2007.

(56)

44 Bibliography

[13] Eric Blossom. Re: [Discuss-gnuradio] Trouble understanding mul-tiple independent signal. http://lists.gnu.org/archive/html/ discuss-gnuradio/2005-10/msg00142.html, 2005.

[14] Matt Ettus. Re: [Discuss-GnuRadio]: tx_chain.v module in usrp_std.v. http://lists.gnu.org/archive/html/discuss-gnuradio/ 2006-07/msg00068.html, 2006.

[15] Matt Ettus. Re: [Discuss-gnuradio] bits per sample. http://lists.gnu. org/archive/html/discuss-gnuradio/2006-07/msg00035.html, 2006. [16] Universal Serial Bus Specification Revision 2.0. http://www.usb.org/

developers/docs/usb_20_122208.zip, 2000.

[17] Eric Blossom. Re: [Discuss-gnuradio] Using USRP from multiple applications simultaneou. http://lists.gnu.org/archive/html/discuss-gnuradio/ 2005-05/msg00011.html, 2005.

[18] Eric Blossom. Re: [Discuss-gnuradio] Frontend Hardware which is not USRP. http://lists.gnu.org/archive/html/discuss-gnuradio/ 2007-12/msg00105.html, 2007.

[19] Jan Axelson. USB Complete: Everything You Need to Develop Custom USB

Peripherals with Cdrom. Lakeview Research, 1999.

[20] S.M. Shajedul Hasan and P. Balister. Prototyping a Software Defined Ra-dio Receiver Based on USRP and OSSIE. http://www.ece.vt.edu/swe/ chamrad/crdocs/CRTM01_051214_USRP.pdf, Dec 2005.

[21] Altera Corporation. Cyclone FPGA Family Data Sheet. http://www. altera.com/literature/hb/cyc/cyc_c5v1_01.pdf, May 2008.

[22] Larry Doolittle. Usrp code on a avnet board. http://lists.gnu.org/ archive/html/discuss-gnuradio/2005-05/msg00115.html, May 2005. [23] Xilinx. Spartan-3 1.2V FPGA Family: Complete Data Sheet.

http://www.digchip.com/datasheets/download_datasheet.php?id= 1060503&part-number=XC3S2000, October 2003.

[24] 1-CORE Technologies. FPGA Logic Cells Comparison. http://www.1-core. com/library/digital/fpga-logic-cells/.

[25] Johnathan Corgan. Re: [Discuss-gnuradio] Logic utilization of stan-dard USRP2 configuration. http://lists.gnu.org/archive/html/ discuss-gnuradio/2009-04/msg00370.html, April 2009.

(57)

Bibliography 45

[26] Matt Ettus. [Discuss-gnuradio] USRP News – Dec 2008. http://lists. gnu.org/archive/html/discuss-gnuradio/2008-12/msg00304.html, De-cember 2008.

[27] Terasice Technologies. DE3 Development and Education Board User Man-ual. http://http://www.terasic.com.tw/attachment/archive/260/DE3_ User_manual_v1.2.5.pdf, 2009.

[28] NXP Semiconductor. High Speed Universal Serial Bus On-The-Go controller Product data sheet. http://download.siliconexpert.com/pdfs/2007/03/ 20/semi_ap/3/phi/interface\%20and\%20control/isp1761_4.pdf, 2007. [29] ST-NXP Wireless. ISP1582 Hi-Speed USB peripheral controller. http://

www.mouser.com/catalog/specsheets/ISP1582_6.pdf, 2008.

[30] Intel Corporation. Enhanced Host Controller Interface Specification for the Universal Serial Bus. http://www.intel.com/technology/usb/download/ ehci-r10.pdf, 2002.

(58)
(59)

Upphovsrätt

Detta dokument hålls tillgängligt på Internet — eller dess framtida ersättare — under 25 år från publiceringsdatum under förutsättning att inga extraordinära omständigheter uppstår.

Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för icke-kommersiell forskning och för undervisning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av doku-mentet kräver upphovsmannens medgivande. För att garantera äktheten, säkerhe-ten och tillgänglighesäkerhe-ten finns det lösningar av teknisk och administrativ art.

Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan be-skrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsmannens litterära eller konstnärliga anseende eller egenart.

För ytterligare information om Linköping University Electronic Press se förla-gets hemsida http://www.ep.liu.se/

Copyright

The publishers will keep this document online on the Internet — or its possi-ble replacement — for a period of 25 years from the date of publication barring exceptional circumstances.

The online availability of the document implies a permanent permission for anyone to read, to download, to print out single copies for his/her own use and to use it unchanged for any non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional on the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility.

According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against

(60)

48 Bibliography

infringement.

For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its www home page: http://www.ep.liu.se/

c

References

Related documents

The Role of Sport Organizations in Developing a Sport within a Major Sporting Event Host Country: An Examination of Ice Hockey and the PyeongChang 2018 Olympic Games. Choi, Kyu

The primary aim of this study is to measure the test-retest reliability of a new semi- automated MR protocol designed to measure whole body adipose tissue, abdominal

It is claimed that image hierarchies based on feature extraction, so called feature hierarchies, demand a signal representation other than the standard spatial or linear

O’Boyle (2016) går däremot emot DeNisi och Pritchards (2006) åsikt och hävdade i sin meta-studie att mindre kontrollerande parametrar som binds till organisationens

You suspect that the icosaeder is not fair - not uniform probability for the different outcomes in a roll - and therefore want to investigate the probability p of having 9 come up in

This paper aims to continue the debate and critique within the FWA literature raised by other scholars, namely the perception of FWAs as autonomous per se (Gerdenitsch, Kubicek

I (2) verkar X antingen vara ett negativt tillstånd eller något som är fysiskt ansträngande, (stressar, svettas, jobbar, tränar osv.) vilket skapar en tanke om att X

Det man kan säga kring det resultat uppsatsen har fått fram är att det var just skilda uppfattningar om missionerna där FN-soldaterna från Sverige, den svenska kontingenten,