• No results found

Synchronization of flywheel position between autonomous devices

N/A
N/A
Protected

Academic year: 2021

Share "Synchronization of flywheel position between autonomous devices"

Copied!
84
0
0

Loading.... (view fulltext now)

Full text

(1)

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Synchronization of flywheel position between

autonomous devices

Examensarbete utfört i Datorteknik vid Tekniska högskolan vid Linköpings universitet

av

Tobias Pettersson

LiTH-ISY-EX--12/4602--SE

Linköping 2012

Department of Electrical Engineering Linköpings tekniska högskola Linköpings universitet Linköpings universitet SE-581 83 Linköping, Sweden 581 83 Linköping

(2)
(3)

Synchronization of flywheel position between

autonomous devices

Examensarbete utfört i Datorteknik vid Tekniska högskolan i Linköping

av

Tobias Pettersson

LiTH-ISY-EX--12/4602--SE

Handledare: Andreas Ehliar

isy, Linköpings universitet

Rasmus Backman

Scania CV AB

Joakim Jäderberg

Scania CV AB

Examinator: Olle Seger

isy, Linköpings universitet

(4)
(5)

Avdelning, Institution

Division, Department

Division of Automatic Control Department of Electrical Engineering Linköpings universitet

SE-581 83 Linköping, Sweden

Datum Date 2012-06-13 Språk Language  Svenska/Swedish  Engelska/English   Rapporttyp Report category  Licentiatavhandling  Examensarbete  C-uppsats  D-uppsats  Övrig rapport  

URL för elektronisk version

http://www.control.isy.liu.se http://www.ep.liu.se ISBNISRN LiTH-ISY-EX--12/4602--SE

Serietitel och serienummer

Title of series, numbering

ISSN

Titel

Title

Synkronisering av svänghjulsposition mellan autonoma enheter Synchronization of flywheel position between autonomous devices

Författare

Author

Tobias Pettersson

Sammanfattning

Abstract

More computing power will be required in Scania’s future engine control units. Calculations is therefore needed to be performed on new hardware such as an FPGA. One problem that arises is synchronization of flywheel position. This mas-ter thesis examines the opportunities existing Scania hardware has to perform synchronization of flywheel position. Different concepts for synchronization have been developed and compared with each other. One of the concepts have been implemented and made possible with a PCB-adapter. The results show that syn-chronization is possible within given real-time requirements. Finally, an analysis to series production has been made. It show the challenges that an FPGA will face when integrated into a future engine control unit.

Nyckelord

(6)
(7)

Abstract

More computing power will be required in Scania’s future engine control units. Calculations is therefore needed to be performed on new hardware such as an FPGA. One problem that arises is synchronization of flywheel position. This master thesis examines the opportunities existing Scania hardware has to perform synchronization of flywheel position. Different concepts for synchronization have been developed and compared with each other. One of the concepts have been implemented and made possible with a PCB-adapter. The results show that synchronization is possible within given real-time requirements. Finally, an analysis to series production has been made. It show the challenges that an FPGA will face when integrated into a future engine control unit.

Sammanfattning

Mer beräkningskraft kommer att krävas i Scanias framtida motorstyren-heter. Då kan beräkningar behövas utföras på ny hårdvara som exempelvis en FPGA. Ett problem som då uppstår är synkronisering av svänghjulsposi-tion. Detta examensarbete undersöker vilka möjligheter befintlig hårdvara har för att möjliggöra synkronisering av svänghjulspositionen. Olika cept för synkronisering har tagits fram och vägts mot varandra. Ett av kon-cepten har implementerats och möjliggjorts med hjälp av en PCB-adapter. Resultaten visar att synkronisering är möjlig inom givna realtidskrav. Till sist har en analys mot serieproduktion utförts som visar vilka utmaningar som finns om en FPGA integreras i en framtida motorstyrenhet.

(8)
(9)

Acknowledgments

I would in particular thank my supervisors at Scania for their support dur-ing this master thesis. Rasmus Backman has given me great support when designing the PCB-adapter and when creating the FPGA design. Joakim Jäderberg has taught me a great deal about how a TPU works and also been a great help when integrating S8 with the FPGA. I also want to thank Hans Svensson for contributing with his knowledge about crank angle logic and Christoffer Markusson for his knowledge about the CPU configuration in S8. Thanks also to Per Olsson, who has provided me with the necessary tools during the master thesis. Thanks to my supervisor at the university Andreas Ehliar, who has supported me throughout the entire master thesis process with report and implementation discussions.

At last, I want to thank my parents for their support during my entire university time.

(10)
(11)

Contents

1 Introduction 7 1.1 Background . . . 7 1.2 Purpose . . . 7 1.3 Problem Definition . . . 8 1.4 Delimitations . . . 9

1.5 Overview S8 and FPGA . . . 9

1.6 Report Outline . . . 10

2 Background Concepts 11 2.1 Diesel Engine Basics . . . 11

2.1.1 Four-stroke process . . . 11

2.1.2 Flywheel Position Sensor . . . 12

2.1.3 CAD - Crank Angle Degree . . . 14

2.2 Scania S8 . . . 14

2.2.1 ECU, EMS and S8 . . . 14

2.2.2 S8 Architecture . . . 14

2.2.3 S8 CPU . . . 14

2.2.4 TPU - Time Processing Unit . . . 15

2.3 FPGA . . . 16

2.3.1 Overview FPGA . . . 16

2.3.2 FPGA Technologies . . . 17

3 Analysis of Methods to Synchronize Flywheel Position 19 3.1 Implementation Factors . . . 19

3.1.1 Timing Requirements . . . 20

3.1.2 Degradation of CPU and TPU . . . 21

3.1.3 Test and Verification . . . 21

3.1.4 Flexible Design . . . 21

3.1.5 Hardware Cost . . . 21

3.2 Engine Speed Estimation . . . 21 ix

(12)

x Contents

3.3 Communication . . . 22

3.3.1 Required Messages . . . 22

3.3.2 CPU and TPU Communication . . . 22

3.3.3 CPU and FPGA Communication . . . 23

3.3.4 TPU and FPGA communication . . . 24

3.4 Implementation Alternatives . . . 25

3.4.1 Synchronization Method . . . 25

3.4.2 Concept 1 - UART Interface . . . 26

3.4.3 Concept 2 - Memory Mapped Interface . . . 27

3.4.4 Concept 3 - Dual Handshake Interface . . . 28

3.5 Concept Selection . . . 29

4 System Design and Implementation 31 4.1 PCB Design . . . 31

4.1.1 PCB-Adapter . . . 31

4.1.2 PCB Voltage Divider . . . 33

4.2 Synchronization of CAD . . . 33

4.3 TPU Design . . . 34

4.3.1 Pin and Channel Selection . . . 34

4.3.2 Functional Description . . . 35 4.4 CPU Design . . . 36 4.5 FPGA Design . . . 39 4.5.1 Overview . . . 39 4.5.2 Message Decoder . . . 40 4.5.3 BRAM Controller . . . 41 4.5.4 CAD Module . . . 42 4.5.5 TPU Interface . . . 46

5 Test and Verification 47 5.1 TPU Design . . . 47 5.2 CPU Design . . . 47 5.3 FPGA Design . . . 49 5.3.1 VHDL Test Packages . . . 49 5.3.2 CAD Module . . . 49 5.3.3 BRAM Controller . . . 50

5.3.4 Message Decoder and TPU Interface Module . . . . 51

5.3.5 Area and Timing Report FPGA . . . 52

5.3.6 PC to FPGA . . . 52

5.4 System Test . . . 53

5.4.1 Test Setup . . . 53

(13)

Contents xi

5.5 Truck Test . . . 55

6 Using an FPGA in a Series Produced Unit 57 6.1 FPGA in Engine Control Units . . . 57

6.2 Requirements on the FPGA . . . 58

6.2.1 Environmental Requirements . . . 58

6.2.2 Startup Requirements . . . 59

6.2.3 Security . . . 60

6.3 FPGA Effects on Assembly Line . . . 60

6.3.1 Assembly Line . . . 60

6.3.2 Firmware Updates . . . 61

6.4 Using an FPGA for Future Engine Control Units . . . 62

7 Conclusion 63 7.1 Summary . . . 63

7.2 Future Work . . . 63

7.2.1 CAD Synchronization on a Separate TPU Engine . . 63

7.2.2 Implementation of Engine Position Logic in an FPGA 64 7.2.3 Logging . . . 64

7.2.4 Using FPGA Platform for Other Purposes . . . 64

7.2.5 Towards a New Engine Control Unit . . . 65

(14)
(15)

List of Figures

1.1 Block diagram of S8 and FPGA with connected flywheel

position sensors. . . 9

2.1 Cross section of a cylinder in a combustion engine. Figure from [10], used with permission. . . 12

2.2 Flywheel teeth pattern and sensor positions for a 6-cylinder engine . . . 13

2.3 Transformation of flywheel position sensor signal to a pulse train signal . . . 13

2.4 Block diagram of S8 architecture . . . 15

2.5 Basic structure of an FPGA . . . 17

3.1 Timing requirement for flywheel position between S8 and FPGA . . . 20

3.2 Simulation when sending 7 data bits with a handshake protocol 24 3.3 Simulation when sending the byte 0xAA via UART . . . . . 25

3.4 Stress simulation when sending the byte 0xAA via UART . 25 3.5 Architecture of concept 1 with UART interface . . . 27

3.6 Architecture of concept 2 with memory mapped interface . 28 3.7 Architecture of concept 3 with a dual handshake interface . 29 4.1 Top and bottom view of the PCB-adapter . . . 32

4.2 The S8 and the FPGA connected via the PCB-adapter . . . 32

4.3 Layout and and picture of the designed voltage divider . . . 33

4.4 Sequence diagram of CAD synchronization . . . 34

4.5 TPU channel accesses of Share Code Memory . . . 36

4.6 FSM for synchronization of CAD value in CPU . . . 38

4.7 Opal Kelly XEM3010 with a Spartan 3 FPGA . . . 39

4.8 Block diagram of FPGA implementation . . . 40

4.9 Block diagram of BRAM controller design . . . 42

4.10 FSM for BRAM port A . . . 43

4.11 Block diagram of CAD module . . . 44

4.12 Implementation of CAD Controller . . . 44

4.13 Flowchart of CAD estimation . . . 45

4.14 FSM for TPU interface . . . 46

5.1 Simulation of handshake protocol between the TPU and the FPGA with a TPU simulator . . . 48

5.2 Oscilloscope measurement of ce and we signals during three write operations . . . 48

(16)

2 Contents

5.3 Simulation of CAD synchronization . . . 49

5.4 Plot of CAD estimation . . . 50

5.5 Simulation of an FPGA write and a CPU read . . . 51

5.6 Simulation of simultaneous CPU read and FPGA write to the same memory address . . . 51

5.7 User interface to control and observe FPGA signals . . . 53

5.8 Block diagram of system test setup . . . 54

5.9 Physcial test setup during system test . . . 54

5.10 Maximimum difference between the S8 and the FPGA times-tamps when the RPM is constant . . . 55

(17)

Contents 3

List of Tables

4.1 FPGA signal interface . . . 41 5.1 Area and Timing Report for FPGA design . . . 52

List of Examples

2.1 Code structure of a TPU channel . . . 16 4.1 Read/Write in CPU to FPGA Memory . . . 36

(18)
(19)

Abbreviations

ASIC Application Specific Integrated Circuit

BDC Bottom Dead Center, position of the piston when it is at the point closest to the crankshaft

BRAM Block Random Access Memory

CAD Crank Angle Degree

CAI Crank Angle Interrupt. Software interrupt in S8

CPU Central Processing Unit

EPS Engine Position Sensor, synonym for flywheel position sensor

ECU Engine Control Unit

FPGA Field Programmable Gate Array

FSM Finite State Machine

GPIO General Purpose Input/Output

JTAG Joint Test Action Group

PCB Printed Circuit Board

S8 Name of Scania engine control unit

SCI Serial Communications Interface

SPI Serial Peripheral Interface Bus

TDC Top Dead Center, position of the piston when it is at the point furthest to the crankshaft

TPU Time Processing Unit 5

(20)

6 Contents UART Universal Asynchronous Receiver/Transmitter

VHDL VHSIC (Very High Speed Integrated Circuit) Hardware Description Language

(21)

Chapter 1

Introduction

This master thesis has been performed at Scania CV AB in Södertälje. It was performed at the department Powertrain Control Systems, which is responsible for low level software for powertrain embedded systems at Scania.

1.1

Background

When new computation power is needed for closed-loop combustion con-trol in Scania engine concon-trol units, new hardware such as an FPGA must be integrated in an effective way. One problem is when calculations are distributed between different hardware components is synchronization of real-time data against flywheel position. A solution aimed for lab work to synchronize flywheel position is needed and some of the concepts that are investigated will probably be integrated in future engine control units.

1.2

Purpose

The starting point of this master thesis is to continue the work of a pre-vious master thesis by Henrik Bohlin with extra hardware in the engine management system called S8 [3]. In this master thesis different concepts to synchronize flywheel position shall be proposed and analyzed. The goal is to solve the synchronization problem with existing hardware and also propose a solution for future control units. The purpose of the synchro-nized flywheel position is to be used as an input to a closed-loop combustion controller where control variables are a function of the flywheel position.

(22)

8 Introduction

1.3

Problem Definition

To solve the problem it has been broken down into the following problems and tasks.

1. Analysis of Methods to Synchronize Flywheel Position a Which factors affect the choice of method to synchronize flywheel

position?

b Analyze different synchronization concepts. Which results are ex-pected for respective concept? How will it affect the current sys-tem?

c Which concept will be implemented to solve the synchronization problem? Why should this be implemented?

2. Implementation CPU, TPU and FPGA

a Create hardware to enable communication between S8 and an FPGA. b Which changes are needed in respective unit to integrate the chosen

concept?

c Implement the software that is needed. 3. Test and Verification

a Create methods to test, simulate and verify the functionality of the implementation.

b Which results has been obtained from the chosen concept?

c If a second concept is implemented: How do the results from the different concepts relate with respect to accuracy and fault toler-ance?

4. Using an FPGA in a Series Produced Unit

a Has an FPGA been used before with an engine control unit? Which types of vehicles has it been integrated to?

b Which factors affect the choice of an FPGA in a series produced control unit (environmental resistance requirements, performance etc.)?

c How is the FPGA going to change the assembly line of trucks? How should it be configured before delivery? How should firmware updates be handled?

(23)

1.4 Delimitations 9

1.4

Delimitations

The following bullet points describe the delimitations of this master thesis. 1. S8 functionality shall not be degraded. This means that no timing

deadlines shall be missed due to the integration of the FPGA. 2. Changes shall not affect the S8 compatibility with existing software.

1.5

Overview S8 and FPGA

A block diagram of the initial platform with S8 and the FPGA can be seen in Figure 1.1. Flywheel position sensors are connected to S8 and not to the FPGA. The dotted lines mean that no communication was possible initially between the units at the start of this master thesis.

S8

Flywheel 58 teeth, 2 gaps on flywheel Flywheel position sensor Flywheel position sensor

CPU

TPU

FPGA

Figure 1.1: Block diagram of S8 and FPGA with connected flywheel posi-tion sensors.

(24)

10 Introduction

1.6

Report Outline

The report outline is described in the list below.

• Background Concepts: this chapter explains main concepts used in this master thesis.

• Analysis of Methods to Synchronize Flywheel Position: possi-bilities and limitations to synchronize flywheel position are discussed. Different concepts are proposed and compared to each other. In the last section a concept is chosen for implementation.

• System Design and Implementation: the design of the system to synchronize flywheel position is presented. Implementation of hard-ware and softhard-ware is described.

• Test and Verification: methodology to verify the implementation is described. Test results are presented and discussed.

• Using an FPGA in a Series Produced Unit: which challenges will an FPGA have if it is used in an engine control unit where high temperatures, temperature variation and vibrations are com-mon? This chapter will try to answer this question. It also provides important aspects of which considerations that have to be made when trying to integrate an FPGA in an engine control unit.

• Conclusions: a summary of the results. Possible future work is also suggested.

(25)

Chapter 2

Background Concepts

The purpose of this chapter is to make the reader familiar with the main concepts of this master thesis. After reading this chapter the reader should be able to understand basic concepts of a diesel engine and also know what kind of embedded technology that is integrated in modern engine control units. Closed-loop combustion control is a central concept in this master thesis. Further reading on this topic can be read in a master thesis written by Jonian Grazhdani [10].

2.1

Diesel Engine Basics

2.1.1 Four-stroke process

A four-stroke engine is an internal combustion engine in which a piston completes four different strokes. A cross section of a cylinder engine can be seen in Figure 2.1. The different strokes of the four-stroke operating cycle are:

1. Intake: at the start of this cycle the piston goes from top to bottom of the cylinder which leads to a reduced pressure in the cylinder. Air is forced into the cylinder due to the open intake port. The last stage of this stroke is when the intake valve closes.

2. Compression: the piston travels from BDC (Bottom Dead Cen-ter) to TDC (Top Dead CenCen-ter) with both intake and exhaust valve closed. The cylinder pressure and temperature increases when the air is compressed.

3. Expansion/power: when the piston is close to the TDC, fuel is injected into the combustion chamber (applies for direction injection

(26)

12 Background Concepts

Figure 2.1: Cross section of a cylinder in a combustion engine. Figure from [10], used with permission.

engines) where it mixes with air. The fuel starts to evaporate and self-ignite and forces the piston to move away from TDC. When the piston travels from TDC to BDC and the combustion has ended, the volume of the cylinder is increased which leads to lower temperature and pressure.

4. Exhaust: as the piston approach BDC the exhaust valve is opened. When the piston is moved from BDC to TDC the exhausts are pushed through the exhaust valve.

2.1.2 Flywheel Position Sensor

The position and rotational speed of the flywheel is measured by two fly-wheel position sensors mounted close to the flyfly-wheel (placement depends on the number of cylinders). An example of the placement can be seen in Figure 2.2. It can also be observed in the figure that the flywheel has 58 teeth and 2 gaps. The teeth are used to measure the rotational speed. Gaps provide information about the flywheel position. The wave signal generated from the flywheel sensors can be seen in Figure 2.3. This signal is transformed into a pulse train before it is connected to the processor in S8.

(27)

2.1 Diesel Engine Basics 13

Figure 2.2: Flywheel teeth pattern and sensor positions for a 6-cylinder engine

Pulse to CPU Wave from flywheel position sensor

Figure 2.3: Transformation of flywheel position sensor signal to a pulse train signal

Two flywheel position sensors are used in parallel but only one of the sen-sors are considered as a master sensor (its value is the reference position). The other acts as a backup sensor if the master sensor breaks down. Deter-mination of the master sensor is performed at startup and the sensor that first can decide its position becomes the master. A synonym that is often used for a flywheel position sensor is EPS (Engine Position Sensor). Both those terms will be used in this master thesis.

(28)

14 Background Concepts

2.1.3 CAD - Crank Angle Degree

It can be observed from the description in section 2.1.1 that the crankshaft performs two revolutions for every operating cycle. The duration of a cycle can be expressed in Crank Angle Degrees (CAD). CAD angle 0 is defined as the position when the piston is positioned at TDC between the compression and expansion strokes. The operating cycle of a four-stroke engine takes 720 CAD, because of the two revolutions per operating cycle. CAD is used as a reference for several control parameters in an engine for example ignition and fuel injection timing. If the perception of the CAD is inaccurate it could led to misfiring, motor vibration or backfire. When the CAD has reached specific angles a software interrupt is generated in the TPU. They are triggered to be able to calculate fuel injection timing for every cylinder. These interrupts are called Crank Angle Interrupt (CAI) and this term will be frequently used during this master thesis.

2.2

Scania S8

2.2.1 ECU, EMS and S8

An Electronic Control Unit ECU is an embedded system which controls one or more electrical systems. An ECU controlling the engine in a vehicle is called an EMS (Engine Management System). The most sophisticated EMS in current Scania trucks is called S8. It uses input from sensors and calculates control and timing parameters, for example when fuel injection should take place. The terms ECU and EMS will be synonymous with S8 during this master thesis.

2.2.2 S8 Architecture

S8 consists of three major building blocks shown in Figure 2.4. The central part is the CPU, which is controlling the behavior of S8. The TPU (Time Processing Unit) is a co-processor which is placed on the same chip as the CPU. It handles complex timing, actuators and sensor inputs. An ASIC is also integrated in S8. This is responsible for handling fuel injectors. The ASIC is not going to be used during this master thesis.

2.2.3 S8 CPU

The CPU used in S8 is part of the family of CPU cores that implement versions built on Power ArchitectureT M embedded category. Instructions are compatible with the Power PCT M user instruction set architecture

(29)

2.2 Scania S8 15

Figure 2.4: Block diagram of S8 architecture

(UISA). It is clocked at 128MHz and has a 32KB unified level 1 cache, which is 4- or 8-way set associative. The cache could be configured with either write-back or write-through approach.

2.2.4 TPU - Time Processing Unit

A TPU is a semi-autonomous co-processor, which is designed for timing control. It can execute instructions, react to inputs, perform PWM gen-eration and access memories without host (CPU) intervention. The TPU has two execution engines which consist of 32 independent timer channels. Each engine has two 24 bit free running counters. Those provide a reference to capture and match events. A TPU channel consists of an input and an output signal pair. Channels can be programmed to react (generate inter-rupts) to four events; two capture and two match events. Capture events reacts to either a rising or falling edge on the input pin. Match events reacts when the free running counter has reached a certain value.

Communication between TPU and CPU can be performed in four different ways: HostServiceRequest, ChannelInterruptRequest, function parameters and Share Code Memory. A HostServiceRequest is when the CPU requests an interrupt in the TPU. It also passes function parameters, which is read and writeable from both the CPU and the TPU. When the TPU wants to request an interrupt in the CPU it creates a ChannelInterruptRequest, which generates an interrupt request to the CPU. A Share Code Memory is available to store global variables. These are read- and writeable by both the CPU and TPU. Communication between channels is performed by set-ting up links. These links can be compared to general software interrupts. The source code for a TPU channel consists of a single function, which consists of several if-statements. These if-statements represent different

(30)

16 Background Concepts

threads that are executed when an event has occurred. Threads cannot be preempted during its execution. An example of the basic code structure of a TPU program can be seen in example 2.1.

Example 2.1: Code structure of a TPU channel

/∗ Every c h a n n e l h a s a f u n c t i o n a s s o c i a t e d w i t h i t . a r g 1 and a r g 2 a r e f u n c t i o n p a r a m e t e r s . ∗/ v o i d ChannelXFunction ( i n t 2 4 a r g 1 , u i n t 8 _ t a r g 2 ) { /∗ H o s t S e r v i c e R e q u e s t from CPU−>TPU ∗/ i f( I s H o s t S e r v i c e R e q u e s t E v e n t (REQUEST) ) { // E x e c u t e c o d e } /∗ Event t h r e a d , match o r t r a n s i t i o n h a s o c c u r e d ∗/ e l s e i f( I s M at c h A O r Tr a n s i ti o n B E ve n t ( ) ) { // E x e c u t e c o d e } /∗ Link t h r e a d , a n o t h e r c h a n n e l have c r e a t e d a l i n k t o t h i s c h a n n e l ∗/ e l s e i f( I s L i n k S e r v i c e R e q u e s t E v e n t ( ) ) { // E x e c u t e c o d e } }

2.3

FPGA

2.3.1 Overview FPGA

An FPGA (Field Programmable Gate Array) is in short an integrated cir-cuit with reconfigurable hardware. It can be seen as an array of logic programmable blocks that can be connected to each other to form digital circuits. This structure makes it possible to use the FPGA as a paral-lel machine compared to a general microprocessor which does not have the same level of parallelism. Figure 2.5 shows the basic structure of an FPGA. Storage of data and multiplications are generally expensive in terms of area in logic blocks. That is why modern FPGAs is provided with Block RAMs (BRAM) and hardware multipliers, which have dedicated area on the FPGA chip for these operations.

(31)

2.3 FPGA 17

Figure 2.5: Basic structure of an FPGA

2.3.2 FPGA Technologies

FPGAs can be divided into different process technology types. The main difference between them is the reconfigurability and if they are volatile (need power to keep memory content) or non-volatile (no power needed to keep memory content). The most common technologies are listed below.

• SRAM: a volatile technology, which has to be programmed at startup by an external device. This is the technology with the smallest tran-sistor size and the highest possible performance. The technology is flexible and it can both be reprogrammable and in-system pro-grammable.

• Flash: live at startup, which means that it does not need to be programmed at startup. It is reprogrammable and in some devices it is in-system programmable. This is a non-volatile technology. • Antifuse: one time programmable technology. It is live at startup

and it does not need any external configuration. This has the ben-efit of better radiation tolerance, low power consumption and high reliability compared to other technology types.

(32)
(33)

Chapter 3

Analysis of Methods to

Synchronize Flywheel

Position

This section describes the process of selecting implementation method to synchronize flywheel position. System requirements, available software and hardware resources are presented. Different implementation concepts are proposed, analyzed and compared to each other. The last part describes and motivates the selected implementation.

3.1

Implementation Factors

To determine important factors for synchronizing flywheel position, a sys-tem requirements specification was created by the author and the industrial supervisors. Following factors were identified as important when deciding which concept to use:

• Timing Requirements

• Degradation of CPU and TPU • Test and Verification

• Flexible Design • Hardware cost

(34)

20 Analysis of Methods to Synchronize Flywheel Position

3.1.1 Timing Requirements

A requirement from Scania was that the deviation of the flywheel position should differ at most 0.1 CAD between S8 and the FPGA. This requirement is illustrated below in Figure 3.1.

Figure 3.1: Timing requirement for flywheel position between S8 and FPGA The time period between two teeth can be calculated. A flywheel consists of 58 teeth and two teeth gaps which is a total of 60 positions. RPM is the number of revolutions per minute and RPS is the number of revolutions per second.

RP S = RP M

60 ⇒ /calculate period for one revolution/ ⇒ Trev = = 1

RP M 60

= 60

RP M ⇒ /calculate period for one tooth/ ⇒

⇒ Ttooth= 60 RP M 60 = 1 RP M (3.1)

There is 6 CAD between each tooth. This means that 0.1 CAD is 601 of a tooth. This fact combined with (3.1) gives the time requirement treq as

treq< Ttooth

60 = 1

60 · RP M (3.2)

The requirement is applicable when an engine is running at 500-2500RPM, but there are situations when an engine is running at higher RPM. But there are cases when the engine speed reaches 3000RPM. It is harder to fulfill this requirement if RPM is increased as seen in equation 3.2. A hard timing requirement tmax can be calculated as

tmax <

1

60 · 3000 < 5.556µs (3.3) For simplicity reasons tmax will be set to 5µs.

(35)

3.2 Engine Speed Estimation 21

3.1.2 Degradation of CPU and TPU

A hard requirement in this master thesis is that functionality in S8 shall not be degraded. One thing to consider in the CPU is to not add too much overhead such that time critical tasks will miss their deadlines. In the TPU it has to be ensured that utilization of new functions are low, otherwise it will block threads which has critical timing requirements.

3.1.3 Test and Verification

It shall be possible and easy to debug, test and verify the implementation. This means that signals connected between S8 and the FPGA shall be possible to measure on an oscilloscope. This has to be considered when designing the hardware. A modular design approach is desired to make it easier to detect where a fault is present.

3.1.4 Flexible Design

It is not known to date all possibilities the FPGA platform will provide. Therefore it has to be possible to have a flexible design where messages and modules can be configured easily. It must also be able to exchange information between S8 and the FPGA bidirectional.

3.1.5 Hardware Cost

The FPGA platform is going to be used for lab purposes. That is why there are no restrictions on how many IOs that are going to be used, given that it is available. According to the industrial supervisors, the cost of using more external IOs would not be increased if an FPGA would be integrated in a new engine control unit. However there is a high cost if IOs are added after a new engine control unit has been developed. Logic area in the FPGA for synchronizing flywheel position should be kept low because there has to be area available for closed-loop combustion control, which is expensive in terms of logic area.

3.2

Engine Speed Estimation

Knowledge of how engine speed is determined is vital to be able to develop a well designed synchronization concept. In today’s engine control units the functionality is divided between TPU and CPU. The TPU is responsible for sampling of flywheel position sensors, calculate current engine position and determine flywheel direction. The actual engine speed calculation is

(36)

22 Analysis of Methods to Synchronize Flywheel Position

performed in the CPU, which is used for calculating time for fuel injections. It is desirable to place all time-critical communication with the FPGA in the TPU, because it takes longer time (nondeterministic) for the CPU to obtain up-to-date sensor values.

3.3

Communication

This section describes which messages that are required to be sent between S8 and the FPGA. It also describes how the CPU, TPU and FPGA com-munication can be performed.

3.3.1 Required Messages

To design a synchronization concept you have to know which types of in-formation that has to be sent between S8 and the FPGA. The system requirements specification that was developed with the industrial supervi-sors states that following types of messages and information must be able to send between S8 and FPGA.

• Master flywheel sensor (S8→FPGA) • A phase shift of 360◦ (S8→FPGA)

• Engine type: 5, 6 or 8 cylinders (S8→FPGA) • Reset of the FPGA (S8→FPGA)

• Synchronization of the CAD value

• Inform each other that synchronization has been lost

The only time critical message is synchronization of the CAD value. Re-quirements for this message can be read in section 3.1.1.

3.3.2 CPU and TPU Communication

A detailed description of CPU and TPU communication can be read in section 2.2.4. It is important to note that communication between CPU and TPU is not instant. When a HostServiceRequest is sent from CPU to TPU it takes some time to start a thread, depending on the utilization of the TPU. When the TPU invokes a ChannelInterruptRequest, an interrupt request is sent to the CPU. There is no guarantee that the interrupt will be served instantly. It could take more than 100 µs to start the interrupt according to Scania personnel.

(37)

3.3 Communication 23

3.3.3 CPU and FPGA Communication

Several available methods to implement a communication protocol between the CPU and FPGA exist. Possible interfaces can be seen in the list below:

• GPIOs (General Purpose Input/Output) • SPI (Serial Peripheral Interface)

• SCI (Serial Communications Interface) • Memory Mapping

There are many available GPIOs on the CPU, which makes it possible to use a parallel interface with the FPGA. Time to read and write GPIOs is however not deterministic. This means that it cannot send time critical synchronization messages. The CPU supports four SPI channels. However all SPI channels are either occupied or unavailable in S8 and can therefore not be used. Another serial interface is SCI, which provides an UART (Universal Asynchronous Receiver/Transmitter) mode. According to the datasheet [20] for the CPU between 8-9 bits can be sent for each transmit. Baud rate can be calculated in the following way [20]:

baud rate = fsys

16 · BR (3.4)

BR is part of a control register and can be set between 1-8191. fsys is

system clock frequency in the CPU. With this formula the maximum baud rate can be calculated:

baud rate = 128 · 10

6

16 · 1 = 8 · 10

6 bits/s (3.5)

The possible baud rate is impressive, but synchronization time require-ments are based on latency not throughput. It will take time for the CPU to setup the data transfer. This cannot be used for synchronizing the CAD value. But it can be used to send messages that are not time critical mes-sages.

Another method to interface the CPU and FPGA is using a memory mapped interface. This was suggested in a prior master thesis [3]. The concept is that information shall be written from CPU to a memory in the FPGA, from where CPU and FPGA can exchange data via a shared memory structure. This method also suffers from the same problems as the GPIOs and UART, that it is not able to synchronize the CAD value. There could for example be an interrupt in the CPU, which could block the memory access more than tmax.

(38)

24 Analysis of Methods to Synchronize Flywheel Position

3.3.4 TPU and FPGA communication

There is only hardware support for GPIOs between TPU and FPGA but the manufacturer [22] of the TPU has provided software support for UART and SPI communication. Some GPIOs and a few outputs from the TPU are available. This provides possibilities of having a parallel interface with the FPGA. A simulation was performed in a TPU simulator provided by AshWare [25] to determine if GPIOs could handle the synchronization re-quirements of the CAD value. A handshake protocol [4] was used to send information to and from the FPGA, where response from the FPGA was approximated. The simulation tested how long time it takes to set seven output pins. Figure 3.2 shows that there is a delay for sending data from the TPU. This delay was measured to slightly less than 5µs.

Figure 3.2: Simulation when sending 7 data bits with a handshake protocol A note is that the simulation was performed when no other channels were active. This is the best case scenario. The delay would increase in a real en-vironment where several channels could be active at the same time. It can also be noted in Figure 3.2 that it takes longer time to send more bits. Sim-ulations have shown that time is proportional to the number of output pins. One feature with channels in the TPU is that they do not capture an input event when a thread is active. It detects it instantly and creates a new thread scheduled for execution. Timestamps of input events are saved in registers and can be accessed when the thread is active. This provides the possibility of saving the timestamp when the TPU detects a positive edge (tooth found) from a flywheel position sensor and also the time when the FPGA sends information that it has detected a tooth. These timestamps can be compared to see if the units are synchronized.

The UART communication between the TPU and the FPGA has also been simulated. Simulations tested how long it takes to transmit a message with 8 bits and 1 parity bit. Two start bits are used in the implementation pro-vided by the manufacturer. The results can be seen in Figure 3.3. It was possible to achieve a period length of 1µs.

(39)

3.4 Implementation Alternatives 25

Figure 3.3: Simulation when sending the byte 0xAA via UART

Simulations were also done to see what happens if the same simulation was done while other threads were working in parallel (Figure 3.4). The result shows an irregular period time between pulses. This means that messages could be wrongly interpreted.

Figure 3.4: Stress simulation when sending the byte 0xAA via UART The achievable baud rate is low and is dependent on TPU utilization. Tim-ing messages will not be possible to send via UART from the TPU. One other disadvantage with this implementation is that it is utilizing the TPU substantial and would degrade performance of the existing system. A SPI implementation has not been tested because the implementation has similar performance as the UART as described in [9] and [18]. A benefit of using SPI compared to the UART should be that it will not send false messages. The reason for this is that transmission of data will be clocked by the TPU so the FPGA will always receive correct data.

3.4

Implementation Alternatives

In this section three different concepts to synchronize flywheel position are presented and analyzed. More concepts were developed but those were either similar to the ones presented in this section or did not meet the syn-chronization timing requirements. Note that the proposed FPGA design in each concept is only a template of the necessary functionality to synchro-nize flywheel position. Partitioning of functionality may change during the implementation.

3.4.1 Synchronization Method

Synchronization is performed the same way in all presented concepts. The solution compares timestamps when the TPU detects a positive edge and when the TPU detects a synchronization message sent from the FPGA. If these timestamps are within tmax, S8 and the FPGA are synchronized.

This algorithm is described in more detail in the list below:

1. Detect Positive Edge: TPU and FPGA detects a positive edge on the flywheel position sensor (assumed that they detect it

(40)

simultane-26 Analysis of Methods to Synchronize Flywheel Position

ously, the possible time difference in negligible). The TPU saves the timestamp when the positive edge occurs. It saves it in a register if there is a synchronization point.

2. Send Timestamp: The FPGA checks if the current tooth is a syn-chronization point. If that is the case it sends information to the TPU.

3. Detect Message: The TPU detects a message, which has been sent from the FPGA. It saves a timestamp of the event in register. 4. Compare Timestamps: timestamp registers are compared. They

are synchronized if the difference between timestamps is less than

tmax.

A handshake protocol will be used between TPU and FPGA to exchange information, which is a standard way to communicate between heteroge-neous systems. A figure of the TPU and FPGA interface can be seen in Figure 3.5. When the FPGA has a message to send, it asserts the req signal and the information on the data lines are sampled by the TPU. When the TPU has received the message it asserts the ack signal. The CAD value is going to be synchronized at every crank angle interrupt. A reason for this is to keep utilization in the TPU low (a maximum of four interrupts per revolution). The main benefit of using this synchronization method is that it does not matter how much utilization the TPU has, it will always be possible to detect if the units are synchronized.

An easier way to solve the synchronization problem would be to use a parallel interface with just a few pins to send the CAD value from TPU to FPGA. The reason why this was not chosen is that transfer time cannot be guaranteed. Even if just one pin is toggled when a new tooth has been de-tected, it cannot be guaranteed that it is sent from the TPU to the FPGA within tmax. The reason for this is that the utilization could be high and

block event for longer than tmax.

3.4.2 Concept 1 - UART Interface

The architecture of the UART concept can be seen in Figure 3.5. A tx line is a serial line where messages are sent from TPU to FPGA. Messages originate from the CPU where all synchronization logic is placed and the TPU is only used for forwarding messages. The flywheel position sensors

(41)

3.4 Implementation Alternatives 27

Figure 3.5: Architecture of concept 1 with UART interface

The FPGA will consist of four modules: An UART Receive, a Message

Decoder, a TPU Interface and a CAD module. The UART Receive module

detects new messages and forwards them to the Message Decoder mod-ule, which interprets the message. The CAD module is handling the syn-chronization logic in the FPGA. It receives interpreted messages from the

Message Decoder module and reacts to them, for example sets the master

sensor. It also informs the TPU Interface module when it shall send a mes-sage that a synchronization point has occurred and when the CAD module is synchronized.

Benefits of using this concept is the low CPU utilization combined with a flexibility to create larger data packets if more messages are added. Startup sequence and synchronization for the FPGA is performed in the CPU. This is preferred because it is hard to write complex control logic in the TPU due to its event based structure.

A drawback of using this concept is that the UART software degrades performance of the TPU significantly; higher baud rate leads to higher utilization. Startup synchronization will also be complex because of non-deterministic behavior of messages sent from the CPU. One cannot send a message which informs which tooth that is active, but you can send infor-mation about the phase of the flywheel. An alternative to this could be to use the UART in the CPU to communicate with the FPGA, especially if TPU functionality is degraded.

3.4.3 Concept 2 - Memory Mapped Interface

The following concept uses some of Bohlin’s suggestions [3] on how the S8 and the FPGA should be connected. This concept uses a memory mapped

(42)

28 Analysis of Methods to Synchronize Flywheel Position

approach to communicate between S8 and FPGA. The FPGA consists of

Figure 3.6: Architecture of concept 2 with memory mapped interface five modules. One Message Decoder module, which is going to snoop on address and control bus to determine when a message is received. Memory

Controller handles arbitration of the memory such that read and writes

are performed correctly. A CAD module is handling the synchronization in the FPGA. It receives decoded messages from the Message Decoder module and reacts to them. The CAD module informs the TPU Interface when it is time to send synchronization messages to the TPU.

There are several pros with this concept. It is possible to send several different types of messages and data. A limiting factor is the size of the memory in the FPGA. It is also easier to put synchronization logic on the CPU, because programming is easier than on a TPU. This concept also suffers from complex startup synchronization as described for concept 1 in section 3.4.2. Another negative aspect is that it could be time consuming work to configure the memory controller in the CPU.

3.4.4 Concept 3 - Dual Handshake Interface

The dual handshake concept differs from concept 1 and 2. This architecture shown in Figure 3.7 uses only the TPU and the FPGA to synchronize the CAD value. It uses a dual handshake protocol to communicate between the TPU and the FPGA. The FPGA consists of four modules; A TPU

Inter-face and a Receive module, which handles communication to and from the

FPGA. A Message Decoder module that decodes received messages. The

CAD module is handling synchronization in the FPGA. A main difference

compared to other methods is that the CPU will not be involved in the synchronization process. This concept also provides the possibility to an

(43)

3.5 Concept Selection 29

Figure 3.7: Architecture of concept 3 with a dual handshake interface

easier startup synchronization process because it would be possible to syn-chronize at every tooth on the flywheel due to lower latency for messages sent from the TPU to the FPGA. This method would also be the most simple to interface with the FPGA.

There are two main drawbacks with this method. First of all increased utilization in the TPU, where there is a great risk that other channels will be blocked. The TPU has only a few available channels which limit the number of messages to be sent. However this can be extended by sending data multiple times, but that will also lead to an increasing utilization in the TPU. Control logic is also hard to program in a TPU, because of its event based structure.

3.5

Concept Selection

If different concepts are analyzed according to the implementation factors described in section 3.1, a choice is rather simple. Concept 2 fulfills latency requirements of the CAD value and it will not degrade performance of the S8 significantly. It is also by far the most flexible of all concepts, because it can communicate from the FPGA either by memory mapping or by handshake communication protocol. One drawback compared to other concepts is that it is harder to test a memory mapped interface between the CPU and the FPGA. You cannot use a software simulator like in concept 1 and 3 where TPU and FPGA communication can be simulated. Another drawback is that estimation of flywheel position will be harder compared to concept 3. This would increase complexity and area of the FPGA.

(44)
(45)

Chapter 4

System Design and

Implementation

This chapter describes how the design selected in chapter 3 was imple-mented. It describes the process of constructing hardware required to con-nect S8 with an FPGA and how functionality in the TPU, CPU and the FPGA were implemented. No time for implementing a second concept described in section 1.3 has been available.

4.1

PCB Design

4.1.1 PCB-Adapter

To enable communication between the S8 and the FPGA a PCB-adapter was designed to connect a subset of the S8 pins to the FPGA. A pin con-nector was included in the design to make it possible to monitor and debug communication between the units. The PCB-adapter was designed in Ki-CAD [14] and produced by an external company.

It was required to do three revisions to get a working prototype. Bad soldering technique combined with unconnected signals in the design made it necessary to cancel the first revision. In the second revision it was de-tected that some signals that were connected to the FPGA were using a 5V logic level and the FPGA is not able to handle those voltages levels. A voltage divider was therefore manufactured at Scania to handle this issue (see section 4.1.2). When the PCB-adapter had been soldered, it was de-tected that one of the connectors was rotated in the layout. A consequence of this was that some of the data bus signals became unavailable. This

(46)

32 System Design and Implementation

meant that a third and final PCB-adapter had to be designed. The final produced PCB-adapter can be seen in Figure 4.1. Figure 4.2 show the S8 connected with the FPGA via the PCB-adapter.

Figure 4.1: Top and bottom view of the PCB-adapter

(47)

4.2 Synchronization of CAD 33

4.1.2 PCB Voltage Divider

During manufacturing of the second revision of the PCB-adapter it was detected that the TPU IOs where using a 5V logic level. This is problematic because the FPGA cannot handle those voltage [26] levels. To handle this issue an additional PCB was manufactured at Scania, which divided the voltage from 5V to 3.3V. This is a voltage level that the FPGA can operate within. A picture of the voltage divider can be seen in Figure 4.3. This design was later placed on the PCB-adapter in the third revision to integrate the entire design into just one PCB.

Figure 4.3: Layout and and picture of the designed voltage divider

4.2

Synchronization of CAD

To be able to know what functionality that should be placed in each unit and to get an overview of the communication during synchronization, a sequence diagram was made shown in Figure 4.4. The first synchronization step is the startup phase where the CPU sends information to the FPGA about engine type, master sensor and current revolution. Timing of when to send current revolution is important so this is not taking place at the end of a revolution. When the FPGA has found synch it informs the CPU that synchronization at every crank angle interrupt is possible. When this phase is over the CPU checks at every crank angle interrupt that time between timestamps differ at most tmax between S8 and FPGA. If the time

(48)

34 System Design and Implementation

:FPGA

:CPU :TPU

Send Engine Parameters CAD Synchronization

Start/Soft Reset

[SYNCH_VERIFIED] Send Master Sensor

[SYNCH_VERIFIED] Send Current Revolution

[FPGA_SYNCH] Synch Found FPGA Synch Found

loop

[CAI_DETECTED] Send Position [CAD_SYNCHRONIZED]

[CAI_DETECTED] Store TPU Timestamp Store FPGA Timestamp loop

[S8_RUNNING]

Figure 4.4: Sequence diagram of CAD synchronization

4.3

TPU Design

4.3.1 Pin and Channel Selection

Messages sent from the FPGA are handled by a request channel. To be able to send confirmation that a message has been received an acknowledgement channel was implemented. There is also data channels, which are each responsible for one bit of the messages sent from the FPGA to the TPU. Four data channels were chosen to handle messages from the FPGA. This makes it possible to decode 16 types of messages. There are 8 required messages for sending which crank angle interrupt (max number of cylinders is 8) that has occurred. Two messages are also necessary to inform if the CAD value for a flywheel position sensor on the FPGA has lost synch. If more than 16 messages are required the number of messages could be extended to 256 messages when more available data channels are used. All implemented channels were placed on TPU engine B. The reason for this is that all GPIO channels on engine A are occupied.

(49)

4.3 TPU Design 35

4.3.2 Functional Description

The intent when designing the TPU code was to make it to a unit which only forwarded messages. A reason for this is that analysis of the CAD value shall be made in the CPU where it is easier to control synchroniza-tion logic. The funcsynchroniza-tionality that has been added to the TPU can be seen in Figure 4.5. CHANNEL_REQUEST handles the request signal from the FPGA and copies the PinState variable to the LastPinState variable when req has been asserted. When req is asserted the

LastTimeStampF-PGA register is also written. The four data channels react to changes on

the data wires and write this to the PinState variable. Behavior of

CHAN-NEL_ACK depends on the WaitState variable, which describes which state CHANNEL_REQUEST is in. If WaitState informs that req signal is high

the ack is asserted and if it is low then ack is deasserted.

CHANNEL_EPS1 and CHANNEL_EPS2 are activated when a positive

edge has been detected on the eps signals. These channels are placed on TPU engine A and implemented channels are placed on TPU channel B. This was an issue because TPU engines have different counters, which makes it impossible to compare timestamps. This was solved by adding two additional channels CHANNEL_EPS_SAVE1 and CHANNEL_EPS_SAVE2. These were placed on TPU engine B and a link thread is created from

CHANNEL_EPS1 or CHANNEL_EPS2 whenever a positive edge on eps1

or eps2 is detected. This made it possible to sample timestamps and write this to the LastTimeStampEPS1 and LastTimeStampEPS2 registers. Ad-ditional linking time will increase the time difference between timestamps and there is also no guarantee when links will be executed, due to the TPUs event based structure. But if the difference between timestamps is less than

tmax, it is still guaranteed that the flywheel position is synchronized.

One thing that has to be considered is that the request signal has to be asserted after data channels have handled a new message. If it is not the case the data channels will be sampled before all have been updated. One solution to this could be to set data signals before the request signal has been set. To minimize wait time for the request signal gray coding could be used, where only one data bit is changed for every successive value. This would lead to fewer events generated in the TPU compared with a counter based approach where several bits can change when incrementing the counter. However there is no guarantee that an event will be served within tmax. The solution that has been implemented prepares the output

(50)

ev-36 System Design and Implementation

Figure 4.5: TPU channel accesses of Share Code Memory

ery crank angle interrupt is at minimum a few milliseconds, which would give the data channels enough time to sample the signals before the crank angle interrupt will occur.

4.4

CPU Design

The CPU memory controller supports communication with several periph-eral units. The FPGA memory is used as such a periphperiph-eral unit to enable communication between the CPU and the FPGA. An own address space was assigned to the FPGA memory in the CPU. Normally when a RAM memory is setup the placement of variables are nondeterministic. How-ever when using a shared memory in the FPGA all variables need to have a designated address. This is required; otherwise the shared memory commu-nication between CPU and FPGA would be impossible. When reading or writing to the FPGA, the CPU compiler could interpret those operations as redundant and therefore remove these during compilation. To ensure that every read and write to the FPGA is performed, variables are declared as

Volatile to prevent optimization from the compiler. Example 4.1 shows an

(51)

4.4 CPU Design 37 Example 4.1: Read/Write in CPU to FPGA Memory

/∗ D e c l a r e v a r i a b l e , s e t t i n g p o i n t e r t o FPGA memory . V o l a t i l e i s u s e d t o p r o t e c t from c o m p i l e r o p t i m i z a t i o n ∗/ s t a t i c v o l a t i l e u n s i g n e d i n t ∗ fpga_mem_variable_1 = ( (u n s i g n e d i n t ∗ ) ADDRESS_MEMORY_POS_1) ; s t a t i c v o l a t i l e u n s i g n e d i n t ∗ fpga_mem_variable_2 = ( (u n s i g n e d i n t ∗ ) ADDRESS_MEMORY_POS_2) ; v o i d exampleMethod (v o i d) { /∗ D e c l a r e r e a d v a r i a b l e ∗/ s t a t i c v o l a t i l e u n s i g n e d i n t r e a d _ v a r i a b l e ;

/∗ Write t h e v a l u e 1 t o FPGA memory ∗/

∗ fpga_mem_variable_1 = (u n s i g n e d i n t) 0 x 0 0 0 0 0 0 0 1 ;

/∗ Read from FPGA memory ∗/

r e a d _ v a r i a b l e = ∗ fpga_mem_variable_2 ; // W r i t t e n from FPGA

module /∗ Use r e a d v a l u e ∗/ i f( r e a d _ v a r i a b l e == 4 2 ) { /∗ Do s o m e t h i n g ∗/ } }

Every variable is saved with 32 bits. This was chosen to make it easy, that is one variable is mapped to one address. There is however hardware (sig-nals from CPU) support to save 8, 16, 24 and 32 bit variables. Space is going to be unused if variables that only requires 8 bits are saved as 32 bit variables, however this is not an issue right now because no significant amount of data will be sent. Note that if 8, 16, 24 and 32 bit variables are implemented the byte order has to be considered. A big-endian configu-ration is now used which means that data is saved in byte order with the most significant byte first. There is possible to change the configuration to small-endian if that is preferred.

The CPU in the S8 supports a cache memory which can be configured as write-back and write-through memory. The cache memory has been in-activated for addresses designated to the FPGA memory to ensure that the variables are always up-to-date. If large amounts of data will be read or written to the FPGA you have to consider how long time memory accesses take. This could otherwise be a bottleneck and degrade performance of the CPU, because CPU execution is stalled when a memory operation is performed. The current implementation uses a time window of 188ns when performing a write or a read operation. This can however be set to as low

(52)

38 System Design and Implementation

as 31ns, which is significantly lower than the current implementation. But there is no guarantee that the FPGA would be able to support this speed, due to switching and propagation delay. There has not been enough time to make an analysis of this problem but this should be considered if large amounts of data are going to be exchanged between CPU and FPGA in the future.

CPU logic implementation has been implemented as a FSM (Finite State Machine) to enable a structured control of the synchronization as seen in Figure 4.6. It starts by sending a start message to the FPGA. Then it sends information to the FPGA about engine type, master sensor and current rev-olution. When this phase is over it awaits a synchronization message from the FPGA. If it is not sent within a specific time the synchronization pro-cedure is restarted. When the FPGA has sent a synchronization message, the CPU starts to compare timestamps at every crank angle interrupt. If timestamps are within tmax they are considered as synchronized and if

timestamps differ to much the synchronization process is restarted.

(53)

4.5 FPGA Design 39

4.5

FPGA Design

Development of FPGA software has been made by using Xilinx ISE web-pack edition [11] and code has been written in VHDL. The language was a requirement from Scania and the development tool was used in a previ-ous master thesis. The author also has previprevi-ous experience with the tool, which made it a natural choice. The design used a module based approach to design the FPGA implementation. This provided a bottom-up imple-mentation strategy which makes it easier to test small modules early and by that identify design errors and bugs easier. A modular approach also provides possibility of reusability. The Design was also implemented to make it generic and changes in the design would be easy to implement with minimum effort. When creating the design a schematic of the hardware was done before implementing the VHDL code. Thismade the implementation of the VHDL code a simpler task.

The FPGA board used in this master thesis is an Opal Kelly XEM3010 [13] (Figure 4.7) with a Xilinx Spartan 3 XC3S1500-4FG320 FPGA. This is the same FPGA that has been used in Bohlin’s master thesis [3].

Figure 4.7: Opal Kelly XEM3010 with a Spartan 3 FPGA

4.5.1 Overview

To enable a modular design, different functionality where placed in separate modules. An overview of the FPGA design can be seen in Figure 4.8.

Message Decoder module and TPU Interface module can be classified as

communication modules and they handle communication with the CPU and the TPU. The BRAM Controller is responsible for control of the shared

(54)

40 System Design and Implementation

memory. It handles operation of reads and writes to memory. Tracking of flywheel position is performed by the CAD module with the sensor input signals eps1 and eps2. The Add-On Unit (could for example be a closed-loop combustion control module) is a calculation unit that will use the CAD value and also communicate with the CPU via the shared memory. This unit is not implemented in this master thesis but it is easy to interface one or several calculation units to the design. The signal interface to the FPGA can be seen in table 4.1.

Figure 4.8: Block diagram of FPGA implementation

4.5.2 Message Decoder

Message Decoder module is responsible for detecting new messages that

are sent from the CPU. It snoops at the address bus and if certain address is set and if the signals ce and we is active, a valid new message has been sent. Current messages that are available to send from CPU to FPGA are:

• Start: start, restart and stop synchronization with the FPGA. • Master Sensor: defines which flywheel position sensor that is

mas-ter.

• Revolution: specifies current revolution of the flywheel. The fly-wheel could either be between 0-359.9◦and 360-719.9◦.

• Engine Type: number of cylinders. Number of cylinders could be 5,6 and 8.

(55)

4.5 FPGA Design 41 Signal name Type Description

eps1 input Engine position sensor 1

eps2 input Engine position sensor 2

ce input Chip enable for block ram. Makes read

and writes to memory possible

oe input Output enable for block ram.

Possi-ble to perform read operations from the block ram.

we input Write enable for block ram. Write

op-erations possible

address(7:0) input Address to block ram

data_cpu(31:0) input/output Bidirectional data line

ack input Message received (acknowledged)

sig-nal.

req output Request to send message to the TPU.

data_tpu(3:0) output Data sent to TPU

Table 4.1: FPGA signal interface

4.5.3 BRAM Controller

The purpose of the BRAM controller (Figure 4.9) is to ensure secure ac-cesses to the shared memory in the FPGA. A BRAM is used to store data that is shared between the CPU and the FPGA. There were two main rea-sons for choosing a BRAM. This was a request from Scania and the previous master thesis had also used a similar design. The BRAM is configured as a dual port memory to ensure that both CPU and FPGA can access the shared memory simultaneously.

One issue with keeping information in a BRAM is that only one module in the FPGA can read from one memory location. If data is going to be shared between different modules, the memory content has to be saved in a reg-ister before it is accessed by both modules. Therefore one should consider using other memory techniques to handle internal shared variables in the FPGA. A suitable choice could be memory mapped registers. This provides less FPGA utilization but at the cost of a more complex memory controller. When using a shared memory structure with a dual port memory there could be memory collisions. There are two possible cases when this could occur; writing to the same memory address or one a write operation on one port and read operation on the other to the same memory address. Xilinx Core Generator System [23] is used to instantiate the BRAM. If the mem-ory is configured in READ_FIRST mode it has support for simultaneous

(56)

42 System Design and Implementation

read and write to the same memory address. The read value is the previous value and the written value is available one clock cycle later. After discus-sions with the Scania supervisors, it has been decided that simultaneous write to the same address is considered as a programming error.

Figure 4.9: Block diagram of BRAM controller design

CPU BRAM accesses are handled by a FSM, which is the method that was suggested by Bohlin previous master thesis [3]. The FSM is necessary due to clock domain crossing and bus arbitration. Control signals from the CPU have to be synchronized to the FPGAs clock domain to ensure that no timing violations occur within the clock-to-clock setup [28]. Due to the shared data bus between the CPU and FPGA, arbitration has to be considered. The data bus needs to be able to put in high impedance state when the CPU is not reading from the BRAM. This is possible if memory accesses are synchronized.

The FSM can be seen in Figure 4.10. A read is performed one clock cycle after the ce and oe from the CPU is asserted. The read operation is finished when ce is deasserted or if oe or we is changed. However this should not happen because ce is deasserted before oe and we in a read or write cycle in the CPU. Writing is performed in a similar way, where we is asserted instead of oe.

4.5.4 CAD Module

The CAD module shown in Figure 4.11 was divided into four smaller mod-ules which are listed and described in the following bullet points:

(57)

4.5 FPGA Design 43

Figure 4.10: FSM for BRAM port A

• CAI Logic: handles detection of crank angle interrupts. Consists of combinational logic.

• CAD Output: sets which sensors CAD information that is sent out depending on which flywheel sensor that is master.

• EPS Logic: this module handles signals from the flywheel position sensors and is trying to provide a synchronized CAD value.

• CAD Controller: a FSM controlling the state of the CAD module. An easy and structured way to control logic in the CAD module was to use a FSM based design approach (Figure 4.12). When the module receives a start message it will perform startup synchronization. When all configu-ration parameters have been set and a flywheel position sensor has been synchronized, a synchronization message is initiated and sent to the TPU. After the synchronization message has been sent, the CAD Controller is continuously sending new messages to the TPU whenever a crank angle interrupt has occurred.

EPS Logic consists of two submodules that are estimating the CAD value

for each flywheel position sensor. The modules are working independently and there is no interaction between them. A flowchart was created to be able to get a view on how the CAD value should be estimated. This is illustrated in Figure 4.13. An initial guess is first performed. This value is updated until a gap has been detected. If the guess was correct the FPGA is synchronized otherwise it adjusts its tooth value to the value of the gap and awaits the next gap. Deviation of period time between two successive teeth can at maximum be 30% (Scania threshold); otherwise it is assumed

(58)

44 System Design and Implementation

Figure 4.11: Block diagram of CAD module

(59)

4.5 FPGA Design 45

that an error has occurred. The reason for this is either acceleration or de-celeration of the engine speed. If this occurs in the EPS Logic module, the entire estimation process is restarted. When the flowchart had been cre-ated an FSM was adapted to mimic the behavior of the flowchart. This was found to be an easy task due to the event based structure of the flowchart.

Figure 4.13: Flowchart of CAD estimation

A flywheel consists of 58 teeth and two gaps. To be able to approximate a CAD value within 0.1 CAD the distance between two teeth has to be divided into internal positions (posinternal). 256 positions were selected as the total number of internal positions. The reason for this is that one internal position corresponds to 60·256360 ≈ 0.025 CAD, which is a higher pre-cision than required. With this representation the flywheel position can be

References

Related documents

But if a similar test was done with clients with higher delay, running the same movement code on the local unit as the server when sending the information and using the Kalman

The objective of this thesis is to design, develop and verify a system for measuring a vehicle’s position relative a wireless charging equipment.. The system will be used

The main findings reported in this thesis are (i) the personality trait extroversion has a U- shaped relationship with conformity propensity – low and high scores on this trait

4.37 Estimation errors of discrete MRAS controller running test cases 4 and 5 com- bined with different values for the start load... List

In all cases it is likely that parents and their children believe that English-language schooling will benefit the linguistic development of the young person, improving their

Based on the efficiency calculation for the different tap change systems and the placement of the filter, Tap Change Option 3, filter after the transformer was chosen, see Fig. A

(1997) studie mellan människor med fibromyalgi och människor som ansåg sig vara friska, användes en ”bipolär adjektiv skala”. Exemplen var nöjdhet mot missnöjdhet; oberoende

The Digital IN Interface board takes in 15 V digital signals and converts them to 5V TTL compatible with NI9401 module... 4.3 Analog