Institutionen för systemteknik
Department of Electrical Engineering
Examensarbete
Synchronization of flywheel position between
autonomous devices
Examensarbete utfört i Datorteknik vid Tekniska högskolan vid Linköpings universitet
av
Tobias Pettersson
LiTH-ISY-EX--12/4602--SE
Linköping 2012
Department of Electrical Engineering Linköpings tekniska högskola Linköpings universitet Linköpings universitet SE-581 83 Linköping, Sweden 581 83 Linköping
Synchronization of flywheel position between
autonomous devices
Examensarbete utfört i Datorteknik vid Tekniska högskolan i Linköping
av
Tobias Pettersson
LiTH-ISY-EX--12/4602--SE
Handledare: Andreas Ehliar
isy, Linköpings universitet
Rasmus Backman
Scania CV AB
Joakim Jäderberg
Scania CV AB
Examinator: Olle Seger
isy, Linköpings universitet
Avdelning, Institution
Division, Department
Division of Automatic Control Department of Electrical Engineering Linköpings universitet
SE-581 83 Linköping, Sweden
Datum Date 2012-06-13 Språk Language Svenska/Swedish Engelska/English Rapporttyp Report category Licentiatavhandling Examensarbete C-uppsats D-uppsats Övrig rapport
URL för elektronisk version
http://www.control.isy.liu.se http://www.ep.liu.se ISBN — ISRN LiTH-ISY-EX--12/4602--SE
Serietitel och serienummer
Title of series, numbering
ISSN
—
Titel
Title
Synkronisering av svänghjulsposition mellan autonoma enheter Synchronization of flywheel position between autonomous devices
Författare
Author
Tobias Pettersson
Sammanfattning
Abstract
More computing power will be required in Scania’s future engine control units. Calculations is therefore needed to be performed on new hardware such as an FPGA. One problem that arises is synchronization of flywheel position. This mas-ter thesis examines the opportunities existing Scania hardware has to perform synchronization of flywheel position. Different concepts for synchronization have been developed and compared with each other. One of the concepts have been implemented and made possible with a PCB-adapter. The results show that syn-chronization is possible within given real-time requirements. Finally, an analysis to series production has been made. It show the challenges that an FPGA will face when integrated into a future engine control unit.
Nyckelord
Abstract
More computing power will be required in Scania’s future engine control units. Calculations is therefore needed to be performed on new hardware such as an FPGA. One problem that arises is synchronization of flywheel position. This master thesis examines the opportunities existing Scania hardware has to perform synchronization of flywheel position. Different concepts for synchronization have been developed and compared with each other. One of the concepts have been implemented and made possible with a PCB-adapter. The results show that synchronization is possible within given real-time requirements. Finally, an analysis to series production has been made. It show the challenges that an FPGA will face when integrated into a future engine control unit.
Sammanfattning
Mer beräkningskraft kommer att krävas i Scanias framtida motorstyren-heter. Då kan beräkningar behövas utföras på ny hårdvara som exempelvis en FPGA. Ett problem som då uppstår är synkronisering av svänghjulsposi-tion. Detta examensarbete undersöker vilka möjligheter befintlig hårdvara har för att möjliggöra synkronisering av svänghjulspositionen. Olika cept för synkronisering har tagits fram och vägts mot varandra. Ett av kon-cepten har implementerats och möjliggjorts med hjälp av en PCB-adapter. Resultaten visar att synkronisering är möjlig inom givna realtidskrav. Till sist har en analys mot serieproduktion utförts som visar vilka utmaningar som finns om en FPGA integreras i en framtida motorstyrenhet.
Acknowledgments
I would in particular thank my supervisors at Scania for their support dur-ing this master thesis. Rasmus Backman has given me great support when designing the PCB-adapter and when creating the FPGA design. Joakim Jäderberg has taught me a great deal about how a TPU works and also been a great help when integrating S8 with the FPGA. I also want to thank Hans Svensson for contributing with his knowledge about crank angle logic and Christoffer Markusson for his knowledge about the CPU configuration in S8. Thanks also to Per Olsson, who has provided me with the necessary tools during the master thesis. Thanks to my supervisor at the university Andreas Ehliar, who has supported me throughout the entire master thesis process with report and implementation discussions.
At last, I want to thank my parents for their support during my entire university time.
Contents
1 Introduction 7 1.1 Background . . . 7 1.2 Purpose . . . 7 1.3 Problem Definition . . . 8 1.4 Delimitations . . . 91.5 Overview S8 and FPGA . . . 9
1.6 Report Outline . . . 10
2 Background Concepts 11 2.1 Diesel Engine Basics . . . 11
2.1.1 Four-stroke process . . . 11
2.1.2 Flywheel Position Sensor . . . 12
2.1.3 CAD - Crank Angle Degree . . . 14
2.2 Scania S8 . . . 14
2.2.1 ECU, EMS and S8 . . . 14
2.2.2 S8 Architecture . . . 14
2.2.3 S8 CPU . . . 14
2.2.4 TPU - Time Processing Unit . . . 15
2.3 FPGA . . . 16
2.3.1 Overview FPGA . . . 16
2.3.2 FPGA Technologies . . . 17
3 Analysis of Methods to Synchronize Flywheel Position 19 3.1 Implementation Factors . . . 19
3.1.1 Timing Requirements . . . 20
3.1.2 Degradation of CPU and TPU . . . 21
3.1.3 Test and Verification . . . 21
3.1.4 Flexible Design . . . 21
3.1.5 Hardware Cost . . . 21
3.2 Engine Speed Estimation . . . 21 ix
x Contents
3.3 Communication . . . 22
3.3.1 Required Messages . . . 22
3.3.2 CPU and TPU Communication . . . 22
3.3.3 CPU and FPGA Communication . . . 23
3.3.4 TPU and FPGA communication . . . 24
3.4 Implementation Alternatives . . . 25
3.4.1 Synchronization Method . . . 25
3.4.2 Concept 1 - UART Interface . . . 26
3.4.3 Concept 2 - Memory Mapped Interface . . . 27
3.4.4 Concept 3 - Dual Handshake Interface . . . 28
3.5 Concept Selection . . . 29
4 System Design and Implementation 31 4.1 PCB Design . . . 31
4.1.1 PCB-Adapter . . . 31
4.1.2 PCB Voltage Divider . . . 33
4.2 Synchronization of CAD . . . 33
4.3 TPU Design . . . 34
4.3.1 Pin and Channel Selection . . . 34
4.3.2 Functional Description . . . 35 4.4 CPU Design . . . 36 4.5 FPGA Design . . . 39 4.5.1 Overview . . . 39 4.5.2 Message Decoder . . . 40 4.5.3 BRAM Controller . . . 41 4.5.4 CAD Module . . . 42 4.5.5 TPU Interface . . . 46
5 Test and Verification 47 5.1 TPU Design . . . 47 5.2 CPU Design . . . 47 5.3 FPGA Design . . . 49 5.3.1 VHDL Test Packages . . . 49 5.3.2 CAD Module . . . 49 5.3.3 BRAM Controller . . . 50
5.3.4 Message Decoder and TPU Interface Module . . . . 51
5.3.5 Area and Timing Report FPGA . . . 52
5.3.6 PC to FPGA . . . 52
5.4 System Test . . . 53
5.4.1 Test Setup . . . 53
Contents xi
5.5 Truck Test . . . 55
6 Using an FPGA in a Series Produced Unit 57 6.1 FPGA in Engine Control Units . . . 57
6.2 Requirements on the FPGA . . . 58
6.2.1 Environmental Requirements . . . 58
6.2.2 Startup Requirements . . . 59
6.2.3 Security . . . 60
6.3 FPGA Effects on Assembly Line . . . 60
6.3.1 Assembly Line . . . 60
6.3.2 Firmware Updates . . . 61
6.4 Using an FPGA for Future Engine Control Units . . . 62
7 Conclusion 63 7.1 Summary . . . 63
7.2 Future Work . . . 63
7.2.1 CAD Synchronization on a Separate TPU Engine . . 63
7.2.2 Implementation of Engine Position Logic in an FPGA 64 7.2.3 Logging . . . 64
7.2.4 Using FPGA Platform for Other Purposes . . . 64
7.2.5 Towards a New Engine Control Unit . . . 65
List of Figures
1.1 Block diagram of S8 and FPGA with connected flywheel
position sensors. . . 9
2.1 Cross section of a cylinder in a combustion engine. Figure from [10], used with permission. . . 12
2.2 Flywheel teeth pattern and sensor positions for a 6-cylinder engine . . . 13
2.3 Transformation of flywheel position sensor signal to a pulse train signal . . . 13
2.4 Block diagram of S8 architecture . . . 15
2.5 Basic structure of an FPGA . . . 17
3.1 Timing requirement for flywheel position between S8 and FPGA . . . 20
3.2 Simulation when sending 7 data bits with a handshake protocol 24 3.3 Simulation when sending the byte 0xAA via UART . . . . . 25
3.4 Stress simulation when sending the byte 0xAA via UART . 25 3.5 Architecture of concept 1 with UART interface . . . 27
3.6 Architecture of concept 2 with memory mapped interface . 28 3.7 Architecture of concept 3 with a dual handshake interface . 29 4.1 Top and bottom view of the PCB-adapter . . . 32
4.2 The S8 and the FPGA connected via the PCB-adapter . . . 32
4.3 Layout and and picture of the designed voltage divider . . . 33
4.4 Sequence diagram of CAD synchronization . . . 34
4.5 TPU channel accesses of Share Code Memory . . . 36
4.6 FSM for synchronization of CAD value in CPU . . . 38
4.7 Opal Kelly XEM3010 with a Spartan 3 FPGA . . . 39
4.8 Block diagram of FPGA implementation . . . 40
4.9 Block diagram of BRAM controller design . . . 42
4.10 FSM for BRAM port A . . . 43
4.11 Block diagram of CAD module . . . 44
4.12 Implementation of CAD Controller . . . 44
4.13 Flowchart of CAD estimation . . . 45
4.14 FSM for TPU interface . . . 46
5.1 Simulation of handshake protocol between the TPU and the FPGA with a TPU simulator . . . 48
5.2 Oscilloscope measurement of ce and we signals during three write operations . . . 48
2 Contents
5.3 Simulation of CAD synchronization . . . 49
5.4 Plot of CAD estimation . . . 50
5.5 Simulation of an FPGA write and a CPU read . . . 51
5.6 Simulation of simultaneous CPU read and FPGA write to the same memory address . . . 51
5.7 User interface to control and observe FPGA signals . . . 53
5.8 Block diagram of system test setup . . . 54
5.9 Physcial test setup during system test . . . 54
5.10 Maximimum difference between the S8 and the FPGA times-tamps when the RPM is constant . . . 55
Contents 3
List of Tables
4.1 FPGA signal interface . . . 41 5.1 Area and Timing Report for FPGA design . . . 52
List of Examples
2.1 Code structure of a TPU channel . . . 16 4.1 Read/Write in CPU to FPGA Memory . . . 36
Abbreviations
ASIC Application Specific Integrated Circuit
BDC Bottom Dead Center, position of the piston when it is at the point closest to the crankshaft
BRAM Block Random Access Memory
CAD Crank Angle Degree
CAI Crank Angle Interrupt. Software interrupt in S8
CPU Central Processing Unit
EPS Engine Position Sensor, synonym for flywheel position sensor
ECU Engine Control Unit
FPGA Field Programmable Gate Array
FSM Finite State Machine
GPIO General Purpose Input/Output
JTAG Joint Test Action Group
PCB Printed Circuit Board
S8 Name of Scania engine control unit
SCI Serial Communications Interface
SPI Serial Peripheral Interface Bus
TDC Top Dead Center, position of the piston when it is at the point furthest to the crankshaft
TPU Time Processing Unit 5
6 Contents UART Universal Asynchronous Receiver/Transmitter
VHDL VHSIC (Very High Speed Integrated Circuit) Hardware Description Language
Chapter 1
Introduction
This master thesis has been performed at Scania CV AB in Södertälje. It was performed at the department Powertrain Control Systems, which is responsible for low level software for powertrain embedded systems at Scania.
1.1
Background
When new computation power is needed for closed-loop combustion con-trol in Scania engine concon-trol units, new hardware such as an FPGA must be integrated in an effective way. One problem is when calculations are distributed between different hardware components is synchronization of real-time data against flywheel position. A solution aimed for lab work to synchronize flywheel position is needed and some of the concepts that are investigated will probably be integrated in future engine control units.
1.2
Purpose
The starting point of this master thesis is to continue the work of a pre-vious master thesis by Henrik Bohlin with extra hardware in the engine management system called S8 [3]. In this master thesis different concepts to synchronize flywheel position shall be proposed and analyzed. The goal is to solve the synchronization problem with existing hardware and also propose a solution for future control units. The purpose of the synchro-nized flywheel position is to be used as an input to a closed-loop combustion controller where control variables are a function of the flywheel position.
8 Introduction
1.3
Problem Definition
To solve the problem it has been broken down into the following problems and tasks.
1. Analysis of Methods to Synchronize Flywheel Position a Which factors affect the choice of method to synchronize flywheel
position?
b Analyze different synchronization concepts. Which results are ex-pected for respective concept? How will it affect the current sys-tem?
c Which concept will be implemented to solve the synchronization problem? Why should this be implemented?
2. Implementation CPU, TPU and FPGA
a Create hardware to enable communication between S8 and an FPGA. b Which changes are needed in respective unit to integrate the chosen
concept?
c Implement the software that is needed. 3. Test and Verification
a Create methods to test, simulate and verify the functionality of the implementation.
b Which results has been obtained from the chosen concept?
c If a second concept is implemented: How do the results from the different concepts relate with respect to accuracy and fault toler-ance?
4. Using an FPGA in a Series Produced Unit
a Has an FPGA been used before with an engine control unit? Which types of vehicles has it been integrated to?
b Which factors affect the choice of an FPGA in a series produced control unit (environmental resistance requirements, performance etc.)?
c How is the FPGA going to change the assembly line of trucks? How should it be configured before delivery? How should firmware updates be handled?
1.4 Delimitations 9
1.4
Delimitations
The following bullet points describe the delimitations of this master thesis. 1. S8 functionality shall not be degraded. This means that no timing
deadlines shall be missed due to the integration of the FPGA. 2. Changes shall not affect the S8 compatibility with existing software.
1.5
Overview S8 and FPGA
A block diagram of the initial platform with S8 and the FPGA can be seen in Figure 1.1. Flywheel position sensors are connected to S8 and not to the FPGA. The dotted lines mean that no communication was possible initially between the units at the start of this master thesis.
S8
Flywheel 58 teeth, 2 gaps on flywheel Flywheel position sensor Flywheel position sensorCPU
TPU
FPGA
Figure 1.1: Block diagram of S8 and FPGA with connected flywheel posi-tion sensors.
10 Introduction
1.6
Report Outline
The report outline is described in the list below.
• Background Concepts: this chapter explains main concepts used in this master thesis.
• Analysis of Methods to Synchronize Flywheel Position: possi-bilities and limitations to synchronize flywheel position are discussed. Different concepts are proposed and compared to each other. In the last section a concept is chosen for implementation.
• System Design and Implementation: the design of the system to synchronize flywheel position is presented. Implementation of hard-ware and softhard-ware is described.
• Test and Verification: methodology to verify the implementation is described. Test results are presented and discussed.
• Using an FPGA in a Series Produced Unit: which challenges will an FPGA have if it is used in an engine control unit where high temperatures, temperature variation and vibrations are com-mon? This chapter will try to answer this question. It also provides important aspects of which considerations that have to be made when trying to integrate an FPGA in an engine control unit.
• Conclusions: a summary of the results. Possible future work is also suggested.
Chapter 2
Background Concepts
The purpose of this chapter is to make the reader familiar with the main concepts of this master thesis. After reading this chapter the reader should be able to understand basic concepts of a diesel engine and also know what kind of embedded technology that is integrated in modern engine control units. Closed-loop combustion control is a central concept in this master thesis. Further reading on this topic can be read in a master thesis written by Jonian Grazhdani [10].
2.1
Diesel Engine Basics
2.1.1 Four-stroke process
A four-stroke engine is an internal combustion engine in which a piston completes four different strokes. A cross section of a cylinder engine can be seen in Figure 2.1. The different strokes of the four-stroke operating cycle are:
1. Intake: at the start of this cycle the piston goes from top to bottom of the cylinder which leads to a reduced pressure in the cylinder. Air is forced into the cylinder due to the open intake port. The last stage of this stroke is when the intake valve closes.
2. Compression: the piston travels from BDC (Bottom Dead Cen-ter) to TDC (Top Dead CenCen-ter) with both intake and exhaust valve closed. The cylinder pressure and temperature increases when the air is compressed.
3. Expansion/power: when the piston is close to the TDC, fuel is injected into the combustion chamber (applies for direction injection
12 Background Concepts
Figure 2.1: Cross section of a cylinder in a combustion engine. Figure from [10], used with permission.
engines) where it mixes with air. The fuel starts to evaporate and self-ignite and forces the piston to move away from TDC. When the piston travels from TDC to BDC and the combustion has ended, the volume of the cylinder is increased which leads to lower temperature and pressure.
4. Exhaust: as the piston approach BDC the exhaust valve is opened. When the piston is moved from BDC to TDC the exhausts are pushed through the exhaust valve.
2.1.2 Flywheel Position Sensor
The position and rotational speed of the flywheel is measured by two fly-wheel position sensors mounted close to the flyfly-wheel (placement depends on the number of cylinders). An example of the placement can be seen in Figure 2.2. It can also be observed in the figure that the flywheel has 58 teeth and 2 gaps. The teeth are used to measure the rotational speed. Gaps provide information about the flywheel position. The wave signal generated from the flywheel sensors can be seen in Figure 2.3. This signal is transformed into a pulse train before it is connected to the processor in S8.
2.1 Diesel Engine Basics 13
Figure 2.2: Flywheel teeth pattern and sensor positions for a 6-cylinder engine
Pulse to CPU Wave from flywheel position sensor
Figure 2.3: Transformation of flywheel position sensor signal to a pulse train signal
Two flywheel position sensors are used in parallel but only one of the sen-sors are considered as a master sensor (its value is the reference position). The other acts as a backup sensor if the master sensor breaks down. Deter-mination of the master sensor is performed at startup and the sensor that first can decide its position becomes the master. A synonym that is often used for a flywheel position sensor is EPS (Engine Position Sensor). Both those terms will be used in this master thesis.
14 Background Concepts
2.1.3 CAD - Crank Angle Degree
It can be observed from the description in section 2.1.1 that the crankshaft performs two revolutions for every operating cycle. The duration of a cycle can be expressed in Crank Angle Degrees (CAD). CAD angle 0 is defined as the position when the piston is positioned at TDC between the compression and expansion strokes. The operating cycle of a four-stroke engine takes 720 CAD, because of the two revolutions per operating cycle. CAD is used as a reference for several control parameters in an engine for example ignition and fuel injection timing. If the perception of the CAD is inaccurate it could led to misfiring, motor vibration or backfire. When the CAD has reached specific angles a software interrupt is generated in the TPU. They are triggered to be able to calculate fuel injection timing for every cylinder. These interrupts are called Crank Angle Interrupt (CAI) and this term will be frequently used during this master thesis.
2.2
Scania S8
2.2.1 ECU, EMS and S8
An Electronic Control Unit ECU is an embedded system which controls one or more electrical systems. An ECU controlling the engine in a vehicle is called an EMS (Engine Management System). The most sophisticated EMS in current Scania trucks is called S8. It uses input from sensors and calculates control and timing parameters, for example when fuel injection should take place. The terms ECU and EMS will be synonymous with S8 during this master thesis.
2.2.2 S8 Architecture
S8 consists of three major building blocks shown in Figure 2.4. The central part is the CPU, which is controlling the behavior of S8. The TPU (Time Processing Unit) is a co-processor which is placed on the same chip as the CPU. It handles complex timing, actuators and sensor inputs. An ASIC is also integrated in S8. This is responsible for handling fuel injectors. The ASIC is not going to be used during this master thesis.
2.2.3 S8 CPU
The CPU used in S8 is part of the family of CPU cores that implement versions built on Power ArchitectureT M embedded category. Instructions are compatible with the Power PCT M user instruction set architecture
2.2 Scania S8 15
Figure 2.4: Block diagram of S8 architecture
(UISA). It is clocked at 128MHz and has a 32KB unified level 1 cache, which is 4- or 8-way set associative. The cache could be configured with either write-back or write-through approach.
2.2.4 TPU - Time Processing Unit
A TPU is a semi-autonomous co-processor, which is designed for timing control. It can execute instructions, react to inputs, perform PWM gen-eration and access memories without host (CPU) intervention. The TPU has two execution engines which consist of 32 independent timer channels. Each engine has two 24 bit free running counters. Those provide a reference to capture and match events. A TPU channel consists of an input and an output signal pair. Channels can be programmed to react (generate inter-rupts) to four events; two capture and two match events. Capture events reacts to either a rising or falling edge on the input pin. Match events reacts when the free running counter has reached a certain value.
Communication between TPU and CPU can be performed in four different ways: HostServiceRequest, ChannelInterruptRequest, function parameters and Share Code Memory. A HostServiceRequest is when the CPU requests an interrupt in the TPU. It also passes function parameters, which is read and writeable from both the CPU and the TPU. When the TPU wants to request an interrupt in the CPU it creates a ChannelInterruptRequest, which generates an interrupt request to the CPU. A Share Code Memory is available to store global variables. These are read- and writeable by both the CPU and TPU. Communication between channels is performed by set-ting up links. These links can be compared to general software interrupts. The source code for a TPU channel consists of a single function, which consists of several if-statements. These if-statements represent different
16 Background Concepts
threads that are executed when an event has occurred. Threads cannot be preempted during its execution. An example of the basic code structure of a TPU program can be seen in example 2.1.
Example 2.1: Code structure of a TPU channel
/∗ Every c h a n n e l h a s a f u n c t i o n a s s o c i a t e d w i t h i t . a r g 1 and a r g 2 a r e f u n c t i o n p a r a m e t e r s . ∗/ v o i d ChannelXFunction ( i n t 2 4 a r g 1 , u i n t 8 _ t a r g 2 ) { /∗ H o s t S e r v i c e R e q u e s t from CPU−>TPU ∗/ i f( I s H o s t S e r v i c e R e q u e s t E v e n t (REQUEST) ) { // E x e c u t e c o d e } /∗ Event t h r e a d , match o r t r a n s i t i o n h a s o c c u r e d ∗/ e l s e i f( I s M at c h A O r Tr a n s i ti o n B E ve n t ( ) ) { // E x e c u t e c o d e } /∗ Link t h r e a d , a n o t h e r c h a n n e l have c r e a t e d a l i n k t o t h i s c h a n n e l ∗/ e l s e i f( I s L i n k S e r v i c e R e q u e s t E v e n t ( ) ) { // E x e c u t e c o d e } }
2.3
FPGA
2.3.1 Overview FPGAAn FPGA (Field Programmable Gate Array) is in short an integrated cir-cuit with reconfigurable hardware. It can be seen as an array of logic programmable blocks that can be connected to each other to form digital circuits. This structure makes it possible to use the FPGA as a paral-lel machine compared to a general microprocessor which does not have the same level of parallelism. Figure 2.5 shows the basic structure of an FPGA. Storage of data and multiplications are generally expensive in terms of area in logic blocks. That is why modern FPGAs is provided with Block RAMs (BRAM) and hardware multipliers, which have dedicated area on the FPGA chip for these operations.
2.3 FPGA 17
Figure 2.5: Basic structure of an FPGA
2.3.2 FPGA Technologies
FPGAs can be divided into different process technology types. The main difference between them is the reconfigurability and if they are volatile (need power to keep memory content) or non-volatile (no power needed to keep memory content). The most common technologies are listed below.
• SRAM: a volatile technology, which has to be programmed at startup by an external device. This is the technology with the smallest tran-sistor size and the highest possible performance. The technology is flexible and it can both be reprogrammable and in-system pro-grammable.
• Flash: live at startup, which means that it does not need to be programmed at startup. It is reprogrammable and in some devices it is in-system programmable. This is a non-volatile technology. • Antifuse: one time programmable technology. It is live at startup
and it does not need any external configuration. This has the ben-efit of better radiation tolerance, low power consumption and high reliability compared to other technology types.
Chapter 3
Analysis of Methods to
Synchronize Flywheel
Position
This section describes the process of selecting implementation method to synchronize flywheel position. System requirements, available software and hardware resources are presented. Different implementation concepts are proposed, analyzed and compared to each other. The last part describes and motivates the selected implementation.
3.1
Implementation Factors
To determine important factors for synchronizing flywheel position, a sys-tem requirements specification was created by the author and the industrial supervisors. Following factors were identified as important when deciding which concept to use:
• Timing Requirements
• Degradation of CPU and TPU • Test and Verification
• Flexible Design • Hardware cost
20 Analysis of Methods to Synchronize Flywheel Position
3.1.1 Timing Requirements
A requirement from Scania was that the deviation of the flywheel position should differ at most 0.1 CAD between S8 and the FPGA. This requirement is illustrated below in Figure 3.1.
Figure 3.1: Timing requirement for flywheel position between S8 and FPGA The time period between two teeth can be calculated. A flywheel consists of 58 teeth and two teeth gaps which is a total of 60 positions. RPM is the number of revolutions per minute and RPS is the number of revolutions per second.
RP S = RP M
60 ⇒ /calculate period for one revolution/ ⇒ Trev = = 1
RP M 60
= 60
RP M ⇒ /calculate period for one tooth/ ⇒
⇒ Ttooth= 60 RP M 60 = 1 RP M (3.1)
There is 6 CAD between each tooth. This means that 0.1 CAD is 601 of a tooth. This fact combined with (3.1) gives the time requirement treq as
treq< Ttooth
60 = 1
60 · RP M (3.2)
The requirement is applicable when an engine is running at 500-2500RPM, but there are situations when an engine is running at higher RPM. But there are cases when the engine speed reaches 3000RPM. It is harder to fulfill this requirement if RPM is increased as seen in equation 3.2. A hard timing requirement tmax can be calculated as
tmax <
1
60 · 3000 < 5.556µs (3.3) For simplicity reasons tmax will be set to 5µs.
3.2 Engine Speed Estimation 21
3.1.2 Degradation of CPU and TPU
A hard requirement in this master thesis is that functionality in S8 shall not be degraded. One thing to consider in the CPU is to not add too much overhead such that time critical tasks will miss their deadlines. In the TPU it has to be ensured that utilization of new functions are low, otherwise it will block threads which has critical timing requirements.
3.1.3 Test and Verification
It shall be possible and easy to debug, test and verify the implementation. This means that signals connected between S8 and the FPGA shall be possible to measure on an oscilloscope. This has to be considered when designing the hardware. A modular design approach is desired to make it easier to detect where a fault is present.
3.1.4 Flexible Design
It is not known to date all possibilities the FPGA platform will provide. Therefore it has to be possible to have a flexible design where messages and modules can be configured easily. It must also be able to exchange information between S8 and the FPGA bidirectional.
3.1.5 Hardware Cost
The FPGA platform is going to be used for lab purposes. That is why there are no restrictions on how many IOs that are going to be used, given that it is available. According to the industrial supervisors, the cost of using more external IOs would not be increased if an FPGA would be integrated in a new engine control unit. However there is a high cost if IOs are added after a new engine control unit has been developed. Logic area in the FPGA for synchronizing flywheel position should be kept low because there has to be area available for closed-loop combustion control, which is expensive in terms of logic area.
3.2
Engine Speed Estimation
Knowledge of how engine speed is determined is vital to be able to develop a well designed synchronization concept. In today’s engine control units the functionality is divided between TPU and CPU. The TPU is responsible for sampling of flywheel position sensors, calculate current engine position and determine flywheel direction. The actual engine speed calculation is
22 Analysis of Methods to Synchronize Flywheel Position
performed in the CPU, which is used for calculating time for fuel injections. It is desirable to place all time-critical communication with the FPGA in the TPU, because it takes longer time (nondeterministic) for the CPU to obtain up-to-date sensor values.
3.3
Communication
This section describes which messages that are required to be sent between S8 and the FPGA. It also describes how the CPU, TPU and FPGA com-munication can be performed.
3.3.1 Required Messages
To design a synchronization concept you have to know which types of in-formation that has to be sent between S8 and the FPGA. The system requirements specification that was developed with the industrial supervi-sors states that following types of messages and information must be able to send between S8 and FPGA.
• Master flywheel sensor (S8→FPGA) • A phase shift of 360◦ (S8→FPGA)
• Engine type: 5, 6 or 8 cylinders (S8→FPGA) • Reset of the FPGA (S8→FPGA)
• Synchronization of the CAD value
• Inform each other that synchronization has been lost
The only time critical message is synchronization of the CAD value. Re-quirements for this message can be read in section 3.1.1.
3.3.2 CPU and TPU Communication
A detailed description of CPU and TPU communication can be read in section 2.2.4. It is important to note that communication between CPU and TPU is not instant. When a HostServiceRequest is sent from CPU to TPU it takes some time to start a thread, depending on the utilization of the TPU. When the TPU invokes a ChannelInterruptRequest, an interrupt request is sent to the CPU. There is no guarantee that the interrupt will be served instantly. It could take more than 100 µs to start the interrupt according to Scania personnel.
3.3 Communication 23
3.3.3 CPU and FPGA Communication
Several available methods to implement a communication protocol between the CPU and FPGA exist. Possible interfaces can be seen in the list below:
• GPIOs (General Purpose Input/Output) • SPI (Serial Peripheral Interface)
• SCI (Serial Communications Interface) • Memory Mapping
There are many available GPIOs on the CPU, which makes it possible to use a parallel interface with the FPGA. Time to read and write GPIOs is however not deterministic. This means that it cannot send time critical synchronization messages. The CPU supports four SPI channels. However all SPI channels are either occupied or unavailable in S8 and can therefore not be used. Another serial interface is SCI, which provides an UART (Universal Asynchronous Receiver/Transmitter) mode. According to the datasheet [20] for the CPU between 8-9 bits can be sent for each transmit. Baud rate can be calculated in the following way [20]:
baud rate = fsys
16 · BR (3.4)
BR is part of a control register and can be set between 1-8191. fsys is
system clock frequency in the CPU. With this formula the maximum baud rate can be calculated:
baud rate = 128 · 10
6
16 · 1 = 8 · 10
6 bits/s (3.5)
The possible baud rate is impressive, but synchronization time require-ments are based on latency not throughput. It will take time for the CPU to setup the data transfer. This cannot be used for synchronizing the CAD value. But it can be used to send messages that are not time critical mes-sages.
Another method to interface the CPU and FPGA is using a memory mapped interface. This was suggested in a prior master thesis [3]. The concept is that information shall be written from CPU to a memory in the FPGA, from where CPU and FPGA can exchange data via a shared memory structure. This method also suffers from the same problems as the GPIOs and UART, that it is not able to synchronize the CAD value. There could for example be an interrupt in the CPU, which could block the memory access more than tmax.
24 Analysis of Methods to Synchronize Flywheel Position
3.3.4 TPU and FPGA communication
There is only hardware support for GPIOs between TPU and FPGA but the manufacturer [22] of the TPU has provided software support for UART and SPI communication. Some GPIOs and a few outputs from the TPU are available. This provides possibilities of having a parallel interface with the FPGA. A simulation was performed in a TPU simulator provided by AshWare [25] to determine if GPIOs could handle the synchronization re-quirements of the CAD value. A handshake protocol [4] was used to send information to and from the FPGA, where response from the FPGA was approximated. The simulation tested how long time it takes to set seven output pins. Figure 3.2 shows that there is a delay for sending data from the TPU. This delay was measured to slightly less than 5µs.
Figure 3.2: Simulation when sending 7 data bits with a handshake protocol A note is that the simulation was performed when no other channels were active. This is the best case scenario. The delay would increase in a real en-vironment where several channels could be active at the same time. It can also be noted in Figure 3.2 that it takes longer time to send more bits. Sim-ulations have shown that time is proportional to the number of output pins. One feature with channels in the TPU is that they do not capture an input event when a thread is active. It detects it instantly and creates a new thread scheduled for execution. Timestamps of input events are saved in registers and can be accessed when the thread is active. This provides the possibility of saving the timestamp when the TPU detects a positive edge (tooth found) from a flywheel position sensor and also the time when the FPGA sends information that it has detected a tooth. These timestamps can be compared to see if the units are synchronized.
The UART communication between the TPU and the FPGA has also been simulated. Simulations tested how long it takes to transmit a message with 8 bits and 1 parity bit. Two start bits are used in the implementation pro-vided by the manufacturer. The results can be seen in Figure 3.3. It was possible to achieve a period length of 1µs.
3.4 Implementation Alternatives 25
Figure 3.3: Simulation when sending the byte 0xAA via UART
Simulations were also done to see what happens if the same simulation was done while other threads were working in parallel (Figure 3.4). The result shows an irregular period time between pulses. This means that messages could be wrongly interpreted.
Figure 3.4: Stress simulation when sending the byte 0xAA via UART The achievable baud rate is low and is dependent on TPU utilization. Tim-ing messages will not be possible to send via UART from the TPU. One other disadvantage with this implementation is that it is utilizing the TPU substantial and would degrade performance of the existing system. A SPI implementation has not been tested because the implementation has similar performance as the UART as described in [9] and [18]. A benefit of using SPI compared to the UART should be that it will not send false messages. The reason for this is that transmission of data will be clocked by the TPU so the FPGA will always receive correct data.
3.4
Implementation Alternatives
In this section three different concepts to synchronize flywheel position are presented and analyzed. More concepts were developed but those were either similar to the ones presented in this section or did not meet the syn-chronization timing requirements. Note that the proposed FPGA design in each concept is only a template of the necessary functionality to synchro-nize flywheel position. Partitioning of functionality may change during the implementation.
3.4.1 Synchronization Method
Synchronization is performed the same way in all presented concepts. The solution compares timestamps when the TPU detects a positive edge and when the TPU detects a synchronization message sent from the FPGA. If these timestamps are within tmax, S8 and the FPGA are synchronized.
This algorithm is described in more detail in the list below:
1. Detect Positive Edge: TPU and FPGA detects a positive edge on the flywheel position sensor (assumed that they detect it
simultane-26 Analysis of Methods to Synchronize Flywheel Position
ously, the possible time difference in negligible). The TPU saves the timestamp when the positive edge occurs. It saves it in a register if there is a synchronization point.
2. Send Timestamp: The FPGA checks if the current tooth is a syn-chronization point. If that is the case it sends information to the TPU.
3. Detect Message: The TPU detects a message, which has been sent from the FPGA. It saves a timestamp of the event in register. 4. Compare Timestamps: timestamp registers are compared. They
are synchronized if the difference between timestamps is less than
tmax.
A handshake protocol will be used between TPU and FPGA to exchange information, which is a standard way to communicate between heteroge-neous systems. A figure of the TPU and FPGA interface can be seen in Figure 3.5. When the FPGA has a message to send, it asserts the req signal and the information on the data lines are sampled by the TPU. When the TPU has received the message it asserts the ack signal. The CAD value is going to be synchronized at every crank angle interrupt. A reason for this is to keep utilization in the TPU low (a maximum of four interrupts per revolution). The main benefit of using this synchronization method is that it does not matter how much utilization the TPU has, it will always be possible to detect if the units are synchronized.
An easier way to solve the synchronization problem would be to use a parallel interface with just a few pins to send the CAD value from TPU to FPGA. The reason why this was not chosen is that transfer time cannot be guaranteed. Even if just one pin is toggled when a new tooth has been de-tected, it cannot be guaranteed that it is sent from the TPU to the FPGA within tmax. The reason for this is that the utilization could be high and
block event for longer than tmax.
3.4.2 Concept 1 - UART Interface
The architecture of the UART concept can be seen in Figure 3.5. A tx line is a serial line where messages are sent from TPU to FPGA. Messages originate from the CPU where all synchronization logic is placed and the TPU is only used for forwarding messages. The flywheel position sensors
3.4 Implementation Alternatives 27
Figure 3.5: Architecture of concept 1 with UART interface
The FPGA will consist of four modules: An UART Receive, a Message
Decoder, a TPU Interface and a CAD module. The UART Receive module
detects new messages and forwards them to the Message Decoder mod-ule, which interprets the message. The CAD module is handling the syn-chronization logic in the FPGA. It receives interpreted messages from the
Message Decoder module and reacts to them, for example sets the master
sensor. It also informs the TPU Interface module when it shall send a mes-sage that a synchronization point has occurred and when the CAD module is synchronized.
Benefits of using this concept is the low CPU utilization combined with a flexibility to create larger data packets if more messages are added. Startup sequence and synchronization for the FPGA is performed in the CPU. This is preferred because it is hard to write complex control logic in the TPU due to its event based structure.
A drawback of using this concept is that the UART software degrades performance of the TPU significantly; higher baud rate leads to higher utilization. Startup synchronization will also be complex because of non-deterministic behavior of messages sent from the CPU. One cannot send a message which informs which tooth that is active, but you can send infor-mation about the phase of the flywheel. An alternative to this could be to use the UART in the CPU to communicate with the FPGA, especially if TPU functionality is degraded.
3.4.3 Concept 2 - Memory Mapped Interface
The following concept uses some of Bohlin’s suggestions [3] on how the S8 and the FPGA should be connected. This concept uses a memory mapped
28 Analysis of Methods to Synchronize Flywheel Position
approach to communicate between S8 and FPGA. The FPGA consists of
Figure 3.6: Architecture of concept 2 with memory mapped interface five modules. One Message Decoder module, which is going to snoop on address and control bus to determine when a message is received. Memory
Controller handles arbitration of the memory such that read and writes
are performed correctly. A CAD module is handling the synchronization in the FPGA. It receives decoded messages from the Message Decoder module and reacts to them. The CAD module informs the TPU Interface when it is time to send synchronization messages to the TPU.
There are several pros with this concept. It is possible to send several different types of messages and data. A limiting factor is the size of the memory in the FPGA. It is also easier to put synchronization logic on the CPU, because programming is easier than on a TPU. This concept also suffers from complex startup synchronization as described for concept 1 in section 3.4.2. Another negative aspect is that it could be time consuming work to configure the memory controller in the CPU.
3.4.4 Concept 3 - Dual Handshake Interface
The dual handshake concept differs from concept 1 and 2. This architecture shown in Figure 3.7 uses only the TPU and the FPGA to synchronize the CAD value. It uses a dual handshake protocol to communicate between the TPU and the FPGA. The FPGA consists of four modules; A TPU
Inter-face and a Receive module, which handles communication to and from the
FPGA. A Message Decoder module that decodes received messages. The
CAD module is handling synchronization in the FPGA. A main difference
compared to other methods is that the CPU will not be involved in the synchronization process. This concept also provides the possibility to an
3.5 Concept Selection 29
Figure 3.7: Architecture of concept 3 with a dual handshake interface
easier startup synchronization process because it would be possible to syn-chronize at every tooth on the flywheel due to lower latency for messages sent from the TPU to the FPGA. This method would also be the most simple to interface with the FPGA.
There are two main drawbacks with this method. First of all increased utilization in the TPU, where there is a great risk that other channels will be blocked. The TPU has only a few available channels which limit the number of messages to be sent. However this can be extended by sending data multiple times, but that will also lead to an increasing utilization in the TPU. Control logic is also hard to program in a TPU, because of its event based structure.
3.5
Concept Selection
If different concepts are analyzed according to the implementation factors described in section 3.1, a choice is rather simple. Concept 2 fulfills latency requirements of the CAD value and it will not degrade performance of the S8 significantly. It is also by far the most flexible of all concepts, because it can communicate from the FPGA either by memory mapping or by handshake communication protocol. One drawback compared to other concepts is that it is harder to test a memory mapped interface between the CPU and the FPGA. You cannot use a software simulator like in concept 1 and 3 where TPU and FPGA communication can be simulated. Another drawback is that estimation of flywheel position will be harder compared to concept 3. This would increase complexity and area of the FPGA.
Chapter 4
System Design and
Implementation
This chapter describes how the design selected in chapter 3 was imple-mented. It describes the process of constructing hardware required to con-nect S8 with an FPGA and how functionality in the TPU, CPU and the FPGA were implemented. No time for implementing a second concept described in section 1.3 has been available.
4.1
PCB Design
4.1.1 PCB-Adapter
To enable communication between the S8 and the FPGA a PCB-adapter was designed to connect a subset of the S8 pins to the FPGA. A pin con-nector was included in the design to make it possible to monitor and debug communication between the units. The PCB-adapter was designed in Ki-CAD [14] and produced by an external company.
It was required to do three revisions to get a working prototype. Bad soldering technique combined with unconnected signals in the design made it necessary to cancel the first revision. In the second revision it was de-tected that some signals that were connected to the FPGA were using a 5V logic level and the FPGA is not able to handle those voltages levels. A voltage divider was therefore manufactured at Scania to handle this issue (see section 4.1.2). When the PCB-adapter had been soldered, it was de-tected that one of the connectors was rotated in the layout. A consequence of this was that some of the data bus signals became unavailable. This
32 System Design and Implementation
meant that a third and final PCB-adapter had to be designed. The final produced PCB-adapter can be seen in Figure 4.1. Figure 4.2 show the S8 connected with the FPGA via the PCB-adapter.
Figure 4.1: Top and bottom view of the PCB-adapter
4.2 Synchronization of CAD 33
4.1.2 PCB Voltage Divider
During manufacturing of the second revision of the PCB-adapter it was detected that the TPU IOs where using a 5V logic level. This is problematic because the FPGA cannot handle those voltage [26] levels. To handle this issue an additional PCB was manufactured at Scania, which divided the voltage from 5V to 3.3V. This is a voltage level that the FPGA can operate within. A picture of the voltage divider can be seen in Figure 4.3. This design was later placed on the PCB-adapter in the third revision to integrate the entire design into just one PCB.
Figure 4.3: Layout and and picture of the designed voltage divider
4.2
Synchronization of CAD
To be able to know what functionality that should be placed in each unit and to get an overview of the communication during synchronization, a sequence diagram was made shown in Figure 4.4. The first synchronization step is the startup phase where the CPU sends information to the FPGA about engine type, master sensor and current revolution. Timing of when to send current revolution is important so this is not taking place at the end of a revolution. When the FPGA has found synch it informs the CPU that synchronization at every crank angle interrupt is possible. When this phase is over the CPU checks at every crank angle interrupt that time between timestamps differ at most tmax between S8 and FPGA. If the time
34 System Design and Implementation
:FPGA
:CPU :TPU
Send Engine Parameters CAD Synchronization
Start/Soft Reset
[SYNCH_VERIFIED] Send Master Sensor
[SYNCH_VERIFIED] Send Current Revolution
[FPGA_SYNCH] Synch Found FPGA Synch Found
loop
[CAI_DETECTED] Send Position [CAD_SYNCHRONIZED]
[CAI_DETECTED] Store TPU Timestamp Store FPGA Timestamp loop
[S8_RUNNING]
Figure 4.4: Sequence diagram of CAD synchronization
4.3
TPU Design
4.3.1 Pin and Channel Selection
Messages sent from the FPGA are handled by a request channel. To be able to send confirmation that a message has been received an acknowledgement channel was implemented. There is also data channels, which are each responsible for one bit of the messages sent from the FPGA to the TPU. Four data channels were chosen to handle messages from the FPGA. This makes it possible to decode 16 types of messages. There are 8 required messages for sending which crank angle interrupt (max number of cylinders is 8) that has occurred. Two messages are also necessary to inform if the CAD value for a flywheel position sensor on the FPGA has lost synch. If more than 16 messages are required the number of messages could be extended to 256 messages when more available data channels are used. All implemented channels were placed on TPU engine B. The reason for this is that all GPIO channels on engine A are occupied.
4.3 TPU Design 35
4.3.2 Functional Description
The intent when designing the TPU code was to make it to a unit which only forwarded messages. A reason for this is that analysis of the CAD value shall be made in the CPU where it is easier to control synchroniza-tion logic. The funcsynchroniza-tionality that has been added to the TPU can be seen in Figure 4.5. CHANNEL_REQUEST handles the request signal from the FPGA and copies the PinState variable to the LastPinState variable when req has been asserted. When req is asserted the
LastTimeStampF-PGA register is also written. The four data channels react to changes on
the data wires and write this to the PinState variable. Behavior of
CHAN-NEL_ACK depends on the WaitState variable, which describes which state CHANNEL_REQUEST is in. If WaitState informs that req signal is high
the ack is asserted and if it is low then ack is deasserted.
CHANNEL_EPS1 and CHANNEL_EPS2 are activated when a positive
edge has been detected on the eps signals. These channels are placed on TPU engine A and implemented channels are placed on TPU channel B. This was an issue because TPU engines have different counters, which makes it impossible to compare timestamps. This was solved by adding two additional channels CHANNEL_EPS_SAVE1 and CHANNEL_EPS_SAVE2. These were placed on TPU engine B and a link thread is created from
CHANNEL_EPS1 or CHANNEL_EPS2 whenever a positive edge on eps1
or eps2 is detected. This made it possible to sample timestamps and write this to the LastTimeStampEPS1 and LastTimeStampEPS2 registers. Ad-ditional linking time will increase the time difference between timestamps and there is also no guarantee when links will be executed, due to the TPUs event based structure. But if the difference between timestamps is less than
tmax, it is still guaranteed that the flywheel position is synchronized.
One thing that has to be considered is that the request signal has to be asserted after data channels have handled a new message. If it is not the case the data channels will be sampled before all have been updated. One solution to this could be to set data signals before the request signal has been set. To minimize wait time for the request signal gray coding could be used, where only one data bit is changed for every successive value. This would lead to fewer events generated in the TPU compared with a counter based approach where several bits can change when incrementing the counter. However there is no guarantee that an event will be served within tmax. The solution that has been implemented prepares the output
ev-36 System Design and Implementation
Figure 4.5: TPU channel accesses of Share Code Memory
ery crank angle interrupt is at minimum a few milliseconds, which would give the data channels enough time to sample the signals before the crank angle interrupt will occur.
4.4
CPU Design
The CPU memory controller supports communication with several periph-eral units. The FPGA memory is used as such a periphperiph-eral unit to enable communication between the CPU and the FPGA. An own address space was assigned to the FPGA memory in the CPU. Normally when a RAM memory is setup the placement of variables are nondeterministic. How-ever when using a shared memory in the FPGA all variables need to have a designated address. This is required; otherwise the shared memory commu-nication between CPU and FPGA would be impossible. When reading or writing to the FPGA, the CPU compiler could interpret those operations as redundant and therefore remove these during compilation. To ensure that every read and write to the FPGA is performed, variables are declared as
Volatile to prevent optimization from the compiler. Example 4.1 shows an
4.4 CPU Design 37 Example 4.1: Read/Write in CPU to FPGA Memory
/∗ D e c l a r e v a r i a b l e , s e t t i n g p o i n t e r t o FPGA memory . V o l a t i l e i s u s e d t o p r o t e c t from c o m p i l e r o p t i m i z a t i o n ∗/ s t a t i c v o l a t i l e u n s i g n e d i n t ∗ fpga_mem_variable_1 = ( (u n s i g n e d i n t ∗ ) ADDRESS_MEMORY_POS_1) ; s t a t i c v o l a t i l e u n s i g n e d i n t ∗ fpga_mem_variable_2 = ( (u n s i g n e d i n t ∗ ) ADDRESS_MEMORY_POS_2) ; v o i d exampleMethod (v o i d) { /∗ D e c l a r e r e a d v a r i a b l e ∗/ s t a t i c v o l a t i l e u n s i g n e d i n t r e a d _ v a r i a b l e ;
/∗ Write t h e v a l u e 1 t o FPGA memory ∗/
∗ fpga_mem_variable_1 = (u n s i g n e d i n t) 0 x 0 0 0 0 0 0 0 1 ;
/∗ Read from FPGA memory ∗/
r e a d _ v a r i a b l e = ∗ fpga_mem_variable_2 ; // W r i t t e n from FPGA
module /∗ Use r e a d v a l u e ∗/ i f( r e a d _ v a r i a b l e == 4 2 ) { /∗ Do s o m e t h i n g ∗/ } }
Every variable is saved with 32 bits. This was chosen to make it easy, that is one variable is mapped to one address. There is however hardware (sig-nals from CPU) support to save 8, 16, 24 and 32 bit variables. Space is going to be unused if variables that only requires 8 bits are saved as 32 bit variables, however this is not an issue right now because no significant amount of data will be sent. Note that if 8, 16, 24 and 32 bit variables are implemented the byte order has to be considered. A big-endian configu-ration is now used which means that data is saved in byte order with the most significant byte first. There is possible to change the configuration to small-endian if that is preferred.
The CPU in the S8 supports a cache memory which can be configured as write-back and write-through memory. The cache memory has been in-activated for addresses designated to the FPGA memory to ensure that the variables are always up-to-date. If large amounts of data will be read or written to the FPGA you have to consider how long time memory accesses take. This could otherwise be a bottleneck and degrade performance of the CPU, because CPU execution is stalled when a memory operation is performed. The current implementation uses a time window of 188ns when performing a write or a read operation. This can however be set to as low
38 System Design and Implementation
as 31ns, which is significantly lower than the current implementation. But there is no guarantee that the FPGA would be able to support this speed, due to switching and propagation delay. There has not been enough time to make an analysis of this problem but this should be considered if large amounts of data are going to be exchanged between CPU and FPGA in the future.
CPU logic implementation has been implemented as a FSM (Finite State Machine) to enable a structured control of the synchronization as seen in Figure 4.6. It starts by sending a start message to the FPGA. Then it sends information to the FPGA about engine type, master sensor and current rev-olution. When this phase is over it awaits a synchronization message from the FPGA. If it is not sent within a specific time the synchronization pro-cedure is restarted. When the FPGA has sent a synchronization message, the CPU starts to compare timestamps at every crank angle interrupt. If timestamps are within tmax they are considered as synchronized and if
timestamps differ to much the synchronization process is restarted.
4.5 FPGA Design 39
4.5
FPGA Design
Development of FPGA software has been made by using Xilinx ISE web-pack edition [11] and code has been written in VHDL. The language was a requirement from Scania and the development tool was used in a previ-ous master thesis. The author also has previprevi-ous experience with the tool, which made it a natural choice. The design used a module based approach to design the FPGA implementation. This provided a bottom-up imple-mentation strategy which makes it easier to test small modules early and by that identify design errors and bugs easier. A modular approach also provides possibility of reusability. The Design was also implemented to make it generic and changes in the design would be easy to implement with minimum effort. When creating the design a schematic of the hardware was done before implementing the VHDL code. Thismade the implementation of the VHDL code a simpler task.
The FPGA board used in this master thesis is an Opal Kelly XEM3010 [13] (Figure 4.7) with a Xilinx Spartan 3 XC3S1500-4FG320 FPGA. This is the same FPGA that has been used in Bohlin’s master thesis [3].
Figure 4.7: Opal Kelly XEM3010 with a Spartan 3 FPGA
4.5.1 Overview
To enable a modular design, different functionality where placed in separate modules. An overview of the FPGA design can be seen in Figure 4.8.
Message Decoder module and TPU Interface module can be classified as
communication modules and they handle communication with the CPU and the TPU. The BRAM Controller is responsible for control of the shared
40 System Design and Implementation
memory. It handles operation of reads and writes to memory. Tracking of flywheel position is performed by the CAD module with the sensor input signals eps1 and eps2. The Add-On Unit (could for example be a closed-loop combustion control module) is a calculation unit that will use the CAD value and also communicate with the CPU via the shared memory. This unit is not implemented in this master thesis but it is easy to interface one or several calculation units to the design. The signal interface to the FPGA can be seen in table 4.1.
Figure 4.8: Block diagram of FPGA implementation
4.5.2 Message Decoder
Message Decoder module is responsible for detecting new messages that
are sent from the CPU. It snoops at the address bus and if certain address is set and if the signals ce and we is active, a valid new message has been sent. Current messages that are available to send from CPU to FPGA are:
• Start: start, restart and stop synchronization with the FPGA. • Master Sensor: defines which flywheel position sensor that is
mas-ter.
• Revolution: specifies current revolution of the flywheel. The fly-wheel could either be between 0-359.9◦and 360-719.9◦.
• Engine Type: number of cylinders. Number of cylinders could be 5,6 and 8.
4.5 FPGA Design 41 Signal name Type Description
eps1 input Engine position sensor 1
eps2 input Engine position sensor 2
ce input Chip enable for block ram. Makes read
and writes to memory possible
oe input Output enable for block ram.
Possi-ble to perform read operations from the block ram.
we input Write enable for block ram. Write
op-erations possible
address(7:0) input Address to block ram
data_cpu(31:0) input/output Bidirectional data line
ack input Message received (acknowledged)
sig-nal.
req output Request to send message to the TPU.
data_tpu(3:0) output Data sent to TPU
Table 4.1: FPGA signal interface
4.5.3 BRAM Controller
The purpose of the BRAM controller (Figure 4.9) is to ensure secure ac-cesses to the shared memory in the FPGA. A BRAM is used to store data that is shared between the CPU and the FPGA. There were two main rea-sons for choosing a BRAM. This was a request from Scania and the previous master thesis had also used a similar design. The BRAM is configured as a dual port memory to ensure that both CPU and FPGA can access the shared memory simultaneously.
One issue with keeping information in a BRAM is that only one module in the FPGA can read from one memory location. If data is going to be shared between different modules, the memory content has to be saved in a reg-ister before it is accessed by both modules. Therefore one should consider using other memory techniques to handle internal shared variables in the FPGA. A suitable choice could be memory mapped registers. This provides less FPGA utilization but at the cost of a more complex memory controller. When using a shared memory structure with a dual port memory there could be memory collisions. There are two possible cases when this could occur; writing to the same memory address or one a write operation on one port and read operation on the other to the same memory address. Xilinx Core Generator System [23] is used to instantiate the BRAM. If the mem-ory is configured in READ_FIRST mode it has support for simultaneous
42 System Design and Implementation
read and write to the same memory address. The read value is the previous value and the written value is available one clock cycle later. After discus-sions with the Scania supervisors, it has been decided that simultaneous write to the same address is considered as a programming error.
Figure 4.9: Block diagram of BRAM controller design
CPU BRAM accesses are handled by a FSM, which is the method that was suggested by Bohlin previous master thesis [3]. The FSM is necessary due to clock domain crossing and bus arbitration. Control signals from the CPU have to be synchronized to the FPGAs clock domain to ensure that no timing violations occur within the clock-to-clock setup [28]. Due to the shared data bus between the CPU and FPGA, arbitration has to be considered. The data bus needs to be able to put in high impedance state when the CPU is not reading from the BRAM. This is possible if memory accesses are synchronized.
The FSM can be seen in Figure 4.10. A read is performed one clock cycle after the ce and oe from the CPU is asserted. The read operation is finished when ce is deasserted or if oe or we is changed. However this should not happen because ce is deasserted before oe and we in a read or write cycle in the CPU. Writing is performed in a similar way, where we is asserted instead of oe.
4.5.4 CAD Module
The CAD module shown in Figure 4.11 was divided into four smaller mod-ules which are listed and described in the following bullet points:
4.5 FPGA Design 43
Figure 4.10: FSM for BRAM port A
• CAI Logic: handles detection of crank angle interrupts. Consists of combinational logic.
• CAD Output: sets which sensors CAD information that is sent out depending on which flywheel sensor that is master.
• EPS Logic: this module handles signals from the flywheel position sensors and is trying to provide a synchronized CAD value.
• CAD Controller: a FSM controlling the state of the CAD module. An easy and structured way to control logic in the CAD module was to use a FSM based design approach (Figure 4.12). When the module receives a start message it will perform startup synchronization. When all configu-ration parameters have been set and a flywheel position sensor has been synchronized, a synchronization message is initiated and sent to the TPU. After the synchronization message has been sent, the CAD Controller is continuously sending new messages to the TPU whenever a crank angle interrupt has occurred.
EPS Logic consists of two submodules that are estimating the CAD value
for each flywheel position sensor. The modules are working independently and there is no interaction between them. A flowchart was created to be able to get a view on how the CAD value should be estimated. This is illustrated in Figure 4.13. An initial guess is first performed. This value is updated until a gap has been detected. If the guess was correct the FPGA is synchronized otherwise it adjusts its tooth value to the value of the gap and awaits the next gap. Deviation of period time between two successive teeth can at maximum be 30% (Scania threshold); otherwise it is assumed
44 System Design and Implementation
Figure 4.11: Block diagram of CAD module
4.5 FPGA Design 45
that an error has occurred. The reason for this is either acceleration or de-celeration of the engine speed. If this occurs in the EPS Logic module, the entire estimation process is restarted. When the flowchart had been cre-ated an FSM was adapted to mimic the behavior of the flowchart. This was found to be an easy task due to the event based structure of the flowchart.
Figure 4.13: Flowchart of CAD estimation
A flywheel consists of 58 teeth and two gaps. To be able to approximate a CAD value within 0.1 CAD the distance between two teeth has to be divided into internal positions (posinternal). 256 positions were selected as the total number of internal positions. The reason for this is that one internal position corresponds to 60·256360 ≈ 0.025 CAD, which is a higher pre-cision than required. With this representation the flywheel position can be