1
1. Micro Architecture and Finite Length
Olle Seger (olle.seger@liu.se) Andreas Ehliar (ehliar@isy.liu.se)
Dake Liu, Rizwan Azhgar
Outline
Introduction
Some Administrative Information
Basic Components
Finite Length, Overflow, 2-complement, rounding, saturation
About Lab-1, the Senior Processor …..
3
Administrative Information
Labs
In groups of two students
No written report
Be prepared to answer questions
(both of you) about how your design works
Mandatory
FAQ:
Q: What if you miss a lab?
A: Do the lab yourself and show it at a later date (ssh to ixtab.edu.isy.liu.se)
2’s Complement Number
Representation
1 0
-1 ½ ¼ 1/8
1 0 1 0
1 0
-4 2 1 ½ ¼ 1/8 1/16 1/32
1 0 1 0 0 1
It’s easy to increase the number of bits.
It’s still the same number.
duplicate sign bit concatenate zeros
[-1,1-1/8]
[-4,4-1/32]
1 0
…
2’s Complement Number
Representation
5
1 0
-4 -2 1 ½ ¼ 1/8 1/16 1/32
1 0 1 0 y1
x2
1 0
-1 ½ ¼ 1/8
0 1 1 1
x1 y2
1 0
-4 -2 1 ½ ¼ 1/8
1 0 1 y1 x2
x1
rounding
1 0
-1 ½ ¼ 1/8
1 0 0 0 x1=0 x1x2=10
1 0
1 0 1 y1
-1 ½ ¼ 1/8
x1x2=11
MAX MIN
1 0
-4 -2 1 ½ ¼ 1/8
1 0 1 0 x2
x1
truncate
saturate
Adder(signed/unsigned)
Implicitly: integer, two’s complement
{c_o, O[15:0]} <= A[15:0] + B[15:0] + {15’b0, c_i}
Alternatively
{c_o,O[15:0],x} <=
{A[15],A[15:0],1} + {B[15],B[15:0], c_i}
Input operands : N bit ; Output : N+1 bit
Subtraction : O
+
A B
c_o c_i
7
Multiplier(signed)
O[7:0] <= A[3:0] B[3:0]
Example:
Integer or Fractional Multiplication 0111 0111 = 00110001 or
0.111 0.111 = 00.110001 = 0. 1100010 MULS
OMB[15:0]
OMA[15:0]
32
Mul_Output [31:0]
Input operands : N bit ; Output : 2N bit
Signed multiplication
paper&pencil algorithm
0111 7
* 0111 7 00000111 7 00001110 14 00011100 28 00000000 0 00110001 49
1001 -7
* 0111 7 11111001 -7 11110010 -14 11100100 -28 00000000 0 11001111 -49
0111 7
* 1001 -7 00000111 7 00000000 0 00000000 0 11001000 -56 11001111 -49
1001 -7
* 1001 -7 11111001 -7 00000000 0 00000000 0 00111000 56 00110001 49
10 (10,30)
X
scale
round
sat
accumulator
Register File
DM0 DM1
(1,15)
(10,30) (10,15)
(1,15) (2,30)
sign extend to(10,30) (1,15)
ALU
ar ar
A Rounding Example
A7A6A5A4 + 0 0 0 A3 B7B6B5B4
A7A6A5A4A3 + 0 0 0 0 1 B7B6B5B4 X
A3
12
Senior Assembler & Simulator
Assembly Code Includes:
Assembly Instructions: LD, ST, ADD, CMP, …
Symbolic name for memory locations: labels
Assembler directives: .skip 31, .df 0.125, …
Senior Assembler: Translates assembly code into an executable binary code (Hex Format).
Senior Simulator: Takes the hex file and provides a debugging environment.
Assembly Code (ex.asm)
ex.hex
Assembler (srasm)
Debugging + Output text file
Simulator (srsim)
Senior
Senior: DSP with lots of bells and whistles
• 32 16-bit general regs (r0-r31)
• 32 16-bit special purpose regs
• 4 32-bit accumulator regs
• + 8 guard bits (acr0-acr3)
About Senior
Special purpose registers
14
About Senior
Memory
Where is the data? rom0
Where are the coefficients? rom0
But you need them at the same time. So?
How to save the output to a text file
out 0x11, r31
Important instructions
convxx
repeat vs cmp & jump
set, clr
move, ld, st
Hint : check the cycles required for data to be ready and use NOP
accordingly.
ram0 ram1
PM CP
rom0
RF DP
dm0 dm1
16
About Senior
move, load and store instructions
move r7,r14
move.eq r22, rnd mul2 acr3 set r21,711
ld0 r1,(ar1,r9) ; r1 <- M0(ar1+r9) ld1 r1,(ar0++%) ; r1 <- M1(ar0)
; ar0 = (ar0==top0)?bot0:ar+step0 st1 (ar2++),r5 ; M1(ar2)<-r5,ar2++
About Senior
Short arithmetic, logic, shift instructions
Long instructions
add r7,r14,r15 add.ne r7,r12
addl.meq acr2,acr1,acr0 addl acr1,acr3,r2:0
convss acr0,(ar0++%),(ar1++%)
;acr0 += M0(ar0)*M1(ar1) , ar0 , ar1
18
About Senior
How to use “repeat”
Hardware loop!
……
repeat label_end, 32 set r4,0xfa72
move r1,sr3
mac acr0, r0, r1 label_end
move r17,sr31
……
These 3 instructions are repeated. No (visible) loop counter. No test. No jump.
About Senior
How to use conditional branch “jump”
set r0,32 ; set loop counter label_start
…
dec r0 ; decrement loop counter jump.ne label_start ; no delay slots
xxx ; branch delayed yyy ; 3 cycles
zzz ;
20
About Senior
jump instruction – Another Example
……
jump.ne ds2,label4
move r1,sr3 ; this will always execute set r2,7 ; so will this
move r12,r3 ; but not this label4
set r7,3
……
set r0,32 ; set loop counter label_start
…
dec r0 ; decrement loop counter jump.ne ds3 label_start
xxx yyy zzz
About srsim
How to debug in simulator (srsim)
h: help menu
r<n>: execution ‘n’ lines of instructions
l: list the instructions around the pc
p: print of the values in registers
Special registers: which are ar0 and ar1?
Accumulation registers: which is acr0?
g: run the whole program
Exercise
22
Exercise
25
Convolution
present sample previous sample
…
reg reg reg reg
Round Saturation
x(n) x(n-1) x(n-2) x(n-3) x(n-4)
h(0) h(1) h(2) h(3) h(4)
y(n)
+ + + +
) 4 (
) 4 ( )
3 (
) 3 ( )
2 (
) 2 ( )
1 (
) 1 ( )
( ) 0 (
) (
) ( )
( 4
0
n x h
n x h
n x h
n x h
n x h
k n x k h n
y
k
Exercise 1.2
26
h(0) h(1) h(31)
ar1
x(0) x(1) x(999)
ar0
0 0 …
…
…
bot1 top1
;; coeffs copied rom0 -> ram1 fir_filter
set r3,signal
set r1,1000 ; loop counter set ar1,coeffs ; ar1->coeffs set ar0,zeros ; ar0->signals set step1,1
set bot1,coeffs
set top1,coeffs_end ;;
loop
inc r3 move ar0,r3 repeat falt,32
convss acr0,(--ar0),(ar1++%) falt
dec r1
jump.ne ds3 loop
move r31,rnd div2 acr0
clr acr0 ; clear accu out 0x11,r31
;;
;; end of code out 0x13,r0
.rom0
.scale 2.0 signal
.df 0.0000 .df 0.588059
coeffs
ram1
rom0
1000 0
) (
) ( )
( 31
0
n
k n x k h n
y
k
Exercise 1.2 with ringbuffer
27
h(0) h(1) h(31)
ar0
…
bot0 top0
coeffs
rom0
x(0) x(1)
ar1
…
bot1 top1
ringbuffer
ram1
x(0) x(1) … x(999)
ar2
ekg
; ekg copied rom0->ram1
; zeros in ringbuffer
; pointers fixed
;;
set r1,1000 ; loop counter ;;
loop
ld1 r0,(ar2++) ; read signal dec r1 ; dec loop cnt st1 (ar1),r0 ; write r.b.
repeat falt,31
convss acr0,(ar0++%),(ar1++%) falt
move r2,ar1
convss acr0,(ar0++%),(ar1++%) move ar1,r2
jump.ne ds3 loop
move r31,rnd div2 acr0 clr acr0 ; clear accu out 0x11,r31
;; end of code out 0x13,r0
x = x0 + sin
h
31 0
) (
) ( )
(
k
k n x k h n
y
Frequency domain
29
Exercise 1.3
in r0,0x10clr acr0
macss acr0,r0,r5 macss acr0,r1,r6 macss acr0,r2,r7 macss acr0,r3,r8 macss acr0,r4,r9
move r10,sat rnd acr0
nop out 0x11,r10
0
r0
0
r1 r2 r3 r4
h0
r5 r6 r7 r8 r9
0 0 0
h1 h2 h3 h4
in r4,0x10 clr acr0
macss acr0,r4,r5 macss acr0,r0,r6 macss acr0,r1,r7 macss acr0,r2,r8 macss acr0,r3,r9
move r10,sat rnd acr0
nop out 0x11,r10
ringbuffer
coeffs
… Unroll the loop 5 times!
Step h,x forward Fill in x backward