• No results found

Exam in Optimising Compilers (DAT230/EDA230)

N/A
N/A
Protected

Academic year: 2021

Share "Exam in Optimising Compilers (DAT230/EDA230)"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

Exam in Optimising Compilers (DAT230/EDA230)

October 22, 2009, 8.00 — 13.00

Examinator: Jonas Skeppstedt

a

b c

d e

f g

h

Figure 1: Control flow graph.

1. (10p) Explain how the Lengauer-Tarjan algorithm (the O(N2)-version) finds the dominator tree in the control flow graph in Figure 1. For each vertex, your solu- tion should explain:

• when is the vertex put in a bucket?

• in which bucket?

• when is it deleted from the bucket?

• when does the algorithm find the immediate dominator for the vertex?

2. (5p) What is the dominance frontier of a vertex?

Answer: see the book.

3. (5p) What is control dependence and how is it computed?

Answer: see the book.

4. (10p) Consider again the control flow graph in Figure 1. Suppose there is a use of variable x in each vertex and an assignment to x in vertices a, c and e. In vertices a and c the definition is before the use and in vertex e the definition is after the use.

(2)

Translate the program to SSA form. Show the contents of the rename stack and when the stack is pushed and popped. You do not have to show how you compute the dominance frontiers.

Answer: see the book.

5. (10p) List scheduling is often inferior to software pipelining. Explain why. To get full points you must show an example of when software pipelining produces better code than list scheduling.

Answer: because list scheduling only can hide the latency of instructions in one loop iteration and there may not be any independent instructions to execute be- tween a particular producer and consumer if only instructions from the same loop iteration are considered. With software pipelining, instructions from other loop iterations can be used to hide the latency. Consider:

float a[100];

float b[100];

float c[100];

int i;

/* ... */

for (i = 0; i < 100; ++i) a[i] = b[i] + c[i];

List scheduling will not be able to hide more than perhaps one or two clock cycles while software pipelining fully can hide the latency of the floating point add and — assuming L1 cache hits — the array accesses.

int f(int* a, int n, int c) {

int i, s;

s = 0;

for (i = 0; i < n; i++) s += a[c * i];

return s;

}

Figure 2: C function for question on operator strength reduction.

6. (10p) Explain in principle how operator strength reduction on SSA form opti- mises the loop in Figure 2. Your description should be based on the SSA-graph of the code, but you don’t have to explain every detail of the algorithm.

(3)

Answer:

s0←0 i0←0

i1←φ(i0,i2) s1←φ(s0,s2) i1n0?

t0c0×i1 t1t0×4 t2a0+ t1

t3M[t2] s2s1+ t3 i2i1+ 1

The SSA-graph becomes:

s0 0

φ(s0,s2) s1

s1 +t3

s2 t3 M[t2]

a+ t1 t2

t0 × 4 t1

c0 × i1 t0

i0 0

φ(i0,i2) i1

i1 + 1 i2

During the execution of Tarjan’s algorithm, i is classified as an induction vari-

(4)

able, which leads to its strongly connected component is copied and modified for t0as follows:

c0×i0

t00

φ(t00,t02) t01

t01+ c0

t02

The use of t0is changed to instead use t01. The computation of t1now also is a multiplication of an induction variable and a region constant and the SCC of t0 is copied and modifed for t1:

t00×4 t10

φ(t10,t12) t11

t11+ c0∗4 t12

Then the use of t1is changed to instead use t11. The computation of t2now is the sum of an induction variable and a region constant and the SCC of t1is copied and modifed for t2:

(5)

a0+ t10 t20

φ(t20,t22) t21

t21+ c0∗4 t22

The multiplication c0×4 is performed before the loop and saved in a new tem- porary variable. The resulting program — after DCE — will look as follows:

s0←0 t44 × c0 t20a0+ 0 × t4 t5a0+ n0×t4

t12←φ(t02,t22) s1←φ(s0,s2) t12t5?

t3M[t21] s2s1+ t3

t22t21+ t4

7. (10p) Why should a loop transformation matrix be invertible?

Answer: it needs to be invertible when the new loop bounds are computed.

References

Related documents

First of all, we notice that in the Budget this year about 90 to 95- percent of all the reclamation appropriations contained in this bill are for the deyelopment

In light of increasing affiliation of hotel properties with hotel chains and the increasing importance of branding in the hospitality industry, senior managers/owners should be

In this thesis we investigated the Internet and social media usage for the truck drivers and owners in Bulgaria, Romania, Turkey and Ukraine, with a special focus on

The main findings reported in this thesis are (i) the personality trait extroversion has a U- shaped relationship with conformity propensity – low and high scores on this trait

(10p) Explain how the Lengauer-Tarjan algorithm (the O(N 2 ) -version) finds the dominator tree in the control flow graph in Figure 1.. For each vertex, your solution

Answer: If the node u was removed during simplify because it had fewer than K neighbours it will find a colour when it is later reinserted into the interference graph, since even if

Hade Ingleharts index använts istället för den operationalisering som valdes i detta fall som tar hänsyn till båda dimensionerna (ökade självförverkligande värden och minskade

In order to make sure they spoke about topics related to the study, some questions related to the theory had been set up before the interviews, so that the participants could be