• No results found

Verifying Deadlock-Freedom for Advanced Interconnect Architectures

N/A
N/A
Protected

Academic year: 2021

Share "Verifying Deadlock-Freedom for Advanced Interconnect Architectures"

Copied!
26
0
0

Loading.... (view fulltext now)

Full text

(1)

Linköpings universitet

Linköping University | Department of Computer and Information Science

Master’s thesis, 30 ECTS | Computer Architecture

2020 | LIU-IDA/LiTH-EX-A–20/072–SE

Verifying

Deadlock-Freedom

for

Advanced

Interconnect

Architectures

Meng Wang

Supervisor : Zeinab Ganjei Examiner : Ahmed Rezine

(2)

Upphovsrätt

Detta dokument hålls tillgängligt på Internet - eller dess framtida ersättare - under 25 år från publiceringsdatum under förutsättning att inga extraordinära omständigheter uppstår.

Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervisning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säkerheten och tillgängligheten finns lösningar av teknisk och administrativ art.

Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsmannens litterära eller konstnärliga anseende eller egenart.

För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/.

Copyright

The publishers will keep this document online on the Internet - or its possible replacement - for a period of 25 years starting from the date of publication barring exceptional circumstances.

The online availability of the document implies permanent permission for anyone to read, to download, or to print out single copies for his/hers own use and to use it unchanged for non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional upon the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility.

According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement.

For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its www home page: http://www.ep.liu.se/.

(3)

Abstract

Modern advanced Interconnects, such as those orchestrated by the ARM AMBA AXI protocol, can have fatal deadlocks in the connection between Masters and Slaves if those transactions are not properly arranged. There exists some research about the deadlock problems in an on-chip bus system and also methods to avoid those deadlocks which could happen. This project aims to verify those situations could make deadlock happens and also the countermeasures for those deadlocks. In this thesis, the ARM AMBA AXI protocol and countermeasures are modelled in NuSMV. Based on these models, we verified the non-trivial cycles of transactions could cause deadlocks and also some bus techniques which can mitigate deadlock problems efficiently. The results from model checking several instances of the protocol and corresponding countermeasures show the techniques could indeed avoid deadlocks.

(4)

Contents

Abstract iii Acknowledgments iv Contents iv List of Figures v List of Tables vi 1 Introduction 1 1.1 Motivations . . . 1 1.2 Aim . . . 1 1.3 Research Questions . . . 1 2 Background 2 2.1 AMBA AXI Protocol . . . 2

2.2 Model Checking . . . 4

2.3 Kripke Structure . . . 5

2.4 Computation Tree Logic . . . 5

3 Literature Review 7 3.1 Bus Status Graph . . . 7

3.2 Existing Techniques to Mitigate Deadlock . . . 8

4 Method 10 4.1 Formal Modeling of The AMBA AXI Protocol . . . 10

4.2 Formal Verification of The AMBA AXI protocol . . . 13

5 Results 17

6 Conclusion 19

(5)

List of Figures

2.1 Interface and Interconnect . . . 2

2.2 Transactions and AXI ID . . . 3

2.3 Three handshake processes . . . 3

2.4 Deadlock Model . . . 4

2.5 Computation Tree . . . 6

2.6 Four Basic CTL Operators . . . 6

3.1 Bus Status Graph : a prime edge represents the earliest request by the ID . . . 7

3.2 BSG example for unsafe states . . . 8

3.3 BSG example for safe states . . . 8

4.1 An abstraction of our model for one ID . . . 10

4.2 an abstraction of slave in our model . . . 11

(6)

List of Tables

(7)

Chapter 1

Introduction

1.1 Motivations

Nowadays, system-on-chip (SOC) designs contain more and more intellectual property (IP) cores since the demand of modern electronic systems grows. In a modern SOC design, one single chip could integrate hundreds of IP cores. This makes the communication for data exchange and synchronization for IP cores increase massively, resulting in severe bus traffic congestion in shared-bus interconnects. Thus, the communication architecture plays a significant role that dominates the system performance. Therefore, some more advanced interconnect protocols such as Advanced eXtensible Interface (AXI) [2] have been proposed. These protocols connect several masters and slaves while aiming for maximum bandwidth with minimum latency. Furthermore, they support parallel access mechanism by allowing outstanding and out-of-order transactions. Hence, a master can assign multiple requests to slaves but waiting for requests to be finished. And a slave is allowed to return the data of outstanding requests out of order.

However, components cannot handle infinite numbers of outstanding transactions. And the bus system supporting out-of-order transactions also need to support transactions which have to be executed in order. If a master is waiting for a slave, but this slave is waiting for the acknowledge from another master. And every master or slave is waiting for acknowledge, for example, they form a circle. Then the deadlock happens, because nobody can get reply. Model checking is a method to verify if all possible infinite behaviors of a finite state model of a considered system meet some given specifications. So in this work, it can be used to check if the AXI protocol has deadlock or not.

1.2 Aim

In this paper, we will focus on the deadlock problems of AMBA AXI protocol. First we identify examples of topologies and scenarios leading to possible deadlocks as well as current counter-measures to avoid them. Based on those deadlock examples, several techniques have been proposed to counter those deadlocks by stalling requests or change the rules of tagging transactions[4][6]. After that the topologies and counter-measures are modeled in NuSMV[8] model checker. The rest part of the paper are verification result of deadlock topologies with or without counter-measures, and discussion about the experiments and results.

1.3 Research Questions

• What are the counter-measures for deadlocks in the AXI protocol?

(8)

Chapter 2

Background

In this chapter, the AMBA AXI protocol, NuSMV, kripke structure and computation tree logic are introduced.

2.1 AMBA AXI Protocol

The AXI protocol provides the definition for the interfaces between a master and the interconnect, a slave and the interconnect, a master and a slave. Figure 2.1 shows a instance of AXI protocol system. It consists several masters and slaves connected together through interconnect.

Interconnect

Master 1 Master 2 Master 3

Slave 1 Slave 2 Slave 3 Slave 4

Interface 

Figure 2.1: Interface and Interconnect

Transaction Model

The AXI ID identifiers are defined in AXI protocol. The IDs are used to tag transactions when they are created. All transactions with the same AXI ID value must be completed in order. But for the transactions with different ID values, no ordering requirements exist, one transaction may be completed without waiting for earlier transactions. And also, there are no restrictions on the transaction orders from different masters, those transactions can complete in any order. In addition, the AXI protocol requires that a slave must respond to requests with the same ID value in order. [2]

As Figure 2.2 shows, transactions T1 and T3 are sent to slave 1 and slave 2 respectively with ID value 1. T2 and T4 are sent to slave 2 and slave 1 respectively with ID value 2. According to the restrictions above, T1 must be completed before T3. So even if T3 is returned before T1, the master will still wait T1 until it is returned and complete it first.

(9)

2.1. AMBA AXI Protocol T1

ID

1

T2

ID

2

T4

S

1

T3

S

2

Requests: T1

T2

T3

T4

  ID: 1      2      1      2

  Targets: S1   S2    S2   S1

Figure 2.2: Transactions and AXI ID

Handshake Mechanism

The handshake mechanism in the AXI protocol makes both master and slave to control the rate of information transferred. First, both master and slave can be the source or the destination. When a source wants to assign a task to a destination, the source makes the VALID signal to high first to indicate that the address, data, or control information is available now. And the destination pulls up the READY signal to show it is able to accept the incoming information. Then the transaction occurs only when both those two signals, VALID and READY, are high. There has three kind of handshake processes as Figure 2.3 shows.

In Figure 2.3 (a), the source provides the address, data or control information after the cycle 0 and pulls up the VALID signal and the source must hold the information stable until the transfer happens. Then the READY signal is generated after the cycle 1 by the destination and the transfer finishes in the cycle 2. Additionally, the source is not allowed to wait for the

READY signal when it has not generated the VALID signal. And once the VALID signal is

generated, it has to remain in high until the handshake finishes.

In Figure 2.3 (b), the destination generated READY signal after the cycle 5 indicates that it is ready to accept the information, and the address, data or control information is not prepared yet. Then the VALID signal is generated after the cycle 6 and the transfer occurs in the cycle 7. Moreover, one destination is allowed to wait for the VALID signal before generating the corresponding READY signal, and it is also allowed to cancel the READY signal before the

VALID signal is generated if the READY is already generated.

In Figure 2.3 (c), both source and destination pull up the VALID and READY respectively, after the cycle 10, to show that they can transfer information. In this case, the transfer happens at the rising edge of the clock of the cycle 11 when those two signals could be recognized, and it finishes in a single cycle.[2]

0 1 2 3 4 5 6 7 8 9 10 11 12

ACLK INFORMATION VALID READY

VALID Before READY READY Before VALID VALID With READY

(a) (b) (c)

(10)

2.2. Model Checking

Deadlocks

The deadlocks may occur in the AXI protocol due to how the AXI supports out-of-order transactions. We take the example in Figure 2.2 again now, slaves 1 and 2 are both assigned tasks via ID 1 and 2, and T1 must be serviced before T3, T2 must be serviced before T4. Now assume slave 1 finishes T4 first and slave 2 finishes T3. But because of the rules mentioned above, T3 and T4 are not allowed to be completed by master before T1 and T2 have been completed. Because the master is waiting for the result from S2, S1 sends the result of T4 back to the master but gets no acknowledge back. And the same thing also happens on ID1 and S2, S2 sends back the result but the master is waiting for the result from S1. As shown in Figure 2.4, every part is waiting, the system can not move. In another word, the deadlock occurs. T1

ID

1

T2

ID

2

T4 Waiting Respond

S

1

T3

S

2

Waiting Respond

Figure 2.4: Deadlock Model

There are some intuitive solutions for this deadlock problem such as ensuring masters do not assign requests which may get systems in an unsafe state or limiting the return order of slaves. Moreover, many methods can achieve this intent, but those may over-solve the problem and thus slow down the performance.

2.2 Model Checking

Model checking is a method to check if all possible infinite behaviors of a finite state model of a considered system satisfy some specifications or properties.

NuSMV

NuSMV is a symbolic model checker. It is redesigned, reimplemented and extended from CMU SMV, which is a BDD-based model checker developed by CMU originally. The main features of NuSMV are the following:[8]

• Functionalities. NuSMV allows both synchronous and asynchronous finite state system, and both computation tree logic (CTL) and linear temporal logic (LTL) expressed specifications.

• Architecture. To reduce the work required to modify and extend NuSMV, the different components and functions are separated into modules with interfaces between them. • Quality of the implementation. NuSMV is written by ANSIC, compliant with POSIX

(11)

2.3. Kripke Structure

2.3 Kripke Structure

In order to verify the correctness of the system, first need to figure out the properties of the system. Then a formal model of the system should be established, which contains the system properties used to verify the correctness, and extracts out those redundant details that have no effect on the correctness but increase the difficulty of verification.[5]

In the AXI bus system, there are three significant properties that need to be captured, state, transition and computation. A state is a temporary description contains certain system variables in a piece of time. The transition represents the change that happens in the system caused by some action occurs. It is given as a pair of states, one is the state before the change and another one is after the change. A computation is a path that contains infinite continuous transitions, which means each state in it is obtained from the previous state through a certain transition.

We use the Kripke structure to capture the behavior of the AXI protocol. A Kripke structure contains a set of states, a set of transitions and labeling function maps each state to a set of properties that hold in the corresponding state. AP is a set of atomic propositions, i.e. boolean expressions over variables, constants and predicate symbols. And paths in Kripke structure could model computations of the system. These models are expressive enough to comprehensively cover the aspects of temporal behaviors to explain the system.

A Kripke structure M over AP is a four tuple M= (S, S0, R, L)with following elements[5]:

1) S is a finite set of states. 2) S0„ S is a set of initial states.

3) R „ S  S is a transition relation that must be total, which means for each state s P S there exists a state s1 P S such that R(s, s1). If some state s has no successor, R(s, s)

holds.

4) L : S Ñ 2AP is a labeling function that labels each state with the set of atomic

propositions which are true in that state.

2.4 Computation Tree Logic

The Computation Tree Logic(CTL) describes attributes of computation trees. A computation tree is a rooted tree with vertices and edges, which is formed by unwinding the Kripke structure into an infinite tree with an initial state of the structure as a root, as illustrated in Figure 2.5. Each vertex represents a single state and each edge represents the transition from one state to another. The computation tree demonstrates all possible transitions and states.

We use φ to represent a specification. There have path quantifiers and temporal operators in the CTL formulas.

Path quantifiers:

• A φ - All: φ must hold on all paths starting from the current state. • E φ - Exists: there exists at least one path starting from the current state. Temporal operators:

• X φ - Next: φ holds in the next state of the path.

• F φ - Finally: φ will hold at some state of the path eventually. • G φ - Globally: φ has to hold at every state on the path.

(12)

2.4. Computation Tree Logic S1 S2 S3 S1 S2 S1 S3 S2 S1 S2

Kripke Structure

Computation Tree

Figure 2.5: Computation Tree

• φ R ψ - Release: ψ needs to holds along the path up to and including the first state held by φ, while φ has no need to hold eventually.

CTL is a restricted subset of CTL*, and the operators must always be grouped in two by one path operator followed by a state operator. The model of time in CTL is a tree-like structure. There are many paths in the future and the future is uncertain, any one path of those might be finally realized. There are ten basic CTL operators, AX and EX, AF and EF, AG and EG, AU and EU, AR and ER. And each of these ten operators can be transformed to use only three operators EX, EG and EU:

• AX φ= EX( φ) • EF φ=E[TrueU φ] • AG φ= EF( φ) • AF φ= EG( φ) • A[φU ψ] E[ ψ U( φ ^ ψ)]^ EG ψ • A[φR ψ] E[ φ U ψ] • E[φR ψ] A[ φ U ψ]

These four operators, EF, AF, EG and AG are used mostly wide as shown in Figure 2.6. The notion M, s |ù φ represents that φ holds at state s in the Kripke structure M.[5]

φ φ φ φ φ φ φ φ φ φ φ φ φ φ

M, s0 ⊨ EF φ M, s0 ⊨ AF φ M, s0 ⊨ EG φ M, s0 ⊨ AG φ Figure 2.6: Four Basic CTL Operators

(13)

Chapter 3

Literature Review

There already exist some techniques to mitigate the deadlock problem. Three cyclic dependency schemes were proposed in [1], and the deadlock avoidance by least stalling(DALS) and a novel ID assignment mechanism were proposed by Chin-Yao Chang. They will be introduced in the chapter.

3.1 Bus Status Graph

IDn Non-Prime Edge Si Sj Prime Edge Non-Prime Edge • • • Non-Prime Edge • • •

Figure 3.1: Bus Status Graph : a prime edge represents the earliest request by the ID

In [4], a BSG (Bus Status Graph) model has been proposed. A BSG consists of several slave vertices, ID vertices, prime edges and non-prime edges. Each vertex stands for a slave or an ID value. Each edge in BSG represents an uncompleted transaction request which has been accepted by the slave. The edge comes from an ID to a slave is called a prime edge, and a non-prime edge is the one comes from a slave to an ID, as shown in Figure 3.1. The prime edge from IDn to Sj shows that the transaction corresponding to this edge is processing

by Sj and is assigned earliest with the IDn, which means the master has to complete this

request before all the other requests with IDn. A non-prime edge from vertex Si to vertex

IDnindicates that this request has been accepted by Sibut is not the highest priority request,

which means it needs to wait for the other higher priority requests to be completed to return the result. Under these definitions, at most one prime edge with a specific ID can exist in a BSG. When a request, which is corresponding to a prime edge, is completed, this prime edge will disappear, and one of the non-prime edges will become a new prime edge. Additionally, this is not randomly because there is an order queue in master but not showed in BSG. Now we take Figure 2.2 as the example again. In this model, there are two IDs, ID1and ID2,

and two slaves, S1 and S2. When the master assigns a request T1, a prime edge is created

from ID1 to S1. As shown in Figure 3.2 (a). Then the master requests T2 to S2with ID2, a

prime edge will appear from ID2 to S2 after S2 accepted this request as shown in Figure 3.2

(b). When the master requests T3 to S2with ID1, there will be a non-prime edge from S2to

ID1 as shown in Figure 3.2 (c), because T1 is not completed yet. T3 can be returned after

T1 has been completed. Similarly, a non-prime edge between S1 and ID2 will appear when

the master requests T4 and S1accepts it. Now there exists a nontrivial cycle in Figure 3.2 (d)

and the system is in unsafe state as we said in section 2.1.

(14)

3.2. Existing Techniques to Mitigate Deadlock T1 ID1 S1 S2 ID2 T1 ID1 S1 S2 T2 ID2 T1 ID1 S1 T3 S2 T2 ID2 T1 ID1 T4 S1 T3 S2 T2 ID2 (a) (b) (c) (d)

Figure 3.2: BSG example for unsafe states

outstanding requests tagged with ID1and ID2 exist, so T1 and T4 are prime edges. Then we

assume T2 and T3 are requested and accepted when T1 and T4 are not completed yet. Two non-prime edges will appear as Figure 3.3 (c) and (d) show. In this situation, no cycles in this BSG, and no matter how those requests are returned by slaves, the system will not get into deadlock. Hence, the system is in a safe state.

T1 ID1 S1 S2 ID2 T1 ID1 S1 S2 T4 ID2 T1 ID1 S1 T3 S2 T4 ID2 T1 ID1 S1 T3 T2 S2 T4 ID2 (a) (b) (c) (d)

Figure 3.3: BSG example for safe states

3.2 Existing Techniques to Mitigate Deadlock

In [1], three cyclic dependency schemes for slave interface are proposed to avoid deadlock by allowing slaves to accept or stall new transactions. These are the single slave scheme, the unique ID scheme and the hybrid scheme respectively.

1. The single slave scheme has the following two rules:

• A master can start a transaction to any slave, if this master has no uncompleted transactions.

• If this master does has uncompleted transactions, it can only request the same slave as the other uncompleted transactions requested to.

2. The unique ID scheme has the following rules:

• A master can start a transaction to any slave with any ID value if it has no uncompleted transactions.

• If an ID has been tagged to an uncompleted transaction, this ID can not be tagged to the other transactions until it is finished.

3. The hybrid scheme has the following rules:

(15)

3.2. Existing Techniques to Mitigate Deadlock

• If a master does have uncompleted transactions, it can start a new transaction only to the slave that involves in the current uncompleted transactions with any ID. Or it can start a transaction to any slave with one of unused IDs.

In general, it is a natural method to mitigate the deadlock in this context by dropping the critical transactions. The countermeasure randomly drop is developed from it, the rule of randomly drop is:

• Drop the prime request randomly.

In [4], a deadlock avoidance approach, the deadlock avoidance by least stalling(DALS), has been proposed. The only rule of this approach is:

• If a nontrivial cycle in BSG will be formed by starting a request, this request will be stalled.

A novel ID assignment mechanism was proposed in [6], which ensures the assigned transactions will not bring system into unsafe state and make a considerable reduction in the amount of stalled transactions. This design will be implemented on each master. And two rules have been proposed in this design:

• Exclusive Rule: Slaves are mutually exclusive to each other. The ID assigned to a new transaction to one slave is not allowed to be the same as IDs already been tagged to outstanding transactions to the other slaves.

• Priority Rule: if Slave Si has higher priority than Sj, the new transaction to Si can be

(16)

Chapter 4

Method

This chapter shows how we carry out the work, from building FSM till implementing the protocol and countermeasures by NuSMV.

4.1 Formal Modeling of The AMBA AXI Protocol

We used the NuSMV symbolic model checker [8] to formally verify the instance AMBA protocol. The aim of this exercise is propose a module that can capture deadlocks and to check whether different counter measures can address them.

The language of SMV allows us to capture every module in the protocol as a finite state machine(FSM). Especially, the initial state of every module and transition behaviors between them can be specified by the users. SMV can build a global state transition graph of the whole model from the description of all the modules. The transition relations and states are considered as boolean functions, which are represented efficiently through a compact data structure called Binary Decision Diagrams (BDD). [3][9]

This section explains our model for the AMBA AXI protocol. Because the notion of time used in protocol is discrete (bus cycle), the protocol can be considered as a discrete event system. There exists several models of computation that can represent discrete event systems, well-known examples include finite state machines, Petri-nets, and data-flow networks. To indicate the AMBA AXI protocol as a discrete event system, the finite state machine(FSM) is an advisable computation model. We picked finite state machines primarily on the grounds that they are supported by some model checkers.[7]

Send Request Initial Get Response Waiting No Yes

Complete Request From target 

slave? No Yes Any uncompleted requests? Send Request

Send ack to Slave

(17)

4.1. Formal Modeling of The AMBA AXI Protocol

First of all, we created finite state machines for both master and slave. For the master, we just focus on one of the many IDs, then the other IDs could just follow the same logic as this ID. Figure 4.1 shows an abstraction of our model for one ID. In the beginning, the master is in the initial state, which means there exist no uncompleted requests with this ID. When it assigns requests to the slave, it goes to the waiting state. During the waiting state, the master is waiting to get a response for the request with the highest priority, which is assigned earliest. And the master could continue sending request during the waiting state. If the master gets the response from the slave assigned with the prime request, it will send back an acknowledge and then complete this request, otherwise, it will still wait. After that, if there still have uncompleted requests, the master then goes back to the waiting state, waits for the new request with the highest priority. If not, it will just go to the initial state.

Get Request

Initial

Return Results

Processing

Yes

Complete Request Get ack from

Master? No Yes Any uncompleted requests? Get Request No

Figure 4.2: an abstraction of slave in our model

The abstraction of a slave is showed in Figure 4.2. It starts from the initial state, when a slave gets a request from a master, it turns to the processing state. During the processing state, it still can accept requests from masters. As long as one of these requests is processed, the slave starts to return the result to the specific master. When it gets the acknowledgement from the master, it will continue to process the other requests if there still exist any request, otherwise, the slaves will go back to the initial state. If the slave gets no acknowledgement from the master, it will keep waiting and block the other uncompleted requests untill the acknowledgement has arrived.

Listing 4.1 shows how we instantiate the master of AMBA AXI protocol in NuSMV. Module

Tag describes the behavior of a master on one ID. It has two inputs, ack and n, ack is the

result returned by slave, and n is the ID value. The variable req is an array of requested slaves, its values indicates which slave is requested. The req[0] is the highest priority request, when this get responded, next request will be the next highest priority one, as shown in line 6. The input ack is an array that captures whether the slave has responded and if so, to which ID. So ack[req[0]] is the target ID number of the slave associated with the highest request, if it equals to the corresponding ID, then this request is completed. To capture the delay exists in reality, the variable delay is non-deterministic.

1 MODULE Tag(ack, n) 2 VAR

(18)

4.1. Formal Modeling of The AMBA AXI Protocol

4 delay : {0, 1} 5 TRANS

6 case

7 req[0] = 0 & delay = 0: next(req[0]) = req[1] ; 8 req[0] != 0 & ack[req[0]] = n & delay = 0:

9 next(req[0]) = req[1] ;

10 TRUE: next(req) = req; 11 esac;

Listing 4.1: NuSMV Model for an AMBA AXI Master

Listing 4.2 shows the way we modelled a generic AMBA AXI slave in NuSMV. It has two inputs, req_array, which is a set consisting all requests sent by the master with different ID values, and n stands for the slave number. The array buf and variable index are used to capture the out-of-order processing in slaves. If a slave gets a request with an ID, it will process it by set the specific position in buf to the ID value, otherwise, the value in that position is null. The variable index is a non-deterministic value, which will change the value randomly every cycle. So the buf[index] returns the random one in the processed requests to ack, which is to be returned to the master.

1 MODULE Slave(req_array , n) 2 VAR

3 ack: {0, 1, null};

4 buf : array 0..1 of {0, 1, null}; 5 index : 0..1; 6 delay : {0, 1}; 7 ASSIGN 8 init(buf[0]):= null; 9 init(buf[1]):= null; 10 ASSIGN 11 ack := buf[index]; 12 TRANS 13 case

14 req_array[0][0] = n | req_array[0][1] = n & delay = 0 :

15 next(buf[0]) = 0;

16 TRUE: next(buf[0]) = null; 17 esac;

18 TRANS 19 case

20 req_array[1][0] = n | req_array[1][1] = n & delay = 0 :

21 next(buf[1]) = 1;

22 TRUE: next(buf[1]) = null; 23 esac;

24 TRANS 25 case

26 buf[index] != null: next(index) = index; 27 TRUE: next(index) = (index+1) mod 2; 28 esac;

Listing 4.2: NuSMV Model for an AMBA AXI Slave

The interconnect between masters and slaves is modeled in Listing 4.6. The requests with different IDs are assigned to the req_array sent to slaves. The acknowledges from different

(19)

4.2. Formal Verification of The AMBA AXI protocol

1 MODULE main 2 VAR

3 Tag0: process Tag(ack_array , 0); 4 Tag1: process Tag(ack_array , 1); 5 slave1 : process Slave(req_array , 1); 6 slave2 : process Slave(req_array , 2); 7

8 VAR

9 ack_array : array 1..2 of {0, 1, null};

10 req_array : array 0..1 of array 0..1 of {1, 2, 0}; 11 12 ASSIGN 13 ack_array[1] := slave1.ack; 14 ack_array[2] := slave2.ack; 15 req_array[0][0] := Tag0.ID[0]; 16 req_array[0][1] := Tag0.ID[1]; 17 req_array[1][0] := Tag1.ID[0]; 18 req_array[1][1] := Tag1.ID[1];

Listing 4.3: NuSMV Model for an AMBA AXI Bus

4.2 Formal Verification of The AMBA AXI protocol

Computational tree logic (CTL)[5] allows us to specify the properties of the instance. In concurrency systems, the term deadlock represents where some components freeze because they block each other. In our model, the deadlock happens when there exists a cycle in BSG, as shown in Figure 3.2. If there are no transactions with an ID, this ID will not be blocked. So the deadlock always happens when there has some transactions in the system. Therefore, if the master can always finish all the transactions with an ID, this ID will not deadlock. In other words, the system will not deadlock if all the IDs could arrive the state with no transactions. We can check that this undesirable situation does not occur by assuming the master could go back to the initial state. So the simple CTL property is:

AG EF req o f each Tag=0

This formula means from any state, it is possible to get to the initial state. If it cannot pass the formal verification, then there exists deadlocks in the model.

In this experiment, we use the simplest model, two IDs and two slaves. The result from NuSMV is that specification AG EF Tagx.req= 0 is false, and the counter examples prove this specification is false is : req_array[0][0] = 1, req_array[0][1] = 2, req_array[1][0] = 2,

req_array[1][1] = 1, as shown in Figure 4.3, the prime request with ID 0 is to slave 1 and with

ID 1 is to slave 2. Meanwhile both of these two IDs have non-prime edge to another slave. There is a cycle in the model then, and deadlock happens under this situation.

Counter measures

After that, we implemented four different counter measures, randomly drop, DALS, single slave and unique ID. They are introduced in 3.2.

Listing 4.4 shows how randomly drop is implemented. The input Dindex is a non-deterministic value given by the bus. It decides which prime transaction will be dropped. So when Dindex equals to the ID values, the prime request will update to next.

(20)

4.2. Formal Verification of The AMBA AXI protocol req_array[0][0] ID0 S1 ID1 S2 req_array[0][1] req_array[1][0] req_array[1][1]

Figure 4.3: Countermeasure from NuSMV

1 MODULE Tag(ack, n, Dindex) 2 VAR

3 req: array 0..1 of {1, 2, 0}; 4 delay : {0, 1};

5 TRANS 6 case

7 Dindex = n : next(req[0]) = req[1];

8 req[0] = 0 & delay = 0 : next(req[0]) = req[1]; 9 req[0]!= 0 & ack[req[0]] = n & delay = 0 :

10 next(req[0]) = req[1];

11 TRUE: next(req) = req; 12 esac;

Listing 4.4: Master model with randomly drop

The master module implemented with DALS is illustrated in Listing 4.5. The new input Dindex stands for the request with which ID need to be dropped. And another, Dstat represents whether this request should be dropped or not. A new variable waiting_for means the ID that this ID is waiting to. For example, if the master send requests to slave 1 with ID 0 and ID 1, and slave 1 chooses to reply ID 1 first, then we say ID 0 is waiting for ID 1, so the value of waiting_for in ID 0 is 1. So if there are no prime request with this ID, or the prime request has been replied, the waiting_for will be set to null, means this ID is not waiting for any other IDs. While if there is a prime request with this ID and the target slave replies another ID, then we set the waiting_for with the ID value which is replied.

1 MODULE Tag(ack, n, Dindex , Dstat) 2 VAR 3 req: array 0..1 of {1, 2, 0}; 4 waiting_for : {0, 1, null}; 5 ASSIGN 6 init(waiting_for) := null; 7 TRANS 8 case

9 Dindex = n & Dstat = TRUE : next(req[0]) = req[1]; 10 req[0] = 0 : next(req[0]) = req[1] &

11 next(waiting_for) = null;

12 req[0] != 0 & ack[req[0]] = n : next(req[0]) = req[1] &

13 next(waiting_for) = null ;

14 req[0] != 0 & ack[req[0]] != n : next(req) = req &

(21)

4.2. Formal Verification of The AMBA AXI protocol

17 esac;

Listing 4.5: Master model with DALS

Also, in the main module, which is corresponding to the bus in the interconnect, some mechanism has been added, as shown in Listing 4.6. Here the waiting_for from different IDs consist the waiting_list, which demonstrates the waiting relation between different IDs.

Dindex is a non-deterministic values, it will take 0 or 1 randomly. Then we check is the waiting

relation between this ID and the other IDs exists a cycle. In this example showed below, there just has 2 IDs so if the waited ID is also waiting for this ID, then there is a cycle. And when a cycle is detected, we keep the value of Dindex, and set the Dstat to true, to drop the prime request of this ID to open the cycle. Otherwise, the Dstat is just set to false, which means no requests will drop.

1 MODULE main 2 VAR

3 waiting_list : array 0..1 of {0, 1, null};

4 Dindex : {0, 1}; -- Info which ID is gonna drop the request 5 Dstat : boolean; -- boolean for Drop or not

6 ASSIGN

7 waiting_list[0] := Tag0.waiting_for; 8 waiting_list[1] := Tag1.waiting_for; 9 TRANS

10 case

11 waiting_list[Dindex] != null &

12 waiting_list[waiting_list[Dindex]] = Dindex : 13 next(Dstat) = TRUE & next(Dindex) = Dindex; 14 TRUE : next(Dstat) = FALSE;

15 esac;

Listing 4.6: Main model with DALS

The main module implemented with single slave is showed in Listing 4.7. 1 MODULE main

2 VAR

3 Tag0: process Tag(ack_array , 0); 4 Tag1: process Tag(ack_array , 1); 5 slave1 : process Slave(req_array , 1); 6

7 VAR

8 ack_array : {0, 1, null};

9 req_array : array 0..1 of array 0..1 of {1, 2, 0}; 10 11 ASSIGN 12 ack_array := slave1.ack; 13 req_array[0][0] := Tag0.ID[0]; 14 req_array[0][1] := Tag0.ID[1]; 15 req_array[1][0] := Tag1.ID[0]; 16 req_array[1][1] := Tag1.ID[1];

Listing 4.7: Main model with single slave

Listing 4.8 shows how we implemented unique ID. As mentioned in section 3.2, each ID can be tagged with one transaction at the same time maximally.

(22)

4.2. Formal Verification of The AMBA AXI protocol 1 MODULE Tag(ack, n) 2 VAR 3 ID: {1, 2, 0}; 4 delay: {0, 1, 2, 3, 4, 5, 6}; 5 TRANS 6 case

7 ID != 0 & ack[ID] != n : next(ID) = ID;

8 ID != 0 & ack[ID] = n & delay = 0: next(ID) = 0; 9 TRUE: next(ID) = next(ID);

10 esac;

(23)

Chapter 5

Results

To generically verify the instance of the AXI protocol, we have captured three systems that include one master and two, three, four slaves. The master randomly tags transactions with two, three or four IDs. The master plays the role of traffic generator that randomly assigns transactions to slaves with random IDs and the slaves support out-of-order execution. Maximum number of transactions with same ID is 2. Several countermeasures showed in section 4.2 are captured in models that are model checked. We show that all can mitigate the deadlock problems.

Number of IDs Number of Slaves Countermeasure Time Consumption Output

2 2 no countermeasure 0.1s False

2 2 randomly drop 1.6s True

2 2 DALS 5m7s True

2 2 single slave 0.083s True

2 2 unique ID 0.083s True

3 3 no countermeasure 6m48s False

3 3 randomly drop 3m42s True

3 3 DALS more than 20 hrs No

3 3 single slave 0.13s True

3 3 unique ID 5.75s True

4 4 no countermeasure more than 20 hrs No

4 4 randomly drop more than 20 hrs No

4 4 DALS more than 20 hrs No

4 4 single slave 0.42s True

4 4 unique ID 49m45s True

Table 5.1: Verification Time for Different Instance

Table 5.1 indicates the verification time with different numbers of IDs and slaves. We implemented 4 countermeasures here, randomly drop, DALS, single slave and unique ID. These are mentioned in 4.2. We built it from the simplest case, 2 IDs and 2 slaves and made it more and more complex. Till the case 4 IDs and 4 slaves, the program would take much time to finish. We set 20 hours as time out for verification.

As the table indicates, the size 2 instance of protocol just takes an instant to model check except the one with DALS. It takes around 5 minutes. When there has 3 IDs, 3 slaves and no countermeasure considered, it takes 7 minutes to get the result, which is as what we expect, NuSMV establishes absence of deadlock for considered instance and countermeasures. When the instance is considered with the randomly drop, the size 3 instance of protocol just take around 3.5 minutes to finish the model checking, and the instance is safe, no deadlock can happen. The instance with the single slave or the unique ID finish in 0.1 second. It just takes a few second to verify them, and the result is true, which means it is impossible to make a

(24)

deadlock. After we adopted the countermeasure DALS in the size 3 instance of protocol, it takes more than 20 hours, which exceeds the time constraint we set. The size 4 instance of protocol also exceeds the time constraint we can accept to verify even with no countermeasures or with the randomly drop method. Only the instance considered with the single slave or the unique ID can finish in time constraint. The results of these two instances are still true. They mitigate the deadlock problem successfully. The time consumption for single slave is still less than 1 second but for unique ID is around 50 minutes this time.

From the result above we can see the model with single slave or unique ID takes much less verification time. The reason for this is these two countermeasures save many behaviors from the original model. The instance considered with DALS takes much more time than the instances with other countermeasures. It is caused by the calculation of recognizing cycles in the DALS.

(25)

Chapter 6

Conclusion

The deadlock problem in the AMBA AXI protocol is studied in this work. The deadlock in AMBA AXI protocol is reflected in a way that some IDs and slaves block each other and can not finish the handshake because of some non-trivial waiting cycle. Because of the features of the out-of-order transaction, the system may sink into an unsafe state which has the possibility to result in deadlock.

We verified that several countermeasures mitigate the deadlock problem on instances. First we built models of some instances of AMBA AXI system, and the finite state machine of it. The abstraction of master and slave are showed in section 4.1. Then we captures instances in the NuSMV. The deadlock situation is checked via model checking. After that, several countermeasures are captured on those instances, as showed in section 4.2. The deadlock and countermeasures on these instances are abstractly expressed by the NuSMV model checker. So expand the system into arbitrary many masters and slaves and verify the deadlock problem could be done in the future.

Exploring the other tools is a way to verify more different configurations, such as spin, it is better than NuSMV with the asynchronous models. It will be nice to integrate the technique in a framework where a designer chooses specific countermeasures for different components, or wants to check whether the model has deadlocks.

(26)

Bibliography

[1] Technical Reference Manual of PrimeCell AXI Configurable Interconnect (PL300). ARM, 2010.

[2] AMBA® AXITM and ACETM Protocol Specification. Arm, 2019.

[3] Bryant. Graph-based algorithms for boolean function manipulation. IEEE Transactions

on Computers, C-35(8):677–691, 1986.

[4] Kuen-Jong Lee Chin-Yao Chang. On deadlock problem of on-chip buses supporting out-of-order transactions. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION

(VLSI) SYSTEMS, 22(3), 2014.

[5] Edmund M. Clarke, Orna Grumberg, and Doron A. Peled. Model Checking. MIT Press, 1999.

[6] Keng-Hao Yang Jean Tsao Shih-Chieh Chang Wen-Ben Jone Hsuan-Ming Chou, Yi-Chiao Chen and Tien-Fu Chen. High-performance deadlock-free id assignment for advanced interconnect protocols. IEEE TRANSACTIONS ON VERY LARGE SCALE

INTEGRATION (VLSI) SYSTEMS, 24(3), 2016.

[7] Gabor Madl, Sudeep Pasricha, L.A. Bathen, Nikil Dutt, and Qiang Zhu. Formal performance evaluation of amba-based system-on-chip designs. pages 311–320, 01 2006. [8] Charles Arthur Jochim-Gavin Keighren Emanuele Olivetti Marco Pistore Marco Roveri

Roberto Cavada, Alessandro Cimatti and Andrei Tchaltsev. NuSMV2.6UserManual, 2015. [9] A. Roychoudhury, T. Mitra, and S. R. Karri. Using formal techniques to debug the amba system-on-chip bus protocol. In 2003 Design, Automation and Test in Europe Conference

References

Related documents

Úkolem je navrhnout novou reprezentativní budovu radnice, která bude na novém důstojném místě ve vazbě na postupnou přestavbu území současného autobusové nádraží

25 Table 3 Top 5 covered events ………...………p.26 Table 4 Percentage of event coverage in game respectively policy style ………p.26 Table 5 Percentage of coverage of

The illumination system in a large reception hall consists of a large number of units that break down independently of each other. The time that elapses from the breakdown of one

I have chosen to quote Marshall and Rossman (2011, p.69) when describing the purpose of this thesis, which is “to explain the patterns related to the phenomenon in question” and “to

The purpose of this study is to come up with a possible forecast on how road pricing in Germany will affect the modal split of the Swedish transportation industry.. This is done

The personal data must be erased in order to fulfill a legal obligation originating in EU or Swedish law that Stockholm School of Economics is bound by (please motivate

The main aim of this thesis was to study granulocyte function after burns and trauma to find out the role played by granulocytes in processes such as development of increased

Sophisticated Firm D Assess yearly the total cost of one (including direct and indirect costs) service and divide it by the number of yearly transactions.. Innovative Firm F