• No results found

NextGenerationSDNSwitchesUsingProgrammingProtocol-IndependentPacketProcessors KTHR I OFT

N/A
N/A
Protected

Academic year: 2021

Share "NextGenerationSDNSwitchesUsingProgrammingProtocol-IndependentPacketProcessors KTHR I OFT"

Copied!
100
0
0

Loading.... (view fulltext now)

Full text

(1)

IN

DEGREE PROJECT ELECTRICAL ENGINEERING, SECOND CYCLE, 30 CREDITS

,

STOCKHOLM SWEDEN 2018

Next Generation SDN Switches

Using Programming

Protocol-Independent Packet Processors

TIJO VARGHESE THAZHONE

KTH ROYAL INSTITUTE OF TECHNOLOGY

(2)
(3)

KTH R

OYAL

I

NSTITUTE

OF T

ECHNOLOGY

M

ASTER

T

HESIS

R

EPORT

Next Generation SDN Switches Using

Programming Protocol-Independent

Packet Processors

Author:

Tijo Varghese Thazhone

Examiner:

Dr. Zhonghai Lu

Company supervisor:

Magnus Svevar (Infinera)

Academic supervisor:

Yuan Yao

A thesis submitted in fulfillment of the requirements for the degree of Master of Science

in the

School of Electrical Engineering and Computer Science Stockholm, Sweden

(4)
(5)

iii

Abstract

Over recent years, Software Defined Networking has enabled operators to control the network and realize new networking topologies. With increasing network traf-fic and protocol formats that aim at managing the traftraf-fic eftraf-ficiently, the capabilities offered by Software Defined Networking alone are currently limited by the under-lying fixed hardware infrastructure. The inflexibility involved in redesigning the hardware forces the bottom-up approach defined by switch vendors in describing the network and limits the capabilities offered to operators for further innovation. To meet the demands of ensuring a higher degree of flexibility to design, test and guarantee a faster time to market, the concept of Softly Defined Networks was in-troduced. The idea in addition to offering the conventional advantages of Software Defined Networking is based upon implementing a re-programmable data-plane. Field-Programmable Gate Arrays offered a higher degree of flexibility and capa-bility to handle such designs. Programming Protocol-independent Packet Proces-sors(P4) is a high-level language continuously evolving to define data-planes for various networking devices. The aim of P4 is for network operators to customize the underlying hardware with minimum constraints and ease, independent of the target. Therefore, the three major goals while defining such a language revolved around reconfigurability of hardware after being deployed, protocol independence to permit customization without constraints and target independence for users to be less concerned of the underlying hardware. Recent advances in P4 with the added support in terms of compatible targets and compilers have made P4 a viable oppor-tunity to realize a re-programmable hardware.

(6)
(7)

v

Abstrakt

Under de senaste åren har Software Defined Networking gjort det möjligt för op-eratörer att styra nätverket och implementera nya nätverkstopologier. Med ökande nätverkstrafik och nya protokoll som syftar till att hantera trafiken effektivt, är de möjligheter som erbjuds av Software Defined Networking för närvarande begränsat av den underliggande fixa hårdvaruarkitekturen. Den inflexibla hårdvaran tvingar fram det ”bottom-up-” tillvägagångssätt som definieras av switchleverantörer när det gäller att beskriva nätverket och begränsar de möjligheter som erbjuds operatör-erna för att styra och innovera i sina nät.

För att möta kraven på att skapa en högre grad av flexibilitet för att designa, testa och garantera en snabbare tid till marknaden, introducerades begreppet Softly Defined Networks. Tanken, utöver att erbjuda de konventionella fördelarna med Software Defined Networking, bygger på att man implementerar ett omprogrammerbart dat-aplan. Field-Programmable Gate Arrays erbjuder en högre grad av flexibilitet och förmåga att hantera sådana konstruktioner. Programming Protocol-independent Packet Processors(P4) är ett språk på hög nivå som kontinuerligt utvecklas för att definiera dataplanet för olika nätverksenheter. Målet med P4 är att nätverksoper-atörerna lätt ska kunna anpassa den underliggande hårdvaran med minimala be-gränsningar oberoende av leverantör av hårdvara. De tre huvudmålen när man definierade ett sådant språk handlade om omkonfigurerbarhet av hårdvaran efter att ha blivit utplacerad, protokolloberoende för att möjliggöra anpassning utan begrän-sningar och leverantörsoberoende för att användarna skulle vara mindre oroade över den underliggande hårdvaran. Nya framsteg i P4 när det gäller stöd för kom-patibla hårdvaror och kompilatorer har gjort P4 till en tänkbar kandidat för att re-alisera en omprogrammerbar hårdvara.

(8)
(9)

vii

Acknowledgements

““Tell me and I forget, teach me and I may remember, involve me and I learn.””

-Benjamin Franklin The contents of this report would be incomplete without acknowledging the con-stant guidance and support I received throughout the thesis work. First and fore-most I am grateful to God for helping me with the patience and capability necessary to see the thesis through.

The opportunity to work on this topic in collaboration with Infinera was an abso-lute pleasure. It would have been impossible to state the findings mentioned in this report without the guidance, support and resources granted by Infinera. The open work culture and supportive colleagues helped make new friends and enjoy my work. I would like to express my gratitude to my industrial supervisor Magnus Svevar for helping me with all the necessary resources and support to better under-stand the topic. Hannah Dysenius helped manage the project work and made sure there was progress in a systematic fashion. I am truly thankful to them both for their patience and understanding with regard to all the unmet deadlines. Kenth Erikson, Dr. Jue Shen and Stefanos Kyri with their years of experience and knowledge in the field offered valuable guidance during various stages of the project work. It was absolutely an honor to have been a part of Infinera and learn more about the topic. This thesis work was undertaken to fulfill the requirements for the degree of Mas-ter of Science at KTH Royal Institute of Technology, Stockholm with Dr. Zhonghai Lu as the examiner and Yuan Yao as the academic supervisor. I specially thank Dr. Zhonghai Lu for constantly reviewing the status of my thesis work and helping me refine this report.

(10)
(11)

ix

Contents

Abstract iii Abstrakt v Acknowledgements vii 1 Introduction 1

1.1 Background and Motivation . . . 2

1.2 Problem Statement . . . 5

1.3 Purpose . . . 5

1.4 Goals . . . 5

1.5 Benefits, Ethics and Sustainability . . . 6

1.6 Research Methodology . . . 6

1.7 Delimitation . . . 7

1.8 Outline . . . 8

2 Theoretical framework 9 2.1 Software Defined Network to Softly Defined Network . . . 9

2.2 P4: Programming Protocol-independent Packet Processors . . . 11

2.2.1 Architecture model . . . 14

2.2.2 P4 description . . . 16

2.2.2.1 Meta-data bus . . . 16

2.2.2.2 Parsing of the packet . . . 19

2.2.2.3 Match-Action tables . . . 21

2.2.2.4 Deparser . . . 23

2.2.3 Benefits of using P4 . . . 24

2.2.4 P4 compiler and tools . . . 24

2.3 Related work . . . 25

2.4 Miscellaneous . . . 26

2.4.1 FPGA platform . . . 26

2.4.2 Simulation and testing . . . 27

3 P4-enabled switch: The proposed design 31 3.1 Building blocks . . . 33

3.1.1 P4 description . . . 33

3.1.1.1 The Architecture model . . . 33

3.1.1.2 P4 data-plane description . . . 34

3.1.2 UPI master . . . 41

3.1.3 UPI-AXI4 Lite translator . . . 43

3.1.4 10/25G Ethernet Subsystem . . . 46

3.1.5 Tuple Controller . . . 50

(12)

x

4 Results 53

4.1 Simulation of building blocks . . . 53

4.2 System integration and packet flow . . . 57

4.3 The final design . . . 58

4.4 Observing the desired packet processing . . . 61

4.5 Analysis . . . 63

5 Conclusion and Future work 67 5.1 Conclusion . . . 67

5.2 Future work . . . 68

A Steps to incorporate P4 69

B Intermediate JSON file 71

(13)

xi

List of Figures

1.1 OpenFlow based Software Defined Networking framework. . . 3

2.1 Softly Defined Networks . . . 10

2.2 The overall network framework. . . 11

2.3 Traditional switch vs a P4-defined switch. . . 13

2.4 Programming a target using P4. . . 14

2.5 An architecture model. . . 15

2.6 Dataflow topology. . . 17

2.7 Sections within a typical P4 program. . . 18

2.8 An abstract parser state machine. . . 19

2.9 Parser state machine. . . 20

2.10 Action code and data. . . 21

2.11 Match-action unit. . . 22

2.12 P4-SDNet compilation flow. . . 25

2.13 Board level block diagram. . . 28

2.14 T-BERD/MTS-5800 hand-held network tester. . . 29

3.1 Overview of the design. . . 32

3.2 XilinxSwitch layout. . . 33

3.3 Ethernet frame. . . 35

3.4 Parser graph to extract stacked VLAN tags. . . 38

3.5 P4 defined module. . . 40

3.6 On-board bus architecture. . . 42

3.7 UPI bus operation. . . 42

3.8 Bus architecture incorporating UPI-AXI4 Lite translator. . . 43

3.9 UPI-AXI4 Lite translator. . . 44

3.10 Flow diagram for UPI-AXI4 Lite translator. . . 45

3.11 PCS-Only Core Variant. . . 47

3.12 Normal 64 Bit Frame Transfer. . . 48

3.13 10G Ethernet Subsystem. . . 49

3.14 Tuple Controller. . . 50

3.15 AXI4-Stream switch. . . 51

4.1 UPI-AXI4 Lite write operation. . . 54

4.2 UPI-AXI4 Lite read operation. . . 55

4.3 Tuple Controller simulation results. . . 56

4.4 Self-looped test setup. . . 57

4.5 Packet stream without P4 defined module. . . 57

4.6 The final block design. . . 59

4.7 Test setup to observe the desired P4 defined processing. . . 61

4.8 Lane 1 with P4 defined packet processing. . . 62

(14)

xii

4.11 Flip-Flop variations w.r.t the number of headers, tables and write op-erations. . . 64 4.12 BRAM variations w.r.t the number of headers, tables and write

oper-ations. . . 64 4.13 Average latency w.r.t the number of headers, tables and write

(15)

xiii

List of Tables

1.1 The seven layers of the OSI Model . . . 2

1.2 OpenFlow standards and the defined header fields. . . 4

2.1 Table restrictions based upon the match kind. . . 23

2.2 Capabilities of 7 Series FPGA. . . 27

3.1 Partial representation of a populated table. . . 39

4.1 Entries within the populated table. . . 61

4.2 Resource utilization for various P4 description. . . 63

(16)
(17)

xv

List of Abbreviations

TCP/IP Transmission Control Protocol/Internet Protocol

SDN Software Defined Networking

OSI Open System Interconnection

ISO International Organization for Standardization

FE Forwarding Element

IDS Intrusion Detection System

VXLAN Virtual eXtensible Local Area Network

POF Protocol-Oblivious Forwarding

ASIC Application-specific integrated circuit

NPU Network Processing Units

FGPA Field-Programmable Gate Array

PSA Portable Switch Architecture

RTL Register-Transfer Level

P4 Programming Protocol-independent Packet Processors

MAT Match-Action Tables

HDL Hardware Description Language

SoC System on Chip

API Application Programming Interface

PSA Portable Switch Architecture

GPCM General-Purpose Chip-Select Machine bus

UPI UltraPath Interconnect

SFP+ Small Form-factor Pluggable transceivers

MGT Multi-Gigabit Transceivers

IEEE Institute of Electrical and Electronics Engineers

SVLAN Service-Virtual Local Area Network

PCP Priority code point

DEI Drop eligible and indicator

VID VLAN identifier

CVLAN Customer-Virtual Local Area Network

AXI Advanced eXtensible Interface

PMD Physical Medium Dependent

PMA Physical Medium Attachment

PCS Physical Coding Sublayer

MDI Media Dependent Interface

MAC Media Access Control

MII Media Independent Interface

PLL Phase-locked loop

LUT LookUp Table

FF Flip-Flop

(18)
(19)

xvii

(20)
(21)

1

Chapter 1

Introduction

After the advent of electronic computers in the mid-twentieth century, earlier con-cepts of wide area networking were implemented in several computer science labo-ratories. One of the earliest recorded social transactions that proved the existence of networking was in August of 1962, while a series of memos written by J.C.R Lick-lider of Massachusetts Institute of Technology discussing his “Galactic Network” concept was sent over a network [1]. The concept then, of a globally interconnected set of computers through which users could transfer information has become the paramount vision of today’s modern Internet.

On the 24th of July 1961, Leonard Kleinrock at Massachusetts Institute of Technol-ogy published the first paper on packet switching theory titled “Information Flow in Large Communication Nets”, subsequently he addressed the feasibility of commu-nications using packets instead of circuits in his first book titled “Communication Nets: Stochastic Message Flow and Delay”. One of the most important hurdles dur-ing the time was to make multiple users interact at the same instant. This posed the need for a well-connected network that allowed multiple nodes to communi-cate simultaneously using the same resources. In 1965, MIT researcher Lawrence G. Roberts worked with Thomas Merrill to create the first ever wide-area network by using a low-speed dial-up telephone line to connect TX-2 computer in Massachusetts to the Q-32 in California. This experiment helped conclude that time shared use of resources could enable computers to work well together to run programs and re-trieve data on a remote machine. At the time, conventional circuit-switched tele-phone systems were totally unusable for the application and Kleinrock’s concept for the use of packet switching was the best practical approach [1].

(22)

2 Chapter 1. Introduction within the network that are capable of conducting the desired packet processing based upon the engineered traffic rules and further provide a global view or ab-straction of the entire network architecture.

According to the widely adopted Open System Interconnection (OSI) reference model developed by the combined efforts of the International Organization for Standard-ization (ISO) and Telecommunications StandardStandard-ization Sector of the International Telecommunications Union (ITU-TS) in 1983, the network is partitioned into a verti-cal set of seven layers. The primary goal of the OSI model is to permit nodes to push packets into a physical network and ensure they travel to the destination indepen-dently [2]. Each layer is concerned with a specific set of functionalities and enhances the services offered by the immediately lower layer as described in table 1.1 [3].

No. Layer Functionality

1 Physical Interface with the physical medium to transmit unstructured bit stream.

2 Data Link Transmission of frames over single network connections. 3 Network Reliable communication over one or more subnetworks. 4 Transport Reliable and transparent transfer of data between end points. 5 Session Management of sessions between end points.

6 Presentation Encoding(data presentation) during transfer.

7 Application Provision of services to end user by processing of informa-tion.

TABLE1.1: The seven layers of the OSI Model derived from [3].

The physical, data link and network layers are responsible for the communication portion between the two systems situated at the transmitting and receiving ends. The physical layer manages the transmission of bits between nodes over a medium. It deals with interfacing the node with transmission hardware, physical connector characteristics and voltage levels for encoding of binary values. The data link layer ensures the reliable transmission of data between adjacent nodes and enhances the reliability over the bit transmission within a single link. If the link between two end nodes is indirect, then the transmission will have to pass through multiple links and the reliability for such a transmission will be handled by higher layers within the OSI model. Network layer is responsible for routing and forwarding of packets. Routing determines the path a packet must traverse to reach its destination and for-warding deals with passing of packets between subnetworks. This layer also ensures that data units are well segmented to be acceptable to the data link layer [3]. This research focuses on proposing a highly flexible software defined packet switching technique between optical links.

1.1

Background and Motivation

(23)

1.1. Background and Motivation 3 networking for the successful interaction of user applications communicating over the Internet. Network switches that managed the network were equipped with func-tionalities such as access control, tunneling, and overlay formats [4]. In addition, the recent inclusion of SDN capabilities on these network switches accelerated the de-sign of newer protocols with run-time configurable traffic engineering rules by sep-arating the plane from the data-plane. This segregation enables the control-plane to have an overall view of multiple packet processing data-control-planes and also makes the implementation of intricate traffic engineering rules using SDN more hi-erarchical and meaningful.

Figure 1.1 depicts the various layers that are involved within an OpenFlow based SDN framework. The bottom layer comprises of the physical infrastructure which basically is the cluster of interconnected Forwarding Elements (FE) that eventually implements the desired data-plane with the adequate routing algorithms. The sec-ond tier can be described as the network control layer that is decoupled from the underlying infrastructure and behaves as a middleman by enabling the top most network application layer to centrally control the network infrastructure and realize much efficient traffic management. The network application layer which directs the traffic through the controller consists of SDN applications that perform functions such as network monitoring, intrusion detection systems (IDS), network virtualiza-tion and flow balancing.

FIGURE1.1: OpenFlow based Software Defined Networking frame-work.

(24)

4 Chapter 1. Introduction in terms of many more header fields and multiple stages of rule tables as shown in the table below [5].

Version Date Header Fields

OF 1.0 Dec 2009 12 fields (Ethernet, TCP/IPv4)

OF 1.1 Feb 2011 15 fields (MPLS, inter-table metadata) OF 1.2 Dec 2011 36 fields (ARP, ICMP, IPv6, etc.) OF 1.3 Jun 2012 40 fields

OF 1.4 Oct 2013 41 fields

TABLE1.2: OpenFlow(OF) standards and defined header fields [5].

Such forwarding devices are generally implemented on a packet switching chip with a dedicated hardware. A variation in the protocol header fields as frequent as the example described above requires an expensive redesign of the hardware that could demand a few years to implement extensively. The time in the design and standard-ization of Virtual eXtensible Local Area Network (VXLAN) in the past, is a proof of this delay [4,6]. Secondly, currently the functionality of a switch is defined by the device vendor and not by the network operator who deploys these devices and has a better understanding of the network. Recent trends prove for the need to transit towards a “top-down” view commanded by the network operators instead of the traditional “bottom-up” view dictated by the switch vendor.

Therefore, to meet the demands of a market that requires an upgrade, it is necessary to research upon the best techniques that ensure the scalability of network switches in terms of varying matching fields, faster lookup time and increasing rule aggre-gation. There are various packet processing languages with their pros and cons under development that deals with instilling the above mention characteristics on various customized hardware. For example, Protocol-Oblivious Forwarding (POF) handles packet headers as tuples in terms of offset and length. This therefore re-sults in a programming model that resembles an assembly language in which the burden of parsing is dealt by the programmer [4, 7]. packetC on the other hand is a domain specific language that enables access to the packet payload. It focuses on more flexibility and lower performance while programming Network Processing Units (NPU), Field-Programmable Gate Arrays (FGPA) and software switches [4,8]. PX targets FPGA platforms by converting high level declaration to Register-Transfer Level (RTL) description of the target substrate in Hardware Description Language (HDL) [4,9]. Therefore, PX restricts itself to FPGA platforms.

(25)

1.2. Problem Statement 5

1.2

Problem Statement

By segregating the control-plane and the data-plane, modern switches are capable of easily reengineering the traffic rules to a certain extend. SDN gives operators a pro-grammable control over the network switches as compared to the traditional way of deploying a system vendor manufactured black-box switches with fixed functions and less optimized routing techniques. With the constant evolution in protocol for-mats, it has become necessary to ensure that the data-plane is not only highly flexible but also easily reconfigurable in addition to permitting the control-plane to decide on the rules.

As discussed previously, P4 is a high-level approach that is optimized for efficiently describing packet forwarding by allowing the designer to customize the parser, match-action tables and the deparser. Therefore, the problem statement for this research project primarily focusses upon how to design the next generation of high-speed SDN switches that are easily reconfigurable with minimum constraints and comple-ments the conventional networking infrastructure by exploring the capabilities of P4? Secondly, to what extent is the "top-down" approach truly achievable in defining the network? Lastly, what is the cost of proposing such a design on parameters such as resource utilization and latency in the case of sophisticated data-plane descrip-tions? A hardware framework that interfaces with the external physical optical links and communicates with the analyzed P4 defined module shall be the by-product of this undertaken research.

1.3

Purpose

The aim of this thesis report is to propose a viable solution in response to the above-mentioned problem statement by designing a version of the network switch that addresses the required characteristics. Various stages of the switch’s hardware ar-chitecture that are crucial to accept, process and eject packets shall be discussed in detail with a primary focus on the development and integration of the module de-fined using P4 consisting of a custom-made parser, lookup tables and deparser. The experiment shall showcase the ease of hardware reconfigurability for future innova-tions in protocol formats and the ability to describe traffic rules by the control-plane using Application Programming Interfaces (API). The report shall be a guide to fu-ture researchers that desire to explore further within this domain.

1.4

Goals

(26)

6 Chapter 1. Introduction

1.5

Benefits, Ethics and Sustainability

The results of this thesis shall cater to the design of a network switch that would be commanded by network operators by adhering to the top-down approach. The data-plane is entirely defined over an FPGA and shall be easily reconfigurable to test new protocol formats or algorithms. This would ensure that the packet switch-ing techniques commanded by the control layer shall optimize the path which a packet travels through, from source to destination. Keeping in mind the continuous upgradation of the network architecture, the proposed hardware pipeline shall be scalable to higher transfer rates with minimum effort in turn shrinking the heap of obsolete technology that threatens the sustainability of our world.

As far as ethics is concerned, the thesis report shall grant credit to previous searches that served as a reference to propose such a design. Citations shall be re-sponsibly stated to appreciate all the previous efforts that helped complement the outcome of this research. Data and figures that are conceived from elsewhere shall be reproduced if necessary for the better understanding of the reader with proper references to the source of information. To adhere to the confidentiality terms and conditions stated by Infinera, some of the contents are described in an abstracted manner. Nevertheless, the work presented in this report shall be comprehensible to the reader.

1.6

Research Methodology

To explore the options of proposing a viable switch design the research shall pri-marily consider the application of a combination of Qualitative and Quantitative research methods, collectively known as the “triangulation” approach to study the phenomenon. Quantitative research methods shall incorporate experiments and continuous testing during the various stages of development or integration of new components that constitute the overall design. Qualitative research methods shall help analyse the final design and study the implications in terms of resource utiliza-tion and latency for the adopted standards and could be a guide in developing the next generation of SDN switches. The reason for mixing these 2 methodologies is to get a complete view of the research area and to complement the results attained from each other [11]. Therefore, the final hardware prototype shall go through a phase of designing, implementation and testing for various cases before articulating a conclusion.

In the beginning of the thesis a brief study shall be conducted to determine the fea-tures of the existing proprietary board upon which the hardware is to be designed. Subsequently it is necessary to meticulously research upon the modules that must be developed and integrated to build a switch that communicates with the rest of the network. Using the concepts of empirical research, knowledge must be derived from proofs of experimentation and test predictions. Therefore, to stay in accordance with the research guidelines, the next stage would be to develop, integrate and test each feature one after the other to understand the corresponding functionality and implications through actual experience [11]. This research method shall guarantee a reliable design alongside promising the validity of the results declared.

(27)

1.7. Delimitation 7 tested by self-looping the transceivers. The next phase would be to use the same em-pirical techniques to further study the hardware design after integrating the packet processor module defined on P4. In addition to the default actions in the case of a total mismatch, it is possible to program the lookup tables from the control-plane. Ease of defining new rules and reconfiguring the FPGA shall be quantitatively dis-cussed. The above-mentioned procedure shall be initially tested on a single lane and subsequently scaled to a two-lane configuration using arbitration techniques to en-sure the sharing of resources. However, the idea is to operate on multiple lanes in the future.

The research and development adopted in the thesis work shall benefit by adher-ing to the basic principles of an agile workflow in a minimalistic fashion. The thesis kicked-off by understanding the impact of the outcome of this research and charting a rough plan that could help track the progress of the tasks. Instead of considering a straight forward waterfall style, the work involves an iterative development with minor documentation and continuous testing using software tools to simulate hard-ware functionality and eventually observing the expected behaviour on an FPGA using test instruments. However, it is worth mentioning that learning more about the subject demanded that the initial plan be modified to ensure a refined outcome. As previously discussed, the various stages of development can be classified un-der two broad categories focusing on P4 compatible hardware pipeline develop-ment and defining a suitable algorithm on P4. The second category shall shape the primary feature of the network switch and must be studied to understand the lim-itations of this design. Qualitative results shall be finally gathered to evaluate the characteristics of the proposed next generation of SDN switches designed by the aforementioned techniques.

1.7

Delimitation

Designing modern network switches to be compatible with the “top-down” ap-proach exposes the network architecture to attacks. The segregation of the control-plane with an overall command above multiple data-control-planes make the control-control-plane a sweet spot for attackers to redefine the traffic rules and create havoc remotely [12]. It is vital that vendors design such technologies with keeping in mind the security risks involved. The proposed design must be improved further to mitigate such risks.

Currently the research studies the phenomenon and the ease of reconfigurability at a scaled down transfer rate. Since the application demands a faster version, the design must be modified to meet the high-speed requirements. However, the pro-posed design has taken this factor into account and ensures minimum effort to make this modification.

(28)

8 Chapter 1. Introduction network switch hardware pipeline.

1.8

Outline

This section shall be a descriptive guide of the structure that has been adopted while writing this report. Readers can get a glimpse of what to expect from the chapters that follow.

Chapter 2 titled as "Theoretical framework" discusses deeper about SDN and the change in trends over the recent years. OpenFlow and its drawbacks that limits the SDN capabilities have been briefly mentioned to state the significance of P4 in de-scribing Softly Defined Networks. Subsequently, the various characteristics of P4 and the basic structure of a P4 code have been discussed to better comprehend the chapter that follow. Finally, this chapter looks back into previous quality work and technologies that influenced the direction of this research work. Some miscellaneous information with regard to the hardware platform and test instruments used for this project have been included within this chapter.

Chapter 3, "P4-enabled switch: The proposed design" discusses the adopted P4 ar-chitectural model and the custom P4 described data-plane. The supporting hard-ware framework upon which the cost of defining a P4 defined switch is studied has been described in terms of the building blocks used for the design. The algorithm, configuration and signals associated with each building block has been discussed in detail.

Chapter 4, "Results" discusses the achieved goals within this thesis work. Firstly, the simulation and integration of necessary building blocks have been discussed. After the successful integration, the final block design is depicted in this chapter and tested for the desired behavior using a test setup. Subsequently, to study the cost incurred various P4 descriptions have been analyzed in terms of resource uti-lization and latency with increase in complexity.

(29)

9

Chapter 2

Theoretical framework

2.1

Software Defined Network to Softly Defined Network

In order to offer remedies to the ever-evolving operator demands, system vendors realized that the innovation made must ensure reduction of development cycles from years to months with the added advantage of taking into account the flexi-bility of realizing new networking topologies. Software Defined Networking was a promising approach to tackle these demands and the conceptualization of such a technique was driven by the constant reworking of standards and over-the-top services that could dynamically define the network. Initially the traditional belief that software is easier to redesign and hardware is expensive and harder to redesign induced the assumption that “there is relatively dumb switching hardware for high-speed packet forwarding, and relatively intelligent software running on processors for lesser-speed packet forwarding and networking control” [13].

In addition to the limitations briefly mentioned in Chapter 1, OpenFlow is a clear ex-ample that exemplifies the above stated assumption. OpenFlow is one of the pioneer attempts and sometimes wrongly considered as a best approach to achieve SDN. In OpenFlow 1.0, the accessed model contained a single lookup table for matching cer-tain predefined packet fields and only allowed simple actions. Subsequent versions of OpenFlow, operated on models with multiple sequential lookup tables with ex-tended predefined packet fields and actions. Future upgradations were even made in improving the language interface and aimed towards a higher level of abstraction. Nevertheless, OpenFlow offered very restricted view of the underlying forwarding architecture and did not fully tap into the degree of programmability by limiting the end-user to work with predefined protocol formats [13].

Over the course of time, it was made obvious that packet forwarding solutions that relied upon simple switching hardware alone, with complex functionalities handled by software had limitations in terms of flexibility to deliver the requisite perfor-mance. IT architects saw that the underlying hardware had to evolve by crossing limitation for future developments and working towards a goal to implement a set of dynamic virtual services under software control. This questioned the role of fixed function ASICs and the importance of moving towards more flexible Network Pro-cessors (NPUs).

(30)

10 Chapter 2. Theoretical framework of networking platforms. FPGA technologies have the capabilities to blur the role of hardware and software by offering the scope for defining ‘soft hardware’. This terminology highlights the capabilities of a highly flexible and easily programmable hardware which are manufactured by companies like Xilinx that are calling their next-generation of programmable networking platforms that goes beyond SDN as ‘Softly Defined Networking’ devices [14].

Therefore, in comparison to conventional SDN devices that function upon a fixed plane implementation, Softly Defined Networks have a software defined data-plane implemented on a re-programmable hardware in addition to the common soft-ware defined control-plane. Various development environments are recently focus-ing on providfocus-ing high level definition capabilities for end users to fully customize the underlying data-plane and easily program them through the control-plane with the necessary APIs. The figure given below has been derived from the guide of a promising development environment offered by Xilinx and is a comprehensive pic-torial description of the above-mentioned comparison.

FIGURE2.1: Software Defined Networks to Softly Defined Networks.

(31)

2.2. P4: Programming Protocol-independent Packet Processors 11 that commands the rules on which decisions are made. Secondly, SDN offers an in-terface between the separated control and data planes. Thirdly, control-plane logic is migrated to a logically centralized controller that offers a global view of the underly-ing network resources that enables applications to command and optimize policies. Therefore, these changes with a much flexible underlying soft hardware, radically increases the pace of network innovation to improve the performance, scalability, cost, flexibility, and ease of management [15].

Figure 2.2 derived from [15] is a more descriptive version of the previously illus-trated figure 1.1 and depicts the desired overall network framework with a segre-gated control-plane that is centralized and commands the underlying softly defined data-planes. The SDN applications that form the application layer communicate and orchestrate the underlying infrastructure layer through the centralized control layer services to realize finer traffic engineering rules. The physical switches that consti-tute the lowest layer of the network is studied in this article.

FIGURE2.2: The overall network framework.

2.2

P4: Programming Protocol-independent Packet

Proces-sors

(32)

12 Chapter 2. Theoretical framework As compared to OpenFlow which gives restricted access to only customize a limited number of flow tables, P4 on the other hand intends to design the overall data-plane functionality of the networking device. Therefore, P4 exposes a wide range of pa-rameters of the data-plane for customization to the network programmer without imposing restrictions as compared to OpenFlow and makes innovation more agile which in turn helps reduce the development cycle. Many devices implement both the control plane and the data plane. However, apart from the data plane, P4 is capable of only partially defining the interface by which the control-plane and the data-plane communicate.

In the case of a network switch, there are primarily two differences between a tradi-tional switch and a P4 defined switch [10].

• Data-plane:

In a traditional switch, the vendor defines the data-plane functionality and is not reconfigurable.

In a P4 defined switch, the data plane functionality is not fixed in advance. Initially, the hardware has no knowledge of the protocols desired by the operator, the header fields extractable and are configured during the ini-tialization based upon the P4 program description. This gives a wider degree of customization to the end user as compared to only modifying the routing tables.

• Routing tables and entries:

In a traditional switch, control-plane controls the data plane by managing entries in a fixed number of routing tables, configuring specialized objects and by processing control-packets.

In a P4 defined switch, the control-plane communicates with the data-plane in a similar fashion as the traditional switch, but the set of routing tables and configurable objects in the data-plane are not fixed, and are defined by the P4 program description. P4 compiler generates APIs that facilitates the interface between the control-plane and the data-plane, this provides access to the data-plane objects.

Figure 2.3 is a more descriptive version of the softly defined network switch shown in figure 2.1 and is adopted from the “P4-16 Language Specification” guide. It clearly illustrates the differences between a traditional and a P4-defined switch.

While proposing P4 as a high-level language aimed at defining the data-plane for network devices, the authors had 3 main goals that would eventually promise a higher degree of flexibility [5]. These goals were:

• Reconfigurability - User must be capable of easily modifying the switch be-havior even after it is deployed in the field. In addition to catering towards an efficient means of testing new ideas, this ensured lesser development cycles and faster time to market.

(33)

2.2. P4: Programming Protocol-independent Packet Processors 13

FIGURE2.3: Traditional switch vs P4-defined switch [10].

• Target independence – Similar to other high-level programming languages, a network developer’s efforts should not be dependent on the underlying hard-ware that shall recognize the P4 description. While defining the data-plane for the target device in P4, users must be able to work with a target-independent description and force the compiler to do the necessary target-dependent trans-lation.

Before discussing about the various programmable components that constitute the data-plane of a network switch, it is crucial to understand the typical tool workflow that enables target programming using P4. To begin, it is crucial to have a hardware or software implementation framework, a P4 architecture model definition, and a target specific P4 compiler. Compiling a user-defined P4 description adhering to the architecture model definition produces two outputs:

• Data-plane configuration that is to be implemented as described by the input P4 program. This is loaded onto the overall switch hardware pipeline/frame-work that is capable of interfacing with the rest of the netpipeline/frame-work. This is when the switch gets aware of the various networking protocols, header fields to parse and match-action tables to allocate.

• Run-time API that helps interface the control-plane to manage the data-plane objects. This caters to the successful interaction between the two layers and ensures the vital stage of configuring the match-action tables with the desired traffic engineering rules by the control plane.

(34)

14 Chapter 2. Theoretical framework

FIGURE2.4: Work-flow in programming a target using P4 [10].

Apart from the prerequisite compiler and target, P416 incorporates a new

capabil-ity to enable P4 on a diverscapabil-ity of devices. The P4 architecture model defines the necessary P4-programmable blocks and the data-plane interfaces that shall carry the required signal necessary for the user-defined P4 program. Architectures in-sulate programmers from the underlying target framework details and provides an overview of the requisite framework that needs to accommodate the P4 definition. Hardware providers are responsible to define the compatible architecture models and implement the necessary compiler back-end to map the architecture model and the user-defined P4 descriptions to the respective target-specific configuration [10]. The vital abstractions that are allowed within the P4 language are header types, parsers, tables, actions, match-action units, control flow, extern objects, user-defined metadata and intrinsic metadata. These are explained in detail within the P416

Lan-guage Specification document provided by the P4 LanLan-guage Consortium [10].

2.2.1 Architecture model

A single pipeline forwarding architecture that efficiently generalizes and closely re-lates to the adopted architecture model for this thesis work primarily involves three programmable blocks which are the parser, match-action unit and the deparser. As discussed previously, this architecture model is defined by the target manufacturer to be compatible with the target-specific compiler. In the future, P4 compilers shall share a common front-end that understands multiple architecture models.

(35)

2.2. P4: Programming Protocol-independent Packet Processors 15 defined upon different technologies such as fixed-function ASICs, NPUs, reconfig-urable switches, software switches and FPGAs.

FIGURE2.5: Architecture model and programmable blocks derived from [16].

Figure 2.5, depicts the described switch architecture in a more comprehensive man-ner and is derived from [10,15,16]. The colored blocks are the programmable sec-tions apart from the meta-data bus that are defined within a P4 description. As shown, the custom P4 definition that describes the parse graph formulates the parser and deparser functionality as they are complimentary to each other. The match-action unit comprising of ingress and egress pipeline are defined using the control program, the desired table configuration and the set of permitted actions. Initially the incoming packet is parsed at the parser to extract the desired header fields that are defined by the developer. Subsequently, these header fields are looked up as key fields within the defined match-action tables which shall be populated by the control-plane with the desired actions to be performed against a specific matched key field. At the output, the packet is finally restructured with the desired modifi-cations by the deparser. Refer [10] to study an example architecture model defined in P4. A standard Portable Switch Architecture (PSA) is currently being defined by the P416architecture working group with 6 programmable blocks, 2 fixed blocks and

functions to support its capabilities [17]. Figure 2.5 can be considered as a simplified version of the PSA model and is useful for multiple applications.

(36)

are-16 Chapter 2. Theoretical framework • XilinxSwitch

• XilinxStreamSwitch • XilinxEngineOnly

XilinxSwitch closely resembles a simplified PSA and has been adopted for the purposes of this thesis. XilinxStreamSwitch is an experimental feature and is simi-lar to the XilinxSwitch architecture model and differs in terms of only the deparser definition. XilinxEngineOnly is an architecture model used by developers to define stand-alone SDNet engines without any other interfaced engines as in the case of XilinxSwitch and XilinxStreamSwitch. Detailed description for each of these mod-els with their source code is available for further study within the "P4-SDNet User Guide" [18].

However, the adopted XilinxSwitch architecture model is described in detail within Chapter 3 for the reader’s apprehension.

2.2.2 P4 description

As briefly discussed, the blocks of the architecture that constitute the anatomy of a basic pipeline and that is available to the programmer for customization are

• Parser

• Match-Action tables • Deparser

• Meta-data bus

The simplified data-flow topology for the above blocks is depicted in figure 2.6. The blue arrow depicts packet transmission and the green arrow depicts tuple transmis-sion.

However, apart from the meta-data bus, each component is optional within a P4 definition. Figure 2.7 depicts the various sections within a typical P4 program. The data declaration is the first section of the P4 program that defines the data types of header fields or the data that is passed within the data-plane pipeline using the meta-data bus. The second section includes the parser definition that describes the various states that constitute the parser graph and is involved in the header extrac-tion. In the third section, the control block declaration of the code shall include the list of match tables and actions that shall be utilized for the packet processing. Fi-nally the last section shall define the deparser that stitches back the packet with the desired modifications. Each of these components along with their sample P4 defini-tions have been discussed in following subsecdefini-tions.

2.2.2.1 Meta-data bus

(37)

2.2. P4: Programming Protocol-independent Packet Processors 17

FIGURE2.6: Programmable blocks and the data-flow topology.

Therefore, these values also known as meta-data, carries information within the en-tire pipeline. The code segment taken from [18] is an example code provided by Xilinx and depicts how various header fields are defined within the data declaration section of a P4 program. These defined fields shall be extracted by the parser for further packet processing. For example, header ethernet_h is a group of 48 bit desti-nation address, 48 bit source address and a 16 bit type field.

(38)

18 Chapter 2. Theoretical framework

FIGURE2.7: Sections within a typical P4 program.

(39)

2.2. P4: Programming Protocol-independent Packet Processors 19

2.2.2.2 Parsing of the packet

Parsing is one of the initial operations performed on the packet and its output of extracted headers is vital to the overall functioning of the SDN based device. As already mentioned, to ensure support for ever evolving network protocols and the increase in multi-gigabit transfer rates, this section of the packet processor must be fast and reconfigurable. Therefore, the parser generated through the high-level and configurable P4 approach must ensure low latency and high-speed packet stream-ing.

The P4 definition of a parser is basically responsible for identifying the purpose

FIGURE2.8: An abstract parser state machine [10].

of the first N bits of an incoming packet and structuring them as a series of extracted fields with an associated label [19]. These set of extracted fields are known as the “parsed representation” of a packet. To achieve the same the P4 description of a parser can be viewed as a state machine with one start state named as ‘start’ and two final states named as ‘accept’ and ‘reject’. Each state is responsible for the ex-traction of the defined fields and deciding upon the path to traverse over the state machine. The final state ‘accept’ indicates the successful parsing of a packet and ‘reject’ indicates the unsuccessful parsing of a packet. The figure 2.8 is an abstract illustration of a parser state machine that separates the final states from the P4 pro-grammable states [10]. To further explain an actual P4 parser definition, figure 2.9 depicts an actual parser state machine for the P4 code that is adopted from [18] and given below.

(40)

20 Chapter 2. Theoretical framework

FIGURE2.9: Parser state machine for the sample parser code.

(41)

2.2. P4: Programming Protocol-independent Packet Processors 21

2.2.2.3 Match-Action tables

The metadata extracted while parsing the packets is the key to classifying and ma-nipulating the packets within the next control block stage. The body of the control block primarily constitute of a variety of action definition that contain instructions to manipulate the metadata and tables that maps the to be matched extracted fields with their respective actions. These match-action units must be invoked to perform any form of data transformation, thereby being an inevitable part of a packet pro-cessing device. In addition to the metadata incoming from the parser, there could be other metadata coming externally along with the packet if defined within the P4 architecture model.

Actions are responsible for the modification operation on the metadata that is be-ing processed. Actions can be compared to a function call in other high-level lan-guages with the to be data values being written by the control-plane or read by the data-plane. If sufficient actions are defined within the P4 program, this allows the control-plane to command the manipulations to be made on the metadata or in other words define the traffic rules dynamically. Figure 2.10 shown below is derived from [10] and it clearly illustrates an action definition within a P4 program. As shown in the figure, parameters traverse to the action code both from the data-plane and the control-plane as defined on P4. To make action definitions easier there are a set of predefined primitive action that could be used to generate a complex compound action in addition to describing a custom action.

FIGURE2.10: Action code, data and parameters [10].

Now that the necessary actions are explicitly defined, to implement different switch-ing protocols it is crucial to ensure that these actions are performed in an orderly fashion based upon certain conditions applied on the meta-data fields. Therefore, to perform certain actions based upon matches, a lookup table must be implemented listing the various key fields to be matched and the corresponding predefined ac-tions. To successfully process a packet using this match-action table the following steps must be executed:

• Construction of a key field to be matched upon.

• The match step: Key lookup within the lookup table to decide upon the actions to be executed.

(42)

22 Chapter 2. Theoretical framework table contents can be manipulated asynchronously by the target control-plane. This feature although inherited from the recent iterations of OpenFlow is crucial to guar-antee the promised protocol-independence and reconfigurability to the end-user.

FIGURE2.11: Match-action unit [10].

Currently there are three kinds of match declaration types defined within the P4 li-brary. It is possible to define match kinds of only these types and P4 programmers cannot define new match kinds [10]. The permitted match kinds are:

• Exact match • Ternary match • Longest prefix match • Direct match

The keywords used to indicate the corresponding match types are ‘exact’, ‘ternary’, ‘lpm’ and ’direct’. The key fields to be matches can be defined using multiple match kind types in P4. However, the Xilinx tool used at the time of this thesis work al-lows only one match type for a table. A sample code below describes a control block consisting of a match-action unit. As mentioned previously, P4 allows mul-tiple match-action tables that are optimized to function in serial or parallel by the compiler. Refer the P4 specification guide for more details regarding the various match types. Table 2.1 is adopted from [18] and states the restrictions imposed on tables of various match kind types.

control Forward(inout headers_t hdr, inout switch_metadata_t ctrl) { action forwardPacket(switch_port_t value) {

(43)

2.2. P4: Programming Protocol-independent Packet Processors 23

action dropPacket() { ctrl.egress_port = OxF; }

table forwardIPv4 {

key = { hdr.ipv4.dst : ternary; }

actions = { forwardPacket; dropPacket; } size = 63;

default_action = dropPacket; }

table forwardIPv6 {

key = { hdr.ipv6.dst : exact; }

actions = { forwardPacket; dropPacket; } size = 64; default_action = dropPacket; } apply { if (hdr.ipv4.isValid()) forwardIPv4.apply(); else if (hdr.ipv6.isValid()) forwardIPv6.apply(); else dropPacket(); } }

Match Kind Key Size(bits) Element size(bits) Depth exact 12,384 1,256 1,512K ternary 1,800 1,400 1,4K

lpm 8,512 1,512 7,64K direct 1,16 1,512 2,64K

TABLE2.1: Table restrictions based upon the match kind [18].

2.2.2.4 Deparser

The deparser is the final section to be defined within a P4 definition. The purpose of this component is to rearrange all the appropriate modified fields with the corre-sponding payload to finally stitch together a packet with the desired manipulations. The initial parsed representation of a packet undergoes major changes while being processed such as modification in header value, omission and addition of header fields. Therefore, deparsing is essential to ensure an accurate restructuring of the desired output packet [19]. In simpler words, a deparser functions as an opposite to a parser.

(44)

24 Chapter 2. Theoretical framework of increasing indexes [10].

control Deparser(in headers_t hdr, packet_out pkt) { apply { pkt.emit(hdr.ethernet); pkt.emit(hdr.ipv4); pkt.emit(hdr.ipv6); pkt.emit(hdr.tcp); pkt.emit(hdr.udp); } }

Finally, following the deparser declaration the packaged architecture model desired is declared with the above mentioned programmable blocks as shown within the Xilinx sample code [18] given below.

XilinxSwitch(Parser(), Forward(), Deparser()) main;

2.2.3 Benefits of using P4

To recapitulate, listed below are some of the major advantages of P4 described within the P4-16 Language Specification guide [10].

• Flexibility: As a high-level language, P4 gives a higher degree of capability to customize the data-plane to the programmers.

• Expressiveness: By defining general-purpose operations and table look-ups, P4 enables programmers to express complex packet processing algorithms. • Resource mapping and management: The abstract data-plane description is

compiled to map the defined fields to hardware resources and efficiently man-age allocation and scheduling.

• Component libraries: Hardware-specific functions can be wrapped into portable high-level P4 constructs.

• Decoupling hardware and software evolution: Low-level architectural details of the hardware can be further abstracted from high-level software processing.

2.2.4 P4 compiler and tools

(45)

2.3. Related work 25 For this thesis project, the switch is designed upon a Kintex FPGA manufactured by Xilinx, Inc. Therefore, to generate an appropriate executable, the Xilinx’s P4-SDNet tool framework which is still under development is utilized to ensure a compatible P4 compilation. The tool initially consists of a compiler with a target-independent frontend p4c which compiles target-independent functionalities into intermediate C code. Subsequently, the backend sdnet converts the intermediate representation into target dependent sdnet code. p4c supports both P414 and P416, and p4c-sdnet

con-verts the P4 description to an appropriate SDNet description of a data-plane. This description consists of engines that primarily communicate with the data-flow of packets and tuples to implement a larger system behavior. SDNet [21] is a develop-ment environdevelop-ment provided by Xilinx which is capable of accepting this description and handling the hardware sold by them. Together, P4 files are compiled to classical verilog files that can be subsequently used to generate the desired IP cores. Fig-ure 2.12 derived from [22], illustrates the functioning of the P4-SDNet tool and lists all the generated output files of SDNet. As depicted some of the essential output files generated are the verilog output files with a top level wrapper, testbenches, APIs for the control plane to populate traffic rules, etc.

The "SDNet Installation and Getting Started" [23] contains information regarding the

FIGURE2.12: P4-SDNet compilation flow [22].

general installation procedures of this tool. The essential instructions necessary to compile a P4 program and the output files generated are discussed in [18,21]; how-ever, we shall discuss regarding some of the inevitable commands that helped obtain the desired results in appendix A.

2.3

Related work

(46)

26 Chapter 2. Theoretical framework University, Google, Microsoft Research, Xilinx Inc., etc are collaborating to define the best methodology to implement the top-down approach irrespective of the tar-get.

Anirudh Sivaraman et al. [4] have explored the use of P4 in defining the forward-ing plane of a data-center switch. Various essential capabilities of P4 at the time have been discussed in terms of implementing a data-center switch. To ease the task of prototyping, a software switch has been considered as the target for the P4 programs. In addition to mentioning the pros of utilizing a high-level language, the paper proposes certain improvements to the P4 version explored at the time in terms of modularity of a P4 code, explicit visibility in terms of flow of information from one table to another, parallel execution semantics, architecture-language separation and new primitives such as cloning or digest generation.

Fabien Geyer et al. [24], have discussed the use of P4 to meet the need for higher flexibility in defining custom network protocols generally used within the aeronau-tical industry. This work explores the capabilities of P4 in addressing various ap-plication requirements. Finally a performance based analysis is conducted upon different platforms such as a software-based back-end using Intel DPDK, a hard-ware network accelerator based on a NPU, and an FPGA-based platform. The article states that since P4 is still underdevelopment with incompatibility issues between its different versions, it is not yet suitable for aeronautical applications with long life-times. Possibility for formal analysis and the simple cost model makes P4 promising for future purposes.

Apart from hardware-based analysis, "PISCES: A Programmable, Protocol-Independent Software Switch" [25] discusses regarding the use of a domain-specific language such as P4 in describing the behavior of a protocol independent software switch such as PISCES. The proposed implementation is benchmarked in terms of overall perfor-mance upon increasing complexity of both the parser and actions. Finally, the results are evaluated with the conventional Open vSwitch.

2.4

Miscellaneous

This section shall introduce the various miscellaneous theory that are prerequisites such as details about the FPGA based framework, high-speed transceivers, testing instruments, etc that were necessary to successfully design and test the proposed next generation networking switch.

2.4.1 FPGA platform

(47)

2.4. Miscellaneous 27 Feature Kintex-7 Logic cells 478K Block RAM 34 Mb DSP Slices 1,920 DSP Performance 2,845 GMAC/s

MicroBlaze CPU 438 DMIPs Transceivers 32 Transceiver Speed 12.5 Gb/s

Serial Bandwidth 800 Gb/s PCIe Interface x8 Gen2 Memory Interface 1,866 Mb/s

I/O Pins 500 I/O Voltage 1.2V–3.3V

Package Options Bare-Die Flip-Chip and High-Performance Flip-Chip

TABLE2.2: Capabilities of 7 Series FPGA [26].

The board utilized for this research consists of a MPC8321 microprocessor using an asynchronous interface to the FPGA such as the General-Purpose Chip-Select Ma-chine bus (GPCM). The Kintex FPGA constitutes the traffic FPGA and shall contain the to be designed hardware pipeline that understands the P4 defined data-plane. Currently, within the FPGA point to point interconnect uses Intel’s UltraPath Inter-connect (UPI). An abstract block diagram of the board from an FPGA designer’s point of view can be summarized as shown in Figure 2.13.

For the purposes of this thesis work, the traffic FPGA shall be customized to ac-commodate and function in accordance with the P4 logic defined. Small form-factor pluggable transceiver (SFP+) are used to interface the board with the external op-tical interfaces as shown in figure 2.13. It is important to mention that Xilinx pro-vides power-efficient transceivers that enable high-speed optical interfacing with the board. The maximum line rate offered by GTH transceivers in the current 7-series is up to 12.5Gb/s. The reference clocking comes from an external clocking device – AD9554-i. These transceivers are highly configurable and Vivado offers a GT wiz-ard which offers an easy means to instantiate these transceivers with the desirable settings and connections. A brief overview regarding Multi-Gigabit Transceivers (MGT) can found in [27]. [28] clearly illustrate the intricate details about the GTH transceiver within a Kintex FPGA.

2.4.2 Simulation and testing

While designing the various vital components of the hardware pipeline that is ca-pable of accommodating P4, it is best practice to test after each stage both at the component level and at an integrated system level for the desired behavior. Xilinx’s Vivado has been used for component level development using system Verilog and for conducting further simulations to investigate for expected behavior.

(48)

28 Chapter 2. Theoretical framework

FIGURE2.13: Board level block diagram.

(49)

2.4. Miscellaneous 29

(50)
(51)

31

Chapter 3

P4-enabled switch: The proposed

design

To further explore the advantages of defining the data-plane using P4 and study the resource and latency related constraints, it is necessary to configure a pipeline on an FPGA in such a way that it communicates with the external network links and also incorporates all the properties and adequate signals to interface a P4-defined data-plane. This shall be achieved by designing a two lane 10G network switch complemented with the current capabilities of P4 on a proprietary FPGA platform that has been discussed briefly in section 2.4.1. This chapter shall give an insight into the most crucial stage of the thesis, which is the overall design of the FPGA based solution that accommodates the P4 definable switch characteristics. The requisite theoretical background has been discussed in Chapter 2; however, vital component specific theory and the related configuration has been discussed accordingly.

Figure 3.1 gives a simplified overview of the design to be implemented on the Kintex FPGA enabled board. As emphasized in the figure, the P4 switch definition is the central most important component whose behavior determines the network packet processing. The rest of the design revolves around this module and focuses more upon the necessary external interfacing of the board and efficient run-time config-uration of various components which also caters to the effective customization of traffic rules using control signals.

Currently P4 descriptions compiled by P4-SDNet is capable of extracting and mod-ifying header and tuple fields based upon the parser, match-action tables and de-parser definitions. In order for the packets to be processed accurately and be rerouted based upon certain tuple field modifications, it is necessary to ensure accurate P4 de-scription and that the chosen architecture model sits well within the proposed hard-ware design.

Also, to justify the necessity of each component that constitute the block design it is vital to list some of the major hurdles that needed addressing during the imple-mentation of this design in the viewpoint of an FPGA designer such

as-• the translation of existing bus operation to integrate the proposed design; • GTH transceiver configuration and clocking to interface with the physical links; • synchronization of the input tuple fields with respect to the packet streams to

avoid stalling at the P4 module;

(52)

32 Chapter 3. P4-enabled switch: The proposed design

FIGURE3.1: Overview of the design.

• clock domain synchronizations;

• address mapping of components to enable the processing system to use control signals as means to configure at run-time;

• enabling switching of packets between the two lanes as part of the engineered actions accompanied with the necessary arbitration and congestion control to ensure minimal packet loss.

(53)

3.1. Building blocks 33 using a tuple controller. It is also necessary to supply the acceptable clocking to capture the incoming packet streams, execute table lookups using match fields and register changes made by the control signals. Finally, to enable the desired switching of packets between Tx lanes as a result of the packet processing is physically possible by using a stream switch that helps determine the route by behaving as a crossbar and conducting the adequate arbitration at times of collision.

Each of the essential components have been described in more detail with regard to their purpose, configuration and essential signals in the subsequent sections. This information is crucial to reproduce the findings proposed within this article. Finally, Chapter 4 shall illustrate the overall implemented block design which shall include intricate details that ensure successful packet processing.

3.1

Building blocks

3.1.1 P4 description

This section shall discuss the architecture model adopted and the desired custom data-plane definition on P4 that is used within the design.

3.1.1.1 The Architecture model

Before elaborating on configuring the various components constituting the hard-ware that communicates with the P4-SDNet generated hardhard-ware description of the data-plane, it is necessary to know the P4 architecture model adopted for this de-sign and understand the expectations from the rest of the supporting components. As discussed in Chapter 2, there are currently three architectural models that are supported by the Xilinx P4-SDNet tool to successfully configure various Xilinx tech-nologies. For this design the XilinxSwitch architecture model has been adopted as this closely resembles the desired PSA model described within the P4 specification guide [10]. This model basically allows the programmer to customize 3 major con-tainers which are the parser, pipeline and deparser as shown in the figure below.

Figure 3.2 is adopted from the “P4-SDNet User Guide” [18] and depicts the internal

FIGURE3.2: XilinxSwitch layout [18].

(54)

34 Chapter 3. P4-enabled switch: The proposed design control-plane are used to configure the lookup tables within this container that are responsible to match the specified fields and execute predefined actions. This facili-tates packet processing based upon the various software defined traffic engineering rules. More detailed information regarding the various match types and actions are given in Chapter 2. After the desired packet fields have been modified, the third container is a XilinxDeparser control block that handles the stitching of the header fields with the rest of the original packet in the order desired.

The XilinxSwitch’s architecture description source code for each of these containers have been derived from the “P4-SDNet User Guide” and given below for under-standing the adopted model.

parser XilinxParser <H> (packet_in pkt, out H headers);

control XilinxPipeline <H,C> (inout H headers, inout C control); control XilinxDeparser <H> (in H headers, packet_out pkt);

package XilinxSwitch <H,C> (XilinxParser<H> prsr, XilinxPipeline<H,C> pipe, XilinxDeparser<H> dprsr);

The first container XilinxParser accepts an input packet stream of type packet_in and delivers the header fields of a user-defined type. This allows the programmer to define the requisite header field composition that needs to be extracted from the in-coming packet with less restrictions. Secondly, the source code describes the param-eters passed to the XilinxPipeline within which the match-action units are defined. For this purpose, the previously extracted header fields are accepted and modified within this container. A user-defined type tuple signal named ‘control’ (this is not related to the control signal that configures the lookup tables) is accepted by this block along with every incoming packet as shown in figure 3.2. The final container as discussed accepts the modified header fields of user-defined type and emits the packet of type packet_out. XilinxSwitch is the final package that assembles each of these mentioned containers to define an architecture model.

3.1.1.2 P4 data-plane description

The custom P4 description of the desired data-plane and the final packaged block has been discussed in detail within this section. Certain commands necessary for the successful compilation to attain the desired outcome is introduced here; how-ever, a detailed discussion regarding compilation, configuration and packaging has been provided in appendix A. This shall be a necessary information to support the findings mentioned within this thesis report. At the time of this thesis work, Xilinx SDNet 2018.1 has been used for compilation purposes and the resulting output was packaged using Xilinx Vivado 2017.3. The various steps involved prior to utilizing the P4 description with respect to the P4-SDNet tool that is still under development has been listed in the appendix.

References

Related documents

The multivariate regression models analyzing the steps towards elected office further drive home the point that the small and inconsistent immigrant–native differences in the

The Scandinavian Brown Bear Research Project is a co-operation between Sweden and Norway, and has a number of different goals such as studying the bear´s choice of food,

2 The result shows that if we identify systems with the structure in Theorem 8.3 using a fully parametrized state space model together with the criterion 23 and # = 0 we

In this situation care unit managers are reacting with compliance, the competing logic are challenging the taken for granted logic and the individual needs to

Efficiency curves for tested cyclones at 153 g/L (8 ºBé) of feed concentration and 500 kPa (5 bars) of delta pressure... The results of the hydrocyclones in these new

The program is intro duced to the site of a closed op en-pit- and underground mine in Tuolluvaara, Kiruna as the site ver y well emb o dies the topic of investigation..

A study of rental flat companies in Gothenburg where undertaken in order to see if the current economic climate is taken into account when they make investment

The demand is real: vinyl record pressing plants are operating above capacity and some aren’t taking new orders; new pressing plants are being built and old vinyl presses are