A new Cutting Algorithm for the Packet Classification

(1)

2008:115 CIV

M A S T E R ' S T H E S I S

A new Cutting Algorithm for the Packet Classification

Problem - UpperCuts

Josefine Åhl

Luleå University of Technology MSc Programmes in Engineering Computer Science and Engineering

Department of Mathematics

(2)

A new Cutting Algorithm for the Packet Classification

Problem -

UpperCuts

Josefine ˚ Ahl

May 2008

(3)

Preface

This master thesis is the final step for me to become a Master of Science in Applied Mathematics. The project has been done at the company Oricane AB in Lule˚ a and at the Department of Mathematics at Lule˚ a University of Technology. The project started in September 2007 and ended in May 2008. I wish to acknowledge the following people for their contributions to my work.

• Mikael Sundstr¨om, for supervision during this project. Thank you for always pushing me in the right direction and for all meaningful discussions.

This project could not have been done without your help.

• Thomas Gunnarsson, for all the tips and ideas during this project. Thank you for always making me see things from another perspective.

• My mom and dad, I would not have been where I am today without your support. You will get your house in Greece some day, I promise!

• The rest of my family and friends and most of all, my husband, for putting

up with me under this time. You are the greatest!

(4)

Abstract

Data sent or received over a public network such as the Internet are categorized into flows, where each flow is an ordered sequence of packets. A packet consists of a header and the data that should be transported. In the packet header, different information is stored in a number of fields; F ield 1 , F ield 2 , ..., F ield D .

The packet classification problem is to determine to which flow each packet belongs by inspecting the header fields of the packet and comparing them to a list of n rules that identify each flow, where each rule consists of D fields.

This report describes an algorithm called UpperCuts that solves the packet

classification problem based on an idea called cutting. The algorithm requires

Θ(lg(n)) lookup time and Ω(n ^D ) storage in the worst case.

(5)

Sammanfattning

Data som skickas eller tas emot över ett publikt nätverk som Internet kategoris- eras i flöden, där varje flöde är en ordnad sekvens av paket. Ett paket best˚ ar av ett huvud och det data som ska transporteras. I pakethuvudet sparas olika information i ett antal fält; Fält 1 , Fält 2 , ..., Fält D .

Paketklassificeringsproblemet best˚ ar av att ta reda p˚ a vilket flöde varje paket tillhör genom att inspektera huvudfälten av paketet och jämföra de med en lista av n regler som identifierar varje flöde, där varje regel best˚ ar av D fält.

Den h¨ar rapporten beskriver en algoritm kallad UpperCuts som l¨oser paketk-

lassificeringsproblemet baserat p˚ a en metod kallad cutting. Algoritmen kr¨aver

Θ(lg(n)) s¨oktid och Ω(n ^D ) minne i v¨arsta fall.

(6)

1 Introduction 1

1.1 The Problem . . . . 1

1.2 Objectives . . . . 3

1.3 Demarcations . . . . 3

2 Background 4 2.1 Existing Algorithms . . . . 4

2.2 BioCAM . . . . 5

2.3 Point Location Problem . . . . 6

3 Hierarchical Intelligent Cuttings (HiCuts) 7 3.1 Geometrical view . . . . 7

3.2 HiCuts Example . . . . 7

3.3 Preprocessing Algorithm . . . . 10

3.4 Search Algorithm . . . . 12

4 Multidimensional Cutting (HyperCuts) 14 4.1 Geometrical view . . . . 14

4.2 HyperCuts Example . . . . 14

4.3 Preprocessing Algorithm . . . . 15

4.4 Search Algorithm . . . . 17

5 New Cutting Algorithm (UpperCuts) 19 5.1 Geometrical view . . . . 19

5.2 Example UpperCuts . . . . 19

5.3 Preprocessing Algorithm . . . . 20

5.4 Search Algorithm . . . . 22

6 Evaluation of HiCuts and HyperCuts 23 6.1 HiCuts Evaluation . . . . 23

6.2 HyperCuts Evaluation . . . . 23

6.3 HiCuts versus HyperCuts . . . . 24

7 Evaluation of UpperCuts 26 7.1 Worst Case Lookup Time . . . . 26

7.2 Worst Case Storage Requirements . . . . 28

(7)

8 Future Work 30

8.1 Cost Function . . . . 30

8.2 Analyzing Subregions . . . . 30

8.3 Pointer Compression . . . . 31

8.4 Implementation . . . . 31

9 UpperCuts versus HiCuts and HyperCuts 33 9.1 Worst Case Bounds . . . . 33

9.2 Cut Width . . . . 34

9.3 Reducing Storage . . . . 34

9.4 Implementation . . . . 34

A Prefixes 36

B Decision Tree 37

C Complexity Bounds 38

D Complexity Bounds for the Point Location Problem 41

E Test Results for the HiCuts Algorithm 43

F Test Results for the HyperCuts Algorithm 45

(8)

Chapter 1 Introduction

1.1 The Problem

Data sent or received over a public network such as the Internet travels as a series of packets. For example every e-mail that a user sends leaves as a series of packets and every web page that a user receives comes as a series of packets.

A packet consists of a header together with the data that should be trans- ported and the packet header consists of a number of fields, where each field contains information such as where the packet comes from and where it should be sent. Figure 1.1 illustrates a packet header with corresponding data.

Figure 1.1: Illustration of a packet with header fields and data.

(9)

When the packets travel on the Internet they are sorted into different flows according to one or several fields in the headers. The header fields used to sort a packet into the right flow are referred to as the input key and they are denoted here by F ield 1 , F ield 2 , ..., F ield D .

In order to know to which flow a packet belongs a router or a firewall is used.

A router or a firewall is a piece of equipment that partitions the Internet into smaller subnetworks and a packet visits a number of routers or firewalls when it travels through the Internet.

The main purpose of a router is to look at the header field containing the destination address for the packet and forward the packet to the next router on the way to the destination.

There can be reasons to block some packets and this is done by using a firewall. The firewall map different flows to different actions that describes how the packets should be treated. An action can be to deny or permit the packet.

Firewalls can work as routers and routers can have firewall qualities, there- fore the word router is used from now on to mean a router, a firewall or a combination of both.

The router uses the input key to search for the corresponding flow that the packet belongs to. The search is done in a table called a classifier. A classifier consists of a list of rules denoted here by Rule 1 , Rule 2 , ..., Rule n . Each rule consists of D fields and represents a flow where each flow can have an action associated with it. A packet matches a rule if the header fields in the input key matches the corresponding fields in the rule.

Figure 1.2 illustrates a classifier with six rules with five fields each and each rule has a flow associated with it. The first field in the classifier in Figure 1.2 is named the destination address (DA), the second field is named the source address (SA), the third field is named the destination port (DP), the fourth field is named the source port (SP) and the fifth field is named the protocol (PR).

The first and the second fields are represented here by prefixes (see Appendix A), the third and fourth fields are represented by numbers and the fifth field is represented by a protocol. The five fields can also be seen as the shaded regions in the packet header in Figure 1.1.

The packet classification problem is to determine the first matching rule for each incoming packet at a router.

Figure 1.2: Illustration of a classifier with six rules and five fields each.

(10)

1.2 Objectives

The main objective of this project was to construct an algorithm that solves the packet classification problem. The constructed algorithm, called Upper- Cuts, will be used in a data structure called BioCam [10], developed by Mikael Sundstr¨om at Oricane AB in Lule˚ a.

The UpperCuts algorithm is based on an idea called cutting that is used in two other algorithms called HiCuts [6] and HyperCuts [9].

The UpperCuts algorithm is analyzed, evaluated and compared to the Hi- Cuts algorithm and the HyperCuts algorithm in order to measure its perfor- mance. The UpperCuts algorithm is protected by a patent application.

1.3 Demarcations

The UpperCuts algorithm is optimized for speed first in this project, meaning that no consideration is taken to the amount of storage the algorithm requires.

The construction of the algorithm is a basic one, meaning that it might be

possible to make refinements of the algorithm to get better performances. No

implementation of the algorithm is done in this project, so the algorithm can

only be compared to the HiCuts algorithm and the HyperCuts algorithm ana-

lytically.

(11)

Chapter 2 Background

There are many algorithms solving the packet classification problem and these algorithms can be broken down into four regions. This chapter starts by giving a short description of the four regions and after that it describes the main parts of the data structure developed by Mikael Sundstr¨om at Oricane AB in Lule˚ a [10]. The chapter continues by describing the Point Location Problem [3] that provides a geometrical view of the packet classification problem.

2.1 Existing Algorithms

There are many algorithms today solving the packet classification problem. Ac- cording to the taxonomy by David E. Taylor [12] the algorithms can be broken down into four regions.

Exhaustive Search. The two most common approaches in exhaustive search are linear search and parallel search. Linear search checks every rule in the classifier until a match is found. Parallel search divides the classifier into subsets containing one rule each and then the subsets are searched in par- allel. The parallel search can be done using Ternary Content Addressable Memory (TCAM) where one processor is assigned to each rule.

Decision Tree. The classifier is analyzed in order to make a number of cuts and then a decision tree (see Appendix B) is constructed from the cuts.

An input key is constructed from the header fields of the packet and the decision tree is traversed until a leaf is found.

Decomposition. The multiple field searches are decomposed into instances of single field searches. Independent searches on each packet field are made and the results are combined in the end.

Tuple Space. The classifier is partitioned according to the number of specified bits in the rules. This approach is based on the assumption that the intervals constituting the rules are represented by prefixes. The partitions or a subset of the partitions are probed using exact match searches [12].

The UpperCuts algorithm, the HiCuts algorithm and the HyperCuts algorithm

all fall into the decision tree region.

(12)

2.2 BioCAM

Mikael Sundstr¨om, Ph.D. in Computer Science and Electrical Engineering at the University of Technology in Lule˚ a, and the founder of the company Oricane AB, has developed a package of efficient algorithms and data structures for the packet classification problem. One such data structure called BioCAM performs Multi Field Classification (MFC). Multi field classification is when several fields from the packet header are used as the input key in the packet classification problem, i.e. when multiple fields are matched simultaneously.

The BioCAM data structure operates in two steps referred to as crunching and matchmaking.

The crunching step compresses the original rules R 1 , R 2 , ..., R n in the clas- sifier to a list of crunched rules R ⁰ ₁ , R ⁰ ₂ , ..., R _n ⁰ . The result of this is that the universe for each field in the classifier is compressed to min(2n+1, 2 ^w

ⁱ

) elements, where w i is the number of bits of the ith field in the classifier and n is the num- ber of rules in the classifier. By using this technique, the total number of bits involved in the classification is reduced from P

w i to P

min(w i , dlog 2 (2n + 1)e) bits. The input key is also crunched in the crunching step [10].

In the matchmaking step the crunched input key is compared to each di- mension and each rule in the crunched rule list in parallel and then the results from each comparison is combined to determine the first matching rule. The matchmaker is implemented using Extended TCAM memory, which is a special TCAM memory with support for interval matching directly in hardware.

Figure 2.1 shows an illustration of the data structure. At the first step an input key enters the cruncher and this results in a new crunched input key.

The crunched input key then enters the matchmaker where it is compared to each dimension and each rule in the crunched rule list in parallel and the first matching rule is outputted in the last step [10].

UpperCuts, HiCuts and HyperCuts all solve the matchmaking step in the data structure. The Extended TCAM memory that is used in the matchmaker is expensive and the idea of the UpperCuts algorithm is that it should replace the matchmaking step, thus removing the need for the Extended TCAM memory.

Figure 2.1: Illustration of a data structure solving the packet classification prob-

lem where multiple fields are used to create the input key.

(13)

2.3 Point Location Problem

There is a problem in computational geometry called the Point Location Prob- lem [3]. It is defined as follows.

Given a query point q in a D-dimensional space, and a set of n D-dimensional non-overlapping regions, find the region that the point q belongs to.

For example, consider Figure 2.2. It has a query point q and six 2-dimensional regions. The point location problem here is to find the region that contains the point q.

q

Figure 2.2: Example of the point location problem; find the region that contains the point q.

Another example of a point location problem is to find the province on a map that a village belongs to. The village Abisko for example is located at longitude 68 ^◦ 20 ⁰ north and latitude 18 ^◦ 51 ⁰ east [1] and it can be found on a map over Sweden by using the scales on the sides of the map. It turns out that Abisko belongs to the province Lapland. In other words, given a map and a query point q specified by its coordinates, the region of the map containing the query point q can be found.

The general packet classification problem can be viewed as a point location problem in multidimensional space [14]. This makes it possible to find complex- ity bounds (see Appendix C) of the lookup time and the storage requirements.

The complexity bounds for the point location problem are either Ω(lg ^D−1 (n)) lookup time with O(n) storage, or O(lg(n)) lookup time with Ω(n ^D ) storage [3]

(see Appendix D for derivation of these complexity bounds).

(14)

Chapter 3 Hierarchical Intelligent Cuttings (HiCuts)

Pankaj Gupta and Nick McKeown solve the packet classification problem with an algorithm called Hierarchical Intelligent Cuttings (HiCuts) [6]. This chapter describes the HiCuts algorithm in more detail.

3.1 Geometrical view

In the HiCuts algorithm the packet classification problem is viewed geomet- rically, meaning that each rule in the classifier is viewed as a D-dimensional rectangle in D-dimensional space, where D is the number of fields in the classi- fier. The D-dimensional space is partitioned by a number of cuts and an input key created from a packet header becomes a point in the space. The packet classification problem now reduces to finding the D-dimensional rectangle that contains the point. A decision tree is built from the subregions created by the cuts and searching in the decision tree is done by using the header fields of the incoming packet as the input key and traverse the decision tree until a leaf is found.

3.2 HiCuts Example

To see how the HiCuts algorithm works consider the classifier in Table 3.1. The classifier consists of eleven rules with two fields each. The first field is called the Address-field and it contains prefixes with a maximum of four bits. The second field is called the P ort-field and it contains intervals with a maximum of four bits.

The classifier in Table 3.1 can be viewed geometrically. Let the fields in the classifier represents dimensions in a space. In this example there will be a two dimensional space containing eleven two dimensional rectangles. This geometrical view can be seen in Figure 3.1 (note that the different colors of the rules are only there for making it easier to see the rules).

The HiCuts algorithm builds the decision tree by cutting the space that

contains the rules. The cuts are made by hyperplanes that are parallel to the

axis. HiCuts starts by choosing one of the dimensions that the cuts should be

(15)

Rule Address Port

R ₁ 1010 2:2

R ₂ 1100 5:5

R ₃ 0101 8:8

R ₄ * 6:6

R ₅ 111* 0:15 R ₆ 001* 9:15

R ₇ 00* 0:4

R ₈ 0* 0:3

R ₉ 0110 0:15

R ₁₀ 1* 7:15

R ₁₁ 0* 11:11

Table 3.1: Illustration of a classifier with eleven rules with two fields each.

Figure 3.1: Geometric representation of the classifier show in Table 3.1.

(16)

placed in and then it decides how many cuts should be placed in the chosen dimension. The decision tree is built up from the subregions that the cuts generate. Figure 3.2 shows a number of cuts made in Figure 3.1 and these cuts produces the decision tree shown in Figure 3.3.

In this example the Address-dimensions is chosen first and three cuts are placed in that dimension, partitioning the space into four subregions. The first subregions goes from [0:3] in the Address-dimension and from [0:15] in the Port- dimension. The second subregion goes from [4:7] in the Address-dimension and from [0:15] in the Port-dimension. The third subregion goes from [8:11]

in the Address-dimension and from [0:15] in the Port-dimension and the last subregion goes from [12:15] in the Address-dimension and from [0:15] in the Port-dimension. These four subregions are then considered one at a time and the procedure of cutting is repeated on each subregion. For example, consider the first subregion. This time the Port-dimension is chosen instead and three cuts are placed in that dimension, generation four new subregions in that subregion.

In this example the cutting continues until there is two or less rules in each subregion.

Figure 3.2: A number of cuts placed in Figure 3.1, generating a number of

subregions.

(17)

Figure 3.3: Decision tree produced by the cuts in Figure 3.2.

The HiCuts algorithm is built up of two different algorithms referred to as the Preprocessing Algorithm and the Search Algorithm.

3.3 Preprocessing Algorithm

The preprocessing algorithm builds the decision tree based on the structure of the classifier. Cuts are made at each level of the decision tree, and recursively on the child nodes of that level, until the number of rules in each node is smaller than a predetermined value called binth [6]. No cuts are made in a node with less than binth rules and that node becomes a leaf of the decision tree.

With each node v in the decision tree that HiCuts generate, the following are associated.

B(v). A box that is a D-tuple of intervals; ([l 1 : r 1 ], [l 2 : r 2 ], ..., [l D : r D ]).

C(v). A cut that is defined by a dimension and np(C) which is the number of partitions the corresponding box is cut up into in the specified dimension.

R(v). A set of rules that the specified box contains.

Choosing the dimension

Choosing the dimension to cut on in a node is done in the preprocessing algo-

rithm and it can be done in a number of different ways [6]. The first way is to

try to minimize the maximum number of rules any child node will have. This

(18)

results in a decrease of the depth of the decision tree. The second way is to try to pick a dimension that leads to the most uniform distribution of the rules.

The third way is to try to minimize a space measure sm(C) over all dimensions and the last way is to cut the dimension that has the largest number of distinct ranges of rules.

Choosing the number of partitions (np(C))

Choosing the number of partitions that should be made by cutting in the chosen dimension at a node is also done in the preprocessing algorithm.

A node is cut into equal sized partitions along a dimension. Increasing the number of cuts will decrease the depth of the decision tree and it results in faster lookups, but it increases the storage used by the HiCuts structure. In order to balance this trade off, a number of parameters can be varied.

For a cut C(v) defined at a node v, a space measure is defined as sm(C(v)) = X

i

N umRules(child i ) + np(C(v)),

where N umRules(child i ) is the number of the rules that child i of node v con- tains. The value of np(C(v)) is chosen as the largest value such that

sm(C(v)) ≤ spmf (N umRules(v)), where

spmf (n) = spf ac ∗ n.

The parameter spf ac is a called a space factor parameter and it can be changed to balance the trade off of lookup time against storage.

As many partitions as the spmf () function allows at a node v is done. This is decided by a binary search on the number of partitions until sm(C(v)) ≤ smpf (N umRules(v)). The pseudo code for the binary search is included in [6].

The binary search starts by letting the number of partitions be equal to max(4, p

(n)),

where n is the number of rules in the classifier. Since the partitions have equal size the number of rules contained in each assumed partition can be counted and this is added to the space measure together with the number of assumed partitions. If the space measure is less than spmf (n) the binary search starts all over again but this time the number of partitions are twice as many.

For an example, consider Figure 3.1 again. Since there are n = 11 rules and max(4, √

n) = 4, the HiCuts algorithm will start by letting the number of partitions in the Address-field be four, and these partitions should be equally spaced. For the space measure sm(C(v)) the following is obtained.

sm(C(v)) = X

i

N umRules(child i ) + np(C(v)) = 17 + 4 = 21,

since the assumed four partitions together contains seventeen rules and four is

the number of partitions . This gives the relation

(19)

sm(C) ≤ spmf (n) = spf ac · n =⇒

17 + 4 ≤ spmf (11) = spf ac · 11 =⇒

21 ≤ spf ac · 11,

where the value of spf ac can be varied to trade off storage against time. De- pending on the binth value and the value for spf ac the number of cuts made in the Address-field can vary. If for example spf ac = 1 and binth = 2, the Address-field will be partitioned into four subregions, meaning that three cuts should be placed in the Address-field.

Refinements

A number of refinements are made in the preprocessing algorithm in order to reduce the amount of storage that the HiCuts structure takes. The first refine- ment is to merge nodes that have the same set of rules associated with them.

For an example of this consider Figure 3.4 that shows how two nodes with the same rule list can be merged together to one node. Before the merge the two pointers called A and B points to identical subtrees. After the merge one of the subtrees is removed and the corresponding pointer points to the other subtree instead.

The second refinement is to remove redundancies in the decision tree. Rules can become redundant if the region made by a cut is covered by a higher priority rule. For an example of this consider Figure 3.5 that shows two rules called R 1

and R 2 . If R 1 has higher priority than R 2 there is no reason to store R 2 in the node, it will never be chosen since R 1 covers the region in R 2 .

Figure 3.4: Merging two nodes with the same list of rules associated with them.

Figure 3.5: Illustration of rules overlapping.

3.4 Search Algorithm

Searching in the decision tree created by the HiCuts algorithm is performed by

using the header fields of the incoming packet as the input key. Each time a

(20)

packet arrives, the decision tree is traversed to find a leaf. The leaf stores a small number of rules and a linear search is done on these rules to find the rule that matches the input key.

Since each node is cut in one dimension, only the field that corresponds to the current dimension in the node need to be considered in the input key when searching.

To see an example of a search, assume a key has an Address-value of eight and a Port-value of seven. When searching in the decision tree in Figure 3.3 the root node is first visited. Since the root node is cut in the Address-field, only the Address-value in the input key needs to be considered. From the root node the third node from the left that has the interval [8:11] in the Address-field will be visited since this is the first node that the Address-value eight belongs to.

Next the second leaf from the left of that node is visited since it contains the

interval [4:7] in the Port-field and the key is compared to rules R 4 and R 10 in

that leaf. The result of the search will be rule R 10 since this rule matches the

input key.

(21)

Chapter 4 Multidimensional Cutting (HyperCuts)

Sumeet Singh, Florin Baboescu, George Varghese and Jia Wang solves the packet classification problem with an algorithm called HyperCuts [9]. This chapter describes the HyperCuts algorithm in more detail.

4.1 Geometrical view

The HyperCuts algorithm is based on the techniques in the HiCuts algorithm.

HyperCuts takes a geometrical view of the packet classification problem, makes a number of cuts and builds a decision tree. Unlike HiCuts in which each node in the decision tree is cut at one dimension, each node in the HyperCuts decision tree is cut at several dimensions simultaneously.

4.2 HyperCuts Example

To see an example of how HyperCuts works compared to HiCuts when building the decision tree consider Figure 4.1.

It gives a geometrical view of a classifier with four rules. HiCuts is to the left in Figure 4.1 and if a leaf node can contain at most one rule then the HiCuts decision tree will have a height of two. HiCuts first makes a cut in the X-axis which splits the region into two subregions containing rules R 1 , R 2 in the left subregion and rules R 3 , R 4 in the right subregion. After that a cut is made in each subregions at the Y -axis and the region is now split into four subregions with one rule each.

HyperCuts is to the right in Figure 4.1 and it makes two cuts simultaneously, one in the X-axis and one in the Y -axis. This directly splits the region into four subregions with one rule in each region. The height of the HyperCuts tree is one. This means that the decision tree created by the HyperCuts algorithm will have a smaller height than the decision tree created by the HiCuts algorithm.

The HyperCuts algorithm is built up of two different algorithms referred to

as the Preprocessing Algorithm and the Search Algorithm.

(22)

Figure 4.1: Illustration of the differences between the HiCuts algorithm and the HyperCuts algorithm when building the decision tree. HiCuts is on the left in the figure and HyperCuts is on the right.

4.3 Preprocessing Algorithm

The preprocessing algorithm builds the decision tree based on the structure of the classifier. The cuts are made simultaneously in the chosen dimensions and the cuts are made at each level of the decision tree and recursively on the child nodes of that level, until the number of rules in each node is smaller than a predetermined value called bucketSize [9]. No cuts are made in a node with less than bucketSize rules and that node becomes a leaf of the decision tree.

With each internal node v in the decision tree that HyperCuts build the following three things are associated.

B(v). A box that is a D-tuple of intervals; ([l 1 : r 1 ], [l 2 : r 2 ], ..., [l D : r D ]).

NC. A number of partitions made by the cuts and a corresponding array of N C pointers.

R(v). A list of rules that the specified box contains.

Choosing the dimensions

Choosing the dimensions to cut on in a node is done in the preprocessing al- gorithm and it can be done in a number of different ways based on a trade off between the depth of the decision tree that will be constructed and the amount of storage that is available.

The first way is to choose the dimensions with the largest number of unique

rules. The second way is to choose the dimensions for which the number of

unique rules is greater than the mean of the number of unique rules for all the

dimensions. The third way takes the ratio of the number of unique rules to the

size of the region represented by that dimension [9].

(23)

Choosing the number of partitions (NC)

Choosing the number of partitions that should be made by cutting in the chosen dimensions at a node is also done in the preprocessing algorithm. A node is cut into equal sized partitions along a dimension. Increasing the number of cuts will decrease the depth of the decision tree and it results in faster lookups, but it increases the storage used by the HyperCuts structure. In order to balance this trade off, a number of parameters can be varied.

The number of partitions made in the ith dimension is called nc(i). The maximum number of partitions made at a node is limited by a factor of the number of rules in the node. This is defined as the function

f (n) = spf ac ∗ √ n,

where n is the number of rules in the node and spf ac is a space factor parameter that can be changed to trade off storage against lookup time. The total number of partitions is defined by

N C = Y

i∈D

nc(i).

When choosing the total number of partitions all possible combinations of nc(i), where N C is bounded by f (n) could be tested to determine which set of nc(i) provides the best result. This is not done since it takes to much time [9]. Instead the dimensions are treated separately. For each dimension i, the optimal number of partitions nc(i) is determined, then the best combination centered around these values is decided.

For an example, consider the rule list represented in Figure 3.1 again. The total numbers of partitions will be limited by

f (n) = spf ac · √ n =⇒

f (11) = spf ac · √ 11.

If four partitions are made in the Address-field and four partitions are made in the P ort-field, there will be a total of 16 subregions, i.e. N C = 4 · 4 = 16 and the following relation is obtained.

N C ≤ f (n) = spf ac · √ n =⇒

16 ≤ f (11) = spf ac · √ 11.

The value of spf ac and bucketSize will affect the number of partitions made at the node. If for example spf ac = 1 and bucketSize = 2 no more partitions should be done and the region is split into sixteen subregions.

Refinements

A number of refinements are made in the preprocessing algorithm to reduce the amount of storage that the HyperCuts structure takes. The first and second refinements are the same as in the HiCuts algorithm, i.e., merging nodes that have the same set or rules associated with them and removing redundancies in the decision tree. The third refinement works by shrinking the region covered by a node to a minimum that still covers all the rules associated with the node.

For an example of this consider Figure 4.2 that shows how a region denoted

(24)

by X min , X max and Y min , Y max can be shrunk to the region X _min ⁰ , X _max ⁰ and Y _min ⁰ , Y _max ⁰ .

The last refinement pushes common rules upwards. This means that if all child nodes have a set of rules that are identical, the parent node will store this set instead. For an example of this consider Figure 4.3 that shows how rule R 1

and rule R 2 are pushed upwards.

It is also suggested in [9] that empty array pointers can be eliminated by using the Lule˚ a Algorithm [8].

Figure 4.2: Shrinking the region containing rules R 1 and R 2 to a minimum.

Figure 4.3: Moving rules R 1 and R 2 to the root node.

4.4 Search Algorithm

The search algorithm in HyperCuts can be explained by an example [9]. Figure 4.4 shows a node called A in a decision tree and an incoming packet header.

The packet header is used as the key and has the values X = 215, Y = 111.

Node A has been cut into sixteen regions, meaning that three cuts has been made in each dimension.

A set of registers is used to send the key to the correct child node. The registers stores information about the region to which the key belongs at the current stage. To know which child node the key should be sent to, an index in each dimension is determined as follow. First,

X index = b 215 − 200 10 c = 1.

The denominator in the equation above is the size of each cut in the X-dimension,

i.e.,

(25)

239 − 200 + 1

4 = 10.

Similarly,

Y index = b 111 − 80 20 c = 1,

where the denominator is the size of each cut in the Y -dimension, i.e., 159 − 80 + 1

4 = 20.

The outcome is that the input key is sent to child node B, if the index starts at zero, and the register values are updated with the new values describing the region covering the input key at this stage. The search ends when a leaf is reached and the input key is compared to the rules in that leaf with a linear search.

Figure 4.4: Example of an incoming packet to a node called A and one of the

nodes children called B.

(26)

Chapter 5 New Cutting Algorithm (UpperCuts)

This project was about constructing an algorithm based on the idea of cutting for solving the packet classification problem. This chapter describes the algorithm called UpperCuts.

5.1 Geometrical view

The UpperCuts algorithm takes a geometrical view of the packet classification problem, makes a number of cuts and builds a decision tree. It uses the idea in HyperCuts to make simultaneously cuts, but unlike both HiCuts and Hyper- Cuts, UpperCuts restricts the cuts to one cut in each dimension. Searching in the decision tree built by the UpperCuts algorithm is done by using the header fields of the incoming packet as the input key and traverse the decision tree until a leaf is found.

5.2 Example UpperCuts

The UpperCuts algorithm simultaneously cuts the space one time in each di- mension and the cuts are made independently of each other. To know where to place a cut in the current dimension the interval points in that dimension are counted and the cut is placed where the median of the interval points is.

This means that the cut will distribute the rules as evenly as possible in the subregions created.

To see how the UpperCuts algorithm works when building the decision tree

consider again the classifier in Table 3.1 and its geometrical representation in

Figure 3.1. Figure 3.1 has eleven interval starting points in the Address-axis as

can be seen in Figure 5.1 (note that the interval starting point between address

value seven and eight must be counted twice since rule R 8 ends there and rule

R 10 starts there), meaning that the cut should be placed at interval starting

point d ¹¹ ₂ e = 6, if the median is rounded up to nearest integer. The Port-axis

also has eleven interval starting points (note that the interval starting point

between port value four and five must be counted twice since rule R 7 ends there

and rule R 2 starts there), meaning that the cut should be placed at interval

(27)

starting point d ¹¹ ₂ e = 6. This generates the subregions in Figure 5.2.

The cutting is then continued recursively on the rule lists in the subregions.

Figure 5.3 represents the node constructed by the cuts in Figure 5.2 and the list of rules that each subregion contains. Each node will also have a list of pointers to the subregions associated with it.

The UpperCuts algorithm is built up of two different algorithms referred to as the Preprocessing Algorithm and the Search Algorithm.

Figure 5.1: Interval starting points in the geometric view of Table 3.1.

5.3 Preprocessing Algorithm

The preprocessing algorithm builds the UpperCuts decision tree based on the structure of the classifier. The preprocessing algorithms counts the number of interval points and places a cut at the median of the interval points in each dimension. The cuts are then made recursively in each subregion generated until there is only one rule in each subregion.

The decision tree can be built breadth first, meaning that all subregions generated by the first cuts will be built first. This means that each level of the decision tree will be completely done before the next level is being built.

If the UpperCuts algorithm is applied the rule list in Figure 3.1, the number of partitions of the root node will be 2 ^D = 4 since there are two dimensions.

The number of partitions of a node will always be 2 ^D , as long as the subregions

contains more than one rule.

(28)

Figure 5.2: subregions generated by cutting at the median.

Figure 5.3: Node created by making one cut in each dimension as in Figure 5.2.

(29)

5.4 Search Algorithm

Searching in the decision tree created by the UpperCuts algorithm is performed by using the crunched header fields of the incoming packet as the input key.

Each time a packet arrives, the decision tree is traversed to find a leaf that contains only one rule. To see how the search algorithm works consider Figure 5.4.

Figure 5.4: Input key with D fields, node with 2 ^D regions and list of 2 ^D pointers.

The node in the figure consists of 2 ^D subregions that are created from the cuts, where D is the number of dimensions (or fields). Each region in the node consists of D crunched fields, one for each dimension and each field consists of an interval represented by its minimum value and its maximum value respectively.

The input key consists of D crunched fields. When searching is done on the

input key the fields of the key are compared to the intervals in the node to see

where they belong. Each subregion in the node can be represented by a D-bit

value. This value is used as an index into a list of 2 ^D pointers that points to

the corresponding subregions.

(30)

Chapter 6 Evaluation of HiCuts and HyperCuts

This chapter evaluates the HiCuts algorithm and the HyperCuts algorithm and it compares the two algorithms with each other.

6.1 HiCuts Evaluation

The HiCuts algorithm is tested in [6] to see how it works on real and synthesized classifiers. The worst case lookup time and amount of storage is measured. To measure the lookup time the depth of the decision tree built is counted.

The first test is for a classifier with two dimensions. The classifier is created by randomly taking prefixes in both dimensions from publicly available routing tables and wildcards are added at random to each dimension. For a binth value of four, a classifier with 20,000 rules consumes about 1.3 MB of storage and has a decision tree depth of four in the worst case and 2.3 in the average case.

When more than two dimensions are tested, classifiers with four dimensions taken from real ISP and enterprise networks are used. With a binth value of eight and a spf ac value of four, the maximum storage is about 1 MB. The worst case decision tree depth is twelve and this is followed by a linear search on eight rules.

The HiCuts algorithm is also tested in [9] to see how it works on core router databases (CR), edge router databases (ER) and firewall databases (FW). All classifiers have five dimensions. The worst case lookup time and the amount of storage are measured and the result of this can be seen in Table E.1 to Table E.6 in Appendix E.

6.2 HyperCuts Evaluation

The HyperCuts algorithm is tested in [9] to see how it works on among others

core router databases (CR), edge router databases (ER) and firewall databases

(FW). All classifiers have five dimensions. The worst case lookup time and

amount of storage is measured. To measure the lookup time the number of

memory accesses is counted. One memory access in [9] is one word, where one

word is 32 bits.

(31)

The amount of storage depends on the number and size of the nodes. A node consists of a header plus an array of pointers to child nodes, one for each cut.

The header size is four bytes, each pointer takes four bytes, and the number of entries in the array is equal to the number of child nodes. A bitmap in the header is used to distinguish between types of nodes. If the refinements discussed in Chapter 4.3 are considered, it can result in an increase of two to eight bytes per dimension [9].

The result of the tests can be seen in Table F.1 to Table F.6 in Appendix F.

6.3 HiCuts versus HyperCuts

The HiCuts algorithm and the HyperCuts algorithm can be compared with each other in order to se which one has the better lookup time and storage requirements.

Decision Tree Height

The main difference between the HiCuts algorithm and the HyperCuts algorithm is in the cutting process. HiCuts chooses one dimension to cut on in each node.

HyperCuts can choose more than one dimension to cut on in each node and HyperCuts makes the cuts simultaneously in the chosen dimensions, resulting in a smaller decision tree height compared to HiCuts.

For example, the decision tree created by the HyperCuts algorithm in Figure 4.1 has a height that is one less than the decision tree created by the HiCuts algorithm in the same figure. By doing multiple cuts the lookup time for the HyperCuts search algorithm can be better than the lookup time for the HiCuts search algorithm.

Lookup Time

Cutting simultaneously in a node as HyperCuts does requires a list of N C pointers to the subregions the cuts generate. This can result in slower lookup time at each node because the list of pointers must be searched. This problem is solved in HyperCuts by using array indexing. To see how array indexing works consider four equally spaced cuts in one dimension; [0:3], [4:7], [8:11] and [12:15].

Each cut has a pointer and the pointers are stored in an array of size four. To find the right pointer for an input point, the input point is divided by the cut width, which in this case is four.

Let for example the input point be the number ten. The result when ten is divided by four is two when rounded down to nearest integer. This means that the third element of the array is indexed if array indices start at zero.

It does not matter how many cuts are made, array indexing will always cost one memory access. This means that HyperCuts can reduce the height of the decision tree without increasing the search time at each node [9].

Storage Requirement

When pointer arrays are used, the storage required for the HyperCuts structure

can increase. Pointers can be removed in the same way as in HiCuts, i.e. when

two pointers point to identical subtrees, one of the subtrees can be removed and

the corresponding pointer is made to point to the other subtree. Moving up

(32)

common rules reduces the storage further in the HyperCuts structure. It is also suggested in [9] that empty array pointers can be eliminated in the HyperCuts structure by using bitmap compression as in the Lulea Algorithm [8].

Test Results

If the tables in Appendix E and Appendix F are compared to each other it can be seen that the HyperCuts algorithm in general use less memory than the HiCuts algorithm and that the cost for lookups are better in the HyperCuts algorithm.

For core router databases the total amount of storage occupied by the search structure in HyperCuts is at one point more than 25 times less than in HiCuts and at no point is it more than in HiCuts. The total number of memory accesses for a lookup in HyperCuts for core router databases is at one point more than three times less than in HiCuts and at no point is it more than in HiCuts.

For edge router databases the total amount of storage occupied by the search structure in HyperCuts is about the same as for HiCuts. In three cases HiCuts takes slightly less storage than HyperCuts. The total number of memory ac- cesses for a lookup in HyperCuts for edge router databases is about the same as for HiCuts, HyperCuts takes slightly less memory accesses than HiCuts.

For firewall databases the total amount of storage occupied by the search structure in HyperCuts is a one point more than eight times less than in HiCuts.

At one point HiCuts takes slightly less storage than HyperCuts. The total number of memory accesses for a lookup in HyperCuts for firewall databases is at one point more than four times less than in HiCuts.

Conclusion

The HyperCuts algorithm generates decision trees with equal or smaller depth than the decision trees generated by the HiCuts algorithm, without increas- ing the amount of storage required. The HyperCuts algorithm performs bet- ter than the HiCuts algorithm on the tests for core router databases and fire- wall databases. For edge router databases HyperCuts and HiCuts performances are about the same. The reason for this, explained in [9], is that edge router databases only specify the two fields for IP source and IP destination and that two dimensions is not enough for HyperCuts to perform better than HiCuts.

The conclusion is that HyperCuts performs better in general than HiCuts in

solving the packet classification problem.

(33)

Chapter 7 Evaluation of UpperCuts

The main objective of this project was to construct an algorithm that solves the packet classification problem, analyze and evaluate the algorithm and compare it to two other algorithms. This chapter analyzes and evaluates the UpperCuts algorithm in order to see what lookup time and amount of storage it requires in the worst case. The cost for a lookup in n rules will be the number of memory accesses and it is denoted T (n). The amount of storage required for n rules with D dimensions will be the number of memory blocks and it is denoted S(n, D).

7.1 Worst Case Lookup Time

The worst case lookup time is the longest lookup time for any input of n rules in D dimensions. For the UpperCuts algorithm, the height of the decision tree corresponds to the worst case lookup time. To get the worst case lookup time, the number of memory accesses are counted. A memory access here is when the CPU has to read from memory. Assume that the memory used to store the data structure consists of memory blocks and assume each memory block consists of 256 bits, which is the typical size of a cache line in a modern CPU. A node in the decision tree can be represented in one memory block, assuming that the list with pointers to subregions is not stored in the same memory block. Looking up an input key in the decision tree with height t will consume t memory accesses since one memory block has to be access for each level in the decision tree, or stated in another way, the number of memory accesses is the height of the decision tree.

If there are n rules, there will be a maximum of I = (2n + 1) intervals on each axis, which corresponds to a maximum of (2n + 1) + 1 interval starting points. In each dimension the interval starting points are cut in half resulting in two subregions with at most

(2n + 1) + 1

2 = n + 1

interval starting points each at every axis. After cutting in each dimension there

will be 2 ^D subregions and each subregion will have at most n+1 interval starting

points at each axis. Since 2n + 2 interval starting points corresponds to n rules,

n + 1 interval starting points corresponds to n/2 rules. This means that after

(34)

cutting is done in each dimension, there can not be more than n/2 rules in each subregion.

Consider now the classifier in Table 7.1 and its geometrical representation in Figure 7.1.

Table 7.1: Example rule list with two rules that have two fields each.

Figure 7.1: Geometric representation of the rule list in Table 7.1 where a cut is placed at the median of the interval starting points in F ield 1 .

A cut is placed in the first field in Figure 7.1 and no cut is placed in the second field since the subregions generated by the first cut contains only one rule each. In this case at least one subregions has n/2 rules and there can not be more than n/2 rules in each subregion.

This corresponds to a lower bound for the UpperCuts algorithm and it is denoted by the Ω-notation (see Appendix C).

The worst case lookup time T (n) can now be described by T (n) = 1 + T (n/2),

since each level requires one memory access. This is a recurrence equation in

one variable and it can be solved by substituting n = 2 ^k and using backward

substitution [7], i.e.,

(35)

T (2 ^k ) = T (2 ^k−1 ) + 1 (substitute T (2 ^k−1 ) = T (2 ^k−2 ) + 1)

= T (2 ^k−2 ) + 2 (substitute T (2 ^k−2 ) = T (2 ^k−3 ) + 1)

= T (2 ^k−3 ) + 3 ...

= ...

= T (2 ^k−i ) + i

= ...

= T (2 ^k−k ) + k

= T (1) + k.

Returning to the original variable n = 2 ^k gives k = lg(n) and T (n) = T (1) + lg(n).

Since this corresponds to a lower bound for the UpperCuts algorithm the result is that

T (n) ∈ Ω(lg(n)).

7.2 Worst Case Storage Requirements

In the UpperCuts algorithm a node is created by placing a cut in each dimension.

A node created by the cuts can be stored in one memory block, where a memory block typically is 256-bits, assuming that the list with pointers to subregions is not stored in the same memory block.

The worst case amount of storage S(n, D) can be described by S(n, D) = 1 + S(n/2, D) · 2 ^D ,

since the cutting of nodes is done recursively and after the first cuts there will be 2 ^D subregions. This is a recurrence equation in two variables. Note that the recurrence equation also depends on the constant D, but since D never changes, the recurrence can be written as

S(n) = 1 + S(n/2) · 2 ^D .

Solving the recurrence equation can be done by substituting n = 2 ^k and using

backward substitution, i.e.,

(36)

S(2 ^k ) = 1 + 2 ^D · S(2 ^k−1 ) (substitute S(2 ^k−1 ) = 1 + 2 ^D · S(2 ^k−2 )

= 1 + 2 ^D · (1 + 2 ^D · S(2 ^k−2 )) (substitute S(2 ^k−2 ) = 1 + 2 ^D · S(2 ^k−3 )

= 1 + 2 ^D · (1 + 2 ^D · (1 + 2 ^D · S(2 ^k−3 )) ...

= 1 + 2 ^D + 4 ^D + 8 ^D · S(2 ^k−3 )

= ...

= X i−1 j=0

((2 ^D ) ^j ) + (2 ^D ) ⁱ · S(2 ^k−i )

= ...

=

k−1 X

j=0

((2 ^D ) ^j ) + (2 ^D ) ^k · S(2 ^k−k )

=

k−1 X

j=0

((2 ^D ) ^j ) + (2 ^D ) ^k · S(1).

Returning to the original variable n = 2 ^k gives k = lg(n) and

S(n) =

lg(n)−1 X

j=0

((2 ^D ) ^j ) + (2 ^D ) ^lg(n) · S(1) = n ^D − 1

2 ^D − 1 + n ^D · S(1).

This is a lower bound for the storage requirements in the UpperCuts algorithm and it can be written as

S(n, D) ∈ Ω(n ^D ).

Recall from Chapter 2.3 that if the point location problem takes Ω(n ^D )

amount of storage it will run in O(lg(n)) time. This means that the UpperCuts

algorithm will take O(lg(n)) lookup time in the worst case, which is an upper

bound. Since the Ω(lg(n)) worst case lower bound match the O(lg(n)) worst

case upper bound for the lookup time, the UpperCuts algorithm is asymptoti-

cally optimal (see Appendix C) in its class, i.e. where the packet classification

problem is solved by doing a number of cuts. This means that the UpperCuts

algorithm requires Θ(lg(n)) lookup time and Ω(n ^D ) storage in the worst case.

(37)

Chapter 8 Future Work

It is possible to make refinements to the UpperCuts algorithm. A number of such refinements are discussed in this chapter. An implementation of the UpperCuts algorithm is also discussed in this chapter.

8.1 Cost Function

To know where to place the cut in each dimension in the UpperCuts algorithm, the interval starting points in the current dimension are counted and the cut is placed where the median of the interval starting points is in that dimension.

Another possible way to decide where to place the cut is to analyze the rule list and measure its hardness with some cost function. Hardness here means how difficult a list of rules is to represent compared to another list of rules. In this way it is possible to decide whether one partition is better than another, or stated in another way, if node X is better than node Y . The length of the rule list can be one example of a cost function. It might be possible to get an even better cost function.

8.2 Analyzing Subregions

After the first node is constructed by doing a cut in each dimension, the sub- regions generated by the cuts can be analyzed in order to reduce the number of pointers and the amount of storage required. The subregions generated can also be analyzed in order to see if no more cuts should be done, meaning that a linear search is done on the rules in the subregion instead.

For example, subregions can contain a number of rules that are the same and if these subregions are merged together, duplicated rules are removed and one pointer is stored for these merged subregions instead. The subregions generated can always be merged together with each other as long as the relative order between the rules is not destroyed.

Another example is a subregion that have a small number of rules in it, in that case it can better not to cut further, leaving the rules together and perform a linear search on the rules in that subregion instead.

For an example on how to analyze the subregions consider the fictive node

with subregions in Figure 8.1. The node has eight subregions after cutting one

time in each dimension. At the beginning there will be eight pointers to the

(38)

subregions. If the rule lists in each subregion are analyzed, it might be possible to reduce the number of pointers, thus reducing the storage.

In (1) in Figure 8.1 there are three subregions that all contains the rule A and it might be possible to merge these subregions together to one subregion that only requires one pointer. In that case there will only be one instance of rule A, leading to a reduction of the amount of storage. After merging the rule lists together the recursive cutting will continue on the new subregions. In (2) the recursive cutting continues directly. In (3) the first two rule lists are the same and the third rule list only contains one rule so the subregions might be merged together. In (4) the recursive cutting continues directly.

Figure 8.1: Illustration of analyzing rule lists.

8.3 Pointer Compression

The pointer array that each node contains can be compressed by using the Lule˚ a algorithm [8] or the XTC algorithm [10] or yet another algorithm that compresses a pointer array. The main idea of the Lule˚ a algorithm is to iden- tify the redundant pointers and store these implicitly while storing a minimum number of pointers explicitly. This reduces the number of redundant pointers stored and thus reduces storage.

8.4 Implementation

No implementation of the UpperCuts algorithm is done in this project, but it will be done after the project is finished. When the implementation is tested and its correctness is verified it can be used to measure the performance of the UpperCuts algorithm to see if it coincides with the derived worst case bounds.

The UpperCuts algorithm can also be compared to the HiCuts algorithm and the HyperCuts algorithm in a better way by running tests. The implementa- tion of the UpperCuts algorithm will be tested with a publicly available packet classification benchmark called ClassBench [11].

A programming technique called word size parallelism will be used in the im-

plementation of the UpperCuts algorithm. The word size parallelism technique

is based on splitting of the register into a number of blocks in which parallel

(39)

operations are performed. By using this technique it is possible to compress the bits in a register after masking out the carry bits [5].

Since the construction of the decision tree in the UpperCuts algorithm will be a breadth first construction, some kind of temporary data structure needs to be implemented in order to know where the pointers should point. The data structure can be a queue where different tasks are placed. For example when the first level is constructed, there might be a pointer some where to level two.

The construction of this pointer is stored in the queue.

(40)

Chapter 9 UpperCuts versus HiCuts and HyperCuts

The main objective of this project was to construct an algorithm that solves the packet classification problem, analyze and evaluate the algorithm and compare the algorithm to two other algorithms. This chapter compares the constructed algorithm called UpperCuts with the two algorithms called HiCuts and Hyper- Cuts.

9.1 Worst Case Bounds

The main difference between the HiCuts algorithm, the HyperCuts algorithm and the UpperCuts algorithm is in the cutting process. HiCuts chooses one dimension and makes a number of cuts in that dimension so that the total storage requirements for representing this do not override the predetermined amount of storage. HyperCuts chooses a number of dimensions and makes a number of cuts simultaneously in these dimensions such that the total storage requirements for representing this do not override the predetermined amount of storage. The different parameters used in the HiCuts algorithm and the HyperCuts algorithm can be varied to trade off storage against lookup time and it is not described exactly how to tune these parameters. This means that no worst case complexity bounds for the lookup time or storage can be derived for the HiCuts algorithm and HyperCuts algorithm. The performances of the two algorithms can only be measured by first implementing them and then testing them on some kind of classifiers.

The UpperCuts algorithm on the other hand always cuts in all dimensions such that all cuts can be represented in one memory block, meaning that worst case complexity bounds for the lookup time and the storage can be derived.

This is a huge advantage since the performance of the UpperCuts algorithm can always be guaranteed beforehand.

The UpperCuts algorithm will take Θ(lg(n)) lookup time and Ω(n ^D ) storage

in the worst case.

(41)

9.2 Cut Width

Since the HiCuts and HyperCuts algorithms always starts by partitioning the current dimension into max(4, √

n) subregions, there will be at least four par- titions in the dimensions that are cutted. If this is applied to the classifier in Figure 7.1 there will be four partitions in F ield 1 , where two partitions contains one rule each and two partitions are empty, if the value of binth is one. This will not be the case in the UpperCuts algorithm since UpperCuts always make just one cut in each dimension.

The cut made in a dimension in the UpperCuts algorithm do not need to partition the current dimension into equally sized subregions, meaning that the rules will be distributed more evenly than in the HiCuts algorithm and the HyperCuts algorithm.

9.3 Reducing Storage

In the HiCuts algorithm some refinements are suggested to reduce the storage occupied by the structure (see Section 3). The first refinement is to merge together nodes that have the same set of rules associated with them. The second refinement is to remove redundancies in the decision tree. In the HyperCuts algorithm two more refinements are suggested to reduce the storage occupied by the structure (see Section 4). The first one shrinks the region containing the rules to a minimum and the second on pushes common rules upwards.

The refinements to the UpperCuts algorithm discussed in Section 8 suggests that the pointer list at each level in the decision tree can be compressed by using the Lule˚ a algorithm or the XTC algorithm or yet another algorithm that compresses a pointer array. This will reduce the number of redundant pointers stored and thus reducing storage. The storage can be reduced further in the UpperCuts algorithm by analyzing the subregions created by the cuts. It is also possible to use a cost function in the UpperCuts algorithm in order to know where to place a cut. This in turn can result in a better partition of the universe.

9.4 Implementation

When implementing the UpperCuts algorithm the code for lookup will be more

straight forward than in the HiCuts algorithm and HyperCuts algorithm. The

code for lookup will use less conditional branches which in turn is better for

pipelining. Pipelining is a technique used to increase the number of instructions

that can be executed in a unit of time, i.e., resulting here in faster lookups.

(42)

Bibliography

[1] Worldatlas. Website. http://www.worldatlas.com.

[2] Wikipedia. Website, 1996. http://en.wikipedia.org/wiki/Point location.

[3] Overmars Berg, Kreveld and Schwarzkopf. Computational Geometry.

Springer, 2000.

[4] Joel Friedman Bernard Chazelle. Point location among hyperplanes and unidirectional ray-shooting. Computational Geometry, 4(2):53–62, 1994.

[5] Andrej Brodnik. Searching in constant time and minimum space. Technical report, University of Waterloo, 1995.

[6] Pankaj Gupta and Nick McKeown. Packet classification using hierarchical intelligent cuttings. Hot Interconnects VII, August 1999.

[7] Anany Levitin. Introduction to The Design and Analysis of Algorithms, international edition. Pearson, Addison Wesley, 2003.

[8] Svante Carlsson Mikael Degerman, Andrej Brodnik and Stephen Pink.

Small forwarding tables for fast routing lookups. SIGCOMM, 1997.

[9] George Varghese Sumeet Singh, Florin Baboescu and Jia Wang. Packet classification using multidimensional cutting. August 2003.

[10] Mikael Sundstr¨om. Time and Space Efficient Algorithms for Packet Clas- sification and Forwarding. PhD thesis, Lule˚ a University of Technology, 2007.

[11] David E. Taylor. Classbench: A packet classification benchmark. Website, 2004. http://www.arl.wustl.edu/˜det3/ClassBench/index.htm.

[12] David E. Taylor. Survey & taxonomy of packet classification techniques.

Technical report, Washington University in Saint Louis, Department of Computer Science and Engineering, 2004.

[13] Ronald L. Rivest Thomas H. Cormen, Charles E. Leiserson and Clifford Stein. Introduction to Algorithms, second edition. The MIT Press, 2001.

[14] D. Stilliadis T.V. Lakshman. High-speed policy-based packet forwarding

using efficient multi-dimensional range matching. SIGCOMM, 1998.

(43)

Appendix A

Prefixes

The rules in a classifier can be constructed as prefixes. For example a 32-bit address prefix can be written as 00∗ where ∗ is called a wildcard and denotes that any bit after 00 can be either a 1 or a 0. This means that the prefix 00∗

can be viewed as a range of addresses from 000...00 to 001...11 on the number

line from 0 to 2 ³² and a given address will match this prefix if the first two bits

of the address are 00.

(44)

Appendix B

Decision Tree

A decision tree represents different decisions that an algorithm need to take.

As an example, an algorithm for finding the minimum of three numbers can be represented by the decision tree in Figure B.1 [7]. Each internal node in the

Figure B.1: Decision tree for finding the minimum of three numbers.

decision tree represents a comparison and each leaf in the decision tree represents a possible result. The number of comparisons made in the worst case is equal to the height of the decision tree.

For a binary tree with l leaves and height h, h ≥ dlg(l)e.

The inequality above holds since the largest number of leaves is 2 ^h , i.e.,

2 ^h ≥ l ⇐⇒ lg(2 ^h ) ≥ lg(l) ⇐⇒ h ≥ lg(l) [7].

A new Cutting Algorithm for the Packet Classification

2008:115 CIV

M A S T E R ' S T H E S I S

A new Cutting Algorithm for the Packet Classification

Problem - UpperCuts

Josefine Åhl

Luleå University of Technology MSc Programmes in Engineering Computer Science and Engineering

Department of Mathematics

A new Cutting Algorithm for the Packet Classification

Problem -

UpperCuts

Josefine ˚ Ahl

May 2008

Preface

• Mikael Sundstr¨om, for supervision during this project. Thank you for always pushing me in the right direction and for all meaningful discussions.

This project could not have been done without your help.

• Thomas Gunnarsson, for all the tips and ideas during this project. Thank you for always making me see things from another perspective.

• My mom and dad, I would not have been where I am today without your support. You will get your house in Greece some day, I promise!

• The rest of my family and friends and most of all, my husband, for putting

up with me under this time. You are the greatest!

Abstract

The packet classification problem is to determine to which flow each packet belongs by inspecting the header fields of the packet and comparing them to a list of n rules that identify each flow, where each rule consists of D fields.

This report describes an algorithm called UpperCuts that solves the packet

classification problem based on an idea called cutting. The algorithm requires

Θ(lg(n)) lookup time and Ω(n D ) storage in the worst case.

Sammanfattning

Paketklassificeringsproblemet best˚ ar av att ta reda p˚ a vilket flöde varje paket tillhör genom att inspektera huvudfälten av paketet och jämföra de med en lista av n regler som identifierar varje flöde, där varje regel best˚ ar av D fält.

Den h¨ar rapporten beskriver en algoritm kallad UpperCuts som l¨oser paketk-

lassificeringsproblemet baserat p˚ a en metod kallad cutting. Algoritmen kr¨aver

Θ(lg(n)) s¨oktid och Ω(n D ) minne i v¨arsta fall.

Contents

1 Introduction 1

1.1 The Problem . . . . 1

1.2 Objectives . . . . 3

1.3 Demarcations . . . . 3

2 Background 4 2.1 Existing Algorithms . . . . 4

2.2 BioCAM . . . . 5

2.3 Point Location Problem . . . . 6

3 Hierarchical Intelligent Cuttings (HiCuts) 7 3.1 Geometrical view . . . . 7

3.2 HiCuts Example . . . . 7

3.3 Preprocessing Algorithm . . . . 10

3.4 Search Algorithm . . . . 12

4 Multidimensional Cutting (HyperCuts) 14 4.1 Geometrical view . . . . 14

4.2 HyperCuts Example . . . . 14

4.3 Preprocessing Algorithm . . . . 15

4.4 Search Algorithm . . . . 17

5 New Cutting Algorithm (UpperCuts) 19 5.1 Geometrical view . . . . 19

5.2 Example UpperCuts . . . . 19

5.3 Preprocessing Algorithm . . . . 20

5.4 Search Algorithm . . . . 22

6 Evaluation of HiCuts and HyperCuts 23 6.1 HiCuts Evaluation . . . . 23

6.2 HyperCuts Evaluation . . . . 23

6.3 HiCuts versus HyperCuts . . . . 24

7 Evaluation of UpperCuts 26 7.1 Worst Case Lookup Time . . . . 26

7.2 Worst Case Storage Requirements . . . . 28

8 Future Work 30

8.1 Cost Function . . . . 30

8.2 Analyzing Subregions . . . . 30

8.3 Pointer Compression . . . . 31

8.4 Implementation . . . . 31

9 UpperCuts versus HiCuts and HyperCuts 33 9.1 Worst Case Bounds . . . . 33

9.2 Cut Width . . . . 34

9.3 Reducing Storage . . . . 34

9.4 Implementation . . . . 34

A Prefixes 36

B Decision Tree 37

C Complexity Bounds 38

D Complexity Bounds for the Point Location Problem 41

E Test Results for the HiCuts Algorithm 43

F Test Results for the HyperCuts Algorithm 45

Chapter 1

Introduction

1.1 The Problem

Data sent or received over a public network such as the Internet travels as a series of packets. For example every e-mail that a user sends leaves as a series of packets and every web page that a user receives comes as a series of packets.

Figure 1.1: Illustration of a packet with header fields and data.

When the packets travel on the Internet they are sorted into different flows according to one or several fields in the headers. The header fields used to sort a packet into the right flow are referred to as the input key and they are denoted here by F ield 1 , F ield 2 , ..., F ield D .

In order to know to which flow a packet belongs a router or a firewall is used.

A router or a firewall is a piece of equipment that partitions the Internet into smaller subnetworks and a packet visits a number of routers or firewalls when it travels through the Internet.

The main purpose of a router is to look at the header field containing the destination address for the packet and forward the packet to the next router on the way to the destination.

There can be reasons to block some packets and this is done by using a firewall. The firewall map different flows to different actions that describes how the packets should be treated. An action can be to deny or permit the packet.

Θ(lg(n)) lookup time and Ω(n ^D ) storage in the worst case.

Θ(lg(n)) s¨oktid och Ω(n ^D ) minne i v¨arsta fall.

The crunching step compresses the original rules R 1 , R 2 , ..., R n in the clas- sifier to a list of crunched rules R ⁰ ₁ , R ⁰ ₂ , ..., R _n ⁰ . The result of this is that the universe for each field in the classifier is compressed to min(2n+1, 2 ^w

The complexity bounds for the point location problem are either Ω(lg ^D−1 (n)) lookup time with O(n) storage, or O(lg(n)) lookup time with Ω(n ^D ) storage [3]

R ₁ 1010 2:2

R ₂ 1100 5:5