Design and Implementation of a Network Search Node

(1)

Design and Implementation of a Network Search Node

THANAKORN SUEVERACHAI

Master’s Degree Project Stockholm, Sweden October 2013

XR-EE-LCN 2013:015

(2)

(3)

Design and Implementation of a Network Search Node

THANAKORN SUEVERACHAI

Stockholm October 2013

Supervisor: Abu Hamed Mohammad Misbah Uddin Examiner: Prof. Rolf Stadler

Laboratory of Communication Networks

School of Electrical Engineering

KTH Royal Institute of Technology

XR-EE-LCN 2013:015

(4)

(5)

v

Abstract

Networked systems, such as cloud infrastructures, are growing in size and complexity.

They hold and generate a vast amount of configuration and operational data, which is maintained in various locations and formats, and changes at various time scales.

A wide range of protocols and technologies is used to access this data for network management tasks. A concept called ‘network search’ is introduced to make all this data available in real-time through a search platform with a uniform interface, which enables location-independent access through search queries.

Network search requires a network of search nodes, where the nodes have identical capabilities and work cooperatively to process search queries in a peer-to-peer fashion.

A search node should indicate good performance results in terms of low query response times, high throughputs, and low overhead costs and should scale to large networked systems with at least hundred thousands nodes.

This thesis contributes in several aspects towards the design and implementation of a network search node. We designed a search node that includes three major components, namely, a real-time data sensing component, a real-time database, and a distributed query-processing component. The design takes indexing of search terms and concur- rency of query processing into consideration, which accounts for fast response times and high throughput of search queries. We implemented a network search node as a software package that runs on a server that provides a cloud service, and we evaluated its performance on a cloud testbed of nine servers. The performance measurements suggest that a network search system based on our design can process queries at low query latencies for a high query load, while maintaining a low overhead of computational resources.

(6)

Acknowledgements

First and foremost, I would like to thank my supervisor Misbah Uddin, for his valuable guidance and advices. He inspired and motivated me greatly to contribute to the project. I also would like to thank him for tremendous supports in all aspects of the thesis. Additionally, we would like to take this opportunity to thank Prof. Rolf Stadler for offering this thesis and his supports rendered over the period of the project. Also, I would like to thank Rerngvit Yanggratoke for knowledge supports and general helps.

Last but not least, I would like to thank my family and friends, who are the sources of energies to pursue this thesis.

Thanakorn Sueverachai October, 2013

(7)

List of Figures

2.1 Sample object respresentations: (a) a document store representation, (b) a key-

value store representation, and (c) a column store representation. . . . 8

2.2 The echo protocol executing on a network graph [51] . . . . 9

2.3 A sample spanning tree created by the echo protocol [54] . . . . 9

2.4 Aggregator for processing a query q on a node with local database D [48] . . . 10

2.5 The architecture for network search [53] . . . . 12

2.6 Sample network search objects: (a) an object that represents a virtual machine, (b) an object that represents a IP flow . . . . 12

4.1 An architecture of a search node [48] . . . . 20

4.2 An architecture for concurrent query processing in a search node . . . . 21

5.1 A sample MongoDB object representing a server in JSON . . . . 25

5.2 A sample MongoDB query . . . . 26

5.3 A sample search index . . . . 27

5.4 A class diagram of the echo protocol component . . . . 28

5.5 Sample Python objects used in the function local() when the network search query (a) is invoked : (b) a sample query object and (c) index entries for terms server and cloud-1 that belong to object id 5215d68ce2b. . . . 32

5.6 A class diagram shows components and their relations in the query processing module . . . . 32

6.1 A topology of search nodes in the testbed . . . . 36

6.2 Global latencies for different query loads. Each measurement shows box plots with markers at 25^th, 50^th, 75^th, and 95^thpercentile. Each search node runs two query processing threads. . . . 38

6.3 Computational overhead of a search node for different query loads. Each search node runs two query processing threads. . . . 38

6.4 The 50^thpercentile of global query latencies for different query loads. The curves show results for 1-4 concurrent query processing thread(s). . . . 39

6.5 Computational overhead of a search node for different query loads and for 1-4 concurrent query processing thread(s). . . . 39

6.6 Bar charts show average time spend on each phase of an operation in a search node with respect to query loads. Each search node runs two query processing threads. . . . 40

6.7 The Pie chart shows percentage of time spend on each operation : (a) at 100 queries/second load and (b) at 200 queries/second load. Each search node runs two query processing threads. . . . 41

ix

(10)

6.8 The 50^th percentile global latencies for different query loads when the cloud is underutilized and highly utilized. . . . 41

(11)

Chapter 1

Introduction

1.1 Background

Over the last decade, server clusters have grown tremendously in terms of using many low- cost commodity machines rather than few top-of-the-line machines. Recently, the paradigm of cloud computing has gained significant popularity. Cloud computing is a concept that allows computing resources to be virtualized via virtual machines that run on server clusters in a data center.

Networked systems, such as cloud infrastructures, often face challenges in management, since the systems are typically large, e.g., in order of magnitude from tens to hundreds of thousands of devices. They keep and produce a vast amount of configuration and operational data in configuration files, device counters, data caches, and data collection points. This data is often segmented in the sense that it is kept in various locations and formats, and it changes in various time scales. To access this data for network management, a wide range of protocols needs to be known, and the location of the data should be provided. The challenge becomes critical when a real-time access to the data is considered.

To address this problem, a concept, called network search, is introduced by Uddin, et al [53] [54] [48]. Network search provides a generalized search process that makes data in networks and networked systems available in real-time to applications for network management. Network search can be seen as a function analogous to web search for operational data in a network. This data can be accessed by characterizing its content through simple terms, whereby the location or schema of the data does not need to be known. Network search provides a uniform, content-based, and location-independent access. It contains real- time databases inside the network that maintain network data and functions that realize in-network search functionality.

1.2 Requirements

In this thesis, we focus on design and implementation of a network search node. The (network) search node is the key component of the network search architecture [53]. A set of search nodes deployed in the network infrastructure works cooperatively to provide functionalities of network search. Generally, each search node has identical capabilities that includes sensing data from devices, maintaining the data in a local real-time database, and

1

(12)

performing data retrieval, matching, ranking and aggregation to realize a distributed search.

In particular, a network search node has the following requirements:

1. The design of a search node must include a real-time sensing functionality, i.e., a mean to read configuration and operational data in real-time from associated network devices, which are subject to search. Since operational data is often fast changing, the sensing must support a fast access to the data.

2. The design must include a database functionality that stores and maintains configuration and operational network data in a real-time database using the information model for network search [54].

3. The design must include a querying functionality, whereby a query is a statement that characterizes information needed for network management tasks. A network search query is based on the query language provided in [54]. The query needs to be processed by the matching and ranking semantics for network search.

4. The query processing must exhibit fast response times, high throughput for search queries, and low overhead cost for computational resources.

5. The design must be scalable in the sense that a deployment of network search should exhibit above properties for a system of at least 100,000 search nodes.

6. The above metrics cannot be jointly optimized. Therefore, their trade-offs need to be studied.

1.3 Approaches

Our approach to satisfy the above requirements makes use of the following methods and techniques:

1. To realize the real-time data sensing functionality, we place search nodes inside the network devices. The sensors in the search node make use of various proprietary protocols and command line interfaces and reads data local to the associated device.

2. To implement the real-time database functionality, we look into the lightweight and flexible database management systems. In particular, we focus on so-called ‘NoSQL’

database systems that can be used off-the-shelf to realize the functionality.

3. To implement the query language, we look into the query languages for keyword search and relational algebra. Our query processing function makes use of matching and ranking semantics developed for network search, which are based on the extended boolean model for information retrieval [49].

4. To achieve fast response times for search queries, we apply search indexes for data in the real-time database. For high throughput, we apply a parallel processing paradigm to support concurrent query processing on multiple CPU cores. To achieve low overhead of computational resources, we look into lightweight databases.

(13)

3

5. To achieve scalability, we make use of wave algorithms, such as an echo protocol, that allows for distributed processing of search queries. Uddin et al [54] [48] have developed an aggregator for the echo protocol to process network search queries. We make use of the echo aggregator.

6. To study the trade-off of the above metrics, we evaluate a prototype implementation.

1.4 Contributions of the Thesis

Our thesis contributes in several aspects in design and implementation of a network search node in terms of the requirements given in Section 1.2. We state the contributions as follows:

• We designed a network search node that includes three major components, namely, a real-time sensing component, a real-time database component, and an echo-based query processing component. The database component includes an efficient indexing module that allows fast query processing in terms of matching and ranking. We also provide an architecture for parallel query processing that enables concurrent query processing on multiple CPU cores. Our design calls for fast query response times, high throughput, and low CPU overhead.

• We implemented a functionally complete network search node. It runs as a software on a server that supports an IaaS cloud platform.

• We evaluated our prototype implementation on a cloud testbed that consists of 9 high performance servers that run an IaaS cloud. Our evaluations show that the prototype achieves the 95^th percentile query latency below 100 milliseconds for a query load up to 100 queries/second at 1.6% CPU overhead.

• We developed a numerical model for estimating the response times of search queries for a large system that includes at least 100,000 search nodes. Based on the analysis on our experimental results from the prototype, using the model and properties of the echo protocol, we can state that a prototype based on our design can achieve an expected query latency of below 100 milliseconds for a query load of 1 queries/second.

1.5 Thesis outline

The remaining chapters of this thesis are organized as follows. First, we present a background of the topics that are relevant to the thesis in Chapter 2, followed by the related research in Chapter 3. We then discuss the design of a network search node in Chapter 4, followed by its implementation in Chapter 5. After that, we evaluate the performance of a network search node in Chapter 6. Thereafter, limitations and potential future works are discussed in Chapter 7. Lastly, we conclude the thesis in Chapter 8.

(14)

(15)

Chapter 2

Background

To design and implement a network search node, it is important to understand related concepts. In this section, we briefly describe the areas, from which we draw inspirations and relevant concepts. We begin with a high-level summary of cloud computing followed by a brief discussion of how one monitors and manages a cloud. Then, we discuss some database concepts and technologies, which are potential enabling technologies for this project. After that, we provide a brief description of a distributed protocol that we use to implement our distributed search plane. Finally, we provide backgrounds on the concept and framework of network search.

2.1 Cloud Computing

The term cloud computing has many interpretations, but it typically refers to the use of computing resources that are delivered as services to users through networks. It has a broad view of describing computing concepts, which invoke a number of computational units and communication networks. For example, a set of servers working together via cable connections provides a web service. According to Amburst et al [30], a computing cloud has new characteristics in a hardware perspective as an illusion of infinite computing resources that are available on demand, an elimination of up-front commitment for a cloud user caused by an initial hardware investment, and the ability to pay per use in a short-term basis.

From a user perspective, it enables the use of the services on-demand without any require- ment to provision services — the need for a plan and a prerequisite deployment of software and hardware to be able to use the services. Since services are provided by a cloud provider, the user also does not require a direct contact to the hardware and software to make use of the services. We call a collection of hardware and software that provides a service as a cloud or a cloud infrastructure.

We can distinguish a computing cloud based on a level of abstraction it provides for services, namely, Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). The IaaS is the most basic cloud-service model. Providers of IaaS offer virtual machines resembled physical machines and other resources, such as image libraries for virtual machines, storages, load balancers, virtual networks, etc. In addition, it creates an abstraction between physical hardware and operating system and hides the complexity

5

(16)

of the underlying hardware components. A hypervisor — a virtual machine manager

— is a key component that manages virtual machines of the cloud users. It provides a virtual operating platform, and manages an execution environment of the operating systems installed by the cloud users. Typically, any application that is developed for operating on a cloud needs a model for computation, a model for storage, and a model for communication.

To deploy their applications, cloud users need to install an operating-system image and their application software on the cloud infrastructure. Some well-known IaaS providers are Amazon EC2 [2], HP Cloud [8], Rackspace [26], etc.

2.2 Management and Monitoring of Clouds

It is inevitable that a cloud management system is needed to operate and maintain a cloud.

A cloud management system manages and aggregates heterogeneous nodes as if they were one component to realize a complete on-demand and elastic use of resources. In this project, we focus on an IaaS cloud where a virtual machine instance is usually provided as the basic unit of service. We explore some well-known cloud management systems that operate on the infrastructure level, as follows.

VMware ESX [27] is a computer virtualization product operating in the infrastructure level offered by VMware, Inc, which is a leader in cloud and virtualization software and services.

The VMware ESX is a hypervisor embedded in bare-metal, which can manage virtual servers directly on host-server hardware without an operating system, (hardware needs to satisfy some specific requirements). In addition, extra services can be added and integrated into the VMware ESX to support, for example, automatic load balancing, efficient migration of virtual machines, fault tolerance, etc. Moreover, since VMware ESX is the market leader on the server-virtualization market [34], software- and hardware-vendors offer wide ranges of tools to integrate their products or services with the VMware ESX. Nonetheless, it is a proprietary product, so a product deployment involves a license cost.

OpenStack [20], on the other hand, is a fast growing and opensource software for cloud management. It provides wide ranges of functionalities for building public or private clouds. It includes series of interrelated projects that control pools of computing, storage, and net- working resources throughout a datacenter. The advantage of being opensource is that it aims to support standard generic hardware. In addition, it provides an ability to integrate with legacy systems and third-party technologies. Furthermore, OpenStack includes services, such as an authenticating identity service, a graphical user interface service for administrator, a disk and server image library service, to name a few. In general, Open- Stack has three layers, i.e., hypervisor, platform virtualization manager, and OpenStack layers. Firstly, a hypervisor layer has a hypervisor that creates and runs virtual machines.

Depending on resource constraints and required technical specifications, OpenStack leaves it as an open choice. The well-known available hypervisors are KVM [12], XenServer [28], Hyper-V [15], etc. Secondly, a platform virtualization manager layer is a layer, which utilizes the libvirt package [13], that provides an application programming interface (API) to the hypervisor. The libvirt is an opensource API that helps manage hypervisors and provides an access to manage virtual machines, virtual networks, and storages. We can gather most of information related to the virtual machine in this layer. Lastly, an OpenStack layer has software components that interact with hypervisors via libvirt. Additionally, it has further software components to support aforementioned services. Additional information for cloud management can be found in this layer, for instance, a virtual disk type, an image of virtual

(17)

2.3. NOSQL DATABASES 7

machines, etc.

A monitoring system is crucial for managing a cloud. It is a key component to understand performance of a cloud. Since a cloud computing platform consists of computing units, typically servers, and networks, it needs to be monitored to get a holistic view of the system.

There is a number of existing tools to provide a monitoring of hardware and software pieces of the cloud. Nagios [18], Zabbix [29], Munin [17] are such tools , to name a few. In addition, OpenStack cloud management software suite provides basic metering and monitoring tools, namely, Ceilometer and Healthnmon[15]. They help monitor CPU/disk/network utilization data of the virtual machine in the cloud. Nonetheless, the cloud can be scaled up, say, to 100,000 computation and network units. It raises the question whether these tools, which are centralized, can be able to monitor a large cloud efficiently.

2.3 NoSQL Databases

The traditional Relational Database Management Systems (RDBMS), as known as Struc- tured Query Language (SQL) databases, have been used extensively. The RDBMS tra- ditionally relies on a vertical scaling technique [46], i.e., purchasing higher performance servers as the database load increases. Additionally, it requires a database schema beforehand, which is a highly structured data model describing the organization and construction of data in a formal language. Thus, it would prefer more predictable and structured data, and often requires an expertise for designing such a data model.

However, recently non-relational distributed databases, as known as “NoSQL” databases, are gaining interest as an alternative model for database management aimed for achieving horizontal scaling, higher availability, and flexibility in data models. It provides a mechanism to store and retrieve data that has a looser structure than that in RDBMS. Nonetheless, some NoSQL database systems do allow a SQL-like query language to be used; thus, some authors refer to them as “Not only SQL” database systems. They are usually designed to expand and scale transparently and horizontally to take advantages of scaling out on commodity hardware. This makes them an inexpensive solution for large datasets. In contrast to RDBMS, NoSQL databases do not necessarily to provide full ACID (atomicity, con- sistency, isolation, durability) [35] guarantees; however, an eventual consistency is usually guaranteed. The eventual consistency is referred to as “given a sufficiently long period of time over which no changes are sent, all updates can be expected to propagate eventually through the system.” [56]

There are several kinds of NoSQL database including a key-value store, a column store, and a document store. A key-value database stores data as a value with a specified key for lookup. It allows an application to store its data in a schema-less way. Examples of such databases are Amazon DynamoDB [1] and Riak [23]. A column-store database stores data by decomposition data into pairs of an identification (id) and one attribute value. Then, based on attribute name, it stores each pair in a table with two columns (an id and an attribute). Thus, there is one table per attribute name [36]. Examples of such databases are C-Store [5] and Vertica [9]. A document store database stores data in the notion of ‘document’ of which the data is encapsulated and encoded in some standard formats.

Examples of such databases are Apache CouchDB [3], Couchbase [6], MongoDB [16].

In this thesis, we focus on the document store databases, which contain a set of documents.

(18)

Figure 2.1: Sample object respresentations: (a) a document store representation, (b) a key-value store representation, and (c) a column store representation.

A document contains a set of attribute-value pairs. These documents are not required to share the same structure, and they are encoded in a standard exchange format, such as XML [33], JSON [11], BSON [4], etc. Hence, Interoperability can be ensured. A simple example of a document is given in Figure 2.1a. The document represents a person named

‘Thanakorn’. A document store enables retrieval of a whole document, which may correspond to complete information of any real-world object, in terms of its attribute names or values in a straightforward way. In contrast, it requires more effort to do the same thing with a key-value store and a column store. Figure 2.1b and Figure 2.1c show the presen- tations of the person represented in Figure 2.1a in form of key-values and column-store representations. In order to retrieve all information of the person named ‘Thanakorn’ using key-value paradigm, one needs to retrieve each key-value pair about the person and reform as a whole object. Similarly, a column store requires more operations to do the same task.

2.4 Echo : A Distributed Protocol for Network Management

Network search makes use of the echo protocol for distributed processing of search queries [51].

The echo protocol has a tree-based distributed algorithm that executes on each node in the graph of the search plane (Figure 2.5). The algorithm contains an approach to perform a distributed synchronization function on a graph. It defines message types and states for an execution, and relies only on local information in the form of knowledge about its neighbors.

The execution of the echo protocol can be seen as the expansion and contraction of a wave on the network graph as following: the execution can be started on any search node, which is referred to as a root node for that particular execution, once the query has been received.

Then, an explorer message is disseminated to neighbors in an expansion phase, in which a spanning tree is created. A local operation is triggered after a node receives an explorer message. When the wave contracts, the results of these local operations are collected in echo messages and aggregated incrementally along the spanning tree to the root node.

The aggregated result of the global operation becomes available at the root node when the execution ends. The protocol execution on the graph is illustrated in Figure 2.2. A pseudocode of the echo protocol is presented in [51].

(19)

2.4. ECHO : A DISTRIBUTED PROTOCOL FOR NETWORK MANAGEMENT 9

Figure 2.2: The echo protocol executing on a network graph [51]

Figure 2.3: A sample spanning tree created by the echo protocol [54]

Figure 2.3 shows a sample spanning tree created in the search plane by the echo protocol on nodes n1,...,n6 , with n1 as a root node. Each node has the local database D, which contains information objects for a search sensed from the node, and a local state qr, which stores the local and (partial) aggregated result of a query q. The explorer (EXP) message contains a query q, while the echo (ECHO) message contains a partial aggregated result qr.

Some of these messages are shown.

The definitions of the local operation, the aggregation operation of the query result, and the current local state of the query execution are modeled in an object, called an aggregator

(20)

1: aggregator object processQuery()

2: var: qr: dictionary;

3: procedure local()

4: qr:= {};

5: foreach o ∈ M(q, D) do

6: insert(name(o),o,R(q,o)) into qr;

7: qr:= top-k(qr);

8: procedure aggregate(child-qr : dictionary)

9: qr:= top-k(merge(qr, child-qr));

Figure 2.4: Aggregator for processing a query q on a node with local database D [48]

object of the echo protocol. Figure 2.4 contains a partial pseudocode of the aggregator object for processing a query q. In line 2, the current local state qr is defined. The local operation is defined in line 3-7 whereby objects are retrieved from the matching operation M of the query q against the local database D. Then, each object o along with its name and rank scores resulting from the ranking operation R is inserted into the local state qr.

Finally, a top-k function sorts results in the local state qr based on their rank scores and truncates the rank-sorted results after k objects in order to limit the size of the result set.

On the other hand, the aggregate operation (in line 8-9) takes the results of the child node and merges with its local state to produce partial aggregate results. It also performs a top-k function afterward.

The performance characteristics of the echo protocol help in determining the performance outcomes of network search. Assuming upper bounds for communication delays between nodes and processing delays for local message processing, we can determine the performance characteristics of network search as following [51]: (1) the execution time of a query increases linearly with the height of the spanning tree, which is created in an expansion phase of the execution. Thus, it is bounded by the diameter of the graph. Moreover, it also increases linearly with the degree of the node. (2) The protocol overhead is evenly distributed on the network graph as two messages are exchanged on each link during the execution. However, the message size depends on the specific aggregate function. (3) The number of messages that requires to process on each search node is bounded by the degree of the network graph.

2.5 Network Search

The network search concept is first introduced by Uddin, et al [53]. Then, a query language for network search is developed and presented in [54]. Later, scalable matching and ranking functions for network search are described in [48].

Today, datacenters that host cloud services are very large, e.g., in a magnitude of ten- thousand commodity servers. It is inevitable that the management of the cloud becomes complex. Managements of networked systems are typically required to handle several network protocols for accessing the information. Examples of such protocols are SNMP, CLI, Netflow, etc. Moreover, finding information is a tedious task because one needs to know an exact location of the information, most of which require deep knowledge and experiences about the clusters. Additionally, the knowledge of the schema of data is required to fetch and interpret information properly. Furthermore, because of the transient nature of the

(21)

2.5. NETWORK SEARCH 11

data, information becomes obsolete very fast, in many cases, in the matter of seconds. To have useful information for the management at hand, we need to retrieve information pe- riodically and immediately. Although tackling all of these is trivial for a small scale setup, it is obviously infeasible for a system with a large scale. For large clouds, the paradigm of network search is introduced to deal with the retrieval of management information, which is primarily transient and location of which is potentially unknown.

The network search concept is defined as the following. The network search can be seen in three ways. First, it can be seen as a generalization of monitoring, where the data is retrieved by its content in simple terms. Second, it can be seen as “googling the network”

for operational information, in analogy to “googling the web”. Third, it can be seen as a capability to view the network as a giant database of configuration and operation information. It provides a unified interface for accessing information for network management tasks. Additionally, it is a mean to explore management information, which allows for finding information inside the networked system without giving location or knowing detail structure of the data.

2.5.1 Architecture

An architecture of network search is adapted from an architecture of peer-to-peer management presented in [51]. As shown in Figure 2.5, conceptually, it has three layers, i.e., management plane, search plane, and managed system. A management plane, which sits on the top, includes the processes for network supervision and management. They communicate to a search plane, which particularly realizes the functionality of network search.

Each node in the search plane, referred to as a search node, has an associated execution en- vironment, processing, storage capacity. Furthermore, search nodes have knowledge about their neighbors and communicate with them through message exchanges. We can view a search node and its peer interaction as a network graph, where search nodes are vertices and neighbor relationships are edges. A distributed management protocol executed on the network graph is discussed in Section 2.4. The bottom plane in Figure 2.5 represents the physical network that is the managed system, and subject to search. Each network device is associated with a search node, which maintains configuration and operational information sensed from the network device.

2.5.2 Information Model

We present the information model for network search, often referred to the object model, as follows. Physical and logical entities in a networked system, such as servers, virtual machines, routers, IP flows, etc., are considered as objects in a search space. The object is expressed as a bag of attribute-value pairs. The object has a globally unique name, and a type. The object name is expressed as a Uniform Resource Name (URN) [38], because it provides a unique, location independent, and expressive identifier. Examples of objects are shown in Figure 2.6.

A relation between objects that links objects together is identified by the attribute-value pairs that they share. The relation allows for finding objects associated with some object in consideration. Consider objects a, and b in a search space O. a is directly linked to b, if aand b share an attribute-value pair. Similarly, a is linked to b, if there is a chain of direct links between a and b.

(22)

Figure 2.5: The architecture for network search [53]

object name : ns:instance-002d object type : virtual-machine

IP address : 10.10.11.173 server : ns:cloud-1 MAC address : fa:16:3e:31:7b:ee

CPU-cores : 1 CPU-load : 0.112

Memory : 536870912 Memory-load : 0.912

(a)

object name : ns:10.10.11.79:7730:

10.10.11.125:37756 object type : ip-flow

source IP : 10.10.11.79 source port : 7730

destination IP : 10.10.11.125 destination port : 37756

server : ns:cloud-2 bytes : 1329 packet : 3 bandwidth : 66.35

(b)

Figure 2.6: Sample network search objects: (a) an object that represents a virtual machine, (b) an object that represents a IP flow

2.5.3 Query Language

The query language for network search is described by BNF notation [31] as follows:

q → t | q ∧ q | q ∨ q (2.1)

t → a | v | a op v (2.2)

op →= | < | > (2.3)

The basic idea of the query language is following: A token t can be an attribute, a value, or an attribute-operator-value as in rule (2.2). The operators op are given in rule (2.3). Then,

(23)

2.5. NETWORK SEARCH 13

according to rule (2.1), a query q is made up of a token or a combination of the tokens with a logical And or a logical Or operator. In addition, a link, a projection, and an aggregation operators are provided in the query language, which we do not discuss here.

2.5.4 Semantics for Matching and Ranking

Network search results may not be exactly matched to a given query, rather may match approximately. A semantic for matching search queries is defined as following. A matching function M maps a query and an object onto a matching score in a real number between 0 and 1, inclusively. If M returns 0, the object is not included in the result set; otherwise, it will be included. The value from M indicates the relevance of the object to the query; higher the score, better the match. This matching function includes objects with approximately relevance to the query. Thus, it is called an ‘approximate match’. On the other hand, an

‘exact match’ is a special case of the approximate match where the matching score is a boolean value either 0 or 1. It will include only objects that match exactly to all tokens of a given query.

The matching function M for the approximate match is defined by an adaptation of the extended boolean retrieval model [49]. M uses two basic metrics, i.e., term frequency (tf ), and inverse document frequency (idf ). In network search, the tf of an attribute name, a value, or an attribute-value expresses the frequency of those in an object. The idf of an attribute name, a value, or an attribute-value pair indicates the inverse of the number of occurrences of those in the object space. For a specific object o, matching function M for a term t is a multiplication of tf and idf of the term t. The matching function M for queries that are constructed out of n terms and boolean operators are defined as following:

M(q1∨ . . . ∨ qn) = kM(q1), . . . , M(qn)kp

√n (2.4)

M(q1∧ . . . ∧ q_n) =k(1 − M(q1)), . . . , (1 − M(qn))kp

√n (2.5)

Equations 2.4 and 2.5 use the Lp vector norm, also known as P-norm. Choosing P = ∞ results in M performing the exact match, while choosing P in the interval [1, ∞) results in the approximate match. The smaller the value P , the looser the approximate match, i.e., more objects match a given query.

Similar to web search, search results are ranked in network search. The ranking reflects the degrees to which a query matches the search results produced by the matching function M.

The matching function M also provides matching scores of search results. Matching scores are computed using Equations 2.4 and 2.5, when M performs query matching. Additionally, The matching rule is extended to support matching of substring to an object name, e.g.,

‘brooklyn’ matches ‘ns:server:brooklyn’. The contribution to the matching score is higher, if the term matches an object name or an object type, since objects that match a query via name or type are considered more relevant than other attributes.

The ranking is also reflected by a link structure and freshness of information. The link structure considers the neighborhood of an object in the graph of objects and their rela-

(24)

tionships. Objects that have a high number of links are considered more important than objects having low number of links. The freshness of information indicates that more recent the information, more important it is. The objects in search results are ordered by the matching scores and ranking metrics described above in a descendent order.

2.6 A Previous Network Search Prototype

A previous version of a prototype has been implemented to explore the usability of network search [50]. The prototype demonstrated the functionality of network search through two applications. One is a googling client for searching a cloud, and another is an exploratory data analysis application for virtual resources in the cloud. However, efficiency, particularly in terms of low latency and high throughput of search queries, of the prototype was not considered.

The prototype includes an implementation of search nodes, of which they cooperatively provide functionality for network search. Each search node has three logical subsystems, i.e., a sensing subsystem, a database subsystem, and a query processing subsystem. The sensing subsystem brings in information from the cloud server and creates objects, the database subsystem stores and maintains objects, and the query processing subsystem processes and distributes search queries.

The prototype is designed to perform query processing in a sequential and synchronous manner, which is simple in design. However, a drawback of such an approach is that all queries that are in queue for processing need to wait until the query processing subsystem completes the job at hand. As a consequence, the average response time of search queries is high due to waiting time caused by a query that requires long processing time. As a consequence, at a higher query rate, low query latency cannot be achieved.

With regard to distributed query processing, a complete echo protocol has not been implemented in the prototype. The prototype takes an advantage of only part of the echo protocol where a node performs local query processing and aggregates the query results from other nodes. Neither a message exchange protocol nor a distributed algorithm for query processing is implemented. Thus, performance properties of distributed query processing cannot be derived from the echo protocol. The prototype also requires a predefined and static topology of search nodes for query distribution and result aggregation. As a consequence, the network search system may not be scalable due to the fact that knowledge of all search nodes and their communication paths has to be known.

Similar to web search, network search should support an approximate matching and ranking of search results. The prototype has implemented a naive approach, i.e., the result set includes objects on which at least one token of the query is matched. They are brought up from the database subsystem. Then, the query processing subsystem calculates a matching score based on attribute name and value pairs of all matched objects. The score along with other rank metrics are used to rank objects. However, a small set of top ranked results is typically required. The approach of approximate matching that brings in all possible relevant objects is not an efficient solution, since it requires an excessive processing cost.

(25)

Chapter 3

Related Research

Network search relates to topics, such as network management, distributed systems, information search and retrieval, etc. In this chapter, we discuss and summarize some closely related projects, as well as, compare and contrast each project to network search.

3.1 Weaver Query System

The Weaver Query System (WQS) is a platform that allows to create global views of traffic flowing through network devices in near real-time [42]. This is done by deploying Weaver Active Nodes (WANs), which are small devices attached to routers for gathering the device information. Each WAN maintains device information in a local database and processes queries against the database. A query is sent from a management station to the WQS via a single interface using a declarative query language based on a structured query language (SQL). The system takes advantage of a decentralized management paradigm that utilizes a navigation pattern, known as an echo pattern (an echo protocol), for distributing queries among WANs and aggregating data.

The design of the system suggests that a management station will be less loaded than that in a centralized system due to the fact that the system has a certain degree of in-network aggregation process of information. The system is also shown to be robust, since each WAN performs identical functions, thus there is no single point of failure. The completion time of a query depends on a network diameter rather than total number of nodes; therefore, it can be expected to work efficiently in large networks.

WQS has a similar architecture to that of network search, i.e., there is a logical plane where distributed nodes with identical functions work cooperatively to support processing of queries. Moreover, distributed algorithms that run in a WAN and a search node of network search are developed using the same protocol, i.e., through the use of the echo- pattern algorithm. However, while WQS relies on a dedicated device to host a WAN, a network search system allows a search node to be hosted in other mechanisms including inside a device, on which cloud/network services are provided, that has processing and storage capability. For the case of an information model, WQS uses a fixed schema model and a structured SQL-like query language, which enables only schema-aware queries that can be matched using an exact matching paradigm. On the other hand, network search uses a looser information model without having the need of any schema and a keyword-

15

(26)

based query language that can be matched using both exact and approximate matching paradigms.

3.2 Sophia

Sophia is a distributed system that collects, stores, propagates, aggregates, and reacts to observations about the network’s current conditions [57]. It can be viewed as a shared information plane that has three main functionalities, i.e., collecting information about the network via sensors, evaluating statements (questions) about the network via a declarative programming environment, and reacting to the results drawn from the evaluation. All functionalities are distributed functions that run on distributed Sophia nodes. It uses a declarative logic programming language as a query language that allows embedding a sub-routine program, which can be evaluated at runtime. This can work on a wide-area, decentralized environment that evolves over time, where possible states of the network are not known beforehand. The query language can also express when and where to execute a statement. Moreover, it allows partial answers to a statement, in order to enable that a Sophia system sacrifice completeness of answers for performance.

A Sophia node has five core components in its implementation to incorporate functionalities of the information plane, namely, (1) a local database that holds terms that are used for evaluating query statements, (2) a statement processing engine, (3) interfaces to sensors and reactors for accessing sensory data and controlling behaviors of the network, (4) a remote statement processor for delegating tasks to a remote Sophia node, and (5) a scheduling mechanism.

Sophia introduces an information plane to tackle a problem of dispersed information through distributed nodes that work together. Each node has identical functionalities, similar to those of a network search node, namely, to collect information, to store information in a local database, and to process a query in a distributed manner. Furthermore, similar to network search, the data model and the query language of Sophia have the flexibility to capture heterogeneity of information. Additionally, a distributed algorithm makes Sophia works well in large scale networks, where query processing can be done in a fast time scale. However, unlike network search, Sophia lacks a search-like functionality, which allows for exploration of yet-to-be-known information in a network or a networked system. In conclusion, we are inspired by functionalities of a Sophia node that cooperatively provides functionalities of the information plane, yet it needs to adapt to satisfy the requirements of network search.

3.3 Distributed Image Search in Camera Sensor Networks

Yan et al proposed the design and implementation of a distributed search engine for a wireless camera sensor network, where images from different sensors can be captured, stored, and searched [58]. A sensor network typically has a limitation in resources in terms of energy, network bandwidth, computational power, and memory capacity. As a result, it is impractical to transmit all images for a centralized search. Instead, the proposed system uses a compact image representation in a text format, so-called a visual word (visual term).

This allows the design of the system to apply a search concept in the Information Retrieval (IR) paradigm.

An image query is converted to visual terms, and then the terms are used as input, of which

(27)

3.4. MINERVA ∞ 17

results are matched and ranked using a weighted similarity measure called tf-idf, which is analogous to that of IR. A search is done by having an architecture that uses a single centralized node to distribute a query to all sensor nodes and receive a top-k result from a local processing of each sensor node. Then, it produces a final top-k result set, and finally it may requests for images that correspond to the result set from specific sensors.

The system utilizes an inverted index for optimizing matching and ranking functions and a tree data structure for maintaining visual terms.

Yan’s system is relevant to the project in the sense that data that is subject to search is located, sensed, and maintained locally in a node itself. The data in a node is inefficient to migrate out of the node for processing due to, in the Yan’s case, energy constraints. In our case, the data would be obsolete by the time the information is migrated for processing, due to a fast-changing characteristic of data. Therefore, it requires a distributed in-node processing. Furthermore, Yan’s system exploits a search function through the use of concepts from Information Retrieval. Information Retrieval is attractive to network search, since it provides concepts for matching and ranking of objects with respect to queries and allows for exploratory of undiscovered data. Nonetheless, Yan’s system has a centralized architecture, which limits scalability in term of system size and leads to having a single point of failure.

3.4 Minerva ∞

Minerva ∞ is a web search engine that has a peer-to-peer architecture [47]. It has algorithms for creating overlay networks that contain data of interest, placing data on a network node, load balancing, top-k algorithm, and data replication. It is designed for a network of a large number of peer nodes, each of which has computation, communication, and storage capabilities. Each peer node has functionalities to crawl web pages, discover documents that are subject to search, and compute scores of documents. A scoring function utilizes a weighted similarity measure (tf-idf ) as in an Information Retrieval context.

The system works as follows: web pages are initially loaded and distributed into the system as a batch process. It builds a global overlay network, which connects all peer nodes, and many term-specific overlay networks, each of which connects some nodes that maintain documents related to the term. A query may have many terms, each of which is processed in the corresponding term-specific overlay network in a distributed manner. A top-k result is returned as an end result. The system allows incomplete answers for the sake of better performance. Note that, the system is built with an assumption that a document is rarely updated.

Minerva ∞ operates on a peer-to-peer architecture, which is similar to network search, where each node works cooperatively and has identical functionalities. A distributed web search is an interesting feature that is closely related to network search, since network search can be viewed as a search engine for operational information in networks/networked systems. Additionally, Minerva ∞ focuses on distributed algorithms that make an efficient search scalable. However, differences in requirements make Minerva ∞ less attractive, i.e., a data that is used in a web search is rarely changed compared to querying (searching), while data in networked systems is fast-changing. Minerva ∞ has data migration and data placing mechanisms to balance loads among nodes, which are impractical in network search where data changes in sub-seconds. Nonetheless, its capability to provide an incomplete-

(28)

yet-meaningful answer to a search query, through the use of ranking and top-k functions is applicable in network search. Additionally, Minerva ∞ also supports parallel processing of a search query, which inspires in capitalizing computational resources.

(29)

Chapter 4

Design of a Search Node

A network of search nodes is formed in the search plane. They cooperatively provide services for network search. Each search node has an interface to the management plane as an access point for network search. The search node senses information in network devices, maintains the information, and performs distributed query processing. We realize a search node as a software component that runs inside devices, which are servers that provide cloud services.

Each search node is responsible only for information within the server where it resides.

Additionally, each search node has an identical functional capability.

4.1 An Architecture of a Search Node

We illustrate the main components of a search node and their interactions in Figure 4.1.

A search node has three interfaces, which are defined by their end points, i.e., an interface to the management plane, an interface to peer search nodes, and an interface to a cloud server.

A distributed query processing component, which is placed on the top, provides functionalities of distributed query processing. It interacts with the management plane and peer search nodes via message exchanges through an interface to the management plane and interfaces to peer search nodes respectively. It also interacts with local databases via the database API. The query processing component is based on the echo protocol, which defines the message exchange protocol as well as a distributed algorithm for query processing.

Knowledge about peers of a search node is provided by a topology manager component, which is used by the echo protocol for message exchanges.

The component in the middle is a local database component, which gives an access to local information for a search. It contains an object database that maintains local objects.

Additionally, it also contains an index database that stores indexes of attribute names, values and attribute-value pairs from the objects in order to optimize response times of a search. We further discuss the search index in Section 4.2.

A sensing component placed at the bottom has sensors that sense information associated with the underlying managed system, i.e., a server that provides cloud services, through periodic polling. Such information includes server configurations, virtual machine utiliza- tions, etc. The sensors organize this information as objects and store the objects in the

19

(30)

Figure 4.1: An architecture of a search node [48]

local databases. In this thesis, we do not focus on designing of sensing functionality, which relies on the earlier network search prototype [50].

4.2 A Design for Efficient Local Query Processing

For processing search queries in a search node, which can be referred to as local query processing, we consider two design goals: (1) the query processing function should exhibit fast response times of search queries to enable real-time search and (2) support a high query load. In this thesis, the above design goals are met for basic queries, which are defined by the query language presented in Section 2.5.3.

In order to achieve the first goal, we introduce search indexes of potential search terms, which can be attribute names, values, or attribute-value pairs of objects. A search index is represented as a key-value pair where the search term is represented as the key and the value is a tuple that contains an object id, a matching metric, and ranking metrics. The object id is a pointer to the object in the object database, while the metrics contain information that is needed by the matching and ranking functions (see Section 2.5.4).

The indexes of the search terms are created and maintained after creation and updates of objects in the object database. Having the indexes of search terms enables local query processing to perform query matching without retrieving entire objects and ranking without having to compute the ranking metrics during the process. Thus, the index enables faster query processing. In addition, if only top-k results are returned, better performance can be achieved.

The indexes of the search terms reduce processing time of the local query processing. As a result, faster response times of search queries can be achieved. However, there is a cost

Design and Implementation of a Network Search Node

Design and Implementation of a Network Search Node

THANAKORN SUEVERACHAI

Master’s Degree Project Stockholm, Sweden October 2013

Design and Implementation of a Network Search Node

THANAKORN SUEVERACHAI

Stockholm October 2013

Supervisor: Abu Hamed Mohammad Misbah Uddin Examiner: Prof. Rolf Stadler

Laboratory of Communication Networks

School of Electrical Engineering

KTH Royal Institute of Technology

XR-EE-LCN 2013:015

Table of Contents

List of Figures

Chapter 1

Introduction

Chapter 2

Background

Chapter 3

Related Research

Chapter 4

Design of a Search Node