Monitoring and Analysis of CPU load relationships between Host and Guests in a Cloud Networking Infrastructure: An Empirical Study

(1)

Thesis no: MEE-2015-NN

Monitoring and Analysis of CPU load relationships between Host and Guests

in a Cloud Networking Infrastructure

An Empirical Study

Krishna Varaynya Chivukula

Faculty of Computing

Blekinge Institute of Technology

SE–371 79 Karlskrona, Sweden

(2)

This thesis is submitted to the Faculty of Computing at Blekinge Institute of Technology in partial fulfillment of the requirements for the degree of Master of Science in Electrical Engineering. The thesis is equivalent to 20 weeks of full time studies.

Contact Information:

Author(s):

Krishna Varaynya Chivukula E-mail: krch13@student.bth.se

University advisor:

Prof.Dr. Kurt Tutschku

Department of Telecommunication Systems

Faculty of Computing Internet : www.bth.se

Blekinge Institute of Technology Phone : +46 455 38 50 00

SE–371 79 Karlskrona, Sweden Fax : +46 455 38 50 57

(3)

Abstract

Cloud computing has been a fast-growing business of the IT sector in the recent years as it favors hardware resource sharing by reducing the infrastructure main- tenance costs and promising improved resource utilization and energy efficiency to the service providers and customers. Cloud Service Providers, CSP, imple- ment load management techniques for effective allocation of resources based on need, enabling them to maintain costs alongside meeting the SLAs. Understand- ing the impact and behavior of variable workloads in a cloud system is essential for achieving load management. CPU load is a principle computational resource that plays an important role in resource management.

This thesis work aims to monitor and analyze load in cloud infrastructure by applying load collection and evaluation techniques. The aim is also to investigate CPU load relationship between host and guest machines under varying workload conditions. In the thesis, along with a cloud environment built using OpenStack, we also consider a system with KVM hypervisor to achieve the goals.

The methodology applied is empirical, that is pure experimental examination.

This study is about performing measurements to make an assessment about load behavior in the system. The experiments are designed to fulfill the objectives of the thesis. We also employ visual correlation analysis to understand the strength of association between host and guest CPU load.

Results of the initial experimental study include distinction between CPU load of OpenStack compute device and a device with KVM hypervisor. Further experi- mental runs are based on these observations. The succeeding results show quite remarkable association between PM and VM under 100% workload conditions.

However, few other variations in workload do not resemble similar results.

CPU load results obtained from cloud and a standalone virtualization system differ, not drastically though. 100% workload situations have shown negligi- ble distortion in the visual correlation and usually reported linearity. Lower workloads showed distortions in correlation analysis. It is anticipated that more iterations can likely refine the observations. Further investigation of these rela- tionships by using other applications commonly used in cloud is potential.

Keywords: Cloud, CPU load, Measurements, OpenStack, Virtualization

i

(4)

Dedicated to my family

ii

(5)

Acknowledgements

I am forever indebted to my supervisor, Prof. Dr. Kurt Tutschku for his valuable guidance, patience and continuous support throughout my thesis. I could not have imagined a better advisor and mentor for my master thesis.

I sincerely thank Dr. Patrik Arlos for his constant encouragement and suggestions in between his busy schedule. I extend my heartfelt thanks to Dr. Dragos Ilie for his invaluable guidance when approached. Words cannot express my gratitude for Anders Carlsson, my father figure, who gave me never ending support and opportunities to excel.

I am grateful to Svenska Institutet for embracing me as a deserving candidate for scholarship and enabling me to fulfill my dream of Master’s education. I acknowl- edge City Network Hosting AB for letting us perform tests in their infrastructure.

I am thankful to god, my wonderful parents and sister for their immense support and motivation. Without you, it would not be the same. Last but not least, a huge thanks to Vida and all my friends who made this journey worthwhile.

iii

(6)

List of Abbreviations

NIST National Institute of Standards and Technology IaaS Infrastructure-as-a-Service

SaaS Software-as-a-Service PaaS Platform-as-a-Service

PM Physical Machine

VM Virtual Machine

SLA Service-Level Agreement

PCPU Physical CPU

vCPU virtual CPU

VMM Virtual Machine Monitor KVM Kernel-based Virtual Machine

QEMU Quick Emulator

OS Operating System(s)

iv

(7)

List of Figures

3.1 Minimal architecture and services offered by OpenStack’s Con- troller node(left), Networking node(center) and Compute node(right) 13 3.2 Experimental Methodology portrayed as a sprial model. . . 15 3.3 Example of graph showing scatter plots and linear correlation as a

relation between x and y attributes. . . 16 3.4 Anticipated graphical plots of possibly attainable correlation be-

tween host and guest in terms of CPU load . . . 16 3.5 Server, Virtualization and Network components that are related to

causing or effecting load in the system. . . 18 3.6 example of output of “uptime” command on terminal. . . 18 3.7 example of output of “top” command on terminal. . . 19

4.1 An OpenStack environment can have “n” number of compute nodes based on requirement and a controller manages the compute nodes. 22 4.2 Abstraction of hardware, software and virtualization layers in a

system. Nova-compute is not present in a normal virtualization system. . . 22 4.3 Visual representation of the PMs, VMs and tools used in the im-

plementation of experiments. The figure resembles the question as to what could be the relationship between load on host and guest machines. . . 23 4.4 Depiction of the experimental setup on OpenStack platform . . . 25

v

(8)

4.5 Stages of experiments performed with stress . . . 26 4.6 Depiction of on-premise device setup . . . 27 4.7 Depiction of on-premise device experimental setup for Single guest 29 4.8 Stress applied initially on 1 vCPU . . . 30 4.9 Stess configured to load 2 vCPUs . . . 31 4.10 Stess configured to load 3 vCPUs . . . 31 4.11 Stess configured to load 3 or more vCPUs. The dotted lines indi-

cate that the number of vCPUs being stressed is increased in each experimental run. . . 32 4.12 Stess-ng configured to impose load om 1 or more vCPUs with 10,

20, 30, 40, 50 and 100% load. The dotted lines indicate that the number of vCPUs being stressed is increased in each experimental run. . . 33

5.1 CPU load observed in OpenStack compute node and on-premise device in multiple guest scenario . . . 35 5.2 Scatter plots showing the relationship between CPU load average

on host and guest in different vCPU stress conditions. . . 37 5.3 Scatter plots showing the CPU load relationships between host and

guest in varying load conditions - uptime tool. . . 39 5.4 Scatter plots showing the CPU load relationships between host and

guest in varying load conditions - uptime tool. . . 40 5.5 Scatter plots showing the CPU load relationships between host and

guest in varying load conditions - uptime tool. . . 41 5.6 Scatter plots showing the CPU load relationships between host and

guest in varying load conditions - uptime tool. . . 42 5.7 Scatter plots showing the CPU load relationships between host and

guest in varying load conditions - top tool. . . 43

vi

(9)

List of Tables

3.1 Minimal hardware required to install a controller and a compute

node . . . 13

4.1 Specifications of the OpenStack Compute Node used for experiments 24 4.2 Specifications of the on-premise device used for experiments . . . 27

5.1 Stress-ng and uptime results on centOS host . . . 44

5.2 Stress-ng and uptime results on Ubuntu host . . . 44

5.3 Stress-ng and top results on CentOS host . . . 44

5.4 Stress-ng and top results on Ubuntu host . . . 45

vii

(10)

Abstract i

Acknowledgements iii

List of Abbreviations iv

List of Figures vi

List of Tables vii

1 Introduction 1

1.1 Background . . . . 1

1.2 Aims and Objectives . . . . 3

1.3 Research Questions . . . . 4

1.4 Expected Contribution . . . . 4

2 Related Work 5 3 Methodology 8 3.1 Introduction to Underlying Technologies . . . . 8

3.1.1 Virtualization . . . . 8

viii

(11)

3.1.2 Hypervisors . . . . 9

3.1.3 Cloud Computing and OpenStack . . . 10

3.2 Methodology . . . 14

3.2.1 Experimental Research . . . 14

3.2.2 Visual Correlation Analysis . . . 15

3.3 Measurement Tools . . . 17

4 General Experimental Setup 21 4.1 Experimental Modeling . . . 21

4.2 Experimental setup . . . 24

4.2.1 OpenStack Cloud Test-bed . . . 24

4.2.2 On-premise Test-bed . . . 25

4.2.3 Stress tests on Single Guest . . . 28

5 Results and Analysis 34 5.1 Results from OpenStack and on-premise Test-beds . . . 35

5.2 Results of Stress tests . . . 36

5.3 Results of Stress-ng tests . . . 38

5.3.1 Uptime tool . . . 38

5.3.2 Top tool . . . 40

5.4 Discussions . . . 43

6 Conclusions and Future Work 46 6.1 Conclusions . . . 46

6.2 Future Work . . . 47

ix

(12)

References 48

x

(13)

Chapter 1 Introduction

This thesis document consists of 6 chapters. Chapter 1 introduces thesis con- cepts, problems statements and motivation in the background section, aims and objectives, research questions and expected contribution of the thesis in the later sections. Research work associated with the thesis in general is presented in Chapter 2. Chapter 3 exhibits the main thesis concepts by briefly discussing the underlying technologies like virtualization, cloud computing, standard tools used in experimentation and the methodology adopted. Experimental modeling and setup is highlighted in Chapter 4. Chapter 5 presents the results obtained from the experimental runs with a detailed analysis and discussions. Conclu- sions derived from the analysis along with intended future work are highlighted in Chapter 6.

1.1 Background

Cloud computing is a predominant phenomenon in telecommunication, which allows sharing of hardware resources with scalability and flexibility by eliminat- ing the constraints on distance. According to NIST, cloud model is classified into three service models based on the resources provided: Software-as-a-Service, SaaS, Platform-as-a-Service, PaaS and Infrastructure-as-a-Service, IaaS. Appli- cation or software itself is offered to the customer as service in SaaS model, while a platform for building or developing customer’s applications is provided as ser- vice in PaaS. On the other hand, IaaS renders pools of computational, storage or network resources and permits the consumers to provision them as per need.

These cloud solutions are utilized on a pay-per-use basis, thereby saving the initial investment and maintenance costs for the customers. [1,2,3]

1

(14)

Chapter 1. Introduction 2 Similar to traditional systems, monitoring system performance with re- spect to computational resources and applications in cloud computing is impor- tant. Performance monitoring and resource management in cloud infrastructures lies on a higher level of complexity due to lack of standards in these service models, where the customers do not have access to the underlying hardware machines.

Cloud service providers, CSP, monitor the resources to ensure quality in their services as well as to bill their customers.[4,5]

The costs faced by CSP depend on the CPU utilization and the costs of the user are based on the time of lease of resources. Higher CPU utilization requires more electricity and cooling, which amount to around 40% costs in dat- acenters. Although cloud services promise improved resource utilization, it is complicated to determine the adequate amount of resources in order to satisfy variable workload. Load management techniques address this issue of managing computational resources according to the varying workloads, thus help in avoiding headroom or hotspot and minimizing the costs. Our study focuses on CPU load as metric of monitoring and analysis in cloud networking infrastructures.[6,7,8,9 ]

CPU load is the demand for computational resources, in other words, is the number of processes running or waiting for the resources. CPU load is determined by adding the CPU utilization and saturation. Utilization is the time a processor is busy and is indicated in percentage. Saturation is the number of processes that are waiting for the CPU at a position where the CPU is 100%

utilized.[10,11]

Cloud computing is built on virtualization technologies that provide a structure where multiple instances can be run on one PM. Since the customers cannot access the hardware, the responsibility of monitoring the resources lies with the service provider to meet the SLAs. It is however complex since there is no possibility for the service provider or user to ensure that the resources are either over used or under used, which is violation of SLAs [5,12]. In such a case, monitoring Virtual Machine, VM, data is a challenge since the CSP and its customer have a different perspective of system performance. In this thesis, we aim at investigating CPU load as viewed from both CSP and customer perspective to identify the relationship between CPU load on host to that on the guests. This relationship can help the CSP in billing their customers based on time as well as resource usage and can also be applied for initiating load prediction for load management techniques.

The thesis mainly focuses on modeling a framework for CPU load gen-

eration, collection and evaluation and on analyzing the guest and host CPU load

relationships. The experiments are conducted on OpenStack compute node and

(15)

Chapter 1. Introduction 3 an on-premise virtualization device using a set of standard Linux tools.

This research is carried out in collaboration with a second master thesis

“An investigation of CPU utilization relationship between host and guests in cloud infrastructure”. While this thesis solely focuses on methodologies for obtaining, monitoring and analyzing CPU load relationships; the second thesis focuses on methodologies in obtaining CPU utilization. The experimental scenarios of both the theses coincide, yet the tools used for measurements and contributions made from the observation and analysis of results, differ. [13]

1.2 Aims and Objectives

The aim of the thesis is to establish a framework for CPU load characterization in federated cloud environment. The experiments outline the performance of load as defined by a set of standard Linux tools that are assist in obtaining the CPU metrics. This is achieved by imposing well-known stress on the vCPUs and extracting the load values from the Physical Machine, PM and VMs, there by identifying the relation between the PCPU and vCPU metrics by a visual correlation analysis. Study related to the working mechanism of these tools is beyond the scope of this thesis.

Cloud operators need to scrutinize the resources available regularly in order to ensure proper load management and meet the SLAs set with cloud cus- tomers.

Main objectives of the thesis include:

• Study of commercially off the shelf performance evaluation tools

• Study of available tools or applications for generating workload

• Modelling of experimental platform in OpenStack

• Modeling of standalone virtualization test platform

• Implementation and experimental runs on both the platforms

• Use of standard tools for CPU load collection

• Iterations of the experiments to ensure robust result

• Analysis of the results

• Observation of correlation values

• Visual correlation analysis of obtained host and guest CPU load

(16)

Chapter 1. Introduction 4

1.3 Research Questions

As discussed in section1.2, our goal is to design a model for load collection and evaluation and to investigate host and guest CPU load relationships for better load management. The following are the research questions formulated:

1. How to model ways of collecting physical and virtual CPU load from a cloud infrastructure?

2. How do the physical and virtual systems react when load is changed?

3. How to identify relationship between host CPU load and Guest CPU load?

4. In what way is the relationship of host and guest useful in load management?

1.4 Expected Contribution

The expected outcome of this thesis comprises:

• Design, modeling and implementation of cloud as well as virtualization plat- forms for experiments.

• Data collection and observation of the measurements over iterations to en- sure robust results.

• Method for analyzing load relationships between PM and VM in cloud and virtualization environment.

• A detailed mathematical and visual correlation analysis.

• Identifying association between physical and virtual CPU load for better

load management.

(17)

Chapter 2 Related Work

Relevant research work associated with this thesis in multiple aspects is intro- duced in this section. A comparative study is conducted to identify the research gaps with proposals of other applicable methods and tools.

As mentioned in previous sections, monitoring resource usage for proper load management in cloud is an important and on-going topic of research. In [14], Mour, et al., presented a novel approach towards load management in cloud by migrating VMs from a burdened hardware machine to other under loaded ma- chines to help achieve substantial improvement in performance. In their model, they have considered a cloud environment comprising diversified range of physical as well as virtual machines. The model comprises of several modules handling individual tasks of load collection, load evaluation, load prediction, load balanc- ing and VM migration. In order to manage the load it is necessary to collect and evaluate the existing load and make a proper prediction based on current load as well as load expected in near future. Our thesis concentrates on load collection and evaluation in OpenStack cloud and virtualization system aiming at finding a relationship between load on VM and PM, which can be of use in load management models.

Another paper, [15], characterized a dynamic resource management sys- tem for cloud computing with focus on VM CPU mapping to the physical CPU because PM capacity should be adequate to fulfill the resource needs of all VMs running on it. They tried to estimate future resource usages without looking inside the VMs, using trend of resource usage patterns. The cloud environment they have used is based on Xen hypervisor.

Few research works include empirical study on VM performance in cloud, i.e.; deducing from practical observations rather than believing in theory [16].

5

(18)

Chapter 2. Related Work 6 One such work can be found in [17], where the authors have characterized and analyzed server utilization with diverse workloads encountered in a real cloud sys- tems. Their analysis states that workload variability across different time spans in a cloud environment is relatively high and they further intended to carry out a more fine-grained analysis on the correlation and effects of workload types on server. In [18], an empirical study on OpenStack cloud’s scheduling functionality is portrayed. The paper aimed at evaluating behavior of OpenStack scheduler with respect to memory and CPU cores of PM and VMs and their interactions.

Their findings include that CPU cores requested by instances is an important factor at the time of resource allocation to VMs in all types of OpenStack sched- ulers. Acquiring the concepts and research gaps from these papers, we set out to perform empirical study applying black-box approach similar to theirs, in Open- Stack cloud networking infrastructures to identify load relationships between host and guest.

Authors of [19] designed a system to measure CPU overhead in the virtual computing system to obtain VM and physical CPU utilization to map a relationship between them. However this does not deal with the impact on PCPU when a VM utilizes more than the allocated resources i.e.; vCPUs, which is addressed in our experiments.

Corradi et al., in their work “VM consolidation: A real case based on OpenStack Cloud”, conclude that power consumption can be exceptionally re- duced with VM consolidation and they desire to investigate further in the di- rection of workload effects with either CPU or network to understand the VM consolidation and role of SLAs in service decision process. They also wished to deploy larger OpenStack cloud for further testing [20]. This is related to our the- sis work from the perspective of VM consolidation in real cloud infrastructures, which is experimented and tested in our thesis on a real OpenStack cloud while multiple guests share resources.

In [21], CPU utilization while running web service application on cloud and a normal virtualization system is compared which would be helpful for the user in specifying the capacity to server, based on computational needs. They performed experiments on cloud and virtualization environments and the results show that web service CPU utilization in cloud is higher than that of on-site device. Their scenarios of experimentation are similar to ours, but our goal is to identify load relationships of host and guest in variable workloads rather than running a single application.

These excerpts from the related work show that load management and

workload impact on PCPU and vCPUs are indeed current important topics of

research. After careful review and comparison of these works and future targets

(19)

Chapter 2. Related Work 7

that stand as motivation to our thesis in multiple ways, we proceed to perform

our experiments aiming at comparing and identifying PCPU relation with respect

to varying workload on vCPUs which could be advantageous in load prediction

and management techniques.

(20)

Chapter 3 Methodology

This chapter provides a brief description of the underlying technologies along with the research methodology and measurement tools used in this thesis.

3.1 Introduction to Underlying Technologies

A concise description of fundamental technologies required to grasp the main idea of the thesis is provided in this section. The fundamentals include virtu- alization, hypervisor and cloud computing principles and standard measurement tools, which detailed in the forthcoming sections respectively.

3.1.1 Virtualization

Virtualization facilitates abstraction of physical servers into virtual machines with their OS. The virtual machines can share the resources at the same time. Virtual- ization is efficient as it reduces the need for physical resources by hosting multiple servers on a single physical machine. Two main techniques of virtualization are OS virtualization and Hardware Virtualization.[10]

OS virtualization –

In OS virtualization, the operating system is partitioned to multiple instances, which behave like individual guests. These guests can be run, rebooted, admin- istered independent to the host machine. These instances can be virtual servers of high performance to cloud customers and of high density to the operators.

The disadvantage of this technique is that the guests cannot run different kernel

8

(21)

Chapter 3. Methodology 9 versions and is overcome in hardware virtualization technique.[22]

Hardware Virtualization –

This technique of virtualization involves creation of virtual machines with entire operating system including kernels. This means that they can run different ker- nel versions unlike OS virtualization. Hardware virtualization supplies an entire system of virtual hardware components where an OS can be installed. This in turn has the following types involved:

• Full virtualization – binary translations: The instructions passed to the guest kernel are translated during the run time.

• Full virtualization – hardware assisted: The guest kernel instructions are not translated or modified and are operated by hypervisor running a Virtual Machine Monitor, VMM.

• Para virtualization: This type of virtualization provides a virtual system with an interface to virtual OS for using physical resources through hy- percalls. This is mostly prevalent in network interfaces and storage con- trollers.[10, 23]

3.1.2 Hypervisors

Hypervisor is a computer software or hardware or firmware that creates, pro- visions, runs and monitors virtual machines. The following are two types of hypervisors:

Type 1 –

This hypervisor runs on the processor directly but not as a kernel software. The supervision is taken care of by the first guest on the hypervisor, which runs on ring 0. This performs the administration work like creating and running new guests. This is also called as bare-metal hypervisor and provides scheduling for VMs. eg: Xen.

Type 2 –

Host OS kernel executes and supervises the hypervisor and the guests existing on it. This does not come with a scheduler of it’s own but uses the host kernel scheduler. Eg: KVM.[24]

KVM, Kernel-based Virtual Machine, is a type 2 open source hypervisor used widely in cloud computing. This hypervisor, coupled with a user process called QEMU – Quick Emulator, creates hardware assisted virtual instances [25].

It is also used in Joyent public cloud and Google Compute Engine [26]. Guests

are at first provisioned by allocating CPU resources as vCPUs and then provided

(22)

Chapter 3. Methodology 10 scheduling by hypervisor. The vCPUs allocation is limited to the physical CPU resource. When it comes to observability, physical resource usage cannot be observed from the virtual instances.

Hardware support is limited to 8 virtual cores for each physical core on the virtual machine and once the maximum number of CPUs exceeds, QEMU provides software virtualization to KVM. In case of multiple VMs hosted by a physical machine, better performance can be attained by assigning 1 virtual core per VM.[27]

3.1.3 Cloud Computing and OpenStack

As summarized in section 1.1, cloud computing is a popular technology supporting physical resource sharing by multiple tenant servers. The services of cloud: IaaS provides compute; storage, network resources and the consumers are allowed to provision them based on their needs. PaaS allows users to run their applications on provided platform and SaaS, on the other hand, allows the customer to utilize the application via a user interface.[1]

In the Future Internet architectures, federation of such public and pri- vate clouds is an interesting feature. One such project working towards the goal of reaching a cloud federation is FI-PPP, Future Internet Public Private Part- nership framework. This framework consists of several smaller projects and XiFi is the project that concentrates on building and providing smart infrastructures for this cloud federation. These Infrastructures facilitate the deployment of sev- eral applications and instances in a unified market place, where, business logic is instantiated as VMs. BTH is one among the current operational nodes across Europe.[28]

These nodes of XiFi are interconnected through networking infrastruc- ture. These architectures are beneficial since they are not implemented at one place and hence are resilient. If hardware at a place is crashed or out of stor- age, the virtual instances can be moved to other places depending on the already existing load on them.

The XiFi nodes are heterogeneous clouds built on OpenStack and pro- vide tools and services such as Generic Enablers for the deployment of various ap- plications. They encompass the Cloud principles namely, on demand self-service, resource pooling, scalability and security.[29]

Another example for such federated platform is the infrastructure at

City Network hosting AB. City Network AB is a corporation that provides cloud

(23)

Chapter 3. Methodology 11 domain services and hosting to its customers. Similar to XiFi, the services are delivered via OpenStack user interface, where the users can create and provision their VMs and utilize the storage and other high quality services offered by City Network. This web interface is called City Cloud and is similar to FIWARE cloud lab of XiFi, which is built on the OpenStack dashboard. However, unlike XiFi, City Network upgrades regularly and keeps track the latest OpenStack releases. Currently, they provide hosting on their datacenters in UK, Stockholm and Karlskrona.[30]

City Network architectures have considerable number of customers uti- lizing their services. Identifying and comparing the host and guest CPU load will be of great value in these operational clouds for better customer service and load balancing.

Cloud Networking focuses on providing the VMs with static or dynamic IP addresses, firewalls to make them available to reach from elsewhere. Cloud Net- working provides control and security for the network functions and services de- livered as a service over global cloud computing infrastructure. The word global, here, resembles the federation of the local clouds through network infrastructures.

Cloud Networking can be of two types – Cloud Enabled Networking (CEN) and Cloud Based Networking (CBN). In the criterion of CEN, the management and control characteristics are moved into the cloud while the network functions such as Routing, Switching and Security services remain in the hardware. In the sec- ond principle – CBN, the requirement for physical hardware is abolished and the network functions are transferred to the Cloud. Yet, the networking infrastruc- ture needed for fulfilling the networking demands of physical devices remains in the hardware.[31,32]

OpenStack is open source cloud solution software that provides and manages IaaS. The infrastructures offered by OpenStack include compute, stor- age and networking resources. OpenStack comes with a combination of core and optional services that can be implemented in its cloud architecture. The minimal architecture of OpenStack consists of its core components that can be realized in either three-node or two-node architectures. Figure 3.1 shows the core com- ponents and services in a three-node architecture. The two-node architecture is similar to figure 3.1 eliminating the Network node, whose services are moved to compute node instead.[33,34]

• Controller –

Openstack comes with a cloud controller that controls or administers other core and optional components or nodes of openstack. Controller node is built to serve as central management system for OpenStack deployments.

Controller can be deployed in a single node or various nodes depending on

(24)

Chapter 3. Methodology 12 the requirement. The main services managed by controller include authenti- cation and authorization services for identity management, databases, user dashboard and image services.[35]

• Compute –

OpenStack’s compute is known as Nova by its project name. The compute node is the device that comprises the software to host virtual instances, thus providing IaaS cloud platform. Nova does not come with virtualiza- tion software but comprises drivers to interact with the virtualization layer underneath. Its main components include object storage component called as swift and block storage component called as cinder that provide storage services.[36]

• Networking –

This aims at providing network services to the instances and enables com- munication between the compute nodes and the virtual instances. It is not necessary to have a separate node for networking and one of the compute nodes can be utilized for the this purpose. The project name of “Network- ing” component in OpenStack is “Neutron”.[34]

• Dashboard –

OpenStack provides a user interface for the users to create and provision virtual instances as per need. It is formally named as “Horizon” in Open- Stack.[35]

• Telemetry –

As shown in figure 3.1, Telemetry is an optional service in OpenStack.

Telemetry, or Ceilometer in OpenStack, monitors the OpenStack cloud en- vironment for providing billing services. Ceilometer has an agent to collect CPU and network metrics that are used for billing, auditing and capacity planning. It has meters to collect duration of instance, CPU time used, number of disk io requests. The CPU utilization reported by this agent is based on CPU ticks but not workload of the VMs.[37,38,39]

The hardware required for OpenStack installation depends upon the number of virtual instances needed or the types of services provided. Table 3.1 displays the minimal requirements for a small cloud platform using Open- Stack.[40]

It is evident that cloud networking aims at increasing efficiency. Effi-

ciency can be defined as the ability to do something without wastage of material,

time and energy. Smart Load management can improve efficiency in cloud infras-

tructures with focus on how best the resources can be shared. In cloud, we place

as many VMs as we can in the system at different locations to increase efficiency;

(25)

Chapter 3. Methodology 13 Figure 3.1: Minimal architecture and services offered by OpenStack’s Controller node(left), Networking node(center) and Compute node(right)

Table 3.1: Minimal hardware required to install a controller and a compute node

Processor RAM Hard Disk

Controller 1 2GB 5GB

Compute 1 2GB 10GB

on the other hand, we need quality and speed required for running the VMs. The

system is not efficient if it is not completely loaded and we need to know about

load on the system to characterize efficiency [14,15]. Our hypothesis is to find

and understand the relation between CPU load inside the VMs and outside the

VMs (host machine), which can be applied to smart load management techniques

to ensure efficient usage of resources.

(26)

Chapter 3. Methodology 14

3.2 Methodology

In this section, we describe the methodology applied to our thesis. In order to reach the aim of our thesis, our methodology includes two subsequent stages - Experimental Research and Visual Correlation Analysis. Experimental Research section resembles our approach in performing the experiments. Visual Correlation Analysis is our analysis strategy to find relationship between host and guest CPU load.

3.2.1 Experimental Research

Prior to experimentation, we have conducted a thorough study of appropriate literature and tools and modeled experiments. The approach in carrying out this thesis is Experimental methodology, which is broadly used while evaluating solutions for questions. In an experimental research, the researchers take mea- surements and further make evaluations to answer the questions.[41]

In an experimental approach, it is important to implement the following three stages, which are followed in our approach as well.

1. Record keeping – a good documentation of the experiments and configura- tion is important in experimental work for future reference and relevance.

2. Experimental setup design – the platform for running experiments need to be modeled and designed. At the end of this phase, researcher should doc- ument hardware and software components and the experimental findings.

3. Reporting experimental results – the results obtained should be stated clearly without distortion and discussed.[41]

We have adopted spiral methodology, a well-known method of software develop-

ment, in performing our experimental research [42]. As resembled in figure 3.2,

our spiral model consists of 4 dimensions – Configuration, experimentation, ob-

servation and analysis. First, we make necessary configurations of PM and VMs

and then perform the experiments in that configured scenario. The results are

then observed and analyzed. Based on the analysis, the complexity of the config-

urations is increased for further experimentation and observation, justifying the

spiral model approach.

(27)

Chapter 3. Methodology 15 Figure 3.2: Experimental Methodology portrayed as a sprial model.

!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !

Configuration!

Experimentation!

Observation!

Analysis!

3.2.2 Visual Correlation Analysis

There are various possible analysis methods viz. statistical analysis but we start with a simple approach, where we do a visual correlation analysis of the data obtained from experimentation. Visual correlation analysis shows association between a set of x and y attributes as depicted in figure 3.3. This visual represen- tation to identify any correlation between 2 sets of values is called as scatter plots [43]. Figure 3.3 is an illustration of the scattered plots and linear correlation line mapping the x and y axes values.

We have applied bivariate analysis to the results to determine empirical relationship between CPU load reported by host and guest in the form of scat- ter plots and correlation co-efficient table. Bivariate analysis is used to predict value of dependent variable when the value of independent variable is known [44].

However, finding the reason for such a behavior is beyond the scope of this thesis.

We set out to find a graphical representation that shows the relationship between host and guest CPU load. Since, we apply a black-box approach as discussed under chapter 2, we anticipate that the correlation we obtain from the bivariate analysis can be linear or exponential or highly varying curves as shown in figure 3.4. Identifying the behaviour of this relation is of interest here.

In case of linear correlation obtained, we identify the strength of the relationship from correlation coefficient. Correlation coefficient (ρ) is calculated using the following formula:

ρ

_(H,G)

= cov(H, G)

σ

_H

σ

_G

(3.1)

(28)

Chapter 3. Methodology 16 Figure 3.3: Example of graph showing scatter plots and linear correlation as a relation between x and y attributes.

0 1 2 3 4 5 6 7 8

X attributes

Y attributes

Figure 3.4: Anticipated graphical plots of possibly attainable correlation between host and guest in terms of CPU load

where:

cov(H, G) = P (H − H)(G − G)

n (3.2)

H: host

G: guest

(29)

Chapter 3. Methodology 17 cov : covariance

σ

H

: Standard deviation of host σ

_G

: Standard deviation of guest

As we have gone through the overview of our methodology, we proceed to the measurement tools used in our experiments in the next section.

3.3 Measurement Tools

There are several simple and standard tools that help obtain CPU load metrics in a Linux system. We have followed “Tools” methodology in system performance, which embraces the use of available performance tools, followed by checking the metrics they provide and finally interpreting the metrics obtained [10]. Various standard tools used for this experimental purpose are described in this section.

The scope of this thesis is limited to the use of these tools for experiments and it does not include the study related to the mechanisms of these tools.

In a cloud environment, load can be caused or affected by the factors as depicted in figure 3.5. Load exists in three dimensions – in physical server, in the network and in the virtualization technology [45,46]. This thesis is about CPU load observation in VMs and the physical server hosting them. Hence, our focus is on the hardware and resource allocation on server and virtualization dimensions respectively.

CPU load is the number of processes utilizing the computational re-

sources. It is calculated based on exponentially moving average and is updated

every 1 min, 5 min and 15 min interval [47]. There are various CPU performance

analysis tools in LINUX OS. Few such tools that provide CPU statistics are top

and uptime that fetch the values from /proc/loadavg. Load average implies the

demand for CPU and is computed by summing the number of threads running and

the number that are waiting to run. When there are processes that are queued

for compute resources, the CPU is said to be in saturation. The load average

should between the numbers of the CPU cores and if load average is higher than

maximum number of CPU cores, the CPU is in saturation state since there are

processes that are queued.[48,49]

(30)

Chapter 3. Methodology 18 Figure 3.5: Server, Virtualization and Network components that are related to causing or effecting load in the system. Load management

Server

Application Database System

Operating System Server Hardware

Virtualization

Network

Ha rdw are

Pro toco ls

Functi ons Virtualization type

Hypervisor

Resource allocation

Description of the measurement tools used:

UPTIME – This is a command tool used for printing the load averages of the system for the last 1-, 5- and 15-minute. As indicated above, load average is the amount of work waiting for the processor and we can compare these three load averages in order to identify the trends in CPU load [50]. Figure 3.6 shows the output of “uptime” command on Linux terminal.[10]

Figure 3.6: example of output of “uptime” command on terminal.

ptg13750475

224 Chapter 6

^!

CPUs

in Chapter 2, Methodology, for more about this, and also Section 2.7, Capacity Planning, of the same chapter for more on scaling.

6.6 Analysis

This section introduces CPU performance analysis tools for Linux- and Solaris- based operating systems. See the previous section for strategies to follow when using them.

The tools in this section are listed in Table 6.6.

The list begins with tools for CPU statistics, and then drills down to tools for deeper analysis including code-path profiling and CPU cycle analysis. This is a selection of tools and capabilities to support Section 6.5, Methodology. See the doc- umentation for each tool, including its man pages, for full references of its features.

While you may be interested in only Linux or only Solaris-based systems, con- sider looking at the other operating system’s tools and the observability that they provide for a different perspective.

6.6.1 uptime

uptime(1) is one of several commands that print the system load averages:

Table 6-6 CPU Analysis Tools

Linux Solaris Description

uptime uptime load averages

vmstat vmstat includes system-wide CPU averages mpstat mpstat per-CPU statistics

sar sar historical statistics

ps ps process status

top prstat monitor per-process/thread CPU usage pidstat prstat per-process/thread CPU breakdowns time ptime time a command, with CPU breakdowns DTrace, perf DTrace CPU profiling and tracing

perf cpustat CPU performance counter analysis

$ uptime

9:04pm up 268 day(s), 10:16, 2 users, load average: 7.76, 8.32, 8.60

From the Library of Kurt Tutschku TOP – Another popular tool for monitoring the top running processes is “top”.

The top command, when issued, prints a system-wide summary and processes running on the terminal. The top results are updated at regular intervals on Linux. System-wide summary is the CPU statistics which includes Load average, percentage utilization of the user, system, nice, idle waiting, hardware interrupt, software interrupt and stealing levels. These values are obtained by taking the mean across all CPUs. [10,51]

Top is a well-accepted tool by Beginners of performance analysis due

to its features. In spite of this significance it has a minor drawback due to its

impact on the performance by applying more load. That is, when top command

(31)

Chapter 3. Methodology 19 is issued on a system, the process or thread that is running top is visible in the %usr column. This implies that top command consumes the CPU itself for extracting the statistics, which is not favorable in performance analysis. This is due to the existence of open, read and close function system calls which have their own costs of extracting stats from /proc entries. Top also has another minor drawback of eliminating short-living processes, since the stats are provided after taking a snapshot of the /proc entities. Hence, it does not consider the processes that lived in between the interval of these snapshots [10]. Figure 3.7 shows the output of top command on terminal.

Figure 3.7: example of output of “top” command on terminal.

ptg13750475

6.6 Analysis 231

This lists every process ( -e) with full details (-f). ps(1) on most Linux- and Solaris-based systems supports both the BSD and SVR4 arguments.

Key columns for CPU usage are TIME and %CPU.

The TIME column shows the total CPU time consumed by the process (user + system) since it was created, in hours:minutes:seconds.

On Linux, the %CPU column shows the CPU usage during the previous second as the sum across all CPUs. A single-threaded CPU-bound process will report 100%. A two-thread CPU-bound process will report 200%.

On Solaris, %CPU is normalized for the CPU count. For example, a single CPU- bound thread will be shown as 12.5% for an eight-CPU system. This metric also shows recent CPU usage, using similar decayed averages as with load averages.

Various other options are available for ps(1), including -o to customize the output and columns shown.

6.6.6 top

top(1) was created by William LeFebvre in 1984 for BSD. He was inspired by the VMS command MONITOR PROCESS/TOPCPU, which showed the top CPU-consuming jobs with CPU percentages and an ASCII bar chart histogram (but not columns of data).

The top(1) command monitors top running processes, updating the screen at regular intervals. For example, on Linux:

$ ps -ef

UID PID PPID C STIME TTY TIME CMD root 1 0 0 Nov13 ? 00:00:04 /sbin/init root 2 0 0 Nov13 ? 00:00:00 [kthreadd]

root 3 2 0 Nov13 ? 00:00:00 [ksoftirqd/0]

root 4 2 0 Nov13 ? 00:00:00 [migration/0]

root 5 2 0 Nov13 ? 00:00:00 [watchdog/0]

[...]

$ top

top - 01:38:11 up 63 days, 1:17, 2 users, load average: 1.57, 1.81, 1.77 Tasks: 256 total, 2 running, 254 sleeping, 0 stopped, 0 zombie

Cpu(s): 2.0%us, 3.6%sy, 0.0%ni, 94.2%id, 0.0%wa, 0.0%hi, 0.2%si, 0.0%st Mem: 49548744k total, 16746572k used, 32802172k free, 182900k buffers Swap: 100663292k total, 0k used, 100663292k free, 14925240k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11721 web 20 0 623m 50m 4984 R 93 0.1 0:59.50 node 11715 web 20 0 619m 20m 4916 S 25 0.0 0:07.52 node 10 root 20 0 0 0 0 S 1 0.0 248:52.56 ksoftirqd/2 51 root 20 0 0 0 0 S 0 0.0 0:35.66 events/0 11724 admin 20 0 19412 1444 960 R 0 0.0 0:00.07 top 1 root 20 0 23772 1948 1296 S 0 0.0 0:04.35 init

From the Library of Kurt Tutschku Although top has these minor drawbacks mentioned above, this stan-

dardized tool is used in well-known performance monitoring applications like Na- gios [52]. We have chosen both uptime and top tools for our experiments after keenly studying them, to be able to make a comparison and validation of our experiments on CPU load of the system. This helps us find out the how the results of top differ from uptime, even though both collect data from the /proc entries [10]. Our aim however, is not to compare these tools but to use them for our experiments.

Stress is a popular Linux tool that is used to literally stress the CPU with proper computational workload. It comes with various other options of CPU cores, memory, time etc. In our experiments, we have stressed the vCPUs for a period of 5 minutes. Scripts for uptime and top commands are written to run after every 5-minute interval. Stress comes with various options and we have used

“-c” in this experiment. Option “-c” is for increasing the number of CPUs that you wish to be stressed. This is equal to vCPUs if used on VMs and CPUs if used on host machine.[53,54]

Stress-ng is a tool similar to stress tool with additional load customizing option.

(32)

Chapter 3. Methodology 20 With stress-ng tool, we can impose desired amount of load in percentage using

“-l” option [55].

In our experiments, we started with the well-known tool stress widely

used in system performance tests viz. [56] and moved on to smarter tool stress-ng

for a refined analysis of system behavior under varying workloads.

(33)

Chapter 4 General Experimental Setup

Now that we have given a clear direction of our methodology and tools selected, this chapter gives brief description of our experimental design and it is essential to have a clear picture of basic concepts and our approach from the previous chapters, in order to be able to connect the dots and to interpret the motivation behind our experimental model. The following sections explain our conceptual model and different experimental structures with figures.

4.1 Experimental Modeling

We have modeled our experimental test bed in an OpenStack cloud to observe load relationship between host and guest. As stated earlier, in OpenStack, nova compute is the agent that provisions and runs the VMs (refer section 3.1.3) and depending on the requirement there can be multiple compute nodes. Figure 4.1 is a notion of having multiple compute nodes that are managed my one controller in an OpenStack environment.

A compute node requires a hypervisor for interacting and managing the VMs. Our OpenStack cloud environment uses KVM hypervisor, which is a type-2 hypervisor. Type-2 hypervisors use the host’s kernel scheduler to process requests from VMs and do not have their own schedulers. Figure 4.2 is an abstraction of the layers involved in a virtualization system with and without OpenStack compute agent. As shown in the figure 4.2, nova compute interacts with the virtualization mechanism using APIs but does not have virtualization mechanism of its own.

[36]

Hence, our hypothesis is that simple virtualization system and a system

21

(34)

Chapter 4. General Experimental Setup 22 Figure 4.1: An OpenStack environment can have “n” number of compute nodes based on requirement and a controller manages the compute nodes.

!

Controller!

Compute!

Node!1! Compute!

Node!N!

Compute!

Node!3!

Compute!

Node!2!

Figure 4.2: Abstraction of hardware, software and virtualization layers in a sys- tem. Nova-compute is not present in a normal virtualization system.

with OpenStack compute agent will show similar load behavior. Since we apply

spiral development approach to our experimentation, we initially have similar ex-

perimental setup on two devices – OpenStack compute node and an on-premise

device with basic configurations to compare the results. If the observed results

support our hypothesis, we increase the complexity in our experiments and con-

tinue them on our on-premise setup. The main reason for setting up an on-site

device is the difficulty in collecting information from the VMs as a CSP is devoid

of access to VMs and also to eliminate the impact of other services running on

operational cloud, on our test environment and avoid inconvenience for customers.

(35)

Chapter 4. General Experimental Setup 23 Working accordingly towards our aim of finding relationship between host and guest CPU load, we collect the CPU load from both PM and VMs as portrayed in figure 4.3. The main subject of interest lies in investigating if we can interpolate and extrapolate from host to guest and vice-versa by observing load behavior on the PM with varying workloads on the VM.

Figure 4.3: Visual representation of the PMs, VMs and tools used in the imple- mentation of experiments. The figure resembles the question as to what could be the relationship between load on host and guest machines.

!

Guest!1!

Guest!OS!

Kernel!! stress/stress/ng! uptime!!/top!

Guest!2!

Guest!OS!

Kernel!!

uptime!!/top!

stress/stress/ng!

Guest!4!

Guest!OS!

Kernel!! stress/stress/ng! uptime!/top!

Host!

uptime!!/top!

Hypervisor!(KVM)!

! Host!Kernel!

Host!OS!!

CPU

! I/O

!

^Memory!

?!

Guest!3!

stress/stress/ng! uptime!/top!

Guest!OS!

Kernel!!

Apart from this, we have performed the experiments also on two different host OS - CentOS and Ubuntu, popularly used in cloud architectures. As KVM is type 2 hypervisor and runs on host OS as software, the VMs on KVM are processes that can be manipulated. OS scheduling is provided by host kernel and our hypothesis leads us to observe behavior of these two different kernel versions of Linux distribution system. CentOS and Ubuntu are chosen to examine our hypothesis.

The tools used for load generation are – stress and stress-ng and for CPU load collection are uptime and top. All experimental scenarios and configurations are described in detail in the succeeding sections of this chapter along with figures.

The observations of the experiments are presented in chapter 5.

(36)

Chapter 4. General Experimental Setup 24

4.2 Experimental setup

In the initial setup, we modeled the experimental platform on operational cloud Infrastructure and an on-premise device, as described below.

4.2.1 OpenStack Cloud Test-bed

This experiment was executed on one of many compute nodes of City Cloud infrastructure owned by City Network AB [57]. Features of the compute node used for the experiments are listed in Table 4.1.

Table 4.1: Specifications of the OpenStack Compute Node used for experiments

Criteria Specification

CPU Intel

^®

Xeon

^®

CPU E5-2697 v2 @ 2.70GHz 48 cores

RAM 32GB

Hard disk 2x300 GB SAS

Hypervisor KVM

Host OS CentOS 6.6

Guest OS Ubuntu 14.04 LTS server

Four virtual machines were created on this compute node with CentOS as host machine. Ubuntu 14.04 LTS server is the OS used for all the 4 VMs on this node. Each VM was provisioned with 2 vCPUs as KVM hypervisor uses core numbers for provisioning [27]. The next step in the process is stress testing VMs and simultaneously collecting the CPU load metrics from host. There is no other application running on the guests or hosts while stress tests are in proceeding, to make sure that the CPU is free of any other load infliction than produced by stress.

The setup is as presented in figure 4.4. The four virtual instances present on the host device are up and running all through the experiment. At first, stress command is issued to impose load on 1 VM for 5 minutes while the other VMs are idle. The stress command is customized to stress 2 vCPU cores, which means that the VM is imposed with 100% load, as each VM has 2 vCPUs assigned to it. Simultaneously, the load values are obtained from the host with the help of command line tool – uptime.

The experiment is conducted in similar manner for 2 VMs. This means,

(37)

Chapter 4. General Experimental Setup 25 Figure 4.4: Depiction of the experimental setup on OpenStack platform

!

!!

! Guest!OS!

! VM!1!

! Kernel!

Guest!OS!

! VM!2!

! Kernel!

Guest!OS!

! VM!3!

! Kernel!

Guest!OS!

! VM!4!

! Kernel!

Hardware!(Processors)!

! Ubuntu!or!CentOS!

! KVM!/!Libvirt!

! Host!Kernel!

!

OpenStack!NOVA!Compute!

2 vCPUs each of 2 VMs were stressed. In other words, 100% load was imposed on 2 VMs each and load metrics were collected once again from host. The rest 2 VMs are idle while 2 guests are being stressed.

The next stage involves conducting similar stress tests on 3 VMs with 1 VM idle. After collecting the load from host machine, all the 4 VMs are imposed with stress on their dedicated vCPUs, i.e, 8 vCPUs in total. This again is simultaneous with uptime running on host. Figure 4.5 provides a clear picture of the 4 stages of this experimental run.

4.2.2 On-premise Test-bed

This test setup is installed on a standalone device with KVM hypervisor. This setup helps us in distinguishing between the OpenStack existent and non-existent cases for supporting our hypothesis as stated in section 4.1. This test bed is built on the attributes as listed in Table 4.2.

This experimental setup imitates the City Network’s compute node cho-

sen for the previous experiment, with minor difference being the number of CPU

cores. The hypervisor used is KVM as it is highly used by popular public and pri-

vate clouds viz. joyant and google [10], which is also evident from City Network’s

infrastructure.

(38)

Chapter 4. General Experimental Setup 26 Figure 4.5: Stages of experiments performed with stress

(a) stage 1 - stress imposed on 1VM

! VM!1!

VM!2!

VM!3!

VM!4!

stress!

!

! VM!1!

stress!

VM!2!

VM!3!

VM!4!

! !

!

! VM!1!

VM!1!

VM!2!

VM!3!

VM!4!

stress!

CPU! CPU!

(b) stage 2 - stress imposed on 2VMs

! VM!1!

VM!2!

VM!3!

VM!4!

stress!

!

! VM!1!

stress!

VM!2!

VM!3!

VM!4!

! !

!

! VM!1!

VM!1!

VM!2!

VM!3!

VM!4!

stress!

CPU! CPU!

(c) stage 3 - stress imposed on 3VMs

! VM!1!

VM!2!

VM!3!

VM!4!

stress!

!

! VM!1!

stress!

VM!2!

VM!3!

VM!4!

! !

!

! VM!1!

VM!1!

VM!2!

VM!3!

VM!4!

stress!

CPU! CPU!

(d) stage 4 - stress imposed on 4VMs

! VM!1!

VM!2!

VM!3!

VM!4!

stress!

!

! VM!1!

stress!

VM!2!

VM!3!

VM!4!

! !

!

! VM!1!

VM!1!

VM!2!

VM!3!

VM!4!

stress!

CPU! CPU!

Virtual machine provisioning in KVM is based on the cores available on the PCPU [27]. For the experimental purposes, 4 guests were created and provisioned with 2 vCPUs each. That is, out of the 8 total cores available in the PCPU, 2 vCPUs each are allocated to one VM each.

Figure 4.6 provides a pictorial representation of this test-bed setup. The

host OS used is CentOS 6.6 and it should be noted that the guest OS is Ubuntu

(39)

Chapter 4. General Experimental Setup 27 Table 4.2: Specifications of the on-premise device used for experiments

Criteria Specification

CPU Intel

^®

Xeon

^®

CPU E3-1230 v2 @ 3.30GHz 8 cores

RAM 8 GB

Hard disk 500 GB SAS

Hypervisor KVM

Host OS CentOS 6.6

Guest OS Ubuntu 14.04 LTS server

14.04 LTS server in all the cases that are considered for experimentation. Also, similar to the previously described OpenStack test case, all the VMs are running all through the experimentation.

Moving on to the performance tests, stress tool is used for generating load on the guests. Stress command is first issued on VM 1 for producing 100%

load by stressing the 2 vCPUs available on the VM. The CPU load is collected from the Centos host using uptime.

Then, this is repeated for 2 VMs, stressed with 2 vCPUs each. The host CPU load behavior when 2 vCPUs are stressed fully is observed via uptime command.

Figure 4.6: Depiction of on-premise device setup

VM1$ VM2$ VM3$ VM4$

KVM$

Ubuntu$or$Centos$

Hardware$(Processors)$

Host$Kernel$

kernel$ kernel$ kernel$ kernel$

Guest$OS$ Guest$OS$ Guest$OS$ Guest$OS$

(40)

Chapter 4. General Experimental Setup 28 The next step is to stress test 3 VMs, while 1 VM is still idle, and observe load characteristics of the hardware CPU through the host with the help of once again. The last run is stress testing 7 vCPUs of the 4 VMs and collecting the CPU load from the host machine.

Stress tests on these setups are performed for 5 min on each VM while CPU load values are retrieved from the respective hosts as explained. We reiterated the experiments ten times to ensure robustness in the results. The data obtained is stored for analysis later on and is produced in Chapter 5.

4.2.3 Stress tests on Single Guest

Way through our spiral approach, the results of this first stage of experiments convinced us to contiue our experimentation on the on-site virtualization test- bed with increased complexity. This second main framework of experimentation is described this section.

This experiment is set on the standalone device with virtualization en- vironment but no OpenStack platform. The structure of this experiment is quite different from the previous experimental configuration except for the hardware used, which is similar in both the cases. Besides, a notable difference is the number of VMs that are hosted by the hypervisor, KVM. The specification is as shown is Table 4.2 and is summarized below once again –

• Intel® Xeon® CPU E3-1230 v2 @ 3.30 GHz

• 8 cores CPU

• KVM hypervisor

• Ubuntu 14.04 server for guest OS

• Ubuntu 14.04 LTS Desktop and CentOS 6.6 for Host OS

This is performed on two different Linux distribution OS – Ubuntu and CentOS due to their wide usage in cloud infrastructures. In contrast to the former arrangement, only 1 VM is created on KVM and is allocated only 2 vCPUs.

In other words, out of the total available 8 cores of PCPU, only 2 vCPUs are

assigned to the only existing VM. This VM is later stress tested in multiple ways

for recognizing the impact on system load. The experimental setup is as depicted

in figure 4.7.

(41)

Chapter 4. General Experimental Setup 29 Figure 4.7: Depiction of on-premise device experimental setup for Single guest

Guest&OS&

VM&

Kernel&

Ubuntu&or&CentOS&

Hardware&(Processors)&

Host&Kernel&

KVM&

4.2.3.A. Testing with “Stress” tool

Initially, stress tool is operated for inserting load on 1 vCPU of the VM. Uptime command is run on both the host and guest, to understand the CPU behavior of both physical and virtual systems. Load information of the guest systems was not collected earlier in the multiple (four) guest scenario. This experiment is also run for 5 min. Figure 4.8 demonstrates this initial step clearly. In our experiments, we have used uptime command to collect cpu load average every 1 second in this 5 min period.

Next, the load is applied to 2 vCPUs on the VM. To achieve this, stress on the VM is issued along with the option of 2 (v)CPUs to be stressed. At the same time, uptime is once again passed to host and guest for collecting the load values over this 5 min time period.

Likewise, stress is now given an option to load 3 vCPUs from the VM.

Even though the VM is bound to run with only 2 vCPU resources, stress is a tool that can be used for enforcing load on more CPUs than that are actually available on the compute machines [53]. It is possible to enter a number higher than the available vCPUs and this would make the OS unstable. We chose to do this experiment to observe the impact on PM load behavior when the guest utilizes more than the allocated resources.

Henceforth, after generating workload on 1 and 2 vCPUs, the tool

Monitoring and Analysis of CPU load relationships between Host and Guests in a Cloud Networking Infrastructure: An Empirical Study

Thesis no: MEE-2015-NN