Comparative evaluation of virtualization technologies in the cloud

(1)

School of Innovation Design and Engineering

V¨

aster˚

as, Sweden

Thesis for the Degree of Bachelor of Science in Engineering

Computer Network Engineering 15.0 credits

COMPARATIVE EVALUATION

OF VIRTUALIZATION

TECHNOLOGIES IN THE

CLOUD

Marcus Johansson

mjn12029@student.mdh.se

Lukas Olsson

lon11005@student.mdh.se

Examiner: Moris Behnam

Supervisor: Alessandro Papadopoulos

(2)

Abstract

The cloud has over the years become a staple of the IT industry, not only for storage purposes, but for services, platforms and infrastructures. A key component of the cloud is virtualization and the fluidity it makes possible, allowing resources to be uti-lized more efficiently and services to be relocated more easily when needed. Virtual ma-chine technology, consisting of a hypervisor managing several guest systems has been the method for achieving this virtualization, but container technology, a lightweight virtualization method running directly on the host without a classic hypervisor, has been making headway in recent years. This report investigates the differences be-tween VMs (Virtual Machines) and containers, comparing the two in relevant areas. The software chosen for this comparison are KVM as VM hypervisor, and Docker as container platform, both run on Linux as the underlying host system. The work con-ducted for this report compares efficiency in common use areas through experimental evidence, and also evaluates differences in design through study of relevant literature. The results are then discussed and weighed to provide a conclusion. The results of this work shows that Docker has the capability to potentially take over the role as the main virtualization technology in the coming years, providing some of its current shortcomings are addressed and improved upon.

(3)

List of tables

1 Hardware specifications. . . 12 2 Performance and functionality overview. . . 28

(6)

List of figures

1 Structure of a virtual machine. [6] . . . 8

2 Structure of a container. [6] . . . 9

3 Box and line diagram of research method as recommended in [11] . . . 11

4 Sysbench container output of ’docker ps’. . . 15

5 Size of a container with MySQL server in Virtuozzo. . . 19

6 MySQL query performance in KVM. . . 20

7 MySQL query performance in Docker. . . 21

8 MySQL query performance in KVM (100 Mbps throughput). . . 21

9 MySQL query performance in Docker (100 Mbps throughput). . . 22

10 File I/O throughput in Docker and three different KVM virtual disk formats, as mentioned in Section 6.2.3. . . 23

11 Multithread file I/O tests in KVM . . . 23

12 Multithread file I/O tests in Docker . . . 24

13 Live VM migration in KVM . . . 24

(7)

1 Introduction

Virtualization of computer systems is used extensively in cloud computing to provide more efficiency in terms of cost, energy and resource allocation. From the average end-users perspective, the cloud might be thought of in terms of services like iCloud, Google Cloud and Dropbox. This is cloud storage, but the cloud is also used to pro-vide software, platform and infrastructures as a service remotely. Computer systems, especially in context of datacenters and the cloud, often have significant underutiliza-tion of their hardware resources. Therefore, virtualizaunderutiliza-tion is key in order to cut costs and provide a more efficient and fluid solution for modern cloud service solutions. This is accomplished through the use of different virtualization technologies that allow an administrator to allocate resources based on the requirements of different services and migration of virtual systems to and from the physical systems depending on where the resources are needed at one particular time. As virtualization has developed into a more and more predominant type of technology in the cloud, the performance of the different types of virtualization technologies have become crucial for response times of services such as video encoding, database applications and other processing heavy applications. For this reason, administrators are required to have the neces-sary knowledge and understanding of these different types of virtualization so as to be able to ensure that a high standard of performance can be sustained in terms of low downtime and fast processing of data.

In this work we will explore two virtualization technologies; Virtual Machines (VMs) and Linux Containers, through research and experimentation. We will test database performance, I/O performance and test live migration in the two technolo-gies along with investigating other influencing factors. After theoretical and exper-imental data has been collected, a comparison between the two technologies based on said data is made. Through this, a conclusion is reached, namely if containers have the capacity to surpass VMs as a virtualization method and if so, in what way. Through comparison of some concrete qualities of the two technologies, administra-tors can make an informed and up-to-date decision of which to use in their particular cloud computing scenario.

1.1 Thesis Outlne

This report consists of 11 sections, excluding references and appendices, which de-scribe the work done for this thesis. The different sections include:

• Section 1 provides an introduction to the subject and some terminology along with a brief problem formulation.

• Section 2 describes the background for this work in detail, including descriptions of technologies used.

• Section 3 contains the problem formulation along with research questions aimed to be answered through this work.

• Section 4 describes the method, metrics, technology and setting in which this work will be conducted.

• Section 5 outlines ethical and societal considerations that had to be taken into account. For this work, no specific considerations had to be made.

(8)

• Section 7 presents the results from the experiments along with brief summaries. • Section 8 discusses the results in detail and attempts to provide some

explana-tions for them.

• Section 9 contains the conclusion of this work. • Section 10 describes related work.

• Section 11 describes future work, includes other factors to take into considera-tion for further comparison, which weren’t evaluated in this work.

2 Background

Cloud computing is widely used today to allow resources to be available for users regardless of their physical location. In order for providers to make their data centers as efficient as possible, virtualization of resources is often implemented through virtual machines. This provides flexibility and scalability at a lower cost for both the provider and the customer.

Within cloud computing today there are several technologies that make exten-sive use of virtual machines (VMs). VMs allow for isolated execution of workloads, and more efficient resource utilization. This provides the means for efficient cloud computing, however, VMs have certain limitations such as static resource allocation, meaning the resources cannot be resized during the runtime and other characteristics as a high degree of isolation. Although dynamic resource allocation can be achieved in some cases through the use of third party solutions, it is not commonly supported natively. Isolation can be seen as both an advantage and a limitation, depending on how you look at it. From a security standpoint it can be seen as an advantage, as isolation corresponds to more shielded environment. Because of its sandboxed envi-ronment, should an attacker find a window into the system through a VM, it is not likely that the host system and other virtual machines should be at risk. On the other hand, from a performance standpoint, a higher degree of isolation means additional overhead, impacting performance. Also with containers gaining more traction and popularity in recent years throughout the industry, many commonplace application and usages are now available in container form. This popularity has resulted in com-panies such as Lyft, Spotify and Ebay using it for their product solutions [1]. In later years, some empirical studies have provided experimental evidence suggesting that containers can really overcome the limitations of VMs [2]. Thus, this work intends to make a fair comparison between the two.

2.1 Virtualization

Virtualization is accomplished through the use of a hypervisor, which is software that maintains the virtual machine and maps physical resources to virtual resources visible to the virtual machine [3]. The hypervisor supervises the kernel of the operating system, and this term dates back to the 1960s during the inception of virtualization technology [4]. Two different types of hypervisor can be identified, bare-metal/native hypervisors (Type-1) and hosted hypervisors (Type-2) [5]. Bare-metal runs directly on the host hardware to control and manage resources for the virtualised operating system, while hosted hypervisors run on top of a conventional operating system as a process.

(9)

2.2 Virtual machines

Virtual machines are computer software that replicates a single or multiple instances of virtual computers on a single physical system, including a separate operating system (OS), network interface card, CD-drive and all other components you would find in a regular computer.

Figure 1: Structure of a virtual machine. [6]

This allows administrators of large datacenters and local users alike to create isolated instances of virtual machines on their physical machines. The keyword here being isolated, in the sense that the virtual machine is completely isolated from the physical machine and hence is less susceptible to being affected by potentially harmful software, viruses. This allows the user to be able to run any software on the virtual machine without having to risk jeopardizing the security of the physical machine as well as other virtual machines on the system. Furthermore it allows for more efficient allocation of resources within datacenters for example, where virtual machines can be tailored to have the exact amount of desired resources, such as processing power, memory and storage, for the services intended to run on the system. This in turn enables a more energy efficient approach to computer service management as well as a more cost effective alternative compared to running several different physical systems in place of virtual machines [6]. IT infrastructure that is lacking in virtualization tend to underutilize the resources at their disposal severely, with storage utilization under 50% and server utilization in many cases as low as 10% [7].

2.2.1 KVM (Kernel-based Virtual Machine)

Kernel-based Virtual Machine(KVM) is an open-source hypervisor that runs on Linux (although it also exists as a stand-alone distribution). KVM blurs the line between type-1 and type-2 hypervisors, because it is packaged as a component of Linux and

(10)

managed from the host operating system, but it runs directly on the x86 hardware, co-processing with the kernel [8]. KVM was implemented in the Linux kernel in 2007 and has since matured to a stable hypervisor, being widely used in open-source environments, with projects such as OpenStack using it as their default hypervisor [9]. KVM suffers from the same limitation as other hypervisors, namely statically bounded resources, limiting the VM to less than the physical machine might be able to provide at any given time (KVM supports VM resizing, but this requires a guest operating system with support for resizing). It also provides the isolation from the host and other guests that VMs in general are known for [5].

2.3 Containers

Containers and VMs are similar in almost every sense functionality-wise, the big difference being in the underlying structure of how the two technologies operate on top of the physical machine. While a VM utilizes a hypervisor for communication between instances of VMs and the physical system along with a separate OS for each VM, containers operate directly on top of the underlying host system without the use of a hypervisor for communication between the container and the host system. This makes containers more lightweight, enabling them to achieve very fast startup times and very efficient resource allocation due to the fact that all containers share the OS and resources with the host system [6].

Figure 2: Structure of a container. [6]

Containers are however not as secure as VMs because of the way they operate. While VMs create more isolated instances of virtual systems because of the hypervisor acting as a barrier between the host system and the VMs, containers operate directly on top of the host system which allows them to make system calls to the kernel which in turn enables a bigger potential of attacks to the system [6].

2.3.1 Docker

Docker, like its hypervisor counterpart in this report, is open-source and runs on Linux, and is based on Linux containers. It provides a consistent lighweight

(11)

environ-ment for applications to run in, by supplying only the necessary components needed. There is also support for full operating-system-level virtualization, where the entire structure of an operating system, such as Ubuntu, can be virtualized. Docker uses copy-on-write, where a base image can be built using a Dockerfile, and then several containers can be built on top of this base image and only the changes relevant to each container are saved. Docker is to some degree community driven, and in the Docker Hub community members can upload their own Docker images for other users to build containers from. There are also official Docker images, verified by Docker for more commonly used applications such as MySQL and RabbitMQ. Docker has steadily been getting more popular since 2013, when it was first launched [10].

3 Problem Formulation

Today, VMs and containers are two approaches to much of the same problems, but they vary in architecture and underlying structure. This work will attempt to identify differences in fuctionality and performance between VMs and containers, along with suitable metrics for use during the test phase. This will be accomplished through investigation of current literature, experimentation and analysis of the results. After these parameters have been identified, the thesis will draw conclusions based on these differences and attempt to provide recommendations based on different areas of use and different scenarios.

The research questions that this work will be based on are:

• How do KVM and Docker differ performance-wise in common operations such as database queries, read/write operations and migration?

• How do they differ in a security and scalability context?

(12)

4 Method

Figure 3: Box and line di-agram of research method as recommended in [11]

The work will consist of gathering experimental data, employing a quantitative research method, but also through studies of current literature. Experimental data will be gathered in large enough batches to provide accurate statistics of behaviour. The setting in which the experimental data will be gathered is static, to provide a fair comparison between the two technologies. This method provides more empirical evidence to support claims and is fitting for this work where large amounts of data can be generated for analysis. This is opposed to qualitative research such as case studies which looks in depth at one specific case.

The term quantitative research refers to approaches to empirical inquiry that collect, analyze, and display data in numerical rather than narrative form. [12]

—Lisa M. Given

4.1 Technology

The technologies used to make the comparison between VM and containers will be Docker and Kernel-based Virtual Machine (KVM), both of which runs on Linux and are open source. Measurement of I/O performance will be performed with sysbench [13], database queries will be measured with mysqlslap [14] and live migration perfor-mance will be compared using KVMs built-in migration feature as well as Virtuozzo containers.

4.2 General setting

All experimental data will be gathered under the same structure, using two HP ZBook 15 G3 laptops with Intel Core i7-6820HQ 8-core processors and 32 GB of RAM and an HP 1820 switch with factory settings, including 1 Gbps throughput. The VMs and containers will be separated from a querying device by the network device, and the hardware configurations on which they run will be the same so as to reduce the amount of variables that will have to be adjusted for. Migration, system operations and database queries are some of the methods that will be used to gauge the efficiency of the technologies. Both of our physical machines and all virtual machines will run the latest version of Ubuntu for desktop, currently Ubuntu version 16.04 LTS [15].

4.3 Verifying KVM and Docker configurations

As the tests performed in this work required different VM and container configura-tions for the comparison of performance, there had to be some kind of verification

(13)

HP ZBook 15 G3

Processor Intel Core i7-6820HQ, 8-cores @ 2.70GHz

RAM 32 GB

Video card NVIDIA Quadro M1000M Operating system Ubuntu 16.04 LTS

KVM version 2.5.0 Docker version 1.13.1

Table 1: Hardware specifications.

that these configurations were accurate. As seen in Table 1, the machines used for this work have 8 processor cores while the tests used configurations of 4, 2 and 1 cores. In order to verify that, for example limiting a virtual machine or container to 4 cores worked, htop was used to show processor workload. For KVM it was quite simple to verify that the limitations worked; a stress testing tool called stress was downloaded and run within the KVM machine, which resulted in showing only 4 of the 8 cores under load in htop. For Docker, an image with a stress testing tool created by petar-maric [16] was downloaded from the Docker repository. This was run in a Docker container restricted to 4 processor cores, while the stress testing tool was instructed to stress all 8 cores. Again, using htop, the observation could be made that, although all 8 cores were being stressed, only 4 of the 8 cores were under load, which verified that the configuration worked.

1 docker run −ti −e MAX\ CPU\ CORES=8 −e STRESS\SYSTEM\FOR=30s −−cpuset−cpus =0−3 −−memory=16g petarmaric/docker.cpu−stress−test

The above command runs the Docker container with the stress testing tool, in-structing it to stress all 8 cores for 30 seconds. The –cpuset-cpus=0-3 command specifies that the container should be assigned processor cores 0, 1, 2 and 3.

4.4 Metrics

In order to compare performance, some metrics must be determined. The metrics identified for this work include database query response time, file I/O performance (throughput) and live migration times. Of course, there are many other metrics that could be identified which have an impact on performance in each technology. The three metrics chosen for this work combined throws a quite broad net over the dif-ferent operations that a virtualization tool might be expected to perform frequently. These choices are based on findings from studying other literature; which operations that were most commonly tested. And as such, we can draw the conclusion that these metrics are sufficiently relevant as grounds for a comparison. The quality of the different metrics is determined by how significant its impact is in a real-world setting, but its quality might be reconsidered if the difference between the two tech-nologies is significant enough. For example, cold start-up time would perhaps not be considered a very high-value metric because of its limited effect on the performance of an infrastructure, but if one technology would be several times slower than the other for this metric, it might be reconsidered as more relevant and of higher quality than it would otherwise. As a general statement it could be said that parameters with a consistent impact on performance are more relevant. Processing performance and added overhead can be considered very relevant, because their impact would be

(14)

constant and felt at all times. Response time and calculations, while still relevant, are more specific and thus ranks lower.

5 Ethical and Societal Considerations

There are no significant ethical or societal factors that need to be taken into consid-eration when executing performance tests of technologies. Of course, one could argue that since the report will most likely favor one over the other, it has some impact on favorability of these technologies within the cloud community. This impact should not be significant and will, in any case, be of just nature.

6 Evaluation of performance

6.1 MySQL database queries

The performance evaluation for MySQL database queries were performed with mysql-slap, using 30 iterations in order to gather reliable data for use in comparing response times between different KVM and Docker setups. For both KVM and Docker, four separate tests with four different setups in regard to allocated resources were per-formed.

Furthermore, the tests simulated 25, 50 and 75 users simultaneously making 100 queries each to the database. A simple database was automatically generated by the program and consists of one column of integers and one column of text characters. This database does not reflect a real-world database in size or complexity, however because this work is only aimed at making comparisons between KVM and Docker, accurately simulating real-world environments in this case is not always necessary.

Tests with MySQL were performed using a script to automate the process. A sample command syntax for 25 users looks as follows:

1 mysqlslap −−user=mysql−user −−password −−host=host−addr −−concurrency=25 2 −−iterations=30 −−number−of−queries=2500 −−auto−generate−sql

However, in order to get more exact data output, the –iterations=[OPTION] could not be used, since it only generates minimum, average and maximum response times for all 30 iterations combined, and not the specific response times for each iteration. Therefore, the program had to be run 30 times with 1 iteration for each user configuration, using a script that saved the results to a text file. See appendix for reference.

6.2 I/O performance

File I/O performance tests were performed using Sysbench. Sysbench is a benchmark-ing tool for Linux, capable of testbenchmark-ing CPU, memory, file I/O, mutex performance and MySQL benchmarking. For these experiments, the file I/O utility was used, where tests are performed using temporary test files and then executing some I/O operations on them. File I/O (input/output) is the process of interacting with files, reading from or writing to. The simplest I/O operation is simply reading from a file. In a cloud environment where files are read or written frequently, having high I/O performance is paramount so as to not bottleneck operations.

(15)

6.2.1 General parameters

When running tests in sysbench, the testing begins with first preparing for the tests by specifying the type of test to run, and in the case of I/O tests, the size of the test files.

1 sysbench −−test=fileio −−file−total−size=”100G” prepare

A combined file size of 100 gigabytes was used for these tests, for the reason of sufficiently exceeding the size of the cache. One can also specify the amount of files (default: 128) and the file block size (default: 16K).

After preparation is done and the files have been produced, the actual test can be run. At the least, the mode must be specified. The mode can be sequential read, sequential write, sequential rewrite, random read, random write or combined random read/write. Max time specifies the amount of time the test is to be run, and max requests how many requests to be done in that time (0=unlimited). Other parameters that can be specified are the ratio between reads and writes (default: 1.5) and I/O mode (default: sync). More parameters exist but their impact on the test is minor.

1 sysbench −−test=fileio −−file−total−size=\$SIZE −−file−test−mode=rndrw 2 −−init−rng=on −−max−time=300 −−max−requests=0 run

After the test has been run, a cleanup command can be run to remove the produced test files.

1 sysbench −−test=fileio −−file−total−size=”100G” cleanup

In addition to these I/O tests, similar tests were run, but with different and higher thread counts. The higher thread count increases concurrency, allowing more operations to be performed simultaneously. Since the drive that these tests were performed on is an SSD, this increases performance. HDDs are more limited in this sense because of the single readpoint HDDs employ. [17]

The tests were run in Docker and KVM with different hardware limitations. The different configurations were 1 CPU and 4GB RAM, 2 CPU and 8GB RAM, and 4 CPU and 16GB RAM.

6.2.2 Sysbench in Docker

In order to run Sysbench in a container, a container running Ubuntu was built using a dockerfile and Sysbench was subsequently installed and run inside this Ubuntu container. The build was performed using the following dockerfile:

1 FROM ubuntu:precise

2 MAINTAINER Lukas Olsson <lukas.olsson92@gmail.com> 3

4 ADD . /bench 5 RUN apt−get update

6 RUN apt−get install −y sysbench 7

8 CMD [”/bin/echo”, ”Please specify a test script from /bench to run.”]

The build includes the directory /bench which contains the following script, that ex-ecutes the Sysbench commands mentioned above.

(16)

1 #!/bin/bash

2

3 SIZE=”$1” 4 TRD=”$2” 5

6 sysbench −−num−threads=$TRD −−test=fileio −−file−total−size=$SIZE prepare 7

8 for i in {1..30} ;

9 dosysbench −−num−threads=$TRD −−test=fileio −−file−total−size=$SIZE −−file−test− mode=rndrw −−init−rng=on −−max−time=300 −−max−requests=0 run ;

10 done

11

12 sysbench −−num−threads=$TRD −−test=fileio −−file−total−size=$SIZE cleanup

The following script starts a container and runs the included script in it. It then takes the output and isolates the throughput count, and appends it to a textfile. This is repeated 30 times in order to produce reliable data.

1 for i in {1..30} ;

2 dodocker run −−rm −i −t −−cpus=”1” −−memory=”4g” benchmark \/bench/bench/io.sh 100G \ |awk’/transferred/ {print $8}’ |sed’s/[(Mb/sec)]//g’ >> result.txt ;

3 done

This approach works well for the base I/O tests, but for running the multithread tests, it is unnecessarily time consuming. This is because the above solution creates the test files for every different testrun, which is a lengthy process. Instead, three different containers with the specified hardware configurations were created. The containers were created with the command tail -f /dev/null to ensure they stayed running and files persisted between tests.

Figure 4: Sysbench container output of ’docker ps’.

This following script was then run, running all multithreaded tests for each container and then moving on to the next one. The time for each test was reduced from 300 seconds to 180 seconds, making the entire script run approximately fourteen hours. 6.2.3 Sysbench in KVM

After sysbench tests were finished in Docker, similar tests were performed in KVM. Ubuntu was installed on a VM and Sysbench then installed on the VM. The same scripts, with some minor modifications, were then run. For the tests there was less automation, since the VM had to be restarted for reallocation of hardware resources to take place. Automation could have been achieved, with three differently configured VMs, but because of the size of the test files, the virtual disk storages would have exceeded the size of the host storage.

File I/O testing, by default, tests writing and reading to a hard drive. A VMs hard drive is contained in the virtual disk storage file. There are several different virtual disk formats usable by KVM, and they offer different advantages.

(17)

• QCOW2 (QEMU Copy On Write 2) is the default disk format in KVM, and is an upgrade from the older Qcow format. It provides inherent snapshot capa-bilities, compression, and encryption.

• Raw format is the universal and easily exportable format, usable by all emula-tors. Raw is a binary image format, and is considered the simplest format. The Raw format can also reduce the space used on certain file systems, where only the space used will be reserved.

• Qed (QEMU Enhanced Disk) resembles Qcow2 in its snapshot capabilities and compact image, but also supports full asynchronous I/O and is claimed to be faster than Qcow2. [18]

There exists several other disk formats, but these are the most commonly used ones in KVM, and were therefore included in the testing of file I/O in KVM. Additionally, for optimal I/O performance, cache mode was set to writeback, which enables both host page cache and disk write cache. This gives good read and write performance but data is not ensured protection in case of power failure. [19]

6.3 Migration time

In the experiments, the ability to migrate in an efficient way was evaluated as a valu-able metric. This carries weight in a production environment because cloud structures are rarely static; storage efficiency and administration might be cause for migrating a service from one physical storage to another. [20]

6.3.1 KVM Migration

KVM comes with native support for live migration. Provided some prerequisites are fulfilled, a running VM can be migrated from one physical host to another with near-zero downtime. The VM will be initiated at the receiving host and, upon success, is shut down on the sending host.

There are some requirements for KVM to be able to perform a live migration. The KVM developers describe these requirements as:

• The VM image is accessible on both source and destination hosts (located on a shared storage, e.g. using nfs).

• It is recommended an images-directory would be found on the same path on both hosts (for migrations of a copy-on-write image – an image created on top of a base-image using ”qemu-image create -b ...”)

• The src and dst hosts must be on the same subnet (keeping guest’s network when tap is used).

• Do not use -snapshot qemu command line option. • For TCP: migration protocol

• The guest on the destination must be started the same way it was started on the source. [21]

(18)

Furthermore, SSH must be running to establish a connection between the hosts. The commands for migration are executed in the qemu-monitor. The qemu-monitor is used to send commands to the QEMU-emulator to perform tasks on the guest sys-tem, such as memory dumps or state inspection[22]. Virtual Machine Manager, which was used to manage the VMs, does not have a qemu-monitor built into it, but instead uses the virsh tool from the command line to pass qemu-monitor commands.

The fact that the virtual disk storage must be accessible on both source and des-tination hosts, and that both source and desdes-tination hosts must be on the same subnet are arguably the biggest limitations of live migration (although, there are no such limitations for a regular offline migration). It does, however, greatly reduce the time that is theoretically needed to perform the migration, due to the fact that no large amounts of data has to be transmitted. The virtual disk image, which might be hundreds of gigabytes, is readily available and only startup instructions and parame-ters have to be supplied by the sending host.

The migration experiments used NFS to implement shared storage between the two host machines. The only network unit involved was a switch, thus placing the host machines on the same subnet. To be able to run the migration tests in an automated script, SSH was setup with key sharing to remove password authentication.

The migration was performed 30 times, sending from the same host every time. The following script, run on the sending host, was used to perform the migration tests. 1 #!/bin/bash 2 3 COUNT=0 4 5 while [ $COUNT −lt 30 ];do

6 virsh start slaptest 7 sleep 1m

8 virsh −t migrate −−live slaptest qemu+ssh://lab2@192.168.13.37/system >> /home/lab1/ Documents/test.txt 9 10 echo”−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−” >> /home/lab1/ Documents/test.txt 11 12 sleep 5m 13 let COUNT=COUNT+1 14 done

The VM is started and after one minute of sleep time, the VM is migrated. Using the -t option in virsh, the migration was timed and subsequently saved to a file. The script then waits five minutes before attempting to migrate again, giving the receiving host time to shut the VM down.

(19)

The following script was run on the receiving host.

1 #!/bin/bash

2

3 while :

4 do

5 if virsh list | grep −Fq slaptest 6 then

7 sleep 1m

8 virsh shutdown slaptest 9 sleep 1m

10 else

11 echoNothing yet 12 sleep 30s 13 fi

14 done

This script runs an infinite loop where it checks for the presence of the VM every thirty seconds, using virsh list and piping the return value to grep -Fq slaptest. If grep returns a true value, the VM is shut down after one minute of sleep. The VM is then ready to be migrated again.

6.3.2 Container Migration

In contrast to KVM, Docker does not have a built-in feature for migration in the current production release. Therefore, other solutions had to be researched for live migration with the most common solution being Checkpoint and Restore (CRIU). CRIU is only able to be used with Docker if you enable experimental mode [23], and is as such not officially supported. This proved to pose problems; while the live migration did work, i.e. a migration of a container from one host to another with accurate container configurations were possible, the migrated containers did not re-tain the data from the source host. In this scenario a conre-tainer running mysql was migrated with configured databases and tables containing some data. The migrated container did include the mysql installation, but none of the databases, tables or data persisted through the migration. As such, another solution had to be researched which resembled the KVM migration more closely, in order for a fair comparison to be made. The next choice fell upon Flocker, which seemed to have the answers to these problems relating to the issue of live migrating Docker containers. Flocker supports migration of Docker containers through its data volume orchestrator, which creates a portable Flocker data volume. This volume enables migration of the Docker container along with keeping all the data intact through the migration [24]. However, as of the time of this work, Flockers download page is not available, therefore this solution were not able be tried and user.

The third choice fell upon Virtuozzo, which is a bare-metal hypervisor combining containers, virtual machines, storage and other functionality, including migration. Virtuozzo containers are based on OpenVZ containers [25], which support migration natively [26]. Naturally, with this decision we deviated a bit from the goals of the report, which were to compare KVM and Docker specifically. However, seeing as no suitable solution for live migration with Docker was found, Virtuozzo would still be a comparison between virtual machines and containers, and thus fall into the categories of this work.

(20)

8192 MB of memory. Installation of Virtuozzo is fast and easy with its user interface and once its up and running only minor configuration needs to be made in order to set up a container. The first step is adding routes on both hosts to the other machine. This also needs to be done inside the container, using the containers virtual bridge addresses. The next step is to disable the firewall on the Virtuozzo host systems and downloading and enabling iptables. After these steps has been made, everything needed in order for the containers to be able to communicate with each other is done. Creating a container is accomplished with the command prlctl create Contain-erName –vmtype ct and in order to assign an IP-address and DNS server the commands prlctl set ContainerName –ipadd x.x.x.x and prlctl set Container-Name –nameserver x.x.x.x is used. The container created in Virtuozzo is roughly 2 GB in size.

Figure 5: Size of a container with MySQL server in Virtuozzo.

Following that, all that is left is entering the container and setting up the desired services. In this case, MySQL was downloaded and installed along with creating a test database containing data for the migration. Migrating the containers was accomplished with the following script:

1 #!/bin/bash 2 3 COUNT=0 4 5 while [ $COUNT −lt 15 ];do 6 sleep 1m

7 if prlctl list | grep −Fq ContainerName 8 then

9 sleep 1m

10 { time prlctl migrate ContainerName usr:pwd@host ; } 2>> liveresults.txt 11 let COUNT=COUNT+1

12 fi

13 done

When a container has been migrated with Virtuozzo, it disappears from the source host. This script checks with the command prlctl list if there is a container present with the specified name. If there is one present, it will wait one minute and then migrate the container to the destination host while timing how long the migration takes and saving the time in a text file. The script will run 15 times on both hosts resulting in a total of 30 migration tests.

7 Results

In the following subsection the data collected from the tests discussed inSection 6is presented. The data is presented in graph form, grouped by the different tests, with a brief explanation of the graphs in each section.

(21)

7.1 MySQL Database Queries

Below is presented the data from the MySQL database query tests, in graph format. The first testrun of MySQL queries produced near identical results for both Docker and KVM. Upon first glance, it was assumed that these results indicated that KVM and Docker performed equally well under the configured data load. Upon further investigation, it was discovered that the network unit, a switch, was bottlenecking the transactions with its 100 Mbps interface. The switch was then replaced with a different switch which had 1 Gbps interfaces, which matched the interface speed of the computers. This bottlenecked data is included and presented separately below.

The graphs are presented with time in seconds on the x-axis, and groupings on the y-axis by scenario(amount of queries). The different bars represent the different resource configurations, 1, 2 and 4 CPU with 4, 8 and 16 GB of RAM, respectively.

(22)

Figure 7: MySQL query performance in Docker.

(23)

Figure 9: MySQL query performance in Docker (100 Mbps throughput).

7.2 File I/O Tests

Below is presented the results for the file I/O tests. The first graph shows the data collected in the simple, single thread scenario, displaying data from Docker and three different KVM formats(Qcow2, Raw and Qed). The y-axis shows throughput in Mb/sec, and the x-axis shows labels for the different bars.The data in this graph is collected from 4 CPU, 16 GB memory configurations.

The second and third graph are presented much like the MySQL query graphs. The x-axis shows thoughput in Mb/sec while the y-axis shows grouping by sce-nario(different thread counts). The different bars represent the different resource configurations, 1, 2 and 4 CPU with 4, 8 and 16 GB of RAM, respectively. The multi-thread tests in KVM were performed on QED format virtual disks, with write-back cache settings.

(24)

Figure 10: File I/O throughput in Docker and three different KVM virtual disk formats, as mentioned inSection 6.2.3.

(25)

Figure 12: Multithread file I/O tests in Docker

7.3 Migration time

Below is presented the results for the live migration tests, with migration time in KVM and Virtuozzo presented separately. Note the difference of the y-axis label in the two graphs. The data in migration time for KVM is given in seconds, while the data for Virtuozzo is given in minutes, showing a rather significant difference. In the graphs, each bar represents a separate data point, for a total of thirty in each graph.

(26)

Figure 14: Live container migration in Virtuozzo

8 Discussion

The following subsections discuss the results presented in Section 7, Evaluation of performance, along with possible improvements and limitations accountered during the course of this work.

8.1 MySQL Database Queries

The testing of MySQL queries was fairly extensive, spanning three different configu-rations in the two technologies, and three different sets of concurrencies, providing 18 different batches of data for comparison. Aside from that data, there is also the data that was collected while the connection was bottlenecked by the 100 Mbps switch connection.

During these tests, Docker consistently outperformed KVM in response time, al-though comparing the results in the 2 core, 8GB memory configuration, KVM and Docker were fairly on par, with Docker exhibiting a 2.6% slower average response time than KVM in the 50 concurrency tests. In the 25 and 75 concurrency tests, no one technology outperformed the other by more than 0.1%, which can be seen as a negligi-ble difference. In the tests with 1 core, 4GB memory configuration, KVM exhibited a 23% slower average response time in both the 50 and 75 concurrency tests, suggesting that in scenarios where resources are stretched thin, Docker performs better.

This might be because of the added complexity in resource scheduling that KVM suffers, where each processor job in the guest system sent to the internal vCPU must be handled by the hypervisor before it is subsequently processed. Docker commu-nicates directly with the host kernel, and this difference in overhead might have a noticeable impact in lower resource scenarios.

In the bottlenecked tests, the uniform response times across different concurrencies and configurations makes it evident that there was in fact a bottleneck, and so KVM and Docker were both processing requests at line speed, well within their limits.

(27)

8.2 File I/O Tests

File I/O performance was evaluated based on average throughput in each test run. Initial testing, as mentioned inSection 6.2.1, showed Docker outperforming KVM in all cases (seeFigure 7). There are a few configurations to take into consideration, as far as disk format goes. Compared to QCOW2 format in KVM, Docker performed at over twice the rate of KVM, with a 214% higher throughput. Comparing QCOW2 against raw and qed format showed RAW and QED performing significantly better than QCOW2, with a throughput rate just shy of 40% better than QCOW2. QED, which has been referred to as a format with accelerated I/O, did perform 1.3% better than RAW. However, a T-test1 comparing the two datasets shows a p-value of 0.09, so the result of this comparison is not reliable enough to be conclusive. While the difference between QCOW2 and RAW/QED certainly is an improvement, Docker still showed a throughput 50% higher than RAW/QED.

When looking at the multi-thread tests, KVM performed better, with a throughput equaling Dockers at 64 threads in the 1 CPU, 4G memory configuration. Docker did outperform KVM in all other testing scenarios, notably so in at higher resource configurations, where Docker showed much less falloff in throughput and even an increase in some scenarios. Docker displayed a throughput almost double that of KVM in several scenarios. What KVM did display was more reliability in throughput. Over all multi-thread tests, Docker had an average standard deviation of 11.28 with a single standard deviation as high as 21.3, whereas KVM had an average standard deviation of 3.11. KVM also displayed a more predictable pattern in relative throughput when comparing configurations and thread counts. This would suggest that while Docker exhibited better ability in performing file I/O at higher thread counts, which would be expected, KVM is more reliable in expected throughput.

8.3 Migration Tests

KVM migration was fairly straight forward once the shared storage was set up. The fact that shared storage had to be set up in order for the KVM migration to work also meant the migration itself would be very fast, seeing as the actual VM, along with its virtual hard drive, did not have to be migrated. The only thing being migrated are the files containing the VM configuration parameters, which is very small in size. The main time consuming operation is the transfer of memory pages and subsequent startup of the VM on the destination host.

As mentioned previously in Section 6.3.2, Virtuozzo containers was used for con-tainer migration instead of Docker concon-tainers. The concon-tainer migration was signif-icantly slower than the VM migration, although in both cases the downtime was nevertheless very low. In the case of Virtuozzo migration, the container continued to run on the source host while the live migration was taking place, and thus provid-ing minimal downtime. With that said, since the container migration is magnitudes slower than its VM counterpart, a solid case could be made favoring KVM over Docker in a live migration scenario.

One possible improvement for the Virtuozzo migration that was not investigated in this work is using a shared storage. One can draw the conclusion though, that even

1_{A T-test is a statistical analysis used to determine if two datasets are likely to be from the same} underlying population, and if they are significantly different. The function returns a probability value(p-value). A p-value under 0.05, meaning a ¿95% certainty that the datasets are significantly different, is the general accepted cut-off for results to be considered sufficiently different to place confidence in. However, the p-value and this seemingly arbitrary limit has been, and continues to be, a topic of controversy in the scientific community. [27,28]

(28)

in the case of a shared storage being used, Virtuozzo would still be significantly slower than KVM. This can be concluded since the Virtuozzo container size is less than 2 GB in size, as can be seen inFigure 2and a 1 Gbps throughput switch was used for the migration, which would equate to a total transfer time of 18 seconds according to

techinternets.com.

8.4 Security aspects

Security within the two technologies was not evaluated through testing but it is an important factor, especially in the context of cloud architectures where resources are shared. VMs gateway to the outside world is through its hypervisor, which imposes limitations on the VM. This makes VMs inherently isolated and secure, by nature of their design as stated in ”An updated performance comparison of virtual machines and linux containers”. The authors go on to state that ”[The VM] is not a panacea, since a few hypervisor privilege escalation vulnerabilities have been discovered that could allow a guest OS to ’break out’ of its VM ’sandbox’”. [2]

One would be hard pressed to find a perfectly secure system. Developers and hack-ers constantly play catch-up, one finding security holes and the other sealing them up, so it is no surprise that VMs contain vulnerabilities. Hypervisor privilege escalation attacks are more specific in targeting VMs, and general attacks such as viruses can be evaluated in a VM, where the potential damage the attack can cause is limited. But in accordance with the aforementioned catch-up game, with virtualization being a very common practice in cloud architectures and other IT areas, viruses are being designed to target VMs at an increasing rate. [29]

Containers dont have the same level of isolation as VMs do. An application running in a VM can only communicate with the guest kernel, whereas a container communicates directly with the host kernel. While this provides containers with less overhead, it also makes the container, and subsequently the system as a whole, more vulnerable. Docker does isolate containers through Process Isolation, Filesys-tem Isolation, IPC Isolation, Network Isolation, Device Isolation and the limiting of resources. These restrictions helps mitigate attacks, such as Denial-of-Service, which is prevented through Dockers use of cgroups, limiting the resources any one container can consume. The paper Analysis of Docker Security [30] noted that the default bridged network model in Docker is vulnerable to ARP spoofing and MAC flooding attacks since no filtering is applied to the bridge. It also noted that containers that are run as privileged gain full permissions and have few restrictions imposed on them, making them similar to normal processes running on the host.

While both technologies have their limitations, it should be noted there exists many methods and best practices for hardening both VMs and containers. Tech-nologies such as AppArmor [31] and best practices such as consistent patching and upgrading [29] can go a long way in protecting the virtualized environment. With all this, both VMs and containers can be considered secure, considering human error is the cause of most breaches [32].

8.5 Other factors

Aside from the items discussed above, there are of course many factors that can play a role in choosing one technology over the other. Scalability is certainly of importance when implementing virtualization in the cloud, an area where containers do very well. The paper ”Container-based operating system virtualization: a scalable, high-performance alternative to hypervisors.” [33] from 2007 investigated the differences

(29)

in performance and concluded that ”Our experiments indicate that container-based systems provide up to 2x the performance of hypervisor- based systems for server-type workloads and scale further while preserving performance.”. Already ten years ago containers showed this tendency. While KVM can safely support overcommitting of CPU at a rate of five virtual CPUs to each physical CPU [34], containers use of cgroups and resource scheduling in the host kernel seem to scale well at higher densitys and demands.

Another factor which is worth mentioning is ease-of-use, where VM hypervisors do well. KVM and software like it have been around for some time, and its design lends itself well to learning and understanding virtualization as a whole. It is, in its simplest case, a matter of running an operating system inside another operating system. KVM also support graphical user interfaces which makes managing and configuring VMs easy, and gives clear oversight of available machines. Docker on the other hand is slightly more difficult to deploy and use, especially when it comes to creating custom images. The method of virtualization Docker(and containers in general) employs is also less intuitive than VM hypervisors. While some graphical user interfaces exists for Docker, they are not always as readily available as for KVM, and is some cases neither as user friendly nor offering the same functionality. The GUIs for Docker are third party software and almost exclusively used to search and download complete images, which means that in any case, custom images still have to be created manually by the user.

9 Conclusions

KVM Docker ”Winner” MySQL Queries X √ Docker

File I/O X √ Docker

Live migration √ X KVM

Scalability X √ Docker

Ease of use √ X KVM

Security √ X KVM

Table 2: Performance and functionality overview.

Comparing VMs and containers, and more specifically KVM and Docker, is a compar-ison of a well established technology and a relative newcomer on the market. In the context of cloud architecture, containers display some great advantages simply by na-ture of their lightweight design. When an important motivation behind virtualization is utilizing resources efficiently, containers are a good choice, especially for use cases such as Software-as-a-Service, where less resources are used for unnecessary overhead. This is noted in Docker: a Software as a Service, Operating System-Level Virtu-alization Framework by John Fink, where he states that Docker is a virtuVirtu-alization framework focused around running applications and not around emulating hardware. He further notes that ...operating system level virtualization is about applications, not machines, suggesting that containers excel in such applications as previously men-tioned [35]. This can be a valuable asset, but it must also be weighed against potential trade-offs. In the testing conducted for this report, containers proved to perform on par with, and even better than, virtual machines. This being in comparison with KVM, which in itself is a quite lightweight virtualization tool. The area where

(30)

con-tainers truly fell behind was in live migration, which is not yet easily available or efficient in Docker or container technology in general. While testing showed that live migration of containers in Virtuozzo took several minutes, compared to under twenty seconds in KVM, the containers showed nearly no downtime during migration. The issue can also be circumvented in a production environment by scheduling downtime to allow for offline migration.

With Docker, and tools like Docker, containers as a virtualization solution is also becoming increasingly accessible and easy to use. As far as security goes, the slight advantage containers might hold is the fact that it is less prominent in production, and so might be less of a target for attacks. One could speculate that there might be a shift in this regard in the coming years, as containers gain popularity. Even though Docker takes many precautions to be secure, it is by design not as contained and secure as VMs are. Nesting containers in VMs could be a tactic to isolate clusters of containers.

To conclude this report, containers would be a good choice for cloud structures. This rings especially true for Software-as-a-Service settings, where low overhead paired with high performance is key. Of course, benefits must be assessed for each individual use case, but containers have shown to be versatile enough and perform well enough to fit most scenarios.

10 Related Work

Although research into this area, and also more specifically research comparing KVM and Docker, has been done extensively in the past, they have usually been done with access to high performance hardware and not necessarily with lower performance hardware. Furthermore, the past research has a lot of the time been focused more specifically at one area or only a few areas. The aim of this work it to investigate a broader area and compare metrics with hardware more likely to be accessible to smaller companies like startups or in other scenarios with a smaller budget to see how viable the two technologies can be for such applications. Examples of related work is the paper referenced in this report; ”An updated performance comparison of virtual machines and Linux containers” [5], and another by a South Korean team, ”Perfor-mance Comparison Analysis of Linux Container and Virtual Machine for Building Cloud” [36]. Other examples of related work include ”Resource Virtualization for Real-time Industrial Clouds” [37], which compared OpenVZ containers to XEN vir-tual machines. In this work however, the author concluded that OpenVZ would be better suited for real-time implementations, stating that it provided less downtime and faster migrations compared to XEN. This contrasts with the results from the tests provided within this work, although this work compared Virtuozzo contain-ers to XEN. With that said though, Virtuozzo containcontain-ers are nevertheless based on OpenVZ containers, but no comparison of performance between Virtuozzo contain-ers and OpenVZ containcontain-ers or KVM and XEN were made. Virtuozzo was also run inside a virtual machine, which resulted in more overhead. Therefore, no conclusion regarding the accuracy of either of these results can be made on the basis of these two works.

11 Future Work

This work was focused mainly on performance metrics in terms of e.g. how quickly you can migrate a VM or container and how quickly MySQL queries can be made etc. For

(31)

the migration tests, while the downtime caused by the migration was taken into some consideration, no sophisticated way of measuring this was employed. In future works, this could be a very important metric to keep track of, seeing as minimal downtime in services provided by companies like Amazon and Google have proved to have big implications when it comes to sales and profits, as mentioned in the thesis Resource Virtualization for Real-time Industrial Clouds [37]. Futhermore, an in-depth look into the security of KVM and Docker would also be a very important factor to compare when choosing the right technology. If more time was available, more comparisons would have been made between different VMs and containers as well, not limiting it to just KVM and (mainly) Docker as was the case for this work.

(32)

References

[1] B. Wootton. (2017, January) Who’s using docker? [Online]. Available:

https://www.contino.io/insights/whos-using-docker

[2] W. Felter, A. Ferreira, R. Rajamony, and J. Rubio, “An updated performance comparison of virtual machines and linux containers,” in 2015 IEEE Interna-tional Symposium on Performance Analysis of Systems and Software (ISPASS), March 2015, pp. 171–172.

[3] B. Golden, Virtualization for dummies. John Wiley & Sons, 2011.

[4] R. Adair, A Virtual Machine System for the 360/40, ser. IBM Cambridge Scien-tific Center report. International Business Machines Corporation, Cambridge Scientific Center, 1966.

[5] G. J. Popek and R. P. Goldberg, “Formal requirements for virtualizable third generation architectures,” Commun. ACM, vol. 17, no. 7, pp. 412–421, 1974. [Online]. Available: http://doi.acm.org/10.1145/361011.361073

[6] P. Kasireddy, “A beginner-friendly introduction to containers, vms and docker.” 2016. [Online]. Available: https://medium.freecodecamp.com/ a-beginner-friendly-introduction-to-containers-vms-and-docker-79a9e3e119b# .448r2l4cc

[7] F. De Silva, “Maverick* research: Peer-to-peer sharing of excess it resources puts money in the bank,” 2013. [Online]. Available: https: //www.gartner.com/doc/2594717/maverick-research-peertopeer-sharing-excess

[8] M. Day, “Kvm myths - uncovering the truth about the open source hypervisor.” 2012. [Online]. Available: https: //www.ibm.com/developerworks/community/blogs/ibmvirtualization/entry/ kvm myths uncovering the truth about the open source hypervisor?lang=en

[9] A. Shah, “Ten years of kvm,” 2016. [Online]. Available: https://lwn.net/ Articles/705160/

[10] “About docker.” 2017. [Online]. Available: https://www.docker.com/company

[11] M. Shaw, “The coming-of-age of software architecture research,” in Proceedings of the 23rd international conference on Software engineering. IEEE Computer Society, 2001, p. 656.

[12] L. M. Given, The Sage encyclopedia of qualitative research methods. Sage Pub-lications, 2008.

[13] A. Kopytov, “Scriptable database and system performance benchmark.” 2017. [Online]. Available: https://github.com/akopytov/sysbench

[14] “mysqlslap load emulation client.” 2017. [Online]. Available: https: //dev.mysql.com/doc/refman/5.7/en/mysqlslap.html

[15] “Download ubuntu desktop.” [Online]. Available: https://www.ubuntu.com/ download/desktop

[16] petarmaric, “petarmaric/docker.cpu-stress-test.” [Online]. Available: https: //hub.docker.com/r/petarmaric/docker.cpu-stress-test/

(33)

[17] A. Kopytov, “Sysbench manual.” [Online]. Available: http://imysql.com/ wp-content/uploads/2014/10/sysbench-manual.pdf

[18] “Qemu emulator user documentation.” [Online]. Available: http://download. qemu.org/qemu-doc.html]

[19] “Best practice: Kvm guest caching modes.” [Online]. Avail-able: https://www.ibm.com/support/knowledgecenter/en/linuxonibm/liaat/ liaatbpkvmguestcache.htm

[20] Z. Xiao, W. Song, and Q. Chen, “Dynamic resource allocation using virtual machines for cloud computing environment,” IEEE transactions on parallel and distributed systems, vol. 24, no. 6, pp. 1107–1117, 2013.

[21] KVM, “Migration — kvm,” 2015. [Online]. Available: https://www.linux-kvm. org/index.php?title=Migration&oldid=173268

[22] “Qemu monitor.” [Online]. Available: http://download.qemu.org/qemu-doc. html#pcsys 005fmonitor

[23] “Docker,” March 2017. [Online]. Available: https://criu.org/Docker

[24] “What is flocker?” [Online]. Available: https://clusterhq.com/flocker/ introduction/

[25] “Main page.” [Online]. Available: https://openvz.org/Main Page

[26] “Checkpointing and live migration,” December 2015. [Online]. Available:

https://openvz.org/Checkpointing and live migration

[27] Y. Dodge, The concise encyclopedia of statistics. Springer Science & Business Media, 2008.

[28] R. L. Wasserstein and N. A. Lazar, “The asa’s statement on p-values: context, process, and purpose,” Am Stat, vol. 70, no. 2, 2016.

[29] A. Khan, “Virtual machine security,” International Journal of Information and Computer Security, vol. 9, no. 1-2, pp. 49–84, 2017.

[30] T. Bui, “Analysis of docker security,” arXiv preprint arXiv:1501.02967, 2015. [31] Sdb:apparmor. [Online]. Available: https://en.opensuse.org/SDB:AppArmor

[32] J. Scott, “Vm-aware viruses on the rise,” October 2012. [Online]. Available: http: //www.computerweekly.com/news/2240169662/VM-aware-viruses-on-the-rise

[33] S. Soltesz, H. P¨otzl, M. E. Fiuczynski, A. Bavier, and L. Peterson, “Container-based operating system virtualization: a scalable, high-performance alternative to hypervisors,” in ACM SIGOPS Operating Systems Review, vol. 41, no. 3. ACM, 2007, pp. 275–287.

[34] “Overcommitting virtualized cpus.” [Online]. Available: https:// access.redhat.com/documentation/en-US/Red Hat Enterprise Linux/6/html/ Virtualization Administration Guide/form-Virtualization-Overcommitting with KVM-Overcommitting virtualized CPUs.html

[35] J. Fink, “Docker: a software as a service, operating system-level virtualization framework,” Code4Lib Journal, vol. 25, 2014.

(34)

[36] K.-T. Seo, H.-S. Hwang, I.-Y. Moon, O.-Y. Kwon, and B.-J. Kim, “Performance comparison analysis of linux container and virtual machine for building cloud,” Advanced Science and Technology Letters, vol. 66, no. 105-111, p. 2, 2014. [37] S. S. Sheuly, “Resource virtualization for real-time industrial clouds,” Master’s

(35)

A

Appendices

A.1 Database query scripts

The following scripts were used for the database query performance tests. They are fairly simple and straight forward; the script runs 30 times for each user configuration and saves the results to a .csv file.

A.1.1 KVM 1 #!/usr/bin/env bash 2 3 COUNTER1=0 4 5 while [ $COUNTER1 −lt 30 ];do 6 expr$COUNTER1 7

8 mysqlslap −−user=user −−password=’password’

9 −−host=ip−address −−auto−generate−sql −−concurrency=25 10 −−number−of−queries=2500 −−csv=kvm.csv 11 12 let COUNTER1=COUNTER1+1 13 done 14 15 echo”−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−” >> kvm.csv 16 17 COUNTER2=0 18 19 while [ $COUNTER2 −lt 30 ];do

20 expr$COUNTER1 + $COUNTER2 21

23 −−host=ip−address −−auto−generate−sql −−concurrency=50 24 −−number−of−queries=5000 −−csv=kvm.csv 25 26 let COUNTER2=COUNTER2+1 27 done 28 29 echo”−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−” >> kvm.csv 30 31 COUNTER3=0 32 33 while [ $COUNTER3 −lt 30 ];do

34 expr$COUNTER1 + $COUNTER2 + $COUNTER3 35

37 −−host=ip−address −−auto−generate−sql −−concurrency=75 38 −−number−of−queries=7500 −−csv=kvm.csv

39

40 let COUNTER3=COUNTER3+1 41 done

Listing 1: Mysqlslap script for KVM, written in bash. Consist of a loop that runs 30 times for each configuration of 25, 50 and 75 users with 100 queries for each user. The script uses the mysqlslap command to create a database, query the database and measure the query time. The results are saved to a .csv file with each user configuration being separated by a dashed line.

(36)

A.1.2 Docker

A.2 I/O performance scripts

1 sysbench −−test=fileio −−file−total−size=”100G” prepare

Listing 2: Prepare command to create files for File I/O test in Sysbench

1 sysbench −−test=fileio −−file−total−size=\$SIZE −−file−test−mode=rndrw 2 −−init−rng=on −−max−time=300 −−max−requests=0 run

Listing 3: Run command to perform File I/O tests on the previously generated files by the prepare command in Sysbench.

Listing 4: Cleanup command to remove files used for File I/O test in Sysbench

A.2.1 KVM

1 #!/bin/bash

2

3 sysbench −−test=fileio −−file−total−size=”100G” prepare 4

5 echo”1 cpu 4g mem” >> result2.txt

6 for i in {1..30} ; dosysbench −−num−threads=2 −−test=fileio −−file−total−size=”100G” −−file−test−mode=rndrw −−init−rng=0 −−max−time=180 −−max−requests=0 run |awk

’/transferred/ {print $8}’|sed’s/[(Mb/sec]//g’ >> result2.txt ;done

7

10

13

16

Listing 5: Sysbench script for KVM, written in bash. Consists of several loops that run 30 times, executing three different tests for 2, 16, 32 and 64 threads. Furthermore, three container configurations with different resources allocated to them were used. The results were saved to a separate .txt file for each container configuration.

A.2.2 Docker

1 #!/bin/bash

2

3 docker start Sys1 4 dockerexecSys1

(37)

6

7 for i in {1..30} ;

8 dodockerexecSys1 sysbench −−num−threads=2 −−test=fileio −−file−total−size=”100 G” −−file−test−mode=rndrw −−init−rng=on −−max−time=180 −−max−requests=0 run \ |awk’/transferred/ {print $8}’|sed’s/[(Mb/sec)]//g’ >> result2.txt ;

9 done

10 echo”2 cpu 8g mem” >> result2.txt 11

12 for i in {1..30} ;

13 dodockerexecSys1 sysbench −−num−threads=16 −−test=fileio −−file−total−size=” 100G” −−file−test−mode=rndrw −−init−rng=on −−max−time=180 −−max−requests=0 run \ |awk’/transferred/ {print $8}’|sed’s/[(Mb/sec)]//g’ >> result16.txt ;

14 done

17 for i in {1..30} ;

19 done

22 for i in {1..30} ;

24 done

27 dockerexecSys1 sysbench −−test=fileio −−file−total−size=”100G” cleanup 28 docker stop Sys1

29

30 docker start Sys2

31 dockerexecSys2 sysbench −−test=fileio −−file−total−size=”100G” prepare 32

33 for i in {1..30} ;

34 dodockerexecSys2 sysbench −−num−threads=2 −−test=fileio −−file−total−size=”100 G” −−file−test−mode=rndrw −−init−rng=on −−max−time=180 −−max−requests=0 run \ |awk’/transferred/ {print $8}’|sed’s/[(Mb/sec)]//g’ >> result2.txt ;

35 done

38 for i in {1..30} ;

40 done

43 for i in {1..30} ;

45 done

48 for i in {1..30} ;

50 done

Comparative evaluation of virtualization technologies in the cloud

School of Innovation Design and Engineering

V¨

aster˚

as, Sweden

Thesis for the Degree of Bachelor of Science in Engineering

Computer Network Engineering 15.0 credits

COMPARATIVE EVALUATION

OF VIRTUALIZATION

TECHNOLOGIES IN THE

CLOUD

Marcus Johansson

mjn12029@student.mdh.se

Lukas Olsson

lon11005@student.mdh.se

Examiner: Moris Behnam

Supervisor: Alessandro Papadopoulos

Contents

List of tables

List of figures

1

Introduction

1.1

Thesis Outlne

2

Background

2.1

Virtualization

2.2

Virtual machines

2.3

Containers

3

Problem Formulation

4

Method

4.1

Technology

4.2

General setting

4.3

Verifying KVM and Docker configurations

4.4

Metrics

5

Ethical and Societal Considerations

6

Evaluation of performance

6.1

MySQL database queries

6.2

I/O performance

6.3

Migration time

7

Results

7.1

MySQL Database Queries

7.2

File I/O Tests

7.3

Migration time

8

Discussion

8.1

MySQL Database Queries

8.2

File I/O Tests

8.3

Migration Tests

8.4

Security aspects

8.5

Other factors

9

Conclusions

10

Related Work

11

Future Work