A comparison of OCI build-tools

(1)

Master thesis

Master of Science Programme in Interaction Technology and Design, 300 credits Degree Project in Interaction Technology and Design, 30 credits

Spring term 2021

BUILDING OCI IMAGES WITH   A CONTAINER ORCHESTRATOR

A comparison of OCI build-tools

Jonas Sjödin

Examiner: Ola Ringdahl Supervisor: Paul Townend

External Supervisor: Mattias Persson & Anders Sigfridsson (Omegapoint)

(2)

Cloud computing is a quickly growing field in modern computing science where new technologies arise every day. One of the latest trends in cloud computing is container based technology, which allows applications to run in a reproducible and stateless fashion without requiring manually installed dependencies. Another trend in computer science is DevOps, a methodology where developers take part in the operations process. DevOps popularise the use of CI/CD workflows, where automatic pipelines run tests and scripts on new code. A container orchestrator, like Kuber- netes, can be used to control and modify containers. Kubernetes allows integrating multiple third-party applications that can moni- tor performance, analyze logs, and much more. Kubernetes can be integrated into the CI/CD system to utilise its container orchestration perks. Building containers inside a container can cause security issues because of native security flaws with OCI build tools.

This thesis aims to look at these issues and analyse the field of container orchestrated OCI build tools using Kubernetes and OCI build tools. We also discover how to develop a test suite that can reliably test container orchestrated OCI build tools and export metrics. The thesis lastly compares different Dockerfile compliant Build tools with the test suite to find out which has the best performance and caching. The compared build tools are BuildKit, Kaniko, Img and Buildah and overall BuildKit and Kaniko are the fastest and most resource effective build tools. It is not obvious which build tool that is the most secure. Kaniko, which is a root container requires no privileges and is therefore tough to break out of but an eventual breakout will give the attacker root access to the host machine. BuildKit and Img only requires unconfined SECcomp and AppArmor which will make a container breakout more probable, even though less than Buildah which must be run in a privileged container. Since they can run rootless, the attacker will only have the same access to the host as that user in case of a container breakout.

(3)

Throughout the writing of this thesis, I have received incredible help and support from my supervisors. I would first like to thank my supervisor at Ume˚a University, Paul Townend, for your help in guiding me in structuring the work and report. Your expertise in scientific writing was invaluable and helped me in organising the whole thesis experience. I would also like to thank my supervisors at Omegapoint, Mattias Persson and Anders Sigfridsson, for their support. They helped me immensely with inspiration and tips about the field and guided me whenever there were any problems.

(4)

1 Introduction 1

1.1 Aims and Objectives 2

1.1.1 Limitations 2

1.2 Background 2

1.3 Thesis Structure 3

2 Literature study 4

2.1 Cloud 4

2.1.1 Definitions and background 4

2.1.2 Common architectures 5

2.1.3 Virtual Machines 5

2.1.4 Containers 6

2.1.5 Growth in popularity of containers 6

2.2 Kubernetes 6

2.2.2 Security 7

2.2.3 Performance 8

2.2.4 Caching 8

2.3 DevOps 8

2.3.2 CICD 9

2.3.3 CICD in Kubernetes 9

2.4 Build tools 10

2.4.2 Popular build tools 10

2.4.3 Security 12

2.5 Research questions 13

2.5.1 Security 13

(5)

2.5.3 Summary 14

3 Solution Design 15

3.1 Automation and Reproducibility 15

3.2 OCI Build tools 16

3.3 Setting up the Cluster 16

3.3.1 Applications 17

3.3.2 Terraform configuration 17

3.4 Test suite 18

3.4.1 Client 18

3.4.2 Server 19

3.4.3 Build tool Configuration 20

3.5 Criteria of Success 20

4 Implementation 21

4.1 Script 21

4.1.1 Configuration 21

4.1.2 Workflow 22

4.2 Kubernetes 22

4.3 Server 24

4.3.1 Configuration 24

4.3.2 Routes 26

4.3.3 Endpoints 26

4.4 Client 27

5 Results 35

5.1 Experiment Design 35

5.2 Testing specifications 36

5.3 Test runs 36

5.3.1 Minimal Build 36

5.3.2 Large Copy 36

5.3.3 Javascript Project 37

6 Evaluation 44

(6)

6.1.1 Test 1 - Minimal Image 44

6.1.2 Test 2 - Large Copy 44

6.1.3 Test 3 - Production Image 45

6.1.4 Security 45

6.2 Test Suite 46

6.3 Aims and Objectives 46

6.4 Experienced problems 47

7 Conclusion 49

7.1 Personal Reflection 51

7.2 Future work 52

7.2.1 Implementation Improvements 52

7.3 Finishing words 54

References 55

A Configuration file 58

(7)

1 Introduction

Cloud computing¹ is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources. It was popularised in 2006 when Amazon released its cloud platform AWS. Cloud computing allowed users to rent virtual machines to run internet-facing production services. Before the cloud became viable, companies needed to provide their own infrastructure, making it harder and costlier to maintain and scale, especially for smaller players [5]. Over the years, more cloud platforms have emerged, cloud platforms have become more and more user-friendly. With new hosting solutions like the serverless design architecture [22], it is now possible to deploy an application directly on a cloud platform without requiring almost any operations knowledge.

However, cloud-based virtual machines are still relevant because of their broad use cases. Because they run a complete operating system [30], they are configurable to run all possible applications and configurations. When running in a cloud platform, they also provide easy backup management and creation. However, virtual machines are not resource-efficient since they require virtualization of an entire operating system (OS) to function. It leads to slow creation, boot times and a significant performance overhead.

Containers attempt to solve the problem of running an isolated application. They do so by sharing their kernel with the host and only running the most necessary OS applications [18], resulting in a tiny footprint. An example is the alpine container image, a Linux container image that is less than 3MB ². Compared to its virtual machine counterpart, which is around 130MB³. Containers allow fast creations, boot times and a minimal performance overhead.

In enters the Open Container Initiative(OCI), an open-source specification for Linux containers [8]. It enables the usage of different container runtimes and container build-tools. Modern web services often require to scale onto multiple nodes to maintain adequate performance and availability. Container orchestrators solve this by managing the network and scheduling of the running services⁴. From a users per- spective, a container orchestrator makes a cluster of nodes seem like a single entity.

One commonly used container orchestrator is called Kubernetes. It is an open-source declaration based container orchestrator which handles all possible use cases.

A popular workflow in modern development is Continuous Integration (CI) [23]. In

1NIST. The NIST Definition of Cloud Computing. https://csrc.nist.gov/publications/

detail/sp/800-145/final (visited 2021-05-23)

2Alpine. Alpine Container Image. https://www.alpinelinux.org/downloads/. Visited 2021- 02-11

3Alpine. Alpine Virtual Machine Image. https://hub.docker.com/_/alpine. Visited 2021-02- 11

4Kubernetes. Kubernetes. https://kubernetes.io/. Visited 2021-02-11.

(8)

CI, it is common to run automatic tests on all committed code. If the tests pass, the application is built and pushed it to a registry. Running the CI jobs in a container orchestrator is often preferable [9] when running a container-based workflow. In such a workflow, the built application is a container image, built by an OCI build- tool. These build-tools do not usually run well in containers unless given extended privileges, suboptimal for security reasons.

1.1 Aims and Objectives

This thesis aims to investigate the field of container orchestrated OCI build-tools. It compares multiple different build-tools by performance, caching, cost, security and UX. They are evaluated by reading articles and papers and by running custom tests.

The paper attempts to answer the following research questions:

Table 1 Aim 1. Investigate how securely an OCI build tool can run.

No. Objective

1 Find out what container privileges OCI build tools require.

2 Research if OCI build tools must be run as the root user.

3 Evaluate the security issues of containers.

Table 2 Aim 2. Investigate the performance of OCI build tools.

No. Objective

1 Evaluate running container orchestrated OCI build tools.

2 Find out how to measure the the performance of containers?

3 Investigate how to develop a test suite for OCI build tools 4 What are the most prominent OCI build tools?

1.1.1 Limitations

This paper compares different orchestrated container build-tools. It will not compare how different container runtimes impact build-tools. One solution for running build-tools in a container is sysbox [25]. It is a container runtime which uses OS- virtualization to and userspace applications to make containers function as virtual machines. It allows software running in containers to run software like Systemd, Docker, and Kubernetes, which usually is impossible without changing some security features.

1.2 Background

The case study for this thesis is Omegapoint. It is a Swedish IT consulting firm specialized in cybersecurity. They are moving a lot of their software infrastructure to the cloud and embracing a more container-based workflow. This thesis helps Omegapoint in deciding which OCI built-tool to use in container orchestrated envi-

(9)

ronments.

Omegapoint is an advanced consulting partner of Amazon and therefore provides credentials to Amazon’s cloud platform AWS, the platform used to run the tests.

Omegapoint also provides a sounding board and professional consultation for this thesis.

1.3 Thesis Structure

The thesis starts in chapter 2 with a literature study where scientific papers, articles and manuals are studied. It does so by describing the fields of cloud, containers, DevOps, Kubernetes, OCI build tools and usability analysis. The chapter also sums the earlier work, which influences the rest of the thesis.

It continues onto chapter 3 with a solution design which uses the conclusions of the literature study to create a solution. The chapter also specifies how to design different solution components. It displays the systems overall functionality and components on a high-level figure.

The thesis continues with describing the implementation of the designed solution in chapter 4. It discusses how each component work at a low-level and other specifications of the implementation. It continues onto chapter 5 which presents the testing results. The chapter also explains the results and how to read them.

An evaluation of the results is available in chapter 6. It discusses the impact of the results and concludes how the different build-tools performs. The chapter also recommends which build-tool that performs the best in the different test cases. The thesis ends with a conclusion available in chapter 7 which discusses if and how the results are satisfactory. At last further work in the field is discussed, and a personal reflection of the project is presented.

(10)

2 Literature study

Computer science is a vast field with much information available from different sources. In this chapter, the thesis examines the domain to make sure that the research question is viable and to make sure that the design of the implementation is substantiated. In modern web services, the cloud plays a central role. It is a well-researched area, but much material is not published in research papers because of the areas ever-changing state. Instead, software documentation and company websites are often the primary places for finding information. It can render some of the sources of this paper invalid in case that a Git Repository or a web page is moved or deleted.

2.1 Cloud

NIST defines cloud computing¹ as a model for enabling ubiquitous, convenient, on- demand network access to a shared pool of configurable computing resources. It is an essential part of a modern software stack. Over the last few years, it has grown in popularity, and 2020 cloud computing had global revenue of almost 400 billion USD². It has become a cost-effective on-demand service where users can run internet-facing software [28]. It is no longer necessary for companies to manage their infrastructure but instead pay a cloud provider for their services. This change allows for cheap horizontal scaling capabilities and easy use of services such as hardware load balancing and DNS management.

2.1.1 Definitions and background

In the sky, a cloud consists of a large number of small drops of water. On their own, each drop of water is insignificant, but they produce something, rain, when they come large enough. The same goes for the ”cloud“ in Cloud Computing. One small server cannot do much, but when multiple servers work together, they create a ”cloud“ [1]. The cloud refers to a network and its infrastructure, whether that is a private or a public one. A cloud is most often available on the internet in some sort.

Public clouds are reachable by users through their web browsers. Users hire the infrastructure or service that they desire and only pay for the duration they use them,

1NIST. The NIST Definition of Cloud Computing. https://csrc.nist.gov/publications/

detail/sp/800-145/final (visited 2021-05-23)

2Statista. Cloud computing market revenues worldwide from 2015 to 2020. https://www.

statista.com/statistics/270811/cloud-computing-revenue-worldwide-since-2008/ Visited 2021-02-17.

(11)

often reducing operating costs. There are public cloud companies of all sizes, and some common ones are Amazon AWS, Google Cloud and Microsoft Azure. Since an external cloud provider manages public clouds, the user must trust that the provider has taken sufficient security measures to prevent intruders. Some companies, therefore, require a private alternative.

Private clouds are both managed and used by an organization. It allows a company to have full control over both the hardware and software infrastructure. As they often are called, private clouds or on-premise clouds are by design limited to specific users, i.e. all developers and operations people at a company. Therefore, this cloud solution is often desirable by banking organizations and other companies where security is the number one priority.

2.1.2 Common architectures

Cloud platforms usually provide three primary architectures, Infrastructure as a service (IaaS), Platform as a Service (PaaS), and Software as a service (SaaS) [24].

They all solve different problems and gives the user a different amount of control of the system.

Infrastructure as a Service gives the user the most control over a system. It allows a user to create virtual or physical servers, load balancers, networks, among others.

The user can then install whatever operating system they desire onto those servers and tweak them to their liking. An IaaS user is responsible for managing updates, taking security measures and maintaining the servers.

In Platform as a Service, the platform provides the underlying infrastructure and application requirements to the user. The user does not need to update the platform but only install its application and configure it to its needs. An example of PaaS is a managed Postgres database. The provider manages the application and its infrastructure, and the user only configures it to meet its needs. The platform often also manages scaling, security and updates.

Software as a Service is a way of distributing software where the user does not need to care about the underlying infrastructure or requirements. A user only configures the platform with their desired application, and the platform manages it by itself.

In this architecture, the user cannot configure networking and scaling to the same extent as in IaaS since this SaaS handles this automatically.

IaaS, PaaS and SaaS all have their place in the market, but they solve different problems. IaaS gives full control, PaaS considerable control and SaaS smaller control of a system. Which one to use depends on the context.

2.1.3 Virtual Machines

One of the critical components of cloud computing is virtualization. It allows multiple operating systems (OS) to run simultaneously on the same server by emulating the hardware of a computer [27]. The specifications of virtual machines (VM) hardware is configurable, allowing different VM:s on the same host to perform better or worse. Virtual machines are often the base of modern cloud deployments because of their built-in host isolation. Virtual networks are also configurable, allowing VM:s

(12)

on the same or different hosts to communicate. Since virtual machines are running in software, their creation and management can easily be automated, commonly used in cloud environments. VM:s need to virtualize an entire OS and its hardware, and because of that, they have a slow startup time and a significant runtime overhead.

Since virtualization is slow, an alternative to this is using containers, which do not virtualize nearly as much.

2.1.4 Containers

Containers do not use full operating system virtualization but hypervisor virtualization [26]. It does not need to virtualize the hardware and all syscalls since it shares its kernel with the host machine. This functionality avoids a lot of the performance issues with VM:s. It enables almost instant startup times but can come at a security cost. If the container hypervisor has security or isolation issues, it is possible to break free and access the host machine. By default, a container does not have full privileges to the host, limiting the attack surface. Containers are also often not run as the root user, further preventing these security issues. Many container im- plementations specify stateless containers, allowing a different workflow where the state usually is managed externally or by an orchestrator. The statelessness and low overhead of containers make it possible to develop and test the same code run in production.

2.1.5 Growth in popularity of containers

One of the most common formats of containers is OCI [8]. It specifies a public interface of how containers should function. It has allowed the creation of custom container runtimes and build tools. A common approach to container technology is to create one container per application, effectively creating a sandboxing environment. The approach makes scaling individual services easy. According to the Cloud Native Computing Foundations (CNCF) report 2019 [4], running containers in production have increased in popularity. By CNCF:s measures, more than 80% use containers in development, testing, staging and production environments.

From these statistics, it looks like containers are something that is going to be used frequently in the future.

2.2 Kubernetes

The previous section clearly shows that the cloud is a popular and flexible alternative when running internet-facing software. It also presents that OCI Containers are often preferable to virtual machines when developing applications because of their low overhead, statelessness and easily scriptable interfaces. When these container workloads scale and more and more components need to interact, many problems can occur. It is often preferable to use a container orchestrator [12] to mitigate these issues. That is a tool that automates deployment, management, scaling and networking of containers. The most popular one is called Kubernetes (K8s) [32].

(13)

Kubernetes was initially designed and developed by engineers at Google [12] and is now an open-source project available at Github [20]. Kubernetes integrates with the cloud provider it runs on, allowing automatic issuing of hardware load balancers and persistent storage. It does so by using a declarative configuration [19] where YAML is used and applied to create resources. The orchestrator has some predefined API resources, but it is also extendable with custom resources.

Kubernetes smallest component is a Pod. A pod is a schedulable piece of software that contains at least one container, and its definition configures the pods’ environ- ment and status. To schedule a pod, Kubernetes uses a deployment or a stateful set. A deployment is a configuration that defines how many of a specific pod to run and what resources they should have. Kubernetes job is to make sure that what is defined is the current state of the system, and if one pod goes down, Kubernetes brings it back up. A stateful set works similarly but with a few key differences.

Stateful sets have predictable names and start in order where deployments start simultaneously. Both can handle data persistency by mounting persistent volumes.

The horizontal auto scaler resource automatically scales a deployment to a specified amount of nodes in an interval. It does so by parsing the resource limit requirements and deciding if it should scale up or down. When running applications, it is essential to configure them differently in different environments. In a Kubernetes environment, config maps and their encrypted relative secrets manage all pod con- figurations. They can be mounted as files or environment variables onto pods to dynamically provide a pod with its desired configuration.

In Kubernetes, services exist to allow pods to communicate with each other by load balancing incoming connections onto pods. There are also headless services that do not have an IP-address but allows communication with specific pods. To make a workload reachable from the internet, Kubernetes uses Ingresses or hardware load balancers. An ingress in Kubernetes typically manages DNS services and SSL certificates. It also performs SSL termination and proxies the request to a specific service.

To isolate different applications, Kubernetes uses namespaces. Together with cor- rectly configured services accounts, it is possible to limit deployments communication to a specific namespace or specific resources.

Kubernetes is extendable with third party application which integrates log management and parsing, resource monitoring, security scanning, to name a few. All of this allows Kubernetes to manage high performance and high availability applications.

2.2.2 Security

It is essential to configure firewalls to make the cluster nodes ports used for internal communication not publicly available on the internet [19]. Even after that, there are many security issues when running Kubernetes clusters, but these issues are often easy to mitigate if configured correctly. Kubernetes is only as secure as its container runtime and the privileges of its running containers. It is therefore essential to only give the containers their necessary privileges. When a pod has extended privileges

(14)

that allows more syscalls, many security issues arise. NCC Group investigated the area of container privileges [13]. They showed security issues with containers, but the issues increase in number with extended privileges. It is also vital to run pods as ordinary users instead of root to limit a hijacked container’s attack surface. Pod isolation is achievable by giving each deployment a service account that limits the pods’ communication to a specific namespace or specific resources. Sometimes these security issues cannot be easily prevented because of a specific workload that requires some extra privileges.

2.2.3 Performance

Kubernetes performance can be hard to reason around since it depends on a lot of different factors [19]. One of those factors is the container runtime. Different runtimes give different performance results in different use cases. Espe, Anshul, Podolskiy and Gerndt [6] showed that CRI-O with runc was the fastest for most use cases, but containerd with runc performed better for IO intensive workflows.

Another performance factor is the number of cluster nodes and their hardware specifications. Kubernetes extensibility enables many different monitoring tools to be used. Two of those tools are Metrics Server and Prometheus. Metrics Server is a scalable service that scrapes container metrics. It is useful for autoscaling applications. Prometheus³ is a generic scraper configurable to scrape cluster metrics (IO, CPU, RAM) and application-specific metrics via custom integrations.

2.2.4 Caching

By default, Kubernetes runs stateless containers [19], meaning the containers’ content is not at all stored after exit. Storing data can be done in many different ways.

It is either saved to a mounted disk, pushed to an external registry or written to STDOUT. Kubernetes caches the running container images locally. If it should pull a new image, Kubernetes first checks if the image is present. If not, it downloads the missing layers.

2.3 DevOps

Because of Kubernetes large and complicated ecosystem, inexperienced users can find it very difficult to deploy applications. To help with this, users can utilize a framework called DevOps. It is an automation approach that integrates the developers into the operations workload [10]. It represents a set of ideas where a software feature goes from development to testing to production environments, often somewhat automatically. The approach often encourages declarative configuration, making it a good fit for Kubernetes. One of DevOps’ key benefits is Continuous Integration (CI) and Continuous Deployment (CD) pipelines. CI/CD pipelines can run code analysis tools, tests or custom integrations automatically on each commit.

3Prometheus. Prometheus (Website). https://prometheus.io/. Visited 2021-02-11.

(15)

The waterfall model is an approach to project management, where each task is performed in sequence [31]. First, the requirements are specified, then a solution is designed, implemented, verified, and lastly maintained. It was long the prominent model for project management in software development, but due to its inability for quick changes, many companies decided to phase it out. A new, more flexible project management model came along called agile development. It is an iterative approach that focuses on swift deliveries and quick actions on new changes.

With this new project management model, the way software was developed changed.

Previously, the software was developed by developers and later maintained by operations personnel. This new method with quick iterative cycles and deliveries with small intervals required that the software developers know operations management, and so was DevOps born. DevOps has promoted new testing and development environments, like containerization and automatic testing, by including the developers in the operations work.

2.3.2 CICD

Continuous Integration and Continuous Delivery are two core DevOps concepts [11]

that focus on automation. It attempts to solve issues that may arrive when multiple developers work on the same project. CICD pipelines allow developers to run functions on their codebase on each commit automatically. These functions can be linting, formatting, testing or whatever the environment requires. These pipelines are often well integrated with the Version Control System (VCS), allowing workflows where only correct code can be submitted to a specific staging or release branch.

Continuous Integration focuses on verifying that the code is correct before merging it to the main development branch, and because of its simplicity, it is usually implemented at least to some extent in many projects. On the other hand, continuous delivery often requires more work to implement because it needs to be integrated with the production environment, which may be tedious depending on the production environment. Laukkanen, Itkonen and Lassenius [21] performed a literature study in 2016, where they concluded that implementing Continuous Delivery could be troublesome problems with the system and build design. They do however not discuss if the issues are present in container based systems, indicating that it is a somewhat unexplored subject.

2.3.3 CICD in Kubernetes

We now know that CI/CD can improve the developer experience and ensure good software quality. It can, however, be a tedious task. To lessen the burden, a container orchestrator like Kubernetes can be used, which allows the DevOps workflow to be integrated with the testing, staging and production environment. It allows pipeline jobs to be run inside Kubernetes, effectively enabling the testing and staging environment to be identical to the production environment. It also allows the CI/CD jobs to leverage Kubernetes large ecosystem with security scanning, logging and metrics. In Kubernetes, there is tooling that simplifies the deployment stage by

(16)

integrating into the Kubernetes cluster, enabling easy automation and scripts.

Using Kubernetes for the pipeline jobs makes it is possible to have a declarative configuration that is identical in development, testing, staging and production. That is one reason why Kubernetes is part of Cloud Native Computing Foundations (CNCF) trail map to cloud-native software⁴.

2.4 Build tools

From the previous sections, we know that using the containers running in the cloud is prefered because it lessens the operations burden and makes maintaining a reproducible environment easier. We also know that using CI/CD pipelines can help with ensuring code quality and software security. In the CI/CD pipelines, it is common to build software that is pushed to a registry. In a container landscape, the software is built within containers and pushed to a container registry. There exists a lot of different built-tools for this job that functions with different aspects in mind.

Some focus on performance and some on security. Running build tools inside an unprivileged container may also be troublesome.

An OCI container image build tool is a piece of software that builds OCI containers.

They are built according to the OCI Image specification⁵, and as long as the specification is followed, it enables inoperability. An OCI build tool builds images by creating a manifest containing metadata about the image content and dependencies with references to the built layers. It also contains a configuration that includes information such as application arguments and environments. To run the container from its image, its manifest is used to reference and putting together the different layers. The configuration is used to configure the function of a specific instance of the container image.

When running OCI build tools inside containers, they often require more privileges then what is by default given to a container. Giving extended privileges to a container may increase the attack surface in case of a compromised container. Therefore, it is important to give containers only the necessary privileges and take precaution- ary actions if more privileges are required. Some build tools require running as the root user to get access to all devices and syscalls, which may also be a security issue in a compromised container.

2.4.2 Popular build tools

When many people think of building OCI containers, they think of Docker. It is an application that builds and runs OCI containers, but there exist many alternatives.

4CNCF. Introducing the Cloud Native Landscape 2.0 - interactive edition. https://www.cncf.

io/blog/2018/03/08/introducing-the-cloud-native-landscape-2-0-interactive-edition/

Visited 2021-02-28.

5Open Container Initiative. Image Format Specification. https://github.com/

opencontainers/image-spec/blob/master/spec.md Visited 2021-02-28

(17)

One of the benefits of Docker is the use of Dockerfiles, small scripts that create a container image from specific instructions. Dockerfiles can extend on other previously built images and even build an image in multiple stages, allowing a small final image footprint. Docker cannot run in daemonless mode and since OCI containers have a big problem of running daemons, Docker instead needs to use a separate service container running Docker, making the setup process more tedious. Akihiro Suda, the creator of the OCI build tool BuildKit, held a presentation about next-generation container image build tools at the Open Source Summit in Japan 2018 [29]. There he discussed other build tools that work as a drop-in replacement for Docker and that they solve many issues with concurrency and running inside a container.

Table 3 shows OCI container image build tools and their usage information. BuildKit [3], Img [14], Buildah [2] and Kaniko [17] are alternatives to Docker, which can build OCI container images from Dockerfiles. They all support running in daemonless mode inside containers, allowing easy orchestration by Kubernetes.

Table 3 OCI container image build tools

Build tool Frontends Privileges User

Docker Dockerfile Privileged Root

BuildKit Dockerfile Unconfined SECcomp & AppArmor Rootless Img Dockerfile Unconfined SECcomp & AppArmor Rootless

Buildah Dockerfile Privileged Rootless

Kaniko Dockerfile None Root

Jib JVM Build system None Rootless

Umoci None N/A N/A

BuildKit was originally a breakout project from Docker, which aimed to improve the build speed by allowing concurrent builds of layers. It is now an upstream of Docker, which allows full concurrency on multi-stage builds. It supports both local caching to/from a directory and remote caching with a container registry. BuildKit does require unconfined SECcomp and AppArmor, but it can run as rootless. Img is a downstream of BuildKit, allowing similar functionality. It aims to provide an interactive frontend, mimicking the Docker CLI. It has the same security profile as BuildKit, meaning it requires unconfined SECcomp and AppArmor, but it can run rootless. It also supports both remote cache in a container registry and local cache to/from a directory. Buildah is a Red Hat build tool alternative that is by default integrated into Podman. It is the recommended OCI container system on Red Hat platforms like Centos and Fedora. Because of its Red Hat backing, it can provide enterprice grade support. It supports mounting volumes into the container it is building, which can speed up a compiling process with a build-environment specific cache. It must be run in a privileged container, but it can be run rootless.

Buildah can only use local cache to/from a directory. Kaniko is a build tool built by developers at Google, which requires no extended privileges but must be run as root. It can only use cache to/from an external registry.

OCI containers consist of multiple layers, but they do not have to be built from a Dockerfile. Jib [16] is a build tool that builds containers differently by integrating with the build manifest config. It is built for the JVM platform and only works with toolings like Maven or Gradle. It compiles the code on the host with the provided

(18)

build system and then copies the compiled bytecode binary and any other specified files onto a final base image. It does not require any privileges or even running as the root user because it does not run commands on the container. It just copies files into the final images’ file system and edits its manifest file. It can, therefore, not build complete container images from scratch. It needs a base image. Because of its integration with other build systems handles caching the same way as its build system, whether it is Maven or Gradle. If Jib runs in a container, its build system cache can be used by mounting a volume to store cache between runs.

Umoci [15] is a strange OCI container image build tool. It does not use Dockerfiles and does not integrate into a build environment like Jib. Instead, it can unpack and edit container image layers in a CLI-like manner. It does not run commands within a container. Instead, the commands are run on the host and umoci edits layers and copies files into the finished container via the host. When unpacking an image, the image file system is unpacked into a directory. It allows the directory to be used as an ordinary file system on the host machine. It can also edit the configuration manifest to set workdir, the container image working directory, and other options.

Umoci cannot retrieve or distribute OCI images and recommends using an external tool, such as scopeo⁶.

2.4.3 Security

By analyzing the build tools described in section 2.4.2, different container image build tools require different privileges and setups to function in a container. It can range from requiring a privileged container runtime to having unconfined access to a few key components/devices or just by running as the root user. All of these options can impose different security issues if a container is compromised or gone rogue. Many container runtimes, including Docker, requires to be run as the root user to function correctly. If the container is compromised, the main target is often to get control over the host system [7].

Root user. A container runs as a specified user. It can be the root user or an ordinary user. If a container breakout occurs, the attacker becomes the same user on the host operating system as it is in the container. Therefore, if a container runs as root and the host machines root user runs the container runtime, a container breakout will give the attacker root access to the host. Therefore, containers should not be run as root if not required.

Privileged. Privileged containers have the same access to the host machine as an ordinary process running on the host [7]. It means that it can access all devices, including hard drives, on the host machine if the running user has access. A privileged container also has all isolation techniques such as cgroups, AppArmor and SECcomp turned off. If the root user runs the container runtime, and the root user runs the privileged container that is compromised, the attacker has root access to the host machine.

AppArmor & SECcomp. AppArmor is a feature built into the Linux Kernel [19] which extends the default user and group-based permissions to enable more fine-grain access control of the resources. Seccomp is also a feature built into the

6Containers. Scopeo (Github). https://github.com/containers/skopeo (visited 2020-03-07)

(19)

Linux kernel, which sandboxes the privileges and restricts the syscalls of running application and containers. These features can be used in Kubernetes to limit the access a running container has to its host. When a pod has unconfined access to one of them, it bypasses some security features of the container runtime. However, it is not as bad as having a privileged container, which also bypasses AppArmor &

SECcomp but also bypasses a lot more security features. However, it increases the attack surface to the host machine, especially when run as the root user.

2.5 Research questions

We now know that cloud computing is often preferable to other ways of deploying software because of its ease of use and competitive pricing. When deploying software, containers are useful for both developments, testing and deployment. However, they do have some issues with scaling and networking, and it is therefore often beneficial to use a container orchestrator which solves these issues. Kubernetes is a popular open- source container orchestrator which allows for custom integrations and extensions.

It makes it possible to easily run namespace isolated custom workloads with log and security analysis. Because of Kubernetes declarative approach, it is a good fit for DevOps. In Kubernetes, it is possible to run CI/CD pipelines that utilises Kubernetes grand ecosystem.

It is common to build and push software in CI/CD pipelines. If the services are container-based, this may be troublesome because container image build tools sometimes require running with additional privileges or, as the root user, inflicting security issues. Configuring these privileges is done differently on different platforms, but CLIs often print cryptic error messages if they miss the correct privileges. The build tools usually supports similar core features, building and pushing an OCI container image. If a build tool lacks a feature, such as remote caching, it requires the user to implement it. This can impact the usability of a build tool and make it harder to use for first time users.

2.5.1 Security

From section 2.2.2 and section 2.1.4 we now know how to test the objectives of our first aim, ”Investigate how securely an OCI build tool can run”. To answer the question, we can compare the container image build tools required privileges and discuss their security impact. Another security issue of containers is which user they run as.

2.5.2 Performance

We also know how to test the objectives of the second aim, from section 2.2.3 and section 2.2.4, ”Investigate the performance of OCI build tools”. Kubernetes provides great resource control where CPU and memory usage can be displayed. This can be done with scrapers such as Prometheus or metrics-server. The caching performance can be tested by analysing the cached images layers size and their impact on build- speed. To build a test suite we can create a client-server application with a web

(20)

interface that runs workloads on Kubernetes. It can test a specific container image build tool by running a deployment in the cluster. It can then analyse the results by checking with Kubernetes metrics software and displaying its GUI results. For easy usage in multiple different clusters, the testing suite can be configured with a configuration file. The suite can also output its testing data to allow easy plotting of graphs.

2.5.3 Summary

This chapter has been very long, and it has discussed everything ranging from cloud- computing to containers to DevOps. From the literature study, it is clear that we now know how to answer the research questions defined in chapter 1. This information will be used in the later chapters to argue why specific methods have been chosen and why the tests have been designed the way they had.

(21)

3 Solution Design

From the previous chapter, we know that it is possible to answer the research questions. To answer the questions, it is necessary to create a test suite that runs specific tests with a specified build-tool when correctly configured. This chapter aims to specify how the test suite should function and how it should be used. The following two components will need to be developed:

• A setup script that sets up the testing environment, which in our case is a Kubernetes cluster. The script should also install all required applications onto the cluster, such as the test suite and monitoring software.

• A test suite that can run tests on OCI build tools.

3.1 Automation and Reproducibility

Many different cloud providers provide a managed Kubernetes platform. This thesis will use Amazon Web Services (AWS) and their Kubernetes distribution EKS because of its maturity and popularity. Currently, EKS only supports Docker with Containerd as its container runtime interface (CRI). We know from our literature study that different CRIs can perform differently in different situations so the results in this thesis will only be reproducible with the same CRI. Otherwise, the choice of Kubernetes distribution should not have any impact on how the container image build tools compare to each other because of the containers cloud agnostic functionality. However, using a different cloud provider with differently configured nodes could impact the ran tests’ speed depending on the hardware specification of the different nodes.

When performing scientific work, it is crucial that it is possible to reproduce the results. To make this possible, automation is a key component of this master thesis and it is done with Linux scripting and the infrastructure provisioning tool Terraform.

Terraform¹ is an infrastructure as code (IaC) tool. It can provision all kinds of different components, including cloud infrastructure and their running applications.

It is configured with a declarative configuration language called HasiCorp Configu- ration Language (HCL), which supports variables, iterations and custom executable shells. Terraform can be extended with many different custom providers allowing a declarative configuration of package managers, DNS services, cloud providers, among others. Terraform makes it possible to write configuration which declares the desired state of our infrastructure and then apply it. Terraform then communicates over REST calls with the cloud platform and creates the desired resources automatically.

1HashiCorp. Terraform Website. https://www.terraform.io/ Visited 2021-03-08

(22)

Terraform is used to make the automation experience smoother. It uses the AWS provider²for provisioning an EKS Kubernetes cluster. Helm³is used to to provision the applications onto the cluster.

3.2 OCI Build tools

Many different OCI container image build tools function differently, but to narrow the thesis scope, we will only test the performance of daemonless Dockerfile compliant build tools. It is done because that containers cannot run a daemon and simultaneously run other software and Dockerfiles are a very commonly used standard when building containers. The chosen build tools are therefore Kaniko [17], BuildKit [3], Img [14] and Buildah [2]. One of the aims of the thesis is to compare the most prominent build tools in the field. They are found by browsing popular sources for container technology like Github, forums, blogs and reports. These sources also present two other Dockerfile Compliant Build Tools. Docker and Orca- Build⁴. Docker requires another instance running in another container to function in a container orchestrator. It cannot simply be started and ran within a single container. Orca-Build is another project which builds container images from Dock- erfiles by using Umoci, runC and skopeo. Its last update was in 2017, meaning the project has not received any development in four years, making it obsolete.

The proposed test suite implementation will be generic, and the build tool information will not be hardcoded to encourage further testing of other build tools. The test suite should be configurable with a configuration file to make this possible. YAML is the used markup language in Kubernetes and is therefore chosen as the test suite configuration file format. The configuration file configures build tools allowing custom ones to be added at any time. We now support build tools but to test the build tools against each other, the configuration file also supports test cases, which is Git repositories that contain a Dockerfile. Git is chosen for test repositories because of its widespread usage in the development ecosystem. Cloning a Git repository may take much time, so it is not done during the test itself. Instead, it is done before the test in a separate container which saves the cloned repository to a volume that can be mounted to a running build tool.

3.3 Setting up the Cluster

A setup script whose functionality is visualized in Figure 1 is developed to set up a cluster and install the required applications, including the test suite. The script first applies the Terraform configuration to create a Kubernetes cluster. It then builds a container image of the Test suite which is pushed to a container registry. Then the Helm package manager installs the desired applications (Prometheus/Grafana and the Test suite) onto the Kubernetes cluster. A Dockerfile for running the script

2HashiCorp. AWS Provider. https://registry.terraform.io/providers/hashicorp/aws Vis- ited 2021-03-08

3Helm. Helm. https://helm.sh/ Visited 2021-03-08

4Cyphar. Orca-Build (Github). https://github.com/cyphar/orca-build (visited 2021-04-10)

(23)

is also developed. It is done to limit the number of required dependencies to a container build tool and a container runtime (like Docker or Podman). The setup script uses Docker images while running so that the user does not need to install all dependencies and can focus on what is important. Testing build tools.

Figure 1: Cluster setup architecture

3.3.1 Applications

Prometheus and Grafana are not required for the test suite to function, but the test results cannot be analysed without it, rendering the results useless. Prometheus scrapes data from Kubernetes, creating a database of metrics which it updates at a set interval. An ingress server is installed to make the test suite and Grafana publicly available. In this case, it is Nginx which is a popular alternative which is recommended at the Kubernetes website [19]. Ingress servers make it possible for Kubernetes to parse an URL and forward the message to the correct service.

3.3.2 Terraform configuration

To be able to set up a Kubernetes cluster, Terraform configuration has to be developed. It consists of multiple files which declaratively defines the desired state of the infrastructure. The configuration specifies the required IAM users, the Kubernetes specification, the cluster nodes and its application.

(24)

3.4 Test suite

The main implementation component is the test suite which is able to take multiple different build tools of any type in its configuration. The test suite is divided into a web application and a server. The web application provides an interface for running tests. The server provides the web application and a REST API, which enables running tests on the cluster. The architecture is visualized in Figure 2.

Figure 2: Kubernetes Cluster architecture

3.4.1 Client

The test suite is used by navigating the web application through a web browser. It enables full control of the test suite and its configuration. It makes it possible to run tests, view results and change the configuration, making it a single truth source for the test suite. It is written in TypeScript⁵ with the React⁶ framework. Type- Script is chosen because of its maturity and type safety, and its interoperability with javascript libraries. React is chosen because of its widespread usage and its great community providing open source components. The client provides the following features. The clients workflow is visualized in 3.

• Run a specific test for a build tool

• View test repositories

• View tests which have run, are running and are going to run.

• View test pod logs

• Edit configuration in YAML

• Link to the result visualization in Grafana

• Show feature comparison of build tools with explanation

5TypeScript. What is TypeScript? https://www.typescriptlang.org/ Visited 2021-03-11.

6React. React. https://reactjs.org/ Visited 2021-03-11.

(25)

Figure 3: Client Web App Features

3.4.2 Server

The other part of the test suite is a backend server. It is written in Kotlin⁷ because of its modern syntax and good interoperability with java, allowing the test suite to use java libraries. It is chosen in front of Java, because of its modern and minimalist syntax. The servers main objective is to run the build tool tests in the Kubernetes cluster. It does so by using a queue. When a new test should be run, it is added to the queue, which is then fetched, and ran sequentially by another thread in a producer-consumer like fashion.

The servers runtime is configurable through environment variables and the tests are configuratble with the previously discussed YAML configuration file. The server both serves a JSON based REST API with support for both WebSockets and standard HTTP calls. WebSockets makes it possible to stream data in both ways, allowing the server to continuously update the client with new information about the cluster state and to enable the client to follow pod logs. The REST API provides the following endpoints:

• Login to the server

• Get the configuration

• Update the configuration

• Add a test to the test queue

• Get the test queue

7Kotlin. https://kotlinlang.org/ (visited 2021-05-11)

(26)

• Follow a pods logs

• Follow the cluster state. (View all test repositories, view all run test jobs, get generic information)

3.4.3 Build tool Configuration

The test suite tests generic OCI container image build tools. Which build tools to test is configured through a YAML configuration file accessible and editable through the web interface. It supports both build tools that uses Dockerfiles and build tools that does not. To add a build tool, it is required to configure the container’s required privileges and other security and usability information.

3.5 Criteria of Success

We know from the literature study that it is possible to answer the research questions. The tests are successful if the following is achieved:

• The test suite can run build tools in Kubernetes.

• The test suite can measure the performance of a build tool.

• The test suite can be configured with custom test cases.

(27)

4 Implementation

The implementation consists of three main components. A setup script sets up the testing environment, a graphical web client which is controllable by a user and a server that communicates with the running Kubernetes cluster.

4.1 Script

A standard Linux BASH script sets up the testing environment, and its only dependencies are Docker and standard shell utilities like sed¹ and xargs². The script is configurable with a file called .env which contains environment variables. It is described in section 4.1.1. It first builds the setup Docker image that contains all the necessary binaries to run the setup scripts, which provisions the cluster and its applications. Then it creates the infrastructure and builds the test suite container images. Lastly, it provisions the applications onto the cluster. To run the script and set up the test suite and an AWS EKS cluster, the user only needs to run ./run.sh.

When the script has finished, the user can enter Grafana or the client web app by entering their printed URL. To delete the infrastructure, run ./run.sh --destroy.

4.1.1 Configuration

The setup script is configurable with environment variables. It configures the used container registry, Grafana, the test-suites images and container registry, the web client credentials and AWS credentials. An example .env file is available in listing 4.1.

Listing 4.1: Example .env file

# URL to the c a c h i n g c o n t a i n e r r e g i s t r y . REGISTRY_URL = https :// index . docker .io/v1/

# U s e r n a m e to the c a c h i n g c o n t a i n e r r e g i s t r y . REGISTRY_USERNAME = username

# P a s s w o r d to the c a c h i n g c o n t a i n e r r e g i s t r y . REGISTRY_PASSWORD = password

# P r e f i x of the c o n t a i n e r r e g i s t r y . REGISTRY_PREFIX = docker .io/ username

# P a s s w o r d to the G r a f a n a D a s h b o a r d . GRAFANA_PASSWORD = password

1CLI which edits text in a scriptable manner.

2CLI which can split a string at spaces to create multiple arguments, useful when piping data onto arguments in shell scripts.

(28)

# T es t s u i t e c l i e n t I m a g e name .

SUITE_CLIENT_IMAGE = username /cbtt - client

# T es t s u i t e s e r v e r I m a g e name .

SUITE_SERVER_IMAGE = username /cbtt - server

# U s e r n a m e to the c l i e n t w e b a p p . CLIENT_USERNAME = admin

# P a s s w o r d to the c l i e n t w e b a p p . CLIENT_PASSWORD = password

# AWS A c c e s s Key ID .

AWS_ACCESS_KEY_ID = someaccesskeyid

# AWS S e c r e t A c c e s s Key .

AWS_SECRET_ACCESS_KEY = somesecretaccesskey

4.1.2 Workflow

The script uses the workflow in listing 4.2 and its functionality is also visualized in figure 4.

Listing 4.2: Script Algorithm Enable exit on error

Read env vars

Cleanup earlier runs Build setup OCI image

Run create infrastructure container Setup AWS credentials

Initialize terraform

Provision the cluster with terraform Install ingress - nginx to the cluster

Copy kubeconfig from exited infrastructure container to host Build and push test suite client container image

Build and push test suite server container image Run install cluster app container

Wait for the IP of the ingress load balancer Install the test - suite onto the cluster Install Prometheus and Grafana

4.2 Kubernetes

The AWS EKS Kubernetes Terraform configuration is very standard and minimal. It is inspired by the recommended default configuration, available at the AWS website³. It sets up a Kubernetes cluster with two worker nodes and nothing else. The scripts

3AWS Admin. From Zero to EKS with Terraform and Helm. AWS (Website). https://aws.

amazon.com/blogs/startups/from-zero-to-eks-with-terraform-and-helm/

(29)

Figure 4: Low level setup script information

installs Grafana and Prometheus and specify that Prometheus should scrape metrics every second. It also installs the test suite helm chart.

The test suite helm chart consists of a deployment that runs the client and server container images. A configMap containing the default server configuration YAML file configures the deployment. A secret also configures the necessary environment variables for the running pod. The test suite deployment mounts a persistent volume through a persistent volume claim to make it possible to change the configuration.

When the test suite first boots, it writes the YAML configuration, present in the configMap, to the persistent volume. It is now not editable and not replaced on each boot.

The running pods have admin access to the cluster through role binding and a service account. The pods are accessible through a service that is accessible by an ingress resource. The ingress redirects all data where the URL path has a prefix of /api to

(30)

the server and everything else to the client.

4.3 Server

The server is responsible for communicating with the Kubernetes cluster and running the tests. It written in Kotlin with the Ktor⁴ web server for providing a REST API. There exists other web servers for Kotlin but Ktor is chosen because of its popularity and native Kotlin experience. The server has many public methods which interact with the cluster through a Kubernetes API Client made by the open-source development platform fabric8io⁵.

Running an OCI container build-tool test requires at least one specified test repository in the configuration. On configuration save and on suite startup, the tests repositories state synchronizes with the current cluster. If the present cluster state differs from the configuration file, it needs to be updated. The test suite creates a Kubernetes pod for each specified repository in the configuration, which mounts a persistent volume using a persistent volume claim. When running, the pod clones the Git repository and saves the specified repository subdirectory to the specified working directory, which is the mounted persistent volume. The suite has now cre- ated a persistent volume which only includes a Git repository or a subdirectory of a git repository. It allows the build tools only to run their test and not spend unnecessary time downloading a git repository.

When running a test, the suite creates a pod that mounts a test repository persistent volume as read-only. On startup, the pod copies the mounted test repository to a new directory to make writing possible. It then changes the current working directory into the test repository and runs the specified setup command, which specifies all requirements for the built-tool test to function. Its most common use case is to log in to a container registry. Lastly, the pod runs the build-tool with a specified setting which specifies if and how the build tool should use cache and if it should push the image.

The server runs test jobs one at a time by using a queue. When there are no running jobs, it tries to take a job from the queue and run the test, if it is not possible, it waits for a new job.

4.3.1 Configuration

A YAML configuration file configures the test suites test cases. It configures the test pods resources, the test case repositories and the build tools. Listing 4.3 explains all options that can be specified when creating a configuration file for the test suite.

Listing 4.3: Test Suite Configuration

# R e s o u r c e l i m i t s and r e q u e s t s for the k8s p od s resources :

limits :

4Ktor. https://ktor.io/ (visited 2021-05-13)

5Fabirc8io. Kubernetes Java Client. https://github.com/fabric8io/kubernetes-client (visited 2021-05-13)

(31)

memory : 512 Mi cpu : 0.8 requests :

memory : 256 Mi cpu : 0.4

# The w o r k d i r w h e r e the test re po is s t o r e d in the c o n f i g u r a t i o n

workdir : / workspace

# R e p o s i t o r i e s c o n t a i n i n g the d i f f e r e n t t es t c a s e s repositories :

# N a me of the te st case - name : quick

# URL to the te st GIT r e p o s i t o r y

url : https :// github . com / JonasKop / TestRepo

# The D i r e c t o r y c o n t a i n i n g the p r o j e c t to build , d e f a u l t s to "/"

dir : / quick

# T a g g e d build - t o o l s can onl y run w ith r e p o s i t o r i e s w h i c h has a s p e c i f i c tag .

tags : [ sometag ]

# S p e c i f i e s the build - t o o l s to t e st buildTools :

# N a m e of the b u i l d tool - name : kaniko

# S e c u r i t y f e a t u r e s of the build - tools , all o p t i o n s h e re are o p t i o n a l

securityContext :

# U n c o n f i g n e d S E C c o m p seccomp : unconfined

# U n c o n f i g n e d A p p A r m o r apparmor : unconfined

# The u s e r I D runAsUser : 0

# P r i v i l e g e d c o n t a i n e r privileged : false

# E n v i r o n m e n t v a r i a b l e s env:

# Key v a l u e p r o p e r t i e s SOME_FLAG : some - value

# Build - t o o l c o n t a i n e r i m a g e

image : gcr .io/ kaniko - project / executor : debug

# Tag a build - tool to o nl y w o rk wi th s p e c i f i c r e p o s i t o r i e s

tag : sometag

# T e s t i n g c o m m a n d s . T he y are all run in a SH - s h e l l . c o m m a n d:

# S e t u p command , s h o u l d i n c l u d e r e g i s t r y s e t u p o p t i o n s if r e q u i r e d .

setup : echo ’{" auths ":{" ${ registry_url }":{" username ":"

${ registry_username }" ," password ":" ${

registry_password }"}}} ’ >/ kaniko /. docker / config . json

# C a c h e c o m m a n d s cache :

(32)

# C o m m a n d for u s i n g c a c h e and p u s h i n g

push : / kaniko / executor -- context ${ workdir } -- dockerfile ${ workdir }/ Dockerfile --cache = true -- cache - repo =${ image } -- destination =${ image }

# C o m m a n d for u s i n g c a c h e and not p u s h i n g

noPush : / kaniko / executor -- context ${ workdir } -- dockerfile ${ workdir }/ Dockerfile --cache = true -- cache - repo =${ image } --no - push

# No c a c h e c o m m a n d s noCache :

# C o m m a n d for not u s i n g c a c h e and p u s h i n g push : / kaniko / executor -- context ${ workdir } --

dockerfile ${ workdir }/ Dockerfile --cache = false -- destination =${ image }

# C o m m a n d for not u s i n g c a c h e and not p u s h i n g noPush : / kaniko / executor -- context ${ workdir } --

dockerfile ${ workdir }/ Dockerfile --cache = false --no - push

4.3.2 Routes

The server’s functionality is accessed through a REST API, making it possible for custom clients to integrate with the test suite. It supports both standard REST HTTP requests and WebSocket connections for streaming data.

A client needs to authenticate itself with the server to be able to use it. To authenticate, a user logs in with a username and password to the login endpoint, which checks if the username and password match the configured credentials. The user can then use the username and password as Basic Authorization to access the REST HTTP endpoints.

When accessing the WebSocket connections, the endpoint does not check any HTTP header because WebSocket clients often lack support for headers. Instead, it checks a query parameter called auth, whose value is base64 encoded. It is done similarly to the Basic Authorization header but without the Basic part. An authorization check protects all endpoints except the login endpoint.

4.3.3 Endpoints

The server has a login endpoint, which logs a user in to the server by checking if the credentials in the request body match the specified credentials. If the credentials are correct, they can be used to authorize the user with the server.

It also has an endpoint at which a user can get the current configuration in string form. A user can also update the configuration by supplying it in string format. It is then checked for validity and then saved to the server. Lastly, the server synchronizes the configuration with the cluster state.

The server makes it possible to delete a pod or a repository by providing two endpoints. The only thing they require is the name of the repository or the pod.

The most important endpoint of the server allows a user to run a test. The endpoint

(33)

takes a job consisting of a repository, a build-tool, a bool if it should use cache or not and a bool if it should push or not. It then adds the job into the queue.

It also has two WebSocket endpoints, one that reads pod logs for a specific pod and one endpoint that listens to the cluster state. The cluster state endpoint sends all pods in the current namespace, the current queue and the URL to Grafana. Both endpoints update every second.

4.4 Client

The client is a React Web Application written in Typescript that communicates with the server through HTTP requests and WebSocket connections. It uses Reacts context API for managing the global state and localStorage for saving state (such as authorization tokens) between sessions.

When a user first enters the client, it presents a login page as seen in figure 5.

It allows a user to log in to the test suite. The user gets redirected to the home page on successful login, viewable in figure 6. The web application consists of a navigation bar with buttons that redirect the user to different pages. It also has a queue button, showing the number of active jobs and jobs in the queue. When clicked, a more detailed view appears as seen in figure 7.

On the home page, the user can see the current repositories and the previously run tests. The user can also click on the repositories and tests to view more specific information like logs and links to Grafana as seen in figure 8. On the pod page, a user can also delete the test or repository.

To run a test, the user can navigate the test page by clicking the test button in the navigation bar. The client greets the user with the test page, which is viewable in figure 9. The page allows the user to run a build-tool with a specific test repository and specify a run with or without cache and pushing to an external registry.

On the config page, which is viewable in figure 10 a user can view and edit the configuration in YAML format at the server. On the last page, tools which is viewable in figure 11, the user can view a comparison of the different build tools with an explanation of each compared stats importance.

(34)

Figure5:ClientLoginPage

(35)

Figure6:ClientHomePage

(36)

Figure7:ClientQueuePage

A comparison of OCI build-tools

BUILDING OCI IMAGES WITH A CONTAINER ORCHESTRATOR

A comparison of OCI build-tools

1 Introduction

2 Literature study

3 Solution Design

4 Implementation

BUILDING OCI IMAGES WITH   A CONTAINER ORCHESTRATOR