T-REX PHACO: enabling PaaS application using Twelve-Factor App

(1)

T-REX PHACO: enabling PaaS application using Twelve-Factor App

LIDIA FERNÁNDEZ GARCÉS

KTH ROYAL INSTITUTE OF TECHNOLOGY

SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE

(2)

www.kth.se

(3)

application using Twelve-Factor App

LIDIA FERNÁNDEZ GARCÉS

Master ICT Innovation (Joint EIT Programme) Date: August 31, 2018

Supervisor: Anne Håkansson Examiner: Mihhail Matskin

School of Electrical Engineering and Computer Science

Swedish title: T-REX PHACO: möjliggör PaaS-applikation med Twelve-Factor App

(4)

Abstract

In order to take advantage of the full-range of benefits of cloud computing, traditional applications should be adapted to conform with the cloud-native principles. In this work, a traditional Java enterprise application is transformed in a cloud-native application able to run in a Platform as a Service using cloud-related technologies (Docker, Kubernetes, OpenShift) and an automation server (Jenkins). The resulting application follows Twelve- Factor App methodology and cloud-native principles.

(5)

Sammanfattning

För att dra nytta av det omfattande utbudet av cloud computing, bör traditionella applika- tioner anpassas för att överensstämma med de molnbaserade principerna. I det här arbetet omvandlas en traditionell Java-företagsansökan till en cloud-native applikation som kan köras i en plattform som en tjänst med hjälp av molnrelaterade teknologier (Docker, Ku- bernetes, Openshift) och en automationsserver (Jenkins). Den resulterande tillämpningen följer Twelve-Factor App metodiken och cloud-native principer.

(6)

1 Introduction 1

1.1 Background . . . 1

1.2 Problem Description . . . 2

1.3 Purpose and Goal . . . 2

1.4 Method . . . 2

1.5 Ethics and Sustainability . . . 3

1.6 Stakeholders . . . 4

1.7 Delimitation . . . 4

1.8 Outline . . . 4

2 Theoretic Background 5 2.1 Background . . . 5

2.1.1 Java EE . . . 5

2.1.2 Integration, Production, Development . . . 6

2.1.3 Cloud Computing . . . 6

2.1.4 SaaS and Cloud-Native Applications . . . 8

2.1.5 Cloud Technologies . . . 11

2.1.6 Continuous Deployment . . . 14

2.2 Related Work . . . 17

3 Architecture of Candidate Application 18 3.1 Software Architecture . . . 18

3.2 Technologies . . . 19

3.3 Codebase . . . 20

3.4 Deployment . . . 20

3.5 Development Workflow . . . 21

3.6 Credentials . . . 21

4 Analysis of Candidate Application 23 4.1 Twelve-Factor App . . . 23

4.2 Cloud-native Principles . . . 24

5 Presentation of Cloud Strategy 26 5.1 Cloud Architecture . . . 27

5.2.1 Development . . . 27

5.2.2 Integration/Production . . . 27

iv

(7)

5.3 Twelve-Factor App . . . 28

5.3.1 Codebase . . . 28

5.3.2 Dependencies . . . 29

5.3.3 Config . . . 29

5.3.4 Backing services . . . 29

5.3.5 Build, Release, Run . . . 29

5.3.6 Processes . . . 29

5.3.7 Port Binding . . . 29

5.3.8 Concurrency . . . 30

5.3.9 Disposability . . . 30

5.3.10 Dev/prod parity . . . 30

5.3.11 Logs . . . 30

5.3.12 Admin Processes . . . 30

6 Implementation of Cloud Strategy 31 6.1 Cloud Architecture . . . 31

6.1.1 Docker . . . 31

6.1.2 OpenShift . . . 32

6.2.1 Development Workflow Components . . . 37

6.3 Automation Pipeline . . . 38

6.3.1 Pipeline as Code . . . 39

6.3.2 Dev/int/prod Workflow . . . 40

7 Evaluation and Results 43 7.1 Twelve-Factor App . . . 43

7.1.1 Codebase . . . 43

7.1.2 Dependencies . . . 43

7.1.3 Config . . . 44

7.1.4 Backing Services . . . 44

7.1.5 Build, Release, Run . . . 45

7.1.6 Processes . . . 46

7.1.7 Port Binding . . . 46

7.1.8 Concurrency . . . 46

7.1.9 Disposability . . . 46

7.1.10 Dev/prod Parity . . . 47

7.1.11 Logs . . . 48

7.1.12 Admin Processes . . . 48

7.2 Cloud Native Principles . . . 49

8 Conclusions 50 8.1 Discussion . . . 50

8.2 Future Work . . . 51

Bibliography 52

A Dockerfile 56

(8)

B Openshift template: trex-template.yaml 57 C Jenkins: Jenkinsfile, Jenkinstemplate. groovy and .env 59 C.1 Jenkinsfile . . . 59 C.2 Jenkinstemplate.groovy . . . 59 C.3 .env . . . 60

D Jenkins execution environment: Dockerfile 63

(9)

Introduction

Cloud computing, as defined by the NIST¹, is "a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction"[1].

Applications can be delivered over the Internet using cloud computing[2]. However, they differ from traditional applications deployed in traditional data centers. They should have particular characteristics in order to take advantage of the full range of benefits of this model, such as being operated on automation platforms, use softwarization of infrastructure and network, and migration and interoperability across different cloud infrastructures and platforms in mind. This kind of applications are known as cloud-native applications.

1.1 Background

In order to take advantage of the cloud, traditional applications should be adapted to conform with the cloud-native principles. This migration is not straightforward; it depends on the particular characteristics of each application. However, some strategies are repeated, mainly when using the most popular technologies. One of the most popular languages when it comes to traditional enterprise applications is Java EE.

According to the TIOBE Index², Java³ has been the most popular programming language since 2002. There are four editions of Java targeting different application environments. Java EE is the one targeting large distributed enterprise and Internet environments.

Several practices are repeated in Java EE applications: they have a similar code orga- nization, they use build systems, they communicate with external systems in similar ways, they use JavaScript frameworks for front-end, they deploy using binaries, etc[3]. Several conventions are followed in this type of applications. Consequently, they will all follow similar strategies when migrating to the cloud.

1NIST (National Institute of Standards and Technology), U.S. Department of Commerce - https://www.nist.gov/

2TIOBE Index - https://www.tiobe.com/tiobe-index/

3Java - https://www.java.com

1

(10)

1.2 Problem Description

Java applications in general, and Java EE applications, in particular, are a standard in the IT industry. With the emergence of cloud computing, these applications should be adapted to take advantage of the multiple benefits offered by this model.

However, how should Java enterprise applications be migrated to the cloud? The an- swer is not straightforward, as every application is different and the strategy to migrate to the cloud should be adapted to each of them. Nevertheless, enterprise applications tend to follow a set of conventions. Consequently, a set of strategies can be defined for migrating this type of applications to the cloud that can be repeated in a similar project.

Twelve-Factor App[4] is the most well-know methodology documenting best practices for building PaaS applications, and cloud-native principles[5] must be followed by any cloud application to take advantage of the full range of benefits of cloud computing.

In this project, we will transform a traditional Java enterprise application into a cloud-native application following cloud-native principles and Twelve-Factor App methodology.

This thesis will use recommended strategies for creating cloud-native applications to a non-cloud Java enterprise application. It will show the process followed, the problems found and the solution chosen for solving them.

1.3 Purpose and Goal

The purpose of this thesis is to present a practical case of migrating an enterprise application to the cloud. I will apply the theory (Twelve-Factor App methodology) to a particular application in order to make it compliant with cloud-native principles.

I expect that this project will give developers a clear idea on how some strategies can be applied to migrating Java EE applications to the cloud. By showing a practical case, I will provide an insight into the difficulties found and how I solved it, providing developers with useful information for adapting their own enterprise applications to the cloud.

Consequently, three goals are pursued:

1. Analyze the targeted application, identifying which cloud-native principles are not followed.

2. Develop a strategy for making the application follow Twelve-Factor App methodology.

3. Perform the implementation of this strategy.

1.4 Method

The method used in this project will be applied research, as defined by Hakansson[6].

This method involves answering specific questions or solving known and practical problems. The method examines a set of circumstances, and the results are related to a particular situation. It often builds on existing research and uses data directly from the real

(11)

work and applies it to solve problems and develop practical applications, technologies, and interventions. Applied research is used for all kinds of research or investigations, which is often based on basic research and with a particular application in mind.

In this, I will be solving the practical problem of transforming a traditional application into a cloud-native application. The characteristics of the traditional application will be examined and how the resulting application fulfills the cloud-native principles. The project will use previous experience from developers for this purpose from literature. Particularly, a well-known methodology for building cloud-native application will be used: the Twelve- Factor App.

This thesis will follow an abductive approach[6]. This reasoning uses both deductive and inductive approaches to establish conclusions. In the method, the hypothesis, which best explain the relevant evidence, is chosen. The approach starts with an incomplete set of data or observations and uses preconditions to infer or explain conclusions. The outcome is a likely or possible explanation and is, hence, useful as heuristics.

This project starts with an original application that does not fulfill all the cloud-native principles. Twelve-Factor App methodology by Wiggins [4] and cloud-native principles by Kratzke and Quint [5] are used as ground truth for requirements for a PaaS application, and changes will be made in the application in order to be compliant with them.

We will use an action research strategy. Action research is performed by actions to contribute to practical concerns in a problematic situation[6]. This strategy seeks to improve the way people solve problems using different strategies, practices, and knowledge of the environment. Action research studies setting with defined boundaries and, hence, qualitative methods are most suitable.

We will present action research in which we carry out a real-life example of the trans- formation of a traditional application to a cloud-native application. We will use several different strategies in order to adapt the legacy application to the cloud.

We will evaluate our work using the Twelve-Factor App methodology[4]. This is a qualitative method by which we analyze if the final application fulfills each one of the principles of the methodology. Additionally, we will evaluate if our final application follows cloud-native principles[5].

1.5 Ethics and Sustainability

Ethics play a part in this project as it should be secure. Sensitive information, such as credentials, is used. This information should be manipulated with appropriate security. The technologies used in this project provide several mechanisms for this purpose. Besides, security recommendations for each of the technologies should be followed, such as using non-root users in Docker containers or not publishing credentials into the code and artifact repositories.

Sustainability is a potential effect of this project. Cloud computing enables better use of resources. Cloud applications allow dynamical changes in its assigned resources, adapting them to the demand. Consequently, fewer resources are idle, saving energy in data centers.

Although this project does not focus on this branch of cloud computing, it takes the first step to enable this kind of practices.

(12)

1.6 Stakeholders

Amadeus[7] is a major IT provider for the global travel and tourism industry. Amadeus provides search, pricing, booking, ticketing and other processing services in real-time to travel providers and travel agencies. It also offers travel companies software systems which automate processes such as reservations, inventory management, and departure control.

As an IT provider, Amadeus is engaged with technological evolutions, mainly around four pillars: cloud, data intelligence, security and open API.

Cloud computing has rapidly changed in the last decade, realizing computing as a utility[8]. In order to take advantage of this innovation, Amadeus is moving in recent years towards cloud-based technology and distributed deployment of services[9]. Following this strategy, the FOR division within Amadeus is migrating towards cloud solutions, adapting their internal tools to this new technology stack. They expect that these technologies will help them scale their applications and open to a wide variety of benefits, such as A/B testing or multi-datacenter deployment.

This project will be developed in Amadeus as a proof-of-concept for migrating to the cloud. The candidate application will be one of their internal tools. The technologies used in this project will be provided by the company and will be based on open source solutions.

1.7 Delimitation

In this project, we will not make changes to the original code provided by Amadeus.

Legacy applications usually need changes in their architecture, such as breaking down monoliths into microservices [8]. These modifications are out of the scope of this project, in which the candidate application for migration already has an adequate structure for the cloud. This project will also be limited by choice of technologies. Amadeus provides their own cloud technologies that should be used. However, all of the Amadeus technologies used in this project are enterprise layers on top of open source technology, so it is expected that the results can be replicated with the open-source versions. We will deploy the final application in Amadeus private cloud. Nevertheless, our final application will be implemented in a way that can be easily deployed also in a public cloud, or any other private cloud.

1.8 Outline

In Chapter 2, we will provide the background necessary to understand the concepts presented in this work. In Chapter 3, we will present the candidate application for cloud migration, presenting its software architecture and development workflow. In Chapter 4, we will analyze the candidate application, describing which cloud-native principles and Twelve-Factor App principles does it fulfill, and which ones it does not and, consequently, need to be adapted. In Chapter 5, we will decide on a strategy for adapting the application to be cloud-native. In Chapter 6, we will describe the implementation process of the strategy described in Chapter 4. In Chapter 7, we will evaluate our project from the point of view of Twelve-Factor App and cloud-native principles. In Chapter 8, we will present the results and conclusions of our work.

(13)

Theoretic Background

2.1 Background

In this section, all the necessary background to understand this thesis is provided.

For understanding the application target of the migration, knowledge of Java EE, usually referred as Java enterprise, is needed, as well as the frameworks used in the project.

Knowledge about software development, such as environments, is also needed.

Cloud computing basics are also provided, as well as the motivation for using it instead of traditional computing. Then, we introduce the concept of SaaS and cloud-native applications followed by Twelve-Factor app, the methodology that will be used in this project.

2.1.1 Java EE

Java is a general-purpose computer programming language that is concurrent, class-based and object-oriented[10]. Java is one of the most popular programming languages in use[11][12].

Java has several editions, being Java Platform, Standard edition the most well known. For large distributed enterprise or Internet environments, the recommended edition is Java Platform, Enterprise Edition (Java EE).

Java allows application developers "write once, run anywhere" (WORA)[13]. Com- piled Java code run on all platforms that support Java without the need for recompilation thanks to the Java Virtual Machine. Java Virtual Machine (JVM)[14] is a virtual machine that allows running Java programs compiled to Java bytecode, either written in Java or any other language.

Java EE code is compiled to portable bytecode and packaged as a Web Application Archive (WAR) Java Archive (JAR)¹. WAR or EAR files comprise all classes and files required to ship an application[3].

In enterprise projects the deployment of artifacts, the WAR or JAR file, should be deployed to an application container, such as Tomcat²or WildFly³. The application container integrate additional concerns such as application life cycle or communication in various forms. For example, communication over HTTP[15] or JDBC⁴.

1Open Source Java EE Reference Implementation - https://javaee.github.io/tutorial/

packaging001.html

2Tomcat - http://tomcat.apache.org

3WildFly - http://wildfly.org

4Oracle Technology Network: JDBC - http://www.oracle.com/technetwork/java/

5

(14)

Several Java frameworks provide an abstraction to build and deploy applications. Spring⁵ is one of them. The Spring Framework provides "a comprehensive programming and con- figuration model for modern Java-based enterprise applications on any deployment plat- form"[16]. A particular Spring’s convention-over-configuration[17], Spring Boot⁶, will be used. Spring Boot is Spring’s convention-over-configuration solution for creating stand- alone, production-grade Spring-based applications that the developer can "just run"[18].

2.1.2 Integration, Production, Development

In software development, different environments are typically used[19].

• Development Working environment for developers. In it, radical changes to the code can be performed without affecting the other environments.

• Integration Common environment where all developers commit code changes. It is used to validate the work of all developers before promoting it to the staging environment. It contains a limit subset of data useful for testing the application.

• Staging Environment as similar to the production environment as possible. The purpose of staging is to simulate as much of the Production environment as possible.

Hardware and software configuration must be comparable to the production system.

• Production Environment from which the application is served for final use.

Although this is usual practice, it is not always followed in the same way. Depending on the project, one of the environment might not be necessary. A common modification is to perform staging in the integration environment.

2.1.3 Cloud Computing

Cloud Computing, as defined by the National Institute of Standards and Technology (NIST)⁷, is "a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management ef- fort or service provider interaction"[1].

A more high-level definition can be found in Armbrust et al. [2], for who cloud comput- ing refers to "both the applications delivered as services over the Internet and the hardware and systems software in the data centers that provide those services".

Cloud model is composed of three service models, as defined by NIST [1]:

Infrastructure as a Service (IaaS) The capability provided to the consumer is to provision processing, storage, networks, and other fundamental computing resources where the consumer is able to deploy and run arbitrary software, which can include operating systems and applications. The consumer does not manage or control the underlying cloud infrastructure but has control over

javase/jdbc/index.html

5Spring Framework - https://spring.io

6Spring Boot - https://spring.io/projects/spring-boot

7NIST (National Institute of Standards and Technology), U.S. Department of Commerce - https://www.nist.gov/

(15)

operating systems, storage, and deployed applications; and possibly limited control of select networking components (e.g., host firewalls).

Platform as a Service (PaaS) The capability provided to the consumer is to deploy onto the cloud languages, libraries, services, and tools supported by the provider. The consumer does not manage or control the underlying cloud infrastructure including network, servers, operating systems, or storage, but has control over the deployed applications and possibly configuration settings for the application-hosting environment.

Software as a Service (SaaS) The capability provided to the consumer is to use the provider’s applications running on a cloud infrastructure. The appli- cations are accessible from various client devices through either a thin client interface, such as a web browser (e.g., web-based email), or a program in- terface. The consumer does not manage or control the underlying cloud in- frastructure including network, servers, operating systems, storage, or even individual application capabilities, with the possible exception of limited user- specific application configuration settings.

Cloud Computing present several advantages. Armbrust et al. [2] identify the following ones:

• Appearance of infinite computing resources on demand Traditionally, companies needed to plan far ahead for provisioning resources. With Cloud Computing, more computing resources are available on demand, quickly enough to follow load surges.

• Elimination of an up-front commitment by Cloud users Small companies can start small and increase hardware resources when there is an increase in their needs,

• Ability to pay for the use of computing resources on a short-term basis as needed Real world estimates of average server utilization in data centers range from 5% to 20%[20][21]. Since few users deliberately provision for less than the expected peak, resources are idle at non-peak times. The more pronounced the variation, the more the waste. With cloud computing, resources can be paid as-you-go. If one service needs only 100 servers at midnight, but 500 servers at noon, it will only be paid the servers that are used in each moment.

• Economies of scale due to very large data centers The construction and operation of large-scale commodity-computer data centers at low-cost location is the main enabler of cloud computing, decreasing the cost of electricity, network bandwidth, operations, software, and hardware available.

• Higher utilization by multiplexing of workloads from different organizations When using Cloud Computing, resources are shared among organizations. This allows multiplexing the workload across the servers and resources. For example, if one application uses maximum 30% of a server, while other needs maximum 40% of a server, both can be allocated in the same server. By using statistical multiplexing, even more applications could be allocated in the same server. For example, two applications that require 60% utilization of the server, but rarely at the same time.

Statistical multiplexing techniques can be used when both of them reach their peaks at the same time with some trade-offs in performance, while the rest of the time both

(16)

will be able to access enough resources.

• Simplify operation and increase utilization via resource virtualization The sta- tistical multiplexing necessary requires automatic allocation and management. For this purpose, virtualization techniques are used. For IaaS, virtual servers are used.

They might be offered as a single server, but one physical server in the data center may contain several virtual ones. For PaaS, RAM and CPU are offered using virtualization, or even a running platform assigning automatically the resources needed, abstracting the management of resources from the user.

This new paradigm has great potential. For example, Slominski, Muthusamy, and Khalaf [22] uses cloud computing for enabling adaptive applications, where there could be application-specific flexibility in the computation that may be desired. The application adapts its resources depending on the needs of the application in each moment. In this way, the authors maximize the application QoS, while meeting both a time constraint and a resource budget limit. Another example is the work of Slominski, Muthusamy, and Khalaf [22] in which the use the resources of cloud computing to provide a multi-tenant application. They achieve tenant isolation by deploying each tenant’s application by using a particular virtualization technique: containers. These containers are running in the same resources thanks to cloud computing techniques, but they are isolated between them, containing the application of a single tenant.

2.1.4 SaaS and Cloud-Native Applications

As just explained, cloud computing presents several advantages. However, to take advantage of the full range of benefits of cloud computing, applications need to be built consid- ering specific requirements, such as scalability, maintainability, and portability[23]. This kind of applications is known as cloud-native applications.

Kratzke and Quint [5] defines cloud-native application (CNA) as "a a distributed, elas- tic and horizontal scalable system composed of (micro)services which isolate state in a minimum of stateful components. The application and each self-contained deployment unit of that application is designed according to cloud-focused design patterns and oper- ated on a self-service elastic platform".

Based on this definition, and analyzing cloud computing literature, Kratzke and Quint [5] states that cloud-native applications fulfill a set of principles:

• Being operated on automation platforms Automation platforms, also referred to as elastic platform, are platforms that can adapt to workload changes by provisioning and de-provisioning resources automatically.

• Using softwarization of infrastructure and network By using techniques such as Infrastructure ad Code or Pipeline as Code, the infrastructure an network can be defined using code.

• Providing migration and interoperability across different cloud infrastructures and platforms As several cloud providers are offered, a need for being able to mi- grate and communicate easily from one to the other arose.

Sometimes is possible to read about CNA referred to as SaaS. CNA and SaaS are

(17)

closely related term but are not necessarily synonyms. CNA refers to the application, while SaaS refers to the method for providing the application to the client. Cloud-native applications are designed to run in a PaaS offering with all their benefits and challenges, including elastic scalability and typically container technology. SaaS refers to applications that are running in a cloud infrastructure, either CNA or monoliths delivered using a cloud infrastructure.

The Twelve-Factor App[4] emerged as a methodology that documents best practices for efficiently building SaaS. This methodology is getting widely adopted in industry[23].

It provides best practices to achieve the properties for cloud-based deployments. The applications built using Twelve-Factor App methodology, as defined by Wiggins [4]:

• Use declarative formats for setup automation, to minimize time and cost for new developers joining the project;

• Have a clean contract with the underlying operating system, offering maximum portability between execution environments;

• Are suitable for deployment on modern cloud platforms, obviating the need for servers and systems administration;

• Minimize divergence between development and production, enabling continuous de- ployment for maximum agility;

• And can scale up without significant changes to tooling, architecture, or develop- ment practices.

The Twelve-factor App methodology defined the following factors. They are described using the official documentation[4] complemented with Daschner [3] interpretation for Java enterprise projects:

1. Codebase The codebase of a single application is tracked in a single repository, containing all specifications for potentially different environments. This factor lever- age developer productivity since all information is found under one repository. The repository should contain all source files that are required to build and run the application, including Java sources, configuration files, and infrastructure as code.

2. Dependencies Explicitly declare and isolate dependencies required in order to run the application. Explicitly specifying the version leads to far fewer compatibility issues in production. Java applications specify their dependencies using build systems. Container technologies simplify this principle by explicitly specifying all software installation steps. The resulting software artifacts should be accessible from central repositories, such as artifact repositories.

3. Config Configuration varies substantially across environments, while code does not. Application configuration, which differs for separate environments, such as databases, external systems, or credentials, should be dynamically modified from outside of the application. This implies that configuration is retrieved via files, environment variables or other external concerns.

4. Backing Services Backing services are those databases and external systems that are accessed by the application. The resources must communicate with the applica-

(18)

tion in a loosely coupled way. In this fashion, they can be replaced by new instances without affecting the application. Communication over HTTP or JDBC, for example, abstracts the implementations and enables systems to be replaced by others.

5. Build, Release, Run Application binaries are built, deployed, and run in separate steps. This is a best practice for Java enterprise developers. Software and configuration changes happen in the source code or in the deployment step. The deployment step brings application binaries and potential configuration together. It is a common practice to separate these steps and orchestrate stages in a Continuous Integration server.

6. Processes Application should run as stateless processes. Potential state must be stored in an external database or discarded.

7. Port Binding Services are exposed via port binding. applications should be self- sufficient that expose their functionality via network ports. Java EE applications support this approach, exporting a port which is used to communicate with the application.

8. Concurrency Cloud applications should be able to scale the workload increases. In this fashion, resources can be used in a savvy manner, only requesting those necessary for the workload at a particular moment. Ideally, the application should be able to scale both horizontally and vertically.

9. Disposability: Applications are discarded and restarted often in the cloud, so they should be prepared to be correctly disposed. They should maximize robustness with fast startup and graceful shutdown, finishing in-flight requests and properly closing open connection and resources. Java EE supports both fast startups and graceful shutdowns, closing resources properly at JVM shutdown.

10. Dev/prod Parity Differences between environment should be minimal. Enterprise applications tend to have differences between the environments of the development process. The use of different tools, technology, external services, and configuration introduce the risk that these differences may lead to errors. Maintaining development, staging, and production as similar as possible reduce these errors to a minimum.

11. Logs Enterprise applications traditionally write logs files on disk. This strategy presents several shortcomings[3]: lots verbose string objects that need to be stored, high use of CPU and memory, several layers of buffering, use of additional technologies for logging that increase overhead, etc. In general, experience shows that logging has the biggest impact on an application’s performance. In contrast, cloud applications argue that logging should be treated as a stream of logs events. Monitor- ing, journaling, or tracing solutions should be used instead of logs. The information that it is still left can be logged simply to the standard output, such a Java errors.

Therefor logs should only indicate fatal problems that requires engineering action.

12. Admin Processes This factor describes principle describes that administrative or management tasks should be executed as separate short-lived processes. In Java EE applications, the number of required administration and management tasks are limited. Administrative tasks are usually required for debugging and troubleshooting.

(19)

2.1.5 Cloud Technologies

In order to enable the development of a cloud-native application, a broad range of technologies have emerged. In this project, the technologies are chosen in order to follow cloud-native principles. Docker will be used to provide easy migration and interoperability. OpenShift will be used as elastic platform, based on Kubernetes, allowing to provision and de-provision resources. Both Docker and OpenShift will be used for softwarization of infrastructure and network.

Containers: Docker

Cloud Computing definition implies that computing resources are shared. Virtualization is used for this purpose, mainly using two particular techniques: virtual machines and containers[24].

In a simplified explanation, virtual machines are based in a software called hypervisor that sits on top of the physical hardware and abstract the host machines resources. This technology allows creating multiple simulated environments of dedicated resources from a single, physical hardware system. This simulated environment is called a virtual machine and it is used extensively in cloud computing, particularly for IaaS[25]. However, this technology has some drawbacks. Each virtual machines includes a separate operating system image, which adds overhead in memory and storage footprint. This results in slow start-up time and some loses in performance[25].

In order to overcome the drawbacks of virtual machines, a new virtualization technology was developed: containers[26]. Containers refer to an implementation of operating system-level without a dedicated kernel. Unlike virtual machines, container virtualization does not mimic a physical server with its own OS and resources. This type of virtualization enables applications to execute in a common OS kernel. There is no need for a dedicated kernel for each application, and thus impose a lower overhead compared to virtual machines. Containers provide faster deployment and elasticity than virtual machines, as well as better performance[25].

There are different container solutions. The first standard distribution was LXC⁸, later expanded with LXD⁹. Another alternative is CoreOS rkt¹⁰. However, the most popular and de facto containers solution is Docker. Docker¹¹ is an open source project that provides a systematic way to automate applications inside portable containers. Packaging existing apps into containers immediately improve security, reduce costs, and gain cloud portability.

Docker containers are created using base images¹². A container image, or simply image, is "a lightweight, stand-alone, executable package of a piece of software that includes everything needed to run it: code, runtime, system tools, system libraries, settings"¹³. Im- ages can be created manually, but typically they are built using Dockerfiles. A Dockerfile is a script that defines a set of steps for building the image, such as commands or argument.

8LXC - https://linuxcontainers.org/lxd/introduction/

9LXD - https://linuxcontainers.org/lxd/introduction/

10rkt - https://coreos.com/rkt/

11Docker - https://www.docker.com/

12Docker: about images, containers and storage drivers - https://docs.docker.com/v17.09/

engine/userguide/storagedriver/imagesandcontainers/

13Docker: what is a container - https://www.docker.com/resources/what-container

(20)

Each of these steps taken for creating an image adds a new layer on top of the previous one.

Orchestrators: Kubernetes

In order to manage containers, we used orchestrators. Orchestrators are in charge of automatic deployment, scaling, and general management of containers.

There are different technologies (Mesos, Marathon). Kubernetes is one of the most popular solutions for orchestrating Docker containers in production environments.

The main objects provided by Kubernetes are¹⁴:

• Pod A Pod is the basic building block of Kubernetes. It represents a running process.

A Pod encapsulates an application container, storage resources, a unique network IP, and options that govern how the container should run. A Pod represents a unit of deployment: a single instance of an application in Kubernetes, which might consist of either a single container or a small number of containers that are tightly coupled and that share resources.

• Service A Service is an abstraction which defines a logical set of Pods and a policy by which to access them. Pods are mortal. They are born, and when they die, they are not resurrected. Replication Controllers, in particular, create and destroy Pods dynamically (e.g., when scaling up or down or when doing rolling updates). While each Pod gets its own IP address, even those IP addresses cannot be relied upon to be stable over time. Service solves this problem by providing a permanent mean to access a set of Pods.

• Volume A Volume is a directory, possibly with some data in it, which is accessible to the containers in a pod. On-disk files in a container are ephemeral, so everything that needs to be persisted during the lifetime of a Pod needs to be stored in a volume.

In addition, Kubernetes contains a number of higher-level abstractions called Con- trollers. Controllers build upon the basic objects and provide additional functionality and convenience features. They include:

• Replication Controller A Replication Controller ensures that a specified number of pod replicas are running at any one time. In other words, a Replication Controller makes sure that a pod or a homogeneous set of pods is always up and available.

If there are too many pods, the Replication Controller terminates the extra pods.

If there are too few, the Replication Controller starts more pods. Unlike manually created pods, the pods maintained by a Replication Controller are automatically replaced if they fail, are deleted, or are terminated.

• Deployments A Deployment controller provides declarative updates for Pods and Replication Controllers. The desired state is described in the Deployment object, and the Deployment controller changes the actual state to the desired state at a con- trolled rate. Deployments create new Replication Controller, or remove existing Deployments and adopt all their resources with new Deployments.

14Kubernetes: concepts - https://v1-7.docs.kubernetes.io/docs/concepts/

(21)

PaaS: OpenShift

Containers and orchestration tool should be hosted and managed by a platform, typically a Platform as a Service (PaaS). There are several alternatives such as AWS Elastic BeanStalk¹⁵, Google App Engine¹⁶, or Cloud Foundry¹⁷. Red Hat offers a solution particularly suitable for containers and based in Kubernetes: OpenShift.

OpenShift¹⁸ is "an open source container application platform by Red Hat based on top of Docker containers and the Kubernetes container cluster manager for enterprise app development and deployment"[27]. It provides hosting for containers as well as extends Kubernetes features. For example, OpenShift provides an integrated software-defined networking and load-balancer (e.g., HAProxy, F5, etc.) for managing Kubernetes services.

It also provides handling of routes for exposing services to the outside world, as well as a templating and a service catalog.

The minimum deployment unit in OpenShift is a Pod. Containers are contained inside a Pod.

A Kubernetes Pod defines:

• Container Same concept as Docker container, explained in 2.1.5.

• Volumes Same concept as Kubernetes Volume, explained in 2.1.5.

Pods are managed by controllers. For pods that are not expected to terminate, such as web servers, a Replication Controller should be used. Kubernetes ReplicationController¹⁹ object ensures that a specified number of pod replicas are running at any one time, making sure that a pod or a homogeneous set of pods is always up and available.

A Kubernetes ReplicationController object defines the following settings:

• The number of replicas desired (which can be adjusted at runtime).

• A pod definition to use when creating a replicated pod.

Openshift expands the ReplicationController with the concept of deployment, adding expanded support for the software development and deployment life-cycle. Deployment are defined using the OpenShift DeploymentConfig²⁰ object. They are created manually or in response to triggered events.

The OpenShift DeploymentConfig object defines the following details of a deployment:

• The elements of a Replication Controller definition.

• The strategy for transitioning between deployments.

• Triggers for creating a new deployment automatically.

15AWS Elastic BeanStalk - https://aws.amazon.com/elasticbeanstalk/

16Google App Engine - https://cloud.google.com/appengine/

17CloudFoundry - https://www.cloudfoundry.org/

18Openshift - https://www.openshift.com

19ReplicationController - https://kubernetes.io/docs/concepts/workloads/

controllers/replicationcontroller/

20OpenShift: Deployments and deployment configuration - https://docs.openshift.com/

container-platform/3.7/architecture/core_concepts/deployments.html#

deployments-and-deployment-configurations

(22)

All Kubernetes objects are present in OpenShift, but not all OpenShift objects are present in Kubernetes. It is accurate to refer as OpenShift objects to those objects that are present in OpenShift but not in Kubernetes, and as Kubernetes objects those only present in Kubernetes. However, for the sake of simplicity, in this thesis sometimes Kubernetes native objects will also be called OpenShift objects.

2.1.6 Continuous Deployment

Several recommendations from Twelve-factor App are achieved not through the code, but with development workflows. Development workflows are about defining the process from the writing of the source code until the running of the application in production. For cloud applications, development workflows should run automated and reliably with as little human intervention as possible [3].

Development workflows can have different levels of complexity. Build systems are a basic start, automating compiling, resolving dependencies, and packaging of software projects. Continuous Integration takes one step further, orchestrating the whole development workflows from building artifacts to automated testing and deployments. Continuous Delivery builds over Continuous Integration, automating the shipping of built software to specific environments on each build. Continuous Deployment expands the concept of Continuous Integration by automatically deploying each committed software version to production.

The main building block of the Continuous Delivery process is the automation pipeline.

This automation pipeline will include several steps, such as retrieving the code from the repository, publishing the artifacts or deploying the app. In each of this steps other techniques are used: code repository, artifacts repository, testing tools, or deployments platforms.

In this thesis, testing phases will not be included due to our application not containing any test. This is not a recommended practice: tests should be added to the application.

However, this task is outside of the scope of the presented work.

Several techniques can be used in automation pipelines. In this project’s pipeline, two

Figure 2.1: High-level overview of Continuous Delivery pipeline. Obtained from Daschner [3]

.

(23)

techniques are used: IaS and PaC.

• Infrastructure as Code (IaC) The idea behind IaC is that all required steps, con- figuration, and versions are explicitly defined as code. These code definitions are directly used to configure the infrastructure. This can be done in different ways, from shell scripts to declarative way using additional tools. IaC will be used when using containers and PaaS. The former will enable the building of containers automatically with the automation pipeline. The latter will enable describing the PaaS infrastructure using code to be automatically deployed.

• Pipeline as Code (PaC) Pipeline as code definitions specify the Continuous De- livery pipeline as part of the software project. In this project, PaC will be used to configure the automation pipeline using code.

Automation Pipeline: Jenkins

Continuous Delivery pipelines consist of several pipeline build steps that are executed as part of a single build. Builds are usually triggered by committing or pushing code changes into version control.

There are several automation pipeline tools, such a Travis CD, Codeship or Jenk- ins. For this project, Jenkins will be used. Jenkins²¹ is the leading open source automation server. It provides tools for Continuous Integration and Continuous Delivery using pipeline as a code. In Jenkins, pipeline as a code is defined in a Jenkins file, which is denied using Groovy DSL. Groovy is an optionally typed, dynamic JVM language, that suits well for DSL and scripts.

Jenkins architecture is Master+Agent²². The master is designed to do coordination and provide the GUI and API endpoints, and the Agents are designed to perform the work. A server that runs an agent is often referred to as Node.

Version Control System (VCS) and Code Repository: git and Bitbucket

Twelve-factor App methodology recommends that the application is always tracked in a version control system. A code repository is a copy of the revision tracking database. It is often shortened to code repo or just repo. The motivation behind this practice is to keep all code needed to build the application in one central place. The first step of a Continuous Deployment pipeline is the retrieval of code from the code repository.

Several version control systems are offered, such as git²³ or SVN²⁴. The former one will be used in this project. Several solutions can be found to store the code based on git, such as GitHub²⁵, GitLab²⁶or Bitbucket²⁷. The latter will be used in this project.

Several git related concepts will be used in this project:

21Jenkins - https://jenkins.io/

22Jenkins: Distributed Builds - https://wiki.jenkins.io/display/JENKINS/

Distributed+builds

23Git - https://git-scm.com

24Apache Subversion (SVN) - https://subversion.apache.org

25GitHub - https://github.com

26GitLab - https://gitlab.com

27Bitbucket - https://bitbucket.org

(24)

• Commit Record changes in the repository.

• Release Git offer the feature to add tags to specific points in history as being impor- tant. This functionality is typically used to mark release points.

• Branch Git object data for multiple commits. A branch in Git is a movable pointer to one commit. The default branch name in Git is master. When the developer initially makes commits, she is given a master branch that points to the last commit made. If she creates a new branch, a new pointer is created for that branch.

Build Artifact: Maven

The build process is the step in which Java code is compiled into bytecode. This step is repeated every time changes are made in the project. In the enterprise world, several different frameworks and libraries are used in the same project. Consequently, it is essen- tial to organize and define all dependencies on APIs and implementations [3]. Packaging the compiled classes and their dependencies into deployment artifacts is also part of the build process. Build systems are integrated and launched fro the Continuous Deployment pipeline.

Apache Maven²⁸, also referred as Maven, and Gradle²⁹are two popular build tools for Java. In this project, Apache Maven will be used. Maven addresses two aspects of building software: first, it describes how software is built, and second, it describes its dependencies.

An XML file describes the software project being built, its dependencies on other external modules and components, the build order, directories, and required plug-ins. It comes with pre-defined targets for performing certain well-defined tasks such as compilation of code and its packaging.

Build Container: Docker

Docker was already introduced in 2.1.5. One step of the Continuous Deployment is to build the Docker image, containing all software necessary for the application to work.

Docker Registry: JFrog Artifactory

Twelve-factors App methodology does not only recommend to track the code in a version control system but all artifacts. The motivation behind this practice is to keep all the artifacts needed to ship the application in one central place. Artifacts repositories save the built artifact version for later retrieval. This is important for SaaS applications since it is needed to access the previous version of the application, for example, for rollback, or for deploying a different version to integration and production.

There are several solutions, most of them depending on the kind of artifacts that need to be stored. Since our artifacts will consist of Docker images, it is needed a solution providing version controls for them. Some technologies are Sonatype Nexus or JFrog Artifactory. We use JFrog Artifactory.

28Apache Maven - https://maven.apache.org/

29Gradle - https://gradle.org/

(25)

2.2 Related Work

Similar work can be found in research literature. Balalaie et al. [28][29] present their experiences in migrating a Java application to the cloud. The work presents the original architecture before migration to the cloud, the target architecture, and the migration steps.

Instead of migrating a single service as presented in this master thesis, it presents the migration of several microservices. Development workflows are also implemented, allowing Continuous Deployment. They use similar technologies: Docker, Kubernetes, Jenkins.

Instead of deploying to OpenShift they deploy to a CoreOS cluster, and they use GitLab instead of BitBucket as code repository, both using git as version control technology. They provide a high-level view of their migration process. In contrast, this master thesis will provide further details of each step and its configuration.

Certain parts of the project resemble other people work. A part of Slominski, Muthusamy, and Khalaf [22] work is similar to our project. In their project, the migration to the cloud is the first step toward reaching a multi-tenant cloud service. Truyen et al. [30] propose an architecture, in a similar way as our Chapter 5. A SWOT analysis of pros and cons is made, but an implementation is not presented. Pahl [31] focus on the containerization of application, discussing the requirements, for later deployment to PaaS. Wurster et al.

[23] uses Twelve-Factor App methodology to build scalable, maintainable, and portable applications.

(26)

Architecture of Candidate Application

The candidate application for migration to PaaS is T-REX Original, an Amadeus internal tool. From now on, when referring to T-REX Original as the application previous to the migration, it will be called as T-REX Original Original.

In this chapter, T-REX Original software architecture, technologies used, deployment and development workflow is presented.

3.1 Software Architecture

T-REX Original is a web application that works as a Jenkins templater. It is a User Interface (UI) to create and launch automated test campaigns for Amadeus internal tools. That way the developer can already have a preliminary assessment of the quality of her tests, before trying to merge the changes in the codebase or deliver it to the client.

Figure 3.1: T-REX Original User Interface

T-REX Original follows a traditional client-server architecture, formed by a front-end and a back-end. The front-end allows selecting the configurable options for the automated test. The back-end is in charge of creating and launching this test in Jenkins.

18

(27)

The application is served with a Tomcat server. This server is shared with another two applications, both Amadeus internal tools: Slam and Cops.

The applications make use of two databases: T-REX database and Spin database. The connection with these databases is made using JDBC.

Communication with external dependencies (LDAP, Carto Service, and Jenkins instance) is done through HTTP. In addition, the authentication is done through Amadeus LDAP¹. The tests are created in a Jenkins instance.

Figure 3.2: T-REX Original Software Architecture

3.2 Technologies

T-REX Original back-end is developed with Java EE. In order to facilitate the back-end development, the Spring Framework is used, with Spring’s convention-over-configuration, Spring Boot.

T-REX Original front-end is developed with AngularJS². Angular JS is a JavaScript front-end web application framework[32]. It works combined with HTML³ and CSS⁴.

T-REX Original development workflow is composed by a codebase repository, an automation server to perform Continuous Build, and the server in which the application run.

Bitbucket⁵is used for the codebase repository, Jenkins⁶ as automation server for Continu- ous Build and Apache Maven⁷as Java build tool.

1Lightweight Directory Access Protocol - https://msdn.microsoft.com/en-us/library/

aa367008(v=vs.85).aspx

2AngularJS - https://angularjs.org/

3HTML - https://www.w3.org/html/

4https://www.w3.org/Style/CSS/

5Bitbucket - https://bitbucket.org/

6Jenkins - https://jenkins.io/

7Apache Maven - https://maven.apache.org/

(28)

3.3 Codebase

T-REX Original codebase is stored in Amadeus Bitbucket. It has two main branches:

’master’ and ’next’. The ’master’ branch contains the production code. The ’next’ branch contains development and integration code.

For development, ’next’ branch is used. The developer should launch with Maven using the ’development’ profile. For integration, ’next’ branch is used, and a file ROOT.xml with DataSource configuration is present in the integration server. For production, ’master’

should be used, and a file ROOT.xml with DataSource is present in the production server.

3.4 Deployment

In a traditional structured process for software development, there are four tiers: development, integration, staging, and production, as introduced in 2.1.2.

In T-REX Original, development, integration and production tiers are used. Staging was not found necessary by the creators of the application.

Spring Boot provides a mechanism for enabling different configurations for different environments: Spring Boot profiles. By using profiles, a different configuration is set up for development, integration, and production.

Each of the environments is set up as follows:

• Development The application run on the developer’s workstation. T-REX Original is served by an embedded Tomcat server provided natively by Spring Boot. Devel- opment versions of T-REX database and Spin database should be running in a local server in the developer workstation. The application is started with Maven. For this tier, the ’development’ Spring Boot profile is used.

• Integration T-REX Original is deployed in a Tomcat server reserved for integration.

This server has access to the LDAP, the databases, and the Jenkins instance. This version connects to an integration version of T-REX Original DB and Spin DB.

For this tier, the ’integration’ Spring Boot profile is used. Tomcat server should be manually preconfigured to serve this application. Every time the application is updated, the developer stops the Tomcat server, substitute the old binaries with the new binaries and restart the server.

• Production T-REX Original is deployed in a Tomcat server reserved for production.

This server has access to the LDAP, the databases, and the Jenkins instance. This version connects to the production version of T-REX Original DB and Spin DB.

For this tier, the ’production’ Spring Boot profile is used. Tomcat server should be manually preconfigured to serve this application. Every time the application is updated, the developer stops the Tomcat server, substitute the old binaries with the new binaries and restart the server.

In order to connect to the T-REX Original and Spin databases, JDBC is used. Java uses DataSource⁸ interface to connect through JDBC. Since each software development tiers connect to a different version of the database, different DataSources should be pro-

8Java 8, Interface DataSource - https://docs.oracle.com/javase/8/docs/api/javax/sql/DataSource.html

(29)

vided. DataSource for development is provided in ’development’ profile. The DataSource for integration and the DataSource for production are not present in the Spring Boot profile, but in the Tomcat configuration. They should be set up manually in the integration Tomcat server and the production Tomcat server. This is due to security reasons, and it is a usual practice[3]. Integrations and production versions of the databases require credentials, and they should not be publicly visible in the code repository. The DataSource with the correspondent credentials will be configured manually in the integration server and in the production server using the context container⁹.

3.5 Development Workflow

Different development workflows for development, integration, and production.

• Development All steps are performed manually by the developer. The code is cloned from Bitbucket into the developer workstation. The code is deployed locally using Maven. Spring Boot provides an embedded Tomcat server to run the application.

The developer workstation should be able to access the databases, the LDAP and the Ghost Service.

• Integration Continuous Build, launched automatically, and manual deployment.

A Jenkins job performs Continuous Build using the following steps. Every time there is a new commit in the branch ’next’, Jenkins job checkout T-REX Original code from Bitbucket. The Jenkins job uses Maven to build the binaries: a WAR file. Once built, the developer performs the manual deployment. The WAR file is deployed manually in the integration server. The integration server should be able to access the databases, the LDAP and the Ghost Service.

• Production Continuous Build, launched manually, and manual deployment. A Jenkins job performs Continuous Build using the following steps. Every time a new release is prepared from ’master’ the Jenkins job is manually started. Then, the Jenkins job checkout T-REX Original code from Bitbucket. The Jenkins job uses Maven to build the binaries: a WAR file. Once built, the developer performs the manual deployment. The WAR file is deployed manually in the production server.

The production server should be able to access the databases, the LDAP and the Ghost Service.

3.6 Credentials

T-REX Original uses different types of credentials:

1. T-REX database and Spin DB credentials For development, credentials are pro- vided in the Spring Boot profile. For integration and production, credentials for the databases are specified in the context of the Tomcat server, using the ROOT.xml file.

For doing so, the developer accesses the server manually and set it up.

9Apache Tomcat: The Context Container - https://tomcat.apache.org/tomcat-8.

0-doc/config/context.html

(30)

Figure 3.3: T-REX Original Integration/Production Workflow

2. Bitbucket For development, Bitbucket credentials are provided by the developer. In the Continuous Build, they are provided in Jenkins using Jenkins Credentials¹⁰.

10Jenkins Credentials Binding - https://plugins.jenkins.io/credentials-binding

(31)

Analysis of Candidate Application

In this chapter, T-REX is analyzed to asses its compliance with cloud-native principles.

For doing so, we used Twelve-Factor App methodology[4] and the cloud-native principles defined by Kratzke and Quint [5] .

4.1 Twelve-Factor App

The candidate application is not natively prepared to run in PaaS. In this section, Twelve- Factor App methodology is to discover the lacking properties for the application to be served with the cloud.

1. Codebase As explained in subsection 3.3, T-REX software code is kept under ver- sion control in a single repository, containing all specifications for potentially different environments. However, some files are not present in this codebase: those containing credentials. This is a usual security practice. These files are added manually to the production server.

2. Dependencies As usually happens in Java applications [3], T-REX dependencies are declared using a build system: Maven. Besides, software artifacts should be accessible via well-defined processes. Dependencies need to be distributed from a central place. For T-REX Original, the primary software artifact is the WAR file.

This file is built in Jenkins and store in its internal file system, but it is not published to any central artifact repository from which it can be accessed via distinct processes.

3. Store config in environment A part of the configuration is done automatically through Spring Boot profiles, which are selected using environment variables. An- other part is done manually in the Tomcat server by the developer.

4. Backing Services T-REX backing services consist of the databases (T-REX and Spin) and the attached services (LDAP, Jenkins, and Carto). Both databases are connected using JDBC, and the services are connected using HTTP. Consequently, T-REX backing services fulfill this factor.

5. Build, Release, Run This a popular approach to Java enterprise developers and it is followed in T-REX. However, a big part of this process is manual. T-REX has set up a Continuous Build with Jenkins for integration and production. The release is done manually by the developer, getting the resulting file of the Continuous Build

23

(32)

1 2 3 4 5 6 7 8 9 10 11 12

T-REX Original 3 7 7 3 7 7 3 7 3 7 7 3

Table 4.1: T-REX Original compliance with Twelve-Factor App

and putting it into the Tomcat server. Run is also manual as the developer stop, start and restart the Tomcat server to serve the application.

6. Processes: Although T-REX is mainly a stateless application, it still needs one state- ful process: sticky sessions. Sticky session is a mechanism in which sessions are stored in the local application instance. Consequently, if more than one instance is running at the same time, the session will be lost.

7. Port Binding Java EE applications typically expose services via port binding. As Java EE application, T-REX also exposes its functionality via network port. Its functionality exposes a UI for the user. Consequently, T-REX already fulfills this factor.

8. Concurrency T-REX is not able to scale out via the process model. T-REX can scale vertically in the server in which it is deployed. However, scaling vertically is limited, since resources on the physical server are limited. T-REX is not able to scale horizontally in an easy way. For doing so, it would be needed to set up a load balancer that redirects users to the appropriate instance, keep sessions, maintains consistency across instances, etc.

9. Disposability As Java EE application, T-REX support fast startup and graceful shut- down, closing resources properly at JVM shutdown. Consequently, T-REX fulfills this factor.

10. Dev/prod Parity Development environment differs significantly from integration and production environment. Integration and production have the same structure deployed in different servers. This also suppose a risk since the servers could have different configurations.

11. Logs: We can find different logs in T-REX, such as Java logs from the code, Catalina logs from the Tomcat server. They are written into disk using a log rotation strategy.

This is not the recommended way of dealing with logs for cloud applications, as they should only write to the standard output instead of keeping them on disk.

12. Admin Processes: the servers in which T-REX is deployed allow to run adminis- trative tasks, so this factor is fulfilled.

4.2 Cloud-native Principles

The candidate application is not prepared to run in the cloud. In this section, its compliance with cloud-native principles is analyzed.

• Operated on automation platforms T-REX is operated in a traditional server and served by Tomcat. This is not an automation platform, so this principle is violated.

• Softwarization of infrastructure and network Its infrastructure is configured man-

(33)

ually by the developer or the operation team maintaining the server. It is not defined by software, so this principle is violated.

• Migration and interoperability across different cloud infrastructures and plat- forms T-REX migration is not prepared for easy migration from one cloud infras- tructure to another, and it is even less prepared for interoperability among them.

This principle is violated.

(34)

Presentation of Cloud Strategy

In this chapter, it is given an overview of the strategy followed to adapt T-REX Original to be deployed in PaaS. The version prepared for PaaS will be referred as T-REX PHACO, being PHACO the PaaS instance in which T-REX aims to be deployed.

This chapter presents three sections. In Cloud Architecture, we describe the high-level architecture that will be used for deploying T-REX PHACO in the PaaS. In Development Workflow, we will describe the strategy not related to the PaaS, including the Continuous Deployment pipeline or the code and artifact repositories used. In Twelve-Factor App section, we will present how each principle of the methodology will be fulfilled with this strategy.

Figure 5.1: Target T-REX Cloud Architecture

26