Protection of Non-Volatile Data in IaaS-environments

(1)

Institutionen för datavetenskap

Department of Computer and Information Science

Final thesis

Protection of Non-Volatile Data

in IaaS-environments

by

Erik Sundqvist

LIU-IDA/LITH-EX-A--14/062--SE

2014-11-18

Linköpings universitet

SE-581 83 Linköping, Sweden

Linköpings universitet

581 83 Linköping

(2)

Linköping University

Department of Computer and Information Science

Final Thesis

Protection of Non-Volatile Data

in IaaS-environments

by

Erik Sundqvist

LIU-IDA/LITH-EX-A--14/062--SE

2014-11-18

Supervisor: Marcus Bendtsen

Examiner: Prof. Nahid Shahmehri

(3)

Abstract

Infrastructure-as-a-Service (IaaS) cloud solutions continue to experience growth, but many enterprises and organizations are of the opinion that cloud adoption has decreased security in several aspects. This thesis addresses protection of IaaS-environment non-volatile data. A risk analysis is conducted, using the CORAS method, to identify and evaluate risks, and to propose treatments to those risks considered non-acceptable. The complex and distributed nature of an IaaS deployment is investigated to identify di↵erent approaches to data protection using encryption in combination with Trusted Computing principles. Additionally, the outcome of the risk analysis is used to decide the advantages and/or drawbacks of the di↵erent approaches; encryption on the storage host, on the compute host or inside the virtual machine. As a result of this thesis, encryption on the compute host is decided to be most beneficial due to minimal needs for trust, minimal data exposure and key management aspects. At the same time, a high grade of automation can be obtained, retaining usability for cloud consumers without any specific security knowledge. A revisited risk analysis shows that both non-acceptable and non-acceptable risks are mitigated and partly eliminated, but leaves virtual machine security as an important topic for further research. Along with the risk analysis and treatment proposal, this thesis provides a proof-of-concept implementation using encryption and Trusted Computing on the compute host to protect block storage data in an OpenStack environment. The implementation directly follows the Domain-Based Storage Protection (DBSP) protocol, invented by Ericsson Research and SICS, for key management and attestation of involved hosts.

(4)

Acknowledgements

This thesis was carried out at the security division of Ericsson Research and at the Swedish Institute of Computer Science (SICS), as a part of a joint project aiming to secure the use of Infrastructure-as-a-Service deployments of cloud computing. I would specially like to thank Fredric Morenius (Ericsson Research) for his input, advices, au-dits of this thesis report and help throughout the thesis process. I would also like to thank Nicolae Paladi (SICS) for his extraordinary support with the test-bed, deployment and troubleshooting, and Ph.D. Antonis Michalas (SICS) for his support and overall help throughout the work with this thesis. I would also like to thank Professor Nahid Shahmehri at The Institute of Technology at Linkoping University, for her advices and suggestions, and Marcus Bendtsen for his help and guidance regarding the methodology and structure of this thesis report. Thanks to Andras Mehes (Ericsson Research) and Mudassar Aslam (SICS) for their understanding and explanations of TPM aspects of Trusted Computing.

(5)

i

List of Figures

2.1 A simplified high-level picture of a multi-tenant IaaS cloud architecture. 8

2.2 Basic structure of logical volume management. . . 10

2.3 How issues are transformed using cryptographic techniques. . . 11

2.4 Basic architecture of a small OpenStack-based IaaS cloud environment consisting of five separate hosts with block storage and networking capa-bilities. . . 13

3.1 The eight steps of the CORAS approach. . . 20

4.1 Symbols of the CORAS risk modelling language. . . 22

4.2 Overall target model of non-volatile data in IaaS environments. . . 23

4.3 Scope of analysis and physical classification of cloud services. . . 25

4.4 Complete target model for the risk analysis. . . 26

4.5 Asset diagram with cloud service user/tenant as stakeholder. . . 27

4.6 Asset diagram with cloud service provider as party. . . 28

4.7 Risk evaluation criteria. . . 31

4.8 Potential attack vectors using virtualization technologies. . . 33

4.9 Risk classification using the risk evaluation criteria. . . 36

5.1 Target model with data encryption and decryption on the storage host. 42 5.2 Target model with data encryption and decryption on the compute host. 43 5.3 Target model with data encryption and decryption inside the virtual ma-chine on the compute host. . . 44

6.1 Overall architecture and interconnectivity between system components. 48 6.2 Secure Component basic architecture and libraries. . . 49

6.3 Additional time required for unlocking and locking of volumes introduced by this implementation of DBSP. . . 51

7.1 Re-evaluated risk classification using the risk evaluation criteria. . . 55

7.2 Distribution of risks before and after treatment. . . 56

A.1 Sequence diagram of creation of a volume. . . 65

A.2 Sequence diagram of attachment of a volume. . . 67

A.3 Sequence diagram of data read e.g. using SSH. . . 68

A.4 Threat diagram showing harm to indirect assets. . . 69

A.5 Threat diagram of deliberate risks. . . 70

A.6 Threat diagram of accidental risks. . . 71

A.7 Final threat diagram with estimated likelihoods. . . 72

A.8 Risk picture of non-volatile data in infrastructure clouds. . . 73

(8)

iv

A.10 Threat diagram showing treatments to mitigate risks. . . 75

A.11 Threat diagram with re-estimated likelihoods and consequences. . . 76

B.1 Sequence diagram illustrating attachment of an unencrypted volume, showing the extensions made to the original sequence of actions. . . 77

B.2 Sequence diagram illustrating attachment of an encrypted volume, show-ing the extensions made to the original sequence of actions. . . 78

List of Tables

4.1 Estimated scale of likelihoods. . . 29

4.2 Asset table of cloud service user/tenant. . . 29

4.3 Asset table of cloud service provider. . . 30

4.4 Consequence scale of asset user data. . . 30

4.5 Consequence scale of asset cloud management software. . . 30

4.6 Consequence scale of asset user VM. . . 31

4.7 Risks to direct assets. . . 36

4.8 Risks to indirect assets. . . 37

5.1 Concluding advantages and disadvantages of di↵erent solutions. . . 46

7.1 Re-evaluated risks to direct assets. . . 53

7.2 Re-evaluated risks to indirect assets. . . 54

(9)

1

Chapter 1 Introduction

Over the past number of years, the concept of cloud computing has experienced a rapid and significant increase in popularity. By applying the concept of resource sharing, enter-prises and organizations deploying their applications in the cloud are o↵ered scalability, reliability and availability to a reasonable cost in a pay-per-use model. The era of cloud computing is still highly topical in general, and the Infrastructure-as-a-Service model [18] in particular, due to increased customer interest in running scalable virtual servers in the cloud. It is understandable that enterprises and organizations in possession of sensitive data, e.g. personal data or health records, ask themselves how to comply with strict requirements on security while taking advantage of the wide range of benefits of-fered by cloud computing. This thesis aims to investigate such security aspects of cloud computing. More particularly, ensuring secure storage and use of non-volatile data in IaaS-environments.

To estimate the assumed decrease in security implied by storing and using data in IaaS-environments, compared to traditional storage on-site, a risk analysis is conducted using the CORAS method [35], yielding a summary of potential risks and treatments. The treatments are further investigated and combined into an overall security mechanism implementing encryption and trusted computing principles [10] to secure non-volatile data. To estimate the e↵ect, risks are re-evaluated, and finally the security mechanism is implemented in an OpenStack environment as a prototype proving the concept. The implementation is based on the findings during the risk treatment, adopting a protocol

[29] created by Ericsson Research and SICS1_.

1.1 Purpose

Refraining from adopting IaaS cloud solutions due to security reasons implies customer loss of the various appealing characteristics enabled by cloud computing. Cloud com-puting indeed introduces a variety of security issues, but the use of state-of-the-art technologies may facilitate in addressing such issues, thus encouraging further adoption of IaaS cloud solutions.

This thesis is also a part of ongoing research by Ericsson Research and SICS, aiming to decide whether it is possible to migrate security-critical information systems to an infrastructure cloud environment. As a study object for the overall project, a health-care system is chosen.

(10)

1.2. THESIS OUTLINE 2

The reasons for migrating such systems into a cloud environment are mainly financial, but due to the sensitive nature of health records and other data in such systems, strict security requirements must be met. A Swedish county is involved, providing health-care system software, and a pilot implementation of a secure cloud environment where such systems could be deployed and security-evaluated is under development. The secure cloud environment is based on the OpenStack cloud platform, thus motivating the choice of cloud platform in this thesis. Furthermore, involvement in this project has influenced the general direction of the thesis.

The motivation in section 1.4 furthermore substantiates the problem and thus the pur-pose and goals of this thesis.

1.2 Thesis Outline

Following the introduction, chapter 2 describes the background and the theoretical framework forming the basis of this thesis. The background consists of a brief intro-duction to cloud computing in general and the Infrastructure-as-a-Service model in par-ticular, shortly describing virtualization and its role in cloud computing. Furthermore, non-volatile storage in IaaS-environments is described. Additionally, Trusted Comput-ing principles and the OpenStack platform are described. FollowComput-ing the background, the overall methodology and the CORAS method is described in chapter 3.

The analysis part of the report is divided into three chapters, with one additional chapter providing a brief standalone description of the implementation (chapter 6). First, chap-ter 4 presents the documented results of the risk analysis: identified security concerns, risks, risk levels and risk-treating concepts. In chapter 5, the risk-treating concepts are combined into di↵erent solutions aiming towards risk mitigation. Finally, in chapter 7, risks are re-evaluated, followed by a discussion in chapter 8 containing an evaluation of the results and the method. Furthermore, chapter 8 concludes the thesis by providing recommendations to future work and briefly discussing the contribution.

Reading Instructions

A number of sections in the background contains information needed to fully understand the implementation. Thus, all the information in such sections is not necessary for the understanding of the risk analysis, however it is facilitating. Such sections are 2.4.1, describing more detailed aspects of trusted computing and section 2.6.1 explaining the key management protocol implemented in chapter 6. Furthermore, readers may skip sections in the background chapter according to their knowledge in the area.

1.3 Problem Description

Adoption of cloud solutions has increased significantly and rapidly the past number of years. Extended support and improved performance of virtualization technologies have contributed to make the Infrastructure-as-a-Service (IaaS) service model increasingly popular. According to a survey initiated by the Everest Group, IaaS is the only cloud service model that experienced an increasing number of consumers between 2012 and 2013 [15]. The number of enterprises and organizations planning to adopt IaaS cloud solutions in a near future have also increased significantly during the same period, ac-cording to [15]. The referenced survey also shows that especially private- and hybrid cloud solutions are the IaaS deployment models that are becoming increasingly popular. While public cloud adoption is subject to some growth, future plans of enterprises and organizations are rather aiming towards the private and hybrid IaaS deployment models.

(11)

1.4. MOTIVATION 3

Another survey, conducted by Ponemon Institute reveals that almost half of the re-spondents believe that cloud adoption has decreased their security, and there are also major di↵erences between consumers in their attitude about whom to hold responsible for securing and protecting data in the cloud; the consumer or the provider [18]. By taking the two above surveys into consideration and adding the fact that NIST presents data breaches as the cloud computing top threat of 2013 [2], it is reasonable to suspect that lack of data protection of data at rest, in transfer and in use, is preventing potential cloud consumers from adopting public infrastructure cloud solutions. Instead, enterprises and organizations are using private cloud solutions to remain separated from other, or hybrid cloud solutions where sensitive data are handled separately.

There are already various existing solutions available addressing a variety of security problems such as confidentiality, integrity, key management and trust as mentioned in chapter 2. Combining such solutions into an overall and accessible storage protection solution for infrastructure clouds is a logical step towards a more secure use of cloud computing, but it can certainly be performed in several ways, resulting in various char-acteristics of security, performance and usability.

1.4 Motivation

According to surveys [15] and [18], enterprises and organizations globally are uncertain about if sufficient security measures have been taken to ensure protection of data stored in public IaaS environments. Thus, by adopting private- or hybrid IaaS solutions and re-frain from using public IaaS environments, enterprises and organizations decrease their exposure and thereby the risk of unauthorized access to sensitive data. At the same time, the opportunity to take part of the often huge amount of computing resources and storage resources available in a public IaaS environment are missed. Furthermore, single-tenancy tends to inhibit the degree of which resource sharing is adopted, poten-tially leaving computing resources unused, thus in some sense violating one of the most fundamental concepts of cloud computing.

1.5 Aim

A computer cloud is a complex structure and which architecture and characteristics that are desirable for a data protection mechanism depends on how the cloud is intended to be used in combination with the needs of the cloud consumer (e.g. balance between expectations on security and performance). There are four major questions which this thesis investigates:

Q1 What are the main risks and threat scenarios using and storing non-volatile data in IaaS environments?

Q2 How can a data protection mechanism be implemented to mitigate or eliminate risks, i.e. allow consumers to benefit from cloud computing without a decrease in security?

Q3 How can Trusted Computing principles be used to address important issues such as key management and trust in a data protection mechanism?

Q4 How prepared is an OpenStack-environment for the modifications needed to adapt a selected data protection mechanism?

(12)

1.6. LIMITATIONS AND SCOPE 4

The first question deems a risk analysis necessary to gain understanding of what to protect and what risks exists. The second question intends to clarify and analyse how di↵erent architectures of data protection mechanisms, i.e. design choices of the basic components and the location of such components in the cloud computing stack, a↵ects the ability to protect non-volatile data in IaaS-environments. The third question aims to investigate how Trusted Computing principles can be used for further security im-provements. The fourth question follows a proof of concept implementation and aims to evaluate how the risks are treated and also evaluate the adaptability of the OpenStack-environment to such a selected mechanism.

1.5.1 Contribution

Contributing to the discussion and development of cloud data security solutions results in a contribution to benefit further use and new adoption of public IaaS clouds in particular, but also continued security-related development of IaaS as a cloud service model in general. The risk analysis also adds to the general discussion of security aspects in cloud computing which is highly motivated as enterprises and organizations migrate to the cloud. Furthermore, the implementation highlights a practical use-case of trusted computing principles.

1.6 Limitations and Scope

Analysis of di↵erent approaches to non-volatile data protection can quickly grow to a subject of unmanageable proportions, thus some limitations in the scope of this thesis are inevitable to be able to come up with concrete design choices, and from that pro-duce a reasonable prototype. Assumptions, limitations and scope of the risk analysis is described separately in chapter 4.

- Data to Protect

The risk analysis and the resulting protection mechanism is supposed to protect data in block storage resources and not just arbitrary data. Reasoning during the risk analysis applies to non-volatile data in IaaS-environments in general, but the implementation is focused on block storage data. However it is reasonable to consider protection of data on its way to or from a block storage resource as a feature of such a protection mechanism. Protecting data during transfer between e.g. two virtual machines with-out involving a storage resource is thus not in scope for the protection mechanism. General data breach due to a malicious virtual machine is also not considered, but data breaches that directly may a↵ect protection of data in a storage resource, e.g. leaked cryptographic keys is taken into consideration and discussed shortly, as this is related to the key management problem.

- Cryptographic techniques

The protection mechanism relies on existing cryptographic techniques considered to be secure. Despite being theoretically possible to break these cryptographic techniques, such aspects are not considered as a point of failure.

- Trusted Third Party

A Trusted Third Party (TTP) is assumed to be completely trustworthy and successful in fulfilling its tasks. Even if that is not always the case and a trusted third party may fail, no data protection mechanism in this thesis takes any actions to mitigate risk of such a failure, or to prevent any potential damage of such a failure.

(13)

1.6. LIMITATIONS AND SCOPE 5

- Trusted Platform Module

A Trusted Platform Module (TPM) is always considered to be secure in the context of this data protection mechanism. This means that metrics stored in any of the platform configuration registers (PCRs) are considered genuine and unforgeable. Cryptographic keys bound or sealed to a TPM is also considered completely secure in context of the TPM. There are some assumptions made on what measurements that can be made by the TPM. To produce a measurement able to be verified, it has to be the same between restarts of the trusted host, which implicates that code has to be hashed in the same order, which is not the case for all operating system code (e.g. dynamically loaded libraries). In these protection mechanisms, the measurement and verification capabilities of the TPM might be extended, allowing the whole operating system to be a part of the trusted computing base for any host.

- Trusted Computing

Other aspects vital in fulfilling the security purpose of Trusted Computing, in addition to trust in the TTP and trust in the TPM, are also assumed to be secure.

- Cloud Management Platform

The implementation is made in an OpenStack-environment, but the conceptual design should be as applicable to other cloud management platforms as possible. However, the selection of a certain platform implicates that the OpenStack-environment will be present in the analysis in the cases where a specific issue or behavior is needed to be put in context of an cloud environment.

- Security of Virtual Machines

Breach of confidentiality caused by exploited vulnerabilities in a virtual machine in control of a cloud consumer is not actively prevented in the data protection mechanism discussed in this project. The security solutions assumes that the cloud consumer or the cloud service provider is taking necessary precautions to avoid such exploits. However, the potential outcome of such an exploit is briefly discussed in context of the di↵erent solutions.

- Privacy and Anonymity

Anonymity aspects of the implemented mechanism are out of scope of this thesis, and thus neither considered nor discussed.

From here and throughout the thesis report, above-mentioned assumptions are assumed to hold.

(14)

6

Chapter 2 Background

The results of the literature studies are presented in this background chapter organized in two parts, where the first part covers the fundamentals of cloud computing and use of non-volatile data in infrastructure cloud computing. The second part briefly explains OpenStack and Trusted Computing which is used in the proof-of-concept implementa-tion.

The term cloud computing has many explanations and there is no final agreed definition of cloud computing. It is referred to as a new computer paradigm and sometimes even a new technology [6]. The purpose of this thesis is not to contribute to the large variety of explanations and interpretations already available. But to be able to refer to some properties of cloud computing and address security concerns, a short review of its fundamental concepts is necessary. Readers who already are familiar with the topics covered in this chapter may skip these sections accordingly.

2.1 Cloud Computing Basics

The National Institute of Standards and Technology1 _{(NIST) have composed a widely}

used high-level definition of cloud computing [23]:

”Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management e↵ort or service provider interaction.”

In this thesis, cloud computing aspects are explained according to the cloud definition developed by NIST. According to this definition, a cloud can typically be described to have five essential characteristics; on-demand self-service, broad network access,

re-source pooling2_{, rapid elasticity and measured service. From a security point of view,}

the resource pooling characteristic may be the most important and challenging since the hardware boundary using separate computational hardware resources is put out of action.

1_{A US measurements standards laboratory.}

2_{Multiple consumers share the resources provisioned by the cloud provider in a multi-tenant model}

(15)

2.1. CLOUD COMPUTING BASICS 7

2.1.1 Service Models

NIST classifies a cloud deployment as one out of three types of service models [23]. The fundamental service model provides a basis which the other service models are built upon. Simply put, the service model declares what the cloud consumer is able to use and control; just an application, an execution environment or an entire platform.

• Software as a Service (SaaS) - A cloud consumer is only allowed to use certain applications provided by the cloud provider. Applications are accessible from a

client device3_{, and configuration of applications using application-wide or}

system-wide settings are not permitted. SaaS is the most confined service model and builds on top of a cloud platform, and finally a cloud infrastructure. Web-based e-mail and office-suites are examples of SaaS implementations.

• Platform as a Service (PaaS) - A cloud consumer hold capabilities to use the un-derlying platform to deploy its own applications but is limited to use programming languages, libraries and tools that the cloud provider has chosen to support in the underlying infrastructure. The result is a confined environment, i.e. platform. A web hosting service is one example of a PaaS provider, where a user deploys its own web-applications using supported technologies.

• Infrastructure as a Service (IaaS) - The most fundamental service model where the cloud consumer is in command of an entire virtual platform. Implicitly, the user is in full control of the operating-system and able to deploy and run arbitrary software. The underlying cloud infrastructure is still out of control of the cloud consumer. Amazon Web Services is an example of a well-known IaaS vendor.

2.1.2 Deployment Models

According to NIST, how a cloud is deployed can be categorized into four di↵erent types

depending on the intended group of tenants4_{and their concerns;}

• Private cloud - The cloud infrastructure is exclusively used by a single tenant, e.g. organization or enterprise. Resource sharing is only applicable between cloud users within the same organization or enterprise.

• Community cloud - The cloud infrastructure is used by several tenants with com-mon concerns. Resource sharing in this deployment model is between several tenants.

• Public cloud - The cloud infrastructure is accessible and open for use by any tenant (the general public), pushing the concept of resource sharing to its maximum. • Hybrid cloud - A hybrid infrastructure based upon two or more cloud

infrastruc-tures of the other cloud deployment models into one single cloud infrastructure. Despite classified as a private cloud or a community cloud, the cloud infrastructure may be provided and managed by an organization or a third party. Thus, the cloud deployment model only classifies a cloud infrastructure with respect to its intended type of usage.

3_{Web-browsers and programs are typical client devices}

(16)

2.1. CLOUD COMPUTING BASICS 8

As the focus of this thesis is to investigate security of non-volatile data in infrastruc-ture clouds, the term cloud computing is from here and onwards used interchangeably with infrastructure cloud computing. A simplified high-level picture of an IaaS cloud architecture is shown in Figure 2.1, illustrating k tenants jointly running j virtual guest machines sharing the computational resources of i physical hosts. Such resources are provisioned by a cloud provider through virtualization and managed by a cloud man-agement platform.

Tenant k Tenant 2

Tenant 1

Host 1

Host 2

Host 3

Host i

V i r t u a l i z a t i o n

Guest 1

Guest 2

Guest 3

Guest 4

Guest 5

Guest j

C l o u d M a n a g e m e n t P l a t f o r m

Figure 2.1: A simplified high-level picture of a multi-tenant IaaS cloud architecture.

2.1.3 Virtualization

Virtualization is one of the key technologies in cloud computing, enabling resource

shar-ing and also rapid elasticity as hardware is emulated. Accordshar-ing to VMware5_{, ”the term}

virtualization broadly describes the separation of a resource or request of a service from the underlying physical delivery of that service” [38]. In cloud computing, hardware virtualization is adopted to create one or several instances of virtual hardware upon a single instance of physical hardware. This allows one or several guest operating systems to run on an separate confined platform residing in a single physical machine. The sep-aration between physical hardware and virtual hardware is achieved by introducing an abstraction layer which mediates all access between any so-called virtual machine and the physical machine [34]. This abstraction layer is referred to as a Virtual Machine

Monitor [31] (VMM) or a hypervisor6and in addition to running and managing virtual

machines, the hypervisor is also responsible for enforcing separation.

There exist various flavors of hardware virtualization [1] where full virtualization and paravirtualization are commonly used. Full virtualization enables configuration and emulation of a complete set of hardware, making it possible for an unmodified operating system to run on the virtual machine, even unaware of the fact that the machine is virtual [11]. For increased performance, direct device assignment can also be adopted to replace the performance demanding emulation [4]. The other type of virtualization is para-virtualization, where the hypervisor makes one or several types of hardware resources of the physical host available to the guest through interfaces, thus eliminating the need of the performance-demanding emulation of all hardware [34]. Since the guest needs to run a modified operating system to make use of the interfaced hardware in para-virtualization, compatibility has to some extent been traded for performance.

5_{VMware, Inc. is a software company providing virtualization software i.a.}

6_{Both the terms VMM and Hypervisor are used interchangeably according to virtualization}

(17)

2.2. NON-VOLATILE DATA IN IAAS-ENVIRONMENTS 9

2.2 Non-volatile Data in IaaS-environments

Each virtual machine is assigned its own instance storage, enabling storage of arbitrary data such as logs and configuration files. This storage is normally neither durable nor persistently stored at the end of the virtual machine life-cycle. In this sense, launching an instance of a virtual machine is comparable to booting a Live-CD on a desktop computer; saving data to files is possible, but everything is gone once the virtual machine is rebooted. Consequently, cloud consumers are likely to need persistent storage to store arbitrary data over time, where object storage and block storage are two common approaches [9].

- Block Storage - Using block storage, data are stored in blocks, i.e. a sequence of bytes, which are written to/read from a block device. A block device is the most raw and lowest-level representation of a storage device, revealing the underlying physical structure to the user. Commonly, the block device is used to host a file-system, a layer of abstraction between the user and the end-device resulting in a more convenient representation, allowing the user to handle and organize files instead of blocks. Such abstraction layers are not always desirable, and there are applications which may benefit from running directly on a block-device, such as databases. As databases may want to implement a specific type of caching, the caching capabilities provided and controlled by the operating system is not always desirable. A major benefit of using block devices is the freedom of choice, i.e. the possibility for users to deploy a file-system of their choice or use their own application-specific block device representations. To a virtual machine, the block storage device appears as directly attached to that certain machine, and is accessed through any device access mechanism provided by the operating system. The freedom of choice have made block devices an important approach to persistent storage in infrastructure clouds.

- Object Storage - The need of scalable and accessible storage which can be accessed from several devices simultaneously gave birth to object storage. Object storage is somewhat similar to a file-system deployed on a block device, but is generally accessed

through a lightweight RESTful API7_{, achieving large accessibility. The abstraction}

layer also decouples the objects from the blocks where the data physically resides, making it highly scalable, since the natural boundaries of a block device no longer are a limitation. Objects are replicated in a distributed network of storage resources providing redundancy, and automated load-balancing ensures high performance [8]. Objects are normally requested through a proxy, keeping track of at which storage resource an object is located.

Object storage is ideal for read-intensive data which is required to be widely accessible, such as images and pictures. As data is replicated in a distributed network, changes to objects may cause inconsistency, or an eventual delay while consistency is ensured. Thus, object storage is most suited for static data, or non-frequently modified data. It is important to mention that object storage is a type of storage itself, and has a major area of use outside infrastructure cloud computing. Despite major benefits, there is still a need for block storage solutions in infrastructure clouds as mentioned above. It is also important to mention that block storage also can be scalable to some extent, by adopting logical volume management (described in section 2.2.1) and using file-systems that allow resizing.

Cloud storage is often provided as a separate service, e.g. in Amazon Web Services, Amazon EC2 service is responsible for managing the virtual machines, while Amazon

(18)

EBS provides block storage and Amazon S3 provides object storage. Depending on the existing application-programming interfaces, it is possible to combine computational resources and storage resources from di↵erent providers, e.g. compute resources in OpenStack (described in section 2.3) can use block storage provided by Amazon EBS.

2.2.1 Logical Volume Management

Logical volume management enables scaling of block storage devices and resource pool-ing, which are two NIST cloud computing characteristics [23] also applicable to storage. Another benefit of using logical volume management is the possibility to create volumes larger than one single physical disk. Logical volume management can be seen as a type of virtualization applied on disk level, where a set of physical volumes are grouped together and presented as a large homogeneous disk, a so-called volume group [16]. This group is then used to hold virtual volume entities mapped to the physical volumes, so-called logical volumes. Since a logical volume only is a reservation of blocks on the physical volumes, it is easy to reserve more available blocks, or release blocks that are not needed any longer, thus enable scaling of the volume size.

VG0

Mapping-table

OS

Logical

volumes Volumegroups

Physical volumes

Figure 2.2: Basic structure of logical volume management.

As previously discussed, ability to scale block devices may depend on what the block device is used for. A block device itself is often not that useful without deploying an abstraction layer. To be able to benefit from the feature of resizing block devices, this abstraction layer must support resizing as well.

2.2.2 Storage Security Concerns

One of the most frequently proposed notions of computer security is the CIA triad [12], referring to three security properties; confidentiality, integrity and availability. The risk of these properties to be violated is present regardless if data are available over a network or through physical access exclusively. Loss of data due to hard-drive breakdown (availability), or data breaches (confidentiality) caused by stolen hard-drives during a break-in are examples of such physical violations. But making data available over a network, such as the Internet, introduces a new widely available and versatile entry point, in parallel to the physical entry point resulting in a much greater risk that the CIA properties are violated.

(19)

An efficient and straight-forward method of addressing the integrity and confidentiality issues is to use cryptographic techniques. While introducing cryptographic techniques solves confidentiality and integrity issues in theory, the reality is that the mentioned issues are transformed into a set of new concerns shown in Figure 2.3, with the confi-dentiality/integrity problem as entry-point.

Condentiality / Integrity

Trust Data loss (availability)

Key management Encryption Lost keys O u ts o u rc in g Outso urcing Leaked keys Leake d keys Lost keys

Figure 2.3: How issues are transformed using cryptographic techniques. By adopting encryption, the need to ensure confidentiality and integrity of data applies to the encryption key instead. A leaked key may result in compromised confidentiality and illegal modifications to a key might lead to data loss. By outsourcing data protection or key management to an external party, that party must be trusted. A failing external party may lead to compromised data security. Figure 2.3 illustrates this correlation of issues.

(20)

2.3. THE OPENSTACK CLOUD PLATFORM 12

2.3 The OpenStack Cloud Platform

OpenStack8_{is an open-source cloud computing project and community joined by several}

major enterprises such as Ericsson, VMware, IBM and Intel Corporation. The Open-Stack cloud management platform makes use of already existing hypervisor technologies, messaging software, logical volume management software and databases. OpenStack can be viewed as a higher-level coordinator, consistently composed of a set of general inter-faces and protocols used to coordinate several software into an IaaS-environment. Due to its generality, OpenStack can be deployed on a huge variety of platforms.

2.3.1 Architectural Overview

An OpenStack deployment is component based and built upon several intercommunicat-ing standalone services responsible for a certain function. The cornerstone service is the

computing service, the cloud fabric controller, Nova9_{. Thus having Nova installed and}

configured on a host results in having the resources of the host, i.e. CPU and memory, provisioned to the cloud resource pool. Additionally there are services responsible for networking (Neutron) and storage, and one single computer can run several services, e.g. a compute node may also provision its hard drives by installing the block stor-age service (Cinder) or the object storstor-age service (Swift). There are also a number of shared services, such as an identity managing service, an web-based dashboard service (Horizon), an image service and a telemetry service for billing purposes. Each service is accessed through an application programming interface and user managing of the cloud is performed from a command-line client or the web-based dashboard service. Every call

between services is performed using a messaging system implementing AMQP10_{. The}

strict component based approach used in OpenStack makes the cloud itself scalable, manageable and expandable.

The compute service, the block storage service and the object storage service are the important services in context of this thesis and is therefore explained in more detail, according to the OpenStack Configuration Reference [26].

Nova

Nova is the heart of the IaaS system, handling the virtual machines. Nova does not handle the virtual machines itself, but is using existing hypervisors and possibly emu-lators installed on the host, such as KVM/QEMU or Xen. However, Nova is respon-sible for communicating with the hypervisor/emulator and thus dispatching requests for launching of virtual machines. It is also responsible for communicating with the hypervisor/emulator to attach and detach arbitrary virtual hardware such as virtual hard-drives (provided by Cinder) and virtual network interfaces (provided by Neutron). To be precise, Nova uses the libvirt virtualization API to communicate with the hy-pervisor/emulator. For simplicity, libvirt is omitted while explaining the di↵erent cloud services below.

8_{The OpenStack project, https://www.openstack.org/}

9_{Nova consists of several sub-services such as Nova API, Nova Scheduler, Nova Conductor and Nova}

Compute. The pure cloud fabric controller is Nova Compute which for simplicity reasons is referred to as Nova.

10_{Advanced Message Queuing Protocol (AMQP) - an open standard application layer protocol for}

(21)

2.4. TRUSTED COMPUTING 13

Cinder

Cinder manages block devices, which includes creation, removal and expansion of vol-umes upon request. When Cinder has been installed and properly configured, physical devices at the host are provisioned to the cloud in the form of volume groups where users can host their volumes. To attach a volume to a virtual machine, a network-based stor-age protocol is used to first connect it to the compute node hosting the virtual machine and finally it is forwarded to the virtual machine by configuring the hypervisor/emulator through Nova.

Swift

Swift provides object storage by combining physical hard-drives into a scalable storage resource. Unlike block storage, a cloud consumer uses a REST API to write, read and delete data, so-called objects, residing in an object storage resource, so-called container, over network. Thus, an object storage resource is accessed through a virtual network in-terface in the virtual machine instead of the virtual disk inin-terface, involving the network service (Neutron) in provisioning of object storage. Despite the di↵erences, the overall architecture is similar; storage nodes and compute nodes connected over a network. With the description of the OpenStack overall architecture in mind, together with the description of Nova, Cinder and Swift, Figure 2.4 is showing an example of what an OpenStack-based cloud environment can look like.

nova nova nova

cinder horizon cinder cinder neutron API Cloud Management Layer Standard OS CentOS

CPU RAM CPU RAM CPU RAM HDD HDD HDD

Provisioned Hardware Standard Hardware VM0 VM1 VM2 VM3 VM4 VM5 VLAN1 VLAN0

Ubuntu Ubuntu Ubuntu CentOS

User

Manager/ user

Figure 2.4: Basic architecture of a small OpenStack-based IaaS cloud environment consisting of five separate hosts with block storage and networking capabilities.

2.4 Trusted Computing

Trusted Computing is an initiative taken by the Trusted Computing Group (TCG). It

is a consortium between several major organizations11 _{in the Information Technology}

industry aiming towards more secure computing. The TCG have developed a specifi-cation of a reference architecture that is used to ensure trust which according to TCG

(22)

2.4. TRUSTED COMPUTING 14

holds if a device or system is behaving in a particular expected manner for a specific purpose [36].

Assuming that no entity in a system is trusted by default is a security precaution, but it greatly constrains the possibilities to use the system. Trustworthiness should be achieved as a result of some satisfactory verification of the system, which in the TCG specification is solved by introducing the Trusted Platform Module (TPM).

2.4.1 Trusted Platform Module

The TPM is typically an isolated piece of hardware that is physically attached to a certain platform mounted on the motherboard, and accessed through a TCG-specified interface. Amongst the TPM features are platform attestation and protected storage [10]. The TPM architecture is mainly built upon the cryptographic co-processor, a por-tion of persistent memory, a porpor-tion of versatile memory and the input-output interface to communicate with the host platform. The architecture specifies the core components of the TPM enabling these features.

• Cryptographic co-processor - the fundamental cornerstone of the TPM is the ability to perform random number generation and key generation. It can also perform

encryption, decryption, hashing and signing12_.

• Endorsement Key (EK) - is an asymmetric key-pair created and stored in the persistent memory when the TPM is manufactured. The key-pair is permanently embedded in the TPM hardware and the private key can never leave the TPM chip (non-migratable). Thus, the public key certificate issued by the TPM vendor can be used to identify a certain TPM hardware, i.e. a specific platform.

• Storage Root Key (SRK) - another non-migratable key stored in the persistent memory, but is generated inside the TPM when the user takes ownership of the platform. The SRK is used to protect other keys created using the platform by encrypting them with the SRK so that they cannot be used without that specific TPM, i.e. bound to that certain platform.

• Platform Configuration Registers (PCR) - contains strong cryptographic digests (also called integrity measurements) of metrics of the platform characteristics such as program code loaded during the boot process. The digests are extended by re-hashing a previous PCR-value in combination with a new metric. This procedure results in a secure hash that identifies the integrity and state of the platform, e.g. what program code that has been executed.

• Attestation Identity Keys (AIK) - an either migratable or non-migratable asym-metric key created to identify that a platform is trusted without revealing the identity of the TPM (the EK). This makes it possible to maintain anonymity while communicating with external parties. The AIK is created by signing prop-erties of the EK, and can then be verified by the TPM manufacturer issuing the EK and later verified by an external party.

• Storage Keys - derived from the SRK to protect data. Can be used to seal and bind data to a certain TPM and is described below.

12_{In TCG specification version 1.2 used in this thesis, RSA with 2048 bits keys are used for encryption}

(23)

2.5. IAAS STORAGE SECURITY IN RESEARCH AND INDUSTRY 15

These components are combined in various ways to provide the features described in [10], e.g. platform attestation is performed by signing an integrity measurement using an AIK and have it verified by some third party. Thus, the integrity and the identity of the platform can be attested. Since the TPM is a separate hardware module only accessible through a specific interface, it is resistant against software tampering. Thus, to tamper with the TPM, physical access to the hardware is required. Except for remote attestation, the TPM can be used to bind and seal data.

• Binding - by using a public key derived from the Storage Root Key stored in a certain TPM, any data (e.g. a symmetric encryption key) encrypted with that public key can only be decrypted by that specific TPM. The data is said to be bound to that specific platform.

• Sealing - by binding data and considering the TPM state, it is possible to encrypt data using a public key which then only can be decrypted on a certain platform in a certain state. This is an extension to normal binding and is called sealing. By sealing data, a party sending data to a platform is able to not only verify the platform identity authorized to access the plain-text data, but also that it is in a trusted state.

Using a Trusted Third Party

A Trusted Third Party (TTP) is a trusted13 _{entity used as a third party in}

communi-cation between two parties who both trust the third party to facilitate secure commu-nication. In combination with public-key cryptography to identify the TTP, the TTP is well suited for performing di↵erent types of security related actions, such as verifying integrity measurements, authentication and key generation. Another important bene-fit from using the TTP in Trusted Computing is that it makes it possible to maintain anonymity between communicating parties, but it is not considered for the rest of this thesis.

Limitations

The TCG specification as a security architecture does rely on trusting the fundamen-tal components such as the Trusted Platform Module and a Trusted Third Party. If such trusted entities fail, roots of trust fail accordingly, leaving the platform untrusted. Software-tampering with the TPM is made difficult by isolating it as a standalone hard-ware chip, but with physical access to the TPM hardhard-ware the conditions change. The applicability of the attestation is also limited since the expansion of integrity mea-surements need to be performed in the same order every time to give the same measure-ment. This results in requirements on what code that can be attested, and might lead to difficulties in verifying the integrity of dynamically loaded portions of code.

2.5 IaaS Storage Security in Research and Industry

Since the beginning of the cloud computing era, the security aspects of adopting cloud solutions have been a topic for research. By considering the top threats defined by Cloud Security Alliance [2], a shift in focus can be observed. Threats against cloud service providers caused by malicious users has been downgraded and threats against

(24)

2.6. RELATED WORK 16

cloud users are emphasized. As Trusted Computing technologies are getting increasingly mature, adoptions of such techniques in cloud computing are also subject of research [28, 29].

It is reasonable to assume that many of the difficulties lie in implementing security without violating the attractive implications of cloud computing. Such a difficulty is for example the ability to maintain control of outsourced data using cryptographic tech-niques, and at the same time having it highly accessible. This puts requirements on key management schemes, which nowadays is a huge area of research, e.g. in [40]. While still very immature, homomorphic encryption to operate on encrypted data should likely be a groundbreaking technique solving many of the problems with data storage outsourcing. Accordingly, such techniques are highly topical and subject of research, e.g. in [24]. Trusted Computing is well established in terms of securing operating systems, and is also frequently appearing in research on virtualization, to perform integrity checks on virtual machines [5]. Furthermore, research is also aiming towards additional applications of Trusted Computing to virtualization in context of cloud computing, such as verifying integrity of the compute host [28] and secure use of cloud storage [29].

2.6 Related work

When it comes to cloud security in general and cloud storage security in particular, a lot of focus and work has been put into finding procedures for assessing, verifying and ensuring that data is safely stored at an acceptable performance impact. Wang et al [39] proposes a method to achieve publicly auditable cloud storage services by using a third party that assess the risk of outsourced data. By making a cloud storage service publicly auditable, it is suggested that a data owner can gain trust in the provider. Furthermore, by letting a trusted third party perform the assessment, the data owner can also make use of the computational resources needed to assess the risk. The solution proposed in [39] addresses the issue of trusting the cloud storage provider.

Proper deletion of data on request is another important aspect of secure cloud storage, and in [30], Paul et al proposes a scheme able to securely erase data and also to prove that the actual data really has been removed. The scheme ability of giving this proof makes it a interesting solution to the issue of ensuring data destruction.

In [19], Kamara et al suggests a data protection scheme where data is encrypted before it is sent to the cloud for storage. The author suggests using searchable encryption schemes in combination with attribute-based encryption schemes to secure data but also be able to search through data without the need of decrypting it.

As the key enabling technology of cloud computing and multi-tenancy, virtualization security is a large research area. In [33], Sabahi discusses a number of security issues regarding the use of virtualization, and in [41] Zhang et al performed an attack against a certain virtualization issue to retrieve cryptographic keys. Ensuring secure virtualization platform in IaaS environments is up to the cloud service provider. Therefore, trust is related to virtualization security. As all computations in a virtual machine is observable on the VM host, it is desirable that the compute host holding the VM-instance can be trusted. In [28], Paladi et al comes up with a mechanism and protocol that allows VM-instances to be launched on a provable trusted compute host, and furthermore the mechanism allows the client conducting the VM launch to verify that it indeed has been launched on a trusted compute host.

(25)

2.6. RELATED WORK 17

2.6.1 Domain-Based Storage Protection

In [29], Morenius et al proposes a protocol for storage protection in IaaS environments, which is a step towards combining individual techniques such as Trusted Computing and encryption to achieve protection of storage resources. The protocol is an extension to [28] where each virtual machine running on a trusted host and each volume protected under Domain-Based Storage Protection (DBSP) additionally is bound to a so-called domain. The domain is a theoretical quantity of data protected under a certain cryptographic key, where only the virtual machines belonging to a specific domain are allowed to access the data. Access to a domain is granted by proving access rights through an assertion to a trusted third party. The compute host running a virtual machine authorized to a certain domain receives a Session Domain Key from the TTP, used for further communication within the boundaries of that certain domain and may pose as a proof of the access rights. As domain association is performed during launch, membership of a domain guarantees a successful attestation of the compute host. Writing to a protected domain for the first time implies creating the keys used to protect the data. The detailed behavior and sequence of events is described in [29], with the general idea that the so-called metadata needed to derive the protection keys are stored in the actual domain, making it self-contained. For first time and subsequent data reads and writes, keys are derived by the TTP using the metadata in combination with a non-migratable master key. The software controlling and enforcing above-described behavior is referred to as the Secure Component.

The protocol proposes one way of combining di↵erent security technologies into a in-tegrated protection mechanism, but there are aspects not discussed in the paper, e.g. verifiability proving that an implementation of DBSP is operating as expected. The protection solutions discussed and proposed in this thesis uses schemes proposed in the DBSP paper [29], and aims to further investigate and evaluate how such a protocol can be implemented in an IaaS-environment and integrated into a protection solution.

(26)

18

Chapter 3 Methodology

This project is divided into four methodological steps; study of literature, structured identification of risks and threats to non-volatile data in IaaS-environments using the CORAS method, proposing approaches to risk-treating data protection and finally revis-iting the most important risks to evaluate the successfulness of a selected implemented mechanism. A certain approach to data protection is selected as a result of qualitative reasoning and revisiting of the risks. In addition, a proof-of-concept implementation is performed to briefly evaluate implementation-level issues of such a mechanism. The CORAS approach is described in more detail in section 3.1.

The overall methodological approach and structure of the project consists of the follow-ing high-level steps described in order of execution:

1. Study of Literature

The study of literature is covering the basics of infrastructure cloud computing plus use and storage of non-volatile data in the IaaS service model. To facilitate the risk analysis; a basic understanding of cloud computing security issues is gained by reading a wide range of articles covering cloud-computing security aspects. 2. Target Modeling and Risk Analysis

A CORAS risk analysis is conducted to identify the most important risks to non-volatile data in IaaS-environments and to propose treatments to such risks. Fur-thermore, the target model is built upon documentation of how non-volatile data is used in IaaS-environments in general and OpenStack in particular. The out-come of this step is the result of the CORAS risk analysis. Finally, at the point of risk treatment identification, any additional requirements are introduced and motivated.

3. Risk Treatment and Protection

The risk treatments from the CORAS analysis are evaluated in their risk treating ability and cost, and are combined into a complete set of risk treatments forming an approach to data protection. Each approach is subject to an evaluation consid-ering the non-acceptable risks to motivate its risk-treating ability. Any additional advantages or drawbacks of a certain approach are also introduced and considered in the evaluation. The outcome of this step is a comparison between di↵erent approaches to data protection and a proposal for a risk-treating protection mech-anism.

(27)

3.1. THE CORAS METHOD 19

4. Implementation, Repeated Risk Level Estimation and Evaluation The selected mechanism is implemented in OpenStack as a proof-of-concept pro-totype to identify implementation-level issues and briefly evaluate the adaptability of such a data protection mechanism to an IaaS-environment based on OpenStack. Furthermore, the selected solution is subject to a fully implemented and repeated risk assessment, implementing step 6-7 of CORAS. In addition to the risk level es-timation conducted while motivating the di↵erent solutions, risks a↵ecting indirect assets are considered and repeatedly evaluated and the entire resulting risk-picture is presented in terms of before-and-after.

3.1 The CORAS Method

The risk analysis conducted in this project is based on the CORAS approach; a model based risk analysis method using a customized language for threat and risk modeling. The results are presented using di↵erent types of diagrams defined in the language. CORAS consists of eight separate steps (shown in 3.1) organized into three phases; context establishment, risk assessment and risk treatment. Below is a description of the eight steps defined in [35] and their application in this specific project:

1. Preparation of the risk analysis process by gaining initial knowledge and under-standing of the target; use and storage of non-volatile data in IaaS-environments in general OpenStack in particular. Size and scope of the analysis are briefly dis-cussed. The preparations are based on study of literature related to infrastructure cloud computing.

2. Introductory meeting with the customer to highlight the main goals and concerns of the customer and the initial target of the analysis from the customer’s point of view. The customer in this risk analysis is Ericsson Research. The result of the second step is a definition of focus and scope of the analysis.

3. Agree on the target of analysis, its scope and focus. In cooperation with Ericsson Research, the main direct and indirect assets are identified and a high-level risk analysis is conducted to identify the most important threat scenarios and vulner-abilities to be subject of further investigation. The outcome of the third step is a documentation of the target description and the identified assets.

4. Decision point where all of the information collected in the previous steps is com-plete and is to be approved by the customer. As a final preparation, a risk evalua-tion criteria is created in cooperaevalua-tion with the customer, together with definievalua-tions of likelihoods and consequences concerning each direct asset. The CORAS risk evaluation criteria matrix is used to classify risks as acceptable and non-acceptable or using a multi-step criteria [35].

5. Risk identification using structured brainstorming, organized as a walkthrough of the target model to identify threats, unwanted incidents and vulnerabilities, all substantiating the risks. The results and outcome are documented on the fly using a certain type of diagram defined in the CORAS language.

6. Risk level estimation of the risks represented by the unwanted incidents, also performing structured brainstorming. Risk level estimating consists of determining likelihood of threat scenarios and unwanted incidents, followed by determining

(28)

Preparations for the analysis Customer preparation of the target

Refining the target description using asset diagrams Approval of the target description

Risk identification using threat diagrams Risk level estimation using threat diagrams Risk evaluation using risk diagrams

Risk treatment using treatment diagrams

Figure 3.1: The eight steps of the CORAS approach.

consequence to each asset a↵ected by the unwanted incidents. In this risk analysis, the likelihood of the unwanted incidents and threat scenarios are derived from the likelihood of the preceding threat scenario from left-to-right, i.e. from the initial vulnerability ending up in the actual risk. The combination of likelihood and consequence defines the risk level of each identified risk.

7. Risk evaluation where each risk is evaluated if it is acceptable or not. If a risk is acceptable or not is decided by putting the risk in the matrix defined by a certain risk evaluation criteria. Indirect risks are also considered during the risk evaluation.

8. Risk treatments are proposed by evaluating their ability to reduce likelihood and/or consequence of the preceding threat scenarios and vulnerabilities, thus mitigating the risk. Treatments are also discussed from a cost-benefit perspective and in terms of feasibility. The risk treatment plan concludes the risk analysis.

The context establishment phase consists of the first three steps, i.e. until approval of the target to be analyzed. The actual risk analysis is performed during the risk assessment phase; steps 5-7, resulting in an overall risk picture of the target. Finally, the last phase, i.e. risk treatment, consists of the last step in the CORAS approach aiming to identify risk treatments from the overall risk picture.

The purpose of the context establishment phase of the risk analysis is to, in a structured way, set the focus of the analysis resulting in a target model covering the goals of the customer. The context establishment helps in transforming the customer interests into such a target model, without influencing the customer. Due to pre-defined specifications in this project, the focus of the risk analysis has partly been set in a strictly techni-cal direction from the beginning. Yet, there are several major benefits from using an approach such as CORAS, as risks are identified and assessed instead of pure threat

(29)

scenarios e.g. when using attack trees. Another benefit is the well-defined method to collect information during the risk assessment, and the language used to document the findings. Finally, as a risk re-evaluation of identified risks is to be performed as a step in evaluating the risk treatment, the risk evaluation criteria in CORAS is suitable for such a comparison.

3.1.1 Structured Brainstorming

The CORAS approach makes use of structured brainstorming in the fifth and sixth steps where risks are identified and assessed. Structured brainstorming is an approach to brainstorming where each participant is given the equal amount of time to contribute and introduce his or her ideas. Ideally, the participants in the sessions of structured brainstorming are representing several di↵erent interests and backgrounds, enabling con-tinuous identification of more risks than a single homogenous group of individuals. Due to the technical nature of the customer, the implicitly technological nature of the in-dividuals contributing with ideas and the pre-specified scope of analysis, it is possible that more risks could have been identified with additional interests contributing.

(30)

22

Chapter 4 Risk Analysis of Non-volatile

Data in IaaS-environments

This chapter describes the execution and results of the conducted CORAS risk analysis and is organized in three sections according to the CORAS method; context establish-ment, risk assessment and risk treatment. In the first section, the target model and assets are described, followed by threat diagrams and description of the identified secu-rity concerns in the second section. In the third section, the risk treatment concepts are introduced and combined into di↵erent approaches to protection of non-volatile data in chapter 5.

CORAS [35] defines the symbols and notation shown in figure 4.1.

Threat scenario! Human threat! (accidental)! Human threat! (deliberate)! Non-human! threat! Direct! asset! Indirect! asset! Party! Vulnerability! Treatment scenario! Unwanted! incident! _Risk!

Figure 4.1: Symbols of the CORAS risk modelling language.

4.1 Context Establishment

Due to the technical orientation and the pre-defined scope, the target model defined in this risk analysis is a system architectural model involving end-point entities such as virtual machines, compute nodes and storage nodes hosting storage resources (i.e. the hard drives). Furthermore, systems and the interconnectivity between such systems, crucial in enabling persistent storage, are covered by the target model.

Important aspects are data transactions, management of block storage and management of object storage. A data transaction is the life cycle of the transaction between a storage resource on one side and a virtual machine on the other side, e.g. a read action or a write action. Management of block storage is the events of creating, removing, attaching and detaching block storage volumes. Furthermore, management of object storage is the events of creating and removing block storage containers. The model is used as a basis

(31)

4.1. CONTEXT ESTABLISHMENT 23

to identify unwanted incidents and threat scenarios based on security-critical data paths, control paths and points of failure. Beyond the scope of the actual risk analysis, the model is involved in evaluating the proposed treatments and solutions in context of IaaS

environments1.

4.1.1 Step 1 - Preparations and Initial Observations

Based on the high-level OpenStack documentation of non-volatile data, combined with an investigation of the IaaS structure in Nova [27, 25], an initial target model is created. Any tenant in control of a virtual machine is able to store and use arbitrary data in persistent block storage or object storage, provisioned by a network-connected storage node. A block storage volume is typically attached in two steps; first to the compute node hosting the virtual machine and then to the actual virtual machine where it appears as an ordinary local volume. Data is read and written by any process running inside a virtual machine. A cloud consumer can typically connect to the virtual machine through an arbitrary protocol, such as SSH or VNC, to control the virtual machine. An object storage container is similar to a volume but contains objects; but unlike a block storage volume, a container is not attached to a certain virtual machine. To read an object within a container, an object is pulled over a network, and to write an object to a container, such object is pushed over a network. In object storage the objects are typically distributed over several nodes. A storage node acts as a proxy to present the objects to the virtual machine.

Cloud Consumer Cloud Manager

CLI

(admin)

Compute End-Point Storage End-Point VM CM BS BS CM Network CLI (ssh) CLI Network OBJ Network OBJ OBJ OBJ OBJ Distributed Storage

Figure 4.2: Overall target model of non-volatile data in IaaS environments. Block and object storage are managed through a front-end cloud management layer connected to a back-end cloud management (CM) service on each node involved in pro-visioning of persistent storage. The user ordinarily interacts with such management layers through a command-line interface (CLI) or a graphical user-interface over a net-work such as the Internet. Furthermore, data is transferred between compute nodes and storage nodes over such a network. In addition to the cloud user, there is typically a cloud administrator acting on behalf of the cloud service provider.

Protection of Non-Volatile Data in IaaS-environments

Institutionen för datavetenskap

Department of Computer and Information Science

Final thesis

Protection of Non-Volatile Data

in IaaS-environments

by

Erik Sundqvist

LIU-IDA/LITH-EX-A--14/062--SE

2014-11-18

Linköpings universitet

SE-581 83 Linköping, Sweden

Linköpings universitet

581 83 Linköping

Final Thesis

Protection of Non-Volatile Data

in IaaS-environments

by

Erik Sundqvist

LIU-IDA/LITH-EX-A--14/062--SE

2014-11-18

Supervisor: Marcus Bendtsen

Examiner: Prof. Nahid Shahmehri

Contents

List of Figures

List of Tables

Chapter 1

Introduction

1.1

Purpose

1.2

Thesis Outline

1.3

Problem Description

1.4

Motivation

1.5

Aim

1.5.1

Contribution

1.6

Limitations and Scope

Chapter 2

Background

2.1

Cloud Computing Basics

2.1.1

Service Models

2.1.2

Deployment Models

Host 1

Host 2

Host 3

Host i

V i r t u a l i z a t i o n

Guest 1

Guest 2

Guest 3

Guest 4

Guest 5

Guest j

C l o u d M a n a g e m e n t P l a t f o r m

2.1.3

Virtualization

2.2

Non-volatile Data in IaaS-environments

2.2.1

Logical Volume Management

VG0

OS

2.2.2

Storage Security Concerns

2.3

The OpenStack Cloud Platform

2.3.1

Architectural Overview

2.4

Trusted Computing

2.4.1

Trusted Platform Module