Cloud computing from a privacy perspective

(1)

Umeå universitet

Bachelor Thesis

Spring -13

Cloud computing

from a privacy perspective

Author:

Daniel Evertsson

Supervisor:

Jerry Eriksson

September 6, 2013

(2)

(3)

Abstract

The cloud could simplifies the everyday life of private individuals as well as big enterprises by renting out recourses. Resources such as storage capacity, computational power or cloud-based applications could be accessed without the need to invest in expensive infrastructure. Even though many enterprises could benefit from using cloud services they hesitate, partly because they fear data leakage when storing sensitive data in the cloud environment.

The goal has been to prevent unauthorized users to access the users’ data by using client-side encryption. The solution must be able to support ex- isting features. For example many applications support multiple devices, which means that the user can access the same data from devices such as Smartphone, Tablets and desktop computers.

The result showed that there are two main approaches to implement client- side encryption. The first approach bases the encryption key on random elements. It’s without a doubt the most secure method to use, but it’s not user-friendly. The user has to distribute the generated encryption key between all the devices, for example moving files back and forth. The second approach bases the encryption key on a password. The security will decrease but it will be more user friendly.

It appears that the biggest problem related to client-side encryption, isn’t

the encryption itself, but the distribution of encryption keys. As the number

of users increase, the key destitution problem gets more distinct. Often the

key distribution is handled by something called a key manager, which could

operate at different levels. It could be built into the application or it could

be an external application. There are organizations which made guild lines

for how to design key management systems.

(4)

(5)

Acknowledgements

First of I would like to thank Cristian Klein at the department for distributed

systems for coming up with the idea for this thesis. He has also provided a lot

of valuable input and support. I would also like to thank the teachers Jerry

Eriksson and Pedher Johansson for valuable input to this project.

(6)

(7)

Introduction

In today’s society the use of different internet-connected devices has in- creased dramatically. We access the internet though devices such as Smart- phones, Tablets, laptops and desktop computers. Between the years 2003 and 2010 the number of devices increased from 500 million to 12.5 billion devices[1]. This is an increase of 2500% in seven years. In 2010 there where almost twice as many devices as there where people in the world. Users has developed a need to store and access the same data from there different devices.

As a solution to the problem, a concept called cloud computing has been developed. The idea is to let the user access the clouds resources such as storage, software, platforms and infrastructure ¹ . As a user you get access to these resources through the internet, often by using a thin client like a web browser or a client application. You get access to the resources without having to invest in new infrastructure or developing new software. Another benefit with cloud computing is that the user only pays based on the re- courses consumed.

Even though there are many advantages with cloud computing many com- panies hesitate to use it. In 2012 Varonis Systems Inc presented a research which showed that 80 percent of the interviewed companies didn’t want to invest in cloud based solutions. They didn’t even allow their employees to use existing cloud based services [2]. The main reasons where that they feared data leakage, security breaches and compliance issues. 70 percent said

1

If you want to know more about different kinds of cloud services visit TechNet Mag-

azine (http://technet.microsoft.com/en-us/magazine/hh509051.aspx)

(10)

2 CHAPTER 1. INTRODUCTION

that they would use cloud based services if they were as robust as internal tools.

Because the security is a crucial element in whether companies will start using cloud based services or not, this will be the main focus of this thesis.

This thesis will study different encryption techniques which could be used to encrypt data stored at the cloud provider.

Since user often needs to access data from multiply devices this factor should be taken into account. The users should be able to access their files from devices like desktop computers, laptops, Smartphone’s and Tablet’s. In order to identify the user a single user account should be used. Since all devices involved should be able to use the encryption technique presented in this thesis hardware limitation, like computational power, should be taken into account.

1.1 Client-side encryption

To make it more difficult for a unauthorized people ² to access the users data it should be encrypted. One option would be to let the cloud provider encrypt all the data that is stored in the cloud. This method is called server- side encryption. The problem with this approach is that if a attacker gets access to the cloud-provider or if an employee of the cloud provider tries to access the data they will also have access to the decryption key which makes it very easy to decrypt the data.

To make the data less accessible a method called client-side encryption will be used to encrypt all the users’ data before it’s sent to the cloud provider.

In contrast to server-side encryption, where the encryption key is stored by the cloud provider, the client-side encryption approach only stores the encryption key locally. This will prevent the cloud provider from accessing the data since they won’t know how to decrypt the it.

2

Unauthorized people could be employees of the cloud provider or people who broken

into the cloud providers system

(11)

3 1.2. PROBLEM STATEMENT

1.2 Problem statement

First of I will look at existing solutions that offers Storage-as-a-service. The solutions that are interesting are those who offer some kind of client-side encryption. Secondly the most common client-side encryption techniques will be identified and described in more detail. Advantages and disadvantages with the different approaches will be pointed out. The goal is to decide which encryption technique offers the highest security level. Then in order to see how the encryption affects the performance of the client application, a test should be implemented to see how the encryption of large files affects the execution time of the application. In the last part a discussion about the different encryption technique will be presented. Hopefully this thesis will be able to identify the biggest problems related to client-side encryption.

1.3 Definitions

In this section terms often used in this thises will be defined.

Salt:

Salt is often random generated data used to encrypt data. The purpose of the salt is to aggravate, so called rainbow attacks [3]. In a rainbow attack the hacker generates a table of encryption keys. The table is generated once and then used to test all the generated keys for a given number of users. The idea is to add a salt when generating the encryption key. The salt should be generated by random, or at least be different for every user. This forces the hacker to generate a new rainbow table for every user, which is a very expensive operation. The salt is considered public information, which means that even if the salt is known to the hacker, it will still increase the resources needed to crack the encryption.

SHA:

Secure hash algorithm (SHA) was developed by the United States National Security Agency. Together with MD5, SHA is the most conventional hash function used in cryptography.

AES:

Advanced Encryption Standard (AES) is a standardized encryption algo-

rithm developed by National Institute of Standards and Technology. The al-

(12)

4 CHAPTER 1. INTRODUCTION

gorithm is built to use encryption keys by length 128, 192 or 256 bit [4].

Account password:

This is a password that is used to authenticate a user when logging in to the system. The account password will be stored in the cloud and there by accessible to the ones who got access to the cloud-provider.

Archive password:

This is a password used to encrypt data. Its only stored locally unlike an

account password, which is stored online. It’s also worth mentioning that if

the archive password is lost there will be no way to decrypt the data.

(13)

5

Chapter 2

Existing solutions that offers Storage-as-a-service

There are cloud providers who try to ensure the privacy of their users. People from Fraunhofer Institute for Secure Information Technology have written a report in which they compare different cloud storage providers and evaluate the applications based on different criteria [5]. The criteria that are evaluated are whether the applications support any kind of encryption technique among other things. Out of the seven applications that are benchmarked, the four applications that support client-side encryption has been selected in order to identify common techniques used for client-side encryption. The applications that will be presented in this chapter are CrashPlan, Mozy, TeamDrive and Wuala. In the last part of this chapter other applications, which is not presented in the report witten by Fraunhofer Institute for Secure Information Technology, will be studied in order to see if they have come up with any other solution to the client-side encryption problem.

2.1 CrashPlan

CrashPlan ¹ offers three kinds of encryption techniques. As default the ac- count password, which is known by CrashPlan, will be used to generate a 128-bit encryption key. Secondly the user could choose an archive password, which is not known to CrashPlan, it will be used to encrypt the encryption

1

Applcation created by Code 42 Software

(14)

6 CHAPTER 2. EXISTING SOLUTIONS THAT OFFERS STORAGE-AS-A-SERVICE key. The encrypted key will be stored in the cloud and distributed to other clients. In the third alternative the user enters an encryption key which is only stored locally.

2.2 Mozy

Mozy ² offers two methods for encryption. All data is encrypted on the client before sent to the cloud-provider. The first option is to use a 448-bit encryption key provided by and also known to Mozy. The user could also enter a private 256-bit encryption key which will only be stored locally.

2.3 TeamDrive

TeamDrive ³ uses a concept called space which is similar to a folder. When created the space could be made empty or based on an existing folder. All files that are stored in the space will be transmitted to the cloud provider. For every space a unique AES-256 key is generated which means that every space has an individual encryption key. In order to share spaces between different devices the encryption key for that particular space has to be distributed to the other devices. This is done by letting the user export the key to a “.pss”- file. The file then has to be transferred by the user to the new device.

2.4 Wuala

Wuala ⁴ uses something called convergent encryption. Based on each file’s content a hash is calculated, the hash is used to encrypt the file. The hash is then encrypted using the account key. The only way to access the key is to own the original file. The method has one big flaw; it’s open to so called “confirmation of a file attack” where the attacker knows the content of a file. If this is the case then they can verify that a user owns a copy of that file. The attack is most efficient if the text is publicly available, for example copyrighted material. It’s also very simple to see if two users share the same file.

2

Applcation created by EMC Corporation

3

Applcation created by TeamDrive Systems

4

Applcation created by LaCie

(15)

7 2.5. SUMMARY OF COMMON ENCRYPTION TECHNIQUES

2.5 Summary of common encryption techniques

Both CashPlan and Mozy offer server-side encryption, or rather a key gen- erated and stored by the cloud provider. The applications also lets the user enter an encryption key which are only stored locally. CashPlan also offers a third alternative where the user enters an archive password. TeamDrive on the other hand generates a key when a so called space is created, which is only stored locally. Wuala uses convergent encryption where the encryption key is calculated based on the content of the file being encrypted.

2.6 Other solutions

There are other Cloud providers, which are not mentioned in the report writ- ten by Fraunhofer Institute for Secure Information Technology, which offers client-side encryption. Applications like Idrive ⁵ , Swissdisk ⁶ and SpiderOak

7 . They have solved the client-side encryption by using the techniques men- tioned in previous section. To be more specific Idrive lets the user enter an private encryption key. Swissdisk and SpiderOak uses an archive password in order to generate an encryption key.

5

Applcation created by IDrive Inc

6

Applcation created by SwissDisk ICS

7

Applcation created by SpiderOak

(16)

8 CHAPTER 2. EXISTING SOLUTIONS THAT OFFERS

STORAGE-AS-A-SERVICE

(17)

9

Chapter 3

Client-side encryption strategies

By studying the existing solutions I have identified two main approaches to solve the problem concerning client-side encryption. In this chapter this approaches will be presented and their strengths and weaknesses will be pointed out.

3.1 User supplied key

It’s pretty common to let the user enter a generated encryption key which will only be stored locally. The key could sometimes be generated by the client application or in other cases third party programs like an online key generator could be used. In order to make it harder to crack the encryption the user should make sure that the encryption key is based on some random element.

The length of the key is also an important factor. Today the recommended length of an encryption key is 256-bits, since the AES supports encryption key up to 256-bits[4].

One flaw with this technique is that there can be many devices connected to the same user account. If that’s the case then the encryption key has to be distributed between the different devices. One simple solution would be to let the user memorize the 256-bit long encryption key. If a the encryption key would be presented using common characters ¹ used in passwords the key

1

The definition of common characters are [0-9], [a-z] and [A-Z]

(18)

10 CHAPTER 3. CLIENT-SIDE ENCRYPTION STRATEGIES

will be approximately 43 characters long. The probability that the user will be able to memorize this long random generated key is not reasonable.

There are other ways to distribute the encryption key like the approach used by TeamDrive, where the encryption key is exported to a “.pss”-file. One thing to remember is the fact that no information about the encryption key should be stored in the cloud, for security reasons. The cloud provider can’t be involved in the key distribution for the same reasons as server-side encryption shouldn’t be used. The risk that the encryption key is hijacked by the cloud provider is too great a threat.

3.2 Password based key

Another common way to achieve client-side encryption is to let the user enter an archive password, which will be used to encrypt the data. Based on re- search made by a scientist from Council for Scientific and Industrial Research in 2009, most passwords are between 6-9 characters long [6]. For more de- tailed statistics see Figure 3.1. Compared to the 43 characters that a 256-bit encryption key corresponds to, a password would most likely result in a re- duced number of possible key combinations. See Table 3.1 for information on how the password length affects the number of possible combinations.

Characters Number of combinations Number of bits

6 5, 68002 · 10 ¹⁰ ∼ 36 − bits

7 3, 52161 · 10 ¹² ∼ 42 − bits

8 2, 18340 · 10 ¹⁴ ∼ 48 − bits

9 1, 35371 · 10 ¹⁶ ∼ 54 − bits

10 8, 39299 · 10 ¹⁷ ∼ 60 − bits

20 7, 04423 · 10 ³⁵ ∼ 120 − bits

30 5, 91222 · 10 ⁵³ ∼ 180 − bits

40 4, 96212 · 10 ⁷³ ∼ 240 − bits

43 1, 18261 · 10 ⁷⁷ ∼ 256 − bits

Table 3.1: How the number of characters ([0-9][a-z][A-Z]) used in a password

affects the number of possible key combinations. The last column shows how

many bits is needed to represent the number of combinations.

(19)

11 3.2. PASSWORD BASED KEY

Figure 3.1: The diagram shows how many percent of the 46000 MySpace users, used a given numbers of characters in their passwords

To increase security something called Password-Based Key Derivation Func- tion (PBKDF) could be used. The purpose of a PBKDF is to take a password and based on that generate a more complex encryption key, and thereby in- crease the time needed to crack the encryption [7]. The function adds a salt to the password. The purpose of the salt is to prevent rainbow attacks, see section 1.3 for more information. To make this possible the salt has to be different for every user. When choosing salt a simple solution would be to use the username as salt. This will ensure that every user gets a unique salt.

Another solution could be to use something called a "keyfile" where the salt would be based on the content of the file. The file could be any file, for example a family photo. The strategy is used by applications like TrueCrypt [5]. Like the client-generated encryption key, the information has to be dis- tributed between the clients. Since the salt is considered public information, the file could be stored in the cloud unencrypted.

To make it even harder to get access to the encrypted data a unique random

generated salt could be used. The salt has to be stored together with the

(20)

12 CHAPTER 3. CLIENT-SIDE ENCRYPTION STRATEGIES

encrypted data.

After the salt has been added the resulting string is hashed using an approved hash function, like SHA-256, to generate a 256-bit key. In order to increase the resources needed to crack the encryption, the encryption key is hashed a given number of times. Like the salt, the number of iterations is consid- ered public information. In a report written by people from the National Institute of Standards and Technology, the number of iterations should be at least 1000[7]. This means that an attacker would have to do 1000 hash computations for every password, which increase the time needed before he will be able to test a given password. This is based on the assumption that the attacker knows the hash function and the number of iterations.

From the users perspective the time needed to make the calculations won’t make a big difference. As long as the number of iterations is not too high which will result in a delay in the application. 1000 iteration is considered minimum while using a PBKDF. Since an increased number of iterations amplify the resources needed to calculate the encryption key the higher the number the better. Since the system should be able to support different de- vices the devices with the smallest amount of computational power should be the one determining the number of iterations. Smartphone’s should probably be considered the weakest link.

In a report written by people from Horst Görtz Institute for IT-Security, a smartphone with a 1GHz ARM processor should be able to do 4000-10000 iterations in what they defined as a reasonable amount of time [8]. Since the number of iterations has a huge impact on the time needed to break the encryption it is desirable to have as large number of iterations as possible. To use 4000 iterations instead of 1000 would mean that the time would increase by four times.

In their report they also suggested the use of dynamic iteration count where

the number of iteration depends on the current computational power. For

example how many iteration the system is able to do in a limited amount

of time. The iteration count is then stored with the encrypted data to

make sure that the data could be decrypted. With this method the num-

bers of iterations would increase over time according to technological scaling

effects.

(21)

13 3.2. PASSWORD BASED KEY

3.2.1 Test implementation

In order to test the time needed to encrypt data a small scale implementation has been made. To keep it simple a client-server application which handles notes was developed. First off, client-side encryption was implemented using Java’s Crypto library. In order to generate an encryption key an existing Password-based key derivation function was used. The function used the account username as salt and an archive password provided by the user. It hashed the salt and password combination 2000 times using SHA-1. The produced key follows the AES.

The implementation was used to test how the encryption affects the perfor- mance of the client application. To do the test a number of files of given size was encrypted. The test showed that the encryption time where linear dependence of the size of the file. It takes less than a second to encrypt 20 megabytes of data which must be considered relatively fast. The test was made on a laptop with 2,4Ghz Intel core duo processor and 2 GB ddr3 RAM.

The operation system used was Windows 7 (32-bit).

Since users access the cloud through internet a comparison between the en- cryption and the upload speed of the internet was made.

In a report written by people from Akamai Technologies the average internet speed in Sweden is 7.3-Mbit/s [9]. Let’s convert it to megabytes per second in order to see how fast data could be sent to the cloud provider.

Megabit per second

Number of bits per byte = Speed in megabyte per second 7.3

8 = 0, 9125

In Figure 3.2 the speed needed to encrypt data is compared to the speed needed to upload the data to the cloud provider. The figure shows that the time needed to upload a file is much higher than the time needed to encrypt the data. In this case the time needed to encrypt the data will be insignificant. In order to see whether a higher internet speed would be able to compete with the encryption time I chose an internet speed of 200-Mbit/s.

In this case the encryption time was slower than the time it took to upload

the file, at least for files smaller than 30 megabytes. The result it presented

in Figure 3.3.

(22)

14 CHAPTER 3. CLIENT-SIDE ENCRYPTION STRATEGIES

Figure 3.2: The time needed to encrypt data of different size compared with

time needed to send the data to the cloud. Based on an internet connection

of 7.3-Mbit/s

(23)

15 3.3. PBKDF VS. RANDOM BASED ENCRYPTION KEY

Figure 3.3: The time needed to encrypt data of different size compared with time needed to send the data to the cloud. Based on an internet connection of 200-Mbit/s

3.3 PBKDF vs. Random based encryption key

In order to show how much time would be needed to break an encryption key made by a PBKDF compared to a generated encryption key based on random elements, a small example will be presented. In this example it will be assumed that a computer would be able to test 10 ⁹ password per second in a brute force attack.

PBKDF:

The PBKDF creates an encryption key based on an 8 character ² long pass- word. The number of password combinations would then be approximately 10 ¹⁴ . It will be assumed that the time needed to generate a key would be

2

The characters that could be used in the password are [a-z] [A-Z] [0-9]

(24)

16 CHAPTER 3. CLIENT-SIDE ENCRYPTION STRATEGIES

0,2 seconds. To clarify the time needed to generate the key is the time it will take to do add the password and the salt and doing a given number of hash computations. This means that the attacker would be able to test 5 keys every second when a PBKDF is used since he has to compute the corresponding key for every given password. To be more exact it would take 1 second + 5/10 ⁹ seconds but it will round it to one second.

Random based encryption key:

Since this encryption key is based on random elements it does not have a common denominator as the PBKDF has. If a 256-bit long encryption key will be generated there will be approximately 10 ⁷⁷ possible key combinations.

As mentioned before it’s assumed that the attacker will be able to test 10 ⁹ keys every second. So for every key tested using PBKDF 2 · 10 ⁸ keys would be tested using the random based encryption key approach.

In order to get the number of seconds it would take to break an encryption key the total number of combinations has to be divided by the number of tested keys per second.

Generic formula:

Number of possible combinations

Number of tested keys per second = Second needed to crack encryption PBKDF:

10 ¹⁴

5 = 2 · 10 ¹³ seconds ≈ 634196years Radom based key:

10 ⁷⁷

10 ⁹ = 10 ⁶⁸ seconds ≈ 3 · 10 ⁶⁰ years

A summary of the number of possible key combinations and the time needed

to crack a given encryption key is presented in Table 3.2. Let’s compute the

relation between the number of combinations and the time needed to crack

the encryption.

(25)

17 3.3. PBKDF VS. RANDOM BASED ENCRYPTION KEY

Relation between the numbers of combinations:

10 ⁷⁷ /10 ¹⁴ = 10 ⁶³

Relation between time needed to crack encryption:

10 ⁶⁸ /(2 · 10 ¹³ ) = 5 · 10 ⁵⁴

The relation between the time needed to break an encryption and the relation between to the number of possible combinations has decreased. Even though the PBKDF increases the time needed to break the encryption it still isn’t enough to compensate for the lack of key combinations.

PBKDF Random based key

Number of key cobinations ∼10 ¹⁴ ∼10 ⁷⁷

Time needed to crack encryption ∼ 2 · 10 ¹³ seconds ∼ 10 ⁶⁸ seconds

Table 3.2: A summary of the number of possible key combinations and the

time needed to crack a given encryption key. The PBKDF is based on a

password containing 8 characters while the Random based key is a 256-bit

encryption key.

(26)

18 CHAPTER 3. CLIENT-SIDE ENCRYPTION STRATEGIES

(27)

19

Chapter 4

Conclusion

In this chapter a discussion about whether PBKDF or Random based en- cryption keys should be used, will be presented. Benefits and drawback concerning the use of dynamic iterations will be pointed out. Then other big holdback that companies should take into account before using client- side encryption will be presented. Last suggestions to areas which could be studied further will be introduced.

4.1 Client-side encryption

The problem that prevents users from encrypt all data using client-side en-

cryption, as I see it, is the fact that if the encryption key is lost all data

will be irretrievable. Therefore the user should have a choice whether to use

client-side encryption or not. Information about the fact that the data will

be irretrievable when the encryption key is lost should be pointed out to

the users, as should the benefits with client-side encryption. An example of

an application that does not show the benefits with client-side encryption is

Mozy, even though they offer the service. The user is only informed that the

data will be irretrievable if the password is lost. The fact that this would

increase security is never mention.

(28)

20 CHAPTER 4. CONCLUSION

4.2 PBKDF or Random based encryption key

One of the most secure ways of encrypting data is to use the "User supplied key" approach mentioned in Section 3.1. Even though is very hard crack the encryption the task of distributing the encryption key will be rather complex.

There are applications like TeamDrive where an encryption key en generated and stored locally. In the end the same key distribution problem will occur in this approach. In order to make it easier to distribute the encryption key TeamDrive has a feature where the user could export the encryption key to a “.pss”-file. The user then has to transfer the “.pss”-file to all the different devices. I personally would not appreciate to have to transfer file between all my devices to be able to access my data. For example some applications make it possible to access files though a web browser. Sometimes you are at public places like an internet cafe and want to access data though the web browser. In order to do so you have to access your key in some way. It is not a practical solution thought I guess it could be solved in some way, maybe by using third party software which could store the encryption key.

A more practical alternative would be to base the encryption key on a pass- word and then use a Password-based key derivation function. As mentioned before, the biggest disadvantages with this approach is that users tend to use short passwords and it often follows some kind of pattern. This results in a weak encryption key since the only private information is the password.

If someone would use this approach I think it’s important that the software informs the user whether the archive password is considered weak. Person- ally I would prefer this approach because it is more practical. The goal was to increase the security of the user’s data then apparently a Random based encryption key approaches is more desirable.

4.3 Dynamic iteration

In Section 3.2 where the “Password based key“ approach where presented,

the use of dynamic iteration count was introduced. The idea is good since

the number of iterations increases relative to the computers computational

power. The technique has one big flaw. When creating applications for

multiple devices, where the difference in computational power is large, there

could be cases where devices won’t be able to do the hash computations in a

reasonable amount of time. For example if a device with high computational

(29)

21 4.4. CLIENT-SIDE ENCRYPTION DRAWBACKS

power, let’s say a desktop computer, encrypt a file. Then the file is shared with a device with low computational power, let’s say a Smartphone. In order for the Smartphone to decrypt the file it has to do as many hash iterations as the computer. Since it’s a difference in computational power it will probably take the Smartphone a noticeable amount of time to decrypt the data.

4.4 Client-side encryption drawbacks

There are factors that companies have to take into account before they decide to use client side encryption. The problem is that so far we have just consid- ered systems involving a single user. Even though the use of multiply devices has been considered, a single user has been responsible for distributing the encryption key.

What if a company wants to start using a cloud provider? Let’s say they decide to use the “User supplied key” approach in Section 3.1, to encrypt their data. A key is generated and spread to all the employees. Later an employee, in this example we will call him John, get’s fired. Now he poses as a security threat since he may have stored the encryption key used by the company. To prevent John from access the files the company could download all the files from the cloud provider and encrypt it using a newly generated encryption key. This is not really efficient. Like shown in the test implementation in Section 3.2.1, for the time being the internet speed is rather slow. There are examples where the cloud providers limit the download speed of their users making the download even slower. Wouldn’t it be easier if John just didn’t have access to the decryption key?

Note that there are algorithms which separate encryption and decryption keys. The algorithms are called asymmetrical keys, it consist of a private and a public key. The private key is used to decrypt data, it’s also used for mathematically calculating new public keys. A public key is used to encrypt the data which could only be decrypted using the private key.

So back to the example if every employee got a public key for encrypting

the data and only a few administrators got access to the private key. In

order to keep the private key hidden and still enable regular employees to

decrypt data a centralized server could be used for decrypting the data. The

centralized server could also be called a key manager.

(30)

22 CHAPTER 4. CONCLUSION

A key manager is responsible for storing encryption keys. They could oper- ate at different levels, it could be built into an application and sometimes an external key manager could be used for handling keys for multiply software’s at once. Normally new encryption keys are generated with given time in- tervals like once every month, as a security precaution. If the private key is compromised then the intruder won’t have access to all the companies’ data.

As mentioned before it’s not efficient to encrypt data using a new encryption key since the data has to be downloaded every single time. In order to be able to access files encrypted with outdated encryption keys a history of keys has to be handled by the key manager.

In the example a centralized server where used, but some key management systems uses a distributed approach where the data is encrypted and de- crypted locally and the sent to the cloud provider. A distributed solution requires significantly less bandwidth since the data won’t be sent to the cen- tralized server for decryption. A distributed solution will eliminate point of failure. But the implementation will probably be more complex than a centralized solution.

The real problem with client-side encryption isn’t the encryption itself but how to managing the encryption keys. As shown in this thesis there are a few common strategies to solve this when only a single user is involved.

The problem get rather complex when multiply users are involved. There are organizations which have tried to make guild lines for how do design key management systems. For example National Institute of Standards and Technology wrote a report in 2012 in which they tried to show the problems related to key management combined with some guild lines [10]. The same year Securosis, L.L.C. wrote a report where they tried to show the different levels of key management and when to apply them to get the best result.

They also mentioned that there is an increased standardization of communi- cation protocols between key management systems and encryption systems [11]. In 2008 Nubridges gave out a report based what they think is the eight best practice for designing a key management system [12].

4.5 Future work

Key management is a topic which could be investigated even further. As

mentioned in this thesis key management is a crucial element in whether the

client-side encryption cloud be used or not. It would be fun to investigate

(31)

23 4.5. FUTURE WORK

which standards exist. Sometimes companies exchange encrypted data with

each other. How does this affect the key management system?

(32)

24 CHAPTER 4. CONCLUSION

(33)

25

Bibliography

[1] D. Evans, “The internet of things - how the next evolution of the inter- net is changing everything.” http://www.cisco.com/web/about/ac79/

docs/innov/IoT_IBSG_0411FINAL.pdf, 2011. [Online; accessed 2013- 05-17].

[2] R. Hartmann, “The bring your own services (byos) paradox.”

http://www.varonis.com/news-events/press-releases/2012/

byos-paradox.html, 2012. [Online; accessed 2013-04-29].

[3] J. Ullrich, “Isc diary – hashing passwords.” http://www.dshield.org/

diary/Hashing+Passwords/11110, 2011. [Online; accessed 2013-05-08].

[4] J. McCaffrey, “Keep your data secure with the new advanced encryption standard.” http://msdn.microsoft.com/en-us/magazine/cc164055.

aspx, 2003. [Online; accessed 2013-05-17].

[5] M. H. T. K. M. R. U. V. Moritz Borgmann, Tobias Hahn and S. Vowe, “On the security of cloud storage services.”

https://www.sit.fraunhofer.de/fileadmin/dokumente/studien_

und_technical_reports/Cloud-Storage-Security_a4.pdf, 2012.

[Online; accessed 2013-04-24].

[6] R. van Heerden and J. Vorster, “A statistical analysis of large passwords lists, used to optimize brute force attacks.”

http://researchspace.csir.co.za/dspace/bitstream/10204/

3328/1/Van%20Heerden_2009.pdf, 2009. [Online; accessed 2013-04- 24].

[7] W. B. Meltem Sönmez Turan, Elaine Barker and L. Chen, “Recom- mendation for password-based key derivation - part 1: Storage appli- cations.” http://csrc.nist.gov/publications/nistpubs/800-132/

nist-sp800-132.pdf, 2010. [Online; accessed 2013-04-24].

(34)

26 BIBLIOGRAPHY

[8] M. K. C. P. T. Y. Markus Dürmuth, Tim Güneysu and R. Zimmermann,

“Evaluation of standardized password-based key derivation against par- allel processing platforms.” http://www.emsec.rub.de/media/crypto/

veroeffentlichungen/2013/01/29/esorics_pbkdf2.pdf, 2013. [On- line; accessed 2013-05-15].

[9] B. R. David Belson, Tom Leighton, “State of the internet.” http://

www.akamai.com/stateoftheinternet/, 2012. [Online; accessed 2013- 06-06].

[10] W. B. W. P. Elaine Barker, William Barker and M. Smid,

“Recommendation for key management – part 1: General.”

http://csrc.nist.gov/publications/nistpubs/800-57/sp800-57_

part1_rev3_general.pdf, 2012. [Online; accessed 2013-08-16].

[11] R. Mogull, “Pragmatic key management for data encryp- tion.” https://securosis.com/assets/library/reports/

Pragmatic-Key-Management.v.1.pdf, 2012. [Online; accessed 2013-08-16].

[12] Nubridges, “Best practices in encryption key management data security.” http://www.northdoor.co.uk/_assets/_download/

CBBF299D-5056-897F-ED891771F53907B2.pdf, 2008. [Online; ac-

cessed 2013-08-16].

Cloud computing from a privacy perspective

Umeå universitet

Bachelor Thesis

Spring -13

Cloud computing

from a privacy perspective

Author:

Daniel Evertsson

Supervisor:

Jerry Eriksson

September 6, 2013

Abstract

It appears that the biggest problem related to client-side encryption, isn’t

the encryption itself, but the distribution of encryption keys. As the number

of users increase, the key destitution problem gets more distinct. Often the

key distribution is handled by something called a key manager, which could

operate at different levels. It could be built into the application or it could

be an external application. There are organizations which made guild lines

for how to design key management systems.

Acknowledgements

First of I would like to thank Cristian Klein at the department for distributed

systems for coming up with the idea for this thesis. He has also provided a lot

of valuable input and support. I would also like to thank the teachers Jerry

Eriksson and Pedher Johansson for valuable input to this project.

Contents

1 Introduction 1

1.1 Client-side encryption . . . . 2

1.2 Problem statement . . . . 3

1.3 Definitions . . . . 3

2 Existing solutions that offers Storage-as-a-service 5 2.1 CrashPlan . . . . 5

2.2 Mozy . . . . 6

2.3 TeamDrive . . . . 6

2.4 Wuala . . . . 6

2.5 Summary of common encryption techniques . . . . 7

2.6 Other solutions . . . . 7

3 Client-side encryption strategies 9 3.1 User supplied key . . . . 9

3.2 Password based key . . . . 10

3.2.1 Test implementation . . . . 13

3.3 PBKDF vs. Random based encryption key . . . . 15

4 Conclusion 19 4.1 Client-side encryption . . . . 19

4.2 PBKDF or Random based encryption key . . . . 20

4.3 Dynamic iteration . . . . 20

4.4 Client-side encryption drawbacks . . . . 21

4.5 Future work . . . . 22

Bibliography 25

1

Chapter 1

Introduction

If you want to know more about different kinds of cloud services visit TechNet Mag-

azine (http://technet.microsoft.com/en-us/magazine/hh509051.aspx)

2 CHAPTER 1. INTRODUCTION

that they would use cloud based services if they were as robust as internal tools.

Because the security is a crucial element in whether companies will start using cloud based services or not, this will be the main focus of this thesis.

This thesis will study different encryption techniques which could be used to encrypt data stored at the cloud provider.

1.1 Client-side encryption

To make the data less accessible a method called client-side encryption will be used to encrypt all the users’ data before it’s sent to the cloud provider.

In contrast to server-side encryption, where the encryption key is stored by the cloud provider, the client-side encryption approach only stores the encryption key locally. This will prevent the cloud provider from accessing the data since they won’t know how to decrypt the it.

Unauthorized people could be employees of the cloud provider or people who broken

into the cloud providers system

3 1.2. PROBLEM STATEMENT

1.2 Problem statement

1.3 Definitions

In this section terms often used in this thises will be defined.

Salt:

SHA:

Secure hash algorithm (SHA) was developed by the United States National Security Agency. Together with MD5, SHA is the most conventional hash function used in cryptography.

AES:

Advanced Encryption Standard (AES) is a standardized encryption algo-

rithm developed by National Institute of Standards and Technology. The al-

4 CHAPTER 1. INTRODUCTION

gorithm is built to use encryption keys by length 128, 192 or 256 bit [4].

Account password:

This is a password that is used to authenticate a user when logging in to the system. The account password will be stored in the cloud and there by accessible to the ones who got access to the cloud-provider.

Archive password:

This is a password used to encrypt data. Its only stored locally unlike an

account password, which is stored online. It’s also worth mentioning that if

the archive password is lost there will be no way to decrypt the data.

5

Chapter 2

Existing solutions that offers Storage-as-a-service

CrashPlan ¹ offers three kinds of encryption techniques. As default the ac- count password, which is known by CrashPlan, will be used to generate a 128-bit encryption key. Secondly the user could choose an archive password, which is not known to CrashPlan, it will be used to encrypt the encryption

Mozy ² offers two methods for encryption. All data is encrypted on the client before sent to the cloud-provider. The first option is to use a 448-bit encryption key provided by and also known to Mozy. The user could also enter a private 256-bit encryption key which will only be stored locally.

There are other Cloud providers, which are not mentioned in the report writ- ten by Fraunhofer Institute for Secure Information Technology, which offers client-side encryption. Applications like Idrive ⁵ , Swissdisk ⁶ and SpiderOak

6 5, 68002 · 10 ¹⁰ ∼ 36 − bits

7 3, 52161 · 10 ¹² ∼ 42 − bits

8 2, 18340 · 10 ¹⁴ ∼ 48 − bits

9 1, 35371 · 10 ¹⁶ ∼ 54 − bits

10 8, 39299 · 10 ¹⁷ ∼ 60 − bits

20 7, 04423 · 10 ³⁵ ∼ 120 − bits

30 5, 91222 · 10 ⁵³ ∼ 180 − bits

40 4, 96212 · 10 ⁷³ ∼ 240 − bits

43 1, 18261 · 10 ⁷⁷ ∼ 256 − bits