• No results found

Elliptic Curve Digital Signatures in RSA Hardware

N/A
N/A
Protected

Academic year: 2021

Share "Elliptic Curve Digital Signatures in RSA Hardware"

Copied!
109
0
0

Loading.... (view fulltext now)

Full text

(1)

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Elliptic Curve Digital Signatures in RSA Hardware

Examensarbete utfört i Kryptoteknik vid Tekniska högskolan vid Linköpings universitet

av Martin Krisell LiTH-ISY-EX--12/4618--SE

Linköping 2012

Department of Electrical Engineering Linköpings tekniska högskola

Linköpings universitet Linköpings universitet

(2)
(3)

Elliptic Curve Digital Signatures in RSA Hardware

Examensarbete utfört i Kryptoteknik

vid Tekniska högskolan vid Linköpings universitet

av

Martin Krisell LiTH-ISY-EX--12/4618--SE

Handledare: Jan-Åke Larsson

isy, Linköpings universitet

Pablo García

Realsec, Madrid, Spanien

Examinator: Jan-Åke Larsson

isy, Linköpings universitet

(4)
(5)

Avdelning, Institution Division, Department

Division of Information Coding Department of Electrical Engineering SE-581 83 Linköping Datum Date 2012-08-31 Språk Language Svenska/Swedish Engelska/English   Rapporttyp Report category Licentiatavhandling Examensarbete C-uppsats D-uppsats Övrig rapport  

URL för elektronisk version

http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-81084

ISBN — ISRN

LiTH-ISY-EX--12/4618--SE Serietitel och serienummer Title of series, numbering

ISSN —

Titel Title

Digitala signaturer över elliptiska kurvor på RSA-hårdvara Elliptic Curve Digital Signatures in RSA Hardware

Författare Author

Martin Krisell

Sammanfattning Abstract

A digital signature is the electronic counterpart to the hand written signature. It can prove the source and integrity of any digital data, and is a tool that is becoming increasingly im-portant as more and more information is handled electronically.

Digital signature schemes use a pair of keys. One key is secret and allows the owner to sign some data, and the other is public and allows anyone to verify the signature. Assuming that the keys are large enough, and that a secure scheme is used, it is impossible to find the private key given only the public key. Since a signature is valid for the signed message only, this also means that it is impossible to forge a digital signature.

The most well-used scheme for constructing digital signatures today is RSA, which is based on the hard mathematical problem of integer factorization. There are, however, other math-ematical problems that are considered even harder, which in practice means that the keys can be made shorter, resulting in a smaller memory footprint and faster computations. One such alternative approach is using elliptic curves.

The underlying mathematical problem of elliptic curve cryptography is different to that of RSA, however some structure is shared. The purpose of this thesis was to evaluate the per-formance of elliptic curves compared to RSA, on a system designed to efficiently perform the operations associated with RSA.

The discovered results are that the elliptic curve approach offers some great advantages, even when using RSA hardware, and that these advantages increase significantly if special hardware is used. Some usage cases of digital signatures may, for a few more years, still be in favor of the RSA approach when it comes to speed. For most cases, however, an elliptic curve system is the clear winner, and will likely be dominant within a near future.

Nyckelord

(6)
(7)

Sammanfattning

En digital signatur är den elektroniska motsvarigheten till en handskriven signa-tur. Den kan bevisa källa och integritet för valfri data, och är ett verktyg som blir allt viktigare i takt med att mer och mer information hanteras digitalt.

Digitala signaturer använder sig av två nycklar. Den ena nyckeln är hemlig och tillåter ägaren att signera data, och den andra är offentlig och tillåter vem som helst att verifiera signaturen. Det är, under förutsättning att nycklarna är tillräck-ligt stora och att det valda systemet är säkert, omöjtillräck-ligt att hitta den hemliga nyc-keln utifrån den offentliga. Eftersom en signatur endast är giltig för datan som signerades innebär detta också att det är omöjligt att förfalska en digital signatur. Den mest välanvända konstruktionen för att skapa digitala signaturer idag är RSA, som baseras på det svåra matematiska problemet att faktorisera heltal. Det finns dock andra matematiska problem som anses vara ännu svårare, vilket i prak-tiken innebär att nycklarna kan göras kortare, vilket i sin tur leder till att mindre minne behövs och att beräkningarna går snabbare. Ett sådant alternativ är att använda elliptiska kurvor.

Det underliggande matematiska problemet för kryptering baserad på elliptiska kurvor skiljer sig från det som RSA bygger på, men de har en viss struktur ge-mensam. Syftet med detta examensarbete var att utvärdera hur elliptiska kurvor presterar jämfört med RSA, på ett system som är designat för att effektivt utföra RSA.

De funna resultaten är att metoden med elliptiska kurvor ger stora fördelar, även om man nyttjar hårdvara avsedd för RSA, och att dessa fördelar ökar mångfaldigt om speciell hårdvara används. För några användarfall av digitala signaturer kan, under några år framöver, RSA fortfarande vara fördelaktigt om man bara tittar på hastigheten. För de flesta fall vinner dock elliptiska kurvor, och kommer troligen vara dominant inom kort.

(8)
(9)

Abstract

A digital signature is the electronic counterpart to the hand written signature. It can prove the source and integrity of any digital data, and is a tool that is becoming increasingly important as more and more information is handled elec-tronically.

Digital signature schemes use a pair of keys. One key is secret and allows the owner to sign some data, and the other is public and allows anyone to verify the signature. Assuming that the keys are large enough, and that a secure scheme is used, it is impossible to find the private key given only the public key. Since a signature is valid for the signed message only, this also means that it is impossible to forge a digital signature.

The most well-used scheme for constructing digital signatures today is RSA, which is based on the hard mathematical problem of integer factorization. There are, however, other mathematical problems that are considered even harder, which in practice means that the keys can be made shorter, resulting in a smaller mem-ory footprint and faster computations. One such alternative approach is using elliptic curves.

The underlying mathematical problem of elliptic curve cryptography is different to that of RSA, however some structure is shared. The purpose of this thesis was to evaluate the performance of elliptic curves compared to RSA, on a system designed to efficiently perform the operations associated with RSA.

The discovered results are that the elliptic curve approach offers some great ad-vantages, even when using RSA hardware, and that these advantages increase significantly if special hardware is used. Some usage cases of digital signatures may, for a few more years, still be in favor of the RSA approach when it comes to speed. For most cases, however, an elliptic curve system is the clear winner, and will likely be dominant within a near future.

(10)
(11)

Acknowledgments

First of all, I would like to thank Jesús Rodríguez Cabrero for allowing me to do my thesis at Realsec in Madrid.

I would also like to thank my co-workers at Realsec, including my supervisor Pablo García, for the warm welcome to Spain and for giving me expert guidance. I specifically want to thank Luis Jesús Hernández for all our great debugging and discussion sessions.

In addition, I would like to thank my examiner Jan-Åke Larsson for his interest in examining the thesis and for his valuable comments along the way.

Finally, I would like to thank my friends and family who have helped me proof-reading the thesis.

Linköping, August 2012 Martin Krisell

(12)
(13)

Contents

1 Background 1 1.1 Introduction . . . 1 1.2 Realsec . . . 1 1.3 Purpose of Thesis . . . 2 1.4 Outline of Report . . . 2

I

Theory

2 Cryptography Overview 7 2.1 Basic Concepts . . . 7 2.2 Historical Ciphers . . . 7

2.2.1 The Caesar Cipher . . . 8

2.2.2 Substitution Cipher . . . 8

2.2.3 The Vigenère Cipher . . . 9

2.3 Modern Cryptography . . . 10 2.4 Goals of Cryptography . . . 10 2.5 Attack Models . . . 11 2.6 Bits of Security . . . 12 2.7 Computer Security . . . 12 3 Symmetric Cryptography 13 3.1 A Symmetric Cipher . . . 13

3.1.1 The One Time Pad . . . 14

3.2 Security Definitions . . . 15 3.2.1 Perfect Secrecy . . . 15 3.2.2 Semantic Security . . . 15 3.3 Stream Ciphers . . . 16 3.4 Block Ciphers . . . 18 3.4.1 Psuedorandom Permutations . . . 18 3.4.2 DES . . . 18 3.4.3 AES . . . 19 3.4.4 Modes of Operation . . . 19 ix

(14)

3.5 Hash Functions . . . 21

3.5.1 The Birthday Problem . . . 22

3.6 Message Authentication Codes . . . 23

3.6.1 CBC-MAC . . . 24

3.6.2 HMAC . . . 24

3.6.3 Authenticated Encryption . . . 24

4 Asymmetric Cryptography 27 4.1 The Key Distribution Problem . . . 27

4.2 Public and Private Keys . . . 27

4.3 Key Exchange . . . 28

4.3.1 Diffie-Hellman-Merkle Key Exchange . . . 28

4.4 Trapdoor Permutations . . . 30

4.5 Semantic Security . . . 30

4.6 ElGamal . . . 31

4.7 RSA . . . 32

4.7.1 RSA Encryption Standards . . . 33

4.8 Hybrid Systems . . . 33

4.9 Security of Public Key Algorithms . . . 33

4.9.1 RSA Security . . . 34

4.9.2 Solving the Discrete Logarithm Problem . . . 37

4.9.3 Shor’s Algorithm . . . 38

4.10 Elliptic Curve Cryptography . . . 39

4.10.1 Elliptic Curves . . . 39

4.10.2 Elliptic Curves Over Finite Fields . . . 42

4.10.3 Projective Coordinate Representations . . . 43

4.10.4 The Elliptic Curve Discrete Logarithm Problem (ECDLP) . 45 4.10.5 Group Order . . . 45

4.10.6 Domain Parameters . . . 46

4.10.7 Elliptic Curve Key Pair . . . 47

4.10.8 Encryption Using Elliptic Curves (ECIES) . . . 47

4.11 Digital Signatures . . . 49

4.11.1 RSA Signatures . . . 50

4.11.2 Digital Signature Algorithm (DSA) . . . 52

4.11.3 Elliptic Curve DSA (ECDSA) . . . 53

4.12 Public Key Infrastructure . . . 55

4.12.1 Certificates . . . 55

II

Implementation and Performance Evaluation

5 Implementation 59 5.1 Hardware Security Module . . . 59

5.2 Hardware . . . 60

5.2.1 Montgomery Multiplications . . . 60

(15)

CONTENTS xi

5.3.1 Overall Code Structure . . . 63

5.3.2 Layer 1 - Finite Field Arithmetics . . . 63

5.3.3 Layer 2 - Point Addition and Doubling . . . 65

5.3.4 Layer 3 - Point Multiplication . . . 67

5.3.5 Layer 4 - Cryptographic protocols . . . 70

5.3.6 Testing . . . 72

6 Performance Evaluation of Implementation 73 6.1 Performance of HSM implementation . . . 73

6.1.1 Key Pair Generation . . . 74

6.1.2 Signature Generation . . . 75

6.1.3 Signature Verification . . . 77

6.2 Performance of Other ECDSA and RSA Implementations . . . 78

6.3 Conclusion . . . 79

A Mathematical Prerequisites 81 A.1 Complexity Theory . . . 81

A.2 Number Theory . . . 82

A.3 Modular Arithmetic . . . 83

A.3.1 The Chinese Remainder Theorem . . . 83

A.3.2 Modular exponentiation . . . 84

A.3.3 Multiplicative inverses . . . 85

A.4 Groups and Finite Fields . . . 86

A.4.1 Generators and Subgroups . . . 86

A.4.2 The Discrete Logarithm Problem . . . 87

A.4.3 Finite Fields . . . 87

(16)
(17)

1

Background

This chapter gives an introduction to this thesis, defining the background of, and the goals with, the performed work. An outline of the report is provided as well as definitions of commonly used abbreviations.

1.1

Introduction

Cryptography is an invaluable tool today and its applications are growing contin-uously. Even though most implementations are transparent to the user, it is al-most impossible to use a computer today without having to rely on cryptographic constructions. The purpose of using cryptography is, of course, to provide secu-rity. However, when deciding which scheme to use, security is not the only con-cern. Often a decisive factor in the choice between two or more constructions is their performance. This is especially important on the Internet where a server usually needs to be able to handle many simultaneous user requests, and where users generally are impatient. The performance of a cryptographic scheme is de-termined by two factors, the underlying algorithm and the implementation. The implementation can be done in either software or hardware, where the first one is simpler and more maintainable, but where the latter may give better perfor-mance as well as higher security.

1.2

Realsec

Realsec is a Spanish company, based in Madrid, and has since 2001 been pro-viding information security solutions to banks, governments and public organi-zations. Most of their cryptographic products are based on something called a

(18)

Hardware Security Module (HSM), which is a device that provides several secu-rity services such as encryption, digital signatures, and public key infrastructure. My thesis work was performed at the Madrid office and my implementation has been done on their new HSM module, not yet released to customers.

1.3

Purpose of Thesis

The goal of this thesis was to evaluate the security and the performance of the elliptic curve based algorithms for digital signatures, and in particular against those based on RSA which is the by far most common choice today. Another goal derived from this was to create an implementation for Realsec that combines the elliptic curve algorithm with an efficient implementation, and to compare the per-formance with their existing, RSA based, scheme for generating digital signatures. The new elliptic curve implementation will be performed on the hardware that the current RSA implementation uses, and which is specialized for the RSA oper-ations. The goal was to show that elliptic curves has some great advantages over the RSA approach, especially for the higher security cases, and that increased performance can be achieved without the need for new specialized hardware.

1.4

Outline of Report

The basic structure of this thesis is that the first part, chapters 2-4, covers the theoretical foundations of cryptography, needed in order to understand the basics behind digital signatures. The second part, chapters 5-6, covers the performed implementation work and the results achieved.

If the reader is already familiar with the theoretical parts of cryptography, it is possible to jump straight to the second part. Whenever specific details from the first part are used, they are referenced and the page number on which they appear is given.

In order to not require too many prerequisites, there is an appendix covering the required mathematical background. These prerequisites are referred to in the text whenever needed.

(19)

1.4 Outline of Report 3

Abbreviations

ACP Asymmetric Crypto Processor AES Advanced Encryption Standard ANSI American National Standards Institute BAU Big-integer Arithmetic Unit

CBC Cipher Block Chaining

CDH Computational Diffie-Hellman CTR Counter Mode

DES Data Encryption Standard 3DES Triple DES

DH Diffie-Hellman Key Exchange DLP Discrete Logarithm Problem DSA Digital Signature Algorithm ECB Electronic Code Book ECC Elliptic Curve Cryptography

ECDH Elliptic Curve Diffie-Hellman Key Exchange ECDLP Elliptic Curve Discrete Logarithm Problem ECDSA Elliptic Curve Digital Signature Algorithm ECIES Elliptic Curve Integrated Encryption Scheme GF Galois Field

HMAC Hash-based Message Authentication Code JP Jacobian Projective

KDC Key Distribution Center KDF Key Derivation Function LD López-Dahab Projective MAC Message Authentication Code MM Montgomery Multiplier

NIST National Institute of Standards and Technology OAEP Optimal Asymmetric Encryption Padding OTP One Time Pad

PKCS Public Key Cryptography Standards PRG Pseudo Random Generator

PRP Pseudo Random Permutation RSA Rivest Shamir Adleman SHA Secure Hash Algorithm

(20)
(21)

Part I

(22)
(23)

2

Cryptography Overview

This chapter covers a brief overview of cryptography. Some historical examples of ciphers are provided, as well as a quick introduction to modern cryptography. In addition, the goals of cryptography and the different attack models are defined, as they will be referred to throughout this thesis.

2.1

Basic Concepts

The basic purpose of cryptography is to allow two parties, often denoted by Alice and Bob, to talk to each other over an insecure channel while preventing an evil adversary, Eve, from understanding and participating. Alice and Bob utilize cryp-tographic constructions in order to transform the insecure channel into a secure one. The communication between the two parties may be taking place over space (e.g. over the Internet) or in time (e.g. for disc encryption). The original message is often referred to as the plaintext, and the transformed, unreadable, message is often called the ciphertext. The basic idea of cryptography can be visualized as in figure 2.1 on the following page.

2.2

Historical Ciphers

The need for securing information has existed ever since humanity acquired the ability to write. Still, it is only very recently that cryptography started to be treated as a science, with constructions motivated by mathematical proofs. Up until the 20th century, the security of a cipher was based on the obscureness and more importantly on the secrecy of the used algorithm. After the method for encryption was released or reversed engineered, the cipher was always eventually

(24)

Alice

Bob

Eve

Hi Bob!

??

Hi Alice! Wednesday, June 6, 12

Figure 2.1:Basic overview of cryptography.

broken. In this section, a quick overview of a few historical ciphers is given. More information about these, and the often very exciting stories surrounding them, can be found in Kahn [1].

2.2.1

The Caesar Cipher

The Caesar cipher is the simplest possible encryption scheme. All it does is shift-ing the plaintext letters individually three steps forward in the alphabet, such that A → D, B → E, and so on. When the end of the alphabet is reached, it wraps around, i.e. X → A, Y → B, and Z → C. Described with mathematics, the letters in the alphabet are assigned numbers in order from 0 to 25, and encryp-tion is done by performing addiencryp-tion by 3 mod 26 (for an introducencryp-tion to modular arithmetic, see Appendix A.3 on page 83). Decryption is then of course done by performing subtraction by 3 mod 26. This is in fact not a cipher at all, since there is no key (the definition of a symmetrical cipher is given in chapter 3 on page 13). However, usually any cipher that is defined by adding a number n mod 26 is re-ferred to as a Caesar cipher. This cipher is trivially broken by simply trying all 26 possibilities, an attack method referred to as exhaustive search or as a brute force attack.

2.2.2

Substitution Cipher

In the previous example, the size of the key space (i.e. the number of keys) was small enough to make the exhaustive search approach practical. In a substitution cipher, instead of simply allowing rotations of the alphabet, each letter may map to any other letter. An example of such a mapping is given in figure 2.2 on the next page.

Now, the key is given by an ordering of the letters in the alphabet, and since this can be done in 26! different ways, the key space is much larger than before. It turns out that this key space is large enough to prevent any exhaustive search

(25)

2.2 Historical Ciphers 9

ABCDEFGHIJKLMNOPQRSTUVWXYZ

BMCDAEOFGHRKYJZLPNQVUWSXIT

Wednesday, June 6, 12

Figure 2.2:An example of a substitution cipher key.

attacks, but the substitution cipher is still not secure. The reason is that not all letters in a message are equally common. The most frequent letter in the ciphertext is very likely to be the most frequent letter in the given language, e.g. ’e’ in English. By performing a so-called statistical analysis of letters, letter pairs, etc., a substitution cipher can always be broken. This means that a large key space is a necessary, but not sufficient, requirement for a cipher in order to be secure.

2.2.3

The Vigenère Cipher

The problem with the substitution cipher is that the same plaintext letter will al-ways be encrypted into the same ciphertext letter, thus keeping all letter frequen-cies from the original message. The Vigenère cipher was designed to fix this. It uses a key of any length, and then repeats it until it is as long as the message. The encryption is then done similarly to the Caesar case by performing modular ad-dition of each message character to the corresponding character in the expanded key. This means that a repeated plaintext character may be transformed into a different ciphertext character, depending on its relative position to the key. An example of the Vigenère Cipher is given in figure 2.3 on the following page. This cipher was initially thought to be unbreakable, but further insight made it almost as easy to break as a normal substitution cipher. The problem with this cipher is that as soon as the attacker knows the length of the key, i, we can pick every i’th character in the ciphertext and since these have been encrypted by the same key character, we can perform statistical analysis on these letters. By repeating this for every position, we can retrieve the entire message. If we don’t know the size of the key, we can try to figure it out by looking for repeated patterns in the ciphertext, or by simply trying different lengths.

The above approach to break the Vigenère cipher works because the key is re-peated. However, if the key is long enough (or if the message is short enough), it will not work. In that case, the only potential problem would be the way the key characters are chosen. If the characters in the key are chosen truly at random (uniformly) over the alphabet, then every plaintext letter is transformed into each other letter with equal probability, and no statistical analysis will be able to break this cipher. We will keep this idea in mind and return to it in the beginning of the chapter on symmetric cryptography.

(26)

HELLO BOB I LOVE YOU KEYKE YKE Y KEYK EYK

+

RIJVS ZYF G VSTO CME

k

m

c

Notice that the two “b”s in Bob are encrypted to different ciphertext letters

mod 26

Wednesday, June 6, 12

Figure 2.3:An example of a Vigenère cipher in use.

2.3

Modern Cryptography

The first seed towards what is today referred to as Modern Cryptography was planted by Auguste Kerckhoff in 1883, when he published two articles propos-ing a few requirements for all future encryption schemes [2]. The most important one tackles the previous security by obscurity approach by stating that a crypto-graphic construction should be secure even if all details about the algorithms fall into the wrong hands. Only one small parameter should be needed to be kept secret, referred to as the key.

This idea changed cryptography completely, and transformed it from an art form into being a science. Today, ciphers are developed openly and scrutinized by the cryptographic community before being widely accepted. New cryptographic standards are sometimes decided through competitions where different cryptog-raphy research teams try to break each other’s constructions. The believed secu-rity of a cipher increases over time as more and more effort is put into breaking it. Also, a new kind of cryptography has recently been invented, called public key cryptography. Its constructions are often based on more or less complex math-ematics and are usually accompanied by formal proofs of security. The main practical difference is that the encryption and decryption keys are no longer the same, and that besides solving the key distribution problem (discussed in chapter 4), it also permits some new interesting applications of cryptography.

2.4

Goals of Cryptography

One of the main objectives when using cryptography is of course to prevent sensi-tive information from falling into the wrong hands. This is, however, not the only

(27)

2.5 Attack Models 11

objective. Equally important, and even more important in some cases, is to be cer-tain of who actually sent the data and also that it has not been modified during transmission. The four main objectives of cryptography, ordered alphabetically, are the following.

Authentication

Providing assurance that the expected sender of the message is also the true sender.

Confidentiality

Preventing unintended recipients from reading the message. • Integrity

Preventing undetected modification during transmission. • Non-repudiation

Ensuring that the sender of an authenticated message can never deny that they sent the message.

When describing cryptographic constructions throughout the rest of this thesis, it will be specified which of these goals the specific construction is intended to provide.

2.5

Attack Models

An attack model specifies the amount of information that the adversary has ac-cess to when trying to break encryptions. In the simplest case, the adversary, Eve, knows nothing about what kind of data is being sent over the channel, and can only see the ciphertext. In this case, Eve will only be able to mount a so-called ciphertext only attack. This is the weakest kind of attack and any system that is vulnerable to this kind of attack (such as all of the historical ciphers mentioned earlier) should never be used.

There are cases when the adversary may know all or some of the plaintext that was used to generate some specific ciphertext. For example, an email message al-ways starts with "From:" and so even if the contents of the message are unknown, some information about the plaintext is known. In this case, Eve will mount a known plaintext attack.

Finally, it may be the case that the adversary actually gets to choose the messages being encrypted, and then uses the plaintext and ciphertext pair in order to ex-tract information from other ciphertexts. An example could be disc encryption, where the attacker somehow can affect the files being stored in the system, e.g. by sending an email with an attachment. This is called a chosen plaintext attack. Generally, we try to provide security against all these models, since we usually can’t assume anything about the attacker and thus have to assume the worst. We will in the next chapter give a formal definition of security, and see that any system fulfilling it will be secure also against chosen plaintext attacks.

(28)

2.6

Bits of Security

As previously stated, a minimal requirement for a cryptographic system to be se-cure is that the key space is large enough to prevent attacks by exhaustive search. If it is not, the attacker can simply try every possible key until the decrypted message makes sense (except for one case, described in the next chapter). For a perfectly built encryption scheme, exhaustive search would be the only possible attack, and a large key space would also guarantee security. Real sys-tems, however, might have weaknesses that enable for a faster way to break them. When we talk about the bits of security for a cryptographic scheme, we mean the number of bits that corresponds to a key space where exhaustive search would take the same amount of time as the best know algorithm to break the specific scheme does. For example, if the best known attack in order to break scheme A runs in time Θ(2n2) (see Appendix A.1 on page 81 for asymptotic notation) where

n is the size of the key in bits, then A has n2 bits of security.

The National Institute of Standards and Technology (NIST) is a federal agency in the United States that, among other things, gives security recommendations. Up until 2010, they recommended using cryptographic system with a minimum of 80 bits of security, but since 2011, the recommendation is at least 112 bits. This is supposed to provide security until 2030 and after that, 128 bits is the minimum recommendation [3]. When choosing a construction, the required security not only depends on the time the construction will be used, but also on the time the encrypted information need to be kept secret. A message valid for only a few minutes, e.g. a one-time login code, does not need the same security as a message valid for years.

2.7

Computer Security

One note has to be made about the relation between cryptography and computer security, since these two are sometimes incorrectly considered equivalent. Cryp-tography is only a subset of computer security. Not all computer related security problems can be solved by cryptography. Also, even if a theoretically secure cryp-tographic construction is being used, it is not necessarily secure in practice. One such example is so-called side-channel attacks, which are attacks not on the the-oretical constructions but rather on the implementation, where e.g. the power usage or the time for encryption is measured and used to deduce information about the secret key or the message being encrypted. An example of such an attack is given in Kühn [4].

Another attack that cryptography cannot defend against is a replay-attack (or playback attack), where the adversary records a complete encrypted message and then replays this transmission at a later time. The attacker may not know what the message says, but any effect it has upon reception may be triggered again, thus clearly posing a possible security risk. In order to protect against this, additional measures must be taken, such as sequence numbering or time stamping.

(29)

3

Symmetric Cryptography

This chapter gives an overview of symmetric key encryption schemes, and also gives a few examples of the most well-known constructions. Note, however, that the focus of this thesis is on asymmetric cryptography so this chapter will be fairly brief and only cover enough to understand the possibilities and problems of cryptography. For a much deeper and more thorough treatment of symmetric cryptography, see Menezes, Oorchot, and Vanstone [5].

3.1

A Symmetric Cipher

A symmetric cipher provides a way of transforming a plaintext message into so called ciphertext, by using a key. Anyone who has the key can use it to get back the original message from the ciphertext. We call this a symmetric scheme be-cause the same key is used for both encryption and decryption. Throughout this chapter, we assume that the two parties communicating already have been able to share a secret key. The problem of obtaining this key is discussed in the next chapter. An overview of a symmetric encryption scheme is given in figure 3.1 on the following page.

A symmetric cipher can be mathematically defined in the following way.

3.1 Definition (Cipher). Let K be the set of all keys, M be the set of all plain-text messages and C be the set of all cipherplain-text messages.

A symmetric cipher defined over (K, M, C) is a pair of algorithms (E, D) where E : K × M → C and D : K × C → M and where ∀k ∈ K, ∀m ∈ M : D(k, E(k, m)) = m.

(30)

Alice

Bob

Hi Bob!

E(k, m)

k A)=#FEJ

D(k, c)

Hi Bob!

Eve

k Wednesday, June 6, 12

Figure 3.1:Overview of symmetric cryptography.

We have already seen a few examples of symmetrical ciphers, in the historical overview. Note that there is no notion of security in the definition of a cipher. We will soon give some security definitions, but first we will formalize our discovery in the previous chapter when discussing the Vigenère cipher.

3.1.1

The One Time Pad

The One Time Pad (OTP) is a cipher that works similarly to the Vigenère Cipher, but instead of repeating a short key, we require that the key is at least as long as the message. This means that every letter in the message will be encrypted by a different key character and if those are chosen independently at random, the original contents of the message will be completely hidden. The definition of the one time pad over a binary alphabet is given below, but an equivalent definition can be given for any alphabet.

3.2 Definition (One Time Pad). The One Time Pad cipher is the pair (E, D) defined over (K = {0, 1}n, M = {0, 1}n, C = {0, 1}n), where for k chosen uniformly at random from K, m ∈ M, c ∈ C : E = m ⊕ k and D = c ⊕ k.1

Used correctly, this cipher intends to provide confidentiality, and not any of the other goals with cryptography that was discussed in the previous chapter. Note that the cipher is called the one time pad for a very good reason. If you ever use the same key for two different messages, the adversary can simply add the two ciphertexts together (mod 2) and the key will be eliminated, leaving the XOR of the two plaintext messages. There is enough redundancy in the written language to extract the two messages from this. So, a key for the one time pad must only be used for one message. However, even if the OTP is used correctly, it is a very im-practical cipher since the key needs to be at least as long as the message. We need a secure way to transfer this key and if we already have this, that method could be used to transmit the message directly instead. Using the OTP as described here

(31)

3.2 Security Definitions 15

only makes sense if the two parties can meet in advance and exchange a large amount of keys, to use for future communication.2

3.2

Security Definitions

In order to be able to talk about the security of a cryptographic construction, we first need to define what we mean when we say that a cipher is secure. In this section we give two such definitions of security.

3.2.1

Perfect Secrecy

After having invented the information theory in the 1940s, Claude Shannon used these ideas within the area of cryptography and came up with something referred to as perfect secrecy [7]. One way of defining this is as follows.

3.3 Definition (Perfect Secrecy). A cipher (E, D) over (K, M, C) has perfect se-crecyif ∀m0, m1∈ M, where m0and m1has equal length, and ∀c ∈ C

Pr [E(k, m0) = c] = Pr [E(k, m1) = c]

where k is chosen uniformly at random from K.

The definition says that given a ciphertext, the probability that a specific plaintext generated this ciphertext is the same for all plaintexts of equal lengths, i.e. there is no way to determine the original message. Not even exhaustive search can break this system, regardless of the key space size, since there is no way to tell when the original message is found.

It is very easy to prove that the one time pad in fact has perfect secrecy. However, it is also easy to prove that the inconvenience of the one time pad, the fact that |K| ≥ |M|, i.e. that the key must be as long as the message, is in fact a requirement for perfect secrecy. A cipher that have keys shorter than the messages can never be perfectly secure. This makes perfect secrecy a very impractical definition.

3.2.2

Semantic Security

Our second definition of security, that instead of requiring perfect secrecy in a information theoretical way, requires only that the cipher is secure enough to be unbreakable by any "efficient" adversary, i.e. one running in polynomial time (see Appendix A.1 on page 81). We define semantic security when using the same key for multiple encryptions, also sometimes referred to as indistinguishability under a chosen plaintext attack (IND-CPA), as follows.

3.4 Definition (Semantic Security). Semantic security is defined through a game between a challenger and an adversary, through these steps:

2This approach was actually used for the Moskow-Washington hotline, where diplomats

(32)

1. The challenger chooses a random key from the key space, and also a random number b ∈ {0, 1}.

2. The adversary gets to submit any number of plaintext messages to the chal-lenger, and the challenger sends back an encryption of these under the cho-sen key.

3. The adversary generates m0 and m1, of equal length, of his choice, and

sends these to the challenger.

4. The challenger returns the encryption of mb.

The used cryptosystem is said to have Semantic Security if no "efficient" adver-sary can determine which of the two submitted messages was returned, with probability significantly greater than 12 (the probability achieved if the adversary just guesses).

It is easy to realize that this definition prevents an adversary from learning any information about the plaintext, thus achieving semantic security is the goal for all cryptographic constructions intended to provide confidentiality. Note that m0 and m1 may very well be one of the previously submitted messages. This

means that a system which always transforms the same plaintext into the same ciphertext can never be semantically secure. Instead, the encryption algorithm needs to be randomized, meaning that in addition to the key and the plaintext, it also takes bits from some random source as input, such that the output is different even if the input is the same. Decryption, however, need to be deterministic since it should always return the same plaintext when decrypting a ciphertext.

3.3

Stream Ciphers

We have seen the good security properties of the one time pad, but also the im-practicality of using it. We also know that there is no way to achieve perfect secrecy unless the key is as long as the message. The question is if it is possi-ble to instead achieve semantic security by using the same idea as the one time pad, but with a shorter key. This is what stream ciphers are trying to do. The idea is to use an expansion function, that takes as input a short truly random key, and creates a longer pseudorandom keystream. We call the expansion function a Pseudorandom Generator (PRG), defined as follows.

3.5 Definition (PRG). A Pseudorandom Generator is a function G : {0, 1}n → {0, 1}swhere s  n.

Now, one definition of a stream cipher could be like this.

3.6 Definition (Stream Cipher). A stream cipher is a pair (E, D) defined over (K = {0, 1}n, M = {0, 1}s, C = {0, 1}s), where G : {0, 1}n → {0, 1}s is a PRG and k is chosen uniformly at random from K, m ∈ M, c ∈ C : E = m ⊕ G(k) and D = c ⊕ G(k).

(33)

3.3 Stream Ciphers 17

A concept illustration of a stream cipher is given in figure 3.2.

100110100 1000010100101110101011000010101 1010111010101001010100101010011 0010101110000111111111101000110 k m c

G(k)

Alice

E(k, m)

Wednesday, June 6, 12

Figure 3.2:An example of stream cipher encryption.

In order for this stream cipher to have any chance of being secure, the used PRG must be secure. Security in this case is equivalent to unpredictibility, meaning that with random looking input, sometimes called the seed, the result must also look random such that given some part of the output from the PRG, it is impos-sible to predict any future output. It has in fact been proven that the above con-struction, using a secure PRG, gives a stream cipher that is semantically secure as long as the key is used only once. That means that we modify the previous definition of semantic security to omit the second step. The problem is that no one knows if it is possible to construct secure PRGs, however there are some good candidates such as Salsa20, which is part of the ECRYPT3Stream Cipher Project [8].

Like the one time pad, a stream cipher is trying to achieve confidentiality. It is easy to realize that it does not provide any integrity at all, and also that cipher-text modifications not only go undetected, but also that they have a known effect on the plaintext because of the linearity of addition. In order to achieve integrity, a stream cipher must be accompanied or replaced by other constructions. His-tory has showed that it is hard to implement stream ciphers securely, and many systems that have used stream ciphers have eventually been broken. The recom-mendation is therefore to instead use a standardized block cipher, defined in the next section.

3ECRYPT is the European Network of Excellence for Cryptology, project launched to increase

(34)

3.4

Block Ciphers

Up until now, we have only seen constructions that encrypt each symbol in the alphabet individually. A construction that instead encrypts a block of plaintext into a block of ciphertext is called a block cipher. It is hard to make block ciphers as fast as stream ciphers. However, they may be more secure and more impor-tantly, they will give us new capabilities, explained later in this chapter. Before giving examples of block ciphers, we will first define the abstract idea of a secure pseudorandom permutation.

3.4.1

Psuedorandom Permutations

A Pseudorandom Permutation (PRP) is simply described an invertible function that is "efficiently" calculated in both the forward and the backward direction. 3.7 Definition (PRP). A Pseudorandom Permutation is a function F defined over (K, X ), F : K × X → X such that ∀k ∈ K and ∀x ∈ X :

1. There exists an "efficient" algorithm to evaluate F(k, x). 2. The function E(k, · ) is one-to-one.

3. There exists an "efficient" algorithm to inverse F(k, x).

We say that a PRP is secure if no "efficient" adversary can distinguish between that function and a completely random function defined over the same space. Formally, the security is defined by a game similar to the one used for defining semantic security. The adversary submits input values and the challenger returns the output of either a random function or a PRP, and the PRP is secure if the ad-versary cannot tell which he received. This means for example that changing just one bit in the input should flip every bit in the output with probability 12, since this is what a random function would do. This property is sometimes referred to as diffusion, or as the avalanche criterion.

When discussing modes of operations below, we will see that using a secure PRP, we can achieve semantic security as previously defined. Hence, all that block ciphers are trying to do, is to behave like a secure PRP. However, like the case for PRGs, no one knows if it is possible to construct secure PRPs, but there are constructions that are believed to be close, such as for example AES.

3.4.2

DES

The Data Encryption Standard (DES) was released in 1977 and was the first block cipher to be standardized by NIST. It has a block size of 64 bits and a key size of 56 bits. The short key size was criticized from the beginning and allows for exhaustive search attacks. Also, the short block size makes better attacks possible. Today, DES can be easily broken and should never ever be used.

The algorithm still lives in the form of 3DES, which is simply DES applied three times with different keys. The usual way to do this is not to perform three

(35)

consec-3.4 Block Ciphers 19

utive DES encryptions, but rather encryption, decryption, and encryption, using the three keys respectively. The reason for the middle decryption is that when letting all three keys be the same, these steps reduce to normal DES, only slower. 3DES solves the problem of the short key and is considered a secure construction, although very slow. 3DES should therefore only be used for backward compat-ibility and all new systems should preferably use AES. The inner workings of block ciphers are complicated and will not add much value to this thesis, and therefore these are not described here. For more information about how to build block ciphers, see Menezes et al. [5].

3.4.3

AES

The Advanced Encryption Standard was elected as the new standard in 2002, after a competition that was won by the cipher Rijndael. It has a block size of 128 bits, and key sizes of 128, 192, or 256 bits. Some attacks that reduce the number of bits of security exists, the best one for AES-256 which gives it only 99.5 bits of security for some special cases, as described in Biryukov and Khovratovich [9]. This is, however, still too much for exhaustive search and AES is today considered to be a secure block cipher.

3.4.4

Modes of Operation

Block ciphers only act on messages that have a size of one block. If the message is shorter, padding is applied. However, if a message is longer than one block, we have to specify how to utilize the block cipher in order to encrypt this message. This is called the mode of operation, and there are several with different advan-tages and disadvanadvan-tages. Note that the purpose of these are still only to provide confidentiality. Some combinations of mode and block cipher may achieve other goals as well, e.g. integrity, but these should in general not be relied on. If more than confidentiality is intended, additional constructions, soon to be explained, should be used to provide this.

ECB

In Electronic Codebook (ECB) mode, each block of plaintext is encrypted individ-ually into ciphertext, as shown in figure 3.3 on the next page. One way to look at this is that a simple table, a codebook, can be used to lookup what ciphertext the different plaintext blocks will be transformed into. When using ECB-mode, the same plaintext block will always generate the same ciphertext block, and so it is easy to realize that this is not semantically secure (it is not randomized), and therefore ECB mode should never be used.

CBC

There are two weaknesses with ECB mode that the Cipher Block Chaining (CBC) mode tries to fix. First of all, ECB is deterministic. In order to fix this, we will use something called an Initialization Vector, IV4, which is simply a value chosen

4Sometimes the word "nonce" (number used once) is used interchangeably with IV. However, a

(36)

m0 m1 m2 m3 m4

c0 c1 c2 c3 c4

E(k, m0) E(k, m1) E(k, m2) E(k, m3) E(k, m4) k

m

c

ECB

Wednesday, June 6, 12

Figure 3.3:ECB mode of operation.

uniformly at random from some space, in this case over one block. Moreover, in ECB, a repeated plaintext block leads to a repeated ciphertext block, also elimi-nating any chance of being semantically secure. In CBC mode, each ciphertext block does not just depend on the key and the plaintext, but also on the previ-ous block of ciphertext. For the first block, we use the IV in place of the previprevi-ous block. This means that even if the same plaintext block is repeated, the ciphertext blocks will differ. The IV is then sent in clear along with the ciphertext. Figure 3.4 on the next page describes the CBC mode of operation.

Given that the IV is chosen truly at random, and that the used block cipher is a secure PRP, CBC mode is semantically secure.

CTR

In Counter (CTR) mode, the block cipher is actually not applied to the plaintext at all. Instead, it is used to create a key stream that is then XORed to the plaintext, much like how stream ciphers work. We know since before that in order for a stream cipher to be secure, the same key stream can never be used more than one. The way we achieve this here is to use the block cipher to encrypt an IV, that is chosen at random for each message, concatenated with a counter, that is increased for each block. If this is done properly, CTR mode can also be proven to be semantically secure, of course assuming that used block cipher is a secure PRP. The operation of counter mode is showed in figure 3.5 on page 22. Note that for counter mode, the different blocks in the keystream can be calculated in parallel, and even before the message is known. This makes counter mode much more efficient than CBC mode.

Examples of additional modes are cipher feedback mode and output feedback mode. More information about these can be found in Menezes et al. [5].

(37)

3.5 Hash Functions 21 m0 m1 m2 m3 m4 c0 c1 c2 c3 c4 E(k, m0 + IV) E(k, m 1 + c0) E(k, m2 + c1) E(k, m3 + c2) E(k, m4 + c3) k m c

CBC

IV Wednesday, June 6, 12

Figure 3.4:CBC mode of operation.

3.5

Hash Functions

A hash function is a deterministic function that takes an arbitrary length input and produces a fixed length output. Hash functions are not ciphers at all, since they use no key, however they are covered here since they are very useful in many cryptographic constructions. Note that hash functions are used also outside the area of cryptography, and then usually with much less strict requirements. In this thesis, only cryptographic hash functions are discussed.

3.8 Definition (Cryptographic hash function). A cryptographic hash function is a function, F : {0, 1}→ {

0, 1}n where n is the size of the output, that has the

following properties.

Preimage Resistance The function must be one-way, i.e. given h, it should be difficult to find any m such that F(m) = h.

Second Preimage Resistance Given m0, it should be difficult to find any m1such

that F(m0) = F(m1).

Collision Resistance It should be difficult to find any pair m0, m1 (m0 , m1)

such that F(m0) = F(m1).

Just as the case with secure PRGs and secure PRPs, no one knows if it is possible to construct cryptographic hash functions. Examples of famous current construc-tions are MD5, SHA-1, and SHA-2, who all use Merkle-Damgård construction [10]. Only the latter of these are considered secure today. However, there is an ongoing hash function competition, issued by NIST, ending in 2012. The win-ning hash function will be called SHA-3 and will be the new standard, intended

(38)

m0 m1 m2 m3 m4

c0 c1 c2 c3 c4

E(k, IV + 0) E(k, IV + 1) E(k, IV + 2) E(k, IV + 3) E(k, IV + 4)

k m c

CTR

Wednesday, June 6, 12

Figure 3.5:CTR mode of operation.

to remain secure for a long time.

The hardest property to fulfill for cryptographic hash functions is collision resis-tance. When it is possible to violate this property, we say that we found a collision. Finding one by using brute force is, however, not as hard as it first might seem.

3.5.1

The Birthday Problem

Assume that we have a standard school class with 30 students, what is the proba-bility that two of them share the same birthday? Intuition tells us that this prob-ability should be fairly low, since there are a lot more days than students, but the mathematics tells us otherwise. The probability of a collision is the complement of the probability of no collision. For no collision to occur, the first student can be born on any day, the second on any of the remaining days, and so on for all students. So, Pr [Two with same birthday] = 1 −Q29

k=0365−k365 = 0.706. That is, in a

class with 30 students, the probability that two share the same birthday is over 70 %, and the probability is over 50 % for only 23 students. These numbers hold if birthdays are uniformly distributed over the year. In reality, they are not, which makes the probability for collision even higher.

In general, it can be proven that the number of calculations that need to be per-formed in order to find a collision among N elements is approximately

√ 2 ln 2

N . This result is important for hash functions, and as we will soon see also in other cryptographic constructions, since this tells us that the maximum number of bits of security that can be achieved when collisions need to be avoided is half the size of the output. Using this fact to attack a system is called a birthday attack.

(39)

3.6 Message Authentication Codes 23

3.6

Message Authentication Codes

We have so far only discussed how to achieve confidentiality in a system, which is enough to make a system secure when the adversary only has eavesdropping ca-pabilities. In the real world, however, the adversary can be active and modify the contents of a message during transmission. We will now see how we can provide other goals, namely integrity, authentication, and in some sense non-repudiation. Many network protocols utilize some kind of checksum to detect errors during transmission. These are, however, intended to detect random errors, and a mali-cious adversary can easily make undetectable modifications. A Message Authen-tication Code (MAC) is, just like a hash function, a short digest of a message, but in addition to the message it also takes a key as input.5 Usually, the output is called a tag. The tag is calculated and sent together with the message and upon reception, the MAC is verified, usually by simply calculating the tag again and comparing the two. The goal is that only those with access to the correct key will be able to create tags that verify. We define a MAC as follows.

3.9 Definition (MAC). A Message Authentication Code, defined over (K, M, T ) is a pair of algorithms (S, V ) where S : K × M → T and V : K × M × T → {"yes", "no"} and where ∀m ∈ M, ∀k ∈ K : V (k, m, S(k, m)) = "yes".

The definition of a secure MAC is similar to the definition of semantic security for a cipher.

3.10 Definition (Secure MAC). The security of a MAC under a chosen plaintext attack is defined through a game between a challenger and an adversary, through these steps:

1. The challenger chooses a random key from the key space.

2. The adversary gets to submit any number of plaintext messages to the chal-lenger, and for each the challenger sends back a tag for the message under the chosen key.

3. The adversary sends a message tag pair, (m, t), not equal to any of the previ-ously created pairs.

4. The challenger runs the verification algorithm on (m, t).

The MAC used is said to be secure, or existentially unforgeable, if for all "efficient" adversaries the output of the verification is "yes" with negligible probability. By using a secure mac, an attacker will not be able to modify anything in the message without being detected by the verification algorithm, i.e. MACs provide message integrity. This also means that if a message is received, along with a MAC that verifies, the receiver can be certain of who sent the message, assuming

(40)

that only one other person has access to the secret key, so we also achieve authen-tication. Between the two, we can also argue that non-repudiation is achieved, since no one else could have created the tag. However, since the key is not com-pletely personal, a third party can never be convinced of who actually sent the message.

Note that any MAC is vulnerable to the birthday attack, since the attacker may try to find any two messages that map to the same tag. This means that the size of the tag always needs to be at least twice the number of intended bits of security. We have already presented all necessary tools in order to construct MACs that are believed to be secure. Two such constructions are CBC-MAC and HMAC.

3.6.1

CBC-MAC

In CBC-MAC, the idea is the same as when using the CBC mode of operation for block ciphers. We know that the last block of a message encryted with CBC mode depends on the contents of all previous blocks. However, some changes need to be made in order for the MAC scheme to be secure. First of all, the IV should be fixed instead of random. Remember that the randomization was necessary for semantic security, which is not what we are trying to achieve here. Moreover, if messages can have different lengths, then actions need to be taken in order to defend against something called extension attacks. These are attacks where a valid message-tag-pair, (m, t), is known, and the attacker tries to create a new, valid, pair (m0, t0), where m0is simply m concatenated with some additional data. Exactly how such attacks can be carried out, and how to protect against them, can be found in Black and Rogaway [11].

3.6.2

HMAC

Remember that a hash function can be used to reduce the size of a large message down to something small, which we now know that a MAC normally does as well. However, a hash function does not depend on the secret key, which we know that a MAC must do. HMAC (Hash-based MAC) utilizes a hash function on a combination of the message and the secret key in order to construct a MAC. This combination has to be done with care in order for the MAC to be secure, and the full definition of HMAC specifies exactly how this combination should be performed. By following this specification, the security of the MAC depends on the strength of the hash function used and if, for example SHA-512 is being used, HMAC is considered to be secure. HMAC may actually be secure even if the used hash-function is not completely cryptographic. In particular, the collision resistance property may not be required. A detailed description of HMAC and the necessary security demands on the used hash function can be found in the original paper by Bellare, Ganetti, and Krawczyk [12].

3.6.3

Authenticated Encryption

We now know how to achieve confidentiality and integrity individually, but we have yet to discuss how to combine these constructions in order to achieve both

(41)

3.6 Message Authentication Codes 25

simultaneously. A few approaches may work with some combinations of encryp-tion and MAC scheme, but one approach is recommended and guaranteed to be secure for all combinations of secure ciphers and MACs, and that is to first en-crypt the message under one key, and then MAC the ciphertext under a different, independent, key. A more detailed analysis of authenticated encryption is given in Bellare and Namprempre [13].

It is quite easy to realize that the recommended approach must be completely secure. Since the cipher is secure, the ciphertext reveals nothing about the mes-sage and so the tag cannot either since the keys are independent. Also, since the MAC is verified first, any messages with broken integrity is never decrypted, thus saving time. Whenever both confidentiality and integrity are intended, this is the solution that should be used, as illustrated in figure 3.6.

Alice

Hi Bob!

Encrypt

MAC

A)=#FEJ “tag”

Bob

kenc kmac Wednesday, June 6, 12

Figure 3.6: The correct order of operations when both confidentiality and integrity are intended.

(42)
(43)

4

Asymmetric Cryptography

In this chapter, asymmetric cryptography is defined and compared to symmetric cryptography. Since the purpose of this thesis was to implement and evaluate an asymmetric construction, this chapter will be more thorough, and include more detailed descriptions of the constructions, than previous chapters.

4.1

The Key Distribution Problem

We have up until now seen some very neat ways to achieve confidentiality and integrity by using a shared secret key. In fact, the already presented ideas are the most efficient and secure ones for providing data confidentiality and integrity, and are in wide use today. However, one big problem with these constructions is the assumption that there already exists a shared secret key. In almost all realistic settings, e.g. when communicating over the Internet, the two communicating parties does not have this shared secret to begin with, and they may even never have communicated before. Since the private keys need to be completely private, each communicating pair needs a different key and so for n entities, we need Θ(n2) (see Appendix A.1 on page 81 for notation) keys distributed in advance over some secure channel. In the Internet case, there is no such secure channel, and we need something else in order to solve this problem.

4.2

Public and Private Keys

The main difference between symmetric and asymmetric cryptographic systems is that the latter uses different keys for encryption and decryption. The two keys must clearly have an intimate relation for this to work, but the schemes are

(44)

structed such that given the encryption key, it is "hard" (see Appendix A.1 on page 81 for definition) to figure out what the decryption key is. This makes it possible to let the encryption key be completely public and known to everyone, while the private key is kept secret to the owner. For this reason, asymmetric cryptography is also known as public key cryptography.

We can already see that this solves the key distribution problem mentioned above, since we no longer need a new key for each communicating pair, and since the public key can be distributed over an insecure channel (partly true, see section about public key infrastructure). Anyone who wants to send messages to Bob uses the same key, namely Bobs public key. Before showing how we can build public key cryptosystems, we will first look at another solution to the key distri-bution problem, which also was the starting point for public key cryptography.

4.3

Key Exchange

A key exchange system is a way for two parties to generate a shared secret key over an insecure channel. The secret key can then be used with a symmetric cryptosystem in order to secure the communication, as described in the previous chapter. The fact that it is possible to utilize an insecure channel for this appli-cation is quite remarkable. An analogy could be that you can enter a room full of people that you have never met before. You then start shouting to someone on the other side of the room, such that everyone can hear what you are saying. After shouting to each other for a while, you can keep shouting and understand-ing each other completely, while no one else in the room can understand a thunderstand-ing of what you are saying. Intuition tells us this is impossible, and yet we can con-struct such schemes. See Appendix A on page 81 for the required mathematical background before continuing.

4.3.1

Diffie-Hellman-Merkle

1

Key Exchange

In 1976, Withfield Diffie and Martin Hellman published a paper, New Directions in Cryptography [15], that changed cryptography forever. Among many other things, they improved an idea proposed by Ralph Merkle a few years earlier, and constructed a scheme for key negotiation2, as follows.

4.1 Definition (Diffie-Hellman-Merkle Key Exchange). Assume that Alice and Bob want to generate a shared secret key. First, they need to agree upon a cyclic group (see Appendix A.4 on page 86) (G, · ) of order q and a generator element, g. This data is not secret and can be sent completely in the clear. Then, the following steps are performed.

1The scheme is usually referred to as just "Diffie-Hellman Key Exchange (DH)" since those are the

authors of the paper. However, in 2002, Hellman proposed that also Merkle should be included if the scheme is given a name [14].

2It is usually called "exchange", even though that name implies that the secret is known to one of

the parties before the protocol is run, which is not the case. Negotiation is a better name since the secret key is generated and negotiated while running the protocol.

(45)

4.3 Key Exchange 29

1. Both sides choose a random number each. xA, xB∈ Zq.

2. The two sides calculates gxAG and gxBG respectively, and transmits the result to the other side.

3. Now, Alice can calculate (gxB)xAG and Bob can calculate (gxA)xBG. 4. By the rules of exponentiation, they both end up with the same secret value

(gxB)xA = gxBxA = gxAxB= (gxA)xB, all ∈ G.

See figure 4.1 for a visualization of the scheme. An adversary listening to this communication will gain information about what cyclic group and generator is being used, and will also see the values gxAand gxB. The problem of finding gxAxB from these values is called the Diffie-Hellman problem, and the assumption that it is hard is called the Computational Diffie-Hellman (CDH) assumption. It is easy to realize that solving the Diffie-Hellman problem is at least as easy as solv-ing the discrete logarithm problem (DLP) in the group (see Appendix A.4 on page 86), because one solution is to find xAfrom gxA and then calculate (gxB)xA

from this, i.e. just like Alice does. However, the DLP is considered "hard", mean-ing that no "efficient" adversary can solve it. It is however not known if the Diffie-Hellman problem is equivalent to that of finding discrete logarithms, or if there might be another way to find gxAxBfrom g, gxA, and gxB. Nevertheless, no attack better than solving the DLP exists today.

Alice

Bob

x

A (random)

Alice chooses group G and generator g

x

B (random)

G, g

g x

A

g x

B

g x

A

x

B

Eve

g x

A

x

B

??

Wednesday, June 6, 12

(46)

It is important to realize that this scheme, as described here, can only be secure against an eavesdropping adversary. This is because Alice has no way of knowing that she is actually talking to Bob, and the other way around. This enables a man in the middle attack, where an adversary, Eve, can intercept and replace all com-munication. When talking to Bob, Eve will pretend to be Alice and when talking to Alice, she will pretend to be Bob. Alice and Bob have no way of detecting this and will continue with their communication as normal, believing it is secure. In order to protect against this, additional measures will have to be taken. In partic-ular, Alice and Bob need to know something about each other in advance in order to verify the authenticity of the messages and prevent this attack. The problem of verifying the identity of the other side will be discussed in the section about public key infrastructure.

4.4

Trapdoor Permutations

We have already stated that asymmetric cryptography uses systems where the key for encryption and decryption is different, but we have yet to describe how this will work. Just like the pseudorandom permutation is the ideal block cipher, we will now look at something called a trapdoor permutation, which will be the ideal basis for an asymmetric cryptosystem.

4.2 Definition (Trapdoor Permutation). A trapdoor permutation is a set of three "efficient" algorithms, (G, F, F−1

), referred to as key-pair generation, encryption, and decryption.

G - Outputs a public-private key pair, called (kpub, kpriv).

F - F(kpub, x) evaluates the trapdoor permutation at point x.

F−1- F−1(kpriv, y) inverts F, such that F−1(kpriv, F(kpub, x)) = x

We say that the trapdoor permutation is secure if it is "hard" to invert F without knowledge of kpriv.

In words, a trapdoor permutation is a function that has an "efficient" algorithm for calculating it in the forward direction, but is "hard" to invert unless you have access to some extra information, the trapdoor. A common metaphor is that of a padlock, which anyone can lock. Opening it again is, however, hard unless you have access to the key or combination. Just like the case with pseudorandom permutations, no one knows if it is possible to construct trapdoor permutations. We do, however, have a few promising suggestions that will soon be discussed.

4.5

Semantic Security

In order for our definition of semantic security, defined in the previous chapter, to be applicable to the asymmetric world, we need to make some slight adjustments to the first two steps. Instead of having the challenger choosing one random

(47)

4.6 ElGamal 31

key, it is instead going to run the key-pair generation algorithm, G, to acquire a public-private key pair, and then send the public key to the adversary. After this, the adversary can generate any number of ciphertexts himself instead of sending them to the challenger for encryption. With these adjustments, the remaining steps will be the same, i.e. the adversary will submit two messages and get back the encryption of one of them, and the goal is to determine which one was sent back. We say that a scheme is semantically secure if no "efficient" adversary can be significantly better at this game than the guessing adversary would be.

4.6

ElGamal

In 1984, the Egyptian cryptographer Taher Elgamal described a way to leverage the discrete logarithm problem in cyclic groups (see Appendix A.4 on page 86) for creating a trapdoor permutation and thus providing confidentiality [16]. Before Alice can send any messages to Bob, he needs to generate a public and a private key. This is done by choosing a cyclic group of order q, a generator element g and also a random integer x ∈ Zq. The public key is then (G, q, g, h = gx) and

the private key is x. The hardness of the discrete logarithm problem makes it impossible for Eve to figure out x, assuming the group is large enough. ElGamal is now defined, using multiplicative group notation.

4.3 Definition (ElGamal Encryption). In order for Alice to send an encrypted message m to Bob, with public key (G, q, g, h = gx) and private key x, the follow-ing steps are taken.

ElGamal Encryption

Alice chooses a random y in [1, q − 1], and calculates c1= gy.Alice calculates s = hy= (gx)y.

Alice calculates c2 = m0· s, where m0 is the message to send, repre-sented as an element in the chosen group.

Alice sends (c1, c2) = (gy, m0· hy) = (gy, m0· (gx)y) to Bob. ElGamal Decryption

Bob calculates (c1)x= (gy)x= s.

Bob calculates c2· s−1= m0· (gx)y· ((gx)y)−1= m0⇒m.

We can see that the above system is in fact a cipher, since the original message is retrieved upon decryption.

The described version of ElGamal has some problems that must be fixed by apply-ing some paddapply-ing scheme to the message prior to encryption. If done properly, e.g. as described by Cramer and Shoup in [17], the security of the scheme de-pends on the hardness of the discrete logarithm problem in the chosen group.

References

Related documents

In this paper we study reiterated homogenization where a only satis es degenerate struc-.

While not dealing specifically with the topic of childhood, in the dissertation Joyce’s Doctrine of Denial: Families and Forgetting in Dubliners, A Portrait of the Artist

James Joyce’s fiction is considered important for understanding Irish childhoods, and Joyce’s portrayal of childhood is often deemed unchanging within the major themes until

While Abel was the first to publish the idea of inverting elliptic integrals to obtain elliptic functions in 1827, with Jacobi publishing his own inversion two years later, Gauss

And lastly, we will provide the reader enough information to work with group laws for elliptic curves where the reader will be able to understand how to compute a line through

In this paper we will consider these elliptic curves over finite fields, this will make the sets of points finite, and study ways of counting the number of points on a given

The Nagell-Lutz Theorem is a practical tool in finding all rational points of finite order on an elliptic curve over the

(a) To prove the existence and uniqueness of continuous weak solutions to the mixed boundary value problem for quasilinear elliptic equations with con- tinuous and Sobolev