A Performance Evaluation of Post-Quantum Cryptography in the Signal Protocol

(1)

Linköpings universitet SE–581 83 Linköping

Linköping University | Department of Electrical Engineering

Master’s thesis, 30 ECTS | Computer Science

2019 | LITH-ISY-EX--19/5211--SE

A Performance Evaluation of

Post-Quantum Cryptography in

the Signal Protocol

En prestandautvärdering av kvantsäkert krypto i

Signal-protokollet.

Markus Alvila

Supervisor : Guilherme B. Xavier Examiner : Jan-Åke Larsson

(2)

Upphovsrätt

Detta dokument hålls tillgängligt på Internet - eller dess framtida ersättare - under 25 år från publicer-ingsdatum under förutsättning att inga extraordinära omständigheter uppstår.

Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka ko-pior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervis-ning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säker-heten och tillgängligsäker-heten ﬁnns lösningar av teknisk och administrativ art.

Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsman-nens litterära eller konstnärliga anseende eller egenart.

För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/.

Copyright

The publishers will keep this document online on the Internet - or its possible replacement - for a period of 25 years starting from the date of publication barring exceptional circumstances.

The online availability of the document implies permanent permission for anyone to read, to down-load, or to print out single copies for his/hers own use and to use it unchanged for non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional upon the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility.

According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement.

For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its www home page: http://www.ep.liu.se/.

(3)

Abstract

The Signal protocol can be considered state-of-the-art when it comes to secure messag-ing, but advances in quantum computing stress the importance of finding post-quantum resistant alternatives to its asymmetric cryptographic primitives.

The aim is to determine whether existing post-quantum cryptography can be used as a drop-in replacement for the public-key cryptography currently used in the Signal protocol and what the performance trade-offs may be.

An implementation of the Signal protocol using commutative supersingular isogeny Diffie-Hellman (CSIDH) key exchange operations in place of elliptic-curve Diffie-Hellman (ECDH) is proposed. The benchmark results on a Samsung Galaxy Note 8 mobile device equipped with a 64-bit Samsung Exynos 9 (8895) octa-core CPU shows that it takes roughly 8 seconds to initialize a session using CSIDH-512 and over 40 seconds using CSIDH-1024, without platform specific optimization.

To the best of our knowledge, the proposed implementation is the first post-quantum resistant Signal protocol implementation and the first evaluation of using CSIDH as a drop-in replacement for ECDH drop-in a communication protocol.

(4)

Acknowledgments

Foremost, I would like to thank my mother and father, for their unfaltering support and teaching me the true meaning of Finnish sisu. I would also like to thank my brothers, for standing by my side through thick and thin.

My thanks and appreciations also go to my supervisors Marcus Kardell and Guilherme B. Xavier for their mentorship and encouragement, as well as to Peter Schwabe, Jonathan Jogenfors and Christan Vestlund for their invaluable advice.

(5)

List of Figures

2.1 A KDF chain . . . 8

2.2 A complete Diffie-Hellman ratchet . . . 9

2.3 Deriving message keys via the symmetric key ratchet . . . 10

2.4 Illustration of how the different chains advance . . . 11

3.1 Original ECC implementation layout. . . 17

3.2 CSIDH implementation layout. . . 17

3.3 Benchmarking app user interface. . . 22

3.4 Android Studio CPU profiler overview. . . 24

3.5 An example call chart illustrating the involved time concepts. . . 24

3.6 Android Studio CPU profiler hover panel. . . 25

3.7 Call chart for a message initializing a session. . . 25

3.8 Call chart for a consecutive message. . . 26

(8)

List of Tables

2.1 CSIDH attack cost in number of operations for different NIST levels. . . 14

3.1 Key sizes in bytes. . . 21

4.1 Mean thread time [ms] for sending/receiving the first message in a session. . . 28

4.2 Mean thread time [ms] for sending/receiving consecutive messages. . . 28

4.3 Mean thread time [ms] for sending/receiving replies. . . 28

A.1 Thread time [ms] for sending the first message in a session. . . 33

A.2 Thread time [ms] for receiving the first message in a session. . . 34

A.3 Thread time [ms] for sending consecutive messages. . . 35

A.4 Thread time [ms] for receiving consecutive messages. . . 36

A.5 Thread time [ms] for sending a reply. . . 37

(9)

Abbreviations

AEAD Authenticated Encryption with Associated Data AES Advanced Encryption Standard

CBC Cipher Block Chaining CPU Central Processing Unit

CSIDH Commutative Supersingular Isogeny Diffie-Hellman ECC Elliptic-curve Cryptography

ECDH Elliptic-curve Diffie-Hellman

EdDSA Edwards-curve Digital Signature Algorithm EU European Union

HKDF HMAC-based Extract-and-Expand Key Derivation Function HMAC Hash-based Message Authentication Code

IPsec Internet Protocol Security KDF Key Derivation Function MAC Message Authentication Code NATO North Atlantic Treaty Organization

NIST National Institute of Standards and Technology PQC Post-Quantum Cryptography

RSA Rivest-Shamir-Adleman SD Standard Deviation SHA Secure Hash Algorithm

SIDH Supersingular Isogeny Diffie-Hellman SIKE Supersingular Isogeny Key Encapsulation SSH Secure Shell

TLS Transport Layer Security X3DH Extended Triple Diffie-Hellman

(10)

Chapter 1 Introduction

Over the past decades there has been a stable set of secure cryptographic primitives upon which to rely, but as quantum computing continues to be developed, these primitives come under new threats and the work to find potential replacements is ongoing [3, 11]. These new primitives are however still in their infancy and need further public scrutiny and evaluation in real-world contexts.

1.1 Motivation

It has been proved that current public key cryptography based on the difficulty of integer fac-torization or the hardness of the discrete logarithm problem are breakable in sub-exponential time on a theoretical quantum computer using Shor’s factorization algorithm [29]. It has also been proved that Grover’s algorithm provides a quadratic speed-up for quantum search algorithms [18]. Therefore, parties with confidential information and a legal obligation to keep it secure for many decades to come, have already started to look into post-quantum replacements [3, 11]. The currently available post-quantum cryptographic primitives come with performance trade-offs and there is limited research whether these can be practically used in an end-to-end encrypted, mobile messaging context.

This thesis project was conducted at Sectra Communications, an international cyberse-curity company providing secure communication solutions to government authorities and defense organizations in the EU and NATO, where the ability to communicate quickly and securely is of critical importance.

1.2 Aim

The aim is to determine whether existing post-quantum cryptography can be used as a drop-in replacement for the public-key cryptography currently used drop-in the Signal protocol and what the performance trade-offs may be.

Considering the quantum attacks made possible by the algorithms developed by Shor and Grover, the priority is to replace public-key cryptography vulnerable in the post-quantum era with quantum-resistant alternatives. Present symmetric cryptography and cryptographic hash functions are not as critical to replace, due to their higher resilience to known quantum attacks [4].

The Signal protocol was of particular interest as it can be considered state-of-the-art in se-cure messaging. Other messaging apps such as Facebook Messenger, WhatsApp and Skype, among others, have adopted the Signal protocol for their secure messaging functionality [13]. After using existing post-quantum cryptography to replace the currently used public-key cryptography in the Signal protocol and implementing the protocol as part of a simple proof of concept app, CPU performance metrics of the modified protocol are collected and com-pared to the original protocol.

(11)

1.3. Research Questions

1.3 Research Questions

1. How can a post-quantum resistant version of the Signal protocol be implemented with-out losing its secure messaging properties?

2. How does the post-quantum resistant version compare to the original Signal protocol regarding CPU utilization?

1.4 Scope

Only parts of the Signal protocol relevant for single peer communication were considered, meaning that Signal’s multi-device and group chat features were left for future work. Addi-tionally, properties only partially provided by the Signal protocol were not considered.

To maintain a feasible set of post-quantum cryptographic algorithms, only PQCRYPTO recommendations and NIST submissions were considered, with the exception of CSIDH due to its unique properties described further in section 2.5.3.

The EdDSA signature present in the original Signal protocol was excluded from the per-formance evaluations, as a practical CSIDH-based signature scheme was yet to be published at the time of writing.

All performance tests were carried out on a Samsung Galaxy Note 8 Android device. Other devices and mobile operating systems were not considered.

How current hardware acceleration features affect the performance comparisons between common cryptographic primitives and newer post-quantum cryptographic primitives were left for future work.

(12)

Chapter 2 Theory

2.1 Common Cryptographic Primitives

The cryptographic primitives described below are a non-exhaustive list of cryptographic primitives used in the Signal protocol, to achieve desired secure messaging properties.

2.1.1 Public Key Cryptography

In comparison to symmetric cryptography like the Advanced Encryption Standard (AES) [30], that uses the same key for both encryption and decryption, public key cryptography uses a key pair. A key pair consists of a public key that can be distributed to other parties and a private key that must be kept private. This makes key distribution much easier, as the pub-lic key can be published without compromising the security. When pubpub-lic key cryptography is used for sending an encrypted message, a sender encrypts a message using a recipient’s public key and the recipient uses the corresponding secret key to decrypt it. Unfortunately, the encryption and decryption processes are generally slower than for symmetric cryptogra-phy. Public key cryptography is typically used to establish a shared secret key, that can then be used in symmetric cryptography, the Signal protocol is no exception.

2.1.2 Digital Signature

Digital signatures can be used to prevent forgery or tampered messages.

A complete signature scheme consists of three parts: key generation, signing and verifi-cation. Key generation produces a key pair, consisting of a public key and a private key. The private key is used for signing and the public key is used to verify the signatures.

2.1.3 Cryptographic Hash Function

Given an input message of any length, a cryptographic hash function outputs a string of a fixed length. Hashing the same message must always give the same output and it should be very hard to find the input message given only the output hash.

The avalanche effect is another important property of hash functions, in short it means that a minimal change to the input message should drastically change the output hash.

2.1.4 Message Authentication Code

To confirm that a message has not been tampered with and was sent by the expected party, a message authentication code (MAC) can be used. Similar to how digital signatures work, a message authentication code consists of key generation, signing and verifying algorithms. A major difference is that a MAC uses a single symmetric key for both the sender and recipient.

(13)

2.2. Secure Messaging Properties

Only the parties in possession of the symmetric key can generate valid MAC values and verify them.

2.1.5 Diffie-Hellman Key Exchange

By performing a Diffie-Hellman key exchange, Alice can establish a shared secret key with Bob over an insecure channel. First the public parameters g P t2, 3, ..., p ´ 2u and large prime p are generated and published. Alice then chooses a secret key a P t2, 3, ..., p ´ 2u and calculates her public key A=gamodp. She then sends public key A to Bob. Bob performs the equivalent operations, chooses a secret key b P t2, 3, ..., p ´ 2u, calculates his public key B=gbmodp and sends B to Alice. Both Alice and Bob can then calculate the shared secret key:

K ”(B)amodp ” gbamodp ”(A)bmodp ” gabmodp

Protocols like TLS, IPsec and SSH commonly use an elliptic curve variant of the standard Diffie-Hellman key exchange. The Signal protocol only uses the elliptic curve variant.

When Alice and Bob want to exchange a secret key with each other using the elliptic curve Diffie-Hellman key exchange, they first have to agree on all the elements defining the elliptic curve, the domain parameters. Alice then generates a private key dAand calculates a public key QA=dAG, where G is the generator for the curve. Similarly, Bob generates a private key dB and calculates a public key QB = dBG. After exchanging public keys, an eavesdropper would only know QA and QB, while Alice and Bob can calculate the shared secret key xkas the x-coordinate of the calculated point(xk, yk):

dAQB=dAdBG=dBdAG=dBQA

2.1.6 Key Derivation Function

A key derivation function (KDF) can be used to turn weak key material and a pseudorandom KDF key, into cryptographically stronger key material. A KDF can derive one or more secret keys of a required format from a master key or a password. An example use case for a KDF is to convert the result of a Diffie–Hellman key exchange into a symmetric key of the required length for use with AES-256.

2.1.7 Authenticated Encryption with Associated Data

Assuming an attacker uses active techniques, it might not be enough to ensure just confi-dentiality. By using authenticated encryption with associated data (AEAD), the receiver can ensure that the data has been constructed by the sender possessing the right key. AEAD pro-vides confidentiality, integrity and data authenticity, at the same time as it can authenticate associated data like header information.

2.2 Secure Messaging Properties

A wide range of properties is relevant for secure messaging. The property definitions below are based on SoK: Secure Messaging by Unger et al. [31], a comprehensive study of three problem areas in secure messaging: trust establishment, conversation security and transport privacy.

The properties below are a non-exhaustive list of properties provided by the Signal pro-tocol, limited to conversation security. Partially provided properties, multi-device and group chat specific properties have been excluded as specified in section 1.4.

(14)

2.2. Secure Messaging Properties

2.2.1 Confidentiality

Confidentiality assures that only the intended recipients are able to read a message. Specifi-cally, the message is not readable by any server operator that is not a conversation participant.

2.2.2 Integrity

Integrity assures that no honest party will accept a message that has been modified in transit.

2.2.3 Authentication

Each participant in the conversation receives proof of possession of a known long-term se-cret from all other participants that they believe to be participating in the conversation. In addition, each participant is able to verify that a message was sent from the claimed source.

2.2.4 Forward Secrecy

The property of perfect forward secrecy is fulfilled as the compromise of long-term keys does not affect the confidentiality of past conversations. Long-term key material is combined with ephemeral key material generated for every session. The ephemeral key material can never be re-used and never stored between sessions. If an attacker also manages to compromise a session’s ephemeral key material, the impact is limited to that particular session.

2.2.5 Post-Compromise Security

The property of post-compromise security, also referred to as future secrecy, is fulfilled as the compromise of both long-term and session keys does not affect the confidentiality of succeed-ing conversations. This is achieved by regularly mixsucceed-ing and replacsucceed-ing old key material with new, for every session and message.

2.2.6 Asynchronicity

In the Signal Protocol, messages can be sent securely to disconnected recipients and received upon their next connection, this is referred to as asynchronous communication.

2.2.7 Other Properties

• Message Unlinkability: If a judge is convinced that a participant authored one message in the conversation, this does not provide evidence that they authored other messages. • Message Repudiation: Given a conversation transcript and all cryptographic keys,

there is no evidence that a given message was authored by any particular user. We assume that the accuser has access to the session keys, but not to the accused partici-pant’s long-term secret keys.

• Participation Repudiation: Given a conversation transcript and all cryptographic key material for all but one accused participant, there is no evidence that the honest partic-ipant was in a conversation with any of the other particpartic-ipants.

• Participant Consistency: At any point when a message is accepted by an honest party, all honest parties are guaranteed to have the same view of the participant list.

• Destination Validation: When a message is accepted by an honest party, they can verify that they were included in the set of intended recipients for the message.

• Causality Preserving: Implementations can avoid displaying a message before mes-sages that causally precede it.

(15)

2.3. The Signal Protocol

2.3 The Signal Protocol

In 2013 the open-source project group Open Whisper Systems released the first version of the TextSecure Protocol, later renamed to the Signal Protocol, enabling end-to-end encrypted messaging between multiple parties. The Signal protocol combines prekeys and an extended triple Diffie-Hellman key agreement protocol (X3DH) [23] with a ratcheting key manage-ment algorithm (Double Ratchet) [22], using standard cryptographic primitives such as ellip-tic curve Diffie-Hellman calculations, AES-256 and HMAC-SHA256. The Signal Protocol has over time been adopted by other messaging apps such as Facebook Messenger, WhatsApp and Skype, establishing itself as the state-of-the-art secure messaging protocol.

2.3.1 The X3DH Key Agreement Protocol

The X3DH key agreement protocol is used to establish an initial shared secret between the communicating parties, while providing mutual authentication, cryptographic deniability and forward secrecy. The communication is asynchronous, meaning that one party can be offline and later establish the shared secret with the help of a server storing the necessary information.

The following elliptic curve keys are used in the X3DH key agreement protocol when Alice wants to initiate a new one-to-one session with Bob:

• IKA: Alice’s long-term identity key

• EKA: Alice’s ephemeral key, used for a single protocol run • IKB: Bob’s long-term identity key

• SPKB: Bob’s signed prekey, changed periodically • OPKB: Bob’s one-time prekey

All elliptic curve keys use either Curve25519 or Curve448. The signed prekey is signed by the corresponding identity key using EdDSA. Note that identity keys have long lifetimes, signed prekeys have medium lifetimes, while ephemeral and one-time prekeys are used only once. We denote the private keys with prefix priv and || denotes a concatenation.

Before Alice can begin any conversation with Bob, he has to upload his key bundle to the server. The key bundle contains a set of one-time prekeys, IKB, SPKBand the corresponding signature. The first Alice does when she wants to start a new session with Bob is to download his IKB, SPKBand an OPKBfrom the server. As soon as Alice has downloaded an OPKB, it is removed by the server.

Among the first operations performed by Alice is to verify the SPKB signature; if it is valid, she proceeds by generating EKA. The secret session key SKABis then calculated from three to four elliptic curve Diffie-Hellman operations:

1. DH1=ECDH(privIKA, SPKB)

2. DH2=ECDH(privEKA, IKB) 3. DH3=ECDH(privEKA, SPKB) 4. DH4=ECDH(privEKA, OPKB)

5. SKAB=KDF(DH1||DH2||DH3||DH4)

DH1and DH2provide mutual authentication, while DH3and DH4provide forward se-crecy. DH4is only performed if there is a one-time prekey available on the server. The outputs

(16)

DH1-DH4and Alice’s ephemeral secret key are deleted as soon as the session key SKABhas been calculated.

An associated data AD byte sequence is then calculated, containing identity information for both parties. The Signal protocol uses the identity keys from Alice and Bob, but other information like usernames, phone numbers and certificates could also be included.

AD= (Encode(IKA)||Encode(IKB))

The first message sent from Alice to Bob contains Alice’s identity key IKA, ephemeral key EKA, an identifier for the used OBKB and the first message ciphertext. Alice includes this information in the header until Bob has decrypted and responded to the first message. The ciphertext is encrypted using an AEAD encryption scheme with AD as associated data and the encryption key is either SKABor a derived key from a KDF with SKABas input.

When Bob receives Alice’s first message, he calculates the session key SKAB by repeat-ing the elliptic curve Diffie-Hellman operations, usrepeat-ing the private keys correspondrepeat-ing to the public keys Alice used:

1. DH1=ECDH(privSPKB, IKA)

2. DH2=ECDH(privIKB, EKA)

3. DH3=ECDH(privSPKB, EKA)

4. DH4=ECDH(privOPKB, EKA)

5. SKAB=KDF(DH1||DH2||DH3||DH4)

Bob then constructs the same associated data byte sequence AD as Alice and tries to de-crypt the received ciphertext using SKAB and AD. If the decryption fails, Bob aborts the communication and deletes SKAB. If the decryption succeeds, Bob deletes privOBKBto en-sure forward secrecy.

Note that Alice and Bob are advised to use an out-of-band channel to verify the long-term identity key of the other party, to know for sure whom they are talking to. With the Signal app, the public key fingerprint can be shown and scanned as a QR code by the other party.

2.3.2 The Double Ratchet Key Management Algorithm

After the communicating parties have authenticated each other and agreed on a shared secret using the X3DH protocol, the parties can exchange encrypted messages.

The Double Ratchet key management algorithm is used to introduce new key material regularly and to derive new keys, by combining an elliptic curve Diffie-Hellman ratchet with a symmetric key ratchet.

KDF chains is a core concept in the Double Ratchet key management algorithm. A KDF chain is simply a chain of key derivation functions, where the output from a previous KDF is used as the KDF key in the next KDF, see figure 2.1. The Diffie-Hellman ratchet and symmet-ric key ratchet both utilize KDF chains to derive new keys.

The parties’ signed prekeys are used to initialize the Diffie-Hellman ratchet. The shared session key SKAB is used as the Double Ratchet’s initial root key. From the root key, two chain keys are derived, one for sending and one for receiving messages. From each chain key a message key is derived. Alice’s sending chain corresponds to Bob’s receiving chain and vice versa. We denote the Double Ratchet’s different keys as:

• DHR: Diffie-Hellman ratchet public key (Received) • DHS: Diffie-Hellman ratchet key pair (Self/Generated) • RK: Root key

(17)

Figure 2.1: A KDF chain

• CKR: Chain key for receiving • CKS: Chain key for sending • MK: Message key

Every time Alice and Bob exchange a message, the Diffie-Hellman ratchet key pair and received public key is updated. The Diffie-Hellman output secret is used as new input to the root chain, so that later keys cannot be calculated from earlier ones. The output keys from the root chain become the new KDF keys for the receiving and sending chains, see figure 2.2. The key derivation functions in figure 2.2 use HKDF with SHA-256.

Every message contains a header, including the public part of DHS, which is DHRfrom the recipient’s perspective. Alice’s DHS public key is DHR from Bob’s perspective, when Alice sends a message to Bob and vice versa. As soon as Alice or Bob receives a new DHR, the party generates a new DHS and performs a Diffie-Hellman ratchet step. In other words, the parties take turns replacing their Diffie-Hellman ratchet key pairs.

Every time a message is sent or received, the sending or receiving chain is advanced and the output is used as the new message key MK for encryption or decryption, see figure 2.3. This ensures that earlier keys cannot be calculated from later ones. The key derivation func-tions in figure 2.3 use HMAC with SHA-256.

For the sending case, after encrypting a message and calculating the new sending chain key CKS, the used message key and old sending chain key can be deleted.

For the receiving case and as messages may arrive delayed, out-of-order or not at all, the receiver needs to read the message header and determine the message index, in order to ad-vance the receiving chain key CKRthe corresponding number of steps. This also means that the receiving party cannot delete receiving message keys, until a message has been decrypted under each one. In the current open-source Signal protocol implementation, there is a hard-coded limit of 2000 messages or 5 Diffie-Hellman ratchet updates before unused receiving message keys are deleted.

The message keys are used for protecting messages with AES-256 in cipher block chaining (CBC) mode for encryption and HMAC-SHA256 for authentication.

(18)

Figure 2.2: A complete Diffie-Hellman ratchet

An example session from Alice’s point of view is used to summarize the different ratchet steps and illustrate how the different chains advance together, see also figure 2.4:

1. X3DH establishes the shared session secret SK and initializes DHS and DHR to the parties’ signed prekeys. RKinitand CKS0are then derived using HKDF with SHA-256:

RKinit, CKS0 =HKDF(SK, ECDH(privDHS, DHR))

2. Alice generates her new DHS, to be used in the Diffie-Hellman ratchet step for the receiving chain when Bob answers.

3. Alice performs a symmetric key ratchet step in the sending chain, creating MKS0and CKS1using HMAC with SHA-256:

MKS0 =HMAC(CKS0, 0x01)

CKS1=HMAC(CKS0, 0x02)

4. Alice uses MKS0to encrypt her first message to Bob, using AES-256 in CBC mode. 5. To send a second message, Alice performs another symmetric key ratchet step in the

sending chain, creating MKS1and CKS2using HMAC with SHA-256: MKS1 =HMAC(CKS1, 0x01)

CKS2=HMAC(CKS1, 0x02)

6. Bob sends a response to Alice, including a new DHR in the message header. Alice performs a Diffie-Hellman ratchet step for the receiving chain using the latest DHSand DHR, creating RK0,0and CKR0using:

RK0,0, CKR0 =HKDF(RKinit, ECDH(privDHS, DHR))

Followed by one symmetric key ratchet step creating CKR1and MKR0. Alice can then use MKR0to decrypt the received message.

(19)

Figure 2.3: Deriving message keys via the symmetric key ratchet

7. Alice receives two more messages from Bob before sending any response, so she per-forms two more symmetric key ratchet steps for the receiving chain, creating MKR1and CKR2, MKR2and CKR3. She uses MKR1and MKR2to decrypt the received messages. 8. Alice wants to send a response to Bob. She generates a new DHSand performs a

Diffie-Hellman ratchet step, creating RK1,0and CKS0for a new sending chain, followed by a symmetric key ratchet step creating CKS1and MKS0. She uses the new MKS0to encrypt her message to Bob.

9. Alice receives a response from Bob containing a new DHR. She performs a Diffie-Hellman ratchet step creating RK1,1and CKR0for a new receiving chain, followed by a symmetric key ratchet step creating a new CKR1and MKR0. She uses MKR0to decrypt the response.

(20)

2.4. Quantum Computing

Figure 2.4: Illustration of how the different chains advance

2.4 Quantum Computing

The section below is a brief introduction to quantum computing. A more thorough intro-duction can be found in Quantum Computation and Quantum Information by A. Nielsen and L. Chuang [1].

2.4.1 Quantum Bits

While classical computers use bits as their basic unit of information to perform calculations, quantum computers use quantum bits, qubits for short, as their basic unit of quantum in-formation. A qubit can be represented by any two-state quantum mechanical system. The two-state system can for example be the spin of an electron (spin-up or spin-down) or the polarization of a photon (vertical or horizontal).

A classical bit can be in two different base states, 0 or 1. Qubits can also be in two dif-ferent base states,|0y or|1y, but an important difference is that qubits can be in a coherent superposition of both states simultaneously:

|ψy=α|0y+β|1y (2.1)

where α and β are complex numbers. The superposition property is fundamental to quan-tum computing. A classical bit can easily be determined to be either 0 or 1, while the quanquan-tum

(21)

2.4. Quantum Computing

state of a qubit is not as easily determined. When measuring a qubit we get either the result 0, with probability|α|2, or 1 with probability|β|2. The probabilities always sum to one:

|α|2+|β|2=1 (2.2)

It might seem as if you could store an infinite amount of data in a single qubit, but this is not the case. When a qubit is measured it only ever gives 0 or 1, with a certain probability, collapsing it from its superposition. While this might suggest a qubit is similar to a classi-cal bit, the superposition makes them very different, as physiclassi-cal phenomena like quantum entanglement can be utilized for parallel calculations not possible on classical computers.

Simplified, quantum entanglement can be described as a quantum mechanical phe-nomenon in which the quantum states of two or more particles can only be described with reference to each other, regardless of the distance between them. A change to one of the particles will immediately cause the properties of the other particles to change. A common example of entangled particles is two electrons prepared in a single quantum state, spinless as a pair, but when one of the electrons is observed to be spin-up, the other electron will al-ways be observed to be spin-down. The utilization of quantum entanglement is a powerful tool for information processing, enabling new algorithms.

2.4.2 Quantum Gates

Classical gates operate on classical bits and quantum gates operate on qubits. A key differ-ence is that quantum gates can leverage the two quantum mechanical properties superposi-tion and entanglement. Qubits that are entangled on their way into a quantum gate remain entangled on the way out.

The classical NOT gate interchanges the states 0 and 1, a bit with state 0 becomes 1 and a bit with state 1 becomes 0. The quantum NOT gate works in a similar fashion interchanging the state from equation 2.3 to equation 2.4 linearly. The roles of|0y and|1y are interchanged. For a qubit in superposition, the two probabilities α and β are interchanged.

α|0y+β|1y (2.3)

α|1y+β|0y (2.4)

Other notable quantum gates are the controlled NOT gate and the Hadamard gate. The controlled NOT gate has two input qubits, referred to as the control qubit and the target qubit. If the control qubit is set to 0, the target qubit is left unaffected. If the control qubit is set to 1, the target qubit is flipped.

The Hadamard gate has a single input qubit and turns|0y into(|0y+|1y)/?2 and|1y into

(|0y ´|1y)/?2, halfway between |0y and|1y for both cases. In other words, the Hadamard gate places a qubit in a superposition of the two states. The Hadamard gate is a key compo-nent in both Shor’s and Grover’s quantum algorithms.

2.4.3 Quantum Algorithms

Many public key cryptographic primitives depend on integer factorization or the discrete logarithm problem not being solvable in an efficient manner. There is currently no published algorithm that can solve these problems efficiently for the general case on a classical com-puter. However, there are algorithms for quantum computers running in polynomial time, the most well known being Shor’s algorithm published in 1994 [29]. Shor’s algorithm would on a quantum computer be able to efficiently break current prime factor based cryptography such as RSA, cryptography based on the discrete logarithm problem, as well as the elliptic-curve based variant [27].

(22)

2.5. Post-Quantum Cryptography

The most efficient quantum algorithm for brute-forcing a symmetric cryptographic key is Grover’s algorithm published in 1996, which provides a quadratic speedup compared to its classical counterparts [18]. Classically, brute-forcing the key requiresO(N)time, where N is the number of possibilities, but with Grover’s algorithm it only takesO(?N)time. This means that Grover’s algorithm could brute-force a 128-bit symmetric cryptographic key in 264iterations, or a 256-bit key in 2128. In order to protect against future quantum attacks, the symmetric key lengths could be doubled.

2.5 Post-Quantum Cryptography

This section is a brief overview of the three most important categories of post-quantum re-sistant cryptography, related to key exchange algorithms. The only algorithm capable of per-forming a Diffie-Hellman style key exchange is CSIDH, which can be found in the category of isogeny-based cryptography.

2.5.1 Lattice-based Cryptography

Lattice-based problems are well researched and have received a lot of attention during the past decade. They rely on the Shortest Vector Problem, the problem of finding the shortest non-zero vector within a lattice. Lattice problems benefit from the worst-case to average-case reduction, meaning that all keys are as hard to break in the easiest case as in the worst case.

In 1998 Hoffstein et al. published NTRU [19], the first practical lattice-based cryptosystem. The NTRU encryption scheme has better performance than classical cryptography, but larger public key sizes than for example RSA.

There is a related problem called Learning With Error, enabling cryptosystems whose security can be reduced to lattice problems over general lattices. Notable NIST submissions based on LWE or variants of it are Crystals-Kyber, FrodoKEM and NewHope [2, 8, 7].

2.5.2 Code-based Cryptography

Code-based cryptography uses error correcting codes, for example binary Goppa codes. A secure coding scheme can be implemented by keeping the encoding and decoding functions secret and publish a disguised encoding function, that maps a plaintext message to a scram-bled set of code words. Only the party aware of the secret decoding function can remove the secret mapping and recover the plaintext message. The underlying mathematical prob-lem is called syndrome decoding, which is hard to reverse for both classical and quantum computers.

The first encryption scheme based on binary Goppa codes was introduced by McEliece in 1978 [24]. The McEliece cryptosystem is very fast in encryption and reasonably fast in decryption, but has extremely large key sizes. Attempts have been made to reduce the large key sizes, typically by introducing more structure into the codes, but adding structure has led to successful attacks.

2.5.3 Isogeny-based Cryptography

Couveignes proposed the first isogeny-based cryptosystem in 1997, which described how to perform a non-interactive key exchange using the isogeny classes of ordinary elliptic curves defined over a finite field, but the corresponding paper was not formally published until 2006 [14]. The method was eventually rediscovered by Rostovtsev and Stolbunov [28].

In 2010, Childs, Jao, and Soukharev [12] showed how the Couveignes–Rostovtsev–Stolbunov scheme can be broken with the same computational complexity as solving an instance of the abelian hidden-shift problem, for which there are known quantum algorithms with a time complexity of Lq[1/2], see Kuperberg [21] and Regev [25].

(23)

2.6. Standardization Efforts

In 2011, Feo, Jao, and Plût [16] considered the use of supersingular elliptic curves and the resulting key-agreement scheme was called "Supersingular Isogeny Diffie–Hellman" (SIDH). Galbraith, Petit, Shani, and Ti [17] later showed that SIDH keys succumb to active attacks and should not be reused.

In 2018, Castryck et al. published CSIDH, a new post-quantum cryptographic primitive that can serve as a drop-in replacement for the (EC)DH key exchange protocol [10]. CSIDH is a commutative group action based on supersingular elliptic curves defined over a large prime field, providing a non-interactive (static–static) key exchange with full public key validation. Keys can thus be reused for CSIDH. The speed of CSIDH is practical and the public key size is small. The published proof-of-concept implementation run on an Intel Skylake i5 processor at 3.5 GHz, performs the CSIDH group action in 41 ms for a 512-bit prime and 64 byte public keys. The implementation features 512-bit field arithmetic written in assembly for Intel Skylake processors and generic C code for other field sizes and platforms.

Castryck et al. claim that CSIDH-512 with public keys of 64 bytes achieves 128-bits of classical security while matching NIST’s post-quantum security category 1, see Table 2.1.

CSIDH-log p NIST Cost Quantum Attack Cost Classical Attack

CSIDH-512 1 262 2128

CSIDH-1024 3 294 2256

CSIDH-1792 5 2129 2448

Table 2.1: CSIDH attack cost in number of operations for different NIST levels.

2.6 Standardization Efforts

As governmental organizations around the globe are investing heavily in developing quan-tum computers, standardization efforts have also begun to prepare society for the post-quantum era.

The European Union (EU) has published initial recommendations for post-quantum cryp-tographic algorithms as part of the PQCRYPTO project for symmetric encryption, symmetric authentication, public-key encryption and public-key signatures [3]. The recommendations are focused on providing long-term security, rather than efficiency and are summarized be-low:

• Symmetric Encryption: Thoroughly analyzed ciphers with 256-bit keys achieving 2128 post-quantum security, including AES-256.

• Symmetric Authentication: Message authentication codes with underlying post-quantum secure ciphers.

• Public-key Encryption: The code-based McEliece cryptosystem with parameters as in-cluded in McBits [5]. A lattice-based cryptosystem is also under evaluation.

• Public-key Signatures: The hash-based signatures XMSS [9] and SPHINCS-256 [6]. The National Institute of Standards and Technology (NIST) has also begun standardiza-tion efforts for post-quantum cryptography [11]. Established NIST standards for conven-tional cryptography that are vulnerable to attacks in the post-quantum era include:

• FIPS 186-4: Digital Signature Standard

• SP 800-56 A: Recommendation for Pair-Wise Key Establishment Schemes Using Dis-crete Logarithm Cryptography

(24)

2.6. Standardization Efforts

• SP 800-56 B: Recommendation for Pair-Wise Key-Establishment Schemes Using Integer Factorization Cryptography

In 2016, NIST sent out a call for proposals with the intention to standardize one or more unclassified, publicly disclosed digital signature, public-key encryption and key establish-ment algorithms for the post-quantum era.

The NIST post-quantum cryptography standardization process consists of two to three rounds of submissions, evaluations and standardization efforts. In January 2019, NIST sent out a status report on the first round of its post-quantum cryptography standardization pro-cess [2]. The chosen candidates for round two includes 17 public-key encryption and key-establishment algorithms and 9 digital signature algorithms.

(25)

Chapter 3 Method

3.1 Affected Components

As Shor’s algorithm breaks the used public-key cryptography inside the Signal Protocol and Grover’s algorithm only weakens the used symmetric cryptography, the primary focus for the implementation was to replace the public-key cryptography. Leaving the symmetric cryptog-raphy intact, as the used primitives still maintain 128-bit post-quantum security.

By analyzing the Signal protocol’s use of public-key cryptography, see section 2.3, one learns that the X3DH key agreement protocol contains four ECDH operations and an EdDSA signature, while the Double Ratchet key management algorithm contains one ECDH opera-tion. This is all asymmetric cryptography existent in the Signal protocol. It is important to note that the identity key is used in both non-interactive key exchange operations and for signing, as it affects the possible choices of post-quantum resistant cryptographic primitives. To make the Signal protocol post-quantum resistant, all elliptic-curve cryptography (ECC) inside the X3DH and Double Ratchet must be replaced with post-quantum resistant alterna-tives.

3.2 Post-Quantum Resistant Drop-In Replacement

To use any of the current PQCRYPTO recommendations [3] or NIST round two candidates [2], a redesign of the protocol was required, which was out of scope. All applicable PQCRYPTO recommendations and NIST round two candidates work by encapsulating a generated secret with the other party’s public key, requiring a transfer of the ciphertext in addition to knowing the other party’s public key, before a shared secret can be established. In other words, none of them could be used to perform a non-interactive key exchange which was necessary to act as a drop-in replacement for the ECDH operations inside the Signal protocol.

In the Signal protocol, one of the parties can be offline at any time, allowing asynchronous communication with the help of a server, storing the necessary public keys or encrypted messages until the other party comes online again.

As a redesign of the Signal protocol was undesired, a post-quantum resistant candidate was looked for outside the PQCRYPTO and NIST initiatives, to avoid losing the Signal pro-tocol’s asynchronicity property or have to introduce new data flows.

Achieving a non-interactive key exchange in the post-quantum era was considered an open problem, until Castryck et al. published CSIDH, providing a drop-in replacement for ECDH [10]. CSIDH was therefore considered the most suitable candidate for the implemen-tation phase and later benchmarking. Unfortunately, no practical CSIDH-based signature scheme implementation had been published by the time of writing, meaning that finding a drop-in replacement for the EdDSA signature was left for future work.

(26)

3.3. Implementation

3.3 Implementation

The Signal Protocol is released as open-source under the GNU General Public License (GPLv3) in Java, C and JavaScript. This implementation used the Java version1, specifically commit 3c1a8ee representing the most recent Signal protocol version 2.7.0 (April 4, 2019). Android Studio version 3.3.2 was used for all builds and benchmarking runs together with Java SE Runtime Environment 1.8.0.

A dependency to the Signal protocol is the Curve25519 Java library2, containing functions for elliptic curve key pair generation and shared secret calculation. This implementation used commit 70fae57, the most recent Curve25519 Java library version 0.5.0 (May 4, 2018). For the post-quantum resistant implementation of the Signal protocol, the original CSIDH source code3as published by Castryck et al. was used.

The official Signal protocol imports the Curve25519 Java library, with multiple elliptic-curve cryptography providers for different platforms. For the Android platform, the native curve provider was used, using the Java Native Interface to call native C code for all elliptic-curve cryptography. See figure 3.1 for the related implementation layout.

Figure 3.1: Original ECC implementation layout.

The leftmost column represents the highest abstraction layer. Note how the Signal proto-col library imports functionality from Curve25519.java, in turn importing functionality from NativeCurve25519Provider.java, using the Java Native Interface and curve25519-jni.c to call native code for all elliptic-curve cryptography.

To mimic the call structure of the original protocol and achieve as comparable benchmarks as possible, the CSIDH library was also wrapped using the Java Native Interface and the original implementation layout left as intact as possible, see figure 3.2.

Figure 3.2: CSIDH implementation layout. 1_{https://github.com/signalapp/libsignal-protocol-java}

2_{https://github.com/signalapp/curve25519-java} 3_{https://csidh.isogeny.org/software.html}

(27)

3.3. Implementation

Note how Curve.java on the Signal protocol layer now imports our implemented CsidhMain.javathat uses the Java Native Interface and our implemented csidh-jni.c to call native code implemented by Castryck et al. for all CSIDH isogeny-based cryptography.

When building the Android app (APK), the layers below are first compiled into Android archive (AAR) files and imported as modules. The rightmost layer in figures 3.1 and 3.2 with cryptographic primitives implemented in C is compiled into a shared library and loaded by the Java Native Interface.

For all code listings below, parts of the code irrelevant for describing the method have been truncated with "..." for better readability.

3.3.1 Cryptographic Libraries

The original Signal protocol uses an optimized Curve25519 implementation by Google. The original CSIDH implementation as released by Castryck et al. had yet to be optimized for the ARMv8 platform at the time of writing, with only a generic C code implementation available [10]. As part of this thesis the generic C code version was compiled for the ARMv8 platform without further optimization.

The Curve25519 library entry point is in Curve25519.java and exposes functionality for generating key pairs, calculating shared secrets, generating and verifying signatures. Any cryptographic library trying to act as a drop-in replacement for the existing Curve25519 li-brary must implement equivalent functionality or larger modifications to the Signal protocol are required.

To better understand the CSIDH implementation published by Castryck et al., the file tree, dependencies and internal workings were analyzed:

• params.h: Defines a prime field, constants and type definitions.

• constants.c/.h: Pre-computed values for arithmetic operations on the prime field. • uint.c/.h/.s: Unsigned integer implementation in generic C code and optimized

assem-bler instructions for Intel processors.

• fp.c/.h/.s: Finite field implementation in generic C code and optimized assembler in-structions for Intel processors.

• rng.c/.h: Pseudo-random number generator utilizing "/dev/urandom" as source. • mont.c/.h: Montgomery arithmetic implementation for isogeny computations.

• csidh.c/.h: Acts as an API with csidh_private() for generating CSIDH private keys and csidh() for generating CSIDH public keys and calculating shared secrets. • bench.c: Benchmarking functionality for running large number of CSIDH iterations,

measuring clock cycles, wall clock time and stack memory usage. Not used for the implemented CSIDH library.

• main.c: Runs a single CSIDH iteration for Alice and Bob, generating a key pair each and calculating their shared secrets. Not used for the implemented CSIDH library. As part of the CSIDH library implementation, CsidhMain.java in Listing 1 and csidh-jni.c in Listing 2 were implemented, wrapping the original CSIDH functionality exposed through csidh.hwith the Java Native Interface. Compared to the Curve25519 library with multiple providers to consider, CsidhMain.java only has to consider a single cryptographic provider.

(28)

3.3. Implementation

...

public class CsidhMain { ...

public CsidhKeyPair generateKeyPair() {

byte[] privateKey = this.generatePrivateKey();

byte[] publicKey = this.generatePublicKey(privateKey);

return new CsidhKeyPair(publicKey, privateKey); }

public native byte[] generatePrivateKey();

public native byte[] generatePublicKey(byte[] privateKey);

public native byte[] calculateAgreement(byte[] ourPrivate,

byte[] theirPublic);

static {

System.loadLibrary("csidh-jni"); }

}

Listing 1: Main class for the JNI backed CSIDH library implementation.

Refer to the original Curve25519 library and implemented CSIDH library source code as necessary. Note how listing 1 imports csidh-jni.c with loadLibrary() and that it contains no functionality for generating and verifying signatures as this was left for future work. The code in listing 2 has been truncated for readability.

#include <jni.h>

#include "csidh.h"

...

JNIEXPORT jbyteArray JNICALL

Java_com_example_csidhlibrary_CsidhMain_calculateAgreement(...) { ...

csidh(&sharedSecretNative, &theirPublicKeyNative, &ourPrivateKeyNative); ...

return sharedSecretArray; }

Java_com_example_csidhlibrary_CsidhMain_generatePrivateKey(...) { ...

csidh_private(&privateKeyNative); ...

return privateKeyArray; }

Java_com_example_csidhlibrary_CsidhMain_generatePublicKey(...) { ...

csidh(&publicKeyNative, &base, &privateKeyNative); ...

return publicKeyArray; }

(29)

3.3. Implementation

Important to note is how the functions calculateAgreement(), generatePrivateKey() and generatePublicKey() are made accessible to the Java layer and how the CSIDH functions csidh_private() and csidh() implemented by Castryck et al. are called from csidh-jni.cin listing 2. Truncated parts of the code take care of, among other things, mem-ory handling and data type conversion between the Java and C layers. As can be seen in listing 1, the native functions in CsidhMain.java all return a byte array corresponding to the jbyteArrays returned in csidh-jni.c and generateKeyPair() uses the two native functions generatePrivateKey()and generatePublicKey(), following the same pattern as can be observed in the original Curve25519 library.

After implementing CsidhMain.java and csidh-jni.c wrapping the original CSIDH func-tionality using the Java Native Interface, the key generation and shared secret calculation functions were ready to be imported and called from the Signal protocol layer.

3.3.2 The Signal Protocol

The implemented CSIDH library purposefully has an equivalent interface as the original Curve25519 library and can act as a drop-in replacement, with only minor modifications to the Signal protocol.

Two essential functions made available in Curve.java at the Signal protocol layer are generateKeyPair()and calculateAgreement(), see listing 3 for the original imple-mentation. These two functions are dependencies to multiple other parts of the Signal proto-col, all parts where key pairs must be generated or shared secrets calculated.

import org.whispersystems.curve25519.Curve25519;

import org.whispersystems.curve25519.Curve25519KeyPair; ...

public class Curve { ...

public static ECKeyPair generateKeyPair() {

Curve25519KeyPair keyPair = Curve25519.getInstance(...).generateKeyPair();

return new ECKeyPair(new DjbECPublicKey(keyPair.getPublicKey()),

new DjbECPrivateKey(keyPair.getPrivateKey())); }

public static byte[] calculateAgreement(...) { ...

return Curve25519.getInstance(...).calculateAgreement(...); ...

} ...

Listing 3: Original Curve.java implementation in the Signal protocol.

The two function interfaces for generateKeyPair() and calculateAgreement() were kept as is, but instead of importing and using functionality from the Curve25519 library, our implemented CSIDH library was used, see listing 4 for the modified Curve.java.

The original Signal protocol and CSIDH implementation variant require different key sizes. A comparison of the different key sizes can be seen in table 3.1. Note that the CSIDH implementation has two different parameter sets. The original Curve.java hard code the ex-pected public key size, so offsets and number of exex-pected bytes had to be increased for the CSIDH implementation.

(30)

3.3. Implementation

import com.example.csidhlibrary.CsidhMain;

import com.example.csidhlibrary.CsidhKeyPair; ...

public class Curve { ...

public static ECKeyPair generateKeyPair() {

CsidhKeyPair keyPair = CsidhMain.getInstance().generateKeyPair();

return new ECKeyPair(new DjbECPublicKey(keyPair.getPublicKey()),

new DjbECPrivateKey(keyPair.getPrivateKey())); }

public static byte[] calculateAgreement(...) { ...

return CsidhMain.getInstance().calculateAgreement(...); ...

} ...

Listing 4: New Curve.java implementation in the Signal protocol.

Curve25519 CSIDH-512 CSIDH-1024

Public Key 32 64 128

Private Key 32 32 64

Table 3.1: Key sizes in bytes.

As mentioned in section 1.4, generating and verifying signatures was left for future work, as no practical CSIDH-based signature scheme was found at the time of writing and a re-design of the Signal protocol was out of scope. To accommodate this delimitation, signature generation functions were modified to return empty byte arrays and the signature verification taking place in the Signal protocol was commented out, see listing 5.

/*

if (preKey.getSignedPreKey() != null &&

!Curve.verifySignature(preKey.getIdentityKey().getPublicKey(), preKey.getSignedPreKey().serialize(), preKey.getSignedPreKeySignature())) {

throw new InvalidKeyException("Invalid signature on device key!"); }

*/

Listing 5: Commented out signature verification.

3.3.3 Benchmarking App

An Android app was implemented in two different versions, one importing the original Sig-nal protocol and one importing the CSIDH-based SigSig-nal protocol, acting as benchmarking platforms for collecting comparative CPU metrics. The implemented app was based on an open-source Signal demo app4and the most recent commit 087f742 (April 6, 2016).

(31)

3.3. Implementation

The Signal protocol has interfaces and callback functions requiring implementations on the client side of the application importing it. For example, how and where the client stores keys and messages are left unimplemented by the Signal protocol. For a production app, it is critical that the developer understands the importance of storing all private keys securely, especially the long term identity key.

For the original Signal app, a remote server is used for storing, receiving and sending key bundles and encrypted messages between registered clients. For the implemented bench-marking app, there is no persistent data store or remote server involved, instead all key bun-dles and messages are temporarily stored in memory on the device itself. The benchmarking app user interface consists of two scrolling text fields, two input fields and two send buttons, representing two participants, see figure 3.3.

Figure 3.3: Benchmarking app user interface.

For every message sent or received, a complete Signal protocol run is executed for each of the two participants. Note that the two protocol runs are executed in sequence and that simulated server operations and transmission time between the two participants are excluded from the benchmarks. The benchmarks in section 3.4 only measure CPU performance for the two participants’ session initialization, key pair generation, shared secret calculations, encryption and decryption of messages.

(32)

3.4. Performance Evaluation

Apart from the different Signal protocol imports, the two versions of the benchmarking app are identical, sharing the same Android manifest, layout specifications, graphical re-sources and Java classes:

• MainActivity.java: Sets up two participants and a chat fragment with interaction lis-teners, a communication channel between the two participants and a simulated server instance for storing key bundles.

• ChatFragment.java: Contains interactive UI elements for two participants and updates affected views when new messages are sent or received.

• Participant.java: Generates key bundles, registers the participants with the server and contains methods for sending and receiving encrypted messages.

• Channel.java: Handles the transmission of encrypted messages between participants. • Server.java: Simulates a server, storing key bundles for registered participants and

en-abling them to be retrieved.

3.4 Performance Evaluation

The Android Studio CPU profiler was used for all benchmarking, providing two options for recording trace information, sampling or instrumenting the app. Sampling was preferred to minimize overhead and avoid impacting runtime performance.

All collected data were compared using thread time. Thread time represents wall clock time subtracting any portion of that time when the thread was not consuming CPU resources. Using thread time instead of wall clock time gives a better understanding of the thread’s actual CPU usage.

The Android Studio CPU profiler layout consists of five main components, see figure 3.4. The five components correspond to 1-5 below, describing what settings were used in the CPU profiler user interface for the benchmarking runs:

1. Recording configuration menu: Before trace information can be recorded, a recording configuration must be set. A custom configuration was created for Java method sam-pling, capturing the app’s call stack at 500 µs intervals. The profiler then automatically compares the captured data to derive timing and resource usage information.

2. Selected range: After recording a trace, the entire length of the recording is automati-cally selected. A subset of the recorded trace data can be inspected by manually select-ing a smaller time range.

3. Trace pane tabs: Call charts were chosen as graphical representation for recorded method traces, see figure 3.5 for an example. The period and timing of a call are rep-resented on the horizontal axis and its callees are shown along the vertical axis. Calls to system APIs are shown in orange, calls to the benchmarking app’s own methods are shown in green and calls to third-party APIs are shown in blue.

4. Time reference menu: Timing information for all method calls was measured as thread time. Care had to be taken, as this setting defaults to wall clock time.

5. Trace pane: Information from recorded traces are viewed in the trace pane. As men-tioned above, call charts were used as the graphical representation. By hovering a method call with the mouse, its period and timing information is displayed, see fig-ure 3.6.

(33)

Figure 3.4: Android Studio CPU profiler overview.

(34)

Figure 3.6: Android Studio CPU profiler hover panel.

The Signal protocol takes different execution paths depending on if it is the first message sent or received in a session, consecutive messages or a reply to a previously received mes-sage. This has to do with how X3DH and Double Ratchet works, see sections 2.3.1 and 2.3.2. X3DH is only performed when establishing a new session and the Diffie-Hellman ratchet in the Double Ratchet algorithm is typically only performed when a participant receives a reply. Note that the term Diffie-Hellman ratchet is used for both the original and the CSIDH-based implementation.

The results of the different execution paths can be observed in the call charts, by sending and receiving messages between the two participants, while recording trace information. For all benchmarking runs, the text string "lorem ipsum" was sent from one participant to the other. Different periods and timings are observed for initializing a new session by sending the first message from one participant to the other, for sending multiple messages in a row and for participants alternately replying to each other.

In figure 3.7, the red rectangle surrounds method calls involved when sending the first message and the blue rectangle surrounds method calls involved when receiving the first message. When measuring the total time taken to send the first message, the top level method calls in the Signal Protocol are SessionBuilder.Process and SessionCipher.Encrypt. For the receiv-ing case they are SessionBuilder.Process and SessionCipher.Decrypt. The total thread time for the Signal protocol’s top level method calls was combined from the recorded method traces, giving total time taken for sending or receiving the first message. Note that this excludes operating system and app layer method calls, conforming to the second research question, see section 1.3.

Figure 3.7: Call chart for a message initializing a session.

When sending or receiving consecutive messages, the session has already been estab-lished, meaning that only a method call to SessionCipher.Encrypt or SessionCipher.Decrypt is involved, see figure 3.8.

The same goes for sending or receiving replies, the session has already been established, see figure 3.9. However, an important difference for replies is that SessionCipher.Decrypt then

(35)

also executes the Diffie-Hellman ratchet in the Double Ratchet algorithm, which can be ob-served as a much longer total time.

Figure 3.8: Call chart for a consecutive message.

Figure 3.9: Call chart for a reply.

All benchmarking runs were performed on a Samsung Galaxy Note 8 model SM-N950F, with an Exynos 9 (8895) 64-bit ARMv8 octa-core processor, 6 GB RAM (LPDDR4) and running Android 9.0 (Pie). The octa-core processor features four Mongoose 2 (2.3 GHz) cores and four Cortex-A53 (1.7 GHz) cores. After performing a factory reset, sending of usage and diagnostic data was disabled. WiFi, GPS, Bluetooth, NFC, synchronization, sound, screen rotation, adaptive brightness, edge lightning, Google Play protect and input assistance were all turned off. Default apps that could be uninstalled were uninstalled and the remaining apps that could be disabled were disabled. To be able to connect to the phone over USB, install the benchmarking app and run the Android Studio CPU profiler, developer options and USB debugging were enabled.

(36)

Chapter 4 Results

4.1 Implementation

The implementation phase resulted in a post-quantum resistant version of the Signal pro-tocol, not considering the EdDSA signature, replacing the original Curve25519 library for elliptic-curve cryptography with a post-quantum resistant drop-in replacement. All ECDH operations were successfully replaced with post-quantum resistant CSIDH operations.

The post-quantum resistant cryptographic library was implemented from the original post-quantum resistant CSIDH key generation and key exchange algorithms, wrapped with the Java Native Interface and made to conform with the existing interfaces and functions calls in the Signal protocol.

A lightweight benchmarking app was also implemented so that the two different Signal protocol implementations could be executed in full, while recording trace information for all method calls, finding the CPU utilization for each version.

The Signal protocol’s secure messaging properties were largely unaffected, as the CSIDH library was successfully implemented as a drop-in replacement for the existing Curve25519 library. However, as replacing the EdDSA signature for the signed prekey inside the X3DH key agreement protocol was left for future work, forward secrecy was weakened. Quoting Marlinspike and Perrin [23]:

It might be tempting to observe that mutual authentication and forward se-crecy are achieved by the DH calculations, and omit the prekey signature. How-ever, this would allow a “weak forward secrecy” attack: A malicious server could provide Alice a prekey bundle with forged prekeys, and later compromise Bob’s identity key to calculate the shared secret key.

4.2 Performance

The mean and standard deviation presented as "Mean (Standard Deviation)" in the tables below, were calculated from 25 separate benchmark runs and rounded to the nearest mil-lisecond. See appendices A.1, A.2 and A.3 for the individual benchmark runs.

Sending the first message in a session requires the largest computational workload for both parties, as it requires an initial shared secret to be established using X3DH and initial-ization of the sending or receiving chain. The sender initiating the session also has to generate an ephemeral key pair for the X3DH algorithm, resulting in a slightly higher computational workload than the receiver. See table 4.1 for thread times in milliseconds to send or receive the first message in a session. Note how the thread time increases by roughly one order of magnitude from Curve25519 to CSIDH-512 and half an order of magnitude from CSIDH-512 to CSIDH-1024.

(37)

4.2. Performance

Curve25519 CSIDH-512 CSIDH-1024 Send First 352 (19) 4110 (159) 21931 (960) Receive First 213 (15) 3947 (133) 21890 (850)

Table 4.1: Mean thread time [ms] for sending/receiving the first message in a session.

When sending and receiving consecutive messages the computational workload is very low, as the parties only have to perform symmetric ratchet steps and derive new message keys, not involving any time-consuming key generation or asymmetric ratchet steps. The thread times are therefore low, similar for both cryptographic libraries, as well as similar for both sending and receiving, see table 4.2.

Curve25519 CSIDH-512 CSIDH-1024 Send Consecutive 12 (2) 12 (2) 13 (1) Receive Consecutive 11 (2) 11 (1) 11 (2)

Table 4.2: Mean thread time [ms] for sending/receiving consecutive messages.

For sending replies, the computational workload is as low as sending a consecutive mes-sage. The same is not true for the receiving case, due to the fact that when receiving a reply, both the new receiving chain and the new sending chain are created, resulting in a much larger computational workload. This is a design choice by Marlinspike and Perrin [22] to reduce complexity in the Signal protocol. The difference in thread times for the sender and receiver is high. In the case of CSIDH, the thread time for receiving a reply is two orders of magnitude higher than the thread time for sending a reply, see table 4.3.

Curve25519 CSIDH-512 CSIDH-1024 Send Reply 11 (2) 11 (2) 11 (2) Receive Reply 74 (5) 1778 (88) 9364 (505) Table 4.3: Mean thread time [ms] for sending/receiving replies.

A Performance Evaluation of Post-Quantum Cryptography in the Signal Protocol

Linköping University | Department of Electrical Engineering

Master’s thesis, 30 ECTS | Computer Science

2019 | LITH-ISY-EX--19/5211--SE

A Performance Evaluation of

Post-Quantum Cryptography in

the Signal Protocol

En prestandautvärdering av kvantsäkert krypto i

Signal-protokollet.

Markus Alvila

Upphovsrätt

Copyright

Acknowledgments

Contents

List of Figures

List of Tables

Abbreviations

Chapter 1

Introduction

1.1

Motivation

1.2

Aim

1.3

Research Questions

1.4

Scope

Chapter 2

Theory

2.1

Common Cryptographic Primitives

2.1.1

Public Key Cryptography

2.1.2

Digital Signature

2.1.3

Cryptographic Hash Function

2.1.4

Message Authentication Code

2.1.5

Diffie-Hellman Key Exchange

2.1.6

Key Derivation Function

2.1.7

Authenticated Encryption with Associated Data

2.2

Secure Messaging Properties

2.2.1

Confidentiality

2.2.2

Integrity

2.2.3

Authentication

2.2.4

Forward Secrecy

2.2.5

Post-Compromise Security

2.2.6

Asynchronicity

2.2.7

Other Properties

2.3

The Signal Protocol

2.3.1

The X3DH Key Agreement Protocol

2.3.2

The Double Ratchet Key Management Algorithm

2.4

Quantum Computing

2.4.1

Quantum Bits

2.4.2

Quantum Gates

2.4.3

Quantum Algorithms

2.5

Post-Quantum Cryptography

2.5.1

Lattice-based Cryptography

2.5.2