against variants of AES-GCM and ChaCha20-Poly1305

(1)

Degree project

Partitioning oracle attacks

against variants of AES-GCM and ChaCha20-Poly1305

Author: Pontus Tordsson

Supervisor: Per Anders Svensson Examiner: Marcus Nilsson Date: 2021-06-09

Course Code: 5MA41E Subject: Cryptography Level: Master

Department Of Mathematics

(2)

Abstract

We investigate so-called partitioning oracle attacks against AES-GCM and ChaCha20-Poly1305 along with some improvements. Such attacks against these two cryptosystems are efficient because they can be reduced to solving linear systems of equations over finite fields. We show, with some randomness assumptions, that such linear systems must have at least as many columns as rows. We have also chosen two finite (non- field) rings, as replacement for the respective fields used by AES-GCM and ChaCha20-Poly1305 for message authentication. These rings make the problem of linear system arrangement in a partitioning oracle attack extremely hard for large linear system dimensions.

Acknowledgements

The subject of partitioning oracle attacks was suggested to us last year by Carl L¨ondahl,¹ who also proposed an additional lattice reduction step against ChaCha20-Poly1305. This step is an improvement to the original attack [LGR20, pp. 20–23]. The lattice reduction step presented here is slightly different, but it is based on the same idea. I appreciate the discussions with Carl and with my supervisor Per Anders on the subject.

Keywords: partitioning oracle attack, AES-GCM, ChaCha20-Poly1305

1Carl L¨ondahl has a PhD in information theory, from Lund University.

(3)

1 Introduction

Cryptanalysis is the discipline of analyzing cryptosystems to see how good they are at their job. Cryptanalysis is roughly on a spectrum between theoretical and practical, the former being common in academia and the latter being common in practice. Some cryptographers [Aum19] have argued that purely theoretical cryptanalysis of AES [RD99] (for example) is no longer necessary, their reason being that AES in its purest form may not be broken before humans go ex- tinct. But practical implementation issues are still many, and cryptosystems will probably remain unsecure in various real life situations. There seem to be no exceptions to this. Two examples are as follows. Curve25519 [Ber06] is an elliptic curve that is very fit for key exchange between two black boxes, but not so much between two smartphones whose power usage can leak information about their private key [GVY17, p. 847]. AES is extremely secure when used in a black box, but not so much in a computer whose memory access patterns can leak information about the secret key [Ber05a].

One theoretical attack, called a partitioning oracle attack (POA), is a relatively new concept introduced by Len, Grubbs, and Ristenpart last year [LGR20].

It is a key recovery attack involving a new type of oracle that answers where a secret key is located in a partition of a set of keys. An oracle in cryptography is an entity that has special access to some form of information or process.

There are various types of oracles, each with their own specific purpose. To actually realize a given oracle in practice is usually difficult if not impossible, but Len et. al. managed to realize an efficient partitioning oracle against two cryptosystems called AES-GCM [MV04] and ChaCha20-Poly1305 [SSS17]. This oracle took the form of a proxy server used in Virtual Private Networks (VPN), and it answered by having an exposed side effect vary between different queries.

This thesis investigates how easy key recovery by POA can be under certain circumstances of using AES-GCM and ChaCha20-Poly1305. Our main context is relatively abstract. It consists of two parties called Alice and Bob, who communicate with encrypted messages, and an adversary called Eve, who tries to obtain Alice’s and Bob’s secret key. The two cryptosystems rely on finite field arithmetic; we also investigate how switching to non-field arithmetic might change various forms of security.

1.1 Limitations

The main attack presented in this thesis is essentially a form of key search, and it relies on a few assumptions in order to be practically possible. We assume that keys are derived from passwords, in which case the candidate keys to search could be derived from password leak data. Since passwords are notoriously non- random [DMR10], the number of candidate keys may be around the order of magnitude 10¹², which is significantly fewer than all possible keys of, say, 128- bit AES (which has 2¹²⁸ ≈ 3.4 · 10³⁸ possible keys). We also assume that the function f used for deriving keys from passwords is known to the attacker, and that f only takes a password as input. Some technical assumptions are also made along the way to make analysis a bit easier. Any deviation from these assumptions only decreases the applicability of our analysis. Most importantly, this thesis is irrelevant if Alice and Bob choose their secret key completely at random.

(5)

1.2 Contributions

Partial simulations of a POA against AES-GCM and ChaCha20-Poly1305 were made using the Number Theory Library (NTL) for C++, mainly to get a feel- ing for the speed performance that an adversary might expect with a personal computer. It only measures the “main workhorse” of the attack. We describe in Section 2.2 an improvement (courtesy of Carl L¨ondahl) of the original POA [LGR20, p. 20] against ChaCha20-Poly1305, the improvement being a lattice reduction step. This changes the time complexity of the attack from expected-case exponential to worst-case polynomial.

Given certain assumptions on finite ring arithmetic, we also show in Section 2.3 that the linear system approach to the POA presented here requires that linear systems must have at least as many columns as rows, otherwise the chance of ever finding a solution is close to zero. A constraint against AES-GCM is that linear systems have at most 2³² rows. The implications of this is that message size grows at least linearly with respect to the number of simultaneous keys tested.

We also show in Section 3 that certain non-field rings change the problem of linear system arrangement from trivial to extremely difficult (against both AES- GCM and ChaCha20-Poly1305), at least for high-dimensional linear systems.

The suggested rings are polynomials modulo x¹²⁸+ x²+ x ∈ F²[x] for AES- GCM, and integers modulo 2¹³⁰−50 for ChaCha20-Poly1305, both as respective replacements for the fields used by the two cryptosystems. These rings appear to make it extremely time consuming to arrange linear systems for testing large sets of keys. We also provide in Section 3.3 and 3.4 upper bounds on the probability of successful arrangement of linear systems.

1.3 Preliminaries

The reader is assumed to be somewhat familiar with ring theory within abstract algebra, but some notation is still clarified to avoid ambiguity. Finite fields of q = pⁿ elements (with p being prime) are denoted F^q. Up to isomorphism, this is sometimes called the finite field of order q, because all finite fields of a fixed order (ie. fixed size) q are isomorphic to each other. Isomorphism is an important concept used here; if two rings A and B are isomorphic, then they are qualitatively the same ring from an algebraic standpoint. They have the same algebraic structure, in other words.

Definition 1. Two ideals I, J of a ring S are said to be coprime if I + J = {i + j | i ∈ I , j ∈ J } = S .

A commonly used ring here is the set of polynomials modulo 2, denoted F²[x].

These are all the possible polynomials in x where coefficients are integers reduced modulo 2. The set of integers is also a commonly used ring. The rings F²[x]

and Z are so-called principal ideal domains, meaning that any ideal of these rings is generated by an element. By definition, any element of such an ideal has the generator as a factor. An ideal of F²[x] generated by a polynomial f (x) is denoted hf (x)i. An ideal of Z generated by an integer n is denoted nZ.

Polynomials modulo f (x) define the quotient ring F2[x]/hf (x)i, and integers modulo n define the quotient ring Zn. Such quotient rings are isomorphic to

(6)

some (finite) fields if f (x) ∈ F2[x] and n ∈ Z are irreducible. We may also denote modulo reduction as (mod I), where I is an ideal.

Given two finite sets A and B, and the set F of all functions mapping A to B, a random function f : A −→ B is one which is randomly sampled from F . If A = B and if f is bijective, then f is a random permutation. When no distribution is mentioned, it is understood that the distribution of random samples is uniform, either with or without replacement. This does not really work for infinite sets, but F (as defined here) is finite in size, with a total of

|B|^|A|functions to choose from. The set of all permutations on B coincides with the symmetric group S_|B| which has order |B|!. Both the sizes |A| and |B| will be very large to begin with, so whether sampling from the set of functions (or permutations) happens with or without replacement is not a major concern.

The set of bit sequences of length n is denoted {0, 1}ⁿ. Concatenation of two bit sequences a ∈ {0, 1}^m and b ∈ {0, 1}ⁿ is denoted a || b and the result is a member of {0, 1}^m+n. For example, a = 1011 and b = 0110110 can be concatenated to give a || b = 10110110110. For simplicity, n consecutive 0- bits are sometimes denoted 0ⁿ; it should be clear from the context whether 0ⁿ denotes concatenation or exponentiation of 0. An n-bit sequence may also be called an n-bit block. Bit sequences can be converted from (and to) polynomials in F²[x] and non-negative integers. Addition in F²[x] corresponds to the XOR operation on bit sequences (addition of bits modulo 2, also denoted ⊕).

Definition 2. The set of all keys of a cryptosystem is denoted K. For 128-bit AES, K = {0, 1}¹²⁸. For ChaCha20, K = {0, 1}²⁵⁶.

Definition 3. A nonce is a number used only once. The set of all nonces is denoted N , and is equal to {0, 1}⁹⁶.

Remark 1. A possible source of confusion is that a nonce is both a number and a bit sequence at the same time. The corresponding number is the one whose binary expansion is determined by the bit sequence in N . It helps to think of them as either numbers or 96-bit sequences depending on the context. It does not matter how nonces are chosen, as long as they are never re-used. Nothing really stops Alice or Bob from misusing nonces (ie. re-using them), but it is not in their best interest to do so.

A major assumption is that keys are derived from passwords, and that passwords are chosen from a relatively small set (possibly in the billions or trillions).

We therefore do not care about all possible keys. This justifies the following definition.

Definition 4. The set of all candidate (or potential) keys is denoted P. For both cryptosystems, |P| should be a manageable size, so it should be significantly smaller than |K|.

Definition 5. A message authentication code (MAC) is a function, denoted MACK(N, C, A^[d]) in the most general form. Using the secret key K shared by Alice and Bob, it takes a nonce N , a ciphertext C, and some associated data A^[d] as input. It produces a tag T which is used by Bob to verify the integrity of Alice’s ciphertext C.

Remark 2. The exponent in A^[d] means nothing in particular; it is just a symbol. Other authors abbreviate Associated Data as AD. The notation AD looks too much like multiplication, and A^[d] is never raised to any power here.

(7)

Definition 6. A lattice is a subset of Z^d spanned by basis vectors b_i ∈ Z^d (i = 1, . . . , n ≤ d).

Definition 7. A random variable X coming from the geometric distribution is either denoted X ∈ Geom^t(p) or X ∈ Geom^f(p), where Geom^t(p) is the trial-based geometric distribution and Geom^f(p) is the failure-based geometric distribution. The probability mass function for X ∈ Geom^t(p) is P [X = k] = (1 − p)^k−1p, while for Y ∈ Geom^f(p) it is P [Y = k] = (1 − p)^kp.

Remark 3. The geometric distribution has two versions, namely trial-based and failure-based as stated. The domain for X is all positive integers, while the domain for Y is all non-negative integers. The two versions are related by the equivalence Geom^t(p) 3 X = 1 + X⁰ ⇐⇒ X⁰∈ Geom^f(p).

A geometric variable X ∈ Geom^t(p) models the number of “trials” taken for a series of independent experiments to yield one “success”. A geometric variable Y ∈ Geom^f(p) models the number of “failures” taken for a series of independent experiments to yield one “success”. All such experiments yield “success” with equal probability. In other words, some arbitrary kind of experiment (having failure probability 1−p and success probability p) is repeated over and over until success is achieved. What the words “failure” and “success” mean is ambiguous, and they may be swapped depending on context.

Example 1. A coin toss is defined as a failure if we get “tails”, and success if we get “heads”. The variable X ∈ Geom^t(1/2) then represents the number of coin tosses made until we get “heads”. The variable Y ∈ Geom^f(1/2) represents the number of “tails” made until we get “heads”. The number of trials is clearly 1 greater than the number of failures, because the last non-failure is still counted as a trial.

We also assume some familiarity with computer terminology, for example counter and endianness. A counter is just an integer that counts something. An integer can be written in a binary and hexadecimal format in several ways, but the most common ways are big-endian and little-endian. For example, the number 2⁸+ 2² would occupy 2 bytes. In little-endian format, “260” would be denoted

“04 01” in hexadecimal and “00000100 00000001” in binary.

1.4 Stream ciphers

The one-time pad is a well known encryption scheme used for encrypting single messages only once, by XORing plaintext with a one-time key. The key consists of randomly generated bits, and is as long as the plaintext. The one-time pad achieves perfect secrecy, assuming the key bits are independently random Bernoulli variables (with probability 1/2).

Example 2. Suppose we encode letters using the ASCII binary format. Then we can encrypt the word “hello” with a one-time pad as follows.

plaintext (“hello”): 01101000 01100101 01101100 01101100 01101111 one-time key: ⊕ 11101100 11101000 10101011 01101011 10011001 ciphertext: = 10000100 10001101 11000111 00000111 11110110

(8)

The key was randomly generated. Note the regularity of the upper (leftmost) bits of the plaintext bytes. With many more letters, it would be relatively easy to detect from the plaintext that something like ASCII was being used. This would be impossible to detect from just the ciphertext; it is in fact indistinguishable from random noise.

Stream ciphers try to mimic the one-time pad like this, by generating something resembling a one-time key. They generate so-called key streams, which are pseudorandom sequences of bits. Parts of the key stream are then XORed with plaintext to obtain ciphertext. Because the method of encryption is so simple (ie. just XORing the plaintext), the same sequence of bits from the key stream must never be used more than once. Nonces help ensure this requirement. XOR as an encryption procedure also makes it very easy to manipulate the underlying plaintext. MACs help verify that such manipulation does not occur. A common property of stream ciphers is that the decryption function is the same as the encryption function, since XOR just undoes the encryption if applied twice. We may however differentiate between encryption and decryption functions for the sake of clarity. A stream cipher is broken if its key streams can be predicted;

it is academically broken if its key streams are somehow distinguishable from a truly random bit sequence.

1.4.1 AES-CTR

AES-CTR denotes AES in counter mode [LR00]. Rijndael, the block cipher defining AES, can be thought of as a collection of |K| permutations on {0, 1}¹²⁸. Among other requirements, such permutations should appear to be randomly sampled. Choosing a key is equivalent to choosing one such permutation. This is rarely a useful way to think of Rijndael, simply because |{0, 1}¹²⁸| = 2¹²⁸is so large, and because the symmetric group S₂128 is even larger (with a whopping 2¹²⁸! elements, 2¹²⁸ of which correspond to Rijndael). However, we will think of Rijndael as a randomly sampled permutation indexed by a key. For our purposes, further details of the cipher itself are not necessary to know.

Suppose we want to encrypt a plaintext P consisting of some number of bits.

We first select a nonce N ∈ N and initialize a 32-bit counter c to some constant value. To obtain the first 128 bits of the key stream, we encrypt a copy of the source block N || c. To obtain the next 128 bits of the key stream, we increment c by 1 and encrypt another copy of the source block N || c. With |P | denoting the number of bits of P , we encrypt a total of d|P |/128e blocks.

1.4.2 ChaCha20

ChaCha20 [Ber08a] [Ber08b] makes use of what we can call the “ChaCha function”, which can be thought of as one single function on {0, 1}⁵¹². Instead of encrypting a 128-bit block N || c as for AES-CTR, data is arranged into a

“ChaCha block” as follows







C¯1 C¯2 C¯3 C¯4

K1 K2 K3 K4

K5 K6 K7 K8

N1 N2 N3 c







∈ {0, 1}⁵¹²

(9)

and is then transformed by the ChaCha function. Components of this matrix are 32-bit words (ie. 32-bit sequences), where ¯C_idenote constants, K_i denote parts of the key, Nidenote parts of the nonce, and c is a 32-bit counter initialized to some constant value. Among other requirements, the ChaCha function should appear to be randomly sampled.² The security of ChaCha20 comes from the difficulty of inverting the ChaCha function without knowledge of Ki. As far as we are concerned, all ChaCha20 does is generating pseudorandom bits by transforming 512-bit blocks of the given form. For our purposes, further details of the cipher are not necessary to know.

Suppose we want to encrypt a plaintext P consisting of some number of bits.

We first select a nonce N ∈ N and initialize a 32-bit counter c to some constant value. To obtain the first 512 bits of the key stream, we transform a copy of the given 512-bit ChaCha block. To obtain the next 512 bits of the key stream, we increment c by 1 and transform another copy of the ChaCha block. With |P | denoting the number of bits of P , we transform a total of d|P |/512e blocks.

1.5 Authenticated encryption (AEAD)

This section gives an overview of schemes performing authenticated encryption with associated data (AEAD). Encryption is all about hiding data from the outside world. Information security refers to how well such data (ie. plaintext) is hidden. This is especially important in internet communication, where we usually have no control over who gets to read transmitted data. Ciphers have the job of ensuring information security. Authentication is also necessary, to ensure the integrity of transmitted ciphertext.

Suppose Alice wants to send some plaintext P to Bob. She first selects a nonce N . Then she encrypts the plaintext into C = E_K(N, P ). Then she creates a tag T = MACK(N, C, A^[d]). She finally sends the message M = {N, C, T } to Bob. Bob receives M⁰ = {N⁰, C⁰, T⁰} (in the normal case M⁰ = M ). He computes τ = MACK(N⁰, C⁰, A^[d]) and compares it with T⁰. If τ = T⁰, he decrypts C⁰ into P⁰ = DK(N⁰, C⁰), being confident in that it was actually Alice who sent the message. He can also be confident in that C⁰ has not been tampered with along the way. But if τ 6= T⁰, he assumes the worst (Alice being an impostor), and either responds with an error message or nothing depending on the protocol. We say that Bob finds M⁰ to be authentic if τ = T⁰, and that he authenticates M⁰ by comparing τ with T⁰.

In describing the attacks, we may play the role of Eve, and Eve is trying to obtain Bob’s (and Alice’s) secret key. We mainly target Bob for the sake of simplicity, but the attack is against both parties because they share the same secret key. We do not have Bob’s encryption (or decryption) function available, so in order to test keys, we have to send messages to Bob and ask whether they result in errors (ie. whether τ 6= T⁰). In the simplest form, we send one message for each key, until Bob stops responding with error. Such errors can also be interpreted as asking Bob whether messages are authentic to him. If he responds with error, then the message was inauthentic. If he does not respond, then the message was authentic.

2An exception is the all-zero block corresponding to 0⁵¹², which is a fixed point with respect to the ChaCha function. This fixed point is avoided by the constants ¯Ci.

(10)

1.6 Message forgery attacks (MFA)

The reader might be confused why it says M⁰ = {N⁰, C⁰, T⁰} instead of M = {N, C, T } in the previous section. If everything goes as planned, there should be no need to denote it M⁰ instead of M . But there is a risk that Alice is an impostor (who we assume does not have Bob’s key). This impostor could have forged the message M⁰, possibly with the intent to spam Bob with garbage information, or to impersonate Alice for more sinister reasons. We refer to such an attack against Bob as a message forgery attack (MFA). The main purpose of the MAC is to protect against MFAs, so that Bob can be as sure as possible that Alice was the sender of M⁰ (and that M⁰is in fact the same as M ). Associated data A^[d] also helps against a type of this attack, where Eve tries to swap two messages from different contexts. This is the only purpose of A^[d], although its usefulness may not be clear at first, so it deserves the following example.

Example 3 (associated data). Alice and Bob define the context as simply the current date, and they define A^[d] depending on this context. The date is April 1 2022, when Alice sends a joke message M1 to Bob. Eve somehow has the ability to delay messages to Bob until the next day. The date is now April 2 2022, when Alice sends a very important message M2to Bob, now using some other associated data A^[d]. Eve manages to swap the contents of M1 and M2, and she then proceeds to send M2 to Bob. Without associated data, Bob would decrypt and authenticate M2, not knowing that the received message is an April fools joke from yesterday. But with associated data, Bob can detect that there is something wrong with the contents of M₂.

A message forgery attack is as difficult to perform as creating the tag T⁰ without Bob’s key K, given some nonce N⁰ and ciphertext C⁰. The main requirement for the MAC is therefore that

T⁰= MACK(N⁰, C⁰, A^[d])

should be practically impossible for Eve to evaluate. This gives a sort of security of integrity for Bob, which is higher the harder it is to evaluate T⁰ without K. We sometimes refer to this security as tag security. This is separate from information security given by the cipher. The best possible tag security is determined by the so-called birthday bound,³ which for a tag space {0, 1}ⁿ is equal to√

2ⁿ = 2^n/2. This bound is named after the birthday paradox, which is about the probability that some pair among randomly selected people have the same birthday. It is not really a paradox, and is only named as such because people tend to be misled by the problem formulation.

In order to find two messages M and M⁰producing the same tag T , one must search among 2^n/2 messages before expecting to succeed (this assumes that the MAC is as good as possible). For our purposes, the tag space will be {0, 1}¹²⁸, which leads to a maximum tag security corresponding to 2⁶⁴. Exceeding 2⁶⁴≈ 1.84 · 10¹⁹ messages implies that future messages M⁰ may produce the same tag T as some past message M . In other words, a MFA is within the realms of possibility after 2⁶⁴messages, no matter how well designed the MAC is. Message forgery is not the main subject of this thesis, but it is relevant in Section 3.

3See [HPS08, pp. 228–231] for a derivation of this bound. The birthday bound is also more fuzzy than an exact value, but square root is a good asymptotic bound.

(11)

1.7 Brute force attacks (BFA)

The basis of our context is a form of brute force attack, hinted at towards the end of Section 1.5. Eve can create a message M⁰ = {N⁰, C⁰, T⁰}, where the tag is simply evaluated as follows

T⁰= MACK⁰(N⁰, C⁰, A^[d]) .

We assume that Eve knows the associated data A^[d] used by Alice and Bob.

The ciphertext C⁰ could be anything as long as Bob allows Eve to spam him with meaningless plaintext. Most importantly, K⁰ is a key chosen by Eve, so it is trivial for her to evaluate T⁰. Suppose she sends M⁰ to Bob, after which she asks whether M⁰ is authentic. If Bob answers with “no, it is not authentic”

(by far the most common scenario), then Eve knows that K⁰ is not Bob’s key.

But if he answers with “yes, it is authentic”, then Eve can be very sure that she has found Bob’s key (this assumes the MAC is not of poor quality). Eve can search for Bob’s key by repeating this process for various keys. This is a brute force attack targeting Bob’s key. The following figure crudely illustrates such an attack, where Bob sends error messages instead of answering on authenticity.

(12)

Key: 10101100...

Key: 00101111...

Eve

Bob

Bob M = {1101.., 0101.., 1110...}

M = {0101.., 0010.., 1100...}

M = {0111.., 0000.., 1101...}

Key: 11100101...

"Error!"

"..."

Figure 1: Using various keys, Eve sends authenticated messages to Bob until he stops responding with error. The last of the three instances is where Eve (most likely) found Bob’s key, because Bob did not respond with error. We assume that their connection is uninterruptible, because Bob’s lack of response may otherwise be due to some technical problem. An easy solution for Bob is to simply not respond with error, but in practice there may still be a bug in the communication system that reveals whether Bob finds messages authentic, so this attack applies not only to this context.

1.8 Partitioning oracle attacks (POA)

As the original authors describe it [LGR20], a partitioning oracle attack (POA) is an attack that utilizes a so-called partitioning oracle. Upon receiving a message M = {N, C, T }, such an oracle answers whether Bob’s key belongs to some known (or chosen) subset of keys. In other words, the oracle answers with either

“yes, Bob’s secret key is in your given subset of keys” or “no”. Such a subset is denoted K in the next sections. The authors described a partitioning oracle attack against AES-GCM and ChaCha20-Poly1305 using a VPN server as the oracle. AES-GCM and ChaCha20-Poly1305 are two AEAD-schemes.

A POA can be realized as a way for Eve to test several keys at once, with just one message. Here, it is best to think of it as an “enhanced” brute force attack. POA in a non-theoretical form is done by constructing a ciphertext C, along with a nonce N and a tag T , such that the following holds

(13)

T = MACK₁(N, C, A^[d]) T = MACK2(N, C, A^[d]) T = MACK₃(N, C, A^[d])

· · ·

T = MACK_m(N, C, A^[d])

for a set K = {Kⁱ ∈ K | 1 ≤ i ≤ m} of keys. If Bob’s secret key is in the set K, he will find the message M = {N, C, T } to be authentic. But if his secret key is not in K, it is extremely unlikely that he will find M to be authentic.

If Bob also somehow reveals whether M is authentic or not, he essentially acts as a partitioning oracle against himself, telling us which of the two sets in the partition {K , K \ K} his secret key is in. We refer to the ciphertext C as a splitting ciphertext because of how it splits the key space K into K and K\K, with respect to the oracle. Finding such a ciphertext is called the splitting ciphertext problem. Len et. al. [LGR20] showed that AES-GCM and ChaCha20-Poly1305 are relatively weak against POAs because the splitting ciphertext problem can be mostly reduced to solving a linear system of equations. Such reduction is possible because the MACs use polynomial evaluation over a finite field.

(14)

2 Analysis of POA

This section gives an overview, along with some analysis, of partitioning oracle attacks against AES-GCM and ChaCha20-Poly1305. The attacks very much depend on solving linear systems, so the optimality of linear system dimensions is also analyzed. Section 2.3 applies to both the standard versions of the two AEAD-schemes, as well as the slightly modified variants described in Section 3.

To make analysis slightly easier, we assume that A^[d] is constant for the whole duration of an attack. The specific POA that we will consider can be described as follows.

1. Partition key candidates P ⊂ K into subsets, each denoted K.

2. For each set K of keys,

2.1. Find a splitting ciphertext C with respect to K.

2.2. If C is not found, choose a new nonce N and go back to step 2.1.

2.3. If C is found, send the message {N, C, T } to Bob and ask him whether the message is authentic.

2.4. If Bob answers “no”, we know that Bob’s key is not in K. Choose another set K from P and go back to step 2.1.

2.5. If Bob answers “yes”, we can be very sure that Bob’s key is in K. If |K| > 1, reduce P to K and go back to step 1. If |K| = 1, we have found Bob’s key.

Depending on how P is partitioned into key sets, this attack can have a divide- and-conquer style to it. But we will not analyze the whole attack, because that requires a much more elaborate context. The most interesting steps for us is 2.1–2.2. These steps are the computationally difficult part. The act of asking Bob about authenticity can be replaced with a more sophisticated procedure of detecting authenticity, but the abstract notion of asking Bob is sufficient for us.

Bob essentially acts as an oracle against himself, which is appropriate given our brute force context.

2.1 AES-GCM

The following algorithm describes the MAC of AES-GCM. An overview of AES- GCM can be found in Appendix Section 5.4. Other variants of AES-GCM may merge the encryption and authentication step (probably to increase speed performance) but we present a decoupled version here for clarity. Plaintext, ciphertext, and associated data are simplified to be multiples of 128 bits. It may look different in other documents and specifications, but it should give a sufficient overview. The quantities P_j, C_j, A^[d]_j are interpreted as blocks of 128 bits at position j. A straightforward way to convert 128-bit blocks to polynomials (and vice versa) is implicit. All computations happen in the field F2¹²⁸ defined modulo x¹²⁸+ x⁷+ x²+ x + 1 ∈ F²[x], except in exponents where regular integer operations take place. Field addition is emphasized in the algorithm with the ⊕

(15)

symbol, to make it more clear. The function encode_b(·) outputs a b-bit represen- tation of the input. In our measurements, encode_b(·) converts integers to bytes arranged in little-endian order (this may also differ from other specifications).

Algorithm 1 GHASH_K(N, C, A^[d])

1: ` ← |C| (the number of 128-bit blocks of C)

2: a ← |A^[d]| (the number of 128-bit blocks of A^[d])

3: H ← AESK(0¹²⁸)

4: R ← AESK(N || encode32(1))

5: L ← encode64(|A^[d]|) || encode64(|C|)

6: T ← R ⊕ LH

7: for j from 1 to ` do

8: T ← T ⊕ C_`+1−jH^1+j

9: end for

10: for j from 1 to a do

11: T ← T ⊕ A^[d]_a+1−jH^`+1+j

12: end for

13: Output: T

The attack to be presented originally comes from Len et. al. [LGR20, pp.5–7].

They went further in their attack against AES-GCM by obtaining a Vander- monde matrix, which can give an O(n²) time complexity in solving n × n linear systems [LGR20, p. 6] instead of cubic complexity (an n × n linear system is generally solved in O(n³) time). But we mainly include a variant of their attack here to give some context. As described before, we want to construct a ciphertext C (along with tag T and nonce N ) for which T = GHASHK_i(N, C, A^[d]) holds for every key Ki in some set K ⊂ P. By sending the resulting message {N, C, T } to Bob and asking whether it is authentic to him, we essentially test all keys in K at once. We do this until Bob answers “yes it is authentic”, in which case we repeat the same process for subsets of K. In case we fail to construct C, we choose a new N and try again.

The most straightforward way to construct C is to arrange a linear system of equations over F2¹²⁸, where the tag and ciphertext are unknowns. This works because GHASH is purely based on polynomial evaluation over F2¹²⁸. In creating a tag, a certain polynomial is evaluated at a key-dependent point H ∈ F2¹²⁸. For keys Ki (i = 1, . . . , m), evaluation of GHASH would look as follows

GHASHK_i(N, C, A^[d]) = T = Ri+ LHi+

`

X

j=1

C`+1−jH_i^1+j+

a

X

j=1

A^[d]_a+1−jH_i^`+1+j

from which we can obtain

T +

`

X

j=1

C`+1−jH_i^1+j= Ri+ LHi+

a

X

j=1

A^[d]_a+1−jH_i^`+1+j (1) by re-arranging terms. Plus and minus signs are the same for elements of F2¹²⁸. The unknown variables are on the left hand side. As stated, we assume that A^[d]

is constant for the whole duration of the attack. The term L is also constant

(16)

because the length of ciphertext and associated data is the same for every i = 1, . . . , m. This gives

T +

`

X

j=1

C`+1−jH_i^1+j= Bi.

Here, the term Bican be modelled as uniformly sampled from F2¹²⁸with a small constraint to it: for a fixed Ki, it is a sample without replacement with respect to the underlying nonce N . This is due to the following 3 reasons.

• Nonces are never re-used for constructing Ri= AESK_i(N || encode32(1)).

• The AES cipher can be modelled as a random permutation.

• The term Ri is the only non-constant term in the right hand side of (1), with respect to nonces (see the GHASH algorithm).

We solve the following linear system

Mc =







1 H₁² H₁³ · · · H₁^`+1 1 H₂² H₂³ · · · H₂^`+1 1 H₃² H₃³ · · · H₃^`+1

... ... ... . .. ... 1 H_m² H_m³ · · · H_m^`+1











 T C`

C`−1

... C1







=





 B1

B2

B3

... Bm







= b . (2)

Once we have solved this, we have essentially found a splitting ciphertext and a corresponding tag.

Ideally, we do not want the ciphertext to be large because large messages require more bandwidth to Bob. This means that n = ` + 1 should be as small as possible. Moreover, we want to test as many keys K_ias possible, so m should be as large as possible. This forces the matrix M to have at least as many rows as columns. It turns out that M being a square matrix (ie. m = n) is the optimal scenario, at least when attacking Bob from a personal computer (see Section 2.3).

The following is a graph that roughly shows how the speed performance of step 2.1–2.2 of the attack scales on a computer using the Intel Core i7-3770 processor. The Number Theory Library (NTL) for C++ was used to simulate the steps. More accurate estimates are given in the next table, which shows only a handful of dimensions. Microseconds were rounded to at least 4 significant digits in the table. A splitting ciphertext was conceptually found in all cases.

(17)

0 50 100 150 200 250 dimensions

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40

seconds

Figure 2: Speed performance of step 2.1–2.2 of the AES-GCM attack. It just boils down to solving linear systems. Dimension is denoted m. The matrix M had m rows and m columns. A total of 100 runs of step 2.1–2.2 were made for each dimension.

dimensions (m) microseconds

2 8.456

4 21.84

8 73.65

16 249.8

32 1200

64 7521

128 52915

256 398233

Table 1: More precise measurements of step 2.1–2.2 of the AES-GCM attack.

A total of 10⁴ runs of step 2.1–2.2 were made for each row of the table.

2.2 ChaCha20-Poly1305

The following algorithm describes the MAC of ChaCha20-Poly1305 (or at least a variant of it). An overview of ChaCha20-Poly1305 can also be found in Ap- pendix Section 5.4. Plaintext, ciphertext, and associated data are simplified to be multiples of 128 bits. It may look different in other documents and specifications, but it should give a sufficient overview. The quantities P_j, C_j, A^[d]_j are interpreted as blocks of 128 bits at position j. A special case is B, which

(18)

consists of 4 blocks of 128 bits, denoted B_k (k = 1, 2, 3, 4) only in the algorithm.

Arithmetic is done over the prime field F2¹³⁰−5. A straightforward way to convert 128-bit blocks to non-negative integers (and vice versa) is implicit. The function clear-bits(·) comes from [Ber05b, p. 35] and sets a handful of bits to 0 in order to increase speed performance [Ber05b, p. 37].

Algorithm 2 Poly1305K(N, C, A^[d])

1: p ← 2¹³⁰− 5

2: ` ← |C| (the number of 128-bit blocks of C)

3: a ← |A^[d]| (the number of 128-bit blocks of A^[d])

4: B ← ChaCha20K(N, 0)

5: H ← clear-bits(B1)

6: R ← B2

7: L ← encode64(|A^[d]|) || encode64(|C|)

8: T ← (2¹²⁸+ L)H (mod p)

9: for j from 1 to ` do

10: T ← T + (2¹²⁸+ C_`+1−j)H^1+j(mod p)

11: end for

12: for j from 1 to a do

13: T ← T + (2¹²⁸+ A^[d]_a+1−j)H^`+1+j (mod p)

14: end for

15: T ← T + R (mod 2¹²⁸)

16: Output: T

Step 2.1–2.2 of the attack against ChaCha20-Poly1305 is slightly different.

This is an adaptation of the attack of Len et. al. [LGR20, pp. 20–23], with an additional lattice reduction step. Poly1305 complicates the algebraic structure of polynomial evaluation by further reducing the evaluation modulo 2¹²⁸at the final step, so a linear system must be arranged differently. Moreover, solutions to such a linear system must have components in the range [2¹²⁸, 2¹²⁹) because of how 2¹²⁸ is added modulo p (see step 8, 10 and 13 in the algorithm).⁴ This range is roughly a quarter of the whole range [0, p) of F^p. A lattice reduction step can be made to “squeeze” solution components into the necessary range [2¹²⁸, 2¹²⁹). We want to determine

Poly1305_K_i(N, C, A^[d]) = T =



(2¹²⁸+ L)Hi+

`

X

j=1

(2¹²⁸+ C`+1−j)H_i^1+j

+

a

X

j=1

(2¹²⁸+ A^[d]_a+1−j)H_i^`+1+j (mod p)



 +Ri mod 2¹²⁸

4Addition by 2¹²⁸is usually implemented in a smarter way, by appending a byte with the hex value 0x01 next to the most significant byte of a non-negative 128-bit integer [Ber05b].

We just describe it as addition here because that is the most relevant effect of the algorithm.

(19)

where H_i, R_i are derived using keys K_i ∈ K and a nonce N ∈ N . Here, T is one and the same tag value for all i = 1, . . . , m. The best strategy seems to be selecting an arbitrary value for T , so suppose T is given. We have found a splitting ciphertext if Cj satisfy

T − Ri mod 2¹²⁸ = (2¹²⁸+ L)Hi+

`

X

j=1

(2¹²⁸+ C`+1−j)H_i^1+j+

a

X

j=1

(2¹²⁸+ A^[d]_a+1−j)H_i^`+1+j (mod p)

for all keys in K. An observation similar to that of the original authors [LGR20]

is that by defining ˜B_i= (T − R_i) (mod 2¹²⁸), we can ignore the reduction above modulo 2¹²⁸. This enables us to deal only with the field Fp. Doing this, we have

`

X

j=1

(2¹²⁸+ C_`+1−j)H_i^1+j

= ˜Bi− (2¹²⁸+ L)Hi−

a

X

j=1

(2¹²⁸+ A^[d]_a+1−j)H_i^`+1+j = Bi

where everything is computed modulo p. Here we have unknowns in the left hand side, and the equation above (for i = 1, . . . , m) can be made into a linear system of the form

Mc =







H₁² H₁³ H₁⁴ · · · H₁^`+1 H₂² H₂³ H₂⁴ · · · H₂^`+1 H₃² H₃³ H₃⁴ · · · H₃^`+1

... ... ... . .. ... H_m² H_m³ H_m⁴ · · · H_m^`+1











 C˜`

C˜`−1

C˜_`−2 ... C˜₁







=





 B₁ B2

B3

... Bm







= b . (3)

Let n = ` denote the number of columns. The components ˜Cinow correspond to 2¹²⁸+ Ci, where Ci should by definition be in the interval [0, 2¹²⁸). The original authors mentioned that an initial solution c is very rarely in the desired interval when dimensions are large. Their solution to this was to let the number of rows m be a little fewer than the number of columns (specifically m ≈ (127/129) · n), in order to have a non-trivial kernel for M. Then they keep adding vectors v ∈ ker M until all components are in the desired interval. This is apparently very inefficient, because the success rate of their solution is proportional to (1/4)ⁿ (with some uniform randomness assumptions on ker M). This is because each of the solution’s n components must fit in the necessary interval [2¹²⁸, 2¹²⁹), roughly a quarter of the whole interval as stated. There is a better strategy than this, described as follows.

We start by defining the following vectors and matrices. Let m < n, so that ker M is non-trivial. Define ˆc = c − 1.5 · 2¹²⁸f ∈ Zⁿ (not reducing modulo p) where c is an initial solution to (3), and f = (1, 1, . . . , 1)^T. Define k = dim(ker M); usually k = n − m. Define K ∈ Z^k×n such that its rows are

against variants of AES-GCM and ChaCha20-Poly1305

Degree project

Partitioning oracle attacks