Characterization of cipher suite selection, downgrading, and other weaknesses observed in the wild

(1)

Linköpings universitet SE–581 83 Linköping

Linköping University | Department of Computer and Information Science

Bachelor’s thesis, 15 ECTS | Computer Science

2021 | LIU-IDA/LITH-EX-G--21/063--SE

Characterization of cipher suite

selection, downgrading, and

other weaknesses observed in

the wild

Karaktärisering av cipher suite val, nedgradering och andra

svagheter som observerats i det vilda

Sebastian Frisenfelt

Edvin Kjell

Supervisor : Niklas Carlsson Examiner : Marcus Bendtsen

(2)

Upphovsrätt

Detta dokument hålls tillgängligt på Internet - eller dess framtida ersättare - under 25 år från publicer-ingsdatum under förutsättning att inga extraordinära omständigheter uppstår.

Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka ko-pior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervis-ning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säker-heten och tillgängligsäker-heten ﬁnns lösningar av teknisk och administrativ art.

Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsman-nens litterära eller konstnärliga anseende eller egenart.

För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/.

Copyright

The publishers will keep this document online on the Internet - or its possible replacement - for a period of 25 years starting from the date of publication barring exceptional circumstances.

The online availability of the document implies permanent permission for anyone to read, to down-load, or to print out single copies for his/hers own use and to use it unchanged for non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional upon the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility.

According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement.

For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its www home page: http://www.ep.liu.se/.

(3)

Abstract

The importance of security on the web is growing every day. How domains handle and prioritize their level of security is varying. Tradeoffs between security and convenience have to be made to uphold a website’s public image. This thesis uses a subset of domains from the Alexa Top 1M list. The list was used to create our datasets, collected through active scans with testssl.sh. This thesis has through the mentioned datasets compared do-mains in regards to several security aspects and analyzed how they handle security and convenience. We performed our scans over the course of two weeks to analyze each do-main’s level of security. As well as looking at top domains for several popular categories.

Our analysis mainly focused on comparing the domains on their choice of Transport Layer Security (TLS) version, cipher suite, support for HSTS, and if they were exposed to any vulnerabilities. The subset of domains that we looked at saw about 50% implementa-tion of TLS 1.3. We discovered that the most popular domains tend to choose availability as one of their highest priorities, leaving them exposed to vulnerabilities in earlier versions of the TLS protocol. Most domains that showed exposure to one vulnerability, in general, also were exposed to BEAST. This was also the most prominent vulnerability among all domains. We also showed that many of the negotiated cipher suites on the list of domains still utilize cipher block chaining, which is known to be weak. Our results show that differ-ent browsers, mobile operating systems, and the time of day had a negligible impact on the choice of TLS version. Most of the domains in the popular categories had not yet adopted TLS 1.3 and were overall more exposed to the tested vulnerabilities than those on the top million list. The support for HSTS was low in both the categories and on the Alexa top list. We conclude that upgrading to the latest recommended standard should always be a priority for server operators.

(4)

Acknowledgments

We want to thank our advisor Niklas Carlsson who has provided us with great support and information vital to our thesis. We also want to thank our fellow students who have helped and given us feedback on our work. We would also like to thank Henrik Wendt and Matteus Henriksson for providing us with last year’s Alexa list of categories.

(5)

1 Introduction 1 1.1 Motivation . . . 1 1.2 Aim . . . 1 1.3 Contribution . . . 2 1.4 Research questions . . . 2 1.5 Delimitations . . . 2 2 Background 3 2.1 The TLS protocol . . . 3 2.2 TLS handshakes . . . 3 2.3 TLS downgrading/version negotiation . . . 6 2.4 Cipher suites . . . 6 2.5 HSTS and HTTP security . . . 8 2.6 Vulnerabilities . . . 8 3 Related work 10 4 Method 12 4.1 Testssl.sh . . . 12 4.2 Domain selection . . . 14 4.3 Alexa categories . . . 14 4.4 Limitations . . . 14 5 Results 16 5.1 Negotiated TLS version . . . 16

5.2 Negotiated cipher suites . . . 17

5.3 HSTS . . . 19

5.4 Vulnerabilities . . . 20

5.5 TLS versions in different browsers . . . 22

5.6 TLS versions for mobile . . . 23

5.7 Categories . . . 23

6 Discussion 26 6.1 Results . . . 26

(6)

6.2 Method . . . 28 6.3 The work in a wider context . . . 28

7 Conclusion 29

7.1 Future work . . . 29

A Appendix 31

A.1 Scripts and headers . . . 31 A.2 Venn-diagrams for vulnerabilities per block . . . 31

(7)

List of Figures

2.1 Full TLS 1.2 and TLS 1.3 handshakes. . . 4

2.2 TLS 1.3 handshakes with 1-RTT and 0-RTT. . . 5

5.1 Average number of domains, rounded to closest integer, for each TLS version. . . . 17

5.2 Average number of domains not supporting HSTS. . . 19

5.3 Graphs illustrating average of all analysed vulnerabilities and interesting findings. 20 5.4 Venn-diagram illustrating tested domains that were exposed to the analysed vul-nerabilities at least once. . . 21

5.5 TLS version negotiation in selected browsers. . . 23

5.6 TLS negotiation for mobile clients. . . 23

5.7 Graphs depicting findings from categories. . . 24

A.1 Script used with testssl.sh. . . 31

A.2 Venn-diagram illustrating tested domains that were exposed to the analysed vul-nerabilities at least once in domain range (0,500]. . . 40

A.3 Venn-diagram illustrating tested domains that were exposed to the analysed vul-nerabilities at least once in domain range (500,1000]. . . 40

A.4 Venn-diagram illustrating tested domains that were exposed to the analysed vul-nerabilities at least once in domain range (9500,10 000]. . . 41

A.5 Venn-diagram illustrating tested domains that were exposed to the analysed vul-nerabilities at least once in domain range (99 500,100 000]. . . 41

A.6 Venn-diagram illustrating tested domains that were exposed to the analysed vul-nerabilities at least once in the last 500 domains. . . 42

(8)

List of Tables

4.1 All flags used for testssl.sh. . . 13

4.2 Domains taken from Alexa Top 1M divided into five blocks. . . 14

5.1 Average number of domains, rounded to closest integer, for each TLS version. . . . 17

5.2 Total number of cipher suites negotiated per protocol for all domains and for the top 1000 domains. Text highlighted in red signifies weak cipher suites and text highlighted in orange are potentially weak ciphers. . . 18

5.3 Average of domains weak to each vulnerability. . . 20

5.4 Average of chosen TLS version per block for each browser. . . 22

5.5 Security information from each category. . . 24

A.1 Miscellaneous headers, with information such as domain name, IP, position on the Alex Top 1M list and more. . . 32

A.2 Protocol headers, with information on selected protocol, such as SSLv2 and TLS 1.2. 32 A.3 Server headers, with information on what the server has implemented. . . 32

A.4 Cipher order headers, with information about negotiated protocol and cipher. Also cipher orders from the server. . . 32

A.5 Session headers, with information about how the session is handled, for example in the case of a resumption. . . 32

A.6 Certificate headers 1, with information about the servers certificate. . . 33

A.7 Certificate headers 2, with information about the servers certificate. . . 34

A.8 HTTP headers, with information about what http features the server has imple-mented. . . 35

A.9 Vulnerabilities headers, with information about the serves exposure to the vulner-abilities. . . 35

A.10 Cipher headers 1, with information on available cipher suites in the security pro-tocols. . . 36

A.13 Client simulation headers, with information on negotiated cipher and security protocol for the simulated clients. . . 39

(9)

1 Introduction

This thesis delves into the topics of TLS, TLS downgrades, cipher suite selection, and other weaknesses observed in the wild.

1.1 Motivation

Today the Transport Layer Security (TLS) protocol is essential to all types of secure Inter-net connections. TLS has undergone many changes and has been improved with the help of many contributors. TLS version 1.3 is the current standard after a design process that took four years, starting in 2014. TLS 1.3 has seen fast adoption on the web but still has not reached everyone. Those that still have not implemented the new TLS standard, could either be exposed to vulnerabilities or force connections to downgrade [12].

Even though TLS 1.3 is the latest and most secure version of the protocol, many connec-tions still choose to downgrade to earlier versions like TLS 1.2. Exactly why this happens has still not been researched thoroughly. What is currently believed to be one reason, that has been analyzed from passive measurements, is that the client’s browser might not be updated to the latest version. Also, browsers tend to choose a better user experience over security [23]. An active attacker can gain access to important or private information, by forcing a server to choose a vulnerable cipher suite or by utilizing flaws present within earlier versions of TLS and SSL [1]. Therefore being aware of the connection’s security level is something important to all web users.

1.2 Aim

This thesis aimed to create datasets containing information sufficient to answer TLS down-grading and cipher suite selection questions, from a general point of view. The type of infor-mation collected was designed to hopefully answer some questions on these subjects, such as why some connections are downgraded and what conditions lead to a downgrade. We were hoping to learn more about TLS and TLS downgrading and contribute to further research.

(10)

1.3. Contribution

1.3 Contribution

Our main contributions have been the creation of datasets containing security information from a subset of popular domains, with data collected at two different times of day. As well as datasets constructed similarly for top domain categories.

1.4 Research questions

1. Do browsers differ in the frequency of TLS downgrading and use of less secure cipher suites?

2. To what degree do domains on the Alexa top 1M list prioritize conveniences, such as availability, over security?

3. Can we see any correlation between a domain’s security level and the time of day? 4. Why is downgrading still a supported function; what are the advantages of allowing

downgrades?

1.5 Delimitations

The analysis was performed on the latest versions of some of the most popular browsers: Chrome, Firefox, Microsoft Edge, and safari. Also, mobile operating systems Android and iOS. These were chosen due to their popularity. We decided not to analyze results for all available browsers or browser versions, present in our datasets, because of relevance. All tests were performed on a subset of the Internet’s most popular domains, due to time limitations. We decided that our data collection and analysis would mainly focus on the information most relevant to our topic. Tests were performed twice a day over two weeks on the domains since time was limited.

(11)

2 Background

In this section, we discuss some relevant protocols and vulnerabilities that will later be brought up in the results and discussion.

2.1 The TLS protocol

Transport Layer Security, or TLS, as mentioned in the introduction, is a cryptographic proto-col that is primarily used for secure connections between devices over the Internet. A con-nection usually consists of an initiator and a responder. The initiator is ideally represented as a client and the responder as a server.

The TLS protocol is made up of two sub-protocols, the handshake protocol, and the record protocol. The handshake protocol handles negotiation when selecting a suitable cipher suite and version. It is also responsible for handling key exchange and authentication between server and client. The record protocol is in charge of transporting protected data, that has been encrypted with keys that were exchanged during the initial handshake. The TLS proto-col was created to provide security and integrity between communicating devices [3].

2.2 TLS handshakes

The TLS protocol has undergone many changes over the years, thus multiple versions are available. The standard version that is currently deployed is TLS 1.3. When this version of TLS was in development there was a larger focus on security, when compared to TLS 1.2. Both versions use similar handshake protocols with some caveats, which are illustrated in Figure 2.1.

TLS 1.2

The TLS 1.2 handshake is first initiated by a client that sends a ClientHello message to a server. The initiating message contains information, such as; the maximum TLS version that the client supports, a random client value nI, an optional session ID that can be used if the session

is resumed, a list that is ordered by preference which contains all supported cipher suites that the client supports, another list that is ordered by preference which states all compression methods that the client supports, and finally an optional list of extensions.

(12)

2.2. TLS handshakes

Figure 2.1: Full TLS 1.2 and TLS 1.3 handshakes.

The server will then respond with a ServerHello message and this message will answer some of the questions that the client has asked in the earlier message. This message contains; the TLS version that the server selects, the servers random value nR, the optional session ID,

the selected cipher suite based on the list of supported cipher suites sent by the client, the selected compression method from the list of supported compression methods, and finally an optional list of extensions supported by the server and requested by the client.

After this, the server will send a ServerCertificate, that consists of the server’s certifi-cate, this is only required if server authentication is needed. The server will then send a ServerKeyExchange message. This message will only be sent if the communication utilizes the key exchange Diffie-Hellman (DHE) algorithm. The message does not have to be sent if the Rivest-Shamir-Adleman (RSA) algorithm is used instead. This message will contain the server’s public key and a signature that has been created by hashing the random values nI

and nR. The server will finalize its part of the exchange by sending a ServerHelloDone to prove

its completion.

When the client receives the ServerHelloDone it will have to verify that the suggestions present within the ServerHello message are compatible. If they are, the client will proceed with sending its ClientKeyExchange message to initiate the master secret. The message will have different content depending on what kind of key-exchange algorithm that was used. If RSA was used, the client sends a pre-master secret that has been encrypted with the server’s long-term RSA public key. However, if DHE was used, the client would send its DHE public value for the server to be able to calculate the shared DHE secret-key.

Both parties will proceed with calculating the master secret and the two session keys kI

and kR which will be the client and server keys once this transaction has been completed.

These keys are calculated using PRF, shorthand for Pseudo-Random Function, and the master secret.

After this, the client will send the ChangeCipherSpec message which is followed by a ClientFinished message, and both of these messages are encrypted using the accepted algo-rithms and keys. The conversation is then ended by the server who responds with a Change-CipherSpec message and a ServerFinished message that has to be verified by the client for the

(13)

2.2. TLS handshakes

handshake to be acknowledged as completed. Once both parties have acknowledged the handshake as completed, they can start sending and receiving encrypted data using the cre-ated keys kIand kR[3].

TLS 1.3

TLS 1.3 has eight criteria that have to be followed during a handshake and number five on that list is downgrade protection. One major change that was made to increase security for the protocol was to exclude all known weak cryptographic algorithms, such as RSA and static Diffie-hellman [6]. TLS 1.3 has a reduced handshake time. A full handshake requires one Round Trip Time (1-RTT) while a repeated handshake requires zero Round Trip Time (0-RTT) [7]. This means that a full TLS 1.3 handshake requires one less Round Trip Time when compared to the TLS 1.2 handshake, as can be seen in Figure 2.2.

Three key exchange modes are available for TLS 1.3, pre-shared key exchange, Diffie-Hellman key exchange, and a combination of the two. These three modes are useful in dif-ferent situations and offer difdif-ferent levels of security. Currently, Diffie-Hellman is the default mode for TLS 1.3. A TLS 1.3 handshake using Diffie-Hellman is initiated similarly to TLS 1.2 where the client sends a ClientHello message, which contains a random value together with a list of preferred symmetric algorithms. The client will also send a KeyShare message, which is used to compute application traffic and handshake keys.

The server will then respond by sending a ServerHello message containing the appropriate symmetric algorithm from the list that was sent by the client, together with a randomly gen-erated value. The server will also send a KeyShare message and an EncryptedExtensions mes-sage with an optional CertificateRequest mesmes-sage. The server additionally sends a Certificate message to the client, which contains the server’s certificate, together with a CertificateVerify message, which contains a digital signature. Both of these messages will be used by the client to validate the server.

The conversation ends with the server’s final Finished message. It contains a message authentication code, that is used to authenticate and bind the server to the computed keys.

(14)

2.3. TLS downgrading/version negotiation

If requested by the server, the client will send Certificate and CertificateVerify messages. This means that the client also has to send a Finished message to declare that the conversation is over [6].

2.3 TLS downgrading/version negotiation

In TLS 1.2 and lower the client sets the value of Vercto the latest version that it supports and

sends it to the server. At this point, one of three scenarios may occur. The first case is that the value of Vercis lower than the lowest version supported by the server and the connection

terminates with a protocol_version alert message. The second case that could occur is that the server does support the version chosen by the client and therefore sets Versto the same value

as Verc. The last scenario is that the value of Verc is higher than what the server supports,

then the server sets Versto the latest version it supports. In the last case, after Versis set, the

client needs to see if it supports the version that the server suggested. If the client does not support Vers then the connection is terminated with a protocol_version alert message.

In TLS 1.3 svcand svsare used. The client sets svcto a list of versions that it supports and

the server chooses the latest that it supports. The deprecated Vercand Vers are still present

and are used for backward compatibility [15].

TLS 1.2 will utilize fallback if a connection fails. The fallback function allows TLS 1.2 to downgrade to earlier versions of TLS, such as TLS 1.1, and retry the connection. TLS 1.2 will fall back until it establishes a successful connection [15]. TLS 1.3 requires handshake protocols to handle downgrade protection [6].

2.4 Cipher suites

In TLS the handshake protocol is used for setting up a secure environment for the session. The client and server decide on algorithms used for authentication, confidentiality, and integrity, as well as deriving symmetric keys. For the TLS 1.3 handshake, needed parameters such as extensions are also set up. This negotiated set of keys and algorithms are called the cipher suite.

All cipher suites follow the same basic structure. Every cipher suite is made up of an encryption algorithm, an authentication algorithm, a key exchange algorithm, and a Mes-sage Authentication Code (MAC). Cipher suites from TLS 1.0 up to TLS 1.2 are constructed accordingly:

TLS_KeyExchangeAlg_WITH_EncryptionAlg_MessageAuthenticationAlg,

where TLS represents the protocol version used, such as TLS 1.2. The abbreviation KeyEchangeAlg represents one of six available key exchange algorithms, these include; PSK, RSA, DH, DHE, ECDH, or ECDHE. A key exchange algorithm is used to perform key ex-changes between servers and clients. The key exchange algorithm utilizes “_” for separation and clarity, these will henceforth be called separators. The algorithm can be constructed in two ways, the first way uses one separator, and the second way two separators.

PSK will be used if there is one separator present. The usage of PSK implies that the pre-master secret has been calculated through symmetric algorithms, that utilize pre-shared keys.

Diffie-Hellman key exchange algorithm will be the first abbreviation used if two separa-tors are present, meaning any of DH, DHE, ECDH, or ECDHE. If the first abbreviation used is either DH or ECDHE it implies that the server’s public key contained within its certificate will utilize DH key exchange or ECDHE key exchange. This implies that the second abbrevia-tion will signify the algorithm used for signing the server certificate. If on other hand, the first abbreviation used is either DHE or ECDH, either ephemeral DHE key exchange or ephemeral

(15)

2.4. Cipher suites

ECDH key exchange will be used. The second abbreviation will represent the public key type that is used to validate the server’s ephemeral public key.

The abbreviation EncryptionAlg stands for encryption algorithm. An encryption algorithm is used to encrypt conversation data. The abbreviation consists of the symmetric encryption algorithm and utilized mode. The used encryption algorithm is either AES_128 or AES_256 and the available modes are GCM (Galois/Counter Mode), CCM (Counter with CBC-MAC), or CBC (Cipher Block Chaining mode).

The abbreviation MessageAuthenticationAlg represents one Message authentication algo-rithm, a hash algorithm that often is used in conjunction with HMAC.

One such cipher suite that follows the same structure is TLS_ECDHE_RSA_WITH_ AES_128_CBC_SHA. This cipher suite contains the key exchange algorithm ECDHE, Elliptic-curve Diffie-Hellman, which uses ephemeral keys for establishment in conjunction with RSA. The cipher suite will handle confidentiality with AES 128, an encryption standard that uses 128-bit keys, with the help of cipher block chaining. Secure Hash Algorithm, or SHA, will handle message authentication.

TLS 1.3 arranges cipher suites differently. The protocol does not include what key ex-change algorithm is being used. This is why TLS 1.2 and TLS 1.3 use different cipher suites. TLS 1.3 cipher suites have the following structure [18]:

TLS_AEAD_HASH,

where the abbreviation AEAD[17] represents an authenticated encryption with associated data algorithm, which provides both authentication encryption (AE) while also being able to check the authentication of additional authenticated data (AAD). AEAD is utilized in mes-sage authentication, integrity, and confidentiality. These are the available AEAD algorithms:

• AES_128_GCM, • AES_256_GCM, • AES_128_GCM, • AES_128_GCM_8, • CHACHA20_POLY1306.

The abbreviation HASH represents one of the two following hash algorithms, either SHA256 or SHA384. The hash algorithms can thus be used in these variations:

• TLS_AES_128_GCM_SHA256, • TLS_AES_256_GCM_SHA384, • TLS_AES_128_CCM_SHA256, • TLS_AES_128_CCM_8_SHA256, • TLS_CHACHA20_POLY1305_SHA256.

A cipher suite that could be used for TLS 1.3 is, as stated above, TLS_AES_256_GCM_SHA384. This cipher suite will handle message authentication and con-fidentiality with AES-256 together with GCM or Galois Counter Mode. The pseudo-random function SHA-384 could be used for both HMAC (hash-based message authentication code) and the PRF (pseudo-random function).

For a cipher to be considered secure in TLS it needs to fulfill the following criteria. All certificates should provide a minimum key size of 112 bits while also providing hashing al-gorithms equal to or stronger than SHA-224. This holds for all cryptography, all ephemeral keys used by the client and server, and all symmetric algorithms used to protect the TLS data. What generally determines the strength of a cipher is the key size and algorithm used [18].

(16)

2.5. HSTS and HTTP security

2.5 HSTS and HTTP security

HTTP Strict Transport Security or HSTS [10] in short is a mechanism that prevents man-in-the-middle attacks. The mechanism lets servers inform their user agents, such as browsers, that only connections with HTTPS are allowed. Using HTTPS is safer than HTTP since HTTPS provides users with Transport Layer Security. HTTP does not use any encryption when communicating between two parties, this means that all data is exposed to eavesdrop-pers. HTTPS on the other hand will use encryption to prevent unwanted listeners from ac-cessing private conversation data.

2.6 Vulnerabilities

These are some of the vulnerabilities that utilize weaker TLS or SSL versions.

POODLE

The Padding Oracle On Downgraded Legacy attack or POODLE [3], in short, is a man-in-the-middle exploit that takes advantage of a feature available in earlier versions of TLS, that allows downgrading to protocols that are usually not offered. This is exploitable for versions up to TLS 1.2. The “downgrade dance” is a client-side technique that is used by some TLS users, such as browsers. If the initial handshake fails for any reason, the server will ask the client to fall back to a lower TLS version and retry the handshake. A POODLE attack abuses this mechanism by forcing the client to drop the ClientHello message which in turn forces the client to fall back to earlier versions of TLS. Allowing such fallbacks could open up the path to other potential exploits available in earlier versions of TLS.

TLS-fallback

The TLS-fallback [3] vulnerability is a collective word that describes a category consisting of several attacks that utilize man-in-the-middle to redirect a server or client to perform a protocol downgrade.

Testssl.sh mentions that the TLS-fallback option checks for TLS-FALLBACK-SCSV mitiga-tion [28]. SCSV stands for Signaling Cipher Suite Value and it is meant to prevent unintended downgrade attacks on the TLS protocol. The SCSV implementation is designed to, through a client, try to negotiate an older protocol version with a server. A server with functioning fallback protection should, in this case, respond by checking if the requested protocol version is the highest version that both client and server supports. If the server notices that the re-quested protocol version is not the highest protocol version that the client supports, it will respond with an inappropriate fallback request, and ask the client to retry with a higher pro-tocol version. This allows servers to respond to a client fallback request with a fatal alert message if it identifies an inappropriate fallback request. This alert message will also appear if a conversation suffers from network glitches, meaning that not only intentional downgrade attacks could cause incorrect fallback attempts [20].

BEAST

The BEAST [25] attack will take advantage of issues in the implementations of TLS 1.0 and older SSL protocols. It exploits the implementation of cipher block chaining, or CBC in short, by predicting the initialization vector. The initialization vector in TLS 1.0 is notorious for being predictable. The attacker will then in turn through some trial and error be able to decrypt packets and eavesdrop on the conversation that is being conducted.

(17)

2.6. Vulnerabilities

Logjam

The Logjam [3] attack will be able to take advantage of handshakes utilizing the Diffie-Hellman key exchange. Logjam will modify a Hello message to redirect a server to select an export-grade DHE cipher suite which will generate a weak DHE key. Export-grade ci-pher suites are cici-pher suites that today are considered insecure because of their small key sizes [14]. The exploit takes advantage of that earlier versions of TLS, up to TLS 1.2, will not authenticate a server’s selected cipher suite until the Finish message in a handshake is received. This means that the client will obtain weak keys based on the server’s weak state. The time it takes for the server to authenticate a server’s selected cipher suite provides the attacker with enough time to retrieve the master secret before the Finished message has been sent. The attacker can then hide its tracks by replicating a fake Finished message and decrypt the conversation data.

FREAK

Factoring RSA Export Keys attack or FREAK [3] in short works similarly to how the Log-jam attack would operate, which misdirects the server into selecting an export-grade cipher suite. But the difference is that freak will not be effective against the Diffie-Hellman key ex-change, which Logjam is, but it will be effective against the RSA-key exchange. The exploit will take advantage of a client implementation vulnerability which forces a client to accept a ServerKeyExchange message with weak short-lived export-grade RSA key parameters, even though the client does not support export-grade cipher suites. This will in turn trick the client into using the weak RSA parameters provided by the ServerKeyExchange message instead of using the assumably strong long-term RSA key, which is contained within the server Certifi-cate message to encrypt the pre-master secret. This information can in turn be used to acquire the application data by fabricating a Finished message.

(18)

3 Related work

Multiple papers have discussed how TLS 1.3 has come about and what it is meant to do. Some studies have been made on the risks of downgrading in an attacking situation. Also, studies and experiments have been conducted on what the current certificate landscape looks like, using passive and active measurements.

Holz et al. [12] talk about the designing and implementation of TLS 1.3. There they also say that already in 2017, a year before the final release of TLS 1.3, ten percent of the sites on Alexa Top1M were using it. In November of 2019, it had reached 31%. What they have focused on doing is looking at the TLS 1.3 ecosystem from different perspectives. Using active scans they have been able to identify support for TLS 1.3. With passive scans, they monitored connections to get an understanding of the usages of TLS 1.3. They also did capture traffic on mobile devices to get an overview of how it is used there. They saw that a little over a year after the release of TLS 1.3 19.5% of connections were negotiating it and about 60% of clients supported it.

Cremers et al. [6] created a symbolic model of the specifications of TLS 1.3 back in 2017, where they considered all types of handshakes available with TLS 1.3. They were able to prove that the majority of TLS 1.3’s security requirements were dependant on the secrecy of session keys. With this, they came across some behaviors that could lead to security risks even in more advanced properties such as perfect forward secrecy.

Adrian et al. [1] discuss limitations related to the TLS protocol and how it can be ex-ploited to force the protocol to downgrade to earlier versions. Their study reviews the dis-advantages of using Diffie-Hellman concerning the TLS protocol. With the help of their re-port, they constructed concrete recommendations to recover the expected security level of Hellman. The study recommended all users of TLS to transition to elliptic curve Diffie-Hellman, and to increase the minimum requirements for key strengths. Transitioning to el-liptic curve Diffie-Hellman should according to the writers avoid most known attacks. They also mentioned that clients and browsers could prevent potential downgrade attacks when contacting a server by increasing the accepted minimum size for Diffie-Hellman groups to at least 1024 bits. The report was able to notify major clients and servers about the exposed vulnerabilities that they had uncovered. They convinced browsers such as Microsoft Edge (previously Internet Explorer), Chrome, Opera, and Firefox to take action and follow some of their recommendations.

(19)

In 2017 Amann et al. [4] conducted a large active Internet scan over 193M domains, look-ing at several different security aspects. They show that only 3.5% of all domains, mainly the top domains, implement HSTS. Out of these .2% were inconsistent. They also show that 56% of all domains that use HSTS set the attribute includeSubDomains which have some security benefits but can pose availability problems if these sub-domains do not use HTTPS. Apart from HSTS they also summarize that most security aspects that are easy to configure and have low risks are the ones that see the most implementation on these domains.

Our thesis differs in some ways when compared to the related work. A majority of the work that we have read was published before or just after TLS 1.3 was introduced as the new standard. Which in contrast to how long it took for TLS 1.2 to be implemented, for a majority of the top domains, might mean that security features that saw near to no implementation could grow as TLS 1.3 becomes more used.

This thesis has a larger focus on dividing the domains on the Alexa top 1M list into blocks and comparing them on similar features. Another aspect that is discussed more prominently in this thesis, is the analysis of categories from the Alexa top 1M website.

(20)

4 Method

In this section, we describe how we collected data from 2500 domains and used the findings to analyze some security aspects.

4.1 Testssl.sh

Data collection was done using testssl.sh [28] which is a free, open-source command-line tool that checks a specified server’s service on any port for the support of TLS/SSL ciphers, protocols while also investigating recent cryptographic flaws, and more. The tool also uses all of these findings to tell how considerable of a security risk each of these domains poses. Testssl.sh gives every tested domain a score based on how well each of them handles security threats.

Testssl.sh will natively begin to collect security information from a specified domain, which would provide our thesis with a lot of data, but collecting this data takes time. This is why testssl.sh allows the use of flags to manipulate what kind of data is collected. We include almost every test flag, such as the flag that allows testssl.sh to check all 370 ciphers using OpenSSL [22] for every protocol. OpenSSL is an open-source implementation of the TLS and SSL protocols that can also be used as a cryptography library. We also included a flag that allows testssl.sh to run a client simulation test. The client simulation performs a simulated handshake that is used to tell what cipher and protocol that was negotiated by each client. We wanted a dataset with as much information included as possible, while still keeping the time required for each test at a reasonable level. We had to keep the time within a few hours since we wanted to perform two tests on 2500 domains each day. One test in the morning at 08:00 CEST and one at 17:00 CEST. We did this because we wanted to see if there were any major differences between accessing servers in the morning versus the evening. Also to see if we connected to any domains through different servers, between the tests, and if that had any impact on the security aspects. We continued this process for 14 days, from the 20th of April 2021 to the 3rd of May 2021.

Testssl.sh allowed us to perform active scans [11]. An active scan consists of a scanner that actively sends packets to a distant target to generate network and application informa-tion. Testssl.sh takes a specified domain name as input and utilizes OpenSSL to conduct full TLS/SSL handshakes. Successful handshakes will result in storing or printing sent server data depending on preference.

(21)

4.1. Testssl.sh

Modification

We only made some minor changes to the source code of testssl.sh. One was including the total scan time per domain in the CSV file, which was used to sanity check the results from day to day. The other change we made was how many domains we tested in parallel. The default was set to 20 domains and we changed it depending on the hardware of the machine running the test.

We created a small bash script that runs testssl.sh [28] with the flags specified in Table 4.1. The script is provided in Figure A.1 which makes the experiment easier to replicate.

Flags Purpose

--mode parallel Allows running domains in parallel.

--ip one Specifies that we only perform the search on the domains preferred IP address.

-p Checks TLS/SSL protocols.

-h Performs several HTTP header tests.

-s Tests certain lists of cipher suites by strength.

-f Checks robust (perfect) forward secrecy key exchange. -P Displays the server’s preferences: cipher order,

with used OpenSSL client: negotiated protocol and cipher. -S Displays information from the server hello(s).

-c

Performs a client simulation that simulates a handshake with several standard clients so that you can figure out which client can not or can connect to your domain. -E Checks each of the 370 ciphers per protocol.

-U

Tests all available vulnerability tests which include; heartbleed, ccs-injection, ticketbleed, robot, renegotiation, compression, breach, poodle, tls-fallback, sweet32, freak, drown, logjam, beast, lucky13, rc4.

--quiet Suppresses terminal output banners from printing. Table 4.1: All flags used for testssl.sh.

Successfully collected data is then stored in a CSV file. Collected data is stored in one CSV file for every test. Each CSV file is named after the date and time of when the test was initiated. We decided to store our collected data in CSV files since they are easy to manipu-late and easy to view in spreadsheets. When our script is run it will call testssl.sh with the specified flags mentioned above. Testssl.sh will then run a set amount of tests in parallel, and as soon as it completes one of these tests it will start a new one. This means that if the current set amount of tests is 50 it will initiate all 50 tests at the same time, meaning that it will always have 50 tests running at all times, which speeds up the process significantly. We also decided to include a timeout that prevents testssl.sh from getting stuck. We decided to set it to 20 minutes, allowing testssl.sh to complete longer tests while also preventing the tool from getting stuck.

Datasets and statistics creation

To create the datasets we used pandas [21] which allowed us to easily manipulate CSV files and configure the data frames with the information from the testssl.sh tests. Pandas is a library for Python that allows easy manipulation of data frames and extraction of data from CSV files. From each test one dataset was created, a total of 14 datasets per week. The datasets created with pandas contain all the important findings from one domain in one line, making it easier to read and compare than the raw data from testssl.sh. Raw data from testssl.sh came in the form of an ID for each test performed on a domain and also what the finding for that

(22)

4.2. Domain selection

test was. These tests were distributed on one row each, meaning that the information from one domain spanned across multiple rows. We took these IDs and used them as columns or headers for each domain with the finding for each corresponding ID as the value. We ended up with 513 headers in each dataset. For more information on the headers in the datasets, see Section A.1 available in the appendix.

Virtual Machine

To shorten the time it takes to collect data from 2500 domains using testssl.sh we split the list on three machines. Two of them were running windows and therefore had to use virtual machines to run testssl.sh. We decided on using the long-term supported version of Ubuntu, Ubuntu 20.04 [16]. The last machine was an HP envy 13 laptop running Pop!_OS [27].

4.2 Domain selection

The domains, used in our tests, were selected from the Alexa Global Top 1M list [5] which is a list of the Internet’s most popular domains. The list changes every day but we decided on using the list from the 19th of April 2021. This means we used the same domains for every test that was performed. We extracted five domain blocks with 500 entries each from the Alexa list: the first 500, the next 500, the 500 up to 10 000, the 500 up to 100 000, and the last 500. We decided to split the Alexa list into domain blocks from different parts of the list to more easily isolate and identify differences and similarities between popular and less popular domains. We could have selected the first 2500 domains, but it would have provided us with overall less interesting results. The total number of domains in the Alexa list from this date is almost half a million (484 418 domains). The block selection can be seen in Table 4.2.

Block 1 2 3 4 5

Domains 1-500 501-1000 9501-10 000 99 501-100 000 The last 500 from the list Table 4.2: Domains taken from Alexa Top 1M divided into five blocks.

4.3 Alexa categories

Alexa Top 1M used to create lists of top domains for several categories on the web, but they have as of September 21st, 2020 been discontinued [5]. We wanted to compare our results from the Top 1M list to these categories and decided on borrowing a list of domains from a similar project, which used these categories, that was conducted in 2020.

There are 16 different categories, each category contains a list of the top 50 domains within that category. The 16 categories include; Adult, Arts, Business, Computers, Games, Health, Home, Kids and Teens, News, Recreation, Reference, Regional, Science, Shopping, Society and Sports. We performed two scans on the categories on the 11th of May 2021 at 08:00 and 17:00 CEST.

4.4 Limitations

We have considered some limitations. Our method was limited to the boundaries of the tools that were used to perform the tests. We decided to separate this section into two parts, limitations with testssl.sh and limitations to the Alexa top 1M list.

One limitation with testssl.sh is that it only supports Unix-based systems. This means that it will not run natively on windows. Additionally, testssl.sh is limited to its built-in functionality. This meant that testssl.sh did not always have all the options implemented that we needed to use unless we rewrote the program ourselves. Testssl.sh has only implemented an option for running the tests on all IP addresses or only the preferred one, not a set number

(23)

4.4. Limitations

on each domain. Testssl.sh does not consider every known vulnerability, this meant that we had to limit ourselves to the threats that were available within the program.

The Alexa Top 1M list can only provide the most popular websites, which does not include websites that for example are known to be vulnerable. This could portray a false reality of the Internet with only functional websites. Alexa Top 1M contains only websites that users would intend to go to, not sites that they accidentally would happen upon, by for example following links. An alternate method that might provide a more realistic view of the Internet could be to create a list of completely random domains. The list contains several non-working domains which made testing difficult. The Alexa list contains top domains that for instance could look like this google.com. When entered into a browser it will redirect you to a domain such as this https://www.google.com which is the actual domain with the expected page. With testssl.sh we only access the base domains on the Alexa list, therefore we do not collect any security information for these sub-domains.

(24)

5 Results

This section will present the results of the tests that were performed on the 2500 domains and the 16 categories. We tested all domains, but not all of them were reachable for every test and some not at all, this meant that some data was lost during our scans. Also, some domains did not record any data for some of the runs, leaving their corresponding fields empty. This means that they were still counted as a tested domain for that scan but they do not contribute to the statistics. We do not have a definitive answer to why this happened but from the output of testssl.sh we suspect it has something to do with our firewalls. The average number of tested domains for 08:00 was 2443 and for 17:00 it was 2446.

5.1 Negotiated TLS version

Testssl.sh tests domains with TLS versions 1.3, 1.2, 1.1, and 1.0, as well as SSL versions 3 and 2. Protocol tests are performed through simulated handshakes, where testssl.sh assumes the role of a client machine supporting all protocol versions on specified domain names. If the server does not support TLS 1.3 the connection will downgrade to one of the other versions. We then extracted the negotiated protocol for each connection and calculated the average number of tested domains that use each TLS version. We did this for both times of day and each domain block defined in Table 4.2. This can be seen in both Table 5.1 and Figure 5.1.

From Table 5.1 we can see that the most used TLS version was TLS 1.3, it was also used more frequently than any other version throughout the day as can be seen more clearly in Figure 5.1a. This figure illustrates the average number of negotiated TLS versions for selected domains concerning the times of day. The only relevant difference we could determine for the time of day is that we completed more connections in the afternoon than in the morning. Other than this no significant differences were identified.

Figure 5.1b illustrates the variation in protocol selection between the five predetermined blocks. The largest difference was observed in domain range (99 500,100 000], where TLS 1.3 was used over 50% more than its predecessor 1.2. Domain range (0,500] had an equal spread of the two latest versions of TLS. Domain range (99 500,100 000] also had the most uses of TLS 1.0 and in the last 500 domains, a version below 1.2 was never negotiated.

Figure 5.1b have similar results for each tested block, TLS 1.3 was slightly more used than TLS 1.2 on all 2500 domains. Domain range (99 500,100 000] on the other hand had a significant increase in the use of TLS 1.3 which resulted in a decrease of TLS 1.2 for that block.

(25)

5.2. Negotiated cipher suites

Time of day 08:00 17:00

Number of tested domains 2443 2446

Domains with no result 198 188

TLS version

1.3 1211 1220

1.2 1029 1033

1.1 0 0

1.0 5 5

Table 5.1: Average number of domains, rounded to closest integer, for each TLS version.

0 250 500 750 1000 1250

Number of domains

08:00 17:00

Time of day

TLS 1.3 TLS 1.2 TLS 1.1 TLS 1.0

(a) Time of day.

(0,500] (500,1000] (9500,10 000] (99 500, 100 000] last 500

Block

0 150 300 450

Number of domains

TLS 1.3 TLS 1.2 TLS 1.1 TLS 1.0 (b) Per block.

Figure 5.1: Average number of domains, rounded to closest integer, for each TLS version.

This could be because this set of domains were leaning more towards security rather than convenience. This block of domains could for example have been either newly created web pages that had made sure to implement the new standard or domains that simply required a higher level of security.

From Table 5.1 we also determined that TLS 1.3 was used in around 50% of the connections at both 08:00 and 17:00. The domains that did not use 1.3 downgraded to an earlier version of TLS. TLS 1.2 was the second most used protocol and was used 1029 times at 08:00 and 1033 times at 17:00. This corresponds to about 42% of the connections. This told us that about 8% of all domains either negotiated a security protocol below TLS 1.2 or did not return any information.

Why we did not see any servers negotiating TLS 1.1 as their highest version is probably because most servers are Unix/Linux based and make use of OpenSSL, this was also dis-cussed in [4]. OpenSSL introduced TLS 1.1 and 1.2 in the same release in 2012. This means that these servers most likely chose the strongest available version at the time.

5.2 Negotiated cipher suites

Cipher suite selection is tested similarly to how testssl.sh tests TLS protocols. Testssl.sh sends its supported ciphers, in order of most secure, and the server chooses the highest one it sup-ports. A list of the negotiated ciphers from the 28 performed scans is compiled in Table 5.2, divided into each TLS version.

Table 5.2 contains the total number of cipher suites negotiated per protocol. The column on the far right represents negotiated cipher suites for the top 1000 domains. We have high-lighted the cipher suites that testssl.sh considers insecure in red. Some of these weak cipher suites are signified by parenthesis containing CBC. These cipher suites were using cipher

(26)

5.2. Negotiated cipher suites

Cipher suite Times used total Times used (0,1000]

TLS 1.3

TLS_AES_256_GCM_SHA384, 253 bit ECDH (X25519) 24876 9733 TLS_AES_128_GCM_SHA256, 253 bit ECDH (X25519) 4651 2169 TLS_AES_256_GCM_SHA384, 253 bit ECDH (X25519) (limited sense as client will pick) 2654 330 TLS_AES_256_GCM_SHA384, 384 bit ECDH (P-384) 703 74 TLS_AES_256_GCM_SHA384, 256 bit ECDH (P-256) 605 237 TLS_AES_128_GCM_SHA256, 256 bit ECDH (P-256) 295 191 TLS_CHACHA20_POLY1305_SHA256, 253 bit ECDH (X25519) 251 140

TLS 1.2

ECDHE-RSA-AES128-GCM-SHA256, 256 bit ECDH (P-256) 11622 6709 ECDHE-RSA-AES256-GCM-SHA384, 256 bit ECDH (P-256) 8509 2456

AES128-GCM-SHA256 (matching cipher in list missing) 1529 561 ECDHE-RSA-AES256-GCM-SHA384, 384 bit ECDH (P-384) 1342 457

ECDHE-ECDSA-AES128-GCM-SHA256, 256 bit ECDH (P-256) (matching cipher in list missing) 557 253

ECDHE-RSA-AES256-SHA384, 256 bit ECDH (P-256) (cbc) 493 196

DHE-RSA-AES256-SHA256, 2048 bit DH (cbc) (matching cipher in list missing) 477 58 ECDHE-ECDSA-AES128-GCM-SHA256, 256 bit ECDH (P-256) 459 398

DHE-RSA-AES128-GCM-SHA256, 2048 bit DH (matching cipher in list missing) 435 28

DHE-RSA-AES256-SHA256, 1024 bit DH (cbc) (matching cipher in list missing) 413 220 ECDHE-ECDSA-AES256-GCM-SHA384, 256 bit ECDH (P-256) 395 28

AES128-GCM-SHA256 280 280

AES256-SHA (cbc) 252 140

DHE-RSA-AES256-GCM-SHA384, 1024 bit DH 221 83

ECDHE-RSA-RC4-SHA, 521 bit ECDH (P-521) (matching cipher in list missing) 221 110

ECDHE-RSA-RC4-SHA, 256 bit ECDH (P-256) (matching cipher in list missing) 219 56

AES256-GCM-SHA384 172 144

AES128-SHA (cbc) 138 83

AES256-SHA256 (cbc) 112 57

ECDHE-RSA-AES128-GCM-SHA256, 256 bit ECDH (P-256) (matching cipher in list missing) 89 61 ECDHE-RSA-AES128-GCM-SHA256, 384 bit ECDH (P-384) 84 56

ECDHE-ECDSA-AES128-SHA, 521 bit ECDH (P-521) (cbc) (matching cipher in list missing) 84 0 ECDHE-RSA-AES128-GCM-SHA256, 570 bit ECDH (B-571) 56 28

ECDHE-RSA-AES256-SHA, 256 bit ECDH (P-256) (cbc) 56 28

ECDHE-RSA-RC4-SHA, 570 bit ECDH (B-571) (matching cipher in list missing) 56 28 DHE-RSA-AES256-GCM-SHA384, 2048 bit DH 56 28

ECDHE-RSA-AES128-SHA, 256 bit ECDH (P-256) (cbc) 56 56

ECDHE-RSA-AES256-SHA, 521 bit ECDH (P-521) (cbc) 28 0 DHE-RSA-AES256-GCM-SHA384, 4096 bit DH 28 0 ECDHE-RSA-AES256-GCM-SHA384, 521 bit ECDH (P-521) 28 28

AES256-SHA256 (cbc) (limited sense as client will pick) 28 0

ECDHE-RSA-AES128-SHA256, 521 bit ECDH (P-521) (cbc) (matching cipher in list missing) 28 0

ADH-AES256-GCM-SHA384, 3072 bit DH (matching cipher in list missing) 28 0

ECDHE-RSA-AES256-SHA384, 384 bit ECDH (P-384) (cbc) (matching cipher in list missing) 28 28

ECDHE-RSA-AES128-GCM-SHA256, 521 bit ECDH (P-521) (matching cipher in list missing) 28 0

ECDHE-ECDSA-AES128-SHA, 256 bit ECDH (P-256) (cbc) (matching cipher in list missing) 28 0

ECDHE-RSA-AES256-SHA, 521 bit ECDH (P-521) (cbc) (matching cipher in list missing) 26 0

ECDHE-ECDSA-AES128-SHA, 570 bit ECDH (B-571) (cbc) (matching cipher in list missing) 26 0

Default cipher empty (if IIS6 give OpenSSL 1.0.1 a try) (matching cipher in list missing) 3 3 TLS 1.1

TLS 1.0 DHE-RSA-AES256-SHA, 1024 bit DH (cbc) (limited sense as client will pick) 78 28

AES128-SHA (cbc) 56 28

Table 5.2: Total number of cipher suites negotiated per protocol for all domains and for the top 1000 domains. Text highlighted in red signifies weak cipher suites and text highlighted in orange are potentially weak ciphers.

block chaining which is an insecure encryption algorithm. The cipher suites that contain the parenthesis’s matching cipher in list missing were using cipher suites unknown to testssl.sh. The cipher suites that were not available in testssl.sh’s list of ciphers but still provided a sig-nificant level of security, due to their algorithm or key size, are highlighted in orange.

The most used cipher suite for TLS 1.3 was TLS_AES_256_GCM_SHA384, 253 bit ECDH (X25519), it was used 24876 times. The most used cipher suite for TLS 1.2 was ECDHE-RSA-AES128-GCM-SHA256, 256 bit ECDH (P-256), it was used 11622 times. TLS 1.1 was never used. The most used cipher suite for TLS 1.0 was DHE-RSA-AES256-SHA, 1024 bit DH (cbc) it was used only 78 times. The third most used cipher suite in TLS 1.2 was AES128-GCM-SHA256 (matching cipher in list missing) and was considered as a potentially weak cipher since testssl.sh had no record of the used cipher suite. The sixth most-used cipher suite in TLS 1.2 is ECDHE-RSA-AES256-SHA384, 256 bit ECDH (P-256) (cbc) a cipher suite that utilizes cipher block chaining (CBC) and was considered insecure by testssl.sh, it was used 493 times. TLS 1.0 was only using insecure ciphers utilizing CBC.

In total there were 47 unique cipher suites and 24 of these were considered insecure. This signifies that more than 50% of the negotiated ciphers were insecure. Even though the secure

(27)

5.3. HSTS

ones were used more than the insecure ones, as seen in Table 5.2, it is still frightening that more than half of the negotiated cipher suites were considered insecure.

It would be expected of the top 1000 most popular domains to be significantly better than those further down the list. However, many of the weak ciphers were also present here. The fourth cipher suite in TLS 1.3, TLS_AES_256_GCM_SHA384, 384 bit ECDH (P-384), was used less than the 5th, 6th, and 7th compared to all the domains, even though it provides better security due to its key size. If a cipher was used 28 times it was probably only used for one domain, since we ran a total of 28 scans. With this in mind, we see that a majority of the total 47 cipher suites were being used by the top 1000 most popular domains. This is worrisome, since some of the more popular cipher suites in TLS 1.2 were insecure, such as those using CBC or weak key parameters.

5.3 HSTS

From our datasets we extracted all domains that had not offered HSTS in total and per do-main block. There were a significant number of dodo-mains that did not support HSTS, this meant that they were unprotected or exposed to possible man-in-the-middle attacks. The av-erage number of domains that did not offer HSTS was 1677. Figure 5.2 illustrates the avav-erage number of domains per block that did not support HSTS.

(0,500]

(500,1000]

(9500,10 000]

(99 500, 100 000]

last 500

Block

0

150

300

450 Number of domains

Figure 5.2: Average number of domains not supporting HSTS.

Domain range (500,1000] had the highest number of domains that supported HSTS and the last 500 had the least. Persistent throughout the five blocks was that about 60% or more of the tested domains did not support HSTS. Excluding domain range (0,500], we could see almost linear growth of the number of domains not offering HSTS.

As previously mentioned we only looked at base domains, meaning we did not follow any redirects, which could have lead to inaccuracies as noticed in [4]. Where base domains such as google.com only support HSTS for its sub-domains. Therefore, we believe that this could be true for some domains in the first 500, this would mean that the entire graph would look to grow linearly.

(28)

Total average (0,500] (500,1000] (9500,10 000] (99 500,100 000] last 500 POODLE 44 12 8 16 7 2

Fallback 189 51 39 57 29 13

FREAK 1 0 0 1 0 0

Logjam 32 7 12 9 4 0 BEAST 1275 285 263 261 251 215

Table 5.3: Average of domains weak to each vulnerability.

(0,500] (500,1000] (9500,10 000] (99 500, 100 000] last 500

Block

0 10 20 30 40 50 60 70

Number of domains

(a) Fallback. (0,500] (500,1000] (9500,10 000] (99 500, 100 000] last 500

Block

0 5 10 15

Number of domains

(b) Logjam. (0,500] (500,1000] (9500,10 000] (99 500, 100 000] last 500

Block

0 50 100 150 200 250 300 350

Number of domains

(c) BEAST.

POODLE fallback FREAK Logjam BEAST

Tested vulnerabilities

0 325 650 975 1300

Number of domains

(d) All analysed vulnerabilities.

Figure 5.3: Graphs illustrating average of all analysed vulnerabilities and interesting findings.

5.4 Vulnerabilities

Our datasets contain information about more vulnerabilities than the ones that are presented in the thesis. The ones presented in the thesis were selected because they are associated with the TLS and SSL protocols. The tested vulnerabilities included POODLE, TLS-fallback, FREAK, Logjam, and BEAST. The collected data on these vulnerabilities are presented in Table 5.3 and Figure 5.3.

Table 5.3 compares the total number of domains as well as each block that is weak to each of the selected vulnerabilities. Out of the tested domains, 1275 were weak to BEAST making it the vulnerability that most domains were exposed to. Only one domain was exposed to FREAK, this domain was located in block three. The first 500 domains were most exposed in terms of all vulnerabilities out of the 2500 domains.

(29)

1117

10

1

40

125

2

18

0

30

0

1

0

15

0

1 POODLE

Fallback

Logjam

BEAST

POODLE Fallback Logjam BEAST

Figure 5.4: Venn-diagram illustrating tested domains that were exposed to the analysed vul-nerabilities at least once.

Figure 5.3 shows four different diagrams. Figure 5.3a, 5.3b and 5.3c contains the average number of domains in each block weak to Fallback, Logjam, and BEAST respectively. Figure 5.3d illustrates the average of the selected vulnerabilities over every domain. This figure clearly illustrates how many more domains were exposed to BEAST compared to the other vulnerabilities. In Figure 5.3a we can see that the number of domains possibly exposed to Fallback followed no clear pattern. Domain ranges (0,500] and (9500,10 000] contained the highest amount of domains weak to this vulnerability. For Logjam in Figure 5.3b we can see that most of the domains exposed to this vulnerability were located in domain range (500,1000]. Figure 5.3c shows a linear decline in the number of domains exposed to BEAST per block. This can be seen by how closely the bars follow the trend line.

Figure 5.4 depicts tested domains that were exposed to the analyzed vulnerabilities at least once. Just as in Figure 5.3d BEAST was the largest threat to the tested domains. There was, as before, a small number of domains that were exposed to Fallback and Logjam. Domains that were exposed to the other vulnerabilities seem to, in general, also be exposed to BEAST. We found that the domain weak to all vulnerabilities was goo.ne.jp, and it was located in domain range (500,1000], which can be seen in Figure A.3 in the appendix.

The only domain weak to FREAK was xunbaozhifu.com in domain range (9500,10 000] as seen in Table 5.3. This domain was also weak to POODLE and BEAST, meaning it counts towards the 30 domains located at the bottom of Figure 5.4.

(30)

5.5. TLS versions in different browsers

Venn diagrams for all blocks are present in Section A.2. Looking at all the graphs in this section we can deduce that the differences between the blocks were minuscule. Also, the spread of domains with different combinations of vulnerabilities was similar to those in Figure 5.4 containing all domains.

5.5 TLS versions in different browsers

Testssl.sh runs client simulations by simulating different browsers and returns the negoti-ated TLS version and cipher suite. We focused on analyzing data from the latest versions of more popular browsers [26]; Chrome_79_win10, FireFox_71_win10, Edge_17_win10 and Safari_130_osx_10146. The collected data from these browsers are presented in Table 5.4 and Figure 5.5.

Total average (0,500] (500,1000] (9500,10 000] (99 500,100 000] last 500 Chrome 984 236 223 231 294 0

Firefox 1213 236 223 231 294 229

Edge 0 0 0 0 0 0

Safari 1213 236 223 231 294 229

(a) TLS 1.3

Total average (0,500] (500,1000] (9500,10 000] (99 500,100 000] last 500 Chrome 823 236 218 200 169 0

Firefox 1029 236 217 200 169 207 Edge 2241 472 440 430 463 436 Safari 1029 236 218 199 169 207

(b) TLS 1.2

Total average (0,500] (500,1000] (9500,10 000] (99 500,100 000] last 500

Chrome 0 0 0 0 0 0

Firefox 0 0 0 0 0 0

Edge 1 0 1 0 0 0

Safari 0 0 0 0 0 0

(c) TLS 1.1

Total average (0,500] (500,1000] (9500,10 000] (99 500,100 000] last 500

Chrome 5 1 1 1 2 0

Firefox 5 1 1 1 2 0

Edge 5 1 1 1 2 0

Safari 5 1 1 1 2 0

(d) TLS 1.0

Table 5.4: Average of chosen TLS version per block for each browser.

Table 5.4 contains information on which TLS version was negotiated per block for each browser. Table 5.4a has information on TLS 1.3, Table 5.4b on TLS 1.2, Table 5.4c has informa-tion on TLS 1.1, and Table 5.4d on TLS 1.0. Both Firefox and Safari saw equal support among the domains.

Chrome had similar results for the first four blocks in TLS 1.3 but had no support in the last 500 domains. We noticed that our selected version of chrome never seemed to be supported by the last 500 domains. Upon further investigation, we saw that the machine running the test for this block had misplaced that version in the source script. However, we believe that it would have yielded similar results to Firefox and Safari.

Edge saw no support for TLS 1.3. We noticed that Microsoft Edge never allowed any domains to use TLS 1.3 and what was causing this is that Edge does not support TLS 1.3 natively. It is, however, possible to turn this functionality on by marking a checkbox in the settings tab [19]. The support for the browsers was equal in TLS 1.0 and only Edge negotiated TLS 1.1 for one domain.

Figure 5.5 presents the average of domains that supported each TLS version per block for the selected browsers. Figure 5.5a illustrates the results for Firefox and Figure 5.5b

(31)

de-5.6. TLS versions for mobile (0,500] (500,1000] (9500,10 000] (99 500, 100 000] last 500

Block

0 150 300 450

Number of domains

TLS 1.3 TLS 1.2 TLS 1.1 TLS 1.0 (a) Firefox. (0,500] (500,1000] (9500,10 000] (99 500, 100 000] last 500

Block

0 150 300 450 600 750

Number of domains

TLS 1.3 TLS 1.2 TLS 1.1 TLS 1.0 (b) Edge.

Figure 5.5: TLS version negotiation in selected browsers.

scribes the results for Microsoft Edge. Firefox switched between TLS 1.3 and TLS 1.2 quite evenly, TLS 1.1 and TLS 1.0 were seldom negotiated. Edge used almost only TLS 1.2 for all connections.

5.6 TLS versions for mobile

The client simulation also included tests for mobile devices. In this section, we present the results from Android_X and Safari_121_ios_122.

(0,500] (500,1000] (9500,10 000] (99 500, 100 000] last 500

Block

0 150 300 450

Number of domains

TLS 1.3 TLS 1.2 TLS 1.1 TLS 1.0 (a) Android. (0,500] (500,1000] (9500,10 000] (99 500, 100 000] last 500

Block

0 150 300 450

Number of domains

TLS 1.3 TLS 1.2 TLS 1.1 TLS 1.0 (b) iOS.

Figure 5.6: TLS negotiation for mobile clients.

As shown in Figure 5.6, counting the negotiated TLS version per block for the two mobile clients, they showed similar results for every block. Only differing one domain on average for version 1.2 in the first 500 domains. Both Android and iOS used TLS 1.1 and TLS 1.0 equally few times.

5.7

(33)

5.7. Categories

The News category had more domains that negotiated TLS 1.2 than TLS 1.3 and a high number of domains weak to BEAST as well as a high number of domains not using HSTS, this indicates that the News category is overall the most insecure. A user browsing the Adult category could expect a much higher security level compared to someone browsing the News category. It seems reasonable to assume that domains allowing their users to post their own content, would have more security than domains that do not. Since they handle more per-sonal information in these cases. This is also true for other types of domains that handle personal information, found in categories such as, Health and Recreation. Domains in these categories probably allow their users to create accounts and interact with others.

Characterization of cipher suite selection, downgrading, and other weaknesses observed in the wild

Linköping University | Department of Computer and Information Science

Bachelor’s thesis, 15 ECTS | Computer Science

2021 | LIU-IDA/LITH-EX-G--21/063--SE

Characterization of cipher suite

selection, downgrading, and

other weaknesses observed in

the wild

Karaktärisering av cipher suite val, nedgradering och andra

svagheter som observerats i det vilda

Sebastian Frisenfelt

Edvin Kjell

Upphovsrätt

Copyright

Acknowledgments

Contents

List of Figures

List of Tables

1

Introduction

1.1

Motivation

1.2

Aim

1.3

Contribution

1.4

Research questions

1.5

Delimitations

2

Background

2.1

The TLS protocol

2.2

TLS handshakes

TLS 1.2

TLS 1.3

2.3

TLS downgrading/version negotiation

2.4

Cipher suites

2.5

HSTS and HTTP security

2.6

Vulnerabilities

POODLE

TLS-fallback

BEAST

Logjam

FREAK

3

Related work

4

Method

4.1

Testssl.sh

Modification

Datasets and statistics creation

Virtual Machine

4.2

Domain selection

4.3

Alexa categories

4.4

Limitations

5

Results

5.1

Negotiated TLS version

Number of domains

Time of day

Block

Number of domains

5.2

Negotiated cipher suites

5.3

HSTS

(0,500]

(500,1000]