Threat modelling of historical attackswith CySeMoL

(1)

DEGREE PROJECT, IN COMPUTER SCIENCE , SECOND LEVEL

STOCKHOLM, SWEDEN 2015

Threat modelling of historical attacks

with CySeMoL

CARL SVENSSON

KTH ROYAL INSTITUTE OF TECHNOLOGY

(2)

Threat modelling of historical attacks with

CySeMoL

Hotmodellering av historiska attacker med

CySeMoL

CARL SVENSSON

Master’s Thesis at CSC Supervisor: Sonja Buchegger

(3)

(4)

Abstract

This report investigates the modelling power of the Cy-ber Security Modelling Language, CySeMoL by looking at three documented cyber attacks and attempting to model the respective systems in which they occurred. By doing this, strengths and weaknesses of the model are investigated and proposals for improvements to the CySeMoL model are explored.

Referat

Hotmodellering av historiska attacker med

CySeMoL

(5)

Acknowledgements

I would like to thank my supervisor at KTH, Sonja Buchegger, for her invaluable input and support throughout the project.

I would also like to thank my supervisor at Foreseeti, Mathias Ekstedt who provided great discussions about the work and helped me in the right direction throughout the course of the project.

(6)

1.2 This report . . . 2 2 Background 3 2.1 Threat modelling . . . 3 2.2 Bayesian networks . . . 4 2.3 CySeMoL . . . 5 3 Method 9 4 Case studies 11 4.1 Stuxnet . . . 11 4.1.1 Background . . . 11 4.1.2 Modelling . . . 14 4.1.3 Analysis . . . 16 4.2 Diginotar . . . 23 4.2.1 Background . . . 23 4.2.2 Modelling . . . 23 4.2.3 Analysis . . . 25 4.3 Logica . . . 27 4.3.1 Background . . . 27 4.3.2 Modelling . . . 27 4.3.3 Analysis . . . 28 4.4 Summary of analysis . . . 28 5 Conclusion 31 Bibliography 33 Appendices 34

(7)

B The Diginotar model 41

C Other findings 47

(8)

Chapter 1

Introduction

Over time, IT systems have grown larger. This has lead to an increase in both complexity and the difficulty of maintaining full knowledge about the system.[1]. Furthermore the attack surfaces and the number of vulnerabilities in a system grow with the size. This presents a problem for administrators and security officers who often work under a constrained budget and need to prioritize where to investigate or improve the system. In order to effectively be able to make these kinds of decisions it is desirable to have relevant information to base the decisions on. Ideally, one might want to have full understanding of the entire system including both hardware and software components and their interactions. Unfortunately due to the sheer size and complexity of modern systems, this is usually infeasible.

Different tools have been proposed to aid decision makers with these kind of problems. In addition to traditional methods such as penetration testing and code review, one proposed class of tools is various kinds of models where the analyst tries to create a representation of the system to aid in decision making.

One such tool is CySeMoL (Cyber Security Modelling Language) which uses Bayesian networks to calculate security risks in a model of the system. The Cy-SeMoL model was created at KTH[2] and is being further developed by Foreseeti, a startup company at KTH, into a fully integrated threat modelling tool. By using CySeMoL to model known previous attacks, it is possible to both validate the model and find areas that can be improved.

1.1 Goal and scope

(9)

CHAPTER 1. INTRODUCTION

1.2 This report

This report is divided into five parts. This introduction aims to frame the discussion and give some context to the problem. This is followed by some background where threat modelling is described and some different alternatives are discussed. We also introduce CySeMoL and describe how it works. With this in place we can move on to the actual methods and experiments where several attacks are studied, modelled and analysed. Finally we finish up with some conclusions about these attacks CySeMoL in particular and threat modelling in general.

(10)

Chapter 2

Background

When designing or maintaining any system of non-trivial size there are many quali-ties that can be assessed. Security is one of them and is the focus of this study. The analysis have been performed using CySeMoL, a threat modelling tool to create and evaluate system-centric threat models using Bayesian networks. This chapter aims to provide some context on threat modelling and overall background on CySeMoL.

2.1 Threat modelling

Threat modelling is a process whereby a model is created which represents a subset of possible attacks that can be performed against a system. Such a model is useful when reasoning about the system and to determine where focus should be put in security efforts and which mechanisms and policies can be effective in different areas of the system. Such a model is of interest both before deploying a system as a design tool to investigate different scenarios and variants of the system without having to actually implement them. Furthermore it can also be used as a way of assessing the security properties of an existing system to understand where improvement is needed.

A threat model can be built in several ways, for example by starting from dif-ferent perspectives. It is possible to take an attacker-centric view and try to answer the question: "What is this particular attacker capable of doing?". By moving from this question and looking at which attacks are applicable on the analysed system it is possible to create a threat model.

A second view, the one taken in CySeMoL, is the system-centric approach where the modeller instead starts with the actual system[2]. Here questions such as "What software and hardware is present?" and "What does the network look like?" is the basis of the model. By looking at the system it is possible to determine what attacks and attack steps are possible and how they affect each other.

(11)

CHAPTER 2. BACKGROUND

Figure 2.1. The result of a CORAS model[6]

the security of individual software and not larger networks with multiple servers, network zones and users.

Another modelling framework is Secure Tropos which extends Tropos, a method-ology for software engineering, to include security considerations[5]. Tropos is based on considering large IT systems as a group of smaller individual agents with specific goals for each agent. This means that Secure Tropos is a methodology for develop-ing secure software and is not intended to be used for analysdevelop-ing existdevelop-ing software or the interactions between them.

A higher level framework for threat modelling is CORAS. CORAS, like Cy-SeMoL uses a visual tool to model systems. However CORAS is more similar to traditional risk assessment methods by focusing on general classes of problems and how these can lead to valuable assets can be compromised[6][7]. It combines esti-mating probabilities and consequences for different scenarios with their relations to each other. An example of a part of a CORAS model can be seen in Figure 2.1. Here we see that an actor "Employee" has certain attributes associated with it e.g. "Insufficient training" and how those relates to some risks e.g. "Sloppy handling of records". These risks are then associated with a consequence e.g. "Compromises confidentiality of health records" which in turn affects concrete business aspects e.g. "Patient’s health".

2.2 Bayesian networks

CySeMoL uses Bayesian networks to create its statistical model. A Bayesian net-work is a model which represents a set of random variables and their conditional dependencies. It can be visualized as a directed acyclic graph, DAG, where an edge from a node A to a node B indicates that B has probability distribution that is conditioned on A. For example, Figure 2.2 shows a simple Bayesian network with 3 boolean variables and the conditional dependencies. By inspecting the ta-bles one can see that if the sprinklers are on and it is not raining we have a 90%

(12)

2.3. CYSEMOL

Figure 2.2. A simple Bayesian network (example from Wikipedia)

probability that the grass is wet, in other words: P (Grass wet = T rue|Rain =

F alse ∧Sprinklers = T rue) = 0.9. It is also possible to make back inferences from a

Bayesian network[8], i.e. if we know that the grass is indeed wet, we can use condi-tional probability to calculate the probability that it is raining and the probability that the sprinklers are on.

Bayesian networks provide at least two major advantages over just having a single joint distribution over all the variables. First it saves memory, especially if the graph is sparse and secondly it is more intuitive to understand the relations of the variables from the graph than a single large distribution. They are used in a lot of different fields such as computational biology, image processing and risk analysis. Bayesian networks can also be used for threat modelling[9] and play a central role in CySeMoL[10] where the variables are either attack steps that the attacker must perform or defence mechanics that negatively impacts the attackers probability of succeeding.

2.3 CySeMoL

CySeMoL is used to create models of a system. The system is represented as a graph with nodes representing parts of the system and edges how they are connected. The edges can be of different types, creating different conditional relations between nodes, even between the same pair of types depending on the kind of relation the nodes have.

(13)

CHAPTER 2. BACKGROUND

Figure 2.3. A part of a CySeMoL model

different concrete models can be instantiated. They describe what kind of objects should be contained in the model and how they can be related to each other. They are both based on Bayesian networks but differ on what classes of objects they contain and how different real world objects are mapped to objects in the model.

The CySeMoL graph is not a Bayesian network in itself but is instead used to generate one when the actual computation is performed. Every node in the CySeMoL graph has several attributes belonging to one of two categories: "attack steps" and "defences". Every such attribute is a node in the Bayesian network and their conditional dependencies is based on the CySeMoL model. For example the CySeMoL subgraph shown in Figure 2.3 has three nodes and two edges. When the calculations are to be performed, this is transformed into a Bayesian network with 27 nodes and even more edges. The relations in the network are based both on previous research and assessments from domain experts with Cooke’s method[11].

By grouping attack steps together and focusing on more concrete parts of the system, CySeMoL helps abstract away a lot of the details of the attack graph by allowing the user to focus on the larger components of the system instead of details of the exact method of attack. For example, there are a lot of different ways in which a server can be compromised which leads to the same outcome but the user only needs to define the server in terms of its operating system, other software and relations to the world around it.

As stated above, the actual calculations are performed on the Bayesian network inferred from the CySeMoL model. CySeMoL is built on P2AMF which is based on OCL. OCL is a declarative language which performs its calculations with a recursive dynamic programming algorithm. When the state of one node is to be calculated, the algorithm recurses on all nodes on which it depends and calulates those making

(14)

2.3. CYSEMOL

sure to save results and only calculate each node once.

P2AMF on the other hand uses OCL to create a forward algorithm much like Dijkstra’s shortest path algorithm with the additional constraint of logical AND-nodes. Some nodes can only be traversed if two or more conditions are fulfilled.

CySeMoL is an instantiation of the P2AMF framework with actual relations and probabilities defined. The possibility to "inject" evidence into the model and thus sidestepping any calculations for a particular node forces CySeMoL to perform something slightly more complicated than a simple forward traversal of the graph. CySeMoL uses Monte Carlo sampling with either the acceptance-rejection algorithm or Metropolis-Hastings algorithm[1] to instantiate valid states of the nodes.

The acceptance-rejection algorithm basically generates a lot of uniformly dis-tributed samples over the whole sample space and then removing (rejecting) those who do not fit fall within the target distribution[12]. For example if we would like to sample points on the unit circle, we simply uniformly sample pairs (x, y) and reject any for which x2_{+ y}2 _>_{1. The resulting samples are uniformly distributed}

over the unit circle. The Metropolis-Hastings algorithm is an improvement of the acceptance-rejection algorithm to decrease the number of rejected samples.

(15)

(16)

Chapter 3

Method

The study has been done by performing three case studies of documented attacks on IT systems. Each case study can be divided into three phases: research, modelling and analysis. In the research phase, information about the attack has been gathered and studied. Due to the sensitive nature of these attacks it has proven difficult to find technical details of them. In several cases some details have been replaced by qualified predictions of what probably occurred. As such, the studied models may not accurately represent the actual attack that occurred and should be viewed as a model of an attack that could have occurred in a system similar to the one studied. In the modelling phase two kinds of models are constructed. First a "free-hand" model is created where the attack is represented with a graph describing both the attacked system and the performed attack. In practice this model was created with pen and paper to minimize any constraints that could exist in other kinds of tools. From this model, the documented attack path is expressed in terms of CySeMoL attack steps to make the comparison with the results from CySeMoL easier. After the free-hand model is created, it is translated into a CySeMoL model. In this step, all hardware and software are implemented in the CySeMoL model. All components that have been implicitly assumed to exist, such as dataflows and users, have to be explicitly defined.

When both models are created it is possible to do the actual analysis by studying the CySeMoL model and compare it to the "free-hand" model. Here, questions such as: "Does the CySeMoL model accurately describe the attack?", "Do the most likely attack paths in CySeMoL correspond with what happened?" and "Could more details be added to the model?" are studied. This is a qualitative analysis for which no metric is defined. One challenge with the analysis is to assess the risks that CySeMoL provide. It is known for a fact that the attack happened. However, it is difficult to relate this posterior knowledge to the probability of compromise that CySeMoL calculates.

(17)

Cy-CHAPTER 3. METHOD

SeMoL model. Adding more level of depth and details to the Bayesian network is only productive if some knowledge on the conditional probabilities between the new nodes are known. On the other hand, this can also expose the fact the some part of the model hide internal dependencies for which the relations are unknown and might be of interest for further research. The other two factors are, how much these additions affect both the computational complexity of the model and the "mod-elling" complexity i.e. the extra burden put on the user to provide knowledge on the existence of added parts.

The results from each case study are summarized in their respective section. Furthermore, general observations and broader ideas are summarized at the end of the case studies section.

(18)

Chapter 4

Case studies

Three different attacks from the last few years have been studied. The attacks were chosen based both on their relevance and on the availability of information about the attacks as this is typically information that is difficult to acquire. First the Stuxnet attack, which struck and disrupted operations in Iran’s nuclear facilities[13][14], has been studied. Furthermore, the attack on Diginotar[15] which ultimately led to the bankruptcy of the company was studied. Finally an attack on Logica, a Swedish server provider which was hacked and sensitive information was stolen from[16][17] was studied.

4.1 Stuxnet

4.1.1 Background

Stuxnet is a computer worm that was discovered in 2010. At the time it was con-sidered one of the most sophisticated malwares ever created. Samples of the worm has been thoroughly analysed by researchers[13]. Stuxnet’s goal was to infect pro-grammable logic controller (PLC) in industrial systems. Specifically, it is believed that the targets were Siemens SCADA systems in the nuclear facilities of Iran.

While it is not known exactly what occurred in that specific facility and how the worm propagated, there are several models of the attack based on reference systems and best practice specifications[14]. Based on the findings of this study by Byres et. al. a network which could be similar to the facility and is representative for SCADA networks in general has been modelled.

(19)

CHAPTER 4. CASE STUDIES

Figure 4.1. The Siemens best practice reference network[18]

When modelling an attack such as Stuxnet there is one major difference from a traditional attack where an attacker manually goes through and tries different attack steps. Stuxnet spreads and is replicated between computers which results in new instances of Stuxnet that operate independently from the "parent" instance. Consequently, this means that compared to a human attacker, the capabilities of Stuxnet grows exponentially as it spreads through the network.

In the aforementioned study by Byres et al. it is proposed that Stuxnet reached the PLC by the attack path described below and shown in Figure 4.2. Following an initial handoff via a physical drive, the malware spread through the Enterprise

Control Networkvia SMB shares until it found a computer with the right capabilities

namely VPN access to the Perimiter Network. From there it piggybacked on the connection the Central Archive Server, CAS, and exploited it to gain foothold on the Perimeter Network. Basically same procedure was repeated to gain access to

(20)

4.1. STUXNET

the Process Control Network where it eventually infected PCS7 project files which were uploaded to the PLC:s and were thus compromised. This can be split into a few distinct attack parts described below:

Figure 4.2. The Stuxnet attack as described by Byres et. al.[14]

1. An infected USB drive is given to an off-site contractor, for example by plant-ing it on their office or handplant-ing it out on a conference.

2. The infected drive is inserted into a workstation in the Enterprise Control

Network allowing Stuxnet to infect it.

(21)

4. Stuxnet piggybacks on the SQL database connection established by the priv-ileged user to the server on the Perimiter Network.

5. Stuxnet spreads within the Perimiter Network and infects several servers. 6. Stuxnet again piggybacks on the connection to the historian server on the

Process Control Network.

7. There it infects PCS7 project files which are ultimately downloaded on an engineering workstation.

8. Stuxnet installs itself on the PLC and performs two tasks: cause harmful operation on the machinery and tricks the monitoring systems that everything is running as normal.

4.1.2 Modelling

With the description of the attack, it is possible to create a sequence of CySeMoL attack steps which later can be compared to the actual output of CySeMoL. In those terms, the attack can be described as listed below. In Figure 4.3 it can be seen how the eight parts of the attack been roughly mapped to the CySeMoL objects they involve.

1. SocialZone.sharePortableMedia

2. OperatingSystem.accessThroughPortableMedia, OperatingSystem.deployExploit, OperatingSystem.compromise

3. NetworkZone.access, OperatingSystem.deployExploit, OperatingSystem.compromise 4. ApplicationClient.compromise, DataFlow.produceRequest, ApplicationServer.access,

ApplicationServer.deployExploit, ApplicationServer.compromise

5. NetworkZone.access, OperatingSystem.deployExploit, OperatingSystem.compromise 6. ApplicationClient.compromise, DataFlow.produceRequest, ApplicationServer.access,

ApplicationServer.deployExploit, ApplicationServer.compromise

7. DataFlow.produceResponse, ApplicationClient.deployExploit, ApplicationClient.compromise, OperatingSystem.compromise

8. ApplicationClient.compromise, DataFlow.produceRequest, ApplicationServer.access, ApplicationServer.deployExploit, ApplicationServer.compromise

It should be noted that this is not a one to one mapping and that several translations are possible. It is possible to describe the attack with more or less detail but this translation was chosen as a reasonably detailed description of the attack in CySeMoL terms. Finally, the last part about what Stuxnet did with

(22)

4.1. STUXNET

(23)

Figure 4.4. The network zones, interfaces and firewalls of the Stuxnet model

the PLC once it was compromised, i.e. causing harmful behaviour and disabling monitoring is not reflected in the CySeMoL model at all.

Based on the descriptions of the network topology and data flows, a CySeMoL model was created. Even though the network is quite small and the details have been kept to a minimum the resulting model consists of around 80 nodes. Images of the full model can be found in Appendix A. Overall, the network has been assumed to employ good security measures with strict firewall rules and regularly updated software.

A part of the model is shown in Figure 4.4. This sub view of the CySeMoL model shows the overall network topology of the system, excluding any computers.

4.1.3 Analysis

Finally, from the model an attack path to one of the PLC:s was calculated. Cy-SeMoL is quite detailed and as a result the attack path from the attacker to the PLC contains many steps. Furthermore, CySeMoL does not generate a single at-tack path but the whole atat-tack graph therefore several choices of atat-tack paths are possible. The one which closest matches the hypothesized path has been picked. The full attack path can be found below. Steps marked in bold corresponds to the

(24)

4.1. STUXNET

steps in the proposed path and it should be noted that almost all of those steps are included in the path thus CySeMoL agrees that this was a possible and probable attack path.

1. Attacker.start, Contractor Office.sharePortableMedia

2. ECN Workstation 2.accessThroughPortableMedia, ECN Workstation 2.executeArbitraryCode, ECN Workstation 2.compromise

3. Enterprise Control Network.access, ECN Workstation.findUnknownService, ECN Workstation.findExploit, ECN Workstation.deployExploit, ECN Work-station.executeArbitraryCode, ECN Workstation.compromise

4. Historian Web Client.compromise, CAS ECN-PN.produceRequest,

CAS Server.access, CAS Server.findExploit, CAS Server.deployExploit,

Historian Server OS.executeArbitraryCode, Historian Server OS.compromise 5. Skipped in the model

6. OS Web Client - PN.compromise, PCS7 PN-PCN.produceRequest,

OS Web Server - PCN.access, OS Web Server - PCN.compromise

7. PCS7 PCN Server-Engineer.produceReponse, OS Web Client Engi-neer.findExploit, OS Web Client Engineer.deployExploit, Engineering Workstation.executeArbitraryCode, Engineering Workstation.compromise 8. Siemens PLC Studio.compromise, Siemens PLC Transfer.produceRequest,

Siemens PLCStudio Server.access, Siemens PLCStudio Server.findExploit, Siemens PLCStudio Server.deployExploit, S7-400H.executeArbitraryCode, S7-400H.compromise

The attack path is also depicted in the CySeMoL graph shown in Figures 4.5, 4.6 and 4.7. The red arrows indicate all properties that influence the value of a node in the attack path while the overlaid blue path shows the attack path. The images are cluttered and can be hard to decipher, especially in the operating system nodes. This is made easier by cross-referencing the attack step list above.

(25)

Figure 4.5. The Stuxnet attack calculated by CySeMoL, pt.1

From these results it can be seen that CySeMoL agrees that it was possible that the attack occurred like described by Byres et. al. given that the network looked like this. However, CySeMoL attributes some positive probability to every connection in the model and it is thus hard to draw any conclusions about the actual probabilities in this case. There are two aspects of the attack that can not be properly modelled by CySeMoL. First of all, CySeMoL does not have a concept of privileges. In the real attack, Stuxnet spread between hosts in the network through SMB shares. This in itself did only require regular user privileges and not root access to the machine. In the CySeMoL model however it is modelled as a full compromise of the host. The other thing is domain specific attacks like destroying the PLC. Currently the attack only goes as far as considering the PLC compromised and not what that results in. Currently, one way to represent access levels in CySeMoL is to have multiple copies of the same application and connect different AccessControlPoints to them. This way, one physical application is represented by several virtual applications, each representing the environment the user sees. An example of this is shown in

(26)

4.1. STUXNET

Figure 4.9. This can be extended to the operating system level by creating two copies of the same computer with slight variations depending on what capabilities the user has. There are at least two problems with this approach. First of all, it duplicates a lot of work and makes the model larger. Secondly, it might not properly reflect the real conditional probabilities between the objects involved. For example, in the case of two user environments CySeMoL would treat this as two separate computers connected to the same network. This is a problem since the probability of compromising an admin account given that you have compromised a regular user is not the same as the probability of compromising a computer given that you have compromised another computer in the network.

(27)

anything to the model. If done this way it would be enough to have only one instance of each software and computer but the connection between the "PasswordAccount" and "AccessControlPoint" could be chosen to be either "User" or "Admin" instead of the current "Credentials". This would only introduce a slight additional modelling burden but could potentially improve results. An example of what this could look like is shown in Figure 4.10.

(28)

4.1. STUXNET

Figure 4.8. How SMB could be modelled in the Stuxnet network

(29)

Figure 4.10. How ACL could be modelled in CySeMoL

(30)

4.2. DIGINOTAR

4.2 Diginotar

Diginotar was a Dutch certificate authority. In the summer of 2011 they fell victim for an attack. This led to the compromise of several certificate authorities (CA:s) keys. With these keys, the attackers were able to forge certificates for a number of host names including "*.google.com" and "*.*.com", i.e. all sites with a .com top-level domain. After an investigation[19][15] by Dutch security company Fox-IT, the results showed it couldn’t be ruled out that all of Diginotar’s CA certificates had been compromised. This eventually led to that the Dutch government took over operations of Diginotar’s systems and the company was declared bankrupt. This is a prime example of what the consequences of an attack can be.

4.2.2 Modelling

The report from Diginotar[15] contains a lot of detail of how the attack happened. An overview of the networks zones of Diginotar and some of the central systems in them can be seen in Figure 4.11. Unfortunately, due to limitations in the investi-gations, it was not possible to perform a forensic analysis of one of the computers involved in the attack. Therefore it is unknown how that computer was compro-mised and thus one step of the attack is missing. The attack was modelled up to the known part. Furthermore, part of how the rest of attack was performed was also modelled. It is difficult to say exactly in which order each step of the attack occurred as there was a lot of lateral movement in the attack and many systems in the same network were compromised. The investigations did however reveal a likely attack path. The attack can roughly be described as follows:

1. The web servers Main-web and Docproof2 were compromised through an out-dated version of DotNetNuke, a content management system, with known vulnerabilities.

2. The attacker used a connection from the Main-web to the database server

BAPI-db, which was allowed through the firewall, to compromise it.

3. The attacker escalated privileges to compromise the whole database server. What happens next is unclear but somehow the attacker manages to compromise the BAPI-production server in Secure-net and connects back to Main-web in

DMZ-ext-net to use it as a stepping stone for further attacks in Secure-net. In particular

it means that according to the information in the report there is no explanation for how a connection could have been initiated from outside Secure-net into it. Three different hypotheses are proposed:

(31)

Figure 4.11. An overview of the central network zones and important systems of

Diginotar.[15]

(32)

4.2. DIGINOTAR

2. BAPI-production is reachable from the DMZ-ext-net network in violation of the descriptions of the firewall policies.

3. BAPI-production was compromised through a physically transmitted malware by infecting a storage drive in Office-net which was later brought into

Secure-net.

The important thing is that all three of these scenarios are easily represented by CySeMoL as seen in the previous example, the Stuxnet model. Overall, it is probable that if more details were known about the attack, it would be possible to fully model the attack.

The network described in the report is considerably larger than in the Stuxnet case, however most details were not thoroughly investigated and thus left out of the report. Consequently, most of the network has been left out in the model. Furthermore, only the connections which are explicitly part of the suspected main attack are modelled. Even with a lot left out and some network zones just labelled "other nets", this resulted in a CySeMoL model of about 70 nodes. Images of the full model can be found in Appendix B.

4.2.3 Analysis

There are at least two important points which can be gathered from the model. The first is that the first half of the attack is satisfactorily described by the model. The model shows that by compromising the web server of Main-web it is possible to compromise the whole system and use it as a stepping stone for the attack. Also by allowing an SQL connection from DMZ-ext-net into Office-net it is possible to compromise the database server and thus the whole network.

The other, and maybe more interesting point is that CySeMoL claims that there is a risk that the HTTP connection from within Secure-net out to

DMZ-ext-net can cause a compromise inside Secure-net. This is a lot like the latter part of

the Stuxnet attack where the engineering workstation is infected by a compromised PCS7 server. In that case the model describes the scenario correctly. However in this case, we want to illustrate the fact that technically the firewall allows this HTTP connection but in general there is no such connection done. This is a potential problem. There are two real world scenarios that both most naturally translates to the same CySeMoL model, i.e. that there is a compromised server with a possible dataflow to a client. In one scenario the emphasis is on the fact that this dataflow actually exists and can be used to compromise the client. In the other case however, the emphasis is on the fact that the dataflow is allowed and could be used to compromise the server if it wasn’t already.

(33)

rep-CHAPTER 4. CASE STUDIES

resent the fact that in real world systems full information about our systems may not be available.

(34)

4.3. LOGICA

Figure 4.12. How virtual environments currently can be modelled with CySeMoL

4.3 Logica

In 2012, an attack on the Swedish company Logica (now CGI Group) was discovered. The attack was suspected to have been going on for as long as two years. Logica is a service provider for several customers including the Swedish Tax Agency, which is presumed to have been the main target of the attack. Eventually two people where arrested and convicted for the attack. As a part of the trial, investigation reports from Logica and other affected parties were used as evidence[17][20]. Based on this material it is possible to understand, at least partially, how the attack was performed. Unfortunately, much of the details are redacted from the material[17].

4.3.2 Modelling

Even though the details are limited it is known that the attack involved a large IBM mainframe computer which is a central part of Logica’s IT system. This poses a problem for CySeMoL since the mainframe is divided into multiple logical partitions which serves as a virtualization environment for the system. Since the mainframe plays a central role in the Logica IT system, it is of little interest to try to model the rest of the system. Especially since the attack itself was centred around the mainframe. Instead the model will consider how virtualization could be modelled.

(35)

Figure 4.13. How real virtual environments could be implemented in CySeMoL

4.3.3 Analysis

There are at least two major problems with the model above. The first is that the model should be able to discern the host system from the guest systems as compro-mising them have very different implication. A compromise of one virtual host does not lead to the same capabilities as compromising the host system in which case all guest systems are automatically compromised. Secondly, the probabilities involved in these relations are not the same as for a physical system. It is unreasonable to assume that the conditional probability to compromise a physical system given that you have compromised another system in the network is the same as the con-ditional probability to compromise a virtual host given that you have compromised another virtual host on the same host system. It is however possible to model the two scenarios in conceptually similar ways.

In addition to the regular "NetworkZone" node, there would be a "VirtualEnvi-ronment" node with two types of connection to "OperatingSystem" nodes instead of just the single "Zone" connection that represented host OS connection and guest OS connections. An example of this model is shown in Figure 4.13. The conditional probabilities involved would have to be explored further to be able to implement these additions to the model.

4.4 Summary of analysis

Overall, it can be seen that CySeMoL is capable of modelling the first attack sat-isfactory but fails to provide a fully satsat-isfactory representation for the second and third attacks. Especially the third attack involving virtualization is a problem for CySeMoL. Access levels are not represented in the model which has the effect that a reasonable attack path is still calculated but with less accuracy of to what extent the system is actually compromised. The model also does not represent domain specific attack steps like causing malfunction in SCADA systems. There is also the

(36)

4.4. SUMMARY OF ANALYSIS

problem with to what extent a client to server dataflow should be considered exist-ing. Is it used regularly or is it simply technically possible? Overall, office systems with regular computers, network appliances and software are easy to model whilst more advanced features like virtualization is impossible and "soft" parts like access control and people are both difficult and cumbersome.

Currently access levels can be implemented by creating separate systems for separate accounts as shown in Figure 4.9. Instead it could be fruitful to model it as every system having two access levels: "user" and "admin" to reduce duplication and simplify modelling as shown in Figure 4.10.

The uncertainty in the dataflows could be modelled by adding something similar to the "discover hidden service" attack step, which is an attack step present in the "Operating System" CySeMoL node.

Virtualization can currently be modelled by representing virtual systems as phys-ical systems as shown in Figure 4.12. While this might work on a conceptual level it will introduce errors in the calculations. A better way to implement it would be analogous to the physical systems but with its own nodes and connections as shown in Figure 4.13.

(37)

(38)

Chapter 5

Conclusion

As seen from the modelling, CySeMoL handles many aspects of threat modelling but there is still room for improvement. One of the challenges is to strike a balance in level of detail. An over-detailed model will be cumbersome and hard to work with, but an overly simplistic model will give meaningless results.

Due to various levels of details in the descriptions of the attack it was not meaningful to follow the intended method fully. Nonetheless several insights were gathered throughout the project.

Privileges and access control are central concepts in any system and are therefore something that should be possible to explicitly model. A challenge is that access control and identity management is already a very difficult problem. Creating an adequate model of it must be done with great care to strike the above mentioned balance.

Domain specific attacks are currently not present in the model. To keep the model simple it might be better to not add specific concrete concepts to the model for covering this. Instead the model should be flexible and proper tools created to allow for customization to ease modelling within fields that require specific concepts. Overall, there are a number of usability issues with the CySeMoL tool. These are outside the scope for this study but have been recorded in appendix C with suggestions on how to improve modelling and visualisation.

There is also the issue on how to interpret the results in CySeMoL. The idea with the model is to be able to calculate the probability that an attacker will succeed with a certain attack steps within a specified timeframe. This does not currently work as intended and therefore the numbers should only be looked at in relative terms. Even when this is implemented, the meaning of the results will regardless vary between use cases. In this project, only the relative sizes of the probabilities and whether any significant probability exist at all have been looked at.

(39)

CHAPTER 5. CONCLUSION

a valuable tool for system administrators and decision makers in the future. This study has shown that CySeMoL manages to represent a large portion of systems with the possibility to manage even more in the future.

(40)

Bibliography

[1] Pontus Johnson et al. “An Architecture Modeling Framework for Probabilistic Prediction”. In: (2014).

[2] Teodor Sommestad, Mathias Ekstedt, and Hannes Holm. “The Cyber Secu-rity Modeling Language: A Tool for Assessing the Vulnerability of Enterprise System Architectures”. In: (2014).

[3] Introduction to Microsoft Security Development Lifecycle (SDL) Threat

Mod-eling.

[4] Adam Shostack. “Experiences threat modeling at microsoft”. In:

[5] Haralambos Mouratidis and Paolo Giorgini. Secure Tropos: A Security-oriented

Extension Of The Tropos Methodology. 2006.

[6] Haralambos Mouratidis and Paolo Giorgini. “Secure Tropos: A Security-oriented Extension Of The Tropos Methodology”. In: International Journal of Software

Engineering and Knowledge Engineering 17.02 (2007), pp. 285–309. doi: 10.

1142/S0218194007003240. eprint: http://www.worldscientific.com/doi/ pdf / 10 . 1142 / S0218194007003240. url: http://www.worldscientific. com/doi/abs/10.1142/S0218194007003240.

[7] Fredrik Vraalsen et al. “Specifying Legal Risk Scenarios Using the CORAS Threat Modelling Language”. English. In: Trust Management. Ed. by Pe-ter Herrmann, Valérie Issarny, and Simon Shiu. Vol. 3477. Lecture Notes in Computer Science. Springer Berlin Heidelberg, 2005, pp. 45–60. isbn: 978-3-540-26042-4. doi: 10.1007/11429760_4. url: http://dx.doi.org/10. 1007/11429760_4.

[8] Kevin P. Murphy. Machine Learning: A Probabilistic Perspective. 2012. [9] Teodor Sommestad, Mathias Ekstedt, and Pontus Johnson. “A Probabilistic

Relational Model for Security Risk Analysis”. In: (2010).

[10] Hannes Holm, Matus Korman, and Mathias Ekstedt. “A Bayesian network model for likelihood estimations of acquirement of critical software vulnera-bilities and exploits”. In: (2014).

(41)

BIBLIOGRAPHY

[12] Michael I. Jordan. “Stat260: Bayesian Modeling and Inference”. In: (2010). [13] Aleksandr Matrosov et al. “Stuxnet Under the Microscope”. In: (2011). [14] Eric Byres and Andrew Ginter. “How Stuxnet Spreads - A Study of Infection

Paths in Best Practice Systems”. In: (2011).

[15] Fox-IT. “Black Tulip - DigiNotar Certificate Authority breach - "Operation Black Tulip"”. In: (2011).

[16] Polismyndigheten. “Förundersökningsprotokoll - Logicafallet”. In: (2012). [17] Polismyndigheten and Logica. “Bilaga A - Logicas utredningsrapport”. In:

(2012).

[18] Siemens. “Process Control System PCS 7 Security concept PCS 7 & WinCC (Basic)”. In: (2012).

[19] Fox-IT. “Interim Report - DigiNotar Certificate Authority breach - "Operation Black Tulip"”. In: (2011).

[20] Polismyndigheten. “Bilaga B - Ovriga externa rapporter”. In: (2012).

(42)

Appendix A

The Stuxnet model

The Stuxnet model was created as a CySeMoL model with the Enterprise Architec-ture Analysis Tool (EAAT). The tool allows to break down the model in different views to make it more manageable. Below is all the views from the Stuxnet model. They are included for completeness and to get a better understanding of what a CySeMoL model looks like.

(43)

APPENDIX A. THE STUXNET MODEL

Figure A.2. View showing half of the dataflows in the Stuxnet model

Figure A.3. View showing other half of the dataflows in the Stuxnet model

(44)

Figure A.4. View showing the Control Systems Network in the Stuxnet model

(45)

Figure A.6. View showing the Manufacturing Operations Network in the Stuxnet

model

Figure A.7. View showing the networks topology in the Stuxnet model

(46)

Figure A.8. View showing the Perimeter Network in the Stuxnet model

(47)

Figure A.10. View showing the Process Control Network in the Stuxnet model

Figure A.11. View showing the software in the Stuxnet model

(48)

Appendix B

The Diginotar model

(49)

APPENDIX B. THE DIGINOTAR MODEL

Figure B.2. View showing the DMZ-ext network in the Diginotar model

Figure B.3. View showing DMZ-int network in the Diginotar model

(50)

Figure B.4. View showing the firewalls in the Diginotar model

(51)

Figure B.6. View showing the network overview in the Diginotar model

Figure B.7. View showing the office network in the Diginotar model

(52)

Figure B.8. View showing the protocols in the Diginotar model

(53)

Figure B.10. View showing the softwares in the Diginotar model

(54)

Appendix C

Other findings

This section contains other findings discovered throughout the project. They mostly concern the modelling tool itself and not the actual CySeMoL model. Many of the issues here are outside the scope of the work performed but still directly or indirectly related to some of the issues discovered and are thus included for completeness.

C.1 Visualization

The visualization options in CySeMoL are currently very primitive. It is difficult to get a good overview over the model and understand what the important aspects are. Below is some suggestions for how it could be improved.

• Give different amount of space to different objects. Currently, every node is a rectangle occupying roughly the same amount of space. An operating system or network zone could be more important than a single piece of software. • Create visual clues how objects fit together. Currently, if you don’t know what

types of node a specific node can connect to you have to consult the manual or guess. A better way would be for example to add something like a jig-saw puzzle looking edge to indicate connection types and when connecting nodes explicitly state "Can connect to: A, B or C". It is also possible to highlight objects which the selected object can connect to.

• Enable expandable and collapsible nodes. Views are great but it’s better to instantly be able to shift focus within the same view. For example make it possible to encapsulate OS+software into a box and collapse it. This way, one can immediately zoom in on parts of the model, make changes and then switch back to a more overview like perspective.

(55)

APPENDIX C. OTHER FINDINGS

C.2 Modelling

There is also some problems with the actual modelling which makes it difficult to work with the tool. The tool should really be a tool which helps you in your work and not something you have to struggle with to get things right. Below are some suggestions for how to improve the modelling tools.

• Make it easier to create and duplicate compound objects like combinations of OS and software. For example it should be possible to create a template for a typical workstation in a system consisting of an operating system, some software and connection to some kind of authentication mechanism. This template can then be named and reused throughout the model. Ideally it should be possible to create both shallow and deep copies of the template which enables the user to choose if changes to the template propagate to the copies or not.

• create "wizards" or "generators" to help create compound objects. These guides should remind the user that for example an OS typically has some software connected to it and is usually connected to a network zone. It could present the user with some pre-created templates to base their model on. • Inform of missing mandatory components. The presence of some objects does

not make sense without being connected to certain other objects. The tool should clearly inform the user of this and mark objects red and provide sug-gestions to solve the problems. Currently nothing happens and it is possible to make calculations but the results are probably not what one expects. • Give hints of what types of objects are usually added in connection with

others. This is almost the same as the previous point but for non-compulsory objects. It may mention that a certain kind of object typically is connected to another object.

(56)

Threat modelling of historical attackswith CySeMoL

Threat modelling of historical attacks

with CySeMoL

CARL SVENSSON

Threat modelling of historical attacks with

CySeMoL

Hotmodellering av historiska attacker med

CySeMoL

Abstract

Referat

Hotmodellering av historiska attacker med

CySeMoL

Acknowledgements

Contents

Chapter 1

Introduction

1.1

Goal and scope

1.2

This report

Chapter 2

Background

2.1

Threat modelling

2.2

Bayesian networks

2.3

CySeMoL

Chapter 3

Method

Chapter 4

Case studies

4.1

Stuxnet

4.2

Diginotar

4.3

Logica

4.4

Summary of analysis

Chapter 5

Conclusion

Bibliography

Appendix A

The Stuxnet model

Appendix B

The Diginotar model

Appendix C

Other findings

C.1

Visualization

C.2

Modelling