Analysis and Visualization of Attacks on Organizations

(1)

IN

DEGREE PROJECT INFORMATION AND COMMUNICATION TECHNOLOGY,

SECOND CYCLE, 30 CREDITS STOCKHOLM SWEDEN 2016,

Analysis and Visualization of Attacks on Organizations

TRITA-ICT-EX-2016:41 MIN GU

KTH ROYAL INSTITUTE OF TECHNOLOGY

SCHOOL OF INFORMATION AND COMMUNICATION TECHNOLOGY

(2)

www.kth.se

(3)

Acknowledgements

First, I would like to express my sincere gratitude to my supervisor Christian W. Probst, who has always been so supportive and inspiring to me. Each time I was stuck with some obstacles, he could always come up with a great guidance and lead me to the right solutions.

In addition, he is so encouraging when I am tired and emotionally confused. I couldn’t finish my thesis without his supervision.

I also want to thank Johan Montelius as my supervisor from KTH, he is so kind to accept my request and be my supervisor, he often gives me some reminders about the time schedule and illustrates the format of thesis I should follow. Though we are not able to have physical meeting due to the distance, I appreciate it so much for his kindness.

Last but not least, I want to thank my family and my friends, and all my NordsecMob fellows, they are always there to support me and bring me lots of pleasure when I am bored and annoyed.

(4)

(5)

Abstract

Graphical system models enable the modelling of organisations on layers that are relevant for attacks – the physical, virtual, and social layer. Recently, these models have been used for automatically identifying possible attacks on the modelled organisation. The generated attacks consider all three layers, making the contribution of building infrastructure, computer infrastructure, and humans (insiders and outsiders) explicit. However, this contribution is only visible in the attack trees as part of the performed steps; it cannot be mapped back to the model directly since the actions usually involve several elements (attacker and targeted actor or asset). Especially for large attack trees, visualising these relations between several model components quickly results in a large quantity of interrelations, which are hard to grasp. In this work we present several approaches for visualising attributes of attacks such as likelihood of success, impact, and required time or skill level. The resulting visualisations provide a link between graphical attack models and graphical system models.

(6)

iv

Abstrakt

Grafiska systemmodeller möjliggör modellering av organisationer, i de skikt som är relevanta för attacker - de fysiska, virtuella och sociala skikten. Nyligen har dessa modeller börjat an- vändas för att automatiskt identifiera eventuella attacker mot den modellerade organisationen.

Genererade attacker tar hänsyn till alla tre skikt, vilket tydliggör påverkan av byggnaders infrastruktur, datainfrastruktur, och människor (inom och utom organisationen). Emellertid är detta bidrag endast synligt i attackträden som en del av de genomförda stegen; det kan inte direkt återföras till modellen eftersom de åtgärder som vanligtvis involverar flera element (angripare och angripna aktörer eller tillgångar). Särskilt för stora attackträd kommer en visualisering av dessa förbindelser mellan flera modellkomponenter snabbt resultera i en stor mängd inbördes förhållanden, som är svåra att överskåda. I detta arbete presenterar vi flera metoder för att visualisera attackattribut som sannolikheten för framgång, påverkan och den tid eller kompetensnivå som krävs för att utföra attacken. De resulterande visualiseringarna utgör en länk mellan grafiska attackmodeller och grafiska systemmodeller.

(7)

List of figures

2.1 model example . . . 10

2.2 attacktree . . . 11

4.1 occurrenceCount . . . 28

4.2 weightMeasurement . . . 31

4.3 pathCount . . . 33

4.4 ATVisualization . . . 35

4.5 attack tree visualization example . . . 36

4.6 Pareto-efficient . . . 46

5.1 model2json . . . 49

5.2 modelElementJson . . . 53

5.3 mappingAT2Model . . . 54

5.4 modelVisualizationJson . . . 55

5.5 modelView . . . 57

(10)

(11)

Chapter 1 Introduction

1.1 Socio-technical system threats

Organizations are becoming quite complicated due to the mixture of technical systems and their social aspects, which leads to the emergence of a new concept named socio-technical system. Comparing with traditional technical systems, such systems are more possible to be attacked since attacks could be launched from different entries, either from the technical or social part weakness, or both at one time. Before, the technical security issue has been investigated a lot and quite a few corresponding methods are presented. These risk assessment methods describe processes that can be used to identify attacks, and to explain the attack’s potential impact on the organization. However, the focus of these techniques is often rather technical and ignores the internal structure and functioning of the organization. As to the social layer, the analysis regarding this part is still far from satisfactory due to the complexity of human factors.

To improve the scope of risk assessment and the level of scrutiny, security researchers have suggested socio-technical security models, which include the physical, virtual, and social layer of organisations. Socio-technical security models acknowledge the need of considering all these levels in assessing the risk faced by an organisation since an increasing number of attacks today do involve attack steps on all three levels. The recent attack on a German steel mill, for example, started with a spear phishing campaign, installing malware that gave the attackers access to the office network, and from there to the industrial control system. Eventually, the attack is said to have caused physical damage to the mill’s production system.

(12)

2 Introduction

1.2 System component analysis

In order to have a good presentation of the attacks found in an organization, attack trees[19, 10] are often used. Since attack trees are relatively loose defined, thus they can be adapted to the requirements in many different settings. Attack trees provide structure to the represented attacks by relating a node representing the goal of an attack with different alternative or required sub-goals, which an attacker may or must perform. This structure makes attack trees also an appropriate target for automated identification of attacks [22, 6, 5].

The TREsPASS project [21] applies attack trees as an intermediate representation of attacks. Attacks are generated from a socio-technical system model [7, 8] and are the basis of computing the risk faced by an organisation if one or more of the identified attacks are realised. Properties of interest of these attacks include required resources, such as time or money, likelihood of success, or impact of the attack. The analyses also identify the Pareto frontier of incomparable properties, for example, the likelihood of success of an attack, and the required budget.

When communicating the result of risk assessment, two components are of interest:

the actual attacks and the contribution of components of the organisation under scrutiny to these attacks. While properties of attack trees or other attack models can be visualised in enlightening ways [20], the same does not hold for the connection between components of the organisation and the attack.

The generated attacks make the contribution of building infrastructure, computer infrastructure, and humans (insiders and outsiders) to the attack explicit. However, this contribution is only visible in the attack trees as part of the performed steps, for example, as leaf labels.

Mapping this back to the system model is in principle not complicated. However, the actions usually involve several elements (attacker and targeted actor or asset) that may be located far apart in the model. Especially for large attack trees, visualising these relations quickly results in a large quantity of interrelations, which are hard to grasp.

In this work we present several approaches for visualising attributes of attacks such as likelihood of success, impact, and required time or skill level. The resulting visualisations provide a link between graphical attack models and graphical system models. After a discussion of visualising properties of attack trees, we present our approach of using metrics to identify the importance or contribution of parts of the attack tree, and mapping it to the system model. Our approach currently only considers contribution of model elements – it not, for example, include information on how assets and actors are used in an attack.

Our approach is independent of the attack model or socio-technical system model used.

The only requirement is that all model elements have unique identifiers that establish the link between their occurrences in the attack tree and the model, respectively. While we present

(13)

1.3 Structure of the paper 3

them in the setting of the TREsPASS model, which is similar to ExASyM [13] and Portunes [4], the general approach can be applied to any graphical system model and any attack model.

For example, the metrics used for visualising model components can also be output as a text file for sorting and further analysis.

1.3 Structure of the paper

The rest of this article is structured as follows.

Chapter 2 would give an background overview of graphical models for systems and attacks.

In chapter3, some techniques for analyzing the component contribution to attack would be introduced.

Following that, Chapter 4 illustrates the relevant implementations of these analysis techniques mentioned in chapter 3. At the end of this chapter, a visualization of an attack tree based on its attributes would also be presented.

Based on the analysis result from chapter 4, we show a mechanism to map all components inside an attack to its model and implement the visualization for a model.

Finally, Chapter 6 concludes the paper and discusses future work.

(14)

(15)

Chapter 2 Background

2.1 Insider attack

Human is a quite important component for an organization, most of the value generation processes for a company require the involvement of employees more or less. Meanwhile, it also increases the attack risk of a company to some extend. Assuming the vulnerabilities of an organization could be classified from insiders and outsiders[15], where outsiders mean those hackers who are capable of taking advantages of the weakness in a technical system and intrude into the company to make damages, while insiders could be employees in the company or people who managed to get the access to an internal system and then abuse their access by violating the privacy of an organization and creating loss to it. Most of the time, organizations would have a big chance to protect attacks from outsiders if their systems are in a good security architecture. However, insider attacks, which are quite common nowadays, could be unexpectedly troublesome to detect or prevent since insiders might have the privileges when applying actions[2], in some cases, this kind of attacks could lead to a worse consequence to the corporations if not being taken seriously.

Of course, organizations could take some measures to prevent these insider attacks by regulating insider actions or make all insiders under surveillance[18]. However, both of the above methods seem not so feasible. Over regulating might lead to the rebellious emotion from employees which possibly would result in a bad working efficiency and corporation culture. On the other aspect, strict surveillance is still quite controversial in lots of places around the world, so it is not difficult to imagine the difficulty of implementing it .

A viable solution might be analyzing all the human factors before a real attack, then organizations could narrow down the actions which should have occurred for the attack. In this case, we might overcome the shortcomings of traditional risk assessment methods regarding not considering too much about the internal structure or human factors of an organization. [6].

(16)

6 Background

If more accurate result is required, the insider attack should be distinguished according to the level of insiderness of attackers. The insiderness level could be affected by some parameters such as trust, knowledge and access[15].

Here, considering that it is extremely complex to collect data and analyze human behaviors, so we would talk about one of the worst attackers that an organization could meet, namely, the absolute insiderness attacker, which means attackers know everything about the model and they also have the access to all locations in the organization, therefore they could launch the strongest attack to the model. Actually, this absolute insiderness is not rare in reality, for example, both the CEO and a cleaning lady of a company, have the access to large parts of the organization, thus both of them could have the possibility to run an attack to the company for a financial purpose or other unknown reasons. In reality, this issue is solved by background checking, however, in our system model, we would specify some access control policies to avoid this problem.

2.2 Model

Before discussing the contribution of components of organizations to attacks, we need to summarize the system and attack models we consider in our work. As stated above, our approach is not limited to specific models for systems and attacks. We only require system models to provide unique identifiers for model elements, and attack models to use these identifiers in describing attack steps.

2.2.1 System Model

A model could have a good representation of both the physical and digital infrastructures in an organization. In fact, one organization could be presented by different system models from different perspectives. In this paper, we would only consider the organization as a socio-technical system, which includes the components stated below.

To provide a better view, we could think of the system models’ components majored in the following layers [6]:

• Physical Layer defines the physical infrastructures of an organization, which could be buildings, rooms, doors, etc., and the interconnection among these physical infrastructures. This layer also includes the physical items.

(17)

2.2 Model 7

• Virtual layer represents the elements from the network domain (e.g., computers, servers) together with their router connections. Relevant assets or data also are part of this layer.

• Access control layer refers the access control regulations of the physical and virtual layers. The model part policies which we would mention later belongs to this layer.

• Social layer implies the actors and the roles they are playing in the organization, all of them are restricted by the access right, each of them also owns some data or asset.

With the above being stated, we would like to make it clear that any time we refer to an example organization system model in a later phase, it is an instance of the above system model.

Once the physical and digital infrastructures are clear, we need to define all the elements that make up these infrastructures. In fact,these fundamental elements are our focus in this work.

Here we use nodes to represents different elements in the system, different elements could be actors, localities and all kinds of assets like computers, data existed in the model.

All the elements could be concluded as follow:

• Locations are where all actors and assets are placed. Locations are physically connected, which means actors or assets in the model could be moved from the links between locations. A location could be any part of a building.

• Actors are able to move form one location to another through the connections and access to the internal resources under some access constrains.

• Data are those resources distributed in different locations, it might be files or documents in the computer, or the cipher code to enter the room, etc.

• Actions are the behaviors granted to actors which would result in the location changing of an actor, or some data modification.

• Policies means the access regulation to the locations or data, and some reasonable expectation of the behaviors from an employee. Policies imply the allowed actions and their specifically required credentials, allowed actions means the relevant actions is permitted to be performed if enough credentials have been provided by actors.

(18)

8 Background

In our abstraction of the model, nodes represent the organizational components that enable and contribute to attacks. All elements in the model provide a unique identifier that can be used to refer to the element and to obtain, for example, information on its concrete type, model, or other relevant properties. This information is used in the attack generation, but it can also provide input to the visualization of the system models, for example, whether two elements should be conducted by an edge (e.g., two locations ) or one with in the other (e.g., two items ).

While models such as ExASyM [13] and Portunes [4] also define actions that can be performed by actors and process, these are not required for our approach. We only expect to be able to extract actors and arguments of actions from leaf nodes in attack trees.

2.2.2 Attack model

Similarly, attack models represent possible attacks on the modelled organization. For the approach in this paper, we only require that attack goal can be divided into sub-goals that can be combined either conjunctively, which means all the requirements described in its sub-goals have to be satisfied in order to achieve the goal, or disjunctively, which represents only one sub-goal need to be completed. This is very similar to attack tree, which we would talk about the attack tree in section 2.3, and just as for these it would be interesting to allow more complex combinations at a later point.

As mentioned before we require the attack model to support extraction of actors and assets from the actions in an attack tree[19, 10]. In our current work, actions are contained in the attack tree leafs, the leaf labels contain words from a regular language that provides, for example, information about type of action, performing actor, which asset is obtained, and where the asset is obtained from. The arguments to the action or exactly the identifiers that connect the attack tree with the system model. We do not need to impose other assumptions that are often found, e.g., about the ordering of sub goals from left to right; this is due to the flow insensitive nature of our visualization.

2.2.3 Running example

We use the same running example in this paper as in [6], which is based on a case study in the TREsPASS project[21] centered around an actor Alice, who receives some kind of service, e.g., care-taking, provided by an actor Charlie. Charlie’s employer has a company policy that forbids him to accept money from Alice or to steal money. Figure 2.1 shows a graphical representation of the example scenario, consisting of Alice’s home, a bank with an ATM, and a bank computer.

(19)

2.2 Model 9

A basic introduction of the model:

• Alice, an old person who needs care-taking service from a company, and Charlie is an employee of the company, and he is responsible for providing catering service to Alice;

• Alice’s home include a door which requires cipher code to enter; an workstation contains hard drive, where the password of the workstation is located. Alice could manipulate the workstation to do money transferring via the bank computer.

• Alice holds a payment card, which contains the owner’s name and relevant pin code, of course, Alice knows the Pin and also the password to initiate transfers from her workstation via the bank computer

• Charlie also owns a payment card, similarly, the card contains a pin code which is only known to Charlie.

• A bank, a bank computer and an ATM machine, which provide the payment service.

The money at the ATM requires a card with a pin code, as well as that very pin code modeled as input in order to obtain money.

Figure 2.1 is the running example in paper [6]. The locations, represented by small rectangles, are connected through directed edges. Actors are represented as rectangles with a location, e.g., Alice is at home and Charlie is in the city. Both actor nodes and location nodes can contain a pin code and Alice also has the pin code for her card. Actor nodes can represent processes running on the corresponding locations as well. The processes at the workstation and the bank computer represent the required functionality for transferring money; they initiate transfers from Alice’s home and check credentials for transfers.

Note that all elements have either a unique name or a unique value, which serve as their identifiers. If an element occurs more than once, for example, the password or the Alice’s pin, these occurrences represent copies of the same artefact.

(20)

10 Background

Fig. 2.1 Running model example. The white rectangles represent locations or items, the gray rectangles represent processes and actors; actors contain the items or data owned by the actor.

The round nodes represent data. Solid lines represent the physical connections between locations, and the dotted lines represent the present location of the actors and processes. The dashed rectangles in the upper right part of some nodes represent the policies assigned to these nodes.

(21)

2.3 Attack tree 11

2.3 Attack tree

Attack trees[19, 10] have been widely used in both industrial and academic fields, they are quite informative, easy to interpret. Based on the analysis on attack tree, it is quite straight forward to figure out all the potential attacks that could be applied on the system.

Attack trees depict the attacks in a hierarchical way, with the root represents the goal for the attack, and the leaf nodes imply the actions needed to be taken in order to achieve the goal. There also exist some intermediate sub nodes, which are a division of the final goal.

Replacing rule

In fact, all non-leaf nodes could be classified into two kinds. One is conjunctive nodes, which means the parent node has the same effect with all its child nodes conjunction and thus they are mutually equivalent. The other one is disjunctive nodes, which represents that any child node could replace its parent node and the same consequence would be reproduced.[6]

In Figure 2.2 we can see a simply attack tree replacing result. The goal for this attack tree is to steal a computer located in a room, for this goal, it could be dissolved into enter a room and grab the computer, this two is conjunctive. For entering the room, it could be replaced by either break into the room, or social engineering with the people who has the access to the room with a key.

Fig. 2.2 How to steal a computer in a room

(22)

12 Background

Here we would have a brief explanation about how to generate attack trees out of the attack scenarios. Since the implementation for this attack tree generator is out of our concern, so we would only introduce the usage of the tool on how to get its output.

Fig. 2.3 Overview for the tool to generate attack tree

Figure 2.3 shows the simple process how we use attack tree generator to get an attack tree file. The input would be a XML file represents the whole scenario. Inside the scenario file, it contains some information regarding the profit, which implies how much attackers would benefit if a successful attack, together with some data regarding all the credentials owned by the attacker. Of course, the scenario would also need to know the model information when parsing, and the model information would be introduced in next section. The main function of the tool could be concluded as parsing and applying some analysis to the scenario file and then generate the attack tree file, which is an XML file shows all the possible attacks inside the model. one example of the attack tree is as follow.

Figure 2.4 is an example of the whole attack tree. With the root being its goal, all the sub nodes are combined by their parent nodes in a conjunctive or disjunctive way, the leaf nodes are the basic actions need to be fulfilled to run the attack. We didn’t present a complicated attack tree for a good visualization purpose because one entire attack tree would take up too much space and make the whole plot quite messy. Another thing need to be noticed, this attack tree is only for the presentation purpose, our following analysis examples for attack tree are based on a more complex one.

(23)

2.3 Attack tree 13

Fig.2.4Anexampleforangeneralattacktree

(24)

(25)

Chapter 3 Analysis of component contribution to attacks

Although attack trees give a good view about the required steps to fulfill the attack in a good way, and it is also quite feasible to figure out all the possible paths by following the grammar of the attack tree. However, for a large attack tree, visualizing these relations between several model components is almost impossible since the interrelation would be increased exponentially. Besides, the generated attacks consider all three layers and analyse their contribution accordingly, but the contribution is only visible in the attack tree as part of performed steps, it can not be mapped back to the targeted actors or assets. Thus we have to provide a good mechanism to visualize attributes of the attack tree, and the resulting visualization provide a link between graphical attack model and graphical system models.

The analytic risk assessment based on socio-technical security models operates on attack trees and judgments about quantitative properties of the actions performed and the actors performing them, after briefly discussing how to evaluate attack models, we present a simple approach for visualizing several, potentially incomparable properties of such models.

3.1 Evaluating attack models

The attack models generated from system models from the basis of analytic risk assessment.

Properties of interest[11] of these attacks include required resources, such as time or money, likelihood of success, or impact of the attack based on annotations of the leaf nodes in attack trees. Analyses [1] also identify the Pareto frontier of incomparable properties, for example, the likelihood of success of an attack, and the required budget.

(26)

16 Analysis of component contribution to attacks

The mapping of actions to metrics can again be achieved by mapping the action and its arguments to a specific value. These metrics can represent any quantitative knowledge about components, for example, likelihood, time, price, impact, or probability distributions.

The latter could describe behavior of actors or timing distributions. For the visualization described in this article the mapping of leaf nodes to metrics and the analyses performed are irrelevant; we assume an attack tree and a mapping from its nodes to an analysis result.

3.2 Contribution of Components of Organisations to At- tacks

Now we put the different elements described above together to figure out the relation between attack trees and system models. Remember that we require all elements in the model to have unique identifiers; we use this identifier to associate model components and attack tree actions.

As for attack trees we need a measure for how much a model element contributes to a given attack. We apply techniques similar to our earlier work on insiderness [14].

3.2.1 Measuring Impact

Computing the actual impact of a model component on an attack is as difficult as computing the impact of an attack; the results can be used for ordering attacks or influence, but they should not be taken as absolute answers. With this in mind we have applied several techniques for measuring the impact of components on attacks.

As mentioned before we require the attack model to support extraction of actor and assets from the actions in an attack tree, and actions are contained in the attack-tree leafs. Leaf labels provide information about type of action, performing actor, which asset is obtained, and where the asset is obtained from. All this information is provided through the identifiers that connect the attack tree with the system model.

Counting Occurrences:

The simplest concept of measuring impact is that of counting occurrences of identifiers. It computes for a given entity in how many places it contributes to the whole attack tree or a path. The occurrence-based impact ignores impact, likelihood, or other analysis results. It is either measured as absolute number or as percentage of occurrences of identifiers in the path or tree being analysed. It is computed per identifier id for a set of nodes in a subtree of the

(27)

3.2 Contribution of Components of Organisations to Attacks 17

attack tree that represents an attack, assuming that id ∈ S returns 1 if true, and 0 otherwise, and that node n has successors c ∈ succ(n):

I (id,n) :=











[x, x] x= (id ∈ actor(n)) + (id ∈ assets(n)), if n is a leaf node [l, u] l= min(I (id,c)),u = max(I (id,c)), if n is a disjunc-

tive node

[l, u] l = Σc{l^′|[l^′, _] = I (id,c)},u = Σc{u^′|[_, u^′] = I (id,c)}, if n is a conjunctive node

(3.1)

As a first crude measure, this impact provides a defender with a quick overview of which components of the organisation actually occur in the attack tree.

The occurrence-based impact provides for every identifier a lower and an upper bound of occurrences; for conjunctive nodes these will be the same, for disjunctive nodes the lower bound is the minimum of the lower bounds, and the upper bound is the maximum of the upper bounds of the child nodes. The combination of lower and upper bounds provides a measure for how reliable the numbers are. It also allows to identify, whether certain elements occur in all attacks: ifI (id,r) = [x,_] for some identifier id, the root of the attack tree, and x> 0, then the element with id contains in every attack in the tree.

Weighted Sum:

The impact factor based on occurrences in the generated attacks is a rather crude approxi- mation, since every occurrence of an identifiers is assigned the same impact independent on the actual contribution to the attack. Given that the analyses of attack trees described in 3.1 provide us with quantitative information about attacks, we can improve over the occurrence-based ranking by weighting occurrences of identifiers with the impact of the attack they occur in. The factors we can choose from are limited by available analyses only, but include, for example, the likelihood of success, required time, difficulty, and cost.

In contrast to the occurrence-based impact we now include one of the analysis results, by weighting the count for an identifier with the weight of the path, and potentially normalising it. As before It is either measured as absolute number or as percentage of occurrences of identifiers in a subtree of the tree being analysed. It is computed per identifier id for a node on a path in the attack tree, assuming that id ∈ S returns 1 if true, and 0 otherwise, that node nhas successors c ∈ succ(n), and that val returns the result of the attack tree analysis for a

(28)

18 Analysis of component contribution to attacks

node n in the (sub-)tree p:

I (id,n,p) :=











v_l v_l= (id ∈ actor(n) + id ∈ assets(n)), if n is a leaf node v_ca v_ca= max(I (id,c, p)), measuring difficulty, time, cost if n is a conjunctive node; and likelihood if a disjunctive node;

v_cm v_cm= min(I (id,c, p)), measuring likelihood if n is a conjunctive node; and difficulty, time, cost if a disjunctive node;

(3.2)

3.2.2 Importance of visualization

The above analysis techniques could give us some basic information on how each component contribute to the attacks in an organization. However, the outcome of the analysis would be represented by a huge load of data, which is not so organization-friendly. Therefore, we propose a better idea to provide a straightforward visualization mechanism.

Visualising attack tree:

The existing attack tree demonstration tool, which we have used to present the attack tree in 2.3, is good enough to give an idea about the structure of the attack tree. However, it doesn’t have an emphasis on the parts with high impact, thus it is quite possible to get lost when facing a large sized attack tree. Concerning how to apply analysis to attack trees and utilize the analysis result to implement a better visualization of an attack tree, we would present relevant solutions in Chapter 4. With enough information telling readers that different nodes in the attack tree are unequally important. In that case, more focus could be put on the nodes with higher importance.

Visualising Paths:

Depending on the kind of attack trees, they contain information about moves of the attacker in the organisation or not. If the move information is contained in the attack tree, then the methods above extend to visualising in the system model, which locations of the modelled organisation are most important for the attack. This information is especially interesting for deciding about the need for (better) surveillance.

(29)

3.2 Contribution of Components of Organisations to Attacks 19

Visualising system model:

Understanding the attacks in an attack tree is quite important, but the interrelation of different paths in the attack tree could also be quite confusing. It is not so helpful for an organization knowing path of an attack tree with high attack risk, it is more realistic for organization to apply protection solutions on the components. Thus we need to provide a visual system model to the organization to avoid attacks better. Actually, the above two visualizations provide a link between graphical attack model and graphical system models. In Chapter 5, we would elaborate how to map the attack analysis on attack tree to the relevant system model, and the solution to present the system model with different visual styles to show their importance.

Visualising different components:

There exist many different analyses on attack trees, and it may be interesting to investigate and visualise several values combined on a system model. For many interesting counting approaches the ones discussed here, one can combine different values into a vector, and apply for each value the targeted counting operation. Since the values generally may not be comparable directly, one then can either apply Pareto-based techniques, visualise the different values simultaneously, or apply a summation function that combines the individual values.

(30)

(31)

Chapter 4 Visualizing attacks

As mentioned before, the interrelation of large sized attack tree could result in considerable confusion. So varieties of visual styles have been decided to differentiate various nodes and paths according to their importance. The importance of each node and the path would rely on the impact analysis described in the previous chapter.

In this chapter, we first would introduce some tools and frameworks that are used to implement attack tree analysis and visualize attack trees. Following that, we would talk about the implementation of analysis techniques and how we use the analysis result to build attack tree visualization. The result of each analysis step would also be presented together with some discussion based on the result.

4.1 Tools and frameworks involved

First, Python[16] is the main programming language involved for running script to parse the whole attack tree. The reason to choose this implementation language is that the syntax of Python is quite clear and tidy when programming, it also provides quite a few libraries and frameworks that could be helpful in this project. Since python is a quite well-known technology, we would not illustrate more details about it. Here, we would mainly focus on two python technologies because these two are most frequently used. One is xml.dom.minidom, the other one is Networkx.

(32)

22 Visualizing attacks

4.1.1 Interface xml.dom.minidom

Attack trees is a very important portion in our project, most of the analysis needs to be done using the attributes of attack tree. But in our project, attack trees are originally presented in the format of an XML file, thus we need to get a xml file parser to get all the attributes and their corresponding values stored in the attack tree. In python, the interface xml.dom.minidom would do the job as a parser of the XML object.

Xml.dom.minidom [17] is a simplified implementation for the DOM(Document Object Model) interface, and it provides a quite user friendly API to deal with the XML file object.

There are two methods in this Interface that could be useful to get a XML document object represent the content of the document.

The first method is

xml.dom.minidom.parse (filename_or_file[, parser[ , bufsize ] ])

which would return a Document object from the input, the input might be a file name or a file-like object, the second parameter must be a SAX2 parser if given. This is a function that could be used to change the document handler of the parser and activate the support namespace as well.

The other method is named

xml.dom.minidom.parseString (string[, parser])

It has almost same mechanism with xml.dom.minidom.parse, the only difference is that the former would only take a file name or file-like object as first parameter to parse, but the latter could only parse a string the is made from XML codes.

The main outcome for parse() and parseString() is to make an XML parser bind with a "DOM builder" that can take some parser events from any kind of SAX parser and then transfer it into a DOM tree.

Once the document object is given, it is quite easy to have the access of the content of the different parts in the XML document with the help of its properties and methods. Of course, these properties have to be predefined in the DOM specification. The major property of the document object in our case is documentElement property[17]. Following that, it is easy to tell if a node is a leaf node or not based on its children count, if 0, then it is a leaf node. For a

(33)

4.1 Tools and frameworks involved 23

non-leaf node, with the parser, we could get the sub-tree information of the node and then apply further analyzing.

4.1.2 Networkx

Before choosing a tool for plotting the visualization of an attack tree, a few visual styles have been decided for differentiating impact factors of the nodes and paths which would be talked in section 4.2.2, there we could know that we are targeted at getting a tool that could offer a good mechanism to display an tree structured network with different colors, customized edge widths and various transparency. After evaluating quite a few python packages, we finalized Networkx as our best choice.

NetworkX[12] is a python package which could be used for creating complex network.

It provides the basic functionality for drawing nodes and edges, and it also integrates some packages to display nodes or edges with different colors, transparency and edge width.

The framework provides some classes like graph()(a graph class whose direction is out of concern) and digraph()(a graph class whose direction has to be explicit). Besides, it is also possible to put multiple graphs at the same plot by calling class multigraph()(a graph class that could contain a few objects instantiated from classes of graph() and digraph()).

Here, we would introduce two important methods which are necessary for us to use in the implementation.

Draw networkx nodes

draw_networkx_nodes(G, pos, nodelist=None, node_size=300, node_color=’r’, node_shape=’o’, alpha=1.0, cmap=None, vmin=None, vmax=None, ax=None, linewidths=None, label=None, kwds)

This is the method for drawing nodes in a graph, here, we would explain the meaning of each parameter and the requirements of the argument.

• G means a graph instance of class graph(),or digraph(), or multigraph();

• pos tells all the position of all nodes

• node_size defines the size of the nodes

if it is a scalar, which means all the nodes would be in the same size;

if it is an array whose item count is equal to the number of nodes, each node would have identical size mapped from the array to the node by ID.

(34)

• Node_color have the same requirements with node_size but implies the color of the nodes.

• Node_shape would take a string to determine the node shape, in our case, we make the shape in default ’o’,

• alpha represents the transparency of the nodes, for the attack tree, all nodes would be assigned to half transparent, but for the edges which we would elaborate later, alpha is quite a critical parameter.

• label is a parameter to show the tag for each node and it has to be an array with the size equals to node count.

• linewidths implies the border width of a node, in our project, we take advantage of this parameter to show different nodes, since the nodes have the disjunctive and conjunctive properties, so we can use the border to identify them.

The rest of the parameters are out of our concern, so we would not elaborate them.

Draw networkx edges

draw_networkx_edges(G, pos, edgelist=None, width=1.0, edge_color=’k’, style=’solid’, alpha=1.0, edge_cmap=None, edge_vmin=None, edge_vmax=None, ax=None, arrows=True, label=None, kwds)

This method [12] shares lots of common properties with draw_networkx_nodes except all the styles are for edges when calling method draw_networkx_edges, but here, we have to pay attention to the following parameters

1. The alpha parameter in draw_networkx_nodes would always be 0.5 because the transparency of the node is not in our scope, but the edge transparency means the likelihood of the path in our visualization of attack tree. Thus we have to take each edge alpha seriously.

2. Edge_width tells how width the edge would be, this is another fundamental parameter we need to focus when assigning values

Similarly, some of the parameters could be ignored since we don’t need to use them.

From the above information, it is obvious to know that there are quite a few methods and functions in networkx requires either a fixed value or an array to provide specific information for the nodes or edges. Actually ,Networkx is quite efficient since it depends on this

(35)

4.2 Implementation 25

kind of "dictionary of dictionary" data structure, and it is quite scalable and portable as well.

One more thought, it would be the best if we could visualize the attack tree in the similar shape with what we have got from the attack tree generator. However, if we use the networkx build-in function to get all the nodes’ positions, they would be quite randomly distributed, but the advantages of networkx is that, the graph is compatible if using the graphviz_layout to keep all the nodes in the positions of a tree, so when plotting the nodes, all of them would still appear in the shape of a tree.

Matplotlib is a python plotting library for generating attack tree visualization plots and produce high quality figures in hardcopy formats.

Both graphviz and matplotlib have a good integration with networkx, so it is quite con- venient to use this tool to implement all the features required in our project, that is another key reason why we choose this framework for our work.

4.2 Implementation

As mentioned in 3.2, in order to figure out the relation between attack trees and system models, we need a measurement for how much a model element contributes to a given attack.

But one thing as premise, we require all elements in the model to have unique identifiers; we use this identifier to associate model components and attack tree actions.

In this section, we would illustrate how we implement the techniques for analyzing the attack tree. Beside the two techniques we have mentioned about impact measurement in section 3.2, which are counting occurrence and weight sum. We would also describe a method for computing the attack tree paths. With the result of above analysis, we would then present a mechanism in detail on how the analysis result is used for plotting the attack tree.

At last, some test samples would be used to verify the correctness of our analysis and the result, we would also have a discussion regarding the strength and weakness of our analysis techniques.

4.2.1 Implementation for counting occurrence

The first technique we would use to measure the impact is to figure out the occurrence for each element in the attack model, the occurrence count could give us a basic idea about how

(36)

many times each element has contributed to the whole attack model. In general, this way of measurement would ignore any other impact factors of the element like the likelihood, difficulty, or the cost.

In the beginning, the fundamental resource we could use is the attack tree xml file which we generated from 2.3, thus we need to use an attack tree parser to go through each node in the attack tree and keep a record of the appearing times for each element, the element here must be one of the items in the element list of the system model, and all model elements need have identical tags.

Algorithm to reach all the leaf nodes in attack tree

Due to the specialties of attack tree, we could know that only the element occurrence in leaf nodes would matter since all non-leaf nodes are just a conjunctive or disjunctive combination of its child nodes. Therefore, the occurrence count would start from the leaf nodes. Here, we would talk about how to reach all the leaf nodes in the attack tree with the help of the parser.

When applying the tree parser xml.dom.minidom.parse to the attack tree xml file, we can get the root node and its children. Logically, we can take its each child node as a sub-root and continue the parsing process. To keep a record of the occurrence count for all leaf nodes in the attack tree, we would use a post-order traverse algorithm to visit all leaf nodes. In detail, starting from the root node, the traverse would continue in depth until it arrives at a leaf node, then it would go back to the leaf node’s parent node so that other child branches of the parent node could be traversed. Similarly, the traverse would go back further if all the nodes in the parent node’s sub-tree have been visited, when it gets back to the root node, we could know that the traverse should be terminated. Once the whole traverse is done, we can be sure that all the leaf nodes have been visited. The tree parser interface also provides a mechanism to get the children count of each node in the attack tree, thus each time when we get to a node, we need to apply the tree parser to get all the sub-tree of the node. If the children count is zero, we are acknowledged that this is a leaf node.

We use recursion to realize the computation. Starting from the root, we go down to its children, continuously, once the recursion reach the leaf node, the occurrence count for the leaf node would be recorded, when the traverse goes back, the occurrence count for non-leaf node would be computed layer by layer. When it goes back to the root node, we can have the knowledge about occurrence which has made the contribution to the attack tree.

(37)

Leaf node grammar

Every time when we arrive at a leaf node, we need to analyze the properties of the leaf node by separating every word in the leaf label. Actually, the leaf label follows a few patterns which are public and fixed. All the leaf grammar can defined as follows:

1. FULFILL actor1 action actor2 means the goal would be achieved if and only if the actor1 fulfill the requirement described, since we don’t take actions like "fulfill" into our consideration for the occurrence, so we would only keep a record of the count number of actor1 and actor2.

2. FORCE/MOVE actor location locality is quite straight, we need to keep a track on the occurrence of both the actor and locality in the leaf labels.

3. IN actor AssetKind AssetName AssetID TargetKind TargetName TargetID, the in- terpretation for this in-action is that the actor needs to input an asset from someone or somewhere else so that it can meet the requirements. In this case, we only need to focus on the occurrence for the actor, AssetID and the TargetID appeared in the statement.

4. MAKE actor1 actor2 + {one-of-the-patterns-above}, here, this make-action means the attacker would determine an victim and try to make him perform a certain action, here we only take the actor1 and the statement following actor2 into account, since the victim actor2 would be sure appearing in the new statement.

Algorithm to get the occurrence count of non-leaf nodes

Once finishing the occurrence count of all leaf nodes, the occurrence count of elements on each non-leaf node could be computed by merging its child nodes based on its correlation, either disjunction or conjunction. If conjunction, the occurrence on all child nodes would be summed up, otherwise, a range from the minimum to maximum occurrence value for one specific element would be extracted from its child nodes.

The occurrence count for leaf nodes and non-leaf nodes would be carried out parallel with the traversing process. When the traverse finished visiting all the sub-tree nodes of a sub-root, which must be a non-leaf node, the occurrence count of this sub-root would be computed with the above algorithm.

A sample result of occurrence is presented by an map, for example, ’Charlie’: {’min’:

1, ’max’: 4}, which denotes Charlie should appear at least once in this attack and at most

(38)

four times. Figure 4.1 shows all the functions in the module regarding the implementation of the element occurrence count in the attack tree.

Fig. 4.1 Main functions in the occurrence count module

Actually, in this module,not only the element occurrence count for each node has been recorded, some information regarding the child nodes and parent node for a node is also documented together with the node type, either conjunctive or disjunctive. These information is quite helpful with implementation in a later phase we would talk about in the following sections.

4.2.2 Implementation of Weight Measurement

The basic concept of weight measurement has been illustrated in section 3.2, we would follow the thought described there and start the implementation.

A few factors are used to determine the weight of one single element, the element could be any actor or asset in the attack model, and then the weight of elements can help derive the weight of nodes.

The factors mentioned above are the likelihood, time consumption, difficulty and cost of carrying out an operation on one element, these factors also imply the attributes of this element. Each factor would be divided into 4 levels and the value is based on the ease of access and availability of the element.

One major difference for occurrence and weight measurement is that, we only need to focus on the elements which appear in the model and ignore the actions in the leaf label when computing the occurrence, but for the weight measurement, all actions should also be analyzed, which is quite reasonable. For example, some actions in the label like "force",

(39)

"move", it definitely requires some cost and it might also increase the difficulty or decrease the likelihood, further more, some action like "trust" might be quite time consuming, so it is definitely necessary to think about all the effects these actions could make. Actually, we think the effects these actions could make on the attack is quite comparable with other elements.

Note: With the above explanation, we need to emphasis that, when we mention elements in occurrence count module, we are referring to all the elements which make up the components in the attack model. The rest part of our paper, when we are talking about elements, it means component elements as well as all the actions that could be applied to these component elements.

In order to assign the impact values to all elements, there are 4 maps which are corresponding to the 4 factors mentioned above. Each map concerning each factor is divided into 4 levels.

1. Time consuming map, four levels MT,HR,DY and MN, which represents minute, hour, day and month level of time.

2. Difficulty map, four levels are V,H,M and L, which implies Very high, high, medium and low, respectively.

3. Likelihood map is defined similarly with Difficulty map.

4. Cost map, is leveled with digit 1 to 4, which tells the cost is from low to high.

For each element, all four factors are totally independent. For example, the actor Charlie, who is an actor in the attack tree, if an attacker wants to run any operation on him, the attacker needs to spend some time in the hour level, cost in 3, and the operation’s difficulty is in low level, while likelihood in very high level. Actually, all factors are only relied on the characteristics of the element which has been predefined in the system, they have no connection with each other.

One thing that is quite important to know, whenever we are talking about two levels concerning the same impact factor, the gap between one level and its upper level is considerable huge, so the effect of the lower level could be ignored when meeting higher leveled value. For example, one element possess the time consuming level as MN(month), but another action holds the HR(hour) as the time level, the hour could be totally ignored when facing month level, thus, the result of combining these two is the MN(month). The same rule applies to the other 3 factors.

(40)

Algorithm to get the weight sum for nodes in the attack tree

The computation of weight sum share lots of similarities with the occurrence count, and both of them are carried out synchronously.

when we measure the weight of each node, we first need to keep a record on the weight of each leaf node, following the grammar for leaf labels described in section 4.2.1, we can then compute the weight of the node for all four factors.

As described in the beginning of this section, When analyzing each leaf label, all the words, including the actions and component elements, in the label would contribute to the weight measurement for the node. Our initial goal is to finalize the four factors referred above for each node.

1. As to leaf nodes, we would analyze all the elements in the leaf labels. For each word that appears in the label, we would consult the four impact factor value map defined above to check its weight. For difficulty,cost and time, we would extract the highest level among the elements and assign the value to the node, as to likelihood, the lowest value among the elements would be taken.

2. When computing the weight sum for a non-leaf nodes, if it is conjunctive, likelihood would be the lowest among its child nodes, while the cost, difficulty and time would be the highest level from its child nodes. As to disjunctive node, the cost,difficulty and time would be the lowest while the likelihood would be the highest among its child nodes. The root node can also be taken as one of the non-leaf cases.

As what we have stated before, the process of measuring weight is parallel with the counting occurrence, the recursion thought also works in this part of implementation. One example would be, “FORCE Charlie door” {’likelihood’: ’L’, ’cost’: 4, ’time’: ’HR’, ’difficulty’: ’V’}

, implies that the happening of forcing Charlie to door has low likelihood, and the cost is 4, the time consumption is hour level, and this action is very difficult.

Figure 4.2 shows all the functions involved in the implementation for the weight measurement. Actually, this module is a quite important part for the whole project, the path computation, attack tree visualization and even system model visualization are based on the weight sum information we get from this module.

(41)

Fig. 4.2 main functions in the weight measurement module

4.2.3 Implementation for Path Computation

The above two impact measurement techniques are two simple analysis that we could conduct on the attack tree, and they indeed could provide enough information if we only need to understand the individual nodes in the attack tree. But it is not so meaningful if we only know the information regarding nodes instead of the whole attack tree since attacks are always a combination of a few different nodes. Thus we need extend the previous techniques to the paths where these nodes occur.

There exists quite a few different paths in the attack tree, but all of them share one same character, after applying all the actions described in the leaf nodes for one path. The attack goal, which is the root of the attack tree, would be achieved. Thus it is quite fundamental to figure out all the possible paths existing in the attack tree and the attributes of each path, for example, the cost for each path, the likelihood and the difficulty. After comparing all the path attributes, then we could determine which paths are the weakest in the system and prepare some related countermeasures.

From the characters of attack trees, it is easy to know that the number of available paths

(42)

existing in the attack tree depends on the ratio between disjunctive and conjunctive nodes as well as their locations in the attack tree.

Algorithm for path computation

Based on the replacement rules of conjunctive and disjunctive nodes which has been illustrated in 2.3, the computation would start from the the root node, we also need to define a sequence in the beginning. Originally, the sequence would only have one node, root.

• When arriving at a conjunctive node, the parent sequence could be replaced by adding all child nodes to the sequence and eliminate the parent node.

• But for disjunctive node, the parent sequence would be divided to a few new sequences containing the sequence parent nodes’ child nodes by applying the replacement rule, of course, the parent node would also be eliminated after the replacement.

The process keeps going on until only leaf nodes exist in the newly generated sequences.

Once the whole process terminates, all the possible paths have been figured out.

The above replacing step could only get all the leaf nodes of each path, however,we need to figure out all the non-leaf nodes in the path as well since the attack tree visualization is based on the whole path. In fact, it is not so difficult to implement that because we could use the parent information stored in the occurrence count to get all the distinct non-leaf node til the root node. At the same time, all the edges from the parent node to its child node in the path would also be recorded for the usage in the next step.

Get the weight sum for each path

Knowing the number of possible paths and the nodes for each path is not enough in our analysis for the attack model, it is also quite fundamental to carry out the weight measurement for each path. Thus during the whole path generating process, the weight measurement of one individual path would also be carried out. The mechanism for acquiring the weight factor values of paths are quite similar with these conjunctive non-leaf nodes referred in 4.2.2 . The weight of one path relies on all the leaf nodes in this path, all leaf nodes would be collected and apply the same method for computing the weight of a conjunctive node to get the weight for this path.

Figure 4.3 gives an overview for all the functions have been implemented for the path computation module.

(43)

After getting all different properties of each path, we could do some comparison among all

Fig. 4.3 Main functions in the path count module

paths, then we could figure out which path is the easiest and most likely to be exploited by attackers to run the attack.

Although all the above analysis and generated data could provide quite an amount of information and have an good assessment on the risk in the attack model, it is still not straightforward enough and the visualization is still far from satisfaction especially when the attack tree becomes large, therefore we need to figure out a way to implement good visualization of the attacks.

4.2.4 Implementation of the visualization of attack tree

The main technology for visualizing the attack tree is named networkx which has been described in section 4.1.2. For the attack tree visualization module, it doesn’t produce new information, but we just need to implement a method to use the analysis result properly and reflect all the paths with high or low impact values on the attack tree.

Three visual styles would be utilized to reflect three impact heaviness of each path. First, the line width of edges implies the resource of a specific path, here, the resource tells how demanding this attack path is. We assumed attackers always choose the path with lowest cost, lowest time consumption and lowest difficulty to apply attacking. Thus the resource is inversely proportional to these three parameters. Secondly, the transparency reflects the likelihood of one path in an attack which is straightforwardly defined in the weight measurement.

The more transparent one path is, the lower likelihood it shall be. The last and foremost one is color, which means the overall impact of each path for the whole attack. The impact value

(44)

is determined by the value of resource, likelihood as well as the profit of the attack, color scale is from green to red, which means the impact is increasing from low to high. More detailed information regarding the meaning of different visual styles would be given at the end of this section.

There are a few optimization that have been applied in order to give a better appearance when plotting the attack tree, for example, the transparency represents the likelihood of an attack, but due to our calculation method, it might have the possibility that the value for the likelihood for one path is not in the range of 0 to 1 which is explicitly restricted in the networkx library, so we have to ensure that the final data provided to show transparency is in that range, but of course, it should also have a good reflection of the likelihood of a path, so what we do is get the biggest value among all the path, and then all other likelihood value was divided by the biggest value of likelihood, in that case, the requirement is met.

For the edge width, considering the edge width larger than 5 would result in a messy visualization of attack tree, so we implemented a method to make the edge width settle from 1 to 5. The implementation of this method is quite similar with transparency, but we would get the lowest resource value, and assign lowest resource holder with edge width to be 1, other paths could get the edge width according to the ratio of its resource value to the lowest resource. If the ratio is higher than 5, the path would be assigned with edge width in 5.

Since many paths might share the same edge, so one edge might be assigned with different color value, but it is clear that we have to make the more critical attack obvious, so we need to prioritize different paths. The solution is to sort all the paths with the color factor value in a descending order, so that the color with higher impact could be more visible than lower ones.

Figure 4.4 shows all the functions in the tree visualization module. There are 4 main functions that are involved in this module, all the description and relevant usage have been demonstrated in the figure, actually, this module share some similar functions with the attack tree path computation.

While not at the core of our work, we briefly discuss the mapping from attack tree analysis results to visualisations, since these map directly to the visualisation of the contribution of components of organisations to the risk faced by the organisation. With all the plotting information we could get and apply them with the attack tree visualization module, we could get the following plot.

(45)

Fig. 4.4 Main functions in the attack tree visualization module

Figure 4.5 is an example for the attack tree we have plotted. We have applied three visual styles to illustrate the influence of paths in the attack tree on the overall result for the tree:

• The line width of edges implies the resource usage of a specific path, that is, how resource demanding an attack path is. We assumed attackers always choose the path with lowest cost, lowest time consumption, and lowest difficulty to apply attacking.

Thus the line width is inversely proportional to these three parameters – the lower the resource usage of a path, the more likely the attacker to take it (modulo other factors that come next).

• The transparency reflects the likelihood of success of a path in an attack. This attribute is directly defined in the weight measurement: the more transparent a path is, the lower its likelihood of success.

• The last and foremost property is color, which represents the overall impact of a path, normalised to percentage of the highest impact for the whole attack. The impact value is determined by the required resources, likelihood of success as well as the profit of the attack. In general the color scale chosen is between two colours, where one color represents 0%, the other 100%, and other values are combination of the two. In our example the color scale goes from green to red, which means the impact is increasing from low to high.

Clearly, more advanced visualisations provide even deeper insights into the scenarions represented by an attack tree. In the TREsPASS project we have explored many such methods [20].

(46)

Fig. 4.5 Nodes with border represent conjunctive nodes, nodes without border disjunctive nodes. the two red paths represent the two attacks with the biggest likelihood of success. The left hand path, however, has a higher chance of success, which is represented by a higher saturation of the colors

(47)

4.3 Verification of the result 37

4.3 Verification of the result

The result from the above implementation would provide 4 outputs, including 3 cvs files, which are occurrence.csv,weight_measurement.csv and path.csv, and a graph describe the visual result for the attack tree. All the three csv files could be accessed in the appendix file.

4.3.1 File occurrence.csv

This file is generated from the occurrence count module, it would display the information regarding the appearing times of each element in the attack tree. The file includes fields of information as follows:

ID , N o d e _ O c c u r r e n c e _ C o u n t , C h i l d r e n , Node_Type , L a b e l , P a r e n t For each field, they represent

1. ID is the only unique tag that could be used to refer to a node, it is quite important since both the impact measurement techniques need to apply the analysis to all nodes, and later in the path computation and attack tree visualization module, they also need to use the node information. The id would be assigned to each node when traversing the whole attack tree. Starting from 0, it has an auto-increment property and the owner of id in largest number is the root node. We can also have a knowledge about how many nodes exist in the attack tree by observing the root id number. Another critical feature for this field is that, there exist a few nodes with the same label, but they are under different branches or paths in the attack tree, we have to differentiate them by assigning different ids to them.

2. Node_Occurrence_Count means the occurrence count of each element in each node.

If the node is a leaf node, the value in this field implies the element appearing times in the leaf node label. If a non-leaf node, it would be the element occurrence count in all leaf nodes of the sub tree with current node as the root. The element occurrence count would be an array with the lower/upper boundary means the minimum/maximum times this element should appear to fulfill the attack requirements. Here, we can take the root node Node_Occurrence_Count as an example, most of the nodes hold the lower boundary of 0, it makes sense since from the root node, the attack could have lots of alternatives to run the attack, some elements are necessary in one path, but it is not in another path. Of course, the deviation from the lower boundary to the upper boundary relies on the amount of times of conjunctive and disjunctive child nodes it has gone through.

Analysis and Visualization of Attacks on Organizations

Analysis and Visualization of Attacks on Organizations

TRITA-ICT-EX-2016:41 MIN GU

Acknowledgements

Abstract

Abstrakt

Table of contents

List of figures

Chapter 1 Introduction

1.1 Socio-technical system threats

1.2 System component analysis

1.3 Structure of the paper

Chapter 2 Background

2.1 Insider attack

2.2 Model

2.2.1 System Model

2.2.2 Attack model

2.2.3 Running example

2.3 Attack tree

Chapter 3

Analysis of component contribution to attacks

3.1 Evaluating attack models

3.2 Contribution of Components of Organisations to At- tacks

3.2.1 Measuring Impact

3.2.2 Importance of visualization

Chapter 4

Visualizing attacks

4.1 Tools and frameworks involved

4.1.1 Interface xml.dom.minidom

4.1.2 Networkx

4.2 Implementation

4.2.1 Implementation for counting occurrence

4.2.2 Implementation of Weight Measurement

4.2.3 Implementation for Path Computation

4.2.4 Implementation of the visualization of attack tree

4.3 Verification of the result

4.3.1 File occurrence.csv