Identifying Resilience Against Social Engineering Attacks

(1)

DEGREE PROJECT IN TECHNOLOGY, FIRST CYCLE, 15 CREDITS

STOCKHOLM, SWEDEN 2020

Identifying

Resilience Against Social Engineering Attacks

Lazar Cerovic

(2)

Abstract

Social engineering (SE) attacks are one of the most common cyber attacks and frauds, which causes a large economical destruction to individuals, companies and governments alike. The attacks are hard to protect from, since SE-attacks is based on exploiting human weaknesses. The goal of this study is to identify indicators of resilience against SE-attacks from individual computer space data, such as network settings, social media profiles, web browsing behaviour and more.

This study is based on qualitative methods to collect information, analyse and evaluate data. Resilience is evaluated with models such as theory of planned behaviour and the big five personality traits, as well as personal and demographic information. Indicators of resilience were found in network settings such as service set identifiers (SSID) and routers, web history, social media use and more. The framework developed in this study could be expanded with more aspect of individual data and different evaluation criteria. Further studies can be done about this subject with tools such as artificial intelligence and machine learning.

Keywords

Social engineering attacks, Phishing, Human Resilience, Digital Footprint

(3)

Abstract

Sociala manipulationer är bland de vanligaste cyber attackerna och bedrägerierna som orsakar enorma ekonomiska skador varje år för individer, företag och myndigheter. Dessa attacker är svåra att skydda ifrån då sociala manipulationer utnyttjar mänskliga svagheter som ett medel till att stjäla pengar eller information.

Målet med studien är att identifiera indikatorer av motstånd mot sociala manipulationsattacker, vilket ska göras med hjälp av individuell data, som kan bestå av nätverksinställningar, sociala medieprofiler, webbaktivitet bland annat.

Denna studie är baserat på kvalitativa metoder för att samla, analysera och utvärdera data. Motstånd mot social manipulation utvärderas med hjälp av relevanta teorier och modeller som har med beteende och personligheter att göra, sedan används även personlig och demografisk information i utvärderingen.

De indikatorer som identifierades var bland annat inställningar i routrar, webbhistorik och social medianvändning. Det teoretiska ramverket som utvecklades för att utvärdera motstånd mot sociala manipulationsattacker kan utökas med fler aspekter av individuell data. Viktiga samhällshändelser och sammanhang kan vara en intressant faktor som är relaterat till ämnet. Framtida studier skulle kunna kombinera detta ramverk med tekniker som maskinlärning och artificiell intelligens.

Nyckelord

Social manipulation, Nätfiske, Digitalt fotavtryck, individuell data

(4)

Authors

Lazar Cerovic <lazarc@kth.se>

Information and Communication Technology KTH Royal Institute of Technology

Place for Project

Stockholm, Sweden

Examiner

Robert Lagerström Stockholm, Sweden

KTH Royal Institute of Technology

Supervisor

Mathias Ekstedt Stockholm, Sweden

KTH Royal Institute of Technology

(5)

Contents

1 Introduction 1

1.1 Background . . . . 1

1.2 Problem . . . . 1

1.3 Purpose . . . . 2

1.4 Goal . . . . 2

1.5 Methodology . . . . 2

1.6 Delimitations . . . . 3

1.7 Outline . . . . 4

2 Theoretical Background 5 2.1 Technical Overview . . . . 5

2.2 Digital Footprint . . . . 7

2.3 Social Engineering . . . . 8

2.4 Related Work . . . . 9

3 Method 10 3.1 Qualitative Study . . . 10

3.2 Research Phases . . . 10

4 Defining Resilience 13 4.1 Resilience . . . . 13

4.2 Indicators of SE-attack . . . . 13

4.3 Human Behaviour . . . 14

4.4 Validating the Psychology Models . . . . 15

4.5 Personal & Demographic Information . . . 16

4.6 Modelling Resilience . . . 16

5 Indicators of Resilience 18 5.1 Network Settings & Use . . . 18

5.2 Applications & Tools . . . 18

5.3 Social Media & Personal Websites . . . 19

5.4 Online activity . . . 20

5.5 Web Browsing History . . . 20

6 Discussion 23 6.1 Evaluation of the Project . . . 23

6.2 Future Work . . . 25

6.3 Conclusions . . . 26

References 27

(6)

1 Introduction

1.1 Background

Certain types of attacks that involve exploiting human psychology, is often called social engineering[14]. There are multiple ways to perform a social engineering (SE) attacks, but the most common are usually referred to as: pretexting, baiting and quid pro quo. Pretexting is based on deceiving another person through a false identity or a pre-written story into giving up some valuable information or some form of authorization. Baiting is another technique of social engineering, where the attacker plants some object like an USB-stick in a strategic position which may result with injecting malicious software into the targeted system. Quid pro quo is a technique which the attacker gives something like a gift or does a favor to an employee or worker, in order to acquire information or a favor in return. Another technique that is common is phishing, which is a type of fraud when the victim receives a mail masqueraded as from a trusted source, such as a bank or some authority. When receiving an email the victim can either get malicious software installed or unknowingly revealing private information.

Calculating or estimating the damage of social engineering is hard, but according to Infosec, companies lose billions of dollars specifically from phishing scams[26].

Also in 2016 there was approximately a loss of $1.5 million dollars per spear- phishing attack and most companies still rely on traditional antivirus program for security[26]. Another example of the damage social engineering may cause is the attack on the aerospace parts maker FACC[29]. The technique behind the fraud was a common phishing attack through email, which resulted with close to 50 million dollars stolen as well as damaged reputation of the company.

1.2 Problem

Social engineering attacks are a multidimensional problem that is very hard for individuals, companies and even governments to prevent and recover from. The key element behind why social engineering attacks are so successful is that it is primarily focuses on exploiting human weakness[36]. One example where social engineering attacks are difficult to prevent is in libraries[37]. The personnel is often uneducated in that matter and have the professional obligation of helping the visitor without any prejudice.

The first aspect of the problem formulation is to define what resilience is and how it can be estimated or identified. Finding right models that can form a bridge between different subject areas such as psychology and computer science. The second aspect of the problem is that if given some individual data (from the individual computer space) be able to identify indicators of social engineering resilience. The individual data may include web browsing behaviour, browser configuration, password meta data and more.

(7)

1.3 Purpose

In order to protect from social engineering attacks, such as phishing among others, knowledge about those manipulative attacks must be gained. Information about how these attacks work and some form of estimation of this can contribute to less scams and frauds. The estimation is some form of prediction if a individual is resilient against SE-attacks. Since social engineering attacks pose a great threat to both individual people as organizations, the purpose of this thesis is to increase an understanding of the problems as well as discussing possible solutions.

1.4 Goal

The goal of this thesis is to investigate how people can fall prey to mainly phishing attacks, but also in social engineering attacks general. What aspects of the individual digital footprint, such as network and operating system settings, social media use and web browsing behaviour. How these things can be used to predict if a person would be targeted by an social engineering attack. The primary social engineering technique which is mostly interesting in this thesis is so called phishing and developing a prototype to conduct an estimation of how people individual data can be used for social engineering attacks.

1.4.1 Benefits, Ethics and Sustainability

In order to protect from cyber attacks it is often common to perform a form of test-attack, so called penetration testing (pen test). Usually pen test is a form of test attack against a system, while there is some form an agreement with the human parties involved. According to the article[19] social engineering attacks are ethically questionable, since the main idea behind the attack is human-to-human testing. It is quite hard to establish some form of consent before performing an social engineering attack, since the nature of the attack involves deceiving another person.

SE-attacks are as mentioned very costly for organizations and individuals[26].

Finding more information on how to prevent and protect cyberattacks such as phishing is key to restore trust in various systems today. This may benefit directly organizations such as companies, government institutions, but also individuals who lack the technical knowledge to protect from attacks.

1.5 Methodology

Collecting and parsing data is one of the most essential things in scientific methodology. According to Walliman (2018) [39] data can be divided into two groups: primary and secondary data. Primary data is data that is collected at first- hand such as measurements, observations and studies. Secondary data is data which have been collected by someone else. The thesis will focus on secondary data, which is data that has already been interpreted and recorded. Collecting secondary data are mostly done through databases to find scientific articles. The

(8)

relevant databases used in the study are represented in table 1.1. Once the data is collected it then requires to parse or analyze the data, which can be done mainly in two ways: quantitative and qualitative[39].

The present study is based on a qualitative approach[17], in which inductive methods are used, since the nature of the formulated problem is more theory- driven rather than measuring a phenomenon. The study is a non-experimental case study, which involves empirical research about social engineering attacks.

In order to validate the results it is important to argue for the choice of models, theories and data with descriptive validity so that it is relevant to the study.

Literature study in the early stages of the project are a good approach to be familiarized with relevant topics. Social engineering is a broad subject with various mediums and techniques, where the knowledge about network and communication, human psychology, computer security and more. One important aspect of a literature study is source criticism to validate the information, specially when it comes to website sources.

Databases used

Name Subject Area

Google Scholar Various ScienceDirect Physical

sciences, Engineering, Life

sciences, Health sciences and social sciences IEEE Xplore Computer

science, Electrical

engineering and Electronics ACM Digital

Library

Computer Science, Engineering

Table 1.1: The databases used in the study

1.6 Delimitations

The definition of the goal and problem of this thesis is massive for the constraints of this study. To investigate every indication of SE-attacks would require more time and resources. Social engineering attacks are as mentioned a category of many attacks that involve exploiting human weaknesses, while this study

(9)

primarily focuses on phishing and attacks that have similar principles. The scope of individual data is also something that is to large to fit this study. This thesis mainly focuses on social media profiles, network configurations, online activity and web history. The reason why these particular individual data aspects are chosen is because it was recommended by the supervisor of this project.

Certain aspects of social engineering attacks can be associated with culture, especially when it comes to discussions about behaviour, which is discussed more in the last chapter of this thesis. Unless stated otherwise, the assumed cultural reference is a common western culture. Another limitation is that this project investigates individual human resilience, which may or may not differ from collective. This study does not take some important societal events and crisis into account which could be an interesting aspect.

1.7 Outline

The present thesis consist of total 6 chapters and appendices. In chapter 2 theory about different topic areas which social engineering attacks are related to is presented. Chapter 3 describes the methods used in order to get to the projects results. Chapter 4 describes the models used in the study and defining resilience.

Chapter 5 summarizes the results found in the project, which are the indicators of social engineering attacks. Chapter 6 contains the discussions of the results found in fifth section and conclusions of the project.

(10)

2 Theoretical Background

In this section relevant information about social engineering attacks are described. The chapter is divided into four parts, where one that aims to describe more technical things which are somewhat relevant to the subject. The second part is about digital footprint and what can be categorized as individual data. The third part gives an overview of what social engineering attack are and principles behind them. Last part describes related work in different subject areas, which can help to give deeper understanding of the topic.

2.1 Technical Overview

There are many technical aspects of social engineering attacks, such as choosing over what type of technology the SE-attack may be performed with. The perpetrator might use the knowledge about operating system, networks and malicious program to acquire information about the victim, but also to use it to perform an actual attack.

2.1.1 Operating System

The interface between the hardware and software is called operating system (OS)[2]. OS is responsible for mainly three things: efficiency of the system, making the system user friendly and ensuring that the system runs accurately.

The OS uses virtualization, concurrency and persistence to fulfill its tasks.

Virtualization is a way of taking a physical resource, (such as a processor, video card, memory) which it turns into a virtual resource which is much easier and efficient to use. The second aspect of OS is concurrency, which enables multiple processes (programs) to run simultaneously. The last principle is persistence, which is about storing and organizing memory data.

The OS uses a software called file systems to store files on the disk[2]. In the model of file system it is common to refer to files(a readable and writable linear array of bytes) and directories(holds files and other directories). Files systems keep track of the metadata of files and directories, such as permissions, date of creation, last accessed and more. All of this is stored in a data structure called inode.

2.1.2 Malicious Programs

Many software have a hidden feature which enlarges the user permissions[14]

to the program and make it possible to use it with no user restrictions. The feature is called a backdoor and is usually installed by the software developer to be able to get access from another platform or device, but could also server for debugging purposes. One of the most known malware [14]types are probably computer viruses, which is a program that has the property of replicating its code and performing some type of harmful action on the victims system. One way of protecting from computer viruses is to study and identify the virus, which then could be detected in a system and there after deleted. Another common type of

(11)

malicious software are Trojans, which has the property of masking itself as often doing something good (useful), but in reality and often secretly steal information or perform other actions ill intentional actions. Many types of malware programs have different programs to monitor and steal information from the victim.

Spyware is one example of a program that is usually embedded in some malware program, where once it is installed it can collect information such as screen shots, files, passwords and more. All of this can be done even without the knowledge of the victim. In order to protect from different types of malware attacks there are some measures that can be taken. Malware programs tend to set target on widely used systems, where less popular system are less likely to be affected.

When designing a software it is important that it is correctly implemented and permissions are granted only when necessary, in order to protect from attackers exploiting the system[14]. Limiting the execution of hardware and software (such as USB-drives, CD disks and certain websites)that can start executing its code without direct permission from the user.

2.1.3 The Web

One very important part of network and communication is the World wide web (also known as the web or www). The network protocol used in the web is HyperText Transfer Protocol (HTTP) in which the communication is made. The HTTP uses the client and server model in its way of communication with an Transmission Control Protocol(TCP), which is a connection oriented protocol. In the client and server model, the web browser acts as the client and the web server as the server of the communication. Websites is a set web pages, which contain a set of objects such as HTML documents, JPEG files etcetera. Cookies are used by websites with the purpose of restricting user access or produce content based on the users identity. Cookies are embedded in the HTTP messages and are stored on the clients end system as well as the servers database[24].

2.1.4 Wireless Networks and Devices

There is a lot of theory behind network communication, specifically wireless network communication. Two common types of network are the wireless LANs and cellular networks[24]. Wireless LAN are better known as WiFi networks and IEEE 802.11, with the characteristics of having small range. Cellular networks on the other hand have a much larger range for devices to connect to, with the technologies such as 3G, LTE among other belong to that group of networks. Hosts connected to a network have IP address and MAC addresses, which are used to be identified in that network. MAC addresses are fixed and unique for each network card, while IP addresses are not fixed and changes when the host is moved to a different network area. In IEEE 802.11 the Service Set Identifier (SSID) which is used to name the network.

(12)

2.2 Digital Footprint

Many activities done on computer devices leaves a trace, which is referred to as a digital footprint. The digital footprint can be partitioned in two types[20]

active and passive. Active digital footprint is footprints which are caused by the user directly, such as uploading a image or status update on social media.

Passive digital footprints on the other hand are footprints which are not actively produced such as search history or cookies. Usually the digital footprints can give information about the users interests, hobbies, personal information, family, friends and etcetera. Social media is a large part of the digital footprint subject and is covered it more detail. One property of digital footprints is that in many cases it is public, like for example a Facebook profile or Twitter post. It is often the case that traces of online (or offline) activity is permanent or at least hardly erasable, like for example a company logging information or downloading a video[20].

Examples of things that is or can produce a digital footprint is:

• Any activity on social media, such as Twitter, Facebook, Instagram, Snapchat among many others.

• Web history including search engine history, like in for example a Google search.

• Cookies stored both on the digital device and web server of the visited site.

• Actions on web sites, such as filling out an online form or surveys.

2.2.1 Social Media

One of the most recognizable things in modern era is social media. Social media could be defined [18] as an online service in which the user may create or change content, share it and interact with other people. There are many different types of social media platforms, where each type has a some different function, content and property. Some of the categories of social media are:

• Social sharing is a type of social media, where the service is about sharing books, pictures, videos or sounds. Some examples of the social sharing category are YouTube, Twitter, Instagram, Netflix among others.

• Social networking services are the type of social media services in which the users are linked to each other with online profile that displays personal information. Facebook is by far the most prominent example of this category, but there is also other services such as LinkedIn, Instagram and more.

• Messaging services are the types of services which involves communication between one or a small group of people at once. The communication can be through voice sound, text messaging or video messages. Facebook Messenger, Skype, Viber, Twitter are examples of social medias that have the mentioned attributes.

(13)

2.3 Social Engineering

Social engineering is as mentioned a very broad spectrum of ways to manipulate people. What makes every type of social engineering attack unique is mainly the technique (or techniques) used, mediums on which the attack is carried out on, what the goal is for the perpetrator achieve and the target group, which can fall prey to the attacks.

2.3.1 Techniques

One of the most recognizable techniques of social engineering is phishing.

Phishing is done by sending mail masked as some trusted organization or person.

Generally, the technique for many types of social engineering attacks can be broken down into three steps[36]:

1. Collecting information 2. Planning the attack

3. Performing the actual attack

The first step is to collect information about the potential victim and that is done by conducting some form of research. The research may consist of finding emails, phone numbers, company-information and etcetera; depending on what type of attack is going to be executed. This can be done with various mediums such as social media, websites and search engines. When all data and information about the victim collected, the next phase is to plan the attack. Planning an attack may involve purchasing a website domain similar to the website of the victim. For example Determine which employees should be attacked by sending scam emails.

The last stage of the SE-attack is to execute the SE-attack and based on the plan steal information or money.

2.3.2 Platforms

There are also many different platforms or mediums which can be used for conducting social engineering attacks and is relevant in the study of this phenomenon. Phishing uses primarily electronic mail as medium to reach the target. While other techniques as Quid pro quo or pretexting might involve non- internet mediums such as face to face contact, phone calls, mail, pamphlets, phone calls and phone texting. Online (internet) mediums that can be used for attacks could be electronic mail (e-mail), social media and websites [27].

2.3.3 Goal & Targets

Spear-phishing is also a often mentioned technique that differs from phishing because of the target group is more specific[30]. Unlike phishing which is for example broadcasting fraud email to people, spear-phishing attackers usually find an organization of interest, in which the attackers send out mail that is specifically tailored to manipulate the target. Target groups may consist of banks, individuals,

(14)

government institutions (including military defence), companies etcetera.

There are many different goals or preferred outcomes which the attacker wants to achieve. Many times the attacker wants to get financial gain, confidential information (for e.g. military information), sabotage and more[30]. Depending on what goal the attacker has set determines which methods is used and which target group is chosen.

2.4 Related Work

The subject of social engineering attacks is something that spans across many scientific fields, like for example computer science and psychology. A study made by Ekstedt et al. (2015) ”Investigating personal determinants of phishing and the effect of national culture” that investigate culture and phishing attacks susceptibility [11], which is a relevant aspect of this study and could be used as basis to identify an indicator of SE-attack. There is a scientific article written by Chung (2020)[7] that covers vulnerabilities in companies against cyber attacks, which could be related to individual data. Another interesting publication Connor et al. (2003), that is related to human resilience is the Connor-Davidson Resilience Scale (CD-RISC) [8]. Although this thesis is not based on the CD-RISC assessment of human resilience it can still be viewed as a related model in defining resilience.

(15)

3 Method

In this section the method and methodologies are describe further. The systematic processes of performing the work involved in the research, including systematic investigation in order to establish evidence. Illustration which methods were chosen to analyse data, quality assurance and data collection are described in the subsection of this chapter.

3.1 Qualitative Study

In order to achieve the goals and valid results methodologies are essential to be carried out correctly. The current research represents a qualitative study, since the nature of the research question is about studying a phenomenon and possibly come forth with theories that addresses the topic[17]. The phenomenon which is studied is how the aspects of individual digital footprint can be used in order to estimate resilience against social engineering attacks. The present study, similar to qualitative studies uses a smaller quantities of data contrary to quantitative studies.

3.2 Research Phases

This study is based on applied research, which is a method [17] that is used when addressing a specific research question and problems. The method relies on already existing research and is using it in order to answer the formulated research question. The blueprint for the applied research method may be described as four different phases[4], as can be viewed in figure 3.1. The initial phase was to formulate the research goal and understand the problem which needed to be solved. Once the subject has been familiarized the design phase take place, where decision on of how data should be collected has to be made. It is also important to make a critical review of the selected methods of gathering data and argue why it is done in a certain way. The goal with the design phase is to maximize the validity in the study, where the biases and errors are preferably eliminated or at least minimized.

Def inition ^- Designing ^-Implementing ^- Reporting

Figure 3.1: Phases of applied research in according to the guidelines[4]

(16)

3.2.1 Data Collection

Case studies is a flexible set of methods that are used for collecting data in this study[34]. The case study is a inductive study, which means that it is a theory building study rather than a form of measurement. Data can be divided into two types of data, qualitative (non-statistical and categorizations) and quantitative (mostly statistical measurements). Since the present study is primarily based on inductive methods the natural approach is to collect qualitative data, but may also contain some quantitative data. Another form of categorization of data is in the types primary and secondary data. Secondary sources are studies that are based on primary data and are incorporated in this study.

Sources consists mainly of scientific publications, which are either recommended by the supervisor or found searching on databases. The search strings are based on different keywords, which corresponds to various research topics that are relevant to the study. Using logical conjunction (AND) and disjunction (OR) to construct advanced search queries with the purpose of increasing the relevancy of the results, possibly also narrowing down the search results. The search strings can be seen in table 3.1, which also includes database used, time filter and date of when search occurred.

Database Search String Year span Date (Time)

ScienceDirect ”Measuring AND Resilience AND Cyber” 2000-2020 2020-05-09 (14:07)

ScienceDirect ”Measuring AND Resilience AND Phishing” 2000-2020 2020-05-09 (14:55)

Google Scholar ”Phishing Resilience” 2005-2020 2020-05-10 (16:16)

IEEE Xplore ”Social AND Engineering AND Attack AND Indicator” 2005-2020 2020-05-10 (16:55) ACM Digital Library ”(Digital AND Footprint) OR (Digital AND Footprint AND Attributes)” 2005-2020 2020-05-12 (12:12) Google Scholar ”Measuring Fraud Resilience” 2010-2020 2020-05-13 (11:57)

ScienceDirect ”(Human AND Personality AND Behaviour) OR (Personal AND Traits AND Behaviour)” 2005-2020 2020-05-13 (12:04)

Google Scholar ”demographic information phishing ” 2005-2020 2020-05-20 (17:35)

Google Scholar ”phishing attack big five personality” 2005-2020 2020-05-31 (17:32)

IEEE Xplore ”(IEEE 802.11 OR Wifi) AND (sensing OR scanner) AND (behaviour OR behavior)” 2005-2020 2020-06-08 (16:08)

Table 3.1: Search queries with corresponding metadata

3.2.2 Data Analysis

The analytic induction method[17] is used for analysing data, which is a form of iterative method approach when constantly shifting between collecting and analysing the data. Most scientific articles are based on the IMRAD-format, which consists of an abstract, introduction, method, result and discussion sections[28].

The structure makes it easier to skim through the vast information. Data can then be rejected or accepted based on the relevance of keywords and abstract. The method repeated until some form of validation of the theory occurs.

(17)

3.2.3 Quality Assurance

It is of paramount importance to analyse and validate the research material in order to assure quality. The research publications used in the present study have undergone a quality and relevancy assessment based on mainly four criterions, which are influenced by source criticism principles[35]:

• Only publications that have undergone a scientific scrutiny are allowed to make it into the selected sources. The reason is to minimize errors or fallacies which can be spotted with peer reviewing.

• Tracking the information to its original source is a good way of ensuring authenticity of the data. Reducing the source dependability is a good way to reduce biases and distortion.

• Although it is not as common, but authors affiliated for example with political or business organization are considered as less reliable sources. The credibility might be affected, since there might be a self interest to distort the information.

• Selecting a study based on published date is a good way to filter through a vast amount of scientific articles. If an article is published long time before it is cited, the chances of the information being outdated is higher.

There are many ethical aspects of conducting a study about social engineering attacks. The general scientific ethics consists[39] of honesty in the work, where the responsibility of citing and acknowledging used research material properly is important. Another scientific ethical responsibility is to use neutral language register and not to distort information in anyway. The present research study does not use any human participants nor any data based that may infringe on personal integrity and confidentiality.

(18)

4 Defining Resilience

In this section the concept of resilience is researched and defined. Describing different models that are relevant in order to achieve the results. Then the models are validated to assure relevancy and to support the results. This chapter is also essential how the indicator of resilience are to be interpreted.

4.1 Resilience

In order to be able to measure human resilience against social engineering attacks, first thing that needs to be done is to define what human resilience is. According to a scientific article [8] resilience can be seen as a measure of successful stress- coping ability in a person. The successful stress coping ability may be from many different things and related to different topics. The mentioned article[8] uses the measure of resilience in relation to mental illness such as depression and anxiety. There are also other fields where resilience is defined somewhat different or similar which may be related to the subject of social engineering attacks, like for example fraud, cyber attacks, psychological and social among other. A study made by David Gragg (2003) states that there is no difference between a traditional fraud and a social engineering attack[15].

In a cyber resilience against zero-day malware attacks[38], the resilience is defined as the ability for a network to resist a zero-day attack, which is a software weakness left by the developer that is exploited. The resilience in that case can be modeled as the relation between recovery and incident rate. The recovery rate is defined as the phishing removal rate, which need to be higher than the incident rate.

In this study the resilience is defined as an item or concept, which indicates that a social engineering attack is made harder or impossible to carry out. Resilience is also modeled as a dichotomous variable, which means that it can be bipolar (true and false). In the case that resilience is false it is seen as vulnerability. Therefore, resilience can also be viewed as non-vulnerability, since the lack of vulnerability itself is some form of resilience.

4.2 Indicators of SE-attack

There are many things which may indicate a social engineering attack. The first thing for a social engineering attack requires is to establish trust between the victim and the perpetrator[15]. There are so called very attacked person (VAP), which are of special interest for the attackers and when it comes to building up a defence strategy. VAP are usually system administrators, librarians, help desk personnel secretary and assistants. The psychological principles behind a SE- attack are following[15]:

• Element of surprise is something which can be favorable for the attacker.

Feelings such as anger and joy may impact the victim with a less reasonable

(19)

state of mind. Usually the attacker makes the victim feel shameful and/or guilty.

• Sometimes the perpetrators want to overwhelm the victim with questions or information, which can make the victim more passive and mentally laid- back.

• Reciprocation is a method for the attacker to make the victim feel obligated to help the attacker, after initially doing something for the victim.

• The attacker establish a false friendship or relationship with the victim, by for example talking negatively about other people and creating some type of similarity.

• Many times the attacker is trying to play the role of some authority. The victim becomes complied without disobeying any orders and submit to the attackers wishes.

• Lack of education of the personnel to deal with certain situation is often something which the attackers exploit. Unknowingly people are more easy to trick then the those who are familiar with social engineering attacks.

The study of Ekstedt et al. (2015) showed that the intention of resisting phishing attacks and computer experience had a significant correlation to phishing resistance, while it showed some differences based national culture[11]. The study also showed that age or gender did not have significant impact concerning resistance to phishing.

According to Chung (2020)[7] there are a couple of things which likely poses a great security threat to companies. The first thing companies should consider is to view third party partners or suppliers as a potential risk of breach. A company should also have a response plan and defence mechanisms to prevent or recover from an attack. The right software may be a great tool in order to protect from various attacks. Companies also need to develop a good defence policy, training, identifying vulnerabilities, security protocols (such as security questions and logging’s) and so on[15].

4.3 Human Behaviour

Human behaviour is an crucial factor when building an model of resilience against social engineering attacks. The theory of planned behaviour (TPB) model[1]

states that human behaviour is strongly influenced by personal intentions and perceived behaviour control. The model says that perceived behaviour together with personal intention used to predict behaviour occurrence. TPB also stats that perceived behaviour control is actually of more significance than realistic control.

The limitation of the model is when the person has little knowledge about the behaviour. Some other factor that impact the behaviour is personal obligations, moral norms and past behaviours, such as habits.

There are two theories/models that are used in this study to model behaviour

(20)

and personalities. The big five [25] model illustrates that there is five personality traits which can be seen as rough dimensions of difference between people. Five personality traits that are included into the model are: openness, conscientiousness, agreeableness, extraversion and neuroticism. The dimensions of the five personal traits make up human individuality, which can be accounted for a unique behavioural pattern in people in a vast number of situations.

There are some personal traits which corresponds to vulnerabilities against social engineering attacks[9]. The extrovert personal trait is more susceptible to the likeable SE approach, where the attacker tries to create a sense of similarity.

The personal trait categorized as ”thinking” is more likely to be targeted with an authoritarian SE approach, while the person leaning to a more emotional type of character the vulnerability tends to be more to social proof. People with a more judging personal trait is specially susceptible against the method of reciprocation. The more sensing type of persons are more likely have the weakness against a method based on making the victim obliged to fulfill a commitment.

The masquerading approach, where the attacker pretends to be someone else, is effective against people with perceiving personal traits[9].

4.4 Validating the Psychology Models

The big five model can be embodied in different ways, in this study it is implemented as five dichotomous variables. Each of the five personal traits can be viewed as either high or low (true or false), which simplifies the use of the model.

Vulnerability and resilience against social engineering attacks is shown to be linked with certain personality traits in the big five model. The association between each personality trait and SE-resilience/vulnerability is the following:

• The openness personal trait can be ambiguous in relation to vulnerability to phishing attack. People that are considered to be very open to new ideas can be more resilient against phishing attacks, primarily because of the openness towards protective measures. On contrary, people that are more open-minded to new ideas, routines and experiences can also fall prey to phishing attacks[31].

• Conscientiousness personal trait is primarily a matter of resilience against phishing. In combination with training this personal trait is the most resilient trait of all big five. The reason for this is that consciousness persons are more likely to follow security protocols, policies and not give up information[31].

• People that are extroverted (high extraversion) are more likely to give up information in order to get social affirmation. One study[9] showed that extroverts have are more susceptible to the liking persuasion principle.

• One main aspect of agreeableness is the ability to form bonds, relationship and trust with other people, which is a high indicator of vulnerability against

(21)

social engineering attacks. Altruism and compliance are also factors in agreeableness that have a negative impacts on the resilience towards SE in general[31].

• Neuroticism is also ambiguous, since it can have both positive and negative effect on the resilience. One aspect of neuroticism is that persons with it are more susceptible to certain persuasions (social proof) [9]. On the other hand people that have more computer anxiety and paranoia tend to be more protective, which is for example to not give up passwords and follow routines [40].

The reason why theory of planned behaviour is relevant to this study is due to its connection between intentions and behaviour[1]. The interesting behaviour for this study is security behaviour in which, the necessity is to have a model that can predict it.

4.5 Personal & Demographic Information

Beside the behaviour and personality aspect of the resilience evaluation demographic and personal information is relevant. The reason why this is important in the process of resilience evaluation is because often in the context of certain SE-attacks, such as spear-phishing where the attacker tailors the attack specifically to a victim. The following information will be considered:

1. Geographic locations of the victim

2. Name of the network name, routers, MAC addresses.

3. Information about what type of devices, computer, smartphones and tablets.

4. Personal information, such as name, friends, gender, age education, profession.

5. Perception about computer security and vulnerabilities, policies that needs to be followed.

4.6 Modelling Resilience

The model in which the resilience against social engineering attacks is built up is by four phases, as can be seen in the figure 4.1. In the first phase the principles behind SE-attacks are described. out, the principles behind the attacks are converted into psychological aspects , which includes behaviour, personality traits, knowledge. The psychological aspects are then converted into technical traits which can be categorized as a digital shadow (or digital footprint). In the final stage of the model is about estimating the SE resilience.

(22)

Figure 4.1: Model of identifying resilience

(23)

5 Indicators of Resilience

The results of the project are presented in this section. Here indicators of social engineering resilience and vulnerabilities are identified and described. The indicators are individual data which can be found on a computer space and can be viewed also as a individual digital footprint. The key part of this section is associating an digital trait with the models defined in the previous section or to motivate it with technical theory.

5.1 Network Settings & Use

Smartphones and other online devices often have a setting which enables the WiFi connection. This setting makes it possible to be detected by some monitor, which can be used to capture recurrence. Detecting devices can be used to estimate how many people located in a building or room[10] and implicitly information about the devices. Similar things can be done with Bluetooth technology, but it has lower connection range than WiFi signals.

There is also the vulnerability of so called ”fake WiFi hotspots”, where an attacker creates an access point with the same SSID as the trusted station, from which also the unique MAC-address is copied in order to hijack the connection[21].

What makes the hijacking even more troublesome is that it is hard to detect if the connection to the malicious access point. From this the perpetrator can collect all types of information, like for example monitor online activities of the victim, stealing passwords and other sensitive information. The main prevention of network hijacking is to disable setting in smartphones and other in which, the automatic connection to an already known network. Many smartphones have the feature of becoming an access point, in order to share internet. This type of feature is an indication of vulnerability against potential access point hijack [5], since every smartphone can potential be at risk against malware attack in which, the smartphone is remotely controlled.

Network settings in the router is also a weak point in many systems. Many routers come with a preset default password directly from the manufacturers[6]. Routers are also equipped with a feature called identifier broadcasting, which has the function of announcing its presence to other nearby devices. It is also important to change the preset unique identifier for each router from the manufacturer, since it is somewhat a public secret.

5.2 Applications & Tools

Most smartphones today are equipped with Quick Response code (better known as QR-code) readers, which is a two dimensional optic read often used as a marketing tool[5]. The QR-code technology can be used to trick the user into downloading malware and social engineering, including creating fake WiFi spots[5]. Unlike for example URLs that are more comprehensive to people than QR codes, which can be used into tricking or baiting.

(24)

Anti-viruses and anti-spyware are a good indication of protection against installed malware internally on the computer device[6]. These software’s can be used to prevent, detect and remove any breach. The firewall settings is another important aspect of resilience against attacks through the network. A firewall that is off or blocking very few network traffic is generally viewed as vulnerable.

5.3 Social Media & Personal Websites

According to a study (2012) made by Daniel Quercia et al.[33] that investigated the relation between Facebook popularity and personalities, there is a correlation between number of contacts and extraversion personal trait. In the study an average person is said to have 130 contacts on Facebook. There is also evidence that neuroticism is negatively correlated with large number of Facebook friends.

Number of Facebook likes are positively correlated with openness, neuroticism and agreeableness[3], while extraversion and openness are significantly linked with number of status updates. There is also some studies about the content of Facebook likes and how that reflect individual personality[22]:

• Topics that may characterize the openness trait are ”Oscar Wilde”,

”Leonardo Da Vinci”, ”Charles Bukowski”, ”Leonard Cohen” and ”Plato”.

• People with high conscientiousness tend to like ”Law Officer”, ”Accounting”,

”Sunday Best” ”Glock Inc” and ”Kaplan University”.

• The corresponding

things for extraversion are ”Michael Jordan”, ”Beerpong”, ”Chris Tucker”,

”Cheerleading” and ”Modeling”.

• The corresponding things for agreeableness are ”Christianity”, ”Circles Of Prayer”, ”The Book Of Mormon”, ”Pornography Harms” and ”Redeeming Love”.

• Lastly, a good indicator of neuroticism may be ”Emo”, ”The Addams Family”, ”Kurt Donald Cobain”, ”Sixbillionsecrets.com” and ”Vampires Everywhere”.

Similar associations that have be done with Facebook, between the five factor model (big five) and social media features exists also with Twitter. A study (2013) made by Michal Kosinksi et al.[32] showed that high number of followers on Twitter and accounts following on a profile indicate extraversion and low in neuroticism. A good indicator of the openness trait can be when a person is on other people’s Twitter lists. Another thing that was the influence a person have on Twitter, which can be number of retweets and clicks on tweet. Influence is like number of followers correlated with extraversion.

Personal websites, as well as social media profiles can contain personal information about profession, age, gender, education among other things. This type of information may indicate a persons educational level and if the person is a VAP. Another thing which social media may display is the geographic locations and names of friends.

(25)

5.4 Online activity

A study (2017) made by Ted Grover et al. showed the relation between the big five personal traits and the digital device activity (smartphones and computers)[16].

The study showed that there are correlations in at least:

• The trait agreeableness and usage of smartphone between 8 and 10 pm to total usage.

• Openness and the Facebook activity on computer device, between 12 and 2 am. There is also correlation with the people who use social media up to 5 minutes a day.

• Neuroticism and the average daily screen switches(phone) and serial correlation of hourly screen switching (computer) frequency .

5.5 Web Browsing History

Web browsing history displays what type of websites the user has visited in the past. One study showed that there there is a parallel between personality traits and website preferences[23], in according to the big five personality model. Table 5.5 illustrates the personal traits from the big five model, when each trait is true. The second table 5.5 matches a website category of choice with one of the personal traits considered low on the personality spectrum. Apart from the website topics it is also relevant to add a computer security and a computer technical category, since it a good indicator of computer knowledge and intentions (in according with theory of planned behaviour[1]). The computer tech and security categories would include tech journals, online forums and other web sites that include keywords, which would involve with either the word ”security” linked with computer technology or names of different cyber attacks (such as ”phishing”,

”computer worms” and more).

(26)

HighAgree- ablenessHighConsci- entiousnessHigh ExtraversionHigh NeuroticismHigh Openness Education (Reference)Education (Reference)Internet (Computers)Pets (Recreation)Animation (Arts) Internet (Computers)Electronics (Shopping)Education (Reference)Scoating (Recreation)Marketing (Business) Logistics (Business)Children (Shopping)Environment (Science)Physics (Science)Services (Business) Diseases (Health)Dictionaries (Reference)Music(Arts)Hockey(Sports)Photography (Arts) Table5.1:Websitecategoriescorrespondingtothehighspectrumofeachpersonaltrait[23]

(27)

LowAgree- ablenessLowConsci- entiousnessLow ExtraversionLow NeuroticismLow Openness Kids&Teens (Society)MentalHealth (Health)Movies(Arts)Photography (Arts)Education (Reference) Mentalhealth (Health)Music(Arts)Children (Shopping)Maths(Science)Television (Arts) Physics (Science)Animation (Arts)Literature (Arts)Marketing (Business)Football (Sports) Pets (Recreation)Literature (Arts)Comics(Arts)Logistics (Business)Children (Shopping) Table5.2:Websitecategoriescorrespondingtothelowspectrumofeachpersonaltrait[23]

(28)

6 Discussion

This section describes the evaluation of the results and methods used in the project. Discussion about positive effects and drawbacks in the project. Future work that is valid for this thesis is presented. This section ends with the conclusions of this project.

6.1 Evaluation of the Project

The problem formulation for this thesis is how to identify indicators of resilience against social engineering attacks, primarily phishing, but also other types of attacks. The goal of the thesis was to determine what digital footprints can be used to identify resilience against SE-attacks. Positive aspects of the goal formulation are that it is easy to understand and get to working, while the cons may be that it is too broadly defined. Operating system settings was not found as an indicator of resilience against SE-attacks, which was defined in the goal. Therefore, the goal should be considered partially fulfilled since most of the aspects of individual data was found in some form to indicate resilience. SE-attacks are a large set of scams, as well the definition of digital footprint (shadow) can also be argued to be vague. Another drawback would be that there is no data that can be used to test the theories by developing a prototype, which then could change the type of study into a more experimental study.

The results showed that there are certain things which can be used to identify and possibly measure resilience to SE-attacks. One positive effect of this study is that different subject areas are linked together such as technical aspects and psychological. The study has shown that there exist a correspondent digital trait with a type of persuasion used in SE attacks, how the big five model of personalities can be a bridge between information technology and SE-attacks. A theoretical prototype that could measure the individual resilience could be developed by taking the results an evaluate a scale of resilience/vulnerability in which items of the digital footprint is compare to each other, or to simplify by assigning every item an uniform resilience value (there are limitations of that type of model) and directly measure resilience.

Another interesting aspect that was mentioned in the first section, but was out of the scope of this study is the context which the society are in. Certain time period and world events, such as pandemics and other types of crisis may have a large impact on people’s resilience against social engineering attacks[12]. Perpetrators can take advantage of situations and emotions of people that can arise during for example a pandemic lockdowns, where people are more inside the house and rely on online applications. Finding an indication from individual data could be some apps used during a crisis, like for example in Norway where an app was used to trace the number of infected in Covid-19. Another example of indication would be checking web engine searches, web history and so on.

(29)

6.1.1 Summary of the Framework

The concept of resilience can be summarized with figure 6.1, which displays all the mentioned resilience and vulnerability indicators. There are multiple ways this could be implemented. It could serve as a checklist for some physical inspection or be formulated as questions in some form of survey. Another theoretical implementation could be as an software application, which scans through the computer device for web browsing history, network setting etcetera.

The implementations may vary depending on who is intended as the end-user of the implementation, whether it is an individual, company/organization or attacker.

Figure 6.1: One uniform concept of the framework developed

6.1.2 Reliability & Validity of the Methods

There are couple of relevant evaluation point of the qualitative methods used in the project. Firstly the principle of neutrality, which can be embodied in using neutral language registers to its maximum extent. One important aspect of discussing