Machine Learning for Metamorphic Testing

(1)

IN

DEGREE PROJECT ELECTRICAL ENGINEERING, SECOND CYCLE, 30 CREDITS

,

STOCKHOLM SWEDEN 2018

Machine Learning for

Metamorphic Testing

ZHEYU ZHANG

KTH ROYAL INSTITUTE OF TECHNOLOGY

(2)

TRITA TRITA-EECS-EX-2018:561 ISSN 1653-5146

(3)

1

Abstract

Test oracle is a mechanism used to validate all the functionalities of software under test. However, the lack of test oracle makes the pro-cess of software testing difficult. Metamorphic testing is a state of art approach for automated software testing without test oracles based on metamorphic relations. Metamorphic relations are a set of proper-ties between inputs and outputs that a software could have. However, it is usually difficult to identify metamorphic relations for unknown programs. This thesis aims for automatic generation of metamorphic relations by utilizing machine learning algorithms with the method random walk kernel using input from control flow graphs. By apply-ing Kanawala et al. [1] previous work in our targeted system environ-ment, we encountered a series of difficulties, which we also describe in this thesis. It is important to introduce an alternative solution for our test suite that is working. The performance of our model is evalu-ated by different measures including area under the receiver operating characteristic curve and mean squared error. The results show promis-ing applications of automatically predictpromis-ing metamorphic relations for unknown programs. The study was conducted on software system in the telecommunicaiton domain.

(4)

2

Sammanfattning

Testorakel är en mekanik som används till att avgöra om ett program är korrekt för all funktionalitet i program som testas. Metamorfisk test-ning är en toppmodärn metod för automatiserad programtesting vid saknad testorakel, kan använda metamofiska relationer istället. Tyvärr är det under normala förhållanden svårt att identifiera metamorfiska relationer för okända program. Denna avhandling ämnar därför att automatiskt generera metamorfiska relationer genom att använda ma-skininlärningalgoritmer med kärnmetoden slumpmässig gång och in-data från kontrollflödesgrafer. Metamorfiska relationer är en mängd egenskaper mellan indata och utdata som finns i alla program. Genom att replikera och applicera en tidigare studie av Kanewala et al. [1] i vår egen systemmålmiljö, har vi mött en del svårigheter som vi också be-skriver i denna avhandling. Det är viktigt att introducera alternativa lösningar för vår test svit som fungerar. Prestandan i vår modell ut-värderas av olika mätningar inkluderad AUC och MSE. Våra resultat verkar lovande att använda på automatisk predicerade metamorfiska relationer för okända program. Studien genomfördes på programsy-stem inom telekommunikationsområdet.

(5)

3

Acknowledgement

First of all, I’d like to express my sincere gratitude to everyone that has helped me with this project. Ms. Sigrid Eldh from Ericsson AB and Mr. Karl Meinke from KTH, who act as my supervisors and provided me with the opportunity to carry out my master thesis at Ericsson AB. It is a privilege for me to work with them, and I cannot finish my the-sis work without their guidance and encouragements. Thanks for the continuous supports and advice from Mr. Per-Olof Gatter from Erics-son AB and Mr. Aravind Ashok Nair from KTH. Also, I would like to thank Ms. Ulrika Nordlund from Ericsson AB for her kind help. Secondly, I would like to thank Mr. Ming Xiao at the department of information science and engineering at KTH, who acts as my thesis examiner.

Finally, thanks for the firm and constant support from my family, with-out them, I would never have the chance to study in Sweden.

(6)

Acronyms

5G Fifth generation mobile systems.

AUC Area under the receiver operating characteristic curve.

CFG Control flow graph.

CFGs Control flow graphs.

KRR Kernel ridge regression.

MR Metamorphic relation.

MRs Metamorphic relations.

MSE Mean squared error.

MT Metamorphic testing.

ROC Receiver operating characteristic curve.

SVM Support vector machine.

SVMs Support vector machines.

(9)

List of Figures

2.1 A survey of application areas of metamorphic testing . . 17

3.1 The test oracle problem . . . 19

3.2 The overview of the metamorphic testing process . . . . 20

3.3 Example of permutative metamorphic relation . . . 21

4.1 The overview of the proposed approach . . . 24

4.2 An example of CFG . . . 25

4.3 Kernel methods . . . 26

4.4 An example of the direct product graph . . . 29

5.1 An example of labeling the nodes by their operation . . . 33

5.2 An example of the functions (bubble sort) in my data set and its corresponding control flow graph. The combi-nation of number and notation "%" in each node stands for an automatically generated fake name which can be used to distinguish each node. . . 34

5.3 Stratified k-fold cross-validation. Both classes (red and green) have approximately the same distribution in each fold. . . 36

5.4 ROC analysis of classifier for predicting the permutative metamorphic relation . . . 38

5.5 ROC analysis of classifier for predicting the additive meta-morphic relation . . . 38

5.6 ROC analysis of classifier for predicting the inclusive metamorphic relation . . . 39

5.7 The comparison of performance among all classifiers . . 39

5.8 The comparison of performance among all regression models . . . 40

5.9 Variation of AUC with different value of λ . . . 41

(10)

8 LIST OF FIGURES

(11)

List of Tables

5.1 Number of positive and negative examples for each MR . 35 5.2 Confusion matrix . . . 36

(12)

Chapter 1 Introduction

This degree project aims to automatically predict metamorphic relations (MRs) of metamorphic testing (MT) by utilizing graph kernels and ma-chine learning algorithms. This thesis is carried out at Ericsson AB, which is a world-leading Swedish telecom company.

1.1 Motivation

In software testing, it is usually difficult to detect bugs in some types of software (e.g. scientific software) because of the test oracle problem. As discussed by Weyuker [2], oracle is a mechanism which can check the correctness of the outputs to indicate whether the program under testing is working correctly. However, it is often difficult to detect sub-tle faults, one-off errors or defects in many scientific software systems, because of the lacking of reliable test oracles to indicate what is the correct outputs for arbitrary inputs. As a consequence of this problem, many scientific software systems become "non-testable programs" [2]. Metamorphic testing is a property-based testing technique proposed by Chen et al. to alleviate the oracle problem [3]. Instead of check-ing the correctness of outputs for arbitrary inputs, metamorphic test-ing operates by checktest-ing the program betest-ing tested against the relation between inputs and corresponding outputs, which is known as the metamorphic relations. The MRs specify how a particular change to the inputs should change the outputs. If the MRs do not hold for the program being tested, then it is a sure sign that there are some defects existing in the program.

(13)

CHAPTER 1. INTRODUCTION 11

However, the identification of MRs usually requires the knowledge of the particular domain, so it is a labor intensive task for domain ex-perts and programmers. Thus, in this thesis, we aim to automatically predict the MRs for previously unseen functions by modeling this task as a supervised machine learning problem.

1.2 Problem Formulation and Method

This thesis aims to study metamorphic testing and metamorphic re-lations, and use machine learning methods together with graph ker-nels to automatically predict MRs of testing programs. We design our method and experiment by adjusting the ideas from previous litera-ture, further to evaluate the performance based on area under the re-ceiver operating characteristic curve (AUC) and mean squared error (MSE). The objectives of this thesis are:

• Study metamorphic testing and metamorphic relation in litera-ture study, further to identify MRs for real mathematical func-tions implemented by C programming language.

• Study the idea of using machine learning methods for automatic MRs prediction in literature study, then modify it to fit our case by identifying the limitations of previously successful experi-ence.

• Study graph kernels algorithms in the literature study, further to implement random walk kernel algorithm to calculate the simi-larity score matrix or the gram matrix for training machine learn-ing models.

• Build two different types of machine learning models by using the kernel trick, which takes precomputed similarity score matrix from graph kernel for training.

• Evaluate the accuracy of the machine learning model by calcu-lating AUC or MSE value together with stratified k-fold cross-validation method, and analyze the effectiveness of graph kernel method.

(14)

12 CHAPTER 1. INTRODUCTION

1.3 Previous Work

The first attempt at automatically predicting metamorphic relations was performed by Kanewala et al. in 2013 [4]. In this work, they tried to predict MRs of Java testing programs which take arrays as inputs. They used two kinds of machine learning algorithms (support vector machines and decision trees) to train binary classifiers based on nodes and edges features extracted from programs’ control flow graphs (CFGs). Kanewala et al. then further developed their approach in 2015. In this work, they started to use kernel-based machine learning methods combined with graph kernel methods, which are a set of algorithms can effectively compute the similarity between two graphs. They also compared the performance by using different graph kernel functions, and different graphs, which are CFGs and data dependency informa-tion.

In 2018 [5], Kanewala et al. transferred their approach to Java testing programs, which take matrices as inputs to validate that their methods can be used for a wider range of testing cases.

Besides the machine learning methods for automatically predicting metamorphic relations, Zhang et al. [6] developed a dynamic approach by executing the testing programs with some certain inputs to iden-tify MRs. This approach could be a complementary technique for the method developed by Kanewala et al. [4][1][5], which leverages static properties of a program [1].

In this thesis, our approach is developed based on the ideas proposed by Kanewala et al. in 2013 [4] and [1]. We modified their methods to adapt to our special testing cases.

1.4 Thesis Outline

This thesis report is organized as follows.

• In chapter 2, we briefly introduce the background information about software testing and the software system, which is needed to be tested by metamorphic testing technique.

(15)

CHAPTER 1. INTRODUCTION 13

• In Chapter 3, we introduce detailed definitions of the state of art software testing technique, metamorphic testing and metamor-phic relations, including its basic ideas and applications, then claim the challenges in identifying metamorphic relations. • In Chapter 4, we introduce the machine learning approach for

automatically predicting metamorphic relations in this thesis work, including the method overview, the graph representation of pro-grams, the idea of graph kernels, and the machine learning algo-rithms which are used here.

• In Chapter 5, we introduce the details of our experiment, includ-ing the composition of our data set, the metamorphic relations we chose to predict, the evaluation measures for different ma-chine learning methods, the corresponding results of different evaluation methods, and the analysis of graph kernel method. • In Chapter 6, we summarize the objective of this thesis and

re-view the methods we used to implement our approach and eval-uate the performance, and conclude the results we have got so far. Then discuss the future work of different aspects.

(16)

Chapter 2 Background

In this chapter, we first introduce the background of software testing, including the source of software defects and the benefits of using soft-ware testing techniques. And then we illustrate the target system un-der test and the reason for using the metamorphic testing technique combined with machine learning methods.

2.1 Software Testing

2.1.1 The Overview of Software Testing

Software testing is an evaluation process conducted to investigate the information about the quality of software systems. Software testing techniques operate by checking the behavior of software product against the test oracle, which is a mechanism to tell whether the software be-haves correctly. The basic idea of software testing is to verify correct-ness by finding bugs or defects through the execution of the software system under test. By doing this evaluation process, objective and in-dependent information about the quality and risks of failure can be provided to relative stakeholders.

2.1.2 Source of Software Systems Defects

Software bugs or defects can be caused by several reasons. For in-stance, one of the most common reason is coding errors. Some of these bugs or defects can be detected and corrected through reviewing by programmers, other coding mistakes can still be caused by initially

(17)

CHAPTER 2. BACKGROUND 15

correct coding ideas which the programmers do not realize. Another common reason is requirements gaps, for example, the incorrect omis-sion of needed information by programmers because of the lack of full recognition and implication of a specific requirement [7]. In addition, defects can also be created by the change in certain environments, such as input data and hardware platforms.

2.1.3 The Benefits of Software Testing

The defects mentioned above are usually impossible to be detected and corrected through reviewing by programmers. Furthermore, these defects can result in unexpected failure of software systems which will lead to huge economic losses. For example, NIST conducted a study in 2002 shown that $59.5 billion economic losses are caused by soft-ware bugs in the USA every year, and more than one-third of these losses could be avoided by taking feasible and effective software test-ing strategies [8]. This number has for sure increased many folds be-cause most of the businesses are developed on software-based systems now.

Today, software systems failure results in serious worldwide economic losses annually, and it can affect many aspects of society like entertain-ment, governentertain-ment, finance, and transportation, etc. In fact, most of these unexpected software systems failure can be avoided by proper testing technique, whereas a large amount of software are just simply pushed forward to production without proper evaluation or testing procedure.

Since software testing technique can be used to avoid the unneces-sary economic loss and bring more profits in the meantime, there is professional software testing department nearly in every technology companies to ensure the higher correctness lower risks of failure of their software productions.

(18)

16 CHAPTER 2. BACKGROUND

2.2 Software System Under Test

2.2.1 Fifth Generation Mobile Systems

With global mobile data traffic expected to grow incredibly fast in the near future, there is an unstoppable trend of developing more efficient and advanced technology with higher data rates and spectrum utiliza-tion. Some state of art applications such as 4K/8K video streaming, emerging industrial use cases, virtual and augmented reality will also need the requirement of higher bandwidth, greater capacity, security, and lower latency. The fifth generation mobile system (5G) will be an expecting solution with these capabilities, and this cutting-edge tech-nology will certainly bring new opportunities to the whole society. As a world-leading telecom company, Ericsson is committed to develop more advanced communication technology to realize commercial 5G network expected in 2020.

2.2.2 Utilization of Metamorphic Testing

5G is being developed with more complex and highly innovative stan-dardization, which leads to an urgent need for more sophisticated and powerful software to build a solid foundation of the 5G system. In some cases, the newly developed software system may not have suf-ficient test oracles to verify the correctness in several directions, and this will make the software system untestable, which can be called the oracle problem. If these software systems are pushed forward to pro-duction without proper test process, the unexpected failure will result in unnecessary economic losses of Ericsson.

To this end, metamorphic testing would be an interesting and effec-tive software testing technique to alleviate the oracle problem that oc-curs in some software. Metamorphic testing is a cutting-edge tech-nique firstly introduced by Chen et al. [3] in 1998. During the last 20 years, metamorphic testing has been developed rapidly, and it has been used in a variety of research areas as shown in Figure 2.1 [9]. For instance, this technique has been used for validating the most widely-used compilers such as Clang and GCC [10]; Xie et al. [11] applied metamorphic testing on testing and validating machine learning algo-rithms; multiple bugs of Data Collection JavaScript Library of Adobe

(19)

CHAPTER 2. BACKGROUND 17

Analytics were detected by using metamorphic testing technique [12]; Lindvall et al. applied metamorphic model-based testing on NASA’s Data Access Toolkit [13]; and metamorphic testing was also used for testing autonomous cars [14]. Most of the studies focused on utilizing metamorphic testing techniques to detect bugs in widely used soft-ware and other applications. There are only 8% of case studies are re-lated with machine learning aspect. Furthermore, only a few of these machine learning case studies have been exploring automatic predic-tion of metamorphic relapredic-tions by using machine learning methods.

Figure 2.1: A survey of application areas of metamorphic testing

Therefore, it is totally worthy to utilize this brand new software test-ing technique to help the development of 5G systems in Ericsson. In the next chapter, the detailed definition of metamorphic testing will be introduced.

(20)

Chapter 3 Software Testing with

Metamor-phic Testing

In this chapter, the details about the metamorphic testing are intro-duced. We first describe the test oracle problem and then explain how the metamorphic testing works by operating against the metamorphic relations. Next, further to state the challenges within this technique and the motivation of using machine learning methods.

3.1 Test Oracle Problem

In order to automate the software testing, and make the testing process easier and reliable, the test oracle is needed. Test oracle is a mechanism to determine whether an execution of a software is correct or incorrect. The test oracle check the correctness of software under test by identi-fying whether there is expected output regarding its corresponding output.

In some types of software testing (e.g. scientific software), it is difficult to tell whether a test is passed or failed because there is no available test oracle. Lacking available test oracle or having difficulties in im-plementing such test oracle will arise the test oracle problem (Figure 3.1), which is one of the biggest challenges in software testing.

The test oracle problem can lead to the difficulties in detecting subtle faults and one-off errors, which can seriously affect the correctness and stability of programs or software. When dealing with such problems,

(21)

CHAPTER 3. SOFTWARE TESTING WITH METAMORPHIC TESTING 19

usually domain experts or scientists need to specify some particular test oracles in an inefficient and unsystematic way. Therefore, an au-tomated testing technique is surely welcomed to test software without test oracles.

Figure 3.1: The test oracle problem

3.2 Metamorphic Testing

Metamorphic testing is a software testing technique proposed by Chen et al. [3], which aims to alleviate the test oracle problem. This tech-nique is a property-based testing method which operates by checking whether the program under testing satisfies some previously identi-fied properties which are called metamorphic relations. The metamor-phic relations can tell how a specific change to arbitrary input should affect the output of a program. If the arbitrary inputs and correspond-ing outputs of the program under testcorrespond-ing violate these expected rela-tions, it means that some faults are existing in the program.

Generally, metamorphic testing can be implemented through the fol-lowing steps (Figure 3.2):

1. Identify a set of metamorphic relations between the inputs and outputs of the target program.

2. Generate new source test cases or select existing source test cases. 3. Generate follow-up test cases by applying the previously

identi-fied MRs to the initial source test cases.

4. Execute the follow-up test cases using the target program, and then check whether there are violations of MRs.

(22)

20 CHAPTER 3. SOFTWARE TESTING WITH METAMORPHIC TESTING

Figure 3.2: The overview of the metamorphic testing process The overall objective of metamorphic testing is to use newly gener-ated follow-up test cases according to metamorphic relations to detect faults which may e contained in the programs without test oracles. The most intuitive example of MT is when considering an implemen-tation of a SINE function y = sin(x). For any input angle x, based on the well-known property, the output will remain the same if the input angle is added by 2π, which is y = sin(x) = sin(x + 2π). This property can be regarded as a metamorphic relation: if x0 _{= x + 2π,}

then sin(x0) = sin(x). By using this metamorphic relation, this imple-mented SINE function can be examined twice by the original test case (x, sin(x)) and the follow-up test case (x0, sin(x0)). If the violation of the metamorphic relation between input angle and output value oc-curs, x0 = x + 2π and sin(x0) 6= sin(x), then it is a sure sign that this implementation of SINE function is a failure.

3.3 Metamorphic Relations

As mentioned before, metamorphic relations are some expected rela-tions between arbitrary inputs and corresponding outputs of a pro-gram. Consider a simple example in Figure 3.3, applying metamorphic testing to a mathematical function which can calculate the sum of an array. Some particular changes to the input array would be expected to not change the output result. For instance, randomly permuting the

(23)

CHAPTER 3. SOFTWARE TESTING WITH METAMORPHIC TESTING 21

input array will not affect the sum of the input array. This kind of re-lation between input and output held by a program can be referred to a permutative metamorphic relation.

Figure 3.3: Example of permutative metamorphic relation

Apart from the Permutative metamorphic relation, there are some other relations between inputs and outputs that a program can hold. Ac-cording to the previous work by Murphy et al.[15], six metamorphic relations that can be applied to mathematical functions which take an array as input are described as follows:

• Permutative: Randomly permute the elements of the input array, the corresponding output will remain constant.

• Additive: Add a positive constant to the input, the corresponding output will increase or remain constant.

• Multiplicative: Multiply the input by a positive constant, the out-put will increase or remain constant.

• Invertive: Take the inverse of each element, the output will de-crease or remain constant.

(24)

22 CHAPTER 3. SOFTWARE TESTING WITH METAMORPHIC TESTING

• Inclusive: Add a new element to the input array, the output will increase or remain constant.

• Exclusive: Remove an element to the input array, the output will decrease or remain constant.

If the change in the output of a program corresponds to the prediction after altering the input, then this function can be said to satisfy the cor-responding metamorphic relation.

To apply metamorphic testing to software testing, it is crucial to cor-rectly identify the metamorphic relations that the program under test-ing should have. However, this is usually difficult for ordinary testers to accomplish, since usually, they do not have enough prior domain knowledge of the programs. Besides, identifying metamorphic re-lations often needs to be done manually, which means it is a labor-intensive task and really time-consuming. In the next chapter, an effi-cient and systematic machine learning approach will be introduced to deal with this problem.

(25)

Chapter 4 Approach

In this chapter, the difficulties of replicating previous studies are stated. Then, we introduce our alternative approach including the generation of control flow graphs and machine learning methods combined with graph kernels.

4.1 Method Overview

The proposed method leverages kernel-based machine learning algo-rithms to train a classification model and predict the metamorphic relations for previously unseen programs. The overview of this ap-proach is shown in Figure 4.1. During the training phase, we firstly generate control flow graphs from the functions written by C pro-gramming language, and each function is labeled by its corresponding metamorphic relations; then use graph kernels to calculate the similar-ity score between each pair of control flow graphs and final output of graph kernel function is a similarity score matrix or gram matrix; next train a classification model by using kernel-based supervised machine learning algorithms. During the testing phase, firstly extract the con-trol flow graph from a previously unseen function as similar as the process in the training phase; after that, compute the similarity scores between the previously unseen control flow graph and each graph for training by using the same graph kernel function; finally use the trained classification model to predict the metamorphic relations that this function should have.

(26)

24 CHAPTER 4. APPROACH

Figure 4.1: The overview of the proposed approach

4.1.1 The Difficulties of Replicating Previous Works

In previous work by Kanewala et al. [1], the method they used to generate CFGs is based on a tool called Soot, which can provide inter-mediate representations for analyzing and transforming source codes. However, there is a limitation of this method, the Soot tool can be only used for Java. In Ericsson, most of the programs are written in C programming language, which means that the previously success-ful method cannot be simply replicated.

Therefore, we had to choose an alternative method. Most of the open source tools (e.g. CoFlo) that can be found through the Internet can-not be used for various reasons. Eventually, We found the way of using LLVM framework and compiler Clang to generate CFGs for C programs.

In addition, the data set they used to train machine learning models consists of Java functions, so we had to create our own data set by im-plementing similar mathematical functions written in C programming language, and we also needed to investigate the properties of these

(27)

CHAPTER 4. APPROACH 25

functions to label them by their corresponding metamorphic relations. These works were time-consuming and took more than 6 weeks.

4.2 Control Flow Graphs

According to previous work by Kanewala et al. [1], similar structures of control flow graphs would result in similar metamorphic relations, so it is reasonable to use this feature of a program for training and testing. A CFG of a program is a directed graph which models the se-quence of operations which can be represented by G = (V, E), where V and E represent nodes and edges respectively. Each node vx ∈ V

rep-resents a basic block, which is a straight-line piece of code sequence with no branches in except to the entry and no branches out except at the exit. Directed edge e = (vx, vy) is used to represent jumps or

control flow between node vxand vy, i.e. statement y can be executed

immediately after statement x is executed.

Figure 4.2: An example of CFG

Figure 4.2 shows a simple example of what a CFG looks like. As can be seen, each straight-line piece of codes sequence is contained in a basic block (node), and each related pair of basic blocks is connected by a directed edge.

(28)

4.3 Graph Kernels

As introduced in previous sections, the proposed approach leverages kernel-based supervised machine learning algorithms, in what follows some relevant background on kernel method, graph kernel, and ma-chine learning methods are provided.

4.3.1 Kernel Methods

Kernel methods are a class of machine learning algorithms which can perform pattern analysis (e.g. classification, clusters, and principal components, etc.). The basic idea of kernel methods is described in Figure 4.3, it can map the raw representation of data into an appro-priately high-dimensional feature space where the linear relation can be found by machine learning algorithms such as support vector ma-chines and kernel ridge regression. This kind of methods leverage some specific kernel functions which can be represented by k(x, x0) = hφ(x), φ(x0_{)i, where φ is the feature mapping. Kernel functions can}

enable kernel methods to computes the inner product in the high-dimensional feature space from the original representation of data, in-stead of actually mapping the data into that new feature space and then calculating the actual coordinates.

Figure 4.3: Kernel methods

These kernel methods are often effective when the target data do not have a natural representation in a fixed dimensional space such as se-quence data, text, and graphs, etc.

(29)

4.3.2 Graph Kernels

Graph kernel is a kernel based graph comparison and classification algorithm which has become a popular approach in many domains where the target data is not represented in a fixed dimensional space, such as biology [16], chemistry [17], and social network analysis [18]. These areas usually focus on structured objects, and graphs are the most common and natural representation to model these types of tar-get data. When dealing with graph comparison problems, the simi-larity of two graphs is often produced by comparing the structure of two graphs. Graph kernels, firstly proposed by Gärtner et al. in 2003 [19], are a group of functions which can compute the similarity score for a pair of graphs by comparing the structure of two graphs in cer-tain ways, then some kernel-based machine learning algorithms can use the precomputed similarity scores to perform the classification and prediction tasks.

4.3.3 Random Walk Kernel

According to previous work by Kanewaka et al. [1], random walk ker-nel, proposed by Gärtner et al. [19], usually performs better on this specific task, therefore this graph kernel function is used in this thesis, in what follows are the details about this specific kernel function. Firstly, give the definition of adjacency matrix of a graph G = (V, E): for a directed graph, adjacency matrix A is a n × n matrix with is its element Aij = 1if vertices vi and vj are neighbors, otherwise Aij = 0,

and Aij 6= Aji; for an undirected graph, adjacency matrix A is a n × n

matrix with is its element Aij = 1if vertices vi and vj are neighbors,

otherwise Aij = 0, and Aij = Ajiwhich also means it is symmetric

ma-trix. For both two types of graphs, the elements on the diagonal line always equal to zero. Since the operations in a program are always in a certain order which means its corresponding CFG is always a directed graph, in this case, only the directed graph is considered in this thesis. The random walk kernel is based on the idea of counting the num-ber of matching walks in two graphs. A random walk on a graph G = (V, E) is a sequence of vertices v1, v2, ..., vk of length k, and there

should be a directed edge between vk and vk−1 which can also be

(30)

perform-28 CHAPTER 4. APPROACH

ing random walks with each length of sequences of vertices on both graphs, and then count how many similar walks that are existing in both graphs. However, this task usually is difficult to accomplish due to the high diversity of the two graphs. Therefore, instead of gener-ating sequences of vertices with different length from both graphs, a random walk is usually performed on direct product graph [20] calcu-lated from the two graphs.

Let G = (V, E) and G0 = (V0, E0) be two directed graphs, the direct product graph of these two graphs is defined as Gx = (Vx, Ex), where

Vxand Ex are the vertex set and edge set described as below:

Vx = {(vi, vm0 ) ∈ V × V 0_} (4.1) Ex = {((vi, vm0 ), (vj, vn0)) ∈ Vx× Vx| (vi, vj) ∈ E, (vm0 , v 0 n) ∈ E 0_} (4.2) The vertices of direct product graph are from all the vertices from both Gand G0, two vertices in Gx are connected by an directed edge if and

only if there are directed edges between the corresponding vertices in G1and G2.

As can be seen in Figure 4.4, each node or vertex in direct product graph Gx is a combination of two vertices from direct graph G and

G0, two vertices in direct product graph Gxare neighbors if and only if

the correspondingly original vertices from directed graph G and G0_are

neighbors. For instance, vertices a0aand b0bin direct product graph Gx

have the property of adjacency, because vertices a and b are adjacent in graph G, and vertices a0 and b0 are connected by a directed edge in graph G0_{; vertex a}0_c_{in direct product graph G}

x is not adjacent to any

other vertices since there is not another combination of vertices from original graphs are connected to a0and c simultaneously.

Let A and A0 _{denote the adjacency matrices of two directed graphs G}

and G0_{respectively, then the adjacency matrix of direct product graph}

Gx can be denoted as Ax = A ⊗ A0, where the notation ⊗ means the

Kronecker product which is defined as:

Ax = An×n⊗ A0m×m =    A11A0 · · · A1nA0 .. . . .. ... An1A0 · · · AnnA0    mn×mn (4.3)

(31)

Figure 4.4: An example of the direct product graph

Then, based on the direct product graph, Gärtner et al. defined the random walk kernel as [19]:

k(G, G0) = Vx X i,j=1 [ ∞ X k=0 λkAk_x]ij (4.4)

(32)

Where 0 ≤ λ < 1 is a weighting parameter, k is the maximum length of a random walk. Usually, for the purpose of better classification per-formance, the adjacency matrices are needed to be normalized as:

¯

A = A · D−1 (4.5)

Dis a diagonal matrix which is defined as:

D =    P A∗1 . .. P A∗n    n×n (4.6)

Where A∗j is the jth column of adjacency matrix A. After performing

this normalization operation on adjacency matrix, the sum of each col-umn equals to one. Then, the final function of random walk kernel is: k(G, G0) = Vx X i,j=1 [ ∞ X k=0 λkA¯k_x]ij (4.7)

Where ¯Ax = ¯A ⊗ ¯A0denotes the normalized adjacency matrix of direct

product graph Gx.

4.4 Machine Learning

As mentioned in the previous section, the prediction of MRs is re-garded as a machine learning problem. As we know, machine learn-ing algorithms are effectively and efficiently well-known approach for discovering linear relations. Basically, machine learning algorithms can be divided into two subsets: supervised learning and unsupervised learning, where supervised learning aims to learn a function which can map input data to corresponding output labels based on previous ex-perience of the relations between inputs and outputs; unsupervised learning focuses on finding the structure of unlabeled data. In this thesis, only supervised learning is considered.

There are many widely used algorithms for supervised learning, two kernel-based algorithms, Support vector machines and Kernel Ridge Regression, are used in this thesis.

(33)

4.4.1 Support Vector Machines

SVMs [21] are a group of machine learning algorithms which are used for classification or regression issues in many real-world applications. The basic idea of SVMs is to perform linear classification by creating a hyperplane in the high dimensional space which can distinguish dif-ferent examples in the training sets based on the knowledge of their corresponding class labels. This operation usually depends on the "kernel trick" which has been introduced in the previous section. There are many popular kernel functions which can perform the "kernel trick" by mapping the original low dimensional datasets into a higher di-mensional space, such as linear kernel, polynomial kernel, Gaussian kernel, and sigmoid kernel. In the specific case of this thesis, the kernel function is the self-defined graph kernel function.

4.4.2 Kernel Ridge Regression

KRR [22] is an algorithm which combines the kernel methods and ridge regression. Ridge regression, which can be also referred as Tikhonov regularization, aims to build a regression model to minimize the least square cost functions with l2-norm regularization. KRR can learn a linear or non-linear function in the space mapped by the respective kernel function and the data in original space. All the kernel function that can be used for SVM can also be used for KRR. In the specific case of this thesis, the kernel function is the self-defined graph kernel function.

(34)

Chapter 5 Replication of the Previous

Ex-periment

In previous work, Kanewala et al. used the tool Soot to generate CFGs of their target programs which are written in Java programming lan-guage. And this open source tool is specifically designed only for Java. This thesis work aims to use this state of art technique for programs written in C programming language, which is most often used in Er-icsson. Therefore, it is impossible to directly replicate the successful experience from previous work. In this case, our experiment is specif-ically designed to validate that whether this technique can have the property of generality so that it can be used more widely for different types of programs. In what follows in this chapter illustrate the details about the data set, metamorphic relations, evaluation measures used in this experiment and the results so far.

5.1 Data Set

There is no existing data set of C codes, which are labeled by their cor-responding metamorphic relations. Therefore, in order to evaluate the effectiveness of this proposed method, we created our own data set by implementing the mathematical functions written by C programming languages, which take arrays as inputs, our data set is as similar as Kanewala et al. used for their machine learning model, their data set was written by Java [4].

The proposed method leverages graph kernel method which is able

(35)

CHAPTER 5. REPLICATION OF THE PREVIOUS EXPERIMENT 33

to calculate the similarity score between a pair of graphs, so here we need to generate the corresponding CFGs for these functions. Since Soot was not available, the combination of framework LLVM and com-plier Clang under Linux OS was chosen to generate the CFGs. There is a difference between these two methods, Soot can generate a typed three-address representation for each operation in Java code, then each node in the eventually generated CFG has a label to specify its corre-sponding operation in the code. Different from Soot, the combination of LLVM and Clang can only generate the CFG without specifying the operation of each node. Let us see an example in Figure 5.1, for the CFG on the left side, its nodes are labeled by their correspond-ing mathematical operation in the code. When uscorrespond-ing graph kernel methods, Kanawala et al. [1] also computed the similarity between the labeled nodes from two graphs, and the similarity score between a pair of nodes will eventually contribute to the comparison of two graphs. For the CFG on the right side, each node can only have a numerical number to distinguish from each other, when using graph kernel methods, we can only compare the similarity between graphi-cal structures of two graphs.

Figure 5.1: An example of labeling the nodes by their operation

This is always a difference between these two methods of generating CFGs. The consequence of this is that we cannot compare the similar-ity between two nodes in two graphs as Kanawala et al. did [1]. An example of the functions in my data set and its corresponding CFG is shown in Figure 5.2. The generated CFGs are saved as dot files which can be directed read by some Python libraries later.

(36)

34 CHAPTER 5. REPLICATION OF THE PREVIOUS EXPERIMENT

Figure 5.2: An example of the functions (bubble sort) in my data set and its corresponding control flow graph. The combination of number and notation "%" in each node stands for an automatically generated fake name which can be used to distinguish each node.

5.2 Metamorphic Relations

For simplicity, there are three metamorphic relations are used in this case.

Permutative MR: Randomly permute the elements of the input array, the corresponding output will remain constant.

Additive MR: Add a positive constant to the input, the correspond-ing output will increase or remain constant.

Inclusive MR: Add a new element to the input array, the output will increase or remain constant.

Based on the proposed method, each metamorphic relation is used as labels of CFGs to train a machine learning model, if a graph holds a metamorphic relation, then its label is 1, otherwise is 0. Table 5.1 shows the number of total examples, positive examples, and negative examples for each MR.

(37)

MR Total examples Positive examples Negative examples

Permutative 45 19 26

Additive 39 25 14

Inclusive 30 19 11

Table 5.1: Number of positive and negative examples for each MR

5.3 Evaluation Measures

As mentioned in the previous chapter, two kernel-based machine learn-ing algorithms, SVM and KRR are used in this toy example since the proposed method is based on graph kernels. Both of these two al-gorithms can be conveniently used by calling certain functions from Scikit-Learn, which is a machine learning library for Python.

In the training phase, we feed the training set split from the data set to a random walk kernel to calculate the similarity matrix which will be then used for training a machine learning model.

5.3.1 Stratified k -fold Cross-Validation

Stratified k-fold cross validation (Figure 5.3) method is used for cross-validation in this thesis. This method can randomly partition the orig-inal data set into k subsets or folds, 1 fold is used for testing while the other k − 1 folds are used for training the predictive model, then it-erate until every fold has been used for testing once. In addition, the stratified k-fold method can split data set according to the label of each sample in order to approximately keep each class has the same propor-tion in each fold. The final result is averaged from the cross-validapropor-tion.

5.3.2 Area Under the Receiver Operating

Character-istic Curve

For support vector machines, we use the area under the receiver op-erating characteristic curve to evaluate the performance, since it is an effective analysis tool for binary classifier in machine learning. In bi-nary classification, the outputs are either positive or negative, there-fore, there are four possible outcomes in total, which are: true positive

(38)

Figure 5.3: Stratified k-fold cross-validation. Both classes (red and green) have approximately the same distribution in each fold.

(TP), true negative (TN), false positive (FP), and false negative (NG). With these four possible outcomes, a 2 × 2 confusion matrix can be formu-lated as below in Table 5.2.

Actual value Positive Negative Predicted value Positive T P F P

Negative F N T N

Table 5.2: Confusion matrix

According to the confusion matrix, the true positive rate (TPR) and false positive rate (FPR) can be calculated as below:

T P R = P T P

P T P + F N (5.1)

F P R = P F P

(39)

A receiver operating characteristic curve (i.e. ROC curve) is a graphi-cal plot, which its x and y axes are defined by TPR and FPR respec-tively. Usually, the larger the AUC is, the better the model performs. AU C = 1means that the model under analysis is ideal, there is no false positive and false negative decision made by the classifier, AU C = 0.5 tells that the classifier can only make a random guess. Typically, a rough guide for a good binary classifier is that when its AU C ≥ 0.8.

5.3.3 Mean Squared Error

For kernel ridge regression, since its predicted value is continuous rather than binary, was chosen to measure the quality of the regres-sion model instead of using AUC. The definition of MSE is described as below: M SE = 1 n n X i=1 (yi− ˆyi)2 (5.3)

where n is the total size of data, yiand ˆyirepresent original labels and

predictions respectively. The value of MSE is always non-negative, usually, the smaller the MSE is, the better the estimator performs. Typ-ically, a rough guide for a good binary regression model is that when its M SE ≤ 0.2.

5.4 Results

The result of our experiment is presented in this section. We conducted the grid search together with stratified 3-fold cross-validation to find the best hyperparameters (regularization parameters alpha and C for KRR and SVM respectively) for different metamorphic relations. By making the longest walk k in random walk kernel equal to 9, the best hyperparameters C for support vector machine and λ for random walk kernel equal to 10 and 0.8 respectively. For kernel ridge regression, the best results for all three MRs are performed by alpha = 1 and λ = 0.6.

5.4.1 AUC Performance for Support Vector Machine

Figure 5.4, 5.5, and 5.6 depict the ROC curve of each fold and also the average ROC curve for prediction models regarding permutative MR, additive MR, and inclusive MR. It can be seen that each fold does not

(40)

differ too much from the other folds, which means the variance is rel-atively small for all three models, and also the AUC values stand the fact that the prediction results are much better than a random guess.

Figure 5.4: ROC analysis of classifier for predicting the permutative metamorphic relation

Figure 5.5: ROC analysis of classifier for predicting the additive meta-morphic relation

(41)

Figure 5.6: ROC analysis of classifier for predicting the inclusive meta-morphic relation

Form the comparison of AUC values of three models depicted in Fig-ure 5.7, we can conclude that all three models perform well based on the common acknowledgment that AU C ≥ 0.8 stands for a good clas-sifier. In addition, the prediction of inclusive MR is relatively more accurate than the other two MRs.

(42)

5.4.2 MSE Performance for Kernel Ridge Regression

Figure 5.8 shows the comparison of MSE results for three regression models, we can see that all the MSE values are around 0.2 or even smaller, which also means the models are acceptable. Additionally, the regression model for prediction inclusive MR is also relatively more accurate than the other two MRs.

Figure 5.8: The comparison of performance among all regression mod-els

5.4.3 Analysis of Random Walk Kernel

To evaluate the random walk kernel when using SVM and KRR, we computed the AUC results with different values of λ from 0.1 to 0.9, since λ is a parameter in random walk kernel which should be small than 1.

Figure 5.9 shows the result when using SVM. For permutative MR, the AUC slightly increases when λ goes larger; and higher value of λ re-sults in better performance of AUC for additive MR. This phenomenon indicates that the prediction of permutative and additive MRs focus more on the comparing longer walks in two graphs. This can be ex-plained by Equation 4.7 in the previous section, all the random walks with different lengths in the direct product graph are weighted by λk_,

(43)

the longer walks which have larger value of k can give more contribu-tion when λ is higher. For inclusive MR, the AUC slightly goes down with higher value of λ, this shows that the prediction of inclusive MR cares more about the similarity between shorter walks of two graphs.

Figure 5.9: Variation of AUC with different value of λ

Figure 5.10 shows the result when using KRR. For permutative and additive MRs, the MSE does not vary too much when the value of λ changes. The prediction of inclusive MR becomes more accurate with larger value of λ. This phenomenon shows that different ma-chine learning algorithms can also affect the effectiveness of random walk kernel with different value of λ.

(44)

Figure 5.10: Variation of MSE with different value of λ

5.4.4 Test for Existing Telecom Software

We also tried to use the machine learning models, trained on our data set, to predict MRs for existing telecom software in Ericsson. The pre-dicted MRs were not applicable for the programs under test.

This result can be explained for several reasons. Firstly, the functions we used for training are at the unit level or function level, the code we used to test is more complex and at the system level or program level. The properties that can be revealed in functions may not be possible to detect in complex programs. Second, the data set we used consists of mathematical functions take an array as inputs, whereas the tele-com programs are not purely mathematical and the inputs could have different types of inputs. Additionally, there are not standard meta-morphic relations in the metameta-morphic testing community which can be applied for all kinds of programs and software systems, therefore the MRs in our model may not be feasible for the telecom programs.

(45)

Chapter 6 Conclusion and Future Work

This thesis aims to achieve automatic prediction of metamorphic rela-tions by utilizing machine learning algorithms combined with graph kernel methods. In this chapter, the overall conclusion drawn from this thesis and the future work are presented.

6.1 Conclusion

Metamorphic testing [3] is an effective software testing technique for testing programs which are lacking test oracles. This approach is based on a mechanism of checking a set of properties between inputs and outputs, which is called metamorphic relations. The violation of cer-tain MRs of a program can indicate that there are cercer-tainly some errors in this testing program. However, it is often difficult for testers to iden-tify MRs of a program. In this case, this thesis aims to investigate that whether machine learning algorithms together with graph kernel tech-niques could help to predict MRs for testing programs.

In previous work by Kanewala et al. [1][5], they successfully imple-mented machine learning methods on predicting MRs by using graph kernels trick. Their results show that the control flow graph is an effective feature of a testing program for training a machine learn-ing model, and kernel-based machine learnlearn-ing methods together with graph kernels also work well in predicting MRs. Although their ex-periment was successful in this special task, simple replication of their work cannot be conducted for the common programs in Ericsson. There are two reasons for this problem. Firstly, the code corpus of their

(46)

44 CHAPTER 6. CONCLUSION AND FUTURE WORK

periment is written in Java, while C programming language is most of-ten used in Ericsson. There are some subtle differences between these two programming languages, which could cause different structures of control flow graphs for training a machine learning model. Sec-ondly, the tool they used for generating control flow graphs is called Soot, which a specifically designed library only for Java programs. This means that their method is strictly limited by a third party tool so that their approach cannot be simply transferred to other program-ming languages.

In order to validate that their idea can be used for wider areas, we firstly investigated the way of generating control flow graphs for C codes. Finally, we decided to use the combination of framework LLVM together with compiler Clang to generate unlabeled control flow graphs for C programs. Then we built our own code corpus by implementing 45 mathematical functions as similar as Kanewala et al. did in their previous work [4], and generated CFGs of all the functions in our code corpus by using the method as mentioned before. After that, we imple-mented the function of random walk kernel to calculate the similarity score matrix or gram matrix, each entry of this matrix stands for the similarity score between two CFGs of their corresponding programs in our code corpus. Next, we trained two kinds of machine learning models for each metamorphic relation by using support vector ma-chine and kernel ridge regression algorithms. We measured our mod-els by using the area under the receiver operating characteristic curve or mean squared error together with stratified k-fold cross-validation. The result of our experiments shows that the AUC values for all meta-morphic relations are larger than 0.8, and the MSE values for all MRs are around or smaller than 0.2. According to a common acknowledge-ment, AU C ≥ 0.8 and M SE ≤ 0.2 can indicate that the trained model performs well. In addition, both of these two machine learning algo-rithms, SVM and KRR, performed better on predicting inclusive MR of a program than permutative MR and additive MR. Additionally, we analyzed the effectiveness of random walk kernel with the variation of parameter λ. The result shows that machine learning algorithms together with graph kernels methods can be effective for predicting metamorphic relations for previously unseen C program.

(47)

CHAPTER 6. CONCLUSION AND FUTURE WORK 45

6.2 Future Work

The model in this thesis work is the first step of this specific approach, and it can be extended in several directions.

Firstly, we used the code corpus consists of 45 mathematical functions which is a relatively small data set could threat the external validity, to avoid this problem we can build a larger data set with more functions, which can lead to better accuracy and a smaller threat to external va-lidity.

Secondly, the prediction of metamorphic relations in our model is at the function level or unit level, however, in real applications, pro-grams are often more complex by calling multiple functions. If the model is trained based on codes at the program level, sometimes the metamorphic relations at function level cannot be predicted, this could cause the problem that some subtle errors at function level cannot be revealed by checking the metamorphic relations which are predicted at the program level [23]. In this case, we need to investigate the re-lationship between functions and programs and then develop more general and complex algorithms that can predict metamorphic rela-tions at function level and program level simultaneously.

Thirdly, the model in this thesis aims to the prediction of single meta-morphic relation, which also means that the binary classifier is trained in our work. However, most of the programs could have more than one metamorphic relations at the same time. In this case, we can use a softmax function which can predict the possibilities of multiple classes in a single output vector.

Besides, the functions in our code corpus take arrays as inputs. In real applications, functions could have different types of forms of in-puts like matrices, volumes, and graphs, etc. Regarding this situation, we could train specific machine learning models for different kinds of programs, or train a more general model based on a data set with a larger variety.

In addition, our method is still limited by the third party tools. The way we used for generating control flow graphs only aims for C or

(48)

46 CHAPTER 6. CONCLUSION AND FUTURE WORK

C++ programming languages, so our approach cannot be simply trans-ferred to programs written by other programming languages. There should be a general tool which can generate labeled or unlabeled con-trol flow graphs for different kinds of codes. However, this task may be more feasible for the professional tool developers in this area. Finally, at the age of fast development of artificial intelligence, we could build a graph kernel based deep neural network to predict the possible metamorphic relations that a testing program could have [24].

(49)

Bibliography

[1] Upulee Kanewala, James M Bieman, and Asa Ben-Hur. “Predict-ing metamorphic relations for test“Predict-ing scientific software: a ma-chine learning approach using graph kernels”. In: Software test-ing, verification and reliability 26.3 (2016), pp. 245–269.

[2] Elaine J Weyuker. “On testing non-testable programs”. In: The Computer Journal 25.4 (1982), pp. 465–470.

[3] Tsong Y Chen, Shing C Cheung, and Siu Ming Yiu. Metamorphic testing: a new approach for generating next test cases. Tech. rep. Tech-nical Report HKUST-CS98-01, Department of Computer Science, Hong Kong University of Science and Technology, Hong Kong, 1998.

[4] Upulee Kanewala and James M. Bieman. “Using machine learn-ing techniques to detect metamorphic relations for programs with-out test oracles”. eng. In: IEEE, Nov. 2013, pp. 1–10.ISBN: 9781479923663. [5] Karishma Rahman and Upulee Kanewala. “Predicting

Metamor-phic Relation for Matrix Calculation Programs”. In: arXiv preprint arXiv:1802.06863 (2018).

[6] Jie Zhang et al. “Search-based inference of polynomial metamor-phic relations”. In: Proceedings of the 29th ACM/IEEE international conference on Automated software engineering. ACM. 2014, pp. 701– 712.

[7] Dorota Huizinga and Adam Kolawa. Automated defect prevention: best practices in software management. John Wiley & Sons, 2007. [8] Strategic Planning. “The economic impacts of inadequate

infras-tructure for software testing”. In: National Institute of Standards and Technology (2002).

[9] Sergio Segura et al. “A survey on metamorphic testing”. In: IEEE Transactions on software engineering 42.9 (2016), pp. 805–824.

(50)

48 BIBLIOGRAPHY

[10] Vu Le, Mehrdad Afshari, and Zhendong Su. “Compiler valida-tion via equivalence modulo inputs”. In: ACM SIGPLAN Notices. Vol. 49. 6. ACM. 2014, pp. 216–226.

[11] Xiaoyuan Xie et al. “Testing and validating machine learning classifiers by metamorphic testing”. In: Journal of Systems and Software 84.4 (2011), pp. 544–558.

[12] Zhenyu Wang et al. “Metamorphic Testing for Adobe Analytics Data Collection JavaScript Library”. In: ().

[13] Mikael Lindvall et al. “Metamorphic Model-Based Testing Ap-plied on NASA DAT–An Experience Report”. In: Software Engi-neering (ICSE), 2015 IEEE/ACM 37th IEEE International Conference on. Vol. 2. IEEE. 2015, pp. 129–138.

[14] Yuchi Tian et al. “DeepTest: Automated testing of deep-neural-network-driven autonomous cars”. In: arXiv preprint arXiv:1708.08559 (2017).

[15] Christian Murphy et al. “Properties of Machine Learning Appli-cations for Use in Metamorphic Testing.” In: SEKE. Vol. 8. 2008, pp. 867–872.

[16] Roded Sharan and Trey Ideker. “Modeling cellular machinery through biological network comparison”. In: Nature biotechnol-ogy 24.4 (2006), p. 427.

[17] Danail Bonchev. Chemical graph theory: introduction and fundamen-tals. Vol. 1. CRC Press, 1991.

[18] Ravi Kumar, Jasmine Novak, and Andrew Tomkins. “Structure and evolution of online social networks”. In: Link mining: models, algorithms, and applications. Springer, 2010, pp. 337–357.

[19] Thomas Gärtner, Peter Flach, and Stefan Wrobel. “On graph ker-nels: Hardness results and efficient alternatives”. In: IN: CON-FERENCE ON LEARNING THEORY. 2003, pp. 129–143.

[20] Svn Vishwanathan et al. “Graph Kernels”. English. In: Journal Of Machine Learning Research 11 (Apr. 2010), pp. 1201–1242.ISSN: 1532-4435.

[21] Corinna Cortes and Vladimir Vapnik. “Support-vector networks”. In: Machine learning 20.3 (1995), pp. 273–297.

(51)

BIBLIOGRAPHY 49

[23] Christian Murphy. Metamorphic testing techniques to detect defects in applications without test oracles. Citeseer, 2010.

[24] Giannis Nikolentzos et al. “Kernel Graph Convolutional Neural Networks”. In: arXiv preprint arXiv:1710.10689 (2017).

Machine Learning for Metamorphic Testing

Machine Learning for

Metamorphic Testing

ZHEYU ZHANG

Abstract

Sammanfattning

Acknowledgement

Contents

Acronyms

List of Figures

List of Tables

Chapter 1

Introduction

1.1

Motivation

1.2

Problem Formulation and Method

1.3

Previous Work

1.4

Thesis Outline

Chapter 2

Background

2.1

Software Testing

2.1.1

The Overview of Software Testing

2.1.2

Source of Software Systems Defects

2.1.3

The Benefits of Software Testing

2.2

Software System Under Test

2.2.1

Fifth Generation Mobile Systems

2.2.2

Utilization of Metamorphic Testing

Chapter 3

Software Testing with

Metamor-phic Testing

3.1

Test Oracle Problem

3.2

Metamorphic Testing

3.3

Metamorphic Relations

Chapter 4

Approach

4.1

Method Overview

4.1.1

The Difficulties of Replicating Previous Works

4.2

Control Flow Graphs

4.3

Graph Kernels

4.3.1

Kernel Methods

4.3.2

Graph Kernels

4.3.3

Random Walk Kernel

4.4

Machine Learning

4.4.1

Support Vector Machines

4.4.2

Kernel Ridge Regression

Chapter 5

Replication of the Previous

Ex-periment

5.1

Data Set

5.2

Metamorphic Relations

5.3

Evaluation Measures

5.3.1

Stratified k -fold Cross-Validation

5.3.2