• No results found

Self-healing Middleware Support for Django Web Applications

N/A
N/A
Protected

Academic year: 2021

Share "Self-healing Middleware Support for Django Web Applications"

Copied!
65
0
0

Loading.... (view fulltext now)

Full text

(1)

IN

DEGREE PROJECT COMPUTER SCIENCE AND ENGINEERING, SECOND CYCLE, 30 CREDITS

,

STOCKHOLM SWEDEN 2020

Self-healing Middleware

Support for Django Web

Applications

YI-PEI TU

KTH ROYAL INSTITUTE OF TECHNOLOGY

(2)

Self-healing Middleware

Support for Django Web

Applications

YI-PEI TU

Master in Computer Science Date: September 30, 2020 Supervisor: Long Zhang Examiner: Martin Monperrus

School of Electrical Engineering and Computer Science

(3)
(4)

iii

Abstract

Using web frameworks is becoming more common to build modern websites. Therefore, the robustness and resilience of the web frameworks are critical in production. Automatic software repair is about the software resolving the bugs automatically by itself. A web framework equipped with automatic software repair can improve the resilience of the website at runtime.

In this thesis, we investigate the possibility of repairing common web frame-work errors such as URL interpretation error and database error with self-healing techniques during runtime. The common web framework errors are analyzed by using 16 popular open-source web framework projects from Github. Besides, the self-healing architecture and techniques are implemented in the web framework to resolve the exceptions during runtime. The flexibility and robustness are considered while designing the self-healing techniques. The metrics regarding effectiveness and performance overhead are concerned with evaluating the self-healing techniques.

(5)

iv

Sammanfattning

Att använda webbramar blir allt vanligare för att bygga moderna webbplatser. Därför är webbramens robusthet och motståndskraft avgörande i produktio-nen. Automatisk programvarureparation handlar om att programvaran löser buggen automatiskt av sig själv. Webbramar som är utrustat med automatisk programvarureparation kan vid körtid förbättra webbplatsens motståndskraft. I detta examensarbeten undersöks möjligheten att reparera vanliga fel i webbramen som URL-tolkningsfel och databasfel med självhelande tekniker under körtid. De vanliga webbrams felen analyseras med 16 populära öpp-na webbramsprojekt från Github. Dessutom implementeras den självhelande arkitekturen och teknikerna i webbramsverket för att lösa undantagen under körtid. Flexibiliteten och robustheten beaktas vid utformningen av självhelan-de tekniker. Mätetalen av effektivitet och omkostnasjälvhelan-den för prestanda beaktas vid utvärdering av självhelande tekniker.

(6)

Contents

1 Introduction 1

1.1 Motivation . . . 1

1.2 Problem Statement . . . 2

1.3 Contribution . . . 3

1.4 Ethics & Sustainability . . . 3

1.5 Outline . . . 4

2 Background 5 2.1 Web Applications . . . 5

2.2 Python Web Framework . . . 6

2.3 Middleware . . . 7

2.4 Automatic Software Repair . . . 9

2.5 Summary . . . 10

3 Related Work on Self-healing Software 11 3.1 Self-healing Techniques . . . 11

3.2 Resilience Engineering . . . 14

3.3 Summary . . . 14

4 Empirical Study of Errors in Django Applications 15 4.1 Protocol . . . 15

4.1.1 Target Projects . . . 15

4.1.2 Issue Analysis on Each Project . . . 16

4.2 Results . . . 18

4.3 Summary . . . 20

5 Design and Implementation of Self-healing Middleware 22 5.1 Design of Self-healing Middleware . . . 22

5.1.1 Architecture and Workflow . . . 22

5.1.2 Request Monitor . . . 23

(7)

vi CONTENTS 5.1.3 Controller . . . 24 5.1.4 Strategy Executor . . . 25 5.1.5 Strategy Verifier . . . 29 5.2 Implementation . . . 29 5.3 Summary . . . 30

6 Experiment Protocol on Self-healing Middleware 31 6.1 Experiment Targets . . . 31 6.2 Protocols . . . 32 6.3 Summary . . . 34 7 Experimental Results 36 7.1 Effectiveness . . . 36 7.2 Performance Overhead . . . 38 7.3 Summary . . . 40 8 Discussion 41 8.1 RQ1: Django Common Errors . . . 41

8.2 RQ2: Self-healing Prototype . . . 42 8.3 RQ3: Effectiveness Evaluation . . . 44 8.4 RQ4: Overhead Evaluation . . . 45 9 Conclusions 47 9.1 Recapitulation . . . 47 9.2 Future Work . . . 49 Bibliography 50 A Appendix 56 A.1 Source Code & Raw Data . . . 56

(8)

Chapter 1

Introduction

1.1

Motivation

Over the last few years, the complexity of web applications has extraordinar-ily increased in order to provide more dynamic behaviors to web users. Due to the complexity of modern websites, developers need a robust integrated web framework to decrease the developing time for web frameworks such as Django. Django has become one of the most popular Python frameworks in website development with a large number of ongoing products such as Insta-gram, Spotify, and YouTube [1]. Although software developers could benefit from using a web framework during developing time, it is still painful for de-velopers to fix the bugs after the website is launched. The bugs might happen on the client-side or on the server-side. Some severe bugs in the back-end could damage the website, such as data leaking, server crash, and SQL injec-tion, [2, 3, 4] etc. Self-healing would be one of the solutions to improve the resilience of web applications. Although there have been various presented re-search works of self-healing techniques in web applications among front-end, back-end, and database, no literature regarding self-healing capabilities exists in the Django web framework. The thesis differs from other related work be-cause it investigates how self-healing techniques can be applied in the context of middleware for Django web applications at runtime. The idea got inspired by Durieux et al.’s work [5], Fully Automated HTML and Javascript Rewrit-ing for ConstructRewrit-ing a Self-healRewrit-ing Web Proxy, which proposed a proxy to heal the bugs in front-end. In this thesis, we aim to design and implement a system that provides self-healing middleware support for Django web applica-tions. The challenges are to conduct an in-depth analysis of the Django-built website errors and analyze possible solutions for runtime. Knowing Django’s

(9)

2 CHAPTER 1. INTRODUCTION

common errors is the first puzzle to be solved before implementing the self-healing plugin. After fetching errors, analyzing and categorizing them is the second puzzle. Analyzing and knowing the genre of errors would provide ex-tra information such as error-triggered timing and error causes. According to the author’s understanding, Django’s common errors during runtime have not been assessed with real projects. In addition to self-healing software, failure-oblivious computing is considered as well. Failure-failure-oblivious computing is a technique to be aware of memory errors and then ignore it, making the fragile program overcome the errors by itself and keep running. This technique is also one of the self-healing strategies to enhance software resilience. This research strives to implement a self-healing plugin for Django web developers to help them heal the back-end errors during runtime.

1.2

Problem Statement

This thesis aims to improve Django web applications’ resilience by adding dif-ferent self-healing features. In order to evaluate the thesis result, the following research questions should be answered.

• RQ1: What are the common errors in the current Django web frame-work?

• RQ2: How to design and implement self-healing strategies for Django web applications?

• RQ3: How to evaluate the self-healing prototype with respect to effec-tiveness?

• RQ4: How to evaluate the self-healing prototype with respect to over-head?

(10)

CHAPTER 1. INTRODUCTION 3

1.3

Contribution

My contributions are:

• An empirical study of 16 popular Django websites on Github to study common exceptions in Django applications.

• Three self-healing strategies for common back-end errors, specifically designed to automatically recover from the most frequent errors in the Django web framework.

• The design and implementation of self-healing middleware for the Django web framework. The source code is publicly available for future re-search:

– Django issues crawler and analyzer: https://github.com/

yipeitu/django_issues

– Django self-healing middleware: https://github.com/yipeitu/

django_self-healing

– Evaluation of experiment code and data for reproducibility: https:

//github.com/yipeitu/django_self-healing/experiments

1.4

Ethics & Sustainability

The Django self-healing middleware and Django error analysis tool are pro-vided as open-source; all results are available on Github and aim to be repro-ducible.

(11)

4 CHAPTER 1. INTRODUCTION

1.5

Outline

(12)

Chapter 2

Background

This chapter introduces the necessary background for understanding the later parts of the thesis. We briefly introduce web applications, Python web frame-work, automatic software repair, and middleware.

2.1

Web Applications

The World Wide Web provides 1.6 billion hosted websites [6] so far for peo-ple to search for needed documents and other web resources. All websites are identified by its link address which is Uniform Resource Locators (URLs). Users need a web browser to resolve the URLs to locate and fetch the web resources from the host server. Web browsers interpret the web resources to web pages and communicate with the server to update the web page, such as sending HTTP requests and data streaming. A web application is composed of front-end and back-end. Front-end presents web applications and could be manipulated by users. Back-end, hosted on a server, gives the right response with data according to users’ requests. Typically, modern website architec-ture is database-driven to manage data increase. In other words, the primary web application needs front-end, back-end, and database to provide common usability and user experience for users.

The desired capabilities of robustness, stability, resilience, and flexibility are vital for websites, especially for the ones serving millions of active users. The biggest issue of websites for not only users but also developers is an un-available server [7, 8]. When the server encounters big traffic congestion or unexpected requests such as hacking attacks and unavailable servers may lead to extra cost. In 2011, Egyptian web businesses lost 90 million dollars because the entire internet was shut down for five days [9]. However, there are many

(13)

6 CHAPTER 2. BACKGROUND

ways to prevent servers from being unavailable. One way prevent servers from a shutdown is to enhance servers’ availability. An essential way to achieve this is to monitor website activities and take action on these unexpected events [10]. Load balancing is a way to avoid requests congestion, so servers are sup-posed to be available all the time to serve clients. Auto-scaling could be ap-plied as well to avoid server overloading by increasing more instances to serve users if needed. Fixing errors during runtime is also a strategy to improve server resilience without interrupting the server. Runtime recovery is another way to make a server alive when encountering errors at runtime. Apart from servers, the server shut down sometimes is caused by database errors. Data replication is a backup mechanism for a database to promise data availability, which is to store data in more than one site. Thus, server availability depends on database availability as well.

Due to the complexity of modern web applications, a web application needs many components. Each component is connected and transfers data between themselves to serve users. Once an error happens in one component, the web application may crash. Therefore, building a robust and stable website without falling down is a complex challenge.

2.2

Python Web Framework

Web framework is the collection of packages and modules that allow devel-opers to create web applications without dealing with low-end development such as internet protocols, session management, and multithreading/multi-processing [11]. Developers could launch the web in a short-term period and focus on application development with the help of a web framework. Nowa-days, Python is used a lot in the web development industry because of its high readability and maintainability in back-end. Python provides a full-stack framework and a non-full-stack framework. The full-stack framework usu-ally gives developers full support with common web components such as web form generators, form validations, and templates that are not provided by a non-full-stack framework [12]. Table 2.1 shows the top 3 Github stars of the stack framework and non-stack individually. It indicates that the full-stack framework, Django, is the most popular one.

(14)

CHAPTER 2. BACKGROUND 7

Table 2.1: Python web frameworks

Full-stack Github stars Non full-stack Github stars

Django1 43,998 Bottle2 6,342 Web2py3 1,794 CherryPy4 1,053 TurboGears5 652 Flask6 2,277

Django is a Python web framework that is released in 2005. After 15 years, it dominates the Python web frameworks market. Django’s ambition, building the creation of complex, database-driven websites quickly, leads to its pop-ularity. The main features consist of reusability and pluggability of compo-nents, rapid development with less code. Django is a model, view, and tem-plate architecture, as known as MVT. Model handles the data between the data model and the database. View handles the incoming requests, dispatching the requests, and generating the HTTP responses. Template combines the data and the mapping HTML file to generate the static page. Figure 2.1 shows how Django process the upcoming requests and the ongoing response with its model, view, and template. 1) The incoming HTTP request is passing by the middlewares to the matched URLs. 2) URLs forwards the request to the matched function and View. 3) The matched function reads or writes data via Model to access database 4) View sends the data to the template, and wrap the data with template 5) Sending the template back as response and pass through the middlewares back to the users.

2.3

Middleware

Middleware is a concept of software which provides services between software applications and operating system. Middleware makes it easier for developers to implement communication and input/output based on different purposes for their application [13]. In other words, when middleware knows the input and output, it could communicate with different systems. Thus, the middleware could be independent of the system and be reusable easily. In web develop-ment, middleware plays a vital role in applications.

Django middlewares indicate how to reuse and plug the components. Django supports middlewares, which can get involved among different stages while processing requests and fulfill customize functions. Moreover, developers could customize middleware as well. Django provides developers built-in

6

(15)

8 CHAPTER 2. BACKGROUND

Figure 2.1: Django worklfow

(16)

CHAPTER 2. BACKGROUND 9

Figure 2.2: Django middleware

2.4

Automatic Software Repair

Automatic Software Repair is a field where software bugs are being solved automatically by itself without involvement from developers. Monperrus [15] gives a big picture of all kinds of automatic repair in Automatic Software Re-pair: a Bibliography. It represents two main automatic repair techniques with related research so far. One is behavioral repair, and another is state repair, also known as runtime repair or runtime recovery. When a program is under repair, behavioral repair modifies its behavior online/offline at runtime while state repair changes its state. However, unlike behavioral repair, state repair can only be done at runtime.

(17)

10 CHAPTER 2. BACKGROUND

adding visual information to find possible repair solutions [22] has proposed to repair bugs and discover solutions for web test suits.

Self-healing is one of the state repair strategies. Self-healing software must be equipped with two features [23]: 1) The software system must not fail. 2) If it does fail for some reason, it needs to be able to recover properly from the failure. Designing self-healing components could follow the following steps to build a robust, stable, and flexible software [24]. First, a self-healing com-ponent should be capable of detecting the fault object inside itself. Second, re-configuring the settings in inter-component and intra-components is supposed to be executed before and after the object is repaired. After repairing it, test-ing the healed object is necessary to make sure the healtest-ing strategies worked. Self-healing systems may combine with machine learning to repair the system depending on different computing architectures and environments to avoid hu-man intervention. Sometimes, huhu-man involvement is useful, although it is also costly and may produce potential problems [25]. The variety of self-healing software reveals the complexity of repairing the fault object during runtime [26]. Overall, self-healing strategies vary based on applications, purposes, and time of occurrence. We address more self-healing strategies in chapter 3.

2.5

Summary

(18)

Chapter 3

Related Work on Self-healing

Soft-ware

As we mentioned in section 2.4, self-healing belongs to state repair, which means healing the software automatically during runtime. Software auto-matic repair is a popular research field; recently, web-focused self-healing be-comes much common. Due to modern websites’ complexity, web self-healing technologies could be classified into three topics: front-end, back-end, and database. Each topic could apply different self-healing strategies. Therefore, the following paragraphs introduce the related work for the thesis. There are many kinds of self-healing strategies toward different fields, such as front-end self-healing, back-end self-healing, and database self-healing, etc. In each field, self-healing strategies would be customized by circumstances. Front-end self-healing strategies should consider its role, which is showing the func-tional page to users. In other words, self-healing for front-end aims to prevent browsers from getting compilation failures and showing errors to users. Front-end usually encounters errors, which may be two leading causes. One is unex-pected operations from users. Another is back-end gives the wrong response [27].

3.1

Self-healing Techniques

The following works address Self-healing in front-end. Carzaniga et al. [28] presents a browser extension that executes workarounds for buggy web ap-plications automatically at runtime. This browser extension is application-independent and could rewrite the faulty JavaScript program with three cate-gories. Functionally null operations mean the operations without functional

(19)

12 CHAPTER 3. RELATED WORK ON SELF-HEALING SOFTWARE

effects. Invariant operations mean the pair of different operations, such as adding and removing actions. Alternative operations mean the different sets of operations leading to the same results. Nguyen et al. [29] shows the chal-lenging of detecting and fixing the code smells on server-side so the author proposed a tool to detect the threaten code for back-end. They proposed a tool, WebScent, to detect the errors in the generated web applications. It de-tects the potential errors and then locates them by mapping client-side code with the server-side embedding locations. Although WebScent is not a self-healing tool, it shows a way to detect possible errors in front-end. Carzaniga et al. [30] proposed an automatic self-healing approach by using intrinsic redun-dancy in applications to fix or tolerate errors with some level of self-healing. Durieux et al. [5] proposed an HTTP self-healing proxy, BikiniProxy, to repair the buggy HTML and JS code, so the front-end applications still work even the back-end sent the error response. BikiniProxy is composed of a stateless HTTP proxy to intercept the requests between client and server, the JS/HTML rewriter, which contains the self-healing strategies, and the monitoring and self-healing back-end to store known errors and successful fixing self-healing strategies for each error.

Self-healing in back-end focuses on a fast recovery. Brown et al. [31] pro-posed recovery-oriented computing addresses on recognizing the unexpected failures and recover server back. Nguyen et al. [32] proposed a framework to analyze client-side code in PHP and built call graphs for embedded HTML, CSS, and JS based on the analysis. Alimadadli et al. [33] implemented a tool, SAHAND, to capture full-stack Javascript applications’ execution to enhance program understanding for developers due to the asynchronous interactions spread over the client and server-side. Both Nguyen and Alimadali demon-strate the concept of capturing full-stack web application runtime execution, giving a new way to inject self-healing tools for fixing web application runtime errors.

(20)

CHAPTER 3. RELATED WORK ON SELF-HEALING SOFTWARE 13

approach that could support both static and online recovery. Online recov-ery could accept new transactions while the database is under-recovrecov-ery. The approach only repairs the transactions which have been affected by malicious transactions. Nijjar et al. [36] proposed an automatic system to repair the flaw data model for web applications. They propose a set of data property patterns and infer several heuristics to discriminate property of data models. To verify the inference, the properties, bounded, and unbounded verification techniques are applied. In the end, the system could repair the failure model with a veri-fied inference data model.

Different healing strategies have been proposed to apply in the self-healing techniques above. Runtime recovery/repair is one of the strategies. This strategy aims to fix system at runtime. Candea et al. [37] proposed an autonomous self-recovering application server, JAGR, which could fast recov-ery from downtime without modifying applications. Lewis et al. [38] pro-posed a distributed event-driven runtime software-fault monitor to automati-cally repair buggy states, which could make systems enforceable at runtime. It indicates that this architecture makes the software more reliable and execute within specification. The architecture is composed of a rule engine, software-fault monitor, and event analyzer from the system under test. The rule engine repairs the faulty events and sends it back to the system. Gu et al. [39] utilize exception handling to recover from unexpected errors for automatic runtime recovery with two strategies. One is error transformation, and another is an early return. Error transformation is changing the unanticipated error to an-other type of error, equipped with the appropriate handler on the caller. Early return ignores the error and returns to the caller with a default value to prevent unhandled errors.

(21)

14 CHAPTER 3. RELATED WORK ON SELF-HEALING SOFTWARE

3.2

Resilience Engineering

Resilience means the ability of incomplete applications to recover from spe-cific failures and remain services from user viewpoints. Resilience engineer-ing provides systematic ways to evaluate and improve the system’s resilience. Wang et al. [42] proposed a flexible web framework to handle errors by automatically generating a fault handling strategy for failed services. The sys-tem includes an exception analyzer, decision-maker, and strategy selector, and the accuracy of the automated fault handling strategy is over 95% with ac-ceptable overhead. The system considers system logs to analyze exceptions. The decision-maker constructs the fault handling decision by k-mean cluster-ing. The strategy selector uses an integer program solver to find the solution. Zhang et al. [43] present a resilience improvement system for Java applica-tions equipped with automated monitoring, automated perturbation injection, and automated resilience improvement failure-oblivious computing. Zhang et al. [44] proposed ChaosMachine to analyze Java program exception-handling capabilities in production, at the level of try-catch blocks. Chaos engineering is an active technology to evaluate software’s resilience ability and discover flaws by injecting perturbations in production. ChaosMachine shows a way how to interfere with software and discover weaknesses in production.

3.3

Summary

(22)

Chapter 4

Empirical Study of Errors in Django

Applications

To answer RQ1, which is mentioned in section 1.2, an empirical analysis is conducted on open-source Django projects. The method is inspired by [5] to crawl and collect possible errors for analysis. To the author’s knowledge, the study is the first-ever study of this kind. By analyzing top Django common errors in open-source projects, the dataset provides the information for the de-sign and implementation of self-healing strategies in the following chapters 5 and 6. As we mentioned in section 1.1, Django common errors during runtime have not been measured with real projects. In this section, this thesis provides a notion to define Django’s common errors.

4.1

Protocol

The following section is to discuss the defined protocols for Django projects selection, collected issues analysis, and Django exceptions introduction.

4.1.1

Target Projects

We collect the Django open-source projects that meet the following criteria. 1) the number of stars, 2) the latest commit is after January 2017, 3) the main programming language is Python, 4) the documentation is in English, 5) the number of issues, 6) the reproducibility of projects. With the search bar on Github, we can use Django as a keyword to find out the matched open-source

(23)

16 CHAPTER 4. EMPIRICAL STUDY OF ERRORS IN DJANGO APPLICATIONS

projects1. Sorting the results by most stars is needed since most stars indicate how successful the project is. Besides the most stars, the main programming language should be Python since Django is based on Python. The label of projects should at least include Python and Django. Besides, the latest com-mit is also considered, which should be at least January 2017. The projects’ language should be English since the English developers are still the major-ity in the open-source projects. The number of issues is under consideration since the number of issues can provide the needed information such as the error message and call stack, which is valuable for error analysis.

4.1.2

Issue Analysis on Each Project

Normally, issues in a repository on GitHub could be labeled by ideas,

en-hancements, tasks, and bugs. Apart from the labels, the user who created

issues labeling bug usually provides the relevant content as much as possi-ble for repository developers to reproduce and fix the error. This part gives a valuable input not only for assessing common errors but also for reproducing errors. Aside from issue contents itself, users and developers can discuss by giving comments for the issue, which also contains useful information to an-alyze issues. Since the closed issues are either the problem is solved, or the problem cannot reproduce and fix, even the issues are not reproducible or fix-able, the errors are still countable to assess common errors. Based on those reasons, fetching issues via Github web APIs from public Django projects with many followers, which are the number of stars, is a thoughtful way to collect Django common errors.

After error collection, classifying errors is necessary in order to assess the frequency of incidence. The collected errors should be classified by its cate-gory and count how many times the error catecate-gory happened within all errors. Therefore, a systematic and explainable way to classify all the errors is neces-sary. Thanks to the Django official document [45] based on failure scenarios, there are five categories of Django exceptions, which are listed: Core

Excep-tions raises if any of the configuration settings, data model, validation, or web

security has triggered the exception. URL Resolver Exceptions raises if the URL defined in the template cannot be interpreted into the readable HTML.

Http Exceptions raises if a user cancels an upload. Database Exceptions

raises if any standard database exceptions are raised. Transaction Exceptions raises if any database transactions are raised. Under each exception category,

1

(24)

CHAPTER 4. EMPIRICAL STUDY OF ERRORS IN DJANGO APPLICATIONS 17

Django defines sun-exceptions with more specific conditions to trigger the ex-ceptions. In total, there are 27 distinct sub-exceptions under five exex-ceptions.

In order to classify the errors, those exceptions are treated as a keyword to filter out Django runtime errors by searching in issue titles and issue descrip-tions as long as the critical keyword matches any texts in title or description, the filtering script label this issue a Django error. If an issue contains more than one keyword, the error is counted twice, so it is labeled as another Django error. Besides, we only take runtime exceptions into account since the thesis aims to repair the website during runtime. Therefore, runtime exceptions are the primary concern while analyzing common errors.

Core Exceptions

Under Core exceptions, there are 14 sub-exceptions such as process pre-loading exception (AppRegistryNotReady), database querying exception (ObjectDoes-NotExit, EmptyResultSet, FieldDoes(ObjectDoes-NotExit, MultipleObjectsReturned), sus-picious operations (Sussus-piciousOperation), permission denying (Permission-Denied), configuration error (ViewDoesNotExist, MiddlewareNotUser, Im-porperlyConfigured), and validation error etc. The core exceptions include multiple aspects of the Django exceptions. Not all of them are triggered while users are using the website, such as process pre-loading and configuration er-ror. Since the Core exceptions include runtime and non-runtime errors, Core Exceptions are qualified to keep for classification.

URL Resolver Exceptions

URL Resolver Exceptions have 2 sub-exceptions. One is Resolver404, and another is NoReverseMatch. Resolver404 means the URL pattern does not map to a view. NoReverseMatch means that Django is trying to resolve a Django URL pattern in the template which is to be rendered to the user.

(25)

18 CHAPTER 4. EMPIRICAL STUDY OF ERRORS IN DJANGO APPLICATIONS

Http Exceptions

Http Exceptions happen when the HTTP related errors are triggered, such as the user cancels an upload, and the page is not found. Since Http Exceptions are raised during runtime, the Http Exceptions is also a considerable genre for classification.

Database Exceptions

Database Exceptions are the standard database exception which is wrapped by Django code so the developers can handle the standard database excep-tions. When the database is not available, Django triggers the sub-exception, DatabaseError. When the user is trying to insert a null or duplicated key to the database, the sub-exception-IntergrityError is raised by Django. If the data ta-ble does not exist in the database, the sub-exception-OperationalError is acti-vated by Django. As long as the exceptions are related to the standard database error, the errors are caught by Database Exceptions. Since the database opera-tions usually are triggered while users are browsing the website, the Database Exceptions belong to the classification possible genre.

Transaction Exceptions

Transaction Exceptions are raised while the errors are related to any database transactions. The difference between Database Exceptions and Transaction Exceptions is that transaction-related errors are taken care of by Transaction Exceptions. The rest of the standard database-related errors belong to Database Exceptions . Since the transactions happen while users are using the website, the Transaction Exceptions is also a considerable classification genre.

4.2

Results

(26)

CHAPTER 4. EMPIRICAL STUDY OF ERRORS IN DJANGO APPLICATIONS 19

Table 4.1: 16 open-source Django projects with Github stars, closed issues, line of code of entire project and commits accessed on 2019.08. See A.2 for Projects links

Github name Github stars Line of Code2 Commits Closed/total issues

awx 6,297 320,160 26,448 4,082/4,682 (87.18%) saleor 5,150 135,493 16,756 4,524/4,599 (98.36%) taiga-back 4,845 77,752 3,931 1,254/1,460 (85.89%) django-oscar 3,860 74,762 8,414 3,147/3,254 (96.71%) django-rest-framework-jwt 2,647 2,058 369 329/492 (66.86%) django-react-redux-base 2,474 3,530 134 106/125 (84.8%) django-jet 2,265 19,304 891 189/457 (41.35%) healthchecks 1,934 41,053 1,410 345/404 (85.39%) django-shop 1,920 22,883 5,370 753/826 (91.16%) django-blog-zinnia 1,897 25,576 3,155 540/569 (94.9%) bootcamp 1,757 16,862 1,977 222/236 (94.06%) django-rest-auth 1,723 3,334 517 409/622 (65.75%) Misago 1,659 165,260 5,160 1,276/1,341 (95.15%) Spirit 915 41,576 740 238/280 (85%) django-realworld-example-app 668 1,163 18 15/39 (38.46%) drum 374 11,686 228 51/56 (91.07%) 17,480/19,442 (89.9%)

Table 4.1, we analyze the common errors based on error classification pro-tocol. Table 4.2 summarizes 5 exception categories, 27 sub-exceptions, and frequency of the incidence. In total, there are 194 Django runtime errors with 16 sub-exceptions. The discovered exception types are 16 out of 27 Django exceptions. The filtering script excludes three non-runtime sub-exceptions and one general exception since the thesis aims to repair the runtime Django er-rors and fix the erer-rors with strategies. The top 6 sub-exceptions have covered 208 out of 265 errors, which is around 72.8%. In order to enhance Django resilience during runtime, the top 6 exceptions should be addressed.

1. URL Resolver Exceptions-NoReverseMatch: 44 / 265 (16.6%) 2. Database Exceptions-OperationalError: 42 / 265 (15.85%) 3. Core Exceptions-ValidationError: 38 / 265 (14.33%) 4. Database Exceptions-IntegrityError: 28 / 265 (10.57%) 5. Database Exceptions-DatabaseError: 24 / 265 (9.05%)

2

(27)

20 CHAPTER 4. EMPIRICAL STUDY OF ERRORS IN DJANGO APPLICATIONS

Table 4.2: Number of Django runtime exceptions in 16 Django34

Category Exception Number of errors percentage

Core Exceptions ObjectDoesNotExist 14 5.28%

EmptyResultSet 0 0% FieldDoesNotExist 2 0.75% MultipleObjectsReturned 13 4.9% SuspiciousOperation 1 0.37% PermissionDenied 6 2.26% ViewDoesNotExist 0 0% FieldError 16 6.03% ValidationError 38 14.33% NON_FIELD_ERRORS 17 6.4%

URL Resolver exceptions Resolver404 1 0.37%

NoReverseMatch 44 16.6%

Http Exceptions UnreadablePostError 0 0%

Database Exceptions InterfaceError 4 1.51%

DatabaseError 24 9.05% DataError 11 4.15% OperationalError 42 15.85% IntegrityError 28 10.57% InternalError 2 0.75% NotSupportedError 1 0.37% models.ProtectedError 0 0%

Transaction Exceptions TransactionManagementError 1 0.37%

265 100%

6. Core Exceptions-NON_FIELD_ERRORS: 17 / 265 (6.4%)

The database exceptions in the top 6 have covered 35.47% in the total num-ber of errors. The non-database exceptions in the top 6 have covered 37.33% total errors. It indicates that database exceptions happened more often than non-database exceptions.

4.3

Summary

In this chapter, we fetched the Django errors from popular open-source Django projects. We also analyzed the most common errors with the relevant causes

3

Nonruntime exceptions: Core Exceptions AppRegistryNotReady, Core Exceptions -MiddlewareNotUsed, Core Exceptions - ImproperlyConfigured, Database Exceptions - Pro-grammingError

4

(28)

CHAPTER 4. EMPIRICAL STUDY OF ERRORS IN DJANGO APPLICATIONS 21

and timing when they are triggered. With the dataset of common Django er-rors and the possible raised reasons, self-healing middleware can refer to the empirical study for design and implementation in the next chapter.

(29)

Chapter 5

Design and Implementation of

Self-healing Middleware

This chapter introduces self-healing middleware design and implementation for the Django web framework to answer research questions which listed in section 1.2. There are three sections in this chapter which touch on the ar-chitecture, implementation and summary. Django framework architecture has provided a feasible solution, middleware, to apply repair errors during run-time according to Figure 2.1. Figure 2.2 shows flexibility and extendibility for Django self-defined middleware. The upcoming requests and outgoing re-sponses are all passed by middlewares. Therefore, self-healing middleware is a workable way to implement self-healing strategies. The thesis proposes three strategies to repair the runtime errors for different exceptions.

5.1

Design of Self-healing Middleware

According to the Django framework architecture, which is introduced in sec-tion 2.3, the self-healing middleware is located in the last order among other official middlewares.

5.1.1

Architecture and Workflow

Figure 5.1 presents the overall architecture of the self-healing middleware. There are 4 components inside self-healing middleware, including Request Monitor, Controller, Strategy Executor and Strategy Verifier. Request Moni-tor is used to moniMoni-tor the exception between the upcoming requests and ongo-ing responses. Controller is responsible for dispatchongo-ing the exception to the

(30)

CHAPTER 5. DESIGN AND IMPLEMENTATION OF SELF-HEALING MIDDLEWARE 23

Figure 5.1: Self-healing middleware architecture: The yellow boxes stand for the core components of Django. The green boxes stand for the standard com-ponents for Django. The white boxes stand for the provided self-healing strate-gies.

matched Strategy inside Strategy Executor according to the exception type. Strategy Executor is used to fix the runtime exception. Strategy Verifier is to verify if the exception has solved the exception. Figure 5.2 indicates the workflow of the time-triggered self-healing middleware and how it resolves the exception during runtime. HTTP Request is the starting point from the left side in the figure. According to Figure 2.1, it shows Django framework the requests and response processing flow. The Request Monitor listens to all the events to catch the exceptions. Once the Request Monitor catches the excep-tion, it sends the exception to the Controller. The controller checks the status of the self-healing middleware. If it is ready for a self-healing operation, the controller dispatches the exception to the Strategy Executor. As long as the Strategy Executor has resolved the exception; it sends the request to Strategy Verifindier for verification. Strategy Verifier sends the resolved request back to the starting middlewares, which goes through the whole request process for verification.

5.1.2

Request Monitor

(31)

24 CHAPTER 5. DESIGN AND IMPLEMENTATION OF SELF-HEALING MIDDLEWARE

Figure 5.2: Self-healing middleware workflow if an exception is raised. The green sections represent Django defined modules that must exist. The blue sections are self-healing middlewares.

The metrics about a caught exception under monitoring are: 1) the request header and body such as URL and timestamp, 2) the exception type and mes-sage, 3) the original response code and 4) exception call stack. The request monitor catches all the exceptions since the request monitor is right before the HTTP response, referring to Figure 2.1. As long as the monitor detects the exception, the monitor passes the exception to the controller. Otherwise, the monitor lets the response pass back to the user, referring to Figure 5.2.

5.1.3

Controller

(32)

CHAPTER 5. DESIGN AND IMPLEMENTATION OF SELF-HEALING MIDDLEWARE 25

and the response is sent back to users. Otherwise, the controller dispatches the exception to the right strategy executor when the exception is triggered during the request processing. If the strategy has been applied to the current exception, the controller dispatches the exception to the default strategy. Fig-ure 5.2 shows that all the exceptions during runtime can be caught except the exceptions outside the white box of Request Monitor.

5.1.4

Strategy Executor

We currently provide three strategies under the strategy executor as a proto-type to show the possibility of self-healing middleware during runtime. In the following sections, we introduce the three strategies separately. Due to the flexibility of strategy executor, it is easy to extend or replace the strategy based on developers’ needs.

Strategy 1: URLResolveError

The errors happen when the Django cannot resolve the project URL pattern inside the template. Figure 5.3 indicates how the exception is triggered in the request processing workflow. When Template includes a project URL inside the HTML, Django needs to interpret the project URL into the right HTML format before the response is sending back to the user. Project Conf. lists out all URL paths inside the project. Each application can define its unique namespace and URL paths under the namespace.

(33)

26 CHAPTER 5. DESIGN AND IMPLEMENTATION OF SELF-HEALING MIDDLEWARE

Figure 5.3: URLResolveError-NoReverseMatch is triggered in this workflow The namespace can be applied to distinguish with the same URL name under different applications. If the URL name cannot be found under the declared namespace, the exception is triggered. 6) Incorrect Regular Expression. As we mentioned that we could declare parameters for URL path in URLs. We can also declare more complicated parameters with regular expression. When the data with the URL name in Template cannot be matched with the defined regular expression, the exception is triggered. So far, we have listed out all the possible causes to raise URLResolveError-NoReverseMatch. With the six possible causes, we can label them into two categories based on their solution. Template Modification means the solution is to modify the template during runtime. It overrides the current template with the right URL pattern perma-nently. It means the error does not occur next time with the same request. Configuration Modification means the solution is to modify the configuration during runtime. It modifies the configuration in memory, which benefits the upcoming requests as well. The configuration has been loaded to the memory, so it affects the following requests. However, the risk of configuration modifi-cation is that when the server is restarting or upgrading, memory modifimodifi-cation does not exist anymore.

(34)

CHAPTER 5. DESIGN AND IMPLEMENTATION OF SELF-HEALING MIDDLEWARE 27

Figure 5.4: DatabaseError-IntegrityError is triggered in this workflow the URL name cannot be found, self-healing middleware replaces the wrong URL with a workable one in Template. The solution for 5) is to rename the namespace in Project Conf. according to the namespace in Template. How-ever, the solutions for 4) and 5) are risky as well because the changes may not be 100% correct for the website.

Strategy 2: DatabaseError

DatabaseError-IntegrityError happens for 3 reasons: 1) Duplicated key. When the insertion data has duplicated key with the existing one in the database. 2) Null value. If one of the insertion data fields is null, the exception is raised. 3) Against foreign key constraint. The foreign key means linking two tables by using the primary key from the parent table as a foreign key in the child table. The foreign key constraint can avoid destroying the links between ta-bles. It can also avoid inserting the invalid value to the column since the value should be the primary key in its parent table. Figure 5.4 shows the timing while DatabaseError-IntegrityError is triggered. Model in Django means an object is mapped to Database. The model is the object mapped to the database. As long as a model has been created, Django creates the corresponding table for us in Database according to the fields and information from Model. It also indicates that Model is an interface for developers to create, read, update, and delete data from Database.

(35)

28 CHAPTER 5. DESIGN AND IMPLEMENTATION OF SELF-HEALING MIDDLEWARE

database. If the database migration fails to fix the error, self-healing middle-ware takes users to an error page, which shows up a readable error message, which is better than showing HTTP status code 500 to the user.

Strategy 3: Graceful-degradation

The last strategy, which is called Graceful-degradation, is to catch all unhan-dled and unfixable exceptions during runtime and return the useful page to the users. The exceptions may include the errors in Strategy 1: URLResolveEr-ror and Strategy 2: DatabaseErURLResolveEr-ror if the erURLResolveEr-rors have not been fixed after both strategies. Here are three options listed out as Graceful-degradation: 1) Redi-rect users to the home page, 2) RediRedi-rect users to the last page, 3) Show the error message to the user. Although Graceful-degradation is not fixing the current errors, at least Graceful-degradation can avoid the server from crashing and avoid users from getting an unexpected non-functional page with HTTP status code 500.

The strategy 1) Redirect users to the home page. By modifying the re-sponse, the user can be redirected to the home page. The drawback of it is that users may feel confused with the redirection and keep trying to do the same operation again and again, which increases the server loading. The strategy 2) Redirect users to the last page, by doing so, parsing the request coming from which page is needed. Next, the response has to be modified with the last vis-ited page. However, the option has the same drawback as the first one. Users may keep sending the same request since they do not know why they keep staying on the similar page after the request. The strategy 3) Show the error message to the user. With a default blank page and filling the error message from Controller to the page, the error message is ready to send to the user. Al-though it can prevent users from repetitively sending requests to the server, the real request is not accomplished. Also, the error message may be too technical for users to understand it.

The Extensibility of Strategy Executor

(36)

CHAPTER 5. DESIGN AND IMPLEMENTATION OF SELF-HEALING MIDDLEWARE 29

Figure 5.5: Architecture of the extensibility of Strategy Executor starts. If the exception is raised, self-healing middleware sends the mapped exception type to the right strategy module to fix the error. The label, Others, is the reserved word in Strategy Conf. since it is used to handle all unfixed exceptions.

5.1.5

Strategy Verifier

The strategy verifier is built for checking if the strategy executor has fixed the exception. In order to verify the request, the strategy verifier processes the resolved request again. If the request passes the verification, the strategy verifier sends the request back to the request monitor, which processes the request from View. If it fails, the strategy verifier sends the failed request back to the controller. The drawback is that the strategy verifier cannot verify the exception, which happens before View because the re-processing request is implemented by overriding the Django function get_response. It is only processing the request from View. However, Strategy 1: URLResolveError and Strategy 2: DatabaseError happen in and after View so both of them can be verified by strategy verifier. Therefore, the current strategy verifier is able to support the verification of the current strategies.

5.2

Implementation

(37)

30 CHAPTER 5. DESIGN AND IMPLEMENTATION OF SELF-HEALING MIDDLEWARE

function, process_exception, should be overridden as well. By doing so, the request monitor can notify the controller for the next step. The controller is implemented inside the process_exception, which provides the needed infor-mation such as request content and exception message to the controller. The requested content includes all information such as the HTTP body, the full path of the requested page, and HTTP method, etc. The exception message in-cludes the type of the Django exception and the details of the exception such as stack trace. However, it cannot get an exception message if the exception hap-pens after self-healing middleware among other middlewares during response processing.

The methods are implemented by Python Django middleware. It is an open-source project on Github https://github.com/

yipeitu/django_self-healing. For evaluation, we use a docker con-tainer to exclude the incidents from outside.

5.3

Summary

(38)

Chapter 6

Experiment Protocol on Self-healing

Middleware

To answer RQ3 and RQ4, we need a qualified workaround for evaluation. A series of experiments are conducted on two open-source Django projects to evaluate the designed self-healing middleware. Each experiment uses the same metrics to assess the difference.

6.1

Experiment Targets

We select the issues from the 16 Django open-source projects that meet the following criteria. 1) Closed issues, 2) Matched the top 6 exceptions based on the Django common errors in chapter 4, 3) Reproducibility of the issues is concerned as well, 4) The number of stars of the reproducible project, 5) Django version should be higher than 1.11. According to Django common errors in chapter 4, we select the matched closed issues from top to bottom to evaluate self-healing middleware. In the common case, the status of issues is closed after developers fix the bug. Although the small portion of closed issues is closed by repository owners because of non-reproducible, closed issues are still appealing due to its tremendous useful information. Aside from issue con-tents itself, users and developers can discuss by giving comments for the issue, which also contains useful information. That information provides a possible way to apply as a self-healing strategy. Therefore, taking the closed issues for evaluation is a feasible way to evaluate and design self-healing strategies. Since the self-healing strategy should be applied to the errors to measure if it works, reproducibility becomes the essential condition for evaluating self-healing strategies. Therefore, the reproducibility of issues is necessary for

(39)

32 CHAPTER 6. EXPERIMENT PROTOCOL ON SELF-HEALING MIDDLEWARE

Table 6.1: Strategy and project evaluation information.

Strategy under evaluation Strategy 1 Strategy 2

Exception name URLResovler-NoReverseMatch DatabaseError-IntegrityError

Project name django-oscar django-blog-zinnia

Django version 1.11.25 2.1.15

Issue Number 3,254 569

Commit1

41b824102 9bb09e1b

evaluation since the self-healing middleware repairs the error during runtime. In other words, the more stars with the reproducible issues, the merrier. Also, the version of Django cannot be lower than 1.11 since Django 1.11 is to sup-port Python 2.7, and the latest patch for Django 1.11 is 18.12.2019. Django 2.0 is to support Python 3.4 over the higher version number of Python.

Based on the criteria above, two reproducible issues are matched for eval-uation. The first one relates to URLResovler-NoReverseMatch belonging to django-blog-zinnia, which is a simple yet powerful and extendable applica-tion for managing a blog within the Django Web site. It has 1,897 stars (see Table 4.1) and uses Django 1.11. The second one relates to DatabaseError-IntegrityError belongs to project django-oscar, which is an e-commerce frame-work for Django designed for building domain-driven sites by providing from large-scale B2C sites to complex B2B sites rich in domain-specific business logic. It has 3,860 stars (see Table 4.1) and uses Django 2.0. With both is-sues, we can evaluate the self-healing middleware among different versions of Django. Table 6.1 summarizes the basic information about these two projects together with the issues that can reproduce the corresponding exceptions. The default strategy must take over when the strategy 1 and strategy 2 cannot han-dle the exception so it is not necessary to have one more project to evaluate default strategy.

6.2

Protocols

In order to trigger the specific exception under study, a piece of workload is necessary for each experiment. We set 50 users to simulate the real life web-sites to achieve workload balancing. Each user sends four consecutive requests per round: visit the website homepage, user login, visit the specific page to trigger the error which will be healed by the middleware, and user logout. We call the consecutive requests as Action Requests. Each user has to log out at

1

(40)

CHAPTER 6. EXPERIMENT PROTOCOL ON SELF-HEALING MIDDLEWARE 33

Figure 6.1: Experiment workflow

the end of the round to ensure that the user is not in the previous session for the next round. Each user has to execute the four requests for 30 times within 10 minutes which means the website in each experiment receives 50*4*30 requests. Figure 6.1 indicates the experiment workflow. The workload is con-ducting the experiment at least 3 times per case to avoid bias and then take average value of the metrics as the results for each case. Mahajan et al. [48] uses the same approach to represent the result with its median scores.

As mentioned in section 1.2, both the effectiveness and the performance overhead of the self-healing middle need to be evaluated. To evaluate its ef-fectiveness, we consider the HTTP status code 2xx/3xx/4xx as the correct re-sponses that indicate correctness. Other HTTP status code means that the server encounters the unresolvable error which may crash the website. The correctness shows if the self-healing middleware fixes all the runtime errors. Therefore, the effectiveness of the self-healing middleware is the correctness of the HTTP status code response before and after self-healing middleware. In order to measure the correctness and response time, we applied JMeter2to simulate and catch the metrics. JMeter is a JAVA open-source to test func-tional behavior and measure performance. With the assist of Jmeter, we can simulate 50 robots to send out the Action Requests with 30 rounds and collect HTTP related data automatically.

To evaluate its performance overhead, we take CPU and memory usage to see if the self-healing middleware increase the website loading time during runtime. Besides, the efficiency of the response time is under concern since the self-healing middleware may increase the response time which can hinder the user experience with the website. The server response time indicates how long it takes for the server to process the request and return the response to the users. However, it also affects the speed of the internet and other external factors. We experiment on the localhost so we can exclude internet factors.

2

(41)

34 CHAPTER 6. EXPERIMENT PROTOCOL ON SELF-HEALING MIDDLEWARE

Table 6.2: Effectiveness is composed by correctness. Overhead is composed by response time, CPU Usage and Memory Usage.

Metrics Indicator Measured By

Effectiveness Correctness JMeter Response time JMeter Overhead CPU cAdvisor Memory cAdvisor

In order to decrease the interference from the system and any unpredicted incidents, we use a virtual environment to make sure each experiment is con-ducted in the same environment. Docker container3 is a virtual environment with a single operating system kernel with user selection software, libraries, and configuration files, which make sure the environment is non-interference and excluded the incidents outside. Apart from the docker container, the met-rics measurement inside the container is handled by cAdvisor4 and stored in Prometheus5. cAdvisor, as known as Container Advisor, provides container users an understanding of the resource usage and performance characteristics of their running containers. With the help of cAdvisor, we can collect the OS related data while the websites which is hosted inside the docker are under the experiments. Prometheus is an open-source monitoring system with a di-mensional data model, flexible query language, efficient time-series database, which plays a vital role in data visualization. With Prometheus, the result from cAdvisor can be stored into a database and use it flexibly with a powerful user interface for data visualization. Table 6.2 shows the mapped indicators of effectiveness and performance overhead and measured by which software.

6.3

Summary

To measure correctness and performance with and without self-healing mid-dleware, we conducted two controlled experiments to evaluate the self-healing strategies. Each experiment has been done three times to avoid bias. The first control experiment is to evaluate Strategy 1-URLResolverError fixing URLResovlerError-NoReverseMatch exception which is tested with a project, django-blog-zinnia, which is one of 16 Django open-source projects (see table 4.1). The second control experiment is to observe Strategy 2-DatabaseError

(42)

CHAPTER 6. EXPERIMENT PROTOCOL ON SELF-HEALING MIDDLEWARE 35

resolving DatabaseError-IntegrityError exception. It is conducted on django-oscar, which also belongs to 16 Django open-source projects (see table 4.1). Although each experiment has designed to test its exception, both of them eval-uate if the self-healing middleware works in the Django framework. Besides, Strategy 3-Graceful-degradation is applied implicitly when other self-healing strategies cannot fix the errors.

(43)

Chapter 7

Experimental Results

In this chapter, we mainly focus on two indicators, correctness and perfor-mance. Correctness is defined by the final HTTP response code. Performance is composed of three indexes, which are elapsed time, CPU usage percentage, and memory usage.

7.1

Effectiveness

Table 7.1 shows the correctness under each experiment without and with self-healing middleware for each project. Each experiment has executed with 4 action requests (1500 requests per action request) for 3 times. In this con-text, an action request means visit the admin page, login, visit the exception page, and log out, which is defined in Section 6.2. The definition of correct-ness means the last return HTTP status code is 200. We consider a series of redirecting requests as one action request. For example, we send one action request, exception, to the website. The website processes the exception by redirecting it two times to different URLs. In total, there 3 requests for action request, exception. However, we only take the final response into account to measure its correctness. The correctness ratio means the number of successful HTTP status code per action request divided by the number of requests (1500 requests in total).

In first controlled experiment - evaluation Strategy 1-URLResolverError, Blog-zinna with self-healing middleware shows 100% correctness ratio fixing the exceptions in action request, exception. Besides, it shows the self-healing middleware does not affect the response correctness for other action requests by comparing blog-zinna with and without self-healing. The average of the correctness ratio for admin, login and logout are 100% correct for both cases.

(44)

CHAPTER 7. EXPERIMENTAL RESULTS 37

Table 7.1: The correctness of 4 actions without healing and with self-healing for blog-zinna and django-oscar.

Exception Type Correctness ratio (%) admin login exception logout

URLResolver blog-zinnia

without self-healing Exp.1 100% 100% 0% 100%

Exp.2 100% 100% 0% 100%

Exp.3 100% 100% 0% 100%

Average 100% 100% 0% 100%

blog-zinnia

with self-healing Exp.1 100% 100% 100% 100%

Exp.2 100% 100% 100% 100%

Exp.3 100% 100% 100% 100%

Average 100% 100% 100% 100%

DatabaseErrror django-oscar

without self-healing Exp.1 100% 35.94% 0% 85.86%

Exp.2 100% 29.14% 0% 87.53%

Exp.3 100% 23.87% 0% 88.14%

Average 100% 29.65% 0% 87.17%

django-oscar

with self-healing Exp.1 100% 42.2% 11.07% 88.34%

Exp.2 100% 19.07% 11.2% 86.47%

Exp.3 100% 20.07% 13.34% 91.4%

Average 100% 27.11% 11.87% 88.73%

(45)

38 CHAPTER 7. EXPERIMENTAL RESULTS

7.2

Performance Overhead

Table 7.2 shows the elapsed time (seconds) for four action requests separately and total elapsed time under different experiments. The definition of elapsed time means the server processing time from receiving the request to return the response, including the redirecting requests.

In the first controlled experiment - evaluation Strategy 1-URLResolverError, Blog-zinna without self-healing middleware has significantly less elapsed time than the one with self-healing for each action request. Moreover, it almost dou-bles the processing time for blog-zinnia with self-healing middleware. The ac-tion requests of login, excepac-tion-URLResolver and logout without self-healing redirected twice before returning to users. In total, the completed action re-quest for them is three rere-quests consecutively. The action rere-quest, exception-URLResolver, takes longer elapsed time since it is the exception point that is interfered with by self-healing middleware. The self-healing middleware catches the exception and fixes the exception by applying either Template Mod-ification or Configuration ModMod-ification. After fixing the exception, Strategy Verifier is triggered to verify the patched request by processing the request again. If the exception has not been fixed, Request Monitor must catch the exception at most three times before it is fixed and ready to send back to the user. With self-healing middleware for exception-URLResolver, the number of redirection increase two more times.

In the second controlled experiment - evaluation Strategy 2-DatabaseError, the elapsed time of Django-oscar with self-healing middleware has almost doubled with the one without self-healing middleware in total time. The in-teresting finding is exception-DatabaseError takes the least time among other action requests for both experiments. Login and logout require database opera-tions to complete the requests. Besides, both of them are redirected for 3 and 5 times, which explains why they take longer than exception-DatabaseError for both cases. However, admin needs more time than exception-DatabaseError to process the request even with self-healing middleware. Even so, the behav-ior of elapsed time without and with self-healing middleware of django-oscar is the same as blog-zinnia’s behavior. Besides, Strategy 2-DatabaseError has no serious impact regarding elapsed time, which is a useful finding.

(46)

CHAPTER 7. EXPERIMENTAL RESULTS 39

Table 7.2: The elapsed time (seconds) for 4 action requests without self-healing and with self-self-healing for blog-zinna and django-oscar.

Elapsed Time (sec) admin login URLResolver logout Total Time

blog-zinnia

without self-healing Exp.1 0.81 4.72 7.33 2.56 15.44

Exp.2 1.11 6.13 9.16 3.46 19.88

Exp.3 1.35 7.11 10.94 4 23.41

Average 1.09 5.98 9.14 3.34 19.57

blog-zinnia

with self-healing Exp.1 2.69 10.77 27.75 6.14 47.37

Exp.2 2.83 11.11 29.05 6.56 49.5

Exp.3 2.62 10.11 27.4 5.65 45.8

Average 2.71 10.66 28.06 6.11 47.55 Elapsed Time (sec) admin login DatabaseErrror logout Total Time

djang-oscar

without self-healing Exp.1 4.15 8.15 2.44 11.34 26.08

Exp.2 3.02 5.88 1.93 8.53 19.36

Exp.3 4.05 10.03 2.48 12.51 29.07

Average 3.74 8.02 2.28 10.79 24.83

django-oscar

with self-healing Exp.1 8.93 19.39 5.43 23.75 57.5

Exp.2 10.81 23.92 5.65 27.62 68

Exp.3 4.84 11.36 3.83 13.19 33.22

Average 8.19 18.22 4.97 21.52 52.9

In the first controlled experiment - evaluation Strategy 1-URLResolverError, both CPU usage and memory usage without self-healing middleware are lower than the one with self-healing middleware. The difference of CPU usage and memory usage between with and without self-healing are decreasing 15.16% and increasing 0.74 MB. We are able to infer that Strategy 1-URLResolverError does not increase the loading of CPU and memory usage. We take a closer look at each experiment for CPU usage that Exp.1 with self-healing middleware is closer to Exp.2 and Exp.3 without self-healing middleware. For memory us-age, The Exp.1 and Exp.2 with self-healing middleware are closer to Exp.3 without self-healing middleware. Therefore, we can conclude that the effect of Strategy 1-URLResolverError for both CPU usage and memory usage has not increased the overhead.

(47)

40 CHAPTER 7. EXPERIMENTAL RESULTS

Table 7.3: CPU system usage (%) and memory usage (MB) without self-healing and with self-self-healing for blog-zinna and django-oscar.

Exception Type CPU Usage (%) Memory Usage (MB)

URLResolver blog-zinnia

without self-healing Exp.1 62.32 13.73

Exp.2 65.88 10.56

Exp.3 66.06 9.53

Average 64.5 10.31

blog-zinnia

with self-healing Exp.1 47.19 9

Exp.2 48.43 16.9

Exp.3 52.46 8.77

Average 49.34 11.05

DatabaseError djang-oscar

without self-healing Exp.1 73.02 12.32

Exp.2 74.54 9.65

Exp.3 72.15 12.16

Average 73.23 11.37

django-oscar

with self-healing Exp.1 94.38 26.39

Exp.2 95.42 24.6

Exp.3 108.94 29.31

Average 99.58 26.76

significantly for CPU usage and memory usage. One of the potential reasons is the self-healing strategy regarding Strategy 2-DatabaseError is by migrating the database which may increase the CPU usage and memory usage. It can be fixed by replacing the strategy with more efficient one in the future.

7.3

Summary

(48)

Chapter 8

Discussion

8.1

RQ1: Django Common Errors

For RQ1, Table 4.2 shows the Django common errors from the open-source projects. Based on the analysis on 19,442 issues from 16 Django projects in chapter 4, it indicates that the top 6 Django common errors have covered 70.01% (208/297) of issues: 1) URLResolver-NoReverseMatch means Django fails to interpret the URL name in Template to HTML according to Project Conf. (see Figure 5.3). 2) Database Exceptions-OperationalError happens when database operations fail, such as no table, no column, mismatching for-eign key, and connection loss. 3) Core Exceptions-ValidationError means there are types of fields which cannot meet some criteria. Developers can build their Validators for form validation and value checking. 4) Database Exceptions-IntegrityError happens if the data insertion fails due to duplicated primary key, or null field, or foreign key constraint. 5) Database Exceptions-DatabaseError means the database itself triggers the exception, but the ex-ception does not belongs to Django Database Exex-ceptions such as a deadlock for data. 6) Core Exceptions-NON_FIELD_ERRORS means a validation er-ror is raised, but the invalid data does not belong to a specific field in a form or model. Therefore, the invalid data triggers Core Exceptions-NON_FIELD-_ERRORS.

With Django common errors, both Django users and developers can ben-efit from it. Django users are able to avoid those common errors by designing test suites to test the website before it launches. On the other hand, Django’s common errors provide information for Django developers if they start to im-prove the reliability and resilience of the Django framework. Furthermore, Django’s common errors follow the Django officially declared trustworthy

(49)

42 CHAPTER 8. DISCUSSION

ception types.

The limitation of the result is the number, popularity, and versatility of projects. Table 4.1 shows the number of projects and the popularity of projects by the number of stars. Sixteen projects cannot represent all Django projects. The popularity of the projects point out how many developers are interested in it. However, it does not show if those followers use the projects for further development. Universality represents the genres of projects. However, it is hard to tell how versatile it is about all the projects.

To summarize and answer RQ1, "What are common errors in the current Django web framework?", it can be seen that 16 popular open-source Django projects define the Django common errors (see Table 4.1). The errors are fol-lowing the Django exceptions defined by Django official. The most common errors are related to template interpretation failure and database error. How-ever, Django’s common errors are limited by the number of collected open-source Django projects, which is 16 in this thesis. Although it cannot repre-sent all the Django projects, it is still valuable to provide useful information for Django users and developers to strengthen the websites. Different features enable a restricted and more accessible setup of monitoring.

8.2

RQ2: Self-healing Prototype

References

Related documents

In literature they express involvement as; “A great implementation process is just a degree of involved staff, coming from different levels of the organization,

Thus this research is based on the question “What is the relation between company’s business strategy and project’s strategy in innovation projects following the position

Kaplan and Norton (2000c) do pinpoint this problem and advocate the necessity of connecting strategy and planning through the budget. The key question is whether successful use of

Even though it is seen how the doing of strategy is managed by assigning the module leads and module groups to fill the modules with content and activities, it is shown how Group

In Gekås’ current situation, where they do not work with strategy maps, we believe that it is hard for them during the interactive controls to relate the problems received by

Sweden intends to work to strengthen the Barents Euro-Arctic Council and the Barents Regional Council in matters of particular relevance to the Barents region such as

I uppsatsarbetet har Teori U använts som en teoretisk utgångspunkt samt som en inspiration till att designa forskningsarbetets arbetsmetoder för att representanter från branschen

To be able to follow a smooth motion path defined by a spline curve it can be good to know the closest point on the curve given the aircraft’s position. In general these two