Managing Technical Debt in Django Web Applications

(1)

Managing Technical Debt in Django Web Applications

PER CLASSON

(2)

Managing Technical Debt in Django Web Applications

PER CLASSON PCLASSON@KTH.SE

2016-03-12

Master’s Thesis at CSC Supervisor: Karl Meinke Examiner: Johan Håstad

(3)

(4)

Abstract

Technical debt is a metaphor that refers to the consequences of suboptimal software development. Developers will have to pay interest on this debt, in terms of costs of maintenance. The term helps developers communicate the importance of software quality. This thesis has studied technical debt in the context of Django web applications.

In a survey conducted, the main causes of technical debt in Django applications were found to be architectural issues and lack of testing. This is in line with other studies of causes of technical debt. Tools and practices used in Django development were evaluated. From this evaluation several guidelines were formulated on how to best manage and limit technical debt. The results suggest that static code analyzers should be used to maintain code standards.

Furthermore, the evaluation show that log aggregation tools like Sentry are helpful. Application monitoring should be used if there are performance issues, deprecation patterns can be used in refactorizations and the identification and removal of dead code is probably unnecessary. Finally, pre- commit tools help in preventing technical debt.

(5)

Hantering utav teknisk skuld i Django webbapplikationer

Teknisk skuld är en metafor som beskriver konsekven- serna utav suboptimal mjukvaruutveckling. Utvecklare mås- te betala ränta på denna skuld, i form utav kostnader för underhåll. Termen hjälper utvecklare att kommunicera vik- ten utav programvarukvalitet. Denna rapport har studerat teknisk skuld i kontexten utav Django webbapplikationer.

En enkätundersökning gjordes och de främsta orsakerna till teknisk skuld i Django applikationer visade sig vara arkitektoniska problem och brist utav testning. Detta ligger i linje med andra studier utav orsakerna av teknisk skuld.

Verktyg och metoder som används i Django utveckling ut- värderades. Från denna utvärdering flera riktlinjer formu- lerades om hur man bäst hanterar och begränsar teknisk skuld. Resultaten visar på att statisk programanalys bör användas för att upprätthålla kod standarder. Dessutom visar utvärderingen att log aggregerings verktyg som Sentry är användbara. Applikations övervakning bör användas om det finns prestandaproblem, deprecation patterns kan an- vändas i refaktoriseringar och identifiering av död kod är förmodligen onödigt. Slutligen visar sig pre-commit verktyg vara hjälpsamma i att förebygga teknisk skuld.

(6)

Introduction

The technical debt metaphor refers to the consequences of sub-optimal software development [3]. Examples of technical debt could be architectural issues or unused code. For any system that has technical debt, developers will have to pay interest on this debt in terms of costs of maintenance. By making technical debt visible, and eventually repaying it (for example by re-factorization), developers can make sure that the debt make as little harm as possible. The term technical debt is useful for developers when discussing and communicating the importance of software quality.

Technical debt exists in all types of software, but it is different depending on what kind of framework or programming language that is used. The most popular web framework for Python is the free and open source Django framework. It is a full-stack web framework that makes it easy to develop complex, database-driven web applications. To keep such applications maintainable, it is important to manage their technical debt; and that is the topic of this thesis.

1.1 Motivation

There have been a lot of research done on technical debt and its management. The aim of the research has been to improve current software techniques, to increase software development productivity.

The technical debt metaphor can have many interpretations, depending on programming language, framework choice, or hardware dependencies. There have been a lack of field studies, therefore it is useful to investigate causes of technical debt in the context of Django web applications. In this thesis it is done by conducting a survey. This does not only help to further define the metaphor, but can also help out in management of technical debt in Django, as the causes of debt can be identified easier.

According to recent mapping studies of technical debt [6], there is a lack of em- pirical studies on technical debt management approaches. This study can hopefully fill some of that void by suggesting and testing solutions to manage technical debt in practice. The study of management approaches can not only help out in research,

(11)

but also Django developers as this study tries to suggest some guidelines in managing technical debt. By improving how Django developers think about technical debt in their applications, it is hopefully possible for them to minimize the costs of maintenance. As this report only discusses software development and technical debt in a broader terms, there has been no analysis done on the ethical aspects of this work.

1.2 Research questions

To help define the concept of technical debt in the context of Django web applications, the first research question asks what the main causes of technical debt are.RQ1. What are the main causes of technical debt in Django web applications?

It is then interesting to know the practices and tools available to manage this technical debt. This makes it possible to find out how Django developers identify and prevent technical debt.

RQ2. Are there practices and tools to manage technical debt in Django web applications?

From knowing the causes of debt, and the practices and tools used to manage it, it should then be possible to set up some guidelines for how to best manage technical debt when developing with Django web applications.

RQ3. Can we establish some guidelines for developing with Python and Django to manage technical debt?

1.3 Scope and limitations

Technical debt does not only exist in code, but also in for example tests or infrastructure. This thesis does not investigate all kinds of technical debt that might occur in a software system that uses Django, instead it focuses on code and design debt. The typical infrastructure or documentation of a Django web application is not much different from any other web application, that is why such debt is not discussed.

The management of technical debt can be separated into several types of activities, like measurement and communication. This thesis mainly talks about three types of activities; identification, prevention, and repayment. These are some of the main activities of managing technical debt, but with more time and resources other activities could have been evaluated as well. The case study that evaluated management of technical debt was only performed on one software project that used Django. With more time it would have been interesting to evaluate management activities on more projects.

(12)

Chapter 2

Background and Literature Study

This chapter gives a background to the metaphor technical debt and the recent research done on managing technical debt. Common causes of technical debt have been gathered from other papers and talks and are presented in a section below.

The last section lists the different activities done in technical debt management.

2.1 The technical debt metaphor

The term and metaphor technical debt was first coined by Ward Cunningham in 1992 [1]. He described a situation where the long-term goal of software quality was traded for short-term gain, creating technical debt.

“Shipping first time code is like going into debt. A little debt speeds development so long it is paid back promptly with a rewrite. [...] The danger occurs when the debt is not repaid. Every minute spent on not-quite-right code count as interest on that debt. Entire engineering organizations can be brought to a stand-still under the debt load of an unconsolidated implementation.”

The metaphor helps us think about problems that are prevalent in software projects, and it says that if we do things the “quick and dirty” way we will get technical debt. An example might be technical debt from a poor software architecture, that developers have implemented as a compromise to meet a deadline. Like financial debt, we need to pay interest on technical debt. In our example it would be the extra time needed to implement a new feature because of the poor architecture.

Either the developers can continue to pay interest, or they can pay back the whole debt by refactoring the architecture.

When the term was coined in 1992 it saw little use, but around 2000 the term started to gain traction. It was mostly being used in blogs and essays as a term for all software problems. [3] Software engineers often advocate for investments in code quality and documentation, which might not generate short-term revenue, but are

(13)

crucial to systems lasting quality. The term is then attractive to use for engineers, as a tool to communicate to managers the importance of software qualities.

2.1.1 Technical debt quadrant

The term was further defined when Martin Fowler in 2009 [2] proposed the technical debt quadrant. It came out of the argument if technical debt should be reserved for considered decisions that gives short-term gains but create debt in the long-term, while messy code created by programmers ignorant of good practices should not be considered technical debt. Fowler suggested that instead of thinking in terms of non-debt and debt, that we differentiate between prudent and reckless debt.

Another dimension can then be added, with the difference between deliberate and inadvertent debt. Reckless deliberate debt comes from when developers do things the “quick and dirty” way, while reckless inadvertent debt comes from lack of knowledge. Prudent deliberate debt comes from when developers take on technical debt for a good reason, for example it might not be worth the time to implement the perfect code design in a system that is rarely touched. The last type is then prudent inadvertent debt that comes when developer realize after they have created a system what the perfect design should have been.

Table 2.1. Technical debt quadrant

Reckless Prudent

Deliberate We don’t have time for design

We must ship now and deal with the consequences Inadvertent What’s layering? Now we know how we

should have done it

2.1.2 Managing technical debt research

The topic of technical debt got further attention in 2010 when several researchers from different institutes and universities met at the Carnegie Mellon Software En- gineering Institute. They published a paper [4] that sets the agenda for future research in the field of managing technical debt, by establishing the workshop man- aging technical debt.

It says that research on managing technical debt can make the intuitive under- standing of technical debt into a more rigorous definition. Research can also help the technical community by clarifying the strengths and weaknesses of different types

(14)

2.1. THE TECHNICAL DEBT METAPHOR

of technical debt management. The paper suggest several open research questions that are listed below.

• The first research question is how we find refactoring opportunities. We mostly depend on developers’ experience to detect debt, but there has been tools developed to detect symptoms of poor design in code. However such tools are difficult evaluate quantitatively, as it is hard to estimate the impact of re-factorizations.

• While small local re-factorizations can improve a code base, system-wide architectural issues often need larger efforts. It can become a subjective mat- ter to evaluate if and how to re-architect a system. Having tools to measure and manage technical debt in architecture could help developers make more informed decisions.

• Identifying dominant sources of debt in different contexts from field studies, can give directions on what research should focus on.

• Measurement of technical debt has been researched and there are many issues with it. To make the measurements useful they need to be combined and created to help decision-making. The measurements should then be put into several specific concepts. The first one is principal which is cost of eliminat- ing the technical debt. The second one is interest probability, which “is the probability that a particular type of technical debt will in fact have visible consequences”[4]. The third one is interest amount which is the added cost of performing maintenance because of the technical debt.

• Non-code artifacts can also be subject to technical debt research, for example how technical debt occurs when test plans are not followed.

• Monitoring and visualization tools of technical debt could help provide insight into when there is “too much” debt. To know if there is too much debt, there need to be some acceptability thresholds to be researched.

• To ensure that the technical debt metaphor leads to good practices, research is needed to evaluate projects that have been launched to remove technical debt. To make techniques of removing technical debt explicit, current defined development processes should then be adapted to include tracking and management of technical debt.

2.1.3 Types of technical debt

A systematic literature review from 2014 [6] mapped 94 studies about managing technical debt. From the studies the paper collected what different types of technical debt that had been identified, and it resulted in a classification of 10 types.

(15)

1. Requirements technical debt is the difference between the optimal require- ments’ specification and what is actually implemented.

2. Architectural technical debt is the limitations of an architecture that for ex- ample impair maintainability.

3. Design technical debt is defects in the code design, for example having overly complex classes.

4. Code technical debt is poorly written code that does not follow the best prac- tices and standards.

5. Test technical debt refers to shortcuts taken when creating tests, for example low test coverage or inaccurate test cases.

6. Build technical debt refers to problems in the build system, for example that it is hard to deploy code.

7. Documentation technical debt refers to insufficient or outdated documentation.

8. Infrastructure technical debt refers to sub-optimal configuration of systems or problems with development-related supporting technologies.

9. Versioning technical debt refers to problem with the version control system.

10. Defect technical debt refers to defects or bugs in a software system.

The paper [6] also tries to define what is not technical debt. It lists examples like unimplemented features and functionalities. Some types above are not strictly defined, for example the problem of overly complex code or duplicated code can be identified as both code and design debt.

2.2 Causes of technical debt

There are many causes of technical debt in software projects. Listed below are the main identified causes of technical debt, that have been chosen after review of what other studies [6, 9] reported about technical debt.

This thesis puts the technical debt metaphor in the context of Django application, and it mostly discusses code and design debt. Other causes of debt like versioning or infrastructure technical debt, are not specific to Django applications, and therefore such types are out of scope for this thesis.

2.2.1 Architectural issues

Software architecture refers to the high levels structures of a program. For example how different modules communicate or what framework and programming language

(16)

2.2. CAUSES OF TECHNICAL DEBT

that is used. There are many types of architectural issues that one can get. Ex- amples are over abstraction or lack of layering (instead of using for example the Model-View-Controller pattern). Architectural debt often comes from architectural decisions that made sense when the system first was built, but after the system has grown it now requires other patterns to solve its problem as best as possible.

2.2.2 Lack of adhering to coding style and conventions

Code is more often read than written, therefore it is important to make sure that code is readable. By following a style guide when coding, the readability can improve, as the code is consistent within the code base (e.g., using the same indentation pattern). Python has a de-facto style guide called PEP8, which should be followed for projects to increase readability [7]. Lack of adhering to this code standard, or the one decided upon in a project, is a form of technical debt.

2.2.3 Code duplication

Duplicated code occurs frequently in code bases, but it should be avoided. It might occur from copying and pasting code, as it is faster than writing code from scratch.

It is however considered a bad practice. When developers need to fix a bug or make a change in the code, they will need to check all possible duplications for that fix or change. Code duplication often indicate that there are design problems in the code, and that abstractions and re-factorizations can be done.

2.2.4 Dead and obsolete code

Dead, obsolete, or unreachable code is code that is never executed in a program or code that does not do anything useful. This creates technical debt as it can confuse the reader of the source code, and make the code less clear. In large programming projects, it can become difficult to recognize and delete dead code, as whole classes can be unused. The dead code might still have tests, but is actually never used in production.

2.2.5 Overly complex code

To make code maintainable developers should write their code as readable and robust as possible. This means that they should avoid creating overly complex code, that might look elegant, but will be hard to understand by other developers.

As the computer scientist Brian Kernighan wrote [11]: “Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.”

(17)

2.2.6 Legacy software package dependencies

Developers often use external software packages in order to save time and resources.

Making sure the external packages are of the latest version is important, as new releases might include bug fixes, but also new features. When upgrading packages, each dependency is a potential problem or obstacle. Updating one package might require other packages to be updated as well. Having to update all packages at once can create many problems (e.g. APIs changing). Therefore, it is important to keep all external packages up to date, to avoid any technical debt.

2.2.7 Lack of testing

Testing is an important part of software development. There are many types of tests like acceptance, unit, performance, load, integration or regression tests. Hav- ing extensive testing suites ensures that software changes do not break existing functionality. This eliminates bugs before they reach production. Having extensive tests will also make developers confident in that their program do what it is expected to do, and does not do what it is not supposed to do. It is often impossible (and not economical) to get full test coverage, but having no tests is a cause for technical debt.

2.2.8 Poor tests

Tests for software programs are expected to be deterministic. A test should always pass or fail for the same code. In practice there are usually several flaky tests (due to for example test order dependency) in large software projects [8]. To have a non-deterministic test mean that you can get false positive or false negative results for the same test after running it several times. Such tests makes it difficult to trust test suites and is considered technical debt.

As code and requirements change, tests needs to be updated as well. If the tests are not maintained, they do only cause more problem than they help developers.

Therefore, any poor and not up to date test is technical debt.

2.2.9 Lack of or outdated documentation

Many developers lack the motivation to document code, as they perceive it as a sec- ondary task compared to writing code. Some software development techniques even say that code should be self-explanatory and that code comments are unnecessary.

However, studies show that lack of or outdated documentation increases maintenance costs [10]. So wherever code is created without needed documentation, there is technical debt. The documentation also needs to be up-to-date, as any outdated or out-of-sync documentation only creates confusion.

(18)

2.3. PYTHON AND DJANGO APPLICATIONS

2.3 Python and Django applications

Python is a dynamic typed programming language, that supports both object- oriented and functional programming. Its syntax is intended to be highly readable, and often uses English keywords rather than special characters. It also uses whitespace indentation, instead of curly braces which is common in other languages.

According to the TIOBE Programming Community index, which measures the usage of programming languages, Python is the 5th most popular language [12].

Django is an open-sourced web framework written in Python. It follows the architectural Model-View-Controller pattern. It also includes its own object-relational manager, internationalization system, template system, form serialization and vali- dation system and many other components. It also comes bundled with an authen- tication system and an admin interface, which is often one of the main selling points of the Django framework.

2.4 Django-specific causes of technical debt

When talking about causes of technical debt in the context of Django web applications, there are several Django-specific causes of technical debt. The main causes of technical debt in Django applications that have been identified are lised below, they were collected from talks [13, 14], books [15, 16, 17] and blog posts by Django developers.

2.4.1 Monolithic Django apps

In the Django framework small libraries are called apps. A Django project often consist of several apps that each has their own responsibility. They should be kept small so that they are potentially shareable between different projects. James Bennett, one of the core developers of Django, said in a talk [18] that one should write apps that “do one thing and do it well”.

Some Django projects consists of one large monolithic app, that contain most of the features for the whole site. Large monolithic apps are often hard to understand, in comparison to small well-defined apps. These large apps often encourage and enable tightly coupled code, which makes it hard to break out code or make changes.

2.4.2 Misuse of signal receivers

Signals are used in Django to notify when an action occurs, for example when a model is created or when a HTTP response is finished. A receiver can be set up to take some action upon that signal. This concept is not unique to Django, it exists in many other languages (for example Java has listeners and events).

Below is an example of a signal receiver, that is executed after the model My- Model is saved.

(19)

1 from django.db.models.signals import post_save

2 from django.dispatch import receiver

3 from myapp.models import MyModel

4

5 @receiver(post_save, sender=MyModel)

6 def my_handler(sender, **kwargs):

7 pass

Signals are often unnecessary and can create code obfuscation. For example the post save code example above, could have been put in the model itself, where other model operations exist. One of the core Django developers Aymeric Augustin says [15]:

“I advise not to use signals as soon as a regular function call will do. Signals obfuscate control flow through inversion of control. They make it difficult to discover what code will actually run.”

Many developers [15] also make the mistake to think that signals are asyn- chronous, when they in fact are synchronous and blocking.

2.4.3 Overuse of middleware and context processors

Middlewares in Django are used to hook into the process of every incoming request and outgoing response that reaches the application. It is common to use middlewares to log requests, compress response content or cache responses. These tasks could be done in each individual view, but that would require a lot of boilerplate code that the middleware class removes.

Figure 2.1. Shows the flow from request to response.

Context processors are similar to middlewares, but these processors are run over

(20)

2.4. DJANGO-SPECIFIC CAUSES OF TECHNICAL DEBT

the context that is passed to templates. They provide common information for all templates, for example data that is kept in the footer and header of a website.

While both middlewares and context processor are useful to avoid repeating oneself in the code, it can be misused. If heavy operations or large database queries are executed in middlewares or context processors, they can impair performance.

As they are executed for every response, request or template it is important they are kept minimal and well performing. If business logic is hidden in context processor or middlewares, it might also confuse the reader of the code, as it is not the obvious place for such code to exist.

2.4.4 Heavy queries from Django ORM

Django comes with an ORM (object-relational mapping) that is used to interact with the database. With an ORM developers do not have to write SQL code, and can instead write database queries in Python. The Django ORM is often efficient in its job, and it simplifies a lot of databases interaction for developers. But it also comes with its disadvantages, critique from the Django community [21, 22] is that the ORM is inconsistent and that it lacks composability.

It is possible to use the Django ORM in a wrong way, causing performance problems. Developers may create heavy and complex queries using Python and the ORM, when the calculation could have been done better at the database level (with a raw SQL query). In an example made by Greg Gaughan [23], we assume there is a model of a Book.

1 class Book(models.Model):

2 title = models.CharField(max_length=100)

3 author = models.ForeignKey(Author)

4 jacket = models.ForeignKey(Jacket)

5 shelf = models.ForeignKey(Shelf)

In the view, the books are fetched with the following query.

1 books = Book.objects.filter(shelf__position__lte = 10) The books are then listed in the template.

1 {% for book in books %}

2 {{ book.jacket.image }} {{ book.title }}

3 by {{ book.author.name }}

4 {% endfor %}

(21)

It might seem as a simple operation, but it will make 21 queries to the database.

First one to fetch the 10 books (one for each shelf position), then when iterating in the template, 2 queries for each book to fetch related models. It could have been done with only one query:

1 SELECT

2 book.id, book.title,

3 jacket.image AS jacket_image,

4 author.name AS author_name

5 FROM book

6 NATURAL JOIN shelf NATURAL JOIN author NATURAL JOIN jacket

7 WHERE shelf.position <= 10

The raw SQL query above can be created with the ORM as well (using se- lect_related to perform joins). But the example shows a common problem, and the technical debt that a lot of Django projects have with unnecessarily heavy queries from the Django ORM.

2.4.5 Django models become god objects

In Django code it is not always obvious where one should put business logic. Some prefer to keep the logic in their models [19]. This can be problematic, as the models may become god objects. This anti-pattern [20] refers to objects that does too much. The models hold so much data and functionality and they are required by so much of the code that they become “god like”. As they grow larger, they might become hard to understand. They are hard to maintain as well, as they are so tightly coupled with the rest of the code.

2.4.6 Complex template hierarchies

To generate HTML dynamically Django has a template language. It makes it possible to use constructs like if conditions, and for loops embedded in the HTML. One of the most useful parts of Django’s template engine is the template inheritance.

Developers create a base template, with all the common information of the site, and then let child templates extend and override parts of the base template.

An example of a base template from the Django tutorial [24]:

1 <!DOCTYPE html>

2 <head>

3 <link rel=’stylesheet’ href=’style.css’ />

4 <title>{% block title %}My amazing site{% endblock %}</title>

5 </head>

(22)

2.5. TECHNICAL DEBT MANAGEMENT

6

7 <body>

8 <div id=’content’>

9 {% block content %}{% endblock %}

10 </div>

11 </body>

12 </html>

A child template can then extend the base and override blocks like content and title.

1 {% extends "base.html" %}

2 {% block title %}{{ block.super }} - Child template{% endblock %}

3 {% block content %}This is the content!{% endblock %}

Two-tier template architecture like above, works really well. However, as web- sites grow, it is possible to further extend and create deeply nested template hierarchies. While it is advantageous to avoid any code reuse, creating overly complex template blocks can be as bad. This can create systems that are difficult to debug and modify. [15]

2.5 Technical debt management

To prevent technical debt and deal with existing technical debt there are several activities that are done, and they can be summarized into the list below.

• Technical debt identification is the activity of detecting technical debt, with for example code analysis. By analyzing the source code to find violations of coding standards, lack of tests, or by looking at software metrics to identify design issues.

• Technical debt measurement quantifies the cost of maintaining the technical debt in a system. This can be done with calculation models, code metrics, and human estimations.

• Prioritization of technical debt ranks the identified issues, so the most urgent technical debt can be repaid first.

• Prevention of technical debt is done to limit any future technical debt. This can be done by for example improving the development process, so that developers have to follow best practices.

• Monitoring of technical debt follows the changes in technical debt over time.

(23)

• Repayment of technical debt resolves technical debt in a software system with for example refactoring.

• Technical debt documentation refers to ways to document technical debt, often with the usages of what’s called a “TD item”. There is no strict definition of a TD item but it usually contains information as who is responsible for repaying the item, where the location the debt is and the estimated cost of repaying the debt.

• Communication of technical debt makes the “TD items” visible to stake- holders, for example by putting them on a dashboard or backlog.

The three most fundamental activities for managing technical debt are identification, measurement and repayment. These three types of activities to manage technical debt have also been given the most attention in research [6].

(24)

Chapter 3

Method

The results in this report are supported by three activites. First the literature study that gives a background and support to the conclusions of the work. Secondly a survey on technical debt, with content from the literature study, was done to further define the concept of technical debt in the context of Django web applications.

Thirdly, and lastly, a case study was done where different solutions on how to manage technical debt was tested and evaluated on a Django project.

3.1 Survey on technical debt

A survey was used to identify what software engineers think is the main cause of technical debt in Django web applications. It was through a questionnaire that developers were asked about their experiences and opinions about technical debt.

This method [5] makes it possible to collect quantitative but objective data.

The survey was sent out to developers at companies that use Django. The survey was inspired by a paper [9] that researched what the industry believes in general about technical debt. The survey done for this thesis put the concept of technical debt was in the context of Django.

The survey asked if the respondent is familiar with the concept of technical debt. If not, a definition was presented. The survey then asked how many years of experience with Django the respondent has and what role he or she has when working with Django (for example Software Architect or Software Engineer). Then it asked what causes for technical debt occurs in their typical Django project. By listing the causes of debt that are mentioned in the background, the respondent could choose to answer if they often occur, sometimes, never or if it’s not applicable.

Finally, the survey asked what tools that are used to manage technical debt in their typical Django project.

(25)

3.2 Case study of managing technical debt

To test out approaches of managing technical debt, several tools and practices were evaluated on a code base in a Django project. The evaluation of the tools was made in a project that was currently being developed on. The tools were then tested out in a context of being a developer, to make the evaluations close to a real world situation of managing technical debt. When evaluating all the tools, predefined criteria were used (and they are described in the section 3.2.2).

3.2.1 Test code base

The tools were tested with an existing Django project described below.

• The total number of lines of code are 5545.

• It has 8 contributors during its lifetime.

• Development started in October 2008 and has been active until today.

• It runs Django 1.5, and has over 14 other Python package dependencies.

• The database MySQL is used, and it’s schema has been changed multiple times during the project’s lifetime.

3.2.2 Evaluation criteria

When evaluating tools of identifying, repaying and preventing technical debt, the following criteria were used.

• Cost of usage. How easy was it to use and set up the tool? Are the reports from the tool understandable?

• Technical debt impact. Did the tool identify, repay or prevent technical debt that had high value and critical symptoms?

• Interest amount saved. How much extra work would be needed in future development, if this technical debt had not been prevented or identified and repaid?

• Accuracy. How well did the tool accurately identify or prevent technical debt (or did it find false positive results)?

(26)

Chapter 4

Results

The results are divided in two parts. First the responses from the survey are presented in charts and figures with explanations. The second part is the evaluations of tools to identify, repay and prevent technical debt. Each tool is first described, and then comes the evaluation of how well the tool worked in the project.

4.1 Survey responses

There were 23 responses from Django developers about technical debt and their experience in managing it. There is no measured response rate, but the survey was sent out to 8 different companies and organizations that uses Django.

4.1.1 Demographic information

The majority of the respondents were familiar with the concept of technical debt, as can be seen in figure 4.1.

Figure 4.1. Familiar with technical debt.

Yes

83%

No 17%

(27)

The vast majority of the surveyed had the role of developer or software engineer.

Of the 23 respondents, 20 were software engineers or developers, 2 were software architects and 1 was a project manager. Most of the respondents had between 1 to 5 years of experience with Django, as can be seen in figure 4.2.

Figure 4.2. Experience with Django.

0 5 10

More than 5 years 3-5 years 1-2 years Less than 1 year

4.1.2 Cause of technical debt

The respondents were asked what is causing technical debt on their typical Django project. The answers to general types of technical debt can be seen in figure 4.3 and the Django-specific types are in figure 4.4.

0 5 10 15 20 25

Poor tests Lack of adhering to coding style and conventions Code duplication Dead and obsolete code Lack of or outdated documentation Legacy software package dependencies Overly complex code Lack of testing Architechtural issues

Time in Seconds

Often Sometimes Never Don’t know/not applicable Figure 4.3. Cause of technical debt.

(28)

4.2. EVALUATIONS OF TOOLS AND PRACTICES IDENTIFYING TECHNICAL DEBT

0 5 10 15 20 25

Overuse of middlewares and context processors Misuse of signal receivers Heavy queries from Django ORM Complex template hierarchies Django models become god objects Monolithic Django apps

Time in Seconds

Often Sometimes Never Don’t know/not applicable Figure 4.4. Cause of Django-specific technical debt.

When asked about what tools developers used in their typical Django project to manage technical debt, most answered error logging aggregation and code style and error analysis.

Figure 4.5. Tools used to manage technical debt.

0 5 10 15

Dead code detection Test coverage Code style and error analysis Error logging aggregation

4.2 Evaluations of tools and practices identifying technical debt

The tools for identifying technical debt that have been evaluated in this thesis can be put into three different categories. The first one is logging aggregators, the second one is application monitoring and the third one is static code analyzers.

(29)

4.3 Logging aggregators

When something goes wrong in a Django web application, it saves an error log entry. Informational or debugging logs are also saved to better understand how an application works. As applications grow, so will the amount of log entries. There- fore, it is useful to aggregate logs and present them in a format so that developers can investigate the most important problems. If a web application runs on multiple servers, then it becomes even more important to put all logs in one place. Below are two log aggregation tools that have been evaluated.

4.3.1 Sentry

Sentry aggregates logs and presents them in a web interface where it is possible to view the most common events and errors. It can also give alerts to users if there is a sudden spike in errors, or if new unique errors occur. This makes it easy for developers to tackle the most critical errors in an application. A screenshot of the Sentry dashboard can be seen in figure 4.6.

Figure 4.6. Sentry dashboard

The installation of Sentry for Django can be very easy if one decides to pay

$29 per month for Sentry’s hosted service. It only requires a few minutes work and everything is set up. There is also the free option of hosting Sentry on your own server. This however should be done with caution as error reports will not be accessible if your own server goes down. In many cases it might be more reliable to host Sentry with a third party.

For all error logs Sentry gives a lot of context that is really useful when debugging errors. In the error reports there is information such as release version of the code base, readable HTTP headers, and well formatted tracebacks. It is also possible

(30)

4.3. LOGGING AGGREGATORS

to search among the logs, filtering by time or content. All these features makes it easier to identify visible technical debt. Hidden technical debt, for example scaling issues, will not be observable with this tool.

4.3.2 ElasticSearch, Logstash and Kibana

One of the most powerful ways to aggregate logs is to use the ELK-stack (Elas- ticSearch, Logstash and Kibana). Logstash is responsible for collecting logs and parsing them. The formatted logs are then stored in ElasticSearch, which is a text search engine. On top of these tools is Kibana which is a web interface to present data from ElasticSearch. Together the ELK-stack can be used to present aggregated error logs. With further configuration you can also add alerts for error spikes or unique errors.

Figure 4.7. Screenshot of Kibana

It requires a lot of work and configuration to install the ELK-stack. Today there are no packages to send formatted logs from Django to Logstash. This means that some configuration of Logstash is required to parse the standard Django logs.

Kibana also has many features which can make the interface hard to learn and understand, this means it can be difficult to create a dashboards in it. If there is a need to set up an alert system, that will be more configuration.

All in all; this system requires extensive knowledge for it to be used. It will also have to be maintained, which might create more technical debt for the developers than it will be able to identify. The ELK-stack should not be used for only Django logs, as it is too much infrastructure to maintain. When there is need to display

(31)

more custom logs, the ELK-stack is probably better suited as it can be customized heavily.

4.4 Application monitoring

Monitoring tools are used to analyze and display information about a web application’s execution. The information makes it possible for developers to identify and fix problems. They can for example monitor database query performance or template rendering times.

4.4.1 New Relic

New Relic agent is web monitoring service, it costs $75 per month to use. The service monitors response times, database queries, error rates, CPU use, and uptime for web applications. The reports and monitoring can be viewed in a web interface.

One of the most useful features of New Relic agent is the reporting of slow loads in an application. Slow loads can then be investigated through transactions traces, to see if the problem lies in the view handler, calls to external services, a middleware, template rendering or the database. This makes it easy to tackle performance problems.

Figure 4.8. Transaction trace in New Relic

(32)

4.4. APPLICATION MONITORING

It is very easy to deploy New Relic’s monitoring system as it is just a Python package that needs to be installed and a small modification of the Django’s start up script. From testing out New Relic and learning about all of its features, it is a really good tool to tackle technical debt. It gives great introspection into its monitored Django application, making it easier to improve its performance. However, when using it with the “test Django project” it is hard to find improvements to make. In a larger Django web application with more traffic, a tool like New Relic can probably give better insight.

4.4.2 AppEnlight

AppEnlight is an application monitoring system, it is free to use but it is not open source. As it is a hosted service, it might cost to use in the future. AppEnlight’s reports can be viewed through their web interface. It reports slow calls and breaks down what time is spent in what request layer (middleware, template, view, etc).

It will also report on slow SQL calls, so that the queries can be improved for better performance. This is helpful when an application is performing badly, and improvements need to be made. When using AppEnlight with the test project, it did not report any slow database calls or other information that could be used to improve performance. It is easy to install AppEnlight though, as it only requires their Python package and some changes to Django’s settings.

Figure 4.9. Screenshot of AppEnlight

(33)

4.5 Static code analyzers

The static code analyzers that were evaluated were pep8, Pyflakes, Flake8, Pylint and Vulture. They all have checks to see that the analyzed code is formatted in a correct matter. All of them are very popular packages for Python and they are consistently downloaded thousands of times per month (see table 4.1).

Table 4.1. Statistics from PyPi in January, 2015. [26]

Tool Last updated Downloads per month pep8 2016-01-12 937670

Pyflakes 2015-09-20 669146 Flake8 2015-12-15 572869 Pylint 2016-01-15 370066 Vulture 2015-09-28 3541

A comparison of the supported functionality for the different tools can be seen in table 4.2.

Table 4.2. Functionality (checkers) for the static code analyzers.

pep8 Pyflakes Flake8 Pylint Vulture

Coding standard X X X

Naming conventions X X

Docstring conventions X X

Errors X X X

Complex code X X

Duplicated code X

Dead code X X

4.5.1 pep8

This tool checks Python code to see if it is compliant with the coding standard PEP8. The PEP8 standard [7] defines how indentation should be done, how strings should be constructed, how long lines should be, and much more.It is executed from the command line, and output from an example run can be seen below:

prime/forms.py:15:1: E302 expected 2 blank lines, found 1 prime/forms.py:20:1: E302 expected 2 blank lines, found 1 prime/forms.py:24:1: W293 blank line contains whitespace prime/forms.py:25:1: W293 blank line contains whitespace prime/forms.py:30:8: W291 trailing whitespace

prime/forms.py:31:80: E501 line too long (82 > 79 characters)

(34)

4.5. STATIC CODE ANALYZERS

prime/forms.py:32:1: W293 blank line contains whitespace prime/forms.py:41:8: W291 trailing whitespace

The tool is very easy to set up, as it only requires a Python package to be installed. The reports are also very understandable and it only takes pep8 about 3 second to run through the entire code base of 5000 lines. The technical debt impact from not following the PEP8 standard can be seen as very low though, as it does not identify any critical errors in code (even if most Python projects should follow PEP8 as it is the de-facto coding standard for Python).

4.5.2 Pyflakes

Pyflakes analyzes programs and detects defects such as unused imports, assignment errors, return statements outside of methods or functions, and many more errors and warnings. Here some output from an example run:

wishlist/forms.py:2: ’IntegerField’ imported but unused wishlist/forms.py:3: ’HiddenInput’ imported but unused

wishlist/forms.py:4: ’ForeignKeyRawIdWidget’ imported but unused wishlist/forms.py:7: undefined name ’platform’

wishlist/forms.py:20: redefinition of unused ’models’ from line 5

Pyflakes is easy to set up and use as it is only a Python package that needs to be installed. Pyflakes can theoretically find errors that can have big impact on technical debt, but in the Django project that it was tested against it only found module imports that were unused. Unused imports will slow down the start up time, and it might clutter the namespace, but other than that it is not a big issue.

4.5.3 Flake8

Flake8 is a wrapper around three other tools called Pyflakes, pep8 and Ned Batchelder’s McCabe script. Pyflakes and pep8 have been described above, and the McCabe script calculates cyclomatic complexity for functions. It is a measure to calculate the complexity of a program, and it can give warnings when code is regarded as too complex.

Together it creates a powerful tool that gives you a lot of information about how you should structure your Python code to keep it consistent, readable and correct.

It has the same kind of output as pep8 and pyflakes, so the results are very easy to understand, but the identified problems can be categorized as having low technical debt value when tested against the Django project used in this report. The tool is very fast, and it takes only a second to run it over the entire code base.

(35)

4.5.4 PyLint

PyLint is one of the most comprehensive static code analyzers for Python. It checks for coding standard, dead code, duplicated code, naming conventions and much more. When testing with the test code base, it outputs a huge report. With the default configuration the report contains over 30 000 lines. Most of the reported issues will also be of little importance (causing little technical debt). Examples are

“missing trailing whitespace”, and “variables names of too few characters”. To get any useful information of PyLint, one must choose what errors and warnings to use.

It is also difficult to set up PyLint. It might just be the tested code base that it has problems with, but there are several Python files that PyLint cannot process.

Therefore, several files had to be excluded for the tool to work. There were also encoding errors when trying to export the results as HTML. When running the tool over the entire code base, it takes over 4 minutes for PyLint to finish.

4.5.5 Vulture

Vulture is easy to set up to identify dead code, but Vulture has several issues with the Django framework. As Django uses variables names and classes implicitly, Vulture will mistake a lot of these variables and classes for dead code. Examples of wrongly identified dead code are management commands, migration files, middlewares, tests, admin options, model variables, form variables and settings variables. Most of the correctly identified lines from vulture are unused classes and functions. Therefore it can not be said that Vulture has good accuracy when trying to identify technical debt.

A comparison between PyLint and Vulture’s dead code detection can be seen in table 4.3. It should be noted though that most of the dead code found from PyLint was unused imports and variables, while Vulture identified unused classes and methods.

Table 4.3. Accuracy of dead code detection tools.

Tool Lines of unused code identified

Lines of unused code identified, after

manual verification Accuracy vulture 184 occurrences 27 occurrences 14 % pylint 103 occurrences 103 occurrences 100 %

4.6 Evaluations of tools and practices to repay technical debt

4.6.1 Tombstone decorators

In a large code base it might hard to determine whether a class or a function is used or not. In order to solve such a problem, one can use the practice called

(36)

4.6. EVALUATIONS OF TOOLS AND PRACTICES TO REPAY TECHNICAL DEBT

“tombstones”. By marking a function with a tombstone decorator (see code example in listing 1), it will save a log entry if the function is executed. If such a log entry is then created, we will know that the code actually is used, and that it should not be removed. If a function has not been used for a certain period of time, we will conclude that the function is indeed “dead” and not used, and we can safely remove it.

1 @tombstone

2 def example_function():

3 pass

Listing 1: A function marked with a tombstone decorator

The problem with using tombstones decorators is that they require too much work to just detect dead code in Python. Developers using tombstones will have to maintain the tombstone code and its logs, to make sure they work as intended.

This maintenance will create more technical debt, than the removal of dead code will repay.

4.6.2 Automated code formatting

There are several tools to automatically fix code formatting issues in Python. The three tools tested are isort, autopep8, and autoflake. To sort Python imports al- phabetically and to separate them into sections, isort can be used. To format code correctly, autopep8 will fix most formatting issues in accordance with the PEP8- standard. Finally autoflake removes any unused variables or imports. An example of a badly formatted code can be seen below.

1 import math

2 import sys

3

4 class ClassExample( object ):

5 def __init__ ( self, val ):

6 #There should be a space after the hash in comments.

7 if val : val+=1; val=val* val ; return val

8 return sys.path

After running the tools on the code example, it will then be formatted correctly and unused imports are removed:

1 import sys

2

(37)

3

4 class ClassExample(object):

5

6 def __init__(self, val):

7 # There should be a space after the hash in comments.

8 if val:

9 val += 1

10 val = val * val

11 return val

12 return sys.path

When running these tools over the entire code base, they work really great. It makes sure that the code is readable and consistent. It is also really easy to use these tools, as they are just Python packages that needs to be installed. There are some exceptions to what these automated tools can do though, if for example non- standard libraries are imported but unused, they will not be removed automatically as they might have side effects that are intended.

4.7 Evaluations of tools and practices to prevent technical debt

4.7.1 Pre-commit hooks

For projects that use version control, it is possible to add hooks that run when developers are committing code. This way it is possible to run “sanity checks” on all the code that is added or changed in a source code repository. The hooks that run on the added or changed code can be any program or code analyzer. To make it easy to add pre-commit hooks to git, there is the Python package called pre-commit.

The installation of pre-commit is very easy, and there are several hooks that already exists for Python. Pre-commit hooks are very useful, as they can be used to coordinate several developers to standardize their code consistently. Hooks can then make sure that all developers are running the same automated code formatting tools or that any deprecated function name is not included in the added file.

As pre-commit hooks are only executed over the changed files, it is useful to use these hooks to make big refactorization efforts. Instead of for example trying to change the usage of a deprecated function in the whole code base, we can make sure that incremental changes are made with pre-commit. This makes pre-commit very useful in preventing technical debt, and also in repaying it.

4.7.2 Deprecation decorators

As systems grow, some features or functions become suspended as they are not recommended for use anymore. It might not be possible to remove some functions

(38)

4.7. EVALUATIONS OF TOOLS AND PRACTICES TO PREVENT TECHNICAL DEBT

that are widely used in a system though. A good compromise is then to deprecate such a function, by giving out warnings about its usage. The Python package debtcollector helps in deprecating Python functions and methods. An example of using debtcollector’s deprecation decorators can be see below, where the open method is deprecated.

1 from debtcollector import removals

2

3 class Project(object):

4 @removals.remove

5 def open(self):

6 pass

If one were trying to use the open function, then the command line will print:

__main__:1: DeprecationWarning: Using function/method

’Project.open()’ is deprecated

This can help developers from making the mistake of using deprecated functions.

The package also include decorators for situations when a function is moved or renamed as well. It will require that developers are disciplined enough to look for warnings in new code, as the deprecated code might still work as expected. It was hard to test out the usefulness of using deprecation decorators though, but if it makes any developer not use any deprecated function, it can be said to prevent technical debt. It is probably most useful in large projects where many developers simultaneously develop on the same code base.

(39)

(40)

Chapter 5

Discussion

An interpretation of the survey results are done in this chapter, and then a short analysis on how the definition of technical debt can change the responses that were given. From the evaluation some guidelines are given on how to best manage technical debt in a typical Django project.

5.1 Interpretation of survey results

5.1.1 Familiarity with technical debt

The results show that most (83 %) of the surveyed already were familiar with the concept of technical debt. This shows that the concept of technical debt is well known amongst Django developers. Other studies [9, 3] have reported that technical debt is a well used concept in the industry, so this was not a surprise.

5.1.2 Causes of technical debt

When developers were asked what was causing technical debt in their typical Django project, the two top answers were “architectural issues” and “lack of testing”. Over 43 % said that both of these issues were causing technical debt often. The third most reported cause, with 30 % saying it occurs often, is “overly complex code”.

When other studies surveyed developers about technical debt, they came to very similar results. The big survey [9] saw “bad architecture choices” as the main cause of technical debt, with “overly complex code” as the second most common, while

“inadequate testing” was named the fourth most common cause. That the results are so similar are not surprising as all these kinds of technical debt are very general and can be applied to most projects. The fact that the questions were asked in the context of Django applications seems to not have had any difference.

“Lack of testing” is a common cause of technical debt, so we can only assume that many Django projects lack tests. One of the least reported cause of technical debt is poor tests, which might be because it requires large tests suite for such problems to occur frequently in the first place.

(41)

From the answers of the Django-specific technical debt causes, we can see that monolithic Django apps, Django models becoming god objects and complex template hierarchies were the three most reported causes. They can all three be seen as subsets of “architectural issues” and “overly complex code”, which were also reported as causing technical debt often.

With this information it is possible to answer the research question “What are the main causes of technical debt in Django web applications?”. The answer to that is badly implemented architecture, lack of tests and overly complex code. This can then manifest itself in a Django project, with monolithic Django apps or too large Django models.

5.1.3 Definition of technical debt

The definition of technical debt was not discussed in this survey, but it has been shown that there can be several interpretations of the metaphor. If the developers have different definitions for what is considered technical debt, then the replies on what is causing technical debt can be hard to analyze. When other studies have surveyed definitions of technical debt [9], they have used similar methods as this paper, and they also came to the conclusion that the industry had similar definitions of technical debt as the science community. This is expected as the concept of technical debt did not originate from research, but became popular in the software industry first.

5.2 Guidelines when developing with Django to manage technical debt

From evaluating and testing tools on a Django project, there were several tools that worked really well to manage technical debt. Some of the tools might not be necessary to use, while others should be used in all projects. Below is an analysis, and some guidelines, for how to develop with Django and how to manage technical debt on a typical Django project.

5.2.1 Follow code standards with help of static code analyzers

Even though the survey showed that “lack of adhering to coding standards” is rarely causing technical debt in Django projects, there are still tools that exist for doing style checks that should be used. They require very little to set up, but will make it easy to keep code bases consistent in style. Therefore any Django developer should make use of such tools, to make the code as readable as possible. From the results we can see that the most comprehensive analyzer was flake8. Not only does it check for PEP8-standard in Python code but it also tries to find errors.

The alternative to flake8 would be PyLint, but as it can take over 4 minutes to run the tool over the code base, it is not optimal to use. If one has to wait for too much time it will only hinder development and the tool will become a

(42)

5.2. GUIDELINES WHEN DEVELOPING WITH DJANGO TO MANAGE TECHNICAL DEBT

nuisance. PyLint also requires a lot of configuration to set up, compared to the default configuration of flake8 which is to be better suited for Django projects.

5.2.2 Use Sentry to aggregate logs

To have all error, warning and informational logs in one interface is very useful.

It makes it easy to investigate problems that occur so that is it possible to fix the problem. That is why Sentry is really recommended being used for logging aggregations. Compared to the ELK-stack it is very easy to set up, Sentry also needs very little maintenance which keeps the technical debt low.

5.2.3 If there are performance issues use application monitoring

For a small application, like the one that was tested, it might not be necessary to have application monitoring. For the tested project it was hard to find any improvements to make with the application monitoring, as can be seen in the results. Improving performance might not be an application problem, instead better infrastructure or front-end improvements might be more effective.

If an application has a lot of traffic, and has a problem with performance, then it can be useful to use application monitoring. Such tools will then be able to identify technical debt causes like “overuse of middleware” and “heavy queries from Django ORM”. Currently New Relic has the most features, however AppEnlight is a good free alternative.

5.2.4 Deprecate functions with Python decorators

As seen in the results, there are high quality Python packages that can be used to warn about deprecated functions with Python decorators. Deprecating functions or methods can often be easier than trying to refactor the whole code base to remove the deprecated function calls. By using deprecation the developers will not have to debate the cost versus benefit of “removing” a certain function. Deprecation will create some technical debt, but it will at least be deliberate and visible debt that can be dealt with in the future in a planned matter.

5.2.5 Prevent technical debt with pre-commit tools

Pre-commit hooks should be used in all Django projects. These hooks make it possible to keep development practices consistent for all developers. For example to make sure that tests pass before developers commit code, or that no deprecated functions are added to the source code repository. As it is really easy to implement (see results 4.7.1), it makes pre-commit hooks a cheap way of doing “continuous integration”.

(43)

5.2.6 Active identification of dead code might be unnecessary

In the survey, few developers thought that dead code was a common cause of technical debt. The tools that exists for detecting dead code are also not that accurate.

For example the static analyzer Vulture can be avoided, as seen in results 4.5.5, be- cause it does not find results that are accurate because of how Django uses variables and classes “implicitly”. The practice to use tombstone decorators to conclude if a class or method is indeed dead is also not optimal. The maintenance of tombstones will create more technical debt than the removal of dead code will repay.

5.3 Reliability of the results

The evaluation of tools to manage technical debt was only carried out in one Django project. Depending on the size of project, the numbers of developers, and the amount of traffic the web application has, the challenges of managing technical debt will be different. This study has to be seen as one evaluation of many, and the conclusions are based on the experience from this “case study”, which could be different from others.

The distinction between repayment, prevention and identification of technical debt is also not always clear. There are no strict definitions for the terms, which means is that some tools and practices can be hard to put into one category.

Managing Technical Debt in Django Web Applications

Managing Technical Debt in Django Web Applications

PER CLASSON

Managing Technical Debt in Django Web Applications

Abstract

Contents

Chapter 1

Introduction

1.1 Motivation

1.2 Research questions

1.3 Scope and limitations

Chapter 2

Background and Literature Study

2.1 The technical debt metaphor

2.2 Causes of technical debt

2.3 Python and Django applications

2.4 Django-specific causes of technical debt

2.5 Technical debt management

Chapter 3

Method

3.1 Survey on technical debt

3.2 Case study of managing technical debt

Chapter 4

Results

4.1 Survey responses

4.2 Evaluations of tools and practices identifying technical debt

4.3 Logging aggregators

4.4 Application monitoring

4.5 Static code analyzers

4.6 Evaluations of tools and practices to repay technical debt

4.7 Evaluations of tools and practices to prevent technical debt

Chapter 5

Discussion

5.1 Interpretation of survey results

5.2 Guidelines when developing with Django to manage technical debt

5.3 Reliability of the results