• No results found

Blocking violations in reactive Java frameworks

N/A
N/A
Protected

Academic year: 2021

Share "Blocking violations in reactive Java frameworks"

Copied!
17
0
0

Loading.... (view fulltext now)

Full text

(1)

Blocking violations in reactive Java

frameworks

Robin Sundström

Morgan Vallin

Computer Engineering BA (C), Final Project, 15 hp Main field of study: Computer Engineering

Credits: 15

(2)

Blocking violations in reactive Java frameworks

Robin Sundström, Morgan Vallin

BSc. Thesis, Programvaruteknik

Institute of Computer and Systems Sciences

Mid Sweden University Östersund, Sweden

{rosu1701, mova1701}@student.miun.se

Abstract 

Concurrency in programming is a way of interleaving tasks        in order to enhance the performance of an application.        Previous research has found that concurrency errors are hard        to avoid, hard to find, and that they often degrade        performance of the application. Reactive programming        provides an abstraction, to make it easier to implement        complex concurrent and asynchronous tasks. When        programming reactively in Java, it is often done with a        reactive framework, where RxJava and Project-Reactor are        two of the more popular choices. Blocking a thread that is        not supposed to be blocked will result in concurrency errors,        without the Java compiler providing a warning. In order to        find incorrect blocking, a tool called BlockHound can be        used. BlockHound wraps the original code, intercepts any        blocking calls, and provides an error if the blocking call was        used incorrectly. In this study, BlockHound was used to        detect erroneous blocking calls in open source projects        which use RxJava or Project-Reactor. A JavaAgent was        created to automate the process of adding BlockHound to a        project. The selection of projects to test was done by        evaluating the community usage, and selecting the projects        with the most amount of stars and forks, as this indicates        that the projects are widely used. The projects were tested        with BlockHound, and the errors were saved to external log        files for analysation. The study found that a considerable        percentage of the projects investigated exhibited blocking        violations. These violations were all caused by a low        number of system calls, made from methods in threads that        forbid blocking. Generalizable solutions to the violations        were applied, and considered successful. 

 

Index terms 

RxJava, Project-reactor, Concurrency Error Study,          Blockhound, Reactive programming

1. Introduction

 

Concurrency in programming is the concept of interleaving        the execution of multiple tasks, to enhance the speed and        performance of software     ​[1]​. If one task is waiting, the        program is not put on hold. Instead, the execution continues        with another task. With the advancement of computers, and        their rising number of processor cores, the need for        concurrent applications has increased        ​[2]​. Implementing    concurrency correctly creates programs with potential to        utilize a larger part of the processor’s capacity. The        drawback is that it is often difficult to implement a        concurrent program flawlessly     ​[3]​. Recently, the reactive        programming paradigm has gained popularity, and part of        the reason for this is its contributions to making concurrency        easier to implement     ​[4]​. The Reactive paradigm is useful        when asynchronous stream tasks become complicated.        Reactive frameworks will add a layer of abstraction which        simplifies the implementation of concurrency significantly        [5]​. The reactive paradigm is seeing more developers taking        advantage of it, and some high profile companies, like        Netflix and Microsoft, are using it to a great extent in their        products and services. Therefore, understanding the pitfalls        of the paradigm is essential         ​[5]​. This will help in avoiding        future misuse and maintenance of faulty code ​[5]​.  

 

Programming reactively is possible in multiple languages,        and the guidelines are stated in the Reactive Manifesto       ​[4]​.  It is an event-driven paradigm, and according to the        Reactive Manifesto, reactive programming should be        non-blocking in its asynchronous message-passing [4].        Thus, it is not allowed to use code that blocks an        asynchronous or concurrent thread. Violating this may cause        the application to be slow or unresponsive, which will lead        to a degradation in performance ​[6]​. 

 

(3)

This violation could potentially create the performance        issues stated earlier ​[6]​.  

 

Concurrency errors are one of the most difficult to debug        [11]​, as it may not be obvious that an error exists, and in        which way it impacts the performance of the program. They        are also one of the most widely studied types of errors in        traditional multi threaded programming languages         ​[11]​.  Blocking a non-blocking thread may be considered a        concurrency error, since the error is caused by implementing        concurrency incorrectly, and it may potentially reduce a        program’s performance. It is therefore essential to know        how to locate and fix such errors.  

 

Previous research has investigated generic concurrency        errors in the Java API         ​[12]​. They have also created and        compared tools for finding and fixing different concurrency        errors ​[13]–[20]​. To the best of the authors’ knowledge, no        previous research has been done to examine how common it        is for non-blocking threads to get blocked. Therefore, this        study aims to address this issue. 

 

This will be an empirical study to estimate how frequently        this error occurs in open source projects which implement        either  the  RxJava  or  Project  Reactor-framework.  Blockhound will be used to examine open-source projects,        to determine the occurrence of such a flaw       ​[10]​. The study      will identify where such violations exist, the causes of the        violation, and suggest potential fixes. Knowledge of how to        identify and fix errors in a project will aid in developing        more qualitative code going forward. 

 

2. Problem statement

 

The Java compiler does not prohibit the use of blocking        calls in non-blocking threads       ​[10]​. When using a reactive          framework, it is possible to violate the rules of reactive        programming by blocking incorrectly. Using a blocking call        in a non-blocking or concurrent thread is not safe, as race        condition and deadlock may occur due to bad timing       ​[21]​.  Another problem with using blocking operations incorrectly        is that such an error may not be found during testing of the        software, as concurrency errors are difficult to find       ​[11]​. To    avoid this, blocking calls should not be used in threads        which are supposed to be non-blocking       ​[4]​. However, this      could prove to be problematic, as asynchronous API may        utilize blocking calls, even though the documentation states        otherwise ​[21]​. It is also possible for the developer to        overlook this problem, as no warnings are given by the        compiler. To the best of the authors’ knowledge, no        previous research has been done to find out how frequently        incorrect blocking occurs in open source projects which use        the Project Reactor or RxJava frameworks. As finding and       

fixing errors is crucial for developing high quality software,        this study aims to collect empirical data on the frequency,        causes, and fixes of these blocking violations. The data        collected in the study may help in preventing developers        from using blocking calls incorrectly in the future, as it will        provide knowledge of the pitfalls and potential fixes of        blocking violations. 

 

3. Research Questions 

Incorrect usage of blocking calls may decrease an        application’s performance   ​[6]​. Considering bad timing, the          worst case scenario is that the violation will cause a        deadlock or race condition       ​[21]​. The Java compiler does not        give a warning when such a violation is made, and as such,        there could exist concurrency errors that the developers are        not aware of. The focus of this study is to identify how        frequently open source projects violate the contract by using        blocking calls in non-blocking threads. Furthermore, the        study will examine the errors found, determine the causes,        and try to find potential fixes. Blocking violations are        expected to be found, since concurrency errors are common,        and the violations difficult to catch without the use of        external tools or softwares       ​[11], [12], [21]​. The following          Research Questions (RQ’s) have been formulated to help in        succeeding with the goals of the research: 

 

1. RQ1: How often open source projects violate the                contract by blocking non-blocking threads?  2. RQ2: What is the main cause of the errors                 

produced by blocking incorrectly? 

3. RQ3: What suggestions can be made to fix the                  blocking violations? 

 

4. Limitations 

The empirical data extracted and the results produced by the        study should be considered in light of some limitations. The        collection of the empirical data was entirely dependent on        the Blockhound API, as well as its implementation through        the constructed agent. There are also limitations regarding        the source of the collection. The open source projects        examined, the test suites as well as the criteria to choose        these, is a limiting factor of the study. This study was        additionally bound to a deadline and thus limited to a time        aspect.  

 

(4)

taken is that the relatively, compared to the overall domain,        low number of projects investigated may give an inaccurate        estimation. The quality of the research is dependent on the        criteria being without self-selection bias. Furthermore, it is        necessary for the criteria to be an accurate representation of        quality in open source projects. Quality may be assumed        given a large community usage, since this suggests that the        project has been used and inspected by the community        without finding major flaws. This correlation between        criteria and research quality limited the research, since        sub-optimal criteria may yield sub-standard projects to        examine. As with all statistics, the empirical data collected        by the study is susceptible to standard statistical deviations        [22]​. Regardless of the sources, the data might not be        representative of the true frequency in which these        violations occur. Since the testing was primarily achieved        through the testing suites provided by the projects, the data        collected is related to the quality of the testing. If the test        suite is insufficient, blocking violations might be missed.    

Blockhound has been chosen as the tool to be used in the        study. The study was therefore limited to the accuracy of        this API. If the accuracy was faulty, meaning it cannot find        every violation or it produced false positives, the empirical        data collected will itself be inaccurate. The Blockhound API        also limited the research to projects that use supported        frameworks. 

 

5. Background 

A strategy for solving a programming problem is called a       

programming paradigm   ​[23]​. Different paradigms exist,        and they have different use cases and features       ​[23]​. They    vary in their allowance of side effects, their code flow and        the code structure     ​[24]​. ​Reactive programming is an event            driven paradigm, which uses asynchronous data streams to        communicate change ​[4]​.  

 

RxJava ​[8] and ​Project Reactor ​[9] are two of the most              popular reactive frameworks for Java         ​[6]​. They are used for          implementing asynchronous and event-driven programs. An       

asynchronous event is one that occurs independently of the        main execution flow of the program, and it does not expect        a response immediately      ​[25]​. The use of asynchronous          events is a helpful tool in many applications, as the program        can continue its execution and retrieve the value of the        asynchronous task when it’s finished. The alternative would        be a   ​synchronous event​, which will effectively wait until        one task is finished before proceeding to the next       ​[26]​. If a      synchronous solution was to be used for slow and        cumbersome tasks, such as I/O- or network calls, it would        result in an unresponsive application, as the interface can’t        be updated while the cumbersome task is being executed.       

Related to synchronous and asynchronous are the concepts        of blocking and non-blocking execution. 

 

Blocking code is code which blocks further execution until        it is finished     ​[27]​. The opposite is ​non blocking code​,              which does not block further execution       ​[27]​. Asynchronous    and non-blocking, as well as synchronous and blocking,        might seem like synonyms, but there are subtle differences.        Synchronous and asynchronous refers to the order in which        tasks are executed, while blocking and non-blocking refers        to whether or not we hold up further execution before        proceeding  ​[27]​. Asynchronous events are usually          non-blocking, while synchronous events are often blocking.   

If a blocking code is used in threads that are supposed to not        be blocked, the program may potentially suffer from       

concurrency errors    ​[6]​. A concurrency error may be        defined as “a behaviour that could not be obtained by a        sequential execution”    ​[28]​.  ​Errors can be detected at         

compile-time     ​or at ​run-time ​[29]​. Compile-time (usually)        occurs before the program is running, while the compiler is        turning the source code into machine code.       ​Compile-time  errors ​are errors that happen at compile-time. The compiler        shows the cause of the error, such as wrongly used syntax or        semantics. Run-time is the time when the program is being        executed. A    ​run-time error is an error that passes all        compiler tests, but fails while running the program.        Examples of run-time errors are infinite loops, trying to        access a variable that is null or trying to divide by 0.   

For detecting incorrectly used blocking calls in Java, a tool        called ​BlockHound ​can be used ​[10]​. BlockHound wraps        the original code, and alters the bytecode by adding a        checkBlocking()-method, to detect if there is any blocking        code present where not allowed. BlockHound has support        for RxJava2 and Project Reactor 3.2.x out of the box.   

Build automation tools have been defined as “          the  technology which automatically compiles and tests software        changes, packages the software changes into a binary, and        prepares the created binary for deployment​      ” ​[30]​. Two    popular build tools are Maven         ​[31] and Gradle ​[32]​. Both          tools perform the same tasks, but are different in syntax.        Maven has been around for a longer time and thus has more        IDE support, whilst Gradle has a higher performance       ​[33]​.  Due to these facts, both tools are popular and frequently        used in Java projects ​[7]​.  

 

6. Related work 

(5)

and challenging to test and debug than non-concurrency        related errors   ​[34]​. There have been many studies which        focus on the detection and avoidance of different types of        concurrency errors, such as deadlocks      ​[14]–[17]​, race    conditions ​[14], [19], [20], [35] and atomicity violation        [35]–[38]​. The previous research has been done for varying        languages, systems and frameworks. All of the above papers        present a technique for finding and fixing concurrency        errors already present in the software, which is discovered at        run-time. Unlike the above studies, our study will not come        up with a new technique for identifying concurrency errors.        We will instead do a mapping of the commonality, causes        and potential fixes of blocking violations in reactive open        source projects. 

 

Contrary to the above studies, Liu et al.      ​[12] try to      implement a technique for enforcing correct usage at        compile-time. They state that there is a misuse of        well-designed Java concurrency APIs, which creates        hard-to-debug concurrency errors. Their suggested solution        is using annotations for concurrency-related Java APIs, and        developing a lightweight type checker to detect API misuse.        After  testing,  it  is  confirmed  that  misuse  of  concurrency-related APIs is indeed a problem in open        source projects present on GitHub. They conclude that        “15% of the time when developers want to guarantee        exclusive access on the concurrent collections, they use it        wrongly” ​[12]​. 

 

Instead of avoiding errors by enforcing correct usage,        Mamun et al.     ​[13] try to find concurrency errors at run-time.        The study compares four static analysis tools for Java, and        their capabilities of detecting Java concurrency errors and        error patterns. The average defect detection ratio among the        tools was 25%, which confirms the belief that concurrency        errors are hard to find. This also suggests that using one        statical analysis tool alone is not gonna find all concurrency        errors.  

 

When an error has been located, the next step is fixing it. Yu        et al.   ​[39] present a study that characterizes non-deadlock        concurrency errors. The study takes a look at errors that        have been marked as “fixed” or “closed”, and determines        which fix patterns (e.g. lock, condition check, redesign etc)        were used for the different failure types (e.g. crash, hang,        delay etc.). The study may be used by engineers to help in        fixing non-deadlock concurrency errors. 

 

Tu et al.     ​[11] studied concurrency errors in open source        projects that use the programming language Go. They        looked at the root causes of the errors, and their fixes. They        found and analyzed 171 concurrency errors, and concluded        that their study provided a base for researchers and        programmers to improve the code quality in the future. 

 

To the best of the authors' knowledge, no previous research        has been done to detect concurrency errors caused by using        blocking calls incorrectly in Java. This study will focus on        finding those erroneous blocking calls in open source        projects available at GitHub. Our study will, like Liu et al.        [12]​, try to enforce correct usage, by mapping the frequency        of blocking violations in open source projects. Unlike Liu et        al., our mapping will be happening at run-time, for projects        which already contain the violation. Additionally, this study        will focus on finding concurrency errors and suggest fixes,        like Tu et al.       ​[11]​, but it will target RxJava and Project        Reactor projects instead of Go projects. Mamun et al.       ​[13]  showed that a single statical analysis tool would not be able        to detect all concurrency errors. Our study investigates        concurrency errors caused by blocking violations. By        enforcing correct usage, concurrency errors caused by        incorrect blocking can be avoided. 

7. Research methodology 

The aim of the research was to investigate the frequency of        blocking violations, the causes as well as potential solutions        to these, in reactive open source projects. The research was        separated into four distinct parts. These are the steps        suggested to achieve the aim of the study and to replicate        the results. The first part was to create the testing tool, the        second part was to find projects to test, the third part was to        collect the data, and the fourth part was to analyse the data.   

In the first part, the goal was to create a universal tool,        which could be applied to different types of projects. The        testing tool was created in order to make the installation of        Blockhound easier, as well as logging the run-time errors.        The second part was to find projects that were considered to        be of the most value to the thesis. The value of the project        was determined by the number of stars and forks the        projects had on GitHub, since this would suggest that the        projects are widely used. The third part was to apply the tool        to the chosen projects, in order to collect the testing data.        The fourth part was to analyse the data found in the third        part, and suggest potential fixes for the blocking violations        found. 

7.1 Construction of tools 

(6)

main methods was needed, since this would make the tool        more universal. The tool would ideally install BlockHound        to the target project, and log any errors to a log file.        Installing BlockHound before any testing would make it        possible to successfully register all blocking violations.        Logging all blocking violations to an external file allows for        analysing the data at a later time. 

 

The first attempt to create a tool for automating the testing        was semi-successful. The tool worked by adding it to a        project as an external JAR. To use the JAR, the project        package to be tested was supplied as a command-line        argument, and the main method of the JAR was run. The        JAR started by installing BlockHound, to ensure all testing        was done with it enabled. It then searched the supplied        project package for all JUnit tests annotated with the        “@Test” annotation. All the found tests were then added        and executed within a JUnit TestSuite. The errors found,        which contained a BlockingOperationError, were logged to        an external file. This JAR was tried on the first ten projects        successfully. It installed the BlockHound agent, ran the tests        and logged the data. Problems with the flexibility of the tool        created problems with project selection. Dependency on the        "@Test" annotations caused this problem. 

 

The shortcomings of the previous tool led to the        development of the final tool. This was the tool used for a        majority of the research. The JAR created was added to the        JVM options in the run configuration and used as a        JavaAgent. This ensured that the JAR was executed before        any code in the project         ​[40]​. The advantage of this was the        flexibility, which allowed it to be added to Gradle       ​[32]  projects, Maven   ​[31] projects, and it was not dependent        upon the “@Test”-annotation. This means that the agent        installed BlockHound regardless of what the class files        contained, as long as they could be executed. The criteria        was thus extended, and the number of testable projects        increased. The drawback of the final tool was that automatic        logging did not work. To compensate this, the stack traces        had to be copied manually when blocking violations were        found. 

7.2 Selection of projects 

This section describes the selection process used to find the        projects which was of most value to the research. This was        achieved by applying criteria to the selection and evaluating        the metrics.

  

7.2.1 Criteria and evaluation 

The criteria was designed to derive projects which was        supported by the BlockHound agent. Following the criteria        resulted in a list of projects which were testable by the tool        created. The criteria enforced the following requirements: 

● The project needs to be developed in Java. 

● The project must implement either RxJava,        Project-reactor or both of the frameworks, since        these are the frameworks which the thesis aimed to        investigate. 

● It needs to have a testing suite implemented, or be        an executable that ran a major part of the code.        This  requirement  was enforced to provide        BlockHound with as much data as possible.   

The projects provided by using the criteria needed to be        evaluated. This evaluation was used to find the highest rated        projects. GitHub provides different metrics for the projects.        These metrics may be filtered against, to prioritize the        repositories with the highest score in the given metric. The        metrics that the projects were filtered against were the        number of forks and the number of stars      ​[41], [42]​.    Consequently, the projects tested were the highest-ranked        projects in the list reduced by the criteria. The reasoning        behind using the fork metric as a value indicator was that it        indicates the usage of the project. Wider usage suggests that        it had a larger impact on the community if a violation was        found. The star rating was used as an additional value        indicator to filter against. The number of stars in a project        indicates interest and appreciation from the community,        which may then be translated into usage       ​[43]​. The threshold      chosen for representing an acceptable community usage was        at least 20 forks or 20 stars. There are other metrics as well,        such as number of commits, number of contributors and        code frequency. These were excluded, since they do not        necessarily correlate to community usage. 

  

7.2.2 Projects 

The criteria were primarily used to exclude projects which        were of less value to the research. In conjunction with the        evaluation of the project value, the criteria were used to        collect projects from GitHub. Given the evaluation, these        projects are considered to be representative of the        community usage of the RxJava and Project-reactor        frameworks. As such, the data collected may be used to        infer the frequency and causes of blocking violations in        projects implementing these frameworks. 

 

(7)

selection process, just over 50% of the projects were        successfully tested with the agent.  

7.3 Collecting data 

The collecting of the data was done by adding the agent.        The errors presented by the stack trace were then logged. To        validate the collected data, a control test of the agent        installation was executed.

  

7.3.1 Data collection 

The agent was used in two different ways, depending on the        build tools. Furthermore, depending on the project structure,        different ways of running the executable files were used.        Gradle projects added the agent in the gradle.build file. The        instructions added   ​(see figure 1)     ​installs BlockHound before      all test tasks, meaning that all tests that are recognized by        Gradle  will  run  with  the  agent  deployed.

  

​Figure 1: Instructions added to gradle.build 

 

If the project structure allowed, all tests could be executed        in a single process, as well as creating logs and writing to        them. This was achieved by executing the tests from the        terminal with a command       ​(see figure 2) from the root            directory.

 

Figure 2: Terminal command to execute all the tests in a                      project

 

 

If the projects were built with Maven, the agent was added        from the run configurations of the task to execute. The        ‘javaagent’ command, in accordance with the path to the jar,        was added to the program VM options ​(see figure 3)​.  

 

 

Figure 3: VM options to run the blockhound installation                  tool. 

 

Usage of the Java Development Kit (JDK) versions 13 and        newer prevented the BlockHound agent to be installed        correctly. This proposed a problem, since some of the        projects were compiled with those versions. To circumvent       

this  problem,  the 

“​-XX:+AllowRedefinitionToAddDeleteMethod​s”    ​flag was    added, alongside the javaagent argument, to the VM        options. 

7.3.2 Control of agent installation 

Avoiding a false negative was of primary concern, since this        would undermine the research, given the fact that blocking        violations would be missed. The data would not be as        accurate, if not all violations were found. To avoid this, a        test was added to the projects at an arbitrary class in the        testing suite   ​(see figure 4)    ​. This test would cause a blocking        violation, and if the agent is installed correctly, it should       

find  the  violation.

  

Figure 4: Code to cause a blocking violation with                  project-reactor.

 

7.4 Analysis and correction 

In order to fix the blocking violations, the log files produced        in part 3 were analysed, and the number of tests and        blocking violations for each project were summarized, as        well as their main causes. An attempt to solve the blocking        violations were made, by trying to find general solutions        that could be applied to multiple projects. 

7.4.1 Analysing projects. 

(8)

used as a basis for finding general solutions that are        applicable to multiple projects. 

7.4.2 Correction 

When the root causes of the violations had been established,        the next step was fixing the errors. The suggested solutions        needed to be general solutions that are applicable to multiple        errors, and as such, there might have been specific errors        that were not fixed. There were two general approaches to        fixing the errors. Either the error was fixed by changing the        blocking method to a non-blocking variant, or by changing        the thread to one that allows blocking calls. There was also        the  possibility  of  whitelisting  certain  methods  in  BlockHound, but as that merely ignores a blocking call        instead of fixing the problem, that was not considered a        solution. As we sought a generalizable solution, changing        specific methods in the different projects was not considered        feasible. By editing the projects which contained errors,        general solutions to some problems could be found. 

 

8. Results 

To be able to find blocking violations in open source        projects, BlockHound was used. In order to analyse and fix        the bugs found, logs of the errors were collected. This led to        the creation of two tools, each with its own drawback. The        first tool installed BlockHound, ran all JUnit tests in a        TestSuite, and logged the blocking errors to an external file.        The drawback was that the tool was only compatible with        tests annotated with “@Test”. The second tool installed        BlockHound as a Java agent in JVM options, but lacked any        automatic logging. The logging was either done manually        by copying the stack trace to a file, or by redirecting the        logging with the “>” symbol in the command prompt.   

RQ1: How often open source projects violate the contract                  by blocking non-blocking threads? 

 

In order to find out how common it is for reactive Java        projects to use blocking incorrectly, open source projects        were tested with BlockHound. In total, 29 open source        projects were tested. Out of those, it was found that 7 had        blocking violations. 24.1% of the tested projects had at least        one instance of incorrectly used blocking. See figure 5.   

 

 

 

 

 

 

 

Figure 5: The distribution of tested projects with blocking                  violations 

 

The results of the testing were logged to external files, and        the frequency of the blocking violations was summarised.        Out of the 7 projects which failed the BlockHound testing,        the percentage of failed tests caused by blocking violations        was varying between 0.09% and 32.90%. In order to know        how widely spread the blocking violations are in the        reactive community, the number of stars and forks for each        project, as well as the amount of tests and violations, is        shown in Table 1.

 

 

Name  Forks  Stars  Tests  Violations  Percent 

feign-reactor-core  54  148  138  32  23.19%  reactor-netty  274  1041  484  3  0.62%  cf-java-client  293  295  2259  2  0.09%  reactor-core  649  2972  7193  94  1.31%  hivemq-mqtt  56  289  2383  7  0.29%  reactor-kafka  97  243  155  51  32.90%  reactor-spring  14  36  35  1  2.86%  Total  1437  5024  12647  190  1.50%  Table 1: Amount of tests which generated blocking                violations in the tested projects.

 

(9)

In order to determine what the main causes of the blocking        violations were, the logs were analysed and summarised.        The main causes were due to blocking system calls used        incorrectly. The method used in the code for causing the        blocking violation was not of interest, as there may        otherwise be an extremely large number of causes for the        blockings. The blocking violations found were caused by 5        different system calls: 

  ● sun.misc.Unsafe#park  ● java.io.FileOutputStream#writeBytes  ● java.lang.Thread.sleep  ● jdk.internal.misc.Unsafe#park  ● java.io.FileInputStream#readBytes   

Each project had either 1 or 2 different system calls which        caused all of the violations, which is shown in Table 2. 

 

Name  Caused by 

feign-reactor-core  java.io.FileOutputStream#writeBytes  reactor-netty  java.io.FileOutputStream#writeBytes  sun.misc.Unsafe#park  cf-java-client  java.io.FileInputStream#readBytes  reactor-core  java.lang.Thread.sleep  jdk.internal.misc.Unsafe#park  hivemq-mqtt  sun.misc.Unsafe#park  reactor-kafka  java.io.FileOutputStream#writeBytes  sun.misc.Unsafe#park  reactor-spring  java.io.FileOutputStream#writeBytes  Table 2: The main causes of the blocking violations in each                      project 

 

8.1 Error messages 

This section will describe the error messages and the general        causes of them. 

8.1.1 Unsafe#park 

The error indicates that a thread is waiting and that it is        unable to execute. The error message may indicate a        deadlock as a result of waiting on an unobtainable resource.        This call is similar to thread.wait(), with the difference that        it uses architecture specific code. For this reason, it is        considered unsafe. Methods like await() will call it        internally as its waiting logic.

   

8.1.2 Read/write streams 

The system calls “FileOutputStream#writeBytes” and          “FileInputStream#readBytes” were called as write/reading          logic from different methods. These calls will block the        thread while it writes/reads and this will cause a blocking        violation if the thread is non-blocking. 

8.1.3 Thread.sleep() 

Thread.sleep makes the current thread wait for a defined        period of time. If done within a NonBlocking thread, a        blocking violation will occur. 

8.2 Violation cause and fixes 

The causes of the blocking violations, as well as the general        suggestion for fixes for each project will be listed in this        section.

  

 

RQ2: What is the main cause of the errors produced by                      blocking incorrectly? 

 

RQ3: What suggestions can be made to fix the blocking                    violations? 

8.2.1 Unsafe#park solution 

The cause of the blocking violations, which called        Unsafe#park, was operations that used time delays. The        execution of these on threads marked as non-blocking        resulted in blocking violations. 

Depending on how the waiting was called, there were        different solutions to remove the violation. If the waiting        was done as a result of methods calling “Unsafe#park”, it        may be solved by changing the thread-pool on which the        call was made. This was done by running the code on a        Scheduler marked as blocking        ​(see figure 6 & 7)        ​.  Depending on the framework, the Schedulers were named        differently, the principle is, however, the same. 

 

 

Figure 6: The adjusted code with comments showing the                  cause of the violations.Project-reactor. 

(10)

 

Figure 7: The adjusted code with comments showing the                  cause of the violations. RxJava. 

 

If the violation occurred as a consequence of the “block()”        method being called with a timer, the solution was different.        To solve this violation, the call must be wrapped by an        object which will be scheduled on a thread marked as        blocking ​(See figure 8)​. 

  

 

Figure 8: Wrapping the blocking call.   

Another solution is to remove the blocking call altogether        and use subscribe instead. This is the best solution if        possible. The block() method should be avoided if        necessary, since it ties up the thread and contradicts the        event driven nature of the reactive paradigm.

  

8.2.2 writeBytes/readBytes solution 

The  “FileOutputStream#writeBytes”  and  “FileInputStream#readBytes” system calls were responsible          for an extensive part of the violations. Writing and reading        operations will block the thread they are running on and        thus a blocking violation will occur if the scheduler used is        non-blocking. The calls were made from methods using        them as underlying logic, such as the “log()” method. These        violations may be solved by using a scheduler that allows        blocking. This can be applied to the entire pipeline or just to        the method calling write/read. Furthermore, the use of        blocking operations, such as “BlockingGet()”, may also        cause  a  blocking  violation  with the system call        “writeBytes”. This is solved by adding a wrapper as        demonstrated in ​figure 8​.  

8.2.3 Thread.sleep() 

The method will cause the same kind of violations as        Unsafe#park, if used incorrectly. It will block the thread for        the duration specified by the user. If it is applied to a       

non-blocking thread, this will cause a blocking violation. To        solve this, the operation must på scheduled on a thread        marked as blocking​(See figure 9)​. 

  

 

Figure 9: Example of scheduling thread.sleep() on the                elastic scheduler. 

8.2.4 Other causes 

The use of block operations such as “Mono::block()” was a        common reason for the blocking violations. These calls,        although part of the reactive frameworks, should be used        conservatively. Deadlocks may occur as a consequence of        using them irresponsibly     ​[21]​. If they are implemented, they        should not be used on non-blocking threads. Another cause        of violations was blocking network I/O. The DNS        connection might be implemented as a blocking task, and        thus a blocking violation might be present when connecting        to servers, such as the ones accessed when using       

“@SpringBootTest(webEnvironment  = 

WebEnvironment.RANDOM_PORT)”. Loading of test        suites will block the main thread and may therefore cause a        blocking violation. An additional aspect to consider when        implementing the reactive frameworks is that certain        methods run on certain schedulers by default. The        consequence of this is that a blocking violation may occur        since the method is called on a different scheduler than        expected.

  

 

9. Discussion 

The results showed that approximately 24.1% of the tested        projects had concurrency errors caused by using blocking        operations incorrectly, which was not anticipated. This        could be due to concurrency errors being difficult to find, as        shown by Mamun et al.         ​[13]​. Their study discovered a          detection ratio of 25% while comparing four different        concurrency bug finding tools. The Blockhound tool is not,        however, considered to be part of this statistics. The tools        tested in the above study investigated concurrency errors as        a whole. Since Blockhound is dedicated to one kind of        concurrency error, it is considered to be more accurate and        thus the violations found is considered to be near the true        value.  

 

(11)

Java API. This is less than the 24.1% this study found.        However, their study had a focus on the number of projects,        and tested 344 projects, whereas this study tested projects        with wider community usage, but only 29 of them. As a        larger testing pool provides a more accurate result, this        could explain the difference. Furthermore, one type of        concurrency error is not directly comparable to another. Our        study, like ​[12]​, provides tools for enforcing correct usage.   

Like the study of Yu et al.       ​[39] and Tu et al. ​[11]​, our study        will provide guidelines on how to identify and fix        concurrency errors. This will give future researchers and        developers a foundation upon which further research can be        done. 

 

The main causes of the violations were five different        Java/system calls. The calls were usually not made directly        in the code, but called further down the stack. The errors        were caused by using methods that implemented blocking        calls in threads which were marked as NonBlocking by        Project Reactor or RxJava. For Project Reactor, those        threads are parallel() and single(). For RxJava, trampoline(),        single() and computation() are the NonBlocking threads.        Blocking calls should be avoided when executing in those        threads. Instead, elastic() and boundedElastic() should be        used for Project Reactor, and io() should be used for        RxJava. 

 

This study has been dependent upon the BlockHound API,        which means that potential flaws in the framework may        have produced inaccurate results. Furthermore, this study        was  heavily  dependent  upon  the  tested  projects’  implemented JUnit tests, therefore it relies on the tests        providing sufficient coverage. Additionally, the tests posed        a risk of introducing blocking violations caused by the test        implementation, but which was not present in the production        environment. Such errors may still have implications, as a        concurrency error in the test could produce a false negative,        leading the developer to believe that a passed test means the        production code is without errors. An additional implication        of this is that the developer might not consider this type of        blocking violation. This suggests that the developer may        have implemented the same error in the production code. A        large part of the violations found was caused by the use of        pure blocking methods such as “block()”. The use of these        in a testing environment may not be as critical as if they are        implemented wrong in a production environment. As such,        the developer may recognize the risks by utilizing these        operations but considered them trivial when used in testing.        This would imply that the violations caused by block() may        not be interpreted as developer ignorance. As this cannot be        enforced without insight given by the developer, it will,        however, be assumed that these violations are in fact a        product of ignorance.   

 

The majority of the projects investigated in this study used        Project-reactor as its reactive framework. This made the        ratio between RxJava and Project-reactor projects uneven.        The consequence of this fact is that the results of the        research are mostly based on Project-reactor projects and        may not provide an accurate mapping of blocking violations        in RxJava projects. The aim was to have roughly a 50/50        ratio between the frameworks. This was however not        possible due to the lack of high rated Java projects which        implemented the framework. During this study, it became        clear that the RxJava framework seems to be mostly used in        Android development. As BlockHound does not support        Android, a majority of the RxJava projects could not be        tested. To compensate this, five student projects, which        utilized the RxJava framework, were tested. The projects        were added to achieve a smaller divide between RxJava        projects and Project-reactor projects. This is not to say that        they only functioned as filler projects. The projects utilized        the framework to a large extent, and a majority scheduled        tasks manually on schedulers. The projects can, however,        not be considered representatives of the RxJava community,        given that they are not used by the public.  

 

Some projects had a vast amount of tests, and these projects        could be assumed to test the majority of the production        code. Other projects had a very small amount of tests, which        would imply that a majority of the production code was left        untested. If no violations were present in a project which        had extensive testing, this would suggest that there does not        exist any violation in this project. However, if no violation        is found in a project which implemented a narrow test suite,        there is a high likelihood that the production code may        exhibit blocking violations regardless of what the test        showed. The majority of the projects which had violations        in this study had an extensive test suite, and therefore left        little room for violations to be missed. Given the previous        logic, the criteria of the research should have been extended        to include the scope of the tests in the projects investigated.        This criterion could be based on how much coverage the test        suite provided for the production code. 

 

The societal and ethical implications and consequences of        the work provided by this study is considered to be        non-existent. This conclusion is drawn from the fact that the        nature of the study is purely connected to a limited part of        computer science, and limited to discovering and fixing        errors of little impact to society at large.

  

10. Conclusion 

(12)

therefore be concluded that blocking violations is indeed a        problem in the open source community.  

 

The results also showed that there were a limited number of        method calls, which were the main causes of the blocking        violations, as only five different main causes were found.        These were, however, used further down the stack in many        different methods. As it is hard to know exactly which        methods use blocking calls without doing some deeper        investigations, it is beneficial to use BlockHound as        insurance, when it might be unclear which methods use        blocking calls. 

 

For fixing blocking violations in reactive projects, we found        that the easiest way is by changing the thread on which the        blocking call is made. Sometimes it might make sense to        change the method instead of the thread, if a non-blocking        variant of the method exists. However, it is often difficult to        do this, as a blocking call might be necessary. 

 

(13)

References 

[1] L. Lamport, ‘Time, Clocks, and the Ordering of        Events in a Distributed System’,         ​Communications of    the ACM​, 1978, vol. 21, no. 7, pp. 558-565. 

[2] C. Saidu, A. Obiniyi, and P. Ogedebe, ‘Overview of        Trends Leading to Parallel Computing and Parallel        Programming’, ​Br. J. Math. Comput. Sci.        ​, 2015, vol.      7, no. 1, pp. 40–57​. 

[3] M.  Batty,  K.  Memarian,  K.  Nienhuis,  J.  Pichon-Pharabod, and P. Sewell, ‘The Problem of        Programming Language Concurrency Semantics’, in          Programming Languages and Systems      ​, 2015, vol.      9032, pp. 283–307. 

[4] ‘The  Reactive  Manifesto’. 

https://www.reactivemanifesto.org/ (accessed Apr.      01, 2020). 

[5] T.  Nurkiewicz  and  B.  Christensen,  ​Reactive  programming with RxJava: creating asynchronous,          event-based applications  ​, First edition. Sebastopol,        CA: O’Reilly Media, Inc, 2016. 

[6] T. Nield,    ​Learning RxJava: build concurrent,        maintainable, and responsive Java in less time            ​, First    edition. Packt Publishing, 2017. 

[7] F. Cheng, ​Exploring Java 9​. Apress, 2018.  [8] ‘ReactiveX/RxJava’. 

https://github.com/ReactiveX/RxJava   (accessed May 11, 2020). 

[9] ‘Project Reactor’. https://projectreactor.io/ (accessed        May 11, 2020). 

[10] ‘reactor/BlockHound’.  

https://github.com/reactor/BlockHound (accessed

     

May 06, 2020). 

[11] T. Tu, X. Liu, L. Song, and Y. Zhang, ‘Understanding        Real-World  Concurrency  Bugs  in  Go’,  in  Proceedings  of the Twenty-Fourth International        Conference  on  Architectural  Support  for  Programming Languages and Operating Systems        ​,  Apr. 2019, pp. 865–878​. 

[12] S. Liu, G. Bai, J. Sun, and J. S. Dong, ‘Towards        Using Concurrent Java API Correctly’, in       ​2016 21st    International Conference on Engineering of Complex            Computer  Systems  (ICECCS)​, Nov. 2016, pp.        219–222​. 

[13] A. A. Mamun, A. Khanam, H. Grahn, and R. Feldt,        ‘Comparing Four Static Analysis Tools for Java        Concurrency Bugs’, in ​MCC-10, ​Sep. 2010, p. 7.  [14] D. Engler and K. Ashcraft, ‘RacerX: Effective, Static       

Detection of Race Conditions and Deadlocks’,       ​ACM  SIGOPS Operating Systems Review​, Oct. 2003, p. 16.  [15] H. Jula, D. Tralamazza, C. Zamfir, and G. Candea,       

‘Deadlock Immunity: Enabling Systems To Defend        Against Deadlocks’, in     ​OSDI'08: Proceedings of the        8th USENIX conference on Operating systems design             

and implementation, ​Dec. 2008, pp. 295-308. 

[16] M. Naik, C.-S. Park, K. Sen, and D. Gay, ‘Effective        static deadlock detection’, in        ​2009 IEEE 31st      International Conference on Software Engineering        ​,  2009, pp. 386–396​. 

[17] Y. Wang, T. Kelly, M. Kudlur, S. Lafortune, and S.        Mahlke, ‘Gadara: Dynamic Deadlock Avoidance for        Multithreaded Programs’,     ​in 8th USENIX Symposium      on Operating Systems Design and Implementation            OSDI 2008, ​Dec. 2008, pp. 281-294. 

[18] M. Eslamimehr and J. Palsberg, ‘Sherlock: scalable        deadlock detection for concurrent programs’, in        Proceedings  of  the  22nd  ACM  SIGSOFT  International Symposium on Foundations of Software            Engineering - FSE 2014​, Nov. 2014, pp. 353–365​.  [19] M. D. Bond, K. E. Coons, and K. S. McKinley,       

‘PACER: proportional detection of data races’,       ​in  PLDI '10: Proceedings of the 31st ACM SIGPLAN                Conference on Programming Language Design and            Implementation, ​Jun. 2010, pp. 255-268. 

[20] Y. Yu, T. Rodeheffer, and W. Chen, ‘RaceTrack:        Efficient Detection of Data Race Conditions via        Adaptive Tracking’,    ​in ACM SIGOPS Operating        Systems Review, ​Oct. 2005, pp. 221-234. 

[21] ‘JavaZone  2019  -  September  11th  -  12th’,  javazone.no​. https://2019.javazone.no (accessed Apr.        01, 2020). 

[22] D. G. Altman and J. M. Bland, ‘Standard deviations        and standard errors’, in       ​BMJ​, Oct. 2005, vol. 331, no.        7521, p. 903. 

[23] P. V. Roy, ‘Programming Paradigms for Dummies:        What Every Programmer Should Know’, p. 39. 

[24] ‘Programming  Paradigms’. 

https://cs.lmu.edu/~ray/notes/paradigms/  (accessed  Apr. 07, 2020). 

[25] L. Ziarek, K. Sivaramakrishnan, and S. Jagannathan,        ‘Composable  Asynchronous  Events’,  ​in ACM    SIGPLAN Notices, ​June 2011, vol. 46, no. 6, p. 12.  [26] I. Cassar and A. Francalanza, ‘On Synchronous and       

Asynchronous  Monitor  Instrumentation  for  Actor-based  systems’,  ​Electron. Proc. Theor.      Comput. Sci.​, Feb. 2015, vol. 175, pp. 54–68​. 

[27] ‘Network Programming: Blocking & Non-blocking,          Sync & Async’,      ​Mike Xiao  ​, Aug. 26, 2017.        http://magickaichen.com/unblock-block/  (accessed  Apr. 14, 2020). 

[28] E. Bartocci and Y. Falcone, Eds.,      ​Lectures on    Runtime Verification  ​, vol. 10457. Cham: Springer          International Publishing, 2018. 

[29] ‘Debugging  |  Think  Java  |  Trinket’.  https://books.trinket.io/thinkjava/appendix-c.html  (accessed Apr. 23, 2020). 

(14)

Build Automation Tools?’, in       ​2017 IEEE/ACM 3rd      International  Workshop  on  Rapid  Continuous  Software Engineering (RCoSE)    ​, May 2017, pp.        20–26​. 

[31] ‘Maven  –  Introduction’. 

https://maven.apache.org/what-is-maven.html  (accessed Apr. 22, 2020). 

[32] ‘Gradle  User  Manual’. 

https://docs.gradle.org/current/userguide/userguide.ht ml (accessed Apr. 22, 2020). 

[33] ‘Gradle | Gradle vs Maven Comparison’,       ​Gradle​.  https://gradle.org/maven-vs-gradle/ (accessed Apr.      23, 2020). 

[34] S. Abbaspour Asadollah, D. Sundmark, S. Eldh, and        H. Hansson, ‘Concurrency bugs in open source        software: a case study’,         ​J. Internet Serv. Appl.    ​, Dec.    2017, vol. 8, no. 1, p. ​15. 

[35] G. Li, S. Lu, M. Musuvathi, S. Nath, and R. Padhye,        ‘Efficient scalable thread-safety-violation detection:        finding thousands of concurrency bugs during        testing’, in   ​Proceedings of the 27th ACM Symposium            on Operating Systems Principles  - SOSP ’19            ​, 2019,    pp. 162–180. 

[36] C. Flanagan and S. N. Freund, ‘Atomizer: A Dynamic        Atomicity Checker For Multithreaded Programs’,           ​in Science of Computer Programming,       ​Apr. 2008, vol.      71, no. 2, pp. 89-109. 

[37] Z. Lai, S. C. Cheung, and W. K. Chan, ‘Detecting        atomic-set serializability violations in multithreaded          programs through active randomized testing’, in        Proceedings of the 32nd ACM/IEEE International            Conference on Software Engineering - ICSE ’10            ​,  2010, vol. 1, pp. 235​-244. 

[38] M. Vaziri, F. Tip, and J. Dolby, ‘Associating        synchronization  constraints  with  data  in  an  object-oriented language’,     ​in ACM SIGPLAN Notices,      Jan. 2006, vol. 41, no. 1, p. 334-345. 

[39] M. Yu, Y.-S. Ma, and D.-H. Bae, ‘Characterizing        non-deadlock concurrency bug fixes in open-source        Java programs’, in     ​Proceedings of the 31st Annual          ACM Symposium on Applied Computing - SAC ’16              ​,  Apr. 2016, pp. 1534–1537​. 

[40] ‘Java  Documentation’,  ​Oracle Help Center    ​.  https://docs.oracle.com/en/java/ (accessed Mar. 24,        2020). 

[41] ‘Fork  a  repo  -  GitHub  Help’.  https://help.github.com/en/enterprise/2.13/user/article s/fork-a-repo (accessed May 15, 2020). 

[42] ‘Saving repositories with stars - GitHub Help’.        https://help.github.com/en/github/getting-started-with -github/saving-repositories-with-stars (accessed May      15, 2020). 

[43] H.  Borges,  A.  Hora,  and  M.  T.  Valente,  ‘Understanding  the  Factors  That  Impact  the 

Popularity of GitHub Repositories’, in         ​2016 IEEE    International Conference on Software Maintenance          and Evolution (ICSME)​, Oct. 2016, pp. 334–344​.  [44] M. Sulír and J. Porubän, ‘A quantitative study of Java       

(15)

     

Appendix I: Time Plan 

(16)

Appendix II: Contributions 

Introduction 

Written 80% by Morgan, 20% by Robin.   

Problem statement 

Written 60% by Morgan, 40% by Robin.   

Research Questions 

Written 60% by Morgan, 40% by Robin.   

Limitations 

Written 10% by Morgan, 90% by Robin.   

Background 

Written 80% by Morgan, 20% by Robin.   

Related work 

Written 90% by Morgan, 10% by Robin.   

Research methodology 

Written 10% by Morgan, 90% by Robin.   

Results 

Written 20% by Morgan, 80% by Robin.   

Discussion 

Written 30% by Morgan, 70% by Robin.   

Conclusions 

Written 60% by Morgan, 40% by Robin. 

   

Code implementation 

Done 100 % by Morgan, 0% by Robin.   

Testing 

Done 70% by Morgan, 30% by Robin.   

Analysing (checking log files and summarizing numbers              and causes) 

Done 80% by Morgan, 20% by Robin.   

Error fixing 

Done 20% by Morgan, 80% by Robin. 

 

General 

In writing each section, the sections were written mainly        about 50/50, however, the parts written by Robin were        usually longer, so Robin has probably written 60% and        Morgan 40%. 

 

For the implementation and testing, Morgan did a majority.        For the analysing of data, Morgan did a larger part, and for        the error fixing, Robin did a larger part. In the practical        section overall, we would estimate that Morgan did 60%        and Robin did 40%. 

 

In general, we would consider the amount of work done was        distributed evenly at 50% each. 

 

Appendix III: Logs 

https://github.com/morganwallin/BlockHoundLogs   

(17)

References

Related documents

Generally, a transition from primary raw materials to recycled materials, along with a change to renewable energy, are the most important actions to reduce greenhouse gas emissions

För att uppskatta den totala effekten av reformerna måste dock hänsyn tas till såväl samt- liga priseffekter som sammansättningseffekter, till följd av ökad försäljningsandel

Från den teoretiska modellen vet vi att när det finns två budgivare på marknaden, och marknadsandelen för månadens vara ökar, så leder detta till lägre

Generella styrmedel kan ha varit mindre verksamma än man har trott De generella styrmedlen, till skillnad från de specifika styrmedlen, har kommit att användas i större

Parallellmarknader innebär dock inte en drivkraft för en grön omställning Ökad andel direktförsäljning räddar många lokala producenter och kan tyckas utgöra en drivkraft

Närmare 90 procent av de statliga medlen (intäkter och utgifter) för näringslivets klimatomställning går till generella styrmedel, det vill säga styrmedel som påverkar

I dag uppgår denna del av befolkningen till knappt 4 200 personer och år 2030 beräknas det finnas drygt 4 800 personer i Gällivare kommun som är 65 år eller äldre i

Det har inte varit möjligt att skapa en tydlig överblick över hur FoI-verksamheten på Energimyndigheten bidrar till målet, det vill säga hur målen påverkar resursprioriteringar