Intelligent and Context-Aware Information Filtering in Continuous Integration Pipeline using the Eiffel Protocol

(1)

Linköpings universitet

Linköping University | Department of Computer and Information Science

Bachelor’s thesis, 16 ECTS | Datateknik

2021 | LIU-IDA/LITH-EX-G--2021/020--SE

Intelligent and Context‐Aware In‐

formation Filtering in Continuous

Integration Pipeline using the Eif‐

fel Protocol

Intelligent och kontextmedveten informationsfiltrering i kontin‐

uerlig integrationsrörledning med Eiffel‐protokollet

Robin Gustafsson

Supervisor : Azeem Ahmad Examiner : Kristian Sandahl

(2)

Upphovsrätt

Detta dokument hålls tillgängligt på Internet ‐ eller dess framtida ersättare ‐ under 25 år från publicer‐ ingsdatum under förutsättning att inga extraordinära omständigheter uppstår.

Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka ko‐ pior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervis‐ ning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säker‐ heten och tillgängligheten finns lösningar av teknisk och administrativ art.

Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsman‐ nens litterära eller konstnärliga anseende eller egenart.

För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/.

Copyright

The publishers will keep this document online on the Internet ‐ or its possible replacement ‐ for a period of 25 years starting from the date of publication barring exceptional circumstances.

The online availability of the document implies permanent permission for anyone to read, to down‐ load, or to print out single copies for his/hers own use and to use it unchanged for non‐commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional upon the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility.

According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement.

For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its www home page: http://www.ep.liu.se/.

(3)

Abstract

Software development has gotten more complex and certain parts being more and more automated. Continuous integration practices with automated build and testing greatly benefit the development process. Combined with continuous deployment, the software can go directly from commit to deployment within hours or days which means that every commit is a possible deployment. The ability to trace links between artifacts is known as software traceability which has become a necessity and requirement in the industry. Following these traces and the ability to answer questions and base decisions on them is a complex problem. Tools already used in the industry are hard to adapt since every stakeholder requires different needs. Eiffel protocol aims to be as flexible and scalable as possible to fit as many needs as necessary for the stakeholder. This thesis provides an extension to Eiffel-store, an already existing open-source application that can visualize events in the Eiffel protocol that will be extended with functionality so that it can filter events and answer some questions stakeholders might have.

(4)

Acknowledgments

I would like to thank both my examiner Kristian Sandahl and supervisor Azeem Ahmad. They have both been wonderful and pleasant to work with, helping and giving me the information I needed to complete this thesis.

I would also like to thank my peer-reviewer Jonatan Barr for his comments, the problems with the user interface being hard to use, and that the test data could be a limiting factor for testing the implementation.

(5)

1 Introduction 1 1.1 Motivation . . . 1 1.2 Aim . . . 2 1.3 Research questions . . . 2 1.4 Delimitations . . . 2 2 Background 3 3 Theory 5 3.1 Version control . . . 5 3.2 Continuous practices . . . 6 3.3 CI/CD pipeline . . . 6 3.4 Software traceability . . . 7 3.5 Eiffel protocol . . . 8 3.6 Eiffel-store . . . 10 4 Method 12 4.1 Implementation . . . 12 4.2 Evaluation . . . 14 5 Results 18 5.1 Implementation . . . 18 5.2 Evaluation . . . 22 6 Discussion 27 6.1 Results . . . 27 6.2 Method . . . 27

6.3 The work in a wider context . . . 28

7 Conclusion 29 7.1 How can intelligent and context-aware searching be implemented in the Eiffel protocol based on information needs? . . . 29

7.2 Is the queried information from the implementation correct? . . . 29

(6)

Bibliography 31 Appendices 34 A Event Sequence 1 . . . 34 B Event Sequence 2 . . . 35 C Event Sequence 3 . . . 36 D Event Sequence 4 . . . 38 E parseHelper function . . . 39 F filterEventSequences function . . . 39 G Algorithm Output . . . 41

(7)

List of Figures

2.1 Eiffel-store visualization. . . 3

3.1 Minimal example of a CI/CD pipeline. . . 7

4.1 Example tree structure. . . 13

4.2 Process of traversing the tree with dropdowns. . . 13

4.3 Simplification of the event sequence used for the algorithm. . . 16

4.4 Simplification of the event sequences used for the information needs. . . 16

5.1 Data from the algorithm displayed in a tree structure. . . 22

5.2 The aggregation view with ”TSF”, ”Name” and ”TestSuite 1” selected. . . 23

5.3 ”TestSuite 1” event listed in the details view. . . 23

5.4 Confidence for ”TestSuite 1” and ”TestSuite 2”. . . 23

5.5 The aggregation view for ”test-case-5” and error handler in red. . . 24

5.6 Test case events for ”test-case-5” with error handler set to ”Include event(s). . . 24

5.7 Test case events for ”test-case-5” with error handler set to ”Exclude event(s). . . . 24

5.8 Environment names for ”test-case-5” and ”test-case-6”. . . 25

5.9 Source code submitted information for id ”1” and ”5”. . . 25

5.10 Test suites for ArtC with ”pkg:maven/com.mycompany.myproduct/my-artifact@2”. 26 5.11 Test cases for ArtC with ”pkg:maven/com.mycompany.myproduct/my-artifact@2”. 26 5.12 List of source code changes for id ”alice”. . . 26

(8)

List of Tables

3.1 Eiffel terminology and their corresponding events. . . 9

3.2 Eiffel events not included in the terminology. . . 10

3.3 Eiffel-store names for Eiffel events. . . 11

(9)

1 Introduction

1.1 Motivation

Software development often has multiple people per project. Manually tracking changes and version releases can be challenging and not that straightforward. The open-source community is also more likely to adopt continuous integration to their project that provides automatic testing, building and even deployment [1]. When a project gets larger and more developers are maintaining it, tracking what has changed in each artifact becomes harder and might require more work hours to keep track of each merged change. Manually tracking artifacts is no longer an option and having automated tracking often provides its own challenges [2]. Google, for instance, runs 150 million test executions per day across over 13,000 individual projects [3]. Keeping track of what release contains which source code changes, bug fixes or test case outcome can become unmaintainable when automation is involved, and tools are becoming a necessity to keep track of changes.

The practice of tracking software artifacts such as builds, test executions or releases is called software traceability. This has become an unwritten rule and a necessity to use in the industry as even as governments requiring its use [4]. A problem is that stakeholders which may include developer, researchers, manager and, anyone that is interested in it must come together to decide on what tool and what information they want out of the traceable data. Different stakeholders want different things out of traceability, and usage in the industry varies extensively since it is company-dependent [2]. Based on this to understand stakeholder needs, Ahmad et al. conducted a study by companies what information needs they have and then listed them in a table with importance, frequency, effort, and time with respect to the information needs [5]. They found 27 needs that were related to one of the following: artifacts, builds, bugs, code changes, and tests. Another problem in software traceability is that the tool must be able to scale, be flexible, traceable, and incorporate third-party services as they are commonly used in continuous integration. Many different tools and services might be used interdependently that the tracking must consider and be able to construct links for.

The Eiffel protocol aims to solve these issues with its standardized vocabulary and usage [6]. Since its inception, it was designed to be used and be able to be highly traceable in interconnected and interdependent software pipelines. It provides a system where events are produced anywhere in the pipeline that are published on a global bus. The events can then be stored and have a bi-directional tracing which makes it easier to filter and search the gathered

(10)

1.2. Aim

information. Provided data in the event is minimal and abstracted to accommodate many needs. If more data is required by the company or stakeholder, it can simply be added since it is an open protocol and is flexible.

1.2 Aim

Stakeholders require different information about the artifacts and software. There is no stan-dardized tool in the industry for stakeholders to use and used software is not extensive enough so they can track and filter through events to find a specific event based on user input. This thesis aims to add a filtering mechanism to an already existing application that visualizes Eiffel events to be able to filter events to answer specific information needs stakeholders might have. This will show that the Eiffel protocol is flexible, scalable, and traceable in an interdependent system so that specific information needs can be answered. To verify that the solution is correct with the information filtering, an evaluation will be performed using five information needs from the study by Ahmad et al [5].

1.3 Research questions

The problem that this thesis aims to solve is:

RQ1: How can intelligent and context-aware searching be implemented in the Eiffel protocol based on information needs?

RQ2: Is the queried information from the implementation correct?

Information needs from the study by Ahmad et al [5]. that are going to be examined in this thesis are the following:

1. How much confidence do we have in a specific test suite? 2. In which environment/machine did the specific test cases fail? 3. Is the new feature implemented?

4. Which test case/suite have been run on which product and at what time? 5. How often does a specific employee deliver new code to the system?

1.4 Delimitations

The Eiffel framework and information filtering could have been implemented in any language or application. Instead, this thesis focuses to extend the already existing application Eiffel-store and add functionality to accommodate the needs of the research questions.

Selection of the information needs from [5] could have been done in any way. In this thesis five needs were selected based upon how hard it would be to implement in Eiffel-store and no other selection process was made.

(11)

2 Background

Tracing events, commits, code, and test-cases cases in the workflow of large-scale, heteroge-neous development projects in real-time, Eiffel provides an unprecedented ability for multiple stakeholders to craft and tailor intelligence solutions to their needs. Visualization and analysis tools can effectively query large sets of events for monitoring or troubleshooting.

An implementation that uses the Eiffel protocol is Eiffel-store and it can visually provide events and commit traceability as in the figure below:

Figure 2.1: This graph shows an Eiffel-store visualization of sample events. All nodes are events, time flows left to right and the arrow between the events shows what it links to. It starts with an artifact creation (ArtC) which starts an activity (ActS). This activity is a test suite (TSF) with four test cases (TCS). It also defines two environments (EDef) and a test recipe (TestExeRecipColl).

Azeem Ahmad was looking for a tool that could visualize data to answer questions, pro-vide information filtering to visualize appropriate information, and intelligent context-aware filtering. This could be integrated into Eiffel-store or develop a tool from scratch. It must fulfill these four requirements:

(12)

1. The tool must use Eiffel events as input.

2. The output should be tailored to the need for developers, testers, project managers or any stakeholders.

3. The output graph should be human readable and provide enough information for decision making.

4. The tool should be open source.

It was chosen to extend Eiffel-store with filtering functionality. The method to filter the events was also discussed and chosen to be dropdowns that the user could choose values from to filter by and the information needs that should be answered were chosen freely from his paper on information needs. Chosen needs were therefore chosen by the ease of implementing them.

(13)

3 Theory

Theory that is needed to support the problem statements which includes the following sections: version control, continuous practices, CI/CD pipeline, software traceability, Eiffel framework and Eiffel-store.

3.1 Version control

Tracking changes in source files or projects with version control allows for better management and the ability to filter through past revisions allows reverting code changes. Version control is not often used on its own by manually copying the files occasionally. Instead, a version control system (VCS) is used to track the files such as Git1_{with a local repository. Common features} in a VCS include commit which adds the changes to the repository, branch where new changes can be separately developed before they are committed to the main branch, and merge that combines branches. Code changes can be committed, and the VCS handles the rest and saves it to the local repository where all the information is stored.

Collaboration with a local repository is not ideal since a shared repository requires every developer to always have an up-to-date copy. A centralized version control system (CVCS) instead has the repository stored on a remote server and each change is committed to that repository. This can be cumbersome to handle and maintain as a study by De Alwis and Sillito revealed that the problem with CVCS in production is that it requires each maintainer to have access which makes it harder to contribute and that it has no atomicity which can lead to corruption in branches [7].

A decentralized version control system (DVCS) improves on the shortcomings of CVCS where each developer has its own local copy of the repository where changes can be stored and later be pushed to the remote repository. This makes collaboration easier since changes can also be requested to be merged to the main branch and it has atomicity. Git is widely used in software engineering that is a DVCS and is being more commonly used in the classrooms [8]. Services such as GitHub 2 enables developers to have their Git repository stored in the cloud for both private and public use. This has greatly benefited collaboration and managing versions among developers [9].

1_{Git https://git-scm.com/} 2_{GitHub https://github.com/}

(14)

3.2. Continuous practices

3.2 Continuous practices

Software development has a lot of general practices on how to do things and each company or even project has their own standard. A widely known and growing practice is to use Continuous integration in the project. A study by Hilton et al. shows that over 40% of opensource projects make use of CI and that popular projects are more likely to utilize it [1]. CI can also be extended with more automation with automation of the delivery and deployment process as well.

3.2.1 Continuous integration

A way for developers in a team to combine their code into a shared mainline branch is also a widely known term in software engineering as Continuous integration (CI) [10]. This expands on the VCS and having remote repositories combined with automation of building and testing the changes. The main branch can then be protected and by employing this practice it can result in a reduction of bugs and improvement in the quality of the software [11]. Each developer is supposed to commit their changes often multiple times a day to ensure their work is being saved and enabling merges to the mainline more frequent so the changes can be shipped and debugged faster [12]. Automation of the testing can be used to run regression tests for each commit or pull request to the repository through a CI service such as Travis CI 3 _{to provide information if the changes were successful [1]. Code can be evaluated and tested} before it enters the main branch which also makes it easier to find where the bug is because it is known what has changed.

3.2.2 Continuous delivery

Continuous delivery (CDe) extends continuous integration by always keeping the software in such a state where it could be delivered to production at any time after successful testing [11]. Although the software is production-ready and has passed automated testing does not mean it can be deployed directly. The software might require further manual testing and acceptance in a test environment that the CDe can automatically set up for the tester [13]. After passing, it can be deployed with a click of a button.

3.2.3 Continuous deployment

Completing the continuous practices is continuous deployment (CD) extending CDe with au-tomation of deployment to production. The process is completely automated from code change to deployment to production, and every merge to the main branch will continuously be de-ployed to customers within hours or days [14]. Since code changes will rapidly be dede-ployed, they can also be reported back to the developer within a small amount of time by the user of the software. There are many more benefits such as faster feedback, more frequent releases, and improved quality due to smaller releases [14]. However, CD can also cause problems such as experimental features that should not be deployed yet and require infrastructure to both handle the automation of the build and software to track each deployment [15].

3.3 CI/CD pipeline

All three of the continuous practices from 3.2 make up what is called the CI/CD pipeline where every process from commit to deployment is automated. Automatic deployment does not mean that every build gets deployed, and the build must pass certain criteria to get deployed. This can entail that it must pass all unit tests or pass an automatic test on the build. The pipeline can look very different based on needs or company requirements. Compiled languages might

(15)

3.4. Software traceability

have to pass a build stage and a test stage. Meanwhile, software written for the web in JavaScript may only pass a build stage. Software and services used in the pipeline can also be different since there are no standard tools to use for all languages and needs. A pipeline might use Git through GitHub and the CI services of Travis CI. Commits and pull requests can be configured to trigger automatic builds through Travis CI and return the results to GitHub via its integration. Below is a minimal example of a CI/CD pipeline outlined above.

Figure 3.1: Minimal example of a CI/CD pipeline.

3.4 Software traceability

Artifacts can be a variety of things, builds, test cases, test suites, source code changes, and releases. Keeping track of these artifacts and how they relate to each other is known as software traceability. Managing small or open-source software with a low count of artifacts can be manageable, for larger projects where many commits and artifact builds keeping track of what each artifact contains become increasingly harder. Eﬀicient and relevant information finding is the goal of software traceability [16] which has made it a necessity and tool that should be used in software development [16, 4]. By using links and events in each step in the software pipeline creates a trace of events that can be followed to gather information. Each artifact built can link all the way back to the corresponding causing event. In figure 3.1 each stage creates one or more event that links back to the previous event. That way information about the artifact can be traced backward for information retrieval such as build outcome, test outcome, and what commit caused the build.

The perception of software traceability can be seen in multiple perspectives, Cleland-Huang et al discussed and built upon previous work and presented three interdependent perspectives [2]. First is a goal-based perspective with seven goals of quality from challenges: (1) pur-pose that support needs; (2) cost-effective; (3) configurable to support needs; (4) trusted with full confidence; (5) scalable; (6) portable across projects; and (7) valued by all. The sec-ond perspective is a process-oriented perspective where traceability is planned, implemented accordingly and then links are created as the project goes on. The last perspective is the technical perspective where complex trace links are introduced that can produce links in the infrastructure and between services. The events that are created must then also support the different information that each service might provide or create an abstract layer to handle the information. It is also found that the large amount of information provided by the continuous integration services is too much for stakeholders or developers to handle [11].

3.4.1 Stakeholder needs

A stakeholder could be a developer, researcher, manager, etc. These people are the ones adopting the trace solution and must then drive the research on traceability forward [2]. Goals one, three, four, and seven from section 3.4 are all decided by the stakeholders. Interests from all these people require different solutions as they have different needs. A developer might require more specific details about failed test cases and the higher up only wants to know if it failed or not and not why. This requires a large amount of coordination and management to be able to cooperate which has been confirmed to be a problem [17]. There is also a small amount of research about stakeholder needs since it is dependent on the company needs and requires the industry to do more research [2].

(16)

3.5. Eiffel protocol

3.4.2 Industry tools

Software traceability is not standardized in the industry as mentioned in section 2.4.1 and therefore not a standard tool that can be utilized. Using open-source tools for company needs is often not feasible to customize for their individual needs. Maro and Steghöfer collected requirements for a traceability manager tool and created a tool based on the requirements [18]. They found three requirements: (1) ability to create links to arbitrary artifacts; (2) define custom trace links for projects; and (3) ability to view links in a matrix or graph view. Based on this they created Capra, a plugin for Eclipse that can trace code on a function level, and has support for external services such as continuous integration tools.

Another tool is the Software Artefact Traceability Analyzer (SAT Analyzer) which has access to structured XML files and tracks changes in artifacts [19]. Relationships are then built in a semi-automatic way with learning algorithms that can sense common things among artifacts. Through extended research, this tool has a plugin for Jenkins4_{which allows it to be} extended and used with continuous integration artifacts [20]. Since the SAT Analyzer does not support external services, they also had to set up automatic listening so when the CI service has finished, it could trigger the tool and tell it that a new artifact was built.

3.5 Eiffel protocol

In the need of a more adequate solution, Ericsson in 2012 created the Eiffel protocol, which is designed to be flexible, scalable, and traceable especially in interconnected and interdependent software pipelines [21, 6]. This was later in 2016 open-source and is available on GitHub. It provides a heavily standardized vocabulary on how it is used, and the user can then decide on which parts of it they want to use. This enables it to be flexible and highly customizable to fit their individual needs. Since it only is a protocol, it requires the user to set up a system to use it. However, the Eiffel community on GitHub hosts and maintains software that can be used with the Eiffel protocol.

The concept is to produce globally broadcast atomic events that are triggered in each activity in the pipeline that are referencing each other with semantic trace links that can form a direct acyclic graph (DAG) [6]. It is also emphasized that the protocol supports bi-directional links so that it can be traversed both upstream and downstream. This makes it have a higher traceability and supports more needs. Events are triggered when activities happen, such as a build has started or a test case is running, etc. These events are then sent on a globally available bus, for instance, RabbitMQ5_{, that is always available to use. The payload of these} messages is objects in JSON format that listeners can make use of.

3.5.1 Structure

Each Eiffel event sent through the global bus has different required members in the JSON object to serve their different needs for what they explain. However, there are three required top-level members that every object has [22]. These members are the following: (1) meta, that contains metadata about the event; (2) data, that contains the payload and can include non-Eiffel entries; and (3) links, that have links to other Eiffel events.

3.5.2 Glossary and vocabulary

Eiffel terminology used in the vocabulary as explained here [22] in table 3.1. Since there are more events on the vocabulary than terms and the rest can be found in table 3.2.

4_{Jenkins https://www.jenkins.io/} 5_{RabbitMQ https://www.rabbitmq.com/}

(17)

3.5. Eiffel protocol

Term Event Description

Activity EiffelActivityTriggeredEvent_{EiffelActivityStartedEvent} EiffelActivityFinishedEvent

Describes some sort of activity or action in the

CI/CD system. Could be if an activity has started/ended or the outcome of something. Test cases and Test suites

are activities and have their own specific events.

Artifact EiffelArtifactCreatedEvent_{EiffelArtifactPublishedEvent} EiffelArtifactReusedEvent

Built software packages gen-erated in a CI/CD pipeline.

Could be a built binary.

Composition EiffelCompositionDefinedEvent

This is an immutable way to group specific versions of

artifacts or source changes.

Confidence Level EiffelConfidenceLevelModifiedEvent

Describes the confidence of something. Could be an artifact, composition or source change. It is

usually valued in text form and not numbers.

Environment EiffelEnvironmentDefinedEvent

Describes an environ-ment that an

activ-ity can execute in.

Source Code EiffelSourceChangeCreatedEvent EiffelSourceChangeSubmittedEvent

Commit to Git repos-itory and submitted is when it is merged. Table 3.1: Eiffel terminology and their corresponding events.

3.5.3 Usage in the industry

Eiffel has not yet gained traction and is not widely used in the industry yet [23, 24]. Although it is gaining usage as Ericsson is deploying it in their own development [23]. A study by Hramyka and Winqvist interviewing experts from Axis Communications does show that by using Eiffel for software traceability they get closer to their desired state that they want and it proves that it is a viable strategy in the industry [24].

(18)

3.6. Eiffel-store Event Description EiffelTestCaseTriggeredEvent EiffelTestCaseCanceledEvent EiffelTestCaseStartedEvent EiffelTestCaseFinishedEvent

Subset of the Activity event that is used for test cases.

EiffelTestSuiteStartedEvent EiffelTestSuiteFinishedEvent

Subset of the Activity event that is used for test suites and binds multiple test cases in a single suite. EiffelIssueDefinedEvent

EiffelIssueVerifiedEvent

Defines an issue in external software and is verified with a text message

such as ”SUCCESSFUL_ISSUE”.

EiffelFlowContextDefinedEvent Can describe the

con-text of other events. EiffelTestExecutionRecipeCollectionCreatedEvent _{recipe has been declared.}Describes that a test

EiffelAnnouncementPublishedEvent Announcement of some sort.

Table 3.2: Rest of the Eiffel events that are not included in the terminology from table 3.1.

3.6 Eiffel-store

Eiffel-store 6 _{is one of the community-made applications described in section 3.5 and it is a} persistence solution for Eiffel events [25]. The application is built with Meteor7 a JavaScript framework for building cross-platform applications with a MongoDB backend which produces a simple website that can be used. Events can either be directly added to the database or by using a small tool that enables the ability to use a RabbitMQ8_{bus. When an event is added,} it triggers the application to build up sequences from the links that later can be visualized and intractable in the browser.

From the Eiffel events, the application builds its own events to compress them and to be able to link them together. For example, it condenses all four test case events in table 3.2 to only be a single test case event that is displayed and the data from all these events will be transferred to the new event. The application also abbreviates the event type names to be more manageable, the Eiffel events from table 3.1 and table 3.2 become those in table 3.3. When linking, the new events are chained together to form a data structure that contains all the chained events. This structure is called a sequence, single events that are not linked to anything, have links to Eiffel events that do not exist or are not linked by another event will be placed in its own sequence. Sequences are fundamental to Eiffel-store, and its user interface in the browser depends on it.

The website consists of three different views. It starts with the aggregation view that is rendering the aggregation of all the sequences within a selectable time frame and each node is a type of event that contains all the events from that type in the sequences. These can be clicked on to get more information about them and a button to load all the contained events to a second view. The events will then be listed in the details view where additional information

6_{Eiffel-store https://github.com/eiffel-community/eiffel-store} 7_{Meteor https://www.meteor.com/}

(19)

3.6. Eiffel-store

is located and a button to display the sequence of that specific event in the next view. This will load the sequence and show all the links between all the events in the sequence in the event chain view which is the last view. The selected event from the details view will be visible with a yellow outline, and every node can be clicked for more information.

Eiffel event Eiffel-store name

EiffelAnnouncementPublishedEvent AnnP EiffelArtifactCreatedEvent ArtC EiffelArtifactPublishedEvent ArtP EiffelTestExecutionRecipeCollectionCreatedEvent TestExeRecipColl EiffelEnvironmentDefinedEvent EDef EiffelActivityTriggeredEvent ActT EiffelActivityStartedEvent ActS EiffelActivityFinishedEvent ActF EiffelTestSuiteStartedEvent TSS EiffelTestSuiteFinishedEvent TSF EiffelConfidenceLevelModifiedEvent CLM EiffelTestCaseStartedEvent TCS EiffelTestCaseFinishedEvent TCF EiffelTestCaseTriggeredEvent TCT EiffelActivityCanceledEvent ActC EiffelSourceChangeCreatedEvent SCC EiffelSourceChangeSubmittedEvent SCS EiffelCompositionDefinedEvent CDef

(20)

4 Method

The method is divided into two sections, implementation which is focused on implementing the features to be able to answer the first research question and an evaluation to verify the implementation and answer the second research question.

4.1 Implementation

Eiffel-store already has some features that benefit and simplifies the implementation process that is required to answer the research question as described in section 3.6. The missing feature that is necessary, is a way to filter through all the events and find a specific event based on user input. Since the application is built with a top to bottom design in the user interface as a node in the aggregation view must be selected before moving on to the next view and the same applies to the details view. The place to add this filtering feature would then preferably be in the aggregation view.

To filter Eiffel events in the aggregation view, dropdowns will be placed above it to be able to select specific data from the nodes that can be filtered. The first dropdown will select the node in the aggregation view to filter from and then more dropdowns will appear to be able to further select what information to filter by. This process resembles a tree structure with the root node as no filter, the next level is the nodes that can be visible in the aggregation view, then dynamically add children or leaf nodes to the structure. An example of this tree can be found in figure 4.1. When selecting a specific dropdown item its children will be available in the next dropdown that appears and if it is a leaf node no more dropdowns will appear. This process can be seen in figure 4.2 This will provide the application with suﬀicient information to be able to filter for specific data in the events as the leaf node will hold additional information as well for filtering. For this, an algorithm will need to be constructed.

All the necessary changes and additions that will be implemented in the following sections to the Eiffel-store application are located on the GitHub page for the application together with all the source code at the following address: https://github.com/eiffel-community/ eiffel-store.

(21)

4.1. Implementation

4.1.1 Algorithm

As the aggregation view works with sequences and not directly with Eiffel events the algorithm will need to work with them instead as explained in section 3.6. Fortunately, the sequence structure contains all the chained events, so the events in the structure can be parsed for information to form the tree-based structure that can be used for the user interface. The produced structure will also be stored in the database for faster usage since it only needs to be executed when the sequences are updated. This also ensures that events that are not in any sequence will be excluded.

Figure 4.1: Example of a tree structure that can be used to visualize data and provide the user with an interface that can later be used for filtering.

Figure 4.2: Show the process of traversing the tree in 4.1 in four stages. The values in the first dropdown are the children of the ”root” node as this node is not visible (1). The second stage shows that ”B” has been selected and ”F” is now present in another dropdown (2). When selecting ”F”, ”I” and ”J” will appear in another dropdown (3). Lastly selecting ”J”, since this is a leaf node, the application will query the database and find values related to the chosen path which results in ”Value J1” and ”Value J2” in a final dropdown (4). The user can then select one of these and the application will filter based on the chosen path and value.

4.1.2 Error handling

The aggregation view coupled with the algorithm will produce filterable options. However, it will become a problem since this view is based on event chains and the algorithm provides filters for single events. This occurs if two events of the same Eiffel type in a sequence have different values for what it filters for. An instance this will come up is when there are multiple

(22)

4.2. Evaluation

test cases in the sequence since test cases often will either pass or fail and some pass and some fail. If filtering for test cases with only successful outcomes is filtered, the failed ones will also be present. To mitigate this, the user will be presented with a dropdown to select an action if this occurs. Three options will be provided: (1) include events; (2) exclude events; and (3) discard sequence. The first option does nothing, the second will remove the events from the sequence they are from and the third will remove the whole sequence. This is up to the user to decide what they want to happen.

4.1.3 Information needs

After implementation of the filtering feature, some of the information need questions from section 1.3 will need to have features implemented to be able to answer them. Specifically, question one, two, and three need implementations. Four and five already can be answered with the implementation of the algorithm, the dropdowns, and views from the application. 4.1.3.1 How much confidence do we have in a specific test suite?

Specific test suites can be found by interacting with the dropdowns in the aggregation view with the data provided by the algorithm and selecting the correct information. Since a test suite contains several test cases each with their own results the confidence could be calculated with all the test cases. By finding all the test case events that link to that test suite and then calculate the passed and failed ratio of the verdict. This value will then be presented in the third event chain view when the test suite node is pressed.

4.1.3.2 In which environment/machine did the specific test cases fail?

Environments are described with the EiffelEnvironmentDefinedEvent and finding the specific test case can be performed as explained above. The name of the environment will be presented in the event chain view. This could be a link in any of the test case events from table 3.2 with an ENVIRONMENT link, the environment defines the event and reads the environment name from there to display.

4.1.3.3 Is the new feature implemented?

The Eiffel protocol has no event for when a feature has been implemented. Instead, the EiffelSourceCodeSubmittedEvent can be used since that is sent when the source code has been merged to its intended branch. To filter for this event in Eiffel-store, it is important to add an identifier to the Eiffel event so it can be found. This will make it findable from the dropdowns. Another approach is to look at the source code change to find out if it is submitted or not. This can be accomplished by looking at the EiffelSourceCodeChange event and find out if an EiffelSourceCodeSubmittiedEvent has a link to it. This value will then be presented in the event chain view as the other information needs above.

4.2 Evaluation

To be able to be used in the industry, the implementation needs to display the correct values to the user based on the filtering optioned and as described in goal 4 in section 3.4 it needs to have full confidence by the stakeholders. This will be divided into two evaluations, one where the algorithm is tested and another where the individual information needs from section 1.3 are tested visually and the ability to answer them. The structure is as followed: Test environment, Algorithm, and Information needs.

(23)

4.2. Evaluation

4.2.1 Test environment

For consistent and replicable as possible results, it is important that the system offers the same condition for all tests and that the environment is specified. Different versions of software might have fixed bugs or gained performance and they show different results. In this thesis, the difference that it might induce does not matter that in this context since no comparison or performance tests are performed. However, a different version can break features and it is important to list the hardware and software that it was tested on. All the software in the performed tests is running on the same computer to minimize latency in the browser and to rule out as many possible failures as possible. Specification on the environment and software versions in table 4.1 below:

Operating system: Windows 10 Pro Version 2004

Node.js: Version 14.17.0 LTS (includes npm 6.14.13)

Meteor: Version 1.4.3.2

Database: MongoDB Version 3.2.6

MongoDB Compass: Version 1.22.1

Processor: Intel Core i7-4790K @ 4.00GHz

Ram: 16GB

Table 4.1: Environment specifications.

Before testing both the algorithm and the information needs the database must be purged to remove old data. This will ensure that no entries are carried through to the next test and have a fresh state to test with. Every interaction directly with the database will be handled with the MongoDB Compass 1 _{application through an interactive graphical user interface} (GUI).

To run Eiffel-store some pre-requisites must be installed on the system. These will depend on the platform, as this is tested on Windows 10, the following instructions will be for that platform. On Windows 10, Node.js2_{with npm}3_{is required to be installed to be able to install} meteor. It is also important that Node.js is installed with the 64-bit version as meteor does not support the 32-bit architecture. Node.js will also include npm in the installation. After installing Node.js, the Windows PowerShell can be opened, and by running the following command will install meteor with npm below. Meteor will also handle and install MongoDB automatically.

$ npm install -g meteor

After meteor has been installed, the Eiffel-store source code can be downloaded from GitHub and started with the three following commands:

$ git clone https://github.com/eiffel-community/eiffel-store.git $ cd eiffel-store/visualization

$ meteor

When running meteor on Eiffel-store the first time, some additional packages might have to be installed before it can function. These packages will be listed by meteor in the output as a command to run. An example of this can be seen below as the babel-runtime package has to be installed. Run these commands and then run meteor again. Do this until no errors occur. The application can now be used.

$ meteor npm install --save babel-runtime

1_{MongoDB Compass https://www.mongodb.com/products/compass} 2_{Node.js https://nodejs.org/en/}

(24)

4.2. Evaluation

4.2.2 Algorithm

Testing the algorithm will be done with a small sequence with Eiffel events in a JSON format. It depicts a built artifact that starts an activity to run a test suite with two test cases. A simplified view of the event sequence can be found in figure 4.3 and the full sequence can be found in appendix A in JSON format. After inserting the events with MongoDB Compass to the application, the ”eventfilter” collection where the algorithm inserts its data will be exported as a single JSON file format and put into a tree for visual inspection. The tree will represent all the paths that can be taken in the dropdowns in the user interface and by selecting a node the children of that node will be visible in the next dropdown.

Figure 4.3: Simplification of the event sequence used for the algorithm.

4.2.3 Information needs

There are five information needs in section 1.3 and with the implementation based on section 4.1 these should be able to be answered. To do this there are three different independent created sequences located in appendices B, C and D. A simplification of these events can be found in figure 4.4 The first sequence is like that in figure 4.3 and expands with adding individual environment events for each of the two test cases. The second is the same as the first except it has different outcomes on the activity, test suite, and test cases. The last sequence contains source code changes and submitted events, these are special, since the sequence contains small sequences or single events that are not linked to anything. Four source code changes and four submits are linked. Then there are two free-floating source code changes. All the events across these sequences have different ids to differentiate them and so they can all be added together.

Figure 4.4: Simplification of the event sequences used for evaluating the information needs in appendix B, C and D. First two are represented by a and b. Last is c. The arrow means that the events are chained together to form a sequence.

When testing, all the sequences will be inserted with MongoDB Compass to the already cleared application. Then each information needs question will be tested against the applica-tion in the browser and visually inspected to be correct.

4.2.3.1 How much confidence do we have in a specific test suite?

The data contain two test suites that have been executed, both have a name in their data section of the EiffelTestSuiteStartedEvent so filtering will be for that. Since the confidence

(25)

4.2. Evaluation

is calculated by its own test cases both suites will be tested. The first suite has the name ”TestSuite 1” and the second has ”TestSuite 2”. Finished test suites are abbreviated to ”TSF” in the application so select that in the first dropdown, ”Name” in the second, and then the actual test suite name in the last dropdown. The aggregation view will update according to the selection. To display all the test suite events from the selection, click on ”TSF” and then on the ”Show all events” button and all the test suites will appear in the details view. Locate the correct suite and click on the ”Event chain” button to display the sequence in the event chain view. Then click on the node with the yellow outline and the confidence will be displayed there.

To get a unique test case its identifier in the EiffelTestCaseTriggeredEvent located in the test-Case object in the data field can be used. Feed that in the filtering options for the aggregation view and do as in section 4.2.3.1 to get to the event chain view. An environment field has the environment name in the popup when clicked on the yellow node. A deviation here is that each sequence with test cases has two environments and that will cause an error described in section 4.1.2 and can be solved by selecting Exclude event(s) in the dropdown. The value that will be tested here is the test case with id ”test-case-5”.

As described in section 4.1.3.3, the EiffelSourceCodeSubmittedEvent must be used in some way. In sequences of events, the submitted event has no unique identifier and instead has a link back to the EiffelSourceCodeChangeEvent. That has an id in the change object located in its data field. Put that in for the aggregation view and then do the same as in 4.2.3.1 to get to the event chain view and click on the yellow event. The field Submitted will either have the value Yes or No. From the inputted data, id ”1” and id ”5” will be used for testing. 4.2.3.4 Which test case/suite have been run on which product and at what

time?

The EiffelArtifactCreatedEvent will describe what product it is. To get all the test cases and test suites. Fill in the unique identifier from the identity value from the data field from the event. Then either click on TCS or TSF and then Show all events. Now all the test cases or suites that have been executed on that product will be in the details view. The value to test is the ”pkg:fimage/artifact/name@V2”.

4.2.3.5 How often does a specific employee deliver new code to the system? Authors are specified in the author object in the data field of the EiffelSourceChangeEvent and have a name, email, id, and group field. Since the id is the most unique field, it will be used for filtering in the aggregation view. Then click on SCC and on Show all events. The details view will then list all the associated events with that id in its list. Two tests will be performed, one with the id ”alice” and the other ”bob”.

(26)

5 Results

5.1 Implementation

Since there are many additions to the codebase and it being quite large already, all the changes and additions will not be described in detail. Only the things that are most important and relevant to the following sections will be listed and all the source code is available at the following address: https://github.com/eiffel-community/eiffel-store.

5.1.1 Algorithm

There are three key functions that enable the functionality of the buttons in the aggregation view. For each event in a sequence from Eiffel-store it goes through the parseData function that is included in appendix E. The output from that function gets merged with other outputs from the same type of events. This is then stored as the option field for that type of event. After processing all sequences, this gets inserted into the ”eventfilter” collection in the database. The first thing the parseData function is doing is calling the parseHelper function.

The parseHelper, is a recursive function that can be found in listing 1. It takes the following parameters in order: (1) data that can either be an array, object, or value; (2) the current path in the object; and (3) the third optional parameter is if the current path has an array in its path. This allows it to search for keys through multiple levels of arrays and create a path to find them. The finished returned array can contain some empty arrays as well, so this output is flattened to remove them.

The outputted flattened array is then looped over in parseData to alter and rename some paths. It changes ”time.diff” to ”executionTime” and moves out everything from both data object and customData to the top level. The next thing is looping the structure to create a nested object by calling the createNestedObjPath. This will take each object in the array and create an object from it. The function can be found in listing 2.

From this algorithm, the important value of this creation is the filterBy member since this can be used to search in the database. Since MongoDB can search through arrays and look for a member, this path will be the only thing that has to be used. The dropdowns in the GUI can then be created by gathering the ”eventfilter” collection and then use the keys in the object to display. This will be visualized in a tree structure in section 5.2.1.

(27)

5.1. Implementation

1 function parseHelper(data, currentPath, pathIncludesArray = false) {

2 let arr = [];

3 if (_.isArray(data)) { 4 data.forEach((item) => {

5 arr.push(parseHelper(item, currentPath, true))

6 });

7 } else if (_.isObject(data)) {

8 Object.keys(data).forEach((key) => {

9 const newPath = ((currentPath !== "") ? currentPath + "." :

"") + key; ↪

10 arr.push(parseHelper(data[key], newPath, pathIncludesArray));

11 });

12 } else {

13 if (allowPath(currentPath) && data !== null)

14 arr.push({filterBy: currentPath, pathIncludesArray: pathIncludesArray});

↪

15 }

16 return arr;

17 }

Listing 1: Shows the code snippet on the parseHelper function. The function is essentially divided into three sections. The first is checking if the current data is an array and if it is all the variables will be iterated and calling parseHelper again. The second section is check-ing the data to be an object and if it is a dot is added to the path with the key. This is added to arr and parseHelper is called again. The last section is if data is a value, then the allowPath function is called to see if the path is allowed and the data is then pushed to arr. This essentially gathers all the paths that exist in an object and returns them in an array. Small example with object {nested: {v1: 1}, v2: 2} will return the array [[[”fil-terBy”:”nested.v1”,”pathIncludesArray”:false]],[”filterBy”:”v2”,”pathIncludesArray”:false]].

1 function createNestedObjPath(obj, path, data = {}) { 2 path = path.replace(/\[(\w+)\]/g, '.$1');

3 path = path.replace(/^\./, ''); 4 const arr = path.split('.');

5 let nest = obj;

6 for (let i = 0; i < arr.length; ++i) {

7 let key = arr[i];

8 if (key in nest) {

9 nest = nest[key];

10 } else {

11 nest[key] = (i === arr.length-1) ? data : {};

12 nest = nest[key];

13 }

14 }

15 }

Listing 2: Shows the code snippet on the createNestedObjPath function. This function takes an object, a path, and data to add. The path variable will be in a dotted string format, meaning that each level in a nested object will be described with a dot. By iterating through the path, objects that do not exist can be created and the data can be placed at the correct place. For example, taking the same object and its output from parseHelper 1. By running these two paths on the same object with an empty data object will result in the following object: {”nested”:{”v1”:{}},”v2”:{}}.

(28)

5.1. Implementation

5.1.2 Error handling

When the application is aggregating events, it calls the filterEventSequences function that can be found in appendix F. This function will check if a clash exists in the sequences, meaning if two events of the same type have different values in the filter path. If a clash is found, actions provided by the GUI will appear with selections based on the options from section 4.1.2.

5.1.3 Information needs

The following section will explain the implementation details for each of the information needs except for needs four and five since they do not need an implementation.

When an event has been added to the database and the application is building the sequences, the events are checked and if it is a test suite event the confidence will be calculated. For each test suite, all the test cases included in that suite will be checked if they passed or not. The way it does that can be found in listing 3. This value is represented by the percentage of passed test cases which is then added and stored in the database together with the test suite.

1 _.each(sequence.events, (evt) => { 2 if (isTestSuiteEvent(evt.type)) {

3 const tests = sequence.events.filter((obj) => {

4 return isTestCaseEvent(obj.type) && obj.targets.find((id) => {

↪

5 return id === evt.startEvent || id === evt.id;

6 }); 7 }); 8 let totalTests = 0; 9 let passedTests= 0; 10 _.each(tests, (test) => { 11 ++totalTests; 12 if (test.data.outcome.verdict === 'PASSED') { 13 ++passedTests; 14 } 15 });

16 evt.confidence = Math.round((totalTests > 0) ? (passedTests / totalTests) * 100 : 0);

↪

17 }

18 });

Listing 3: Shows the code snippet on how the confidence is calculated for test suites. For each test suite, all the test cases that link to that suite will be gathered in the tests variable. The confidence is then calculated by iterating the gathered test cases and checking how many of these passed.

When the last view in the application gathers the event chain data to display, the test cases for that sequence are checked if they contain a link to an environment event that can be used. Eiffel-store has disabled the ENVIRONMENT link it instead finds the environment on a lower level by searching for the Eiffel event in the database to check the links. This solution can be found in listing 4.

(29)

5.1. Implementation

1 const startTestCaseEvt = EiffelEvents.find({"meta.id": event.startEvent}).fetch();

↪

2 let envFound = false;

3 if (startTestCaseEvt && startTestCaseEvt[0].meta.type ===

"EiffelTestCaseStartedEvent") { ↪

4 for (const t_link of startTestCaseEvt[0].links) { 5 if (t_link.type === "ENVIRONMENT") {

6 for (const t_event of events) {

7 if (t_event.id === t_link.target &&

isEnvironmentDefinedEvent(t_event.type)) { ↪ 8 envFound = true; 9 node.data.environment = t_event.data.name; 10 break; 11 } 12 } 13 break; 14 } 15 } 16 } 17 if (!envFound) {

18 node.data.environment = "No data";

19 }

Listing 4: Shows the code snippet on how the environment name is found for a test case. Since Eiffel-store has its own event structure that compresses events, the start event is gathered from the database. If this is a test case started event and the links for that event are checked if an ENVIRONMENT link can be found. This link id can then be used to check if the environment event is contained in the sequence.

Same as section 5.1.3.2, source code changes are checked if they have been submitted when the application is chaining events for the last view. If a source code submitted event links back to a source code change the change has then been implemented. The process of finding out if the change has been submitted is located in listing 5.

(30)

5.2. Evaluation

1 let isSubmitted = false;

2 for (const t_evtID of event.targetedBy) { 3 for (const s_evt of events)

4 if (s_evt.id === t_evtID && isSourceChangeSubmittedEvent(s_evt.type)) { ↪ 5 isSubmitted = true; 6 node.data.isSubmitted = "Yes"; 7 break; 8 } 9 } 10 if (!isSubmitted) { 11 node.data.isSubmitted = "No"; 12 }

Listing 5: Shows the code snippet on how a source code change event is checked if it has been submitted. The event variable is the source code change event and the events variable is all the events in the current sequence. The event contains a member targetedBy that includes all events that target this event. By checking all those events with events in the sequence the submitted status can be found.

5.2 Evaluation

This section contains the evaluation for the algorithm and the information needs sections respectively.

5.2.1 Algorithm

The output from the application after processing the events located in the ”eventfilter” collec-tion of the database can be found in appendix G. From this data, the tree in figure 5.1 was constructed.

Figure 5.1: Data from the algorithm displayed in a tree structure. The ”root” node is not visible here.

5.2.2 Information needs

In this section, the five information needs questions will be tested. The order is the same as in section 1.3.

(31)

5.2. Evaluation

Figure 5.2: The aggregation view with ”TSF”, ”Name” and ”TestSuite 1” selected in the dropdown menus.

Figure 5.3: ”TestSuite 1” event listed in the details view.

Figure 5.4: Shows the confidence of the two tested test cases. Left is the test case with id ”TestSuite 1” and it has the confidence of 100%. To the right is the test case with the id ”TestSuite 2” and has a 50% confidence.

(32)

5.2. Evaluation

Figure 5.5: The aggregation view with ”TestCase”, ”Id” and ”test-case-5” selected in the dropdown menus with the error handler in red showing that there are 3 conflicts with this selection. Default error handling selection is ”Include event(s)”.

Figure 5.6: Shows the four test case events associated with id ”test-case-5” when filtering with the error handler set to ”Include event(s).”

Figure 5.7: Shows the test case event associated with id ”test-case-5” when filtering with the error handler set to ”Exclude event(s).””

(33)

5.2. Evaluation

Figure 5.8: Shows the environment names of the two test cases that were tested. Left is the test case with id ”test-case-5” that was executed in the ”Environment 4” environment and to the right is the test case with id ”test-case-6” which was executed in the ”Environment 3” environment.

Figure 5.9: Shows if the source code changes were submitted or not for the two tested cases. Left is the source code change with id ”1” that is submitted and to the right is the source code change with id ”5” that is not submitted.

(34)

5.2. Evaluation

5.2.2.4 Which test case/suite have been run on which product and at what time?

Figure 5.10: Shows the details view of the test suite associated with the created artifact with identity ”pkg:maven/com.mycompany.myproduct/my-artifact@2”.

Figure 5.11: Shows the details view of the four test cases associated with the created artifact with identity ”pkg:maven/com.mycompany.myproduct/my-artifact@2”.

5.2.2.5 How often does a specific employee deliver new code to the system?

Figure 5.12: Show the details view that lists the three source code changes associated with the id ”alice”.

Figure 5.13: Show the details view that lists the three source code changes associated with the id ”bob”.

(35)

6 Discussion

6.1 Results

The results show that implementation of the Eiffel protocol can solve problems in the indus-try with an easy-to-use interface to quickly query questions that need to be answered by the stakeholders. Problems described in section 3.4 that stakeholders have a hard time under-standing the data from CI services, solutions in the industry are not adequate in section 3.4 and fulfilling the goals in section 3.4 can be solved. The information is organized in a way that the information can be found easily, other tools are not scalable or flexible enough to customize to their individual needs and Eiffel protocol with this implementation shows that it shows needs, configurable, scalable, and is portable across projects.

All the tests combined also show that the algorithm is gathering the correct information for the user interface and the correct searching parameters. The dropdowns in the aggregation view are an essential part to be able to answer the information needs since it is at the highest level. If this was incorrect, the other parts of the filtering process might not work at all and no answer could be found. Since the algorithm shows the correct information and the information needs could be answered shows that it is correct for this type of input.

6.2 Method

Eiffel events in their nature are very abstract and it is up to the company or user to define how they want to use them. The way the information needs are answered here with the created events and sequences that are linked together provides one solution. The created test data for the evaluation could therefore have had a wider use of the linkage to reflect this and many more ways to create links. This solution might not work for another company that uses the Eiffel protocol to their needs. More work could certainly have gone into making more general events to provide a wider solution that works in more cases. The error handling in section 4.1.2 is also a side effect of putting the dropdown filtering in the aggregation view. It could have been placed in the details view which removes the need for it. However, this also removes the ability to view the aggregation based on the filtering. More research is probably needed to answer whether the aggregation view is important or not.

The algorithm testing could have been extended to allow a wider range of input to be parsed since it is important that it functions correctly. There could be an edge case where it

(36)

6.3. The work in a wider context

does not work. Extensive software testing could have been performed where every line is tested to achieve a large test coverage to minimize the failure rate. As good as software testing, the aim of this thesis was not to have a perfect implementation of information filtering and more focus was on answering information needs that were achieved. However, the algorithm does have its own test and it is also has been tested indirectly in every test since it is an essential part that must be used for filtering.

This solution of filtering with the Eiffel protocol works in a small number of sequences. Software companies today can provide a very large quantity of commits per day. With so many events the algorithm might reduce performance since it must search all the sequences for filtering purposes. If there were a couple of million events, the rebuilding of the database will certainly take a while. If the commit frequency from developers were high, it could result in spending more time rebuilding sequences than being used. It might also be a problem for the website as well since it must receive a large amount of data to display. A more optimized solution to be able to filter and eﬀiciently render the data for the user.

Another issue is with the filtering implementation, since it is based around dropdown, the user interface can suffer if the events have a large quantity of data in one of its members. If the member is a long string, the filtering dropdown for that value will be very wide that can take up the screen and be hard to use. The user experience will also suffer from a large amount of data as the dropdowns will have many values and will require the user to scroll very far to find a value.

There is a large quantity of research on software traceability and continuous integration, how it is used, its relevancy, and problems that exists. The selection of sources in this thesis is mainly conference proceedings and research articles published by IEEE or ACM. Although some sources are only regular articles and links to websites. The available research around the Eiffel protocol is also limited in a small capacity since it is so young and the industry has not yet committed to it. It was also a problem to find newly published papers on software traceability and the continuous integration pipeline which were relevant and appeared to have credibility.

6.3 The work in a wider context

Continuous integration and software traceability, in general, puts a lot of strain on the stake-holders and developers to keep track of all their artifacts. Having a flexible and scalable solution to use can keep the work hours down for people and let them focus on other things instead. The software can therefore grow faster with new features and bugs can be fixed which enables more frequent releases to customers. Together with the Eiffel protocol that can provide a solution for filtering artifacts and events the stakeholders can better understand the project and make the decision-making process faster. The job of the stakeholder can be easier as well since they can use the tool to answer questions and do not have to have deep knowledge about the project. As higher-up people might only have limited knowledge about the project, they can simply be looked up fast in the interface.

As to answer questions with the tool it must first have to be customized for the stakeholder needs in the company or the user. This can initially require some implementation to be done to be able to answers their needs and this will require some time to develop. After the initial period, a new addition will only be needed if more stakeholder needs require answers. This could also be a double-edged sword as if a new question arises that must be implemented in the tool the stakeholders can decide that it takes too much time to implement or add to the tool. That question is now disregarded in the decision-making process. Decisions can also be made wrongly as the decision might require more information than the answers provided for the stakeholders.

(37)

7 Conclusion

7.1 How can intelligent and context-aware searching be

implemented in the Eiffel protocol based on information needs?

The Eiffel protocol provides a solution that is flexible, scalable, and is abstract so that it can be used across multiple services and projects. Eiffel-store with its ability to visualize Eiffel events provides a solid base to build on. By creating an algorithm that can parse the events in the application and build a tree structure. The user interface can then use this tree to display a series of dropdowns for the user to use for the event filtering. Since filtering for a single event in a sequence can be a problem, error handling is also necessary in case multiple events of the same type clash. By providing three options for conflicts, include or exclude events and discard the sequence. This solution provides a way to find individual events in a sequence and by implementing solutions that provide answers for the five information needs that the stakeholders can then use to answer them.

7.2 Is the queried information from the implementation correct?

To verify that the application can query the proper information, four different event sequences were created and used for testing. The one sequence used to test the algorithm provided the correct result in the form of a tree structure. Each of the five information needs questions was also tested individually, and results show that it displays the correct output. Additionally, the information needs were able to be answered, and it also shows the correctness of the dropdown menus as they were correct in every test. All five information needs showed correct results.

7.3 Future work

One of the areas that need improvement is the user interface. The provided solution with dropdowns works in theory and has problems in practice with both the unpredictable length of the value and finding a value among many that can be present in the list. As JSON objects usually do not have a long name for their members, this makes it unpractical to handle, and that the last dropdown will therefore have the most values. Eiffel events usually resemble each other, so new members are not as likely as new values to appear. A viable solution would

(38)

7.3. Future work

be to instead of providing the last dropdown to put a search box that the user can input to. Based on the input the application will then gather search results which the user can select from. This should result in a better user experience as the value can be searched for, instead of finding it in a list when there is a large amount of data to query.

Intelligent and Context-Aware Information Filtering in Continuous Integration Pipeline using the Eiffel Protocol

Linköping University | Department of Computer and Information Science

Bachelor’s thesis, 16 ECTS | Datateknik

2021 | LIU-IDA/LITH-EX-G--2021/020--SE

Intelligent and Context‐Aware In‐

formation Filtering in Continuous

Integration Pipeline using the Eif‐

fel Protocol

Intelligent och kontextmedveten informationsfiltrering i kontin‐

uerlig integrationsrörledning med Eiffel‐protokollet

Robin Gustafsson

Upphovsrätt

Copyright

Acknowledgments

Contents

List of Figures

List of Tables

1

Introduction

1.1 Motivation

1.2 Aim

1.3 Research questions

1.4 Delimitations

2

Background

3

Theory

3.1 Version control

3.2 Continuous practices

3.2.1 Continuous integration

3.2.2 Continuous delivery

3.2.3 Continuous deployment

3.3 CI/CD pipeline

3.4 Software traceability

3.4.1 Stakeholder needs

3.4.2 Industry tools

3.5 Eiffel protocol

3.5.1 Structure

3.5.2 Glossary and vocabulary

3.5.3 Usage in the industry

3.6 Eiffel-store

4

Method

4.1 Implementation

4.1.1 Algorithm

4.1.2 Error handling

4.1.3 Information needs

4.2 Evaluation

4.2.1 Test environment

4.2.2 Algorithm

4.2.3 Information needs

5

Results

5.1 Implementation

5.1.1 Algorithm

5.1.2 Error handling

5.1.3 Information needs

5.2 Evaluation

5.2.1 Algorithm

5.2.2 Information needs

6

Discussion

6.1 Results

6.2 Method

6.3 The work in a wider context

7

Conclusion

7.1 How can intelligent and context-aware searching be

implemented in the Eiffel protocol based on information needs?

7.2 Is the queried information from the implementation correct?

7.3 Future work