EltonTo¸cka COMPLEXVARIABILITYMODELING

(1)

V¨

aster˚

as, Sweden

Thesis for the Degree of Master of Science (60 credits) in Computer

Science with Specialization in Software Engineering

COMPLEX VARIABILITY MODELING

Elton To¸cka

eea18004@student.mdh.se

Examiner: Antonio Cicchetti

M¨

alardalen University, V¨

aster˚

as, Sweden

Supervisor: Jan Carlson

M¨

alardalen University, V¨

aster˚

as, Sweden

Company supervisor: Zulqarnain Haider,

Bombardier Transportation Sweden AB, V¨

aster˚

as,

Sweden

(2)

Abstract

Software Product Line Engineering has reached a phase of evolution that has turned the Soft-ware Product Lines into more accessible and easier to use not only for those who are rigorously specialized in using them but for everyone. We acknowledge all the benefits gained from this method-ology and all the existing contributions given by different authors over time since its discovery. Its massive usage initiated the development of a new approach that was specifically designed to handle the variability of the Software Product Lines called Variability Modeling. The most known tech-nique of Variability Modeling for developing Software products turned out to be Feature Modeling because it was seen that the usage of features made the modeling process easier, so they became the primary thing to be dealt with. The modeling engineers decided on using feature diagrams to include and present these features.

(3)

List of Figures

1 A sample feature model [1] . . . 7

2 Cardinality formalization [2] . . . 9

3 Feature referencing [3] . . . 9

4 Simple diagram . . . 13

5 Cardinality . . . 13

6 Features with a higher impact . . . 14

7 Accounts chosen specifically for the sectors . . . 15

8 Sectors chosen specifically for the accounts . . . 15

9 Features relations . . . 15

10 Mandatory and Optional features . . . 17

11 Base and Option possibilities, XOR relation . . . 17

12 Base and Option possibilities, OR relation . . . 18

(5)

1. Introduction

In the present market it is really important to provide services that are able to fulfill the needs of each single customer and satisfy all of them. If we refer to 20 years ago and earlier, it was difficult to build a product that could simultaneously satisfy all the mass and reach a success. Therefore the followed method was releasing the product, and doing re-releases based on the reaction of the customers. It was seen as costly, time and money consuming as the product sales were not always the expected ones. Trying to make changes, the companies offering these products decided that they could predict the market and release concurrent types of the same product, or at least have them available depending on the request.

In mass production societies with the tendency to customize the products to reach a market success, a new modeling approach was needed [1]. In software engineering semantics, the indicated mass customized production is known as software product lines [4] or software product families [5]. The motivation of this production type is to produce families of similar systems rather than individual systems. Inspired by Reuse-driven development from the more traditional engineering fields, this notion found application in Software engineering practices and was given the name Software Product Line Engineering [6].

The benefits gained from software product line engineering, can be considered as savings in time and effort of generating requirements, architectural design, components, modeling and analysis, testing, planning, processes and people [7]. Products in the same product-line share a lot of commonalities, but also have variability points [8], points where the products have variations or differences. Adding a new product in that product line would require less work to be done in regard to the above-mentioned factors. This also offers product flexibility which is a highly required criteria in the current marketplace environment.

Generating system variants is simplified by firstly creating a system architecture and deciding upon the variable artefacts that will be used based on the system requirements [9]. We are referring with system variants to the software products that are part of a software product line and with artefacts to the individual functionalities of a system. Variabilty Modeling is seen to be as the approach that archives and reports all the possible combinations of these variable artefacts. It is this modeling approach that has the ability of stating whether the combination is valid and the system can exist. By agreeing that it is reasonable and more practicable to represent artefacts with their most identifiable characteristics, features would be used later on instead. Features stand for words that describe the character of an artefact and later on it was seen as possible including features that were abstract, and were not straightly related to the artefacts. We will talk more about it in the Background Section 3.. Therefore the most well known and used variability modeling method is Feature Modeling.

The first feature modeling attempts were done with boolean choice possibilities. The results of this approach showed that the models were very limited on choices. After moving to the non-boolean modeling approach the biggest struggle for the modeling engineers was the extraction of the complex variabilty information and representing it clearly and correctly.

1.1. Objectives and Problem Formulation

In the development of software product lines, the use of features is determined as an essential factor, especially on capturing the commonalities and variabilities of that product line [6]. A set of feature combinations describes explicitly a product-line member, and specific features help to differentiate members from each-other. The used methodologies, so called feature modeling approaches, arrange features into trees called feature diagrams (FD) [2]. The semantics used to describe these feature diagrams, and the relationships among the features are all defined and explained, but their full potential is not yet perceived and used [1].

(6)

limita-tions of the existing methods, and therefore the work to be done during this thesis will consist on developing a theoretical and practical proposal on solving the identified problems. Our intention is that the suggestions we provide can help other future projects of similar usage using this modeling approach. This thesis aims on providing answers to the following research questions:

RQ1: What are the identified challenges coming as a consequence of the limitations of the existing approaches regarding complex variability scenarios?

RQ2: What theoretical and practical contributions can be given to solve these challenges?

1.1.1. Expected outcomes

The desired outcomes during this thesis work are to identify the limitations in the existing theory over the approaches and tools with respect to complex variability scenarios and give a pro-posed general approach to overcome these limitations. As a way to satisfy the Research Questions, it is planned the usage of product families with the purpose of identifying and expressing the lim-itations their model have. At first, suggestions on how to solve them will be given and with the application of which, will be attempted to achieve a feature model for our use case and leave some well defined instructions for modeling similar cases in the future.

1.2. Structure

(7)

2. Research Methodology

The chosen research methodology for this thesis is case study. The research on this modeling engineering problem related to software engineering is aimed at investigating how the development and operation of the modeling process conducted by the software engineers affects the quality of a variability model of a Software Product Line and what kind of limitations exist from this modeling techniques and approaches. We hypothesized that enough information exists regarding a modeling process of a SPL, so we needed to perform a verification and validation of the modeling techniques. We carried the verification by investigating among all the existing studies and research papers that were closely related to our topic under study, and the validation by executing the modeling procedure in our scenario by utilizing all the information received. Considering our work as an experimentation of modeling software product families in software engineering, the framework of this experimentation consists in four categories: 1) definition, 2) planning, 3) operation, and 4) interpretation [10].

In the definition phase we present the motivation of choosing this methodology which is that by isolating the problem and by using contributions in similar field, we can achieve the goals of this thesis. The purpose of the work it so improve modeling practices that will be used to generate feature models from use cases of general software engineering problems. An in-depth study is done over the theoretical propositions for feature modeling of software product families. The investigation was focused on only a single phenomenon, which in our case is variability modeling with its main approach feature modeling, using qualitative research methods by trying to achieve enough information that fulfill our modeling criteria. The data sources used for the qualitative data collection are research paper studies done over this phenomenon and guiding books of modeling tools, while the qualitative data analysis followed approach was text analysis.

We proceed with planning, by setting as the following step to be followed, the evaluation of the techniques and methods used in the similar cases referred in the research papers we found, by trying to utilize them in our case and reach to the conclusions whether the models we aim to build and deploy, achieve the degree of expressiveness we are looking for in them. As indicated in [11], modeling languages can have different levels of expressiveness based on the described notions the language offers.

The operation will start with the extraction of the variabilities and commonalities from a real project documentation. In this thesis work we will be working with a real project called ZEFIRO Express project from Bombardier Transportation Sweden AB. The document gives an overview of the system and possible configurations. All the needed information about the functionalities and feature relations, dependencies and the criteria of the valid combinations are stated on it. Evaluating and deciding whether the variabilities contained in this document are sufficient and can be used in our modeling process is the beginning of the second part of the problem. We then continue with modeling this project and there are seen to be many variations of definitions of the relations that can be used for this modeling process but there is not any existing definition on how to achieve a feature model diagram with the main focus on emphasizing features that are common or variable. Due to this reason, the more complicated variability scenarios intended to be performed can be achieved but cannot be easily addressed and identified through the existing feature diagrams. Our modeling process proceeds with a first attempt of modeling after having received the basic needed information of feature modeling. We realized that in a few steps of our modeling, there were some challenges that the information we had collected so far could not solve. Thus we had to extend our research, and moving the topic from variability modeling and requirements into variability modeling tools and approaches. We found some extensions that helped us progress in our modeling procedure. So to say, we were following an iterative strategy, going back and forth into investigating theoretical contributions and applying to our model.

Later, we continued facing the similar challenges but this time the theoretical contributions were not very helpful as they were not giving solutions to our problems. This is where we realized that we had to give our solutions to these more complicated scenarios in regard to finding ways of expressing commonalities and variabilities through the feature model. Our contributions can be found in Section 4. and can be used for future work if considered as reliable.

(8)

(9)

3. Background

To get more familiar on what we are going to widely talk about in this thesis report, on this section we will try to describe in detail all the concepts that will be used by starting with the more general field where this topic takes place which is Software Product Lines Engineering, to continue with the general approach utilized on this area of study that is Variability Modeling with its main technique Feature Modeling. We will discuss and explain about the elements of this technique which are mainly types of features and relations. We will conclude by giving information thoroughly over the scenario that is created from the wide usage of variability modeling called complex variabilities and how they are occurred.

3.1. Software Product Lines

A product line or also known as a product family, is an assemblage of closely related products which their functionalities might be diverse but their purpose of usage is principally the same [4]. The factor that brought into the beginning of mass customization was mainly the growing need for individualized products that were developed to fulfill the requirements of each customer. Clients had the chance of having individualized products, but on the other side it meant increasing the prices for these products. Trying to prevent this effect, there were presented stages where plans were made before production, choosing which part will be utilized in various product types. An increasingly amount of varied products were able to be offered from the producers by using stages and decrease the expenses for them. Factors such as improvement of quality, time and cost reduction were the fundamental points of interest of product line engineering. In a similar way Software Product Line Engineering is achieving the production of software products on a faster, cheaper and better way by using software engineering techniques and tools. The advantages of utilizing mass customization brought to an exponential increase of the usage of software product lines especially because of the opportunity they offer to create product families [7] The huge amount of products that are able to be created within the product products lines are associated by the essential intention of being developed, but are seen as difficult to maintain, particularly because the needs and problems that occur to each customer are really different. SPLE has proven to be a successful approach on handling issues such as commonality and variability in a software product line.

According to [7], the software product lines production focuses on these essential technology areas: Domain Engineering, Architecture, Architecture-Based Development, Re-engineering. We will briefly describe these areas by starting with Domain Engineering which aims on revealing commonalities and variations among a set of products, to continue with Architecture which is seen as the foundation for a product line as it provides the framework into which the variable components plug. Architecture-Based Development that is considered as the disciplined derivation or generation from the architectural skeleton of product components and the whole product once the components are ready and the last area of Re-engineering which helps mining reusable assets from legacy assets. The outcome is an innovative strategy that can deliver enormous custom software products rapidly and dependably by utilizing components from the components repository according to their specific application.

(10)

required due to the commonality of the applications.

3.2. Feature Modeling

The products part of a SPL are also called system variants. The production of these variants is taken care of by generating first the architecture of the product family and deciding where the components that are not part of the domain model would belong. The domain model is the simplest product of that product family and is usually the base product that comes with the base price. Every other component would join the domain model according the system requirements of each customer to create system variants. The possible number of combinations of all the variable com-ponents give the number of the possible variants. Variability Modeling is a very wide methodology that has its focus normally as the name states, variability and has the ability of storing, combining and handling all the possible operations that could be done with a SPL. Variability modeling is considered as the key approach to successful management of industrial software product families. In the past few years, several variability modeling techniques have been developed, each one us-ing its own concepts to capture the variability provided by reusable artifacts. Its techniques as mentioned in [12], are:

1. Variability Specification Language (VSL) that distinguishes between variability on the specification and on the realization level, on pre-runtime and runtime

2. ConIPF which aims on configuring products in the feature level by having captured all required knowledge before the configuration

3. COVAMOF that models variability in terms of Variation Points and Dependencies, where variation points are locations in the model where a choice is provided by the product family and dependencies represent the constraints of these choices

4. Cardinality-Based Feature Modeling (CBFM) which models the variability only in terms of choice in Features by allowing also the abstraction of the features

5. Koalish enables variability modeling language by specifying the logical structure of the software system by using components and interfaces

6. Pure::Variants that models variability by using an object-model with types of specialization and aggregation

The components that are also referred as artefacts are expressed through features which describe the main element that identifies this component. They have become essential for the variability modeling approach for the simplicity they offer. One can decide on a very high level of the modeling for the components that will be part of a software product just by including them in diagrams that are made of features. This way we gain feature diagrams with the highest level of expressiveness from the whole existing form or schemes of variability modeling and therefore making feature modeling the most used type of variability modeling.

The first ever feature diagram, was a part of the Feature-Oriented Domain Analysis (FODA) method [13], with the main purpose of capturing commonalities and variabilities at the requirement level. A successful and really important method that will be used as a model during the work of this thesis is FeaturSEB [14] which consists of FODA and a method called Reuse-Driven Software Engineering Business (RSEB). FeatuRSEB models choices of different behaviour with the usage of variation points in the use-cases, which are used to model the FD. To get a valid instance in the model after modeling process is finished, the developer has to select the needed behaviour of the instance.

(11)

The feature diagrams start with a concept node at the root position, which can consist of a property, product or domain type, and are followed beneath by features organized in a hierarchy [15]. It is considered so far as a really good and expressive technique The initially proposed and mostly used relations between features and concept, or features and other sub-features are mandatory, alternative and optional. More relations will be described thoroughly in the coming subsection.

3.3. Feature relations

Figure 1: A sample feature model [1]

Figure 1 taken from the literature review study done by Benavides et al. [1] is a sample of a feature model where some basic feature relations are shown. The concept node of Mobile Phone is positioned on top and it is also known as the root of the tree since feature diagram is sometimes also referred as a tree. It is then split into branches that show the functionalities or the artefacts the concept node can have. As shown from the figure, Mobile Phone is related to features such as Calls, GPS, Screen and Media, and some of them are connected to other features of a lower level, or as we have decided to refer to in this work, subfeatures. The relations used in the simplest version of the feature diagram are:

• Mandatory where the feature of any level of the diagram, must always be included in all the products

• Optional where the feature can optionally be included depending on the choice of the cus-tomer

• Alternative (XOR) which is used between a feature and more than one subfeatures or child features. This relation indicates that only one of the subfeatures must be chosen from all the possible choices.

• OR which is also used between a feature and more than one subfeatures and indicates that at least one of the subfeatures must be selected

• Requires where if a feature requires another feature it implies the inclusion of the second feature whenever the first feature is chosen.

(12)

Table 1: Feature diagram simple notations [1]

By using the explanations given from the list and Table 1, the example shown in Figure 1 can be described as following: Mobile Phone must include Calls and Screen, expressed as mandatory, and can include GPS and Media, expressed as optional. Screen can alternatively be Basic, Colour or High Resolution and only one of them can be chosen in this example, because the relation type XOR states that only one feature can be selected if this relation is used. The Conference Hall will contain Camera or MP3, or both of them, and at least one of them has to be selected because the relation type OR. Whenever GPS is selected, Screen type Basic cannot be chosen and the other way round, because of relation type Exclusion. Whenever Media type Camera is chosen, it implies the obligatory selection of Screen type High Resolution, because of relation type Requires.

Other relation types that are seen as more advanced exist and will be explained in the following subsections.

3.3.1. Cardinality

If other relations would want to be included such as the number of Calls, information or attributes for each Call, categorizations of features, the basic feature relations are not sufficient. A lot of theoretical proposals have been done over the years and that is why the proposed solutions can also be two different ones for the same problem. We will start in this subsection with the relation type Cardinality.

(13)

Figure 2: Cardinality formalization [2]

The contribution given by Czarnecki et al. in [2] states that cardinality can be used to turn every relation type into feature cardinality. As shown in Figure 2, mandatory relation can be expressed as a solitary feature with feature cardinality [1..1], optional relation can be expressed as a solitary feature with feature cardinality [0..1], relation as a feature group with group cardinality < 1 - n > and alternative relation as a feature group with group cardinality < 1 - 1 >.

3.3.2. References

(14)

This relation type is used when one subfeature is contained by more than one feature in a feature diagram. It is demonstrated in the bottom part of Figure 3, where permission is contained by both filepath(String) and environmentVariables. The way of representing it is also demonstrated in the figure. The main purpose of it is to avoid inclusions that can confuse the reader of the diagram. In a similar way, one can decide to use feature reference in an early phase during modeling if he intends to reuse that feature later in the diagram, as part of another parent feature.

3.4. Feature types

Feature diagrams, even though very similar in the way they are constructed, they are not the same as decompositions of software modules or part-of hierarchy diagrams [3]. Features themselves, may correspond to different parts of the software and not necessarily only physical components. There are types of features that are recognized as concrete features that are associated with concrete software modules, abstract features associated to performance requirements to map configurations of components or aspects and grouping features associated to variation point for plug-compatible components. Features in general are expressing elements of software in a level between the require-ments model and the design model. They are describing the system family at a higher level than requirements and are thus an abstraction. The level of the abstraction they express is also defining the classification of them into abstract features or concrete features.

Another very important concept among feature relations and types is feature categories and annotations. According to [3], FeatuRSEB proposes functional, architectural, or implementation feature categories. Other categorization types could be according to priorities, stakeholders, default selections, and exemplar systems. This will be a very useful concept in our work as we intend to use it to express the commonalities on a product family through categorizations. More will be mention in Section 4..

3.5. Complex Variability Scenarios

The extraction of the information in a product family is done over a collection of existing software systems with the purpose of modeling their variability. With complex variability sce-narios we do not refer anymore only to variable or common elements that can be seen on the first examination of the product line. We are now facing challenges that include multiplicities of elements, group of elements that have dependencies with other group of elements, ways of repre-senting accessibility on the selection of these elements, ways of including more information in a variability model rather than relations. These and other complex scenarios have to be represented clearly and accurately through the variability model which for the engineers stands as the biggest complex variability challenge. The relations of the more complicated scenarios that can express more complex variability types, are on our main interest during the work of this thesis.

(15)

4. Identified challenges and proposed solutions

The working process in this thesis consists on firstly doing an investigation on Variability Modeling. Why it was necessary, what was the contribution it had in Software modeling and the real effect its outcome had in the market. It was seen that its usage was initiated from the creation of Software Product Families or Software Product Lines (SPL). The techniques for developing Software products were changing where the usage of Features was the primary thing to be considered. To include these features and to represent them, the modeling engineers decided on using feature diagrams. During the investigation work, it was seen that many proposals on how to model the Software Product Lines exist. Many relations between the features as dependencies or inclusions were seen to be added.

The work went further with the investigation of feature modeling tools such as FeatureIDE and Based Variability Resolution (BVR) tool. From this step we saw that the progress of feature modeling proposals on achieving a more comprehensive model diagram was more accomplished in the theoretical contributions. Although the solutions proposed so far do not completely cover all the possible scenarios. The only relation types supported by the modeling tools are the simple relations such as mandatory, optional, OR, XOR and dependency relations such as exclusions or inclusions. Other feature relations, even though proposed heretofore, have not been yet included in the existing tools. That is why we decided to engage only with the state of art provided by the research in the theoretical input.

The way we performed our work is that we tried to model and create our feature diagram, repeatedly after every bit of information we took. As to say, this came as an incremental work. We used the case of Bombardier Transportation AB from the ZEFIRO project. The documentation of the project we were given offered an overview of the system functions and features. They were represented in tables for every type of wagon included. The project itself consists on possible combinations of wagons, and the wagons or train cars include functionalities that can be of different types. These functionalities are the ones that are referred as features in the tables on the project document. In our attempts to model this product line, we saw that building a diagram had its challenges with as much information as we had. These challenges, that had come as a result of the limitations of current variability modeling languages and tools, and our endeavour to solve them are shown in the following sections.

4.1. Introduction to the identified challenges

In this section we will talk more about the challenges faced during this modeling process. We are introducing the challenges and briefly explaining why they are considered suchlike. We want to emphasize that the order these challenges were faced, was in an incremental way that started with challenges faced from the inclusion of the concept node and ended with the very last feature. However, the typologies of the issues we faced were different, and such they will be categorized into challenges that came as a result of the lack of information from previous research and challenges that came as a result of the techniques of representing a SPL into a feature diagram. Our work was done over existing notations and terminology as mentioned in the Background Section 3..

Challenges that came as a result of the lack of information from previous research:

• The first one to be presented can also be seen as the first one from the importance and is choosing the right notation for expressing the relationship between features. Most of these relations are explained in the Relations Section 3.3.. Features have relations in between, and these relations are of different types and express different relationships. Therefore they have to be chosen carefully and correctly.

(16)

consistency in the feature model, features with a higher impact should be positioned to be selected first. This is important to be considered since the beginning because it determines the degree of expressiveness of the diagram.

• To continue with this challenge, which is both conceptual and representation challenge and refers to the possibility of including Feature Categorizations for features that have a higher tendency to come together. In our case, the features are not related but they are seen to be included more often together at the same time for a considerable number of car types. It is difficult to present these features together using the existing formalisms that are actually specific for presenting feature groups, so we will try to give our own solution to this challenge. • Last challenge of this category is about increasing the expressiveness of the variable features that correspond to the variabilities of Software Product Families, on feature diagrams. As a solution we thought of including feature specialization in such a way that they will not be mistaken during reading and analysis.

Challenges that came as a result of the techniques of representing a Software Product Line into a feature diagram:

• The first challenge we are introducing in this category is when there is a feature that is repeated more than once in a diagram. We are referring to it with the name Feature Repe-titions. This problem was identified in the first place, not because it is impossible to build a diagram by repeating the same feature more than once, but because it is our goal to bring the feature diagrams to the full efficiency, so we will try to avoid where possible the repetition of the features.

• One other challenge faced was giving the opportunity to the customer to understand only from the diagram which are the services he gets from the domain or the simplest product variant, and which are the other options he can choose. We have considered this challenge more because of the conceptual and understanding part you should get from the model diagram.

• This challenge consists on two parts. First part is more modeling related and is about expressing commonalities of two or more features that include the same feature. The other one is presenting that in a diagram in a way that can be understood and can also be aesthetically looking for the customer.

4.2. Challenges and solutions

In this Section we will give a more thorough description for each Challenge respectively, and present our solution to it. As mentioned earlier in this Section, the way the challenges will be presented here is by the order they were identified during our modeling process. Since the reader might not be very familiar with the case study we have worked with, a project for modeling trains, we have decided to include similar scenarios that include the development of software products just for demonstrating and describing how the challenges faced are similar in every Software Product Line. To be more clear, there will be given examples of how the challenges can take place in other forms of applications of software including specific cases, but we emphasize that the purpose of these examples is just to be more descriptive and our focus is on the train’s case study. We also want to mention that the figures and scenarios of the train will not be very accurate as the information is considered to be confidential from the company’s side.

(17)

subsections is necessary. If the solutions are evaluated as correct, we suggest the implementation in a modeling tool, for example FeatureIDE, by following a test-driven development approach. By trying to make the feature models include complex variability relations that are proposed by us, an implementation should be done to support them. In this way, it is easier to handle both validation and implementation in the tools. Lastly, we would start modeling the SPL and check for the consistency of the generated product variants.

4.2.1. Feature Repetitions

First challenge that was identified as Feature Repetitions refers to cases when multiple features contain subfeatures that are the same. In a simple diagram, it would be presented as in Figure 4.

Figure 4: Simple diagram

In this diagram is demonstrated how a model of train with two end-cars and four middle cars would look like, where each of them has the possibility of choosing a different type of car out of two possible options. In our actual ZEFIRO Express project there are two end-cars and six middle cars, where the end-cars have four possible choices of the car types each and the middle cars have 10 possible choices each. This level of features would contain 68 choices of types of cars. To continue with the next level, each one of them would contain feature ”Doors”, ”Energy”, ”Toilet”, ”Braking” and some of them would also contain ”Cabin” and ”Auxiliary”. Our feature diagram will be translated into a massive amount of features that would look messy and confusing. There will be more than two branches that have the exact same content. To solve this challenge, we have used Feature Cardinality notation. We evaluated the existing notation as sufficient to solve our first challenge, so we want to emphasize that there is no modification or change done over this solution.

Figure 5: Cardinality

(18)

feature root name Train must have precisely 2 end-cars and can have from 0 to 6 middle-cars. In our case under study, a train can include cars that consist on features like ”Doors”, ”Toilet” etc., that are identically the same, and such can be represented in the same branch. For example, an engineer can decide to include two middle cars that are identical and than a third middle car that is different, and we can achieve the different car by the solutions given in the following subsections.

4.2.2. Selecting features with a higher impact

As mentioned in the previous subsection 4.1. it is very important the way you structure your feature diagram. Therefore, the right types of features need to be chosen to be positioned in a higher level of the tree, such that the right functionalities are selected at the right moment. It is important for the product line and product variants to be consistent and correct, so we identify this challenge as crucial regarding the product line consistency. This is achieved through two factors, firstly by carefully reading the project documentation and understand what is required to be emphasized, and secondly by reaching to the conclusions of which decisions are more important to be taken in a higher level.

Figure 6: Features with a higher impact

In the Figure 6 are shown two levels of features, where one is the car type and the other, the type of the energy which can be with a pantograph or without one. Considering that we are in the beginning of the modeling of our feature diagram, we should pay attention on choosing the features with a higher impact on a higher level. In our case the type of energy is a feature of the train cars, thus the customer has first to choose the car type and then the energy type for each car. He can make any combination of cars with the energy type he is interested on. If it would be the other way around, we would change the focus from selecting the car type, into selecting the energy type first, which is not in our interest on expressing through our diagram.

(19)

Figure 7: Accounts chosen specifically for the sectors

If the accounts will all follow the same structure for all the sectors, then choosing the account type should come first as shown in Figure 8. There will be an ”Admin” account for each of the sectors: ”Logistics”, ”Quality” and ”Manufacturing”, a ”Member” account for ”Logistics” and ”Manufacturing” and a ”Visitor” account for the ”Customers” sector. The functionalities will be similar for the accounts of the same type even though they will be used for different sectors and they will follow the same structure of being built.

Figure 8: Sectors chosen specifically for the accounts

4.2.3. Choosing the right feature relations

This challenge consists more than choosing the right feature relations. As already mentioned, these relations can be read in Section 3.3.. The modeling engineer should be very familiar with the relations terminology, should have read and understood the information given beforehand for the Software Product Family and the given context of it. Thus he will have understood the relation between the features and will be able to choose the right notation.

(20)

A possible error that can happen, demonstrated in Figure 15, is choosing relation OR instead of XOR, and such the customer chooses more than one feature that is supposed to be allowed and the product becomes inoperative. The one on the left of the figure is correct because the customer is allowed to choose only one from the two options, meanwhile on the right, both car types can be chosen on the same time for the same car which is not possible such the product is invalid.

To be more clear, a feature can be mandatory, optional, can include at least another possible subfeature or has to include at least another possible subfeature. All this information is received through the descriptive document and has to be represented correctly through the diagram. By choosing the wrong notation and such the wrong representation of the relations the model gets to be implemented and transformed into product variants containing wrong information. These variants get to be produced, if selected, and released for the market. By releasing the wrong product which might not be safe or properly working, the company can have huge damages in regard to their budget and legislation. All the possible scenarios mentioned can be avoided by properly understanding the product line and it is recommended to do revisions after every level of features included in the diagram. Our focus on this challenge is to emphasize the importance of properly knowing and using the right feature relations.

4.2.4. Base and Option possibilities

On this challenge, as explained in the Challenges Section 4.1., the aim is to give a better understanding to the person looking at the diagram. In most of the modifiable products you get, you have a domain base model and other features which can be added and edited. In the current contribution in the research papers, there is not a clear solution on the existing proposition on how to present the Base features, such with the application of these solutions we do not receive any clear information directly from the feature diagram. We think that there is potential in the feature diagram, with a possible extension to show possibilities that are considered ”Base” and ”Option”. The only mentioned solution is including the Base features as mandatory features as shown in Figure 10. In this case 2 pantographs and 1 LCB (200 km/h) must always be included and if the client wants to add the optional features they will just be appended to the mandatory features. We are somehow showing that the mandatory are Base features but they are obligations instead of choices. This is not always the case for the relations that exist within the features. The client can choose to only have the two features on the left or only the two features on the right, but not be restricted on his choice. The only reason they are considered as Base is because they are included in the Domain model and you get them with the basic price. If you want to get the optional ones, you would have to pay extra.

(21)

Figure 10: Mandatory and Optional features

Our proposed solution is using the XOR relation, which means that only one subfeature can be chosen at a time, and two possible choices that will consist on Base and Option. As shown in Figure 11, from the feature called ’Energy’, you can choose only one of the two subfeatures ’Base’ and ’Option’, where Base contains features you get in the base domain model and Option is all the possible modifiable features. The idea behind it is that if it would be implemented in a Feature Modeling tool, by selecting all features called ’Base’ the customer can get the most simple version of the product. In such a way by adding ’Options’ on the way, the product can get modified.

Figure 11: Base and Option possibilities, XOR relation

(22)

between the feature and its subfeatures instead of XOR relation as shown in Figure 12.

Figure 12: Base and Option possibilities, OR relation

4.2.5. Categorizations

To continue with this challenge, which is both conceptual and representation challenge and refers to the possibility of including Feature Categorizations for features that have a higher tendency to come together. In our case, the features are not related but they are seen to be included more often together at the same time for a considerable number of car types. It is difficult to present these features together using the existing formalisms that are actually specific for presenting feature groups, so we will try to give our own solution to this challenge.

For this challenge, we saw that some features are more prone to come together with other features for the same product, even if they do not have dependencies between them. As we mentioned in the previous subsection, present these features together using the existing formalisms, that are actually specific for presenting feature groups, is not possible, so we will try to give our own solution to this challenge. Hence we thought to use the feature groups concept which assembles features that are related to each other. Our contribution over this concept is that in our case the features are not categorized according to a common attribute, but according to the times they are existent together on each car type.

(23)

In Figure 13 there are shown three feature categories that we used in our feature model. The first one consists on a ’Basic Set’ that includes features that are included from all the existing cars. The second group is called ’Advanced Set 1’ and will include features that are included from most of the existing cars but not all of them. The third group is called ’Advanced Set 2’ and includes feature that are included only by some of the cars. The demonstration shows only two features on Basic Set and one feature for each Advanced Set, but in reality our example includes more features per each category and that is also the purpose of this solution. It is easier to include Basic Set in every car type and then the Advanced Sets for each one respectively. Furthermore, when the number of the features to be grouped is great, it is a good way to keep the work more organized. It is a notation which will also be used to solve the coming challenge.

4.2.6. Expressing commonalities

To begin with the explanation of how we have proposed expressing commonalities, first we need to mention the importance of Feature Referencing. It is a relation type that is already explained in Section 3., and is used and needed to relate scenarios when one subfeature is contained by more than one feature.

This feature relation type can also find use on expressing some hierarchy levels in a diagram, but since the feature diagram has levels of hierarchy itself if you consider one feature as a concept node and everything underneath it can be a separate diagram. Thus we considered that in this work, that type of use of references will not very profitable so we decided we will not use it.

Figure 14: Features Referencing

(24)

Figure 15: Features commonalities

Since it is seen as very important for this thesis work to capture and show commonalities and variabilites of a Software Product Line, our focus was on trying to find a way on presenting this through a feature diagram. As mentioned in the Challenges Section 4.1., expressing commonalities is more modeling related whereas presenting it in the diagram is more of an understanding and representing aspect.

In our scenario we are aiming to use Categorizations from the previous challenge to represent commonalities. Yet there is a way these commonalities should be related to the train cars. To achieve this we are using the concept of Feature Referencing, however with some new notations proposed by us that aim accomplish what we are trying to achieve by this solution. The subfeatures to be referenced will be indexed in a similar way to vector indexing, starting from 0 from the left to the right in the way they are shown in the existing diagram. The index will be inside the curly brackets and will consist of only one number. The features that are going to reference are the ones that will contain the subfeatures and instead of the normal referencing we will use double square brackets. If these features contain a single subfeature, a single number will be put in between the double brackets. If they include from ’i’ to ’j’ and they are ordinal numbers then this notation will be used [[ i-j ]] and if they are not ordinal numbers then they will be presented one by one, divided by a coma [[ i, j ]] or as multiple intervals [[ i-j, m-n]] where i < j < m < n.

4.2.7. Specializations

(25)

of Specialization, one is type Overwrite that will overwrite the feature group with the same name and the other is type Append that will be added to the feature group with the same name.

Figure 16: Overwrite

To make things more simple, if the car ’DM1 - panto’ already has a feature reference to the ’Basic Set’ it automatically means that it contains feature ’Energy’ with all the subfeatures underneath it. If there is nothing more to add to those subfeature for this type of car, then no changes will be made.

(26)

Figure 17: Append

(27)

5. Related Work

During our research process we came across some contributions that we consider to be re-lated to our work. The range of the topics these contributions cover is very wide, starting from giving cases of variability systems in large industrial product lines and open-source software, to proposed modeling approaches from certain project works and lastly the importance of the system requirements and their transformation into the right features and relations.

In the work done by Berger et al. in [16] which is an exploratory case study of three companies that apply variability modeling, the authors decide to present industrial practices and how this way of modeling is utilized and valued in industry, as according to them there is a huge gap between the proposed theoretical and practical solutions and their actual application in everyday use. According to them, they choose three certain cases because they represent a wide range of organizational sizes, they cover the most common adoption strategies and they originate from domains that apply variability modeling regularly. The three companies to be taken under study are a consulting company, a component producer and a car manufacturer. The information that was extracted from them was about issues that focus mostly in managing variability and features. They are represented in the article by following the same structure, by starting with the notations they use, tools, modularization approaches, model sizes, feature types, relation types, cross-tree constraints, model hierarchy depths and then follow with variability aspects like variable artifacts, variability mechanisms, feature-to-artifact mapping. The results of the study showed that:

”Feature models are perceived as intuitive and simple notations that organize unique domain knowledge and foster understanding and collaboration among developers. Interestingly, instead of declaring and maintaining constraints, our subjects prefer to manage a set of configurations or to let experts configure products. Thus, the primary benefit of variability modeling lies in variability management—organizing, visualizing, and scoping features—less in configuration and automation for our subjects.”

It is considered as very important to see how much of the theoretical proposals given over years fit in the industrial needs and usage, how much of it is strictly followed in a modeling process and how much is considered as completely unnecessary. This way, we think that the academic contributions can be used according to the demand of the industry and develop along instead of providing advancements that are not possible to reach and will never be exploited.

In another similar work done from the same authors [17], a study is done over variability models and language in the Systems Software Domain. Many variability languages have been proposed since the first introduction of Feature-oriented domain analysis (FODA) [13] and the related work under our study focuses its research in the usage, semantics, constructs and associated tools of two variability modeling languages, Kconfig and CDL that were developed independently and used massively in software projects outside of the academia scope. 128 variability models were analyzed and evaluated from 12 open-source projects using these languages. In the end of this work, the results that were received show the degree of supportability the existing languages and tools have for projects of real-world use, tackle the coverage of size and complexity of variability models reported in academic papers have from the variability tools and lastly provide requirements for concepts and mechanisms that are not commonly considered in academic techniques. The contribution of this article is valuable for the modeling engineers, similarly to the previous work, to evaluate the practicability of theoretical modeling and analysis techniques in real-world projects in industry.

(28)

Through this process, elements and parts of the SPL can be evaluated and classified into project specific or general for all the generated models.

(29)

6. Discussion

Our contribution aims to give a guidance on how to carry a modeling process of a Software Product Line into a feature diagram from the very beginning. We try to give instructions on important concerns that should be considered before starting modeling, and then gradually explain on why it would be more beneficial to follow our approach if there are scenarios similar to ours. We had our focus on pointing out how managing variabilities and commonalities in the product line is being taken care of by Variability Modeling. It is a topic that has been tackled before from previous work, and we tried to refine it to some extent and present how it was utilized on our project. Our main target was the extraction and representation of variabilities and commonalities in the feature model, which represents variabilities and commonalities among the products part of the product line.

Variability modeling proposals are mainly focused on the ways of presenting the variability. They are mainly centralized on proposing diagrams or other approaches to guide the modeling engineers on exhibiting the product lines, however very few to none of them mention the strategy or practice one should follow to reach to the conclusion that a certain feature can be considered common or variable. There is no given and defined criteria on the categorization of the features. We were presented with the overall document of the project, with tables of features as a way of representing the product line. The tables were descriptive and not comparative, so on the first try we did not know on how the features would be used, except of the fact that they would be included in a diagram. We were conscious that in some certain way they had to present commonalities and variabilities of the product line, but we had to deal with finding that way. What needs to be worked more within the research area, is giving the criteria of categorizing features. Our suggestion coming as a result of our work, would be choosing features or feature groups that are common on a percentage of 50%-75% and include them in the commonalities, and the others as variabilities. We would not still say for sure this is the right solution as it is only an outcome of a single working process, but we can guarantee it will produce decent results.

As to continue with the modeling process itself, after we evaluated the features and what each of them represented for the product line, we faced many challenges as they were either related to notations, concepts or ways of representing them on the diagram. The identified challenges were considered to come as a result of the limitations of the existing approaches especially for the more complex variability scenarios of the family product lines. We think that by following step by step all the solutions proposed in Section 4.2. one can generate a very accurate diagram. It is up to the modeling engineer or to the customer to decide whether it is acceptable according to their needs and requirements.

We mention several times that if these suggestions are decided to be implemented, it will make the production of these diagrams and the transformation of them into product variants easier. As an example, by including the functionality of ’Base’ and ’Option’ from challenge 4 in Section 4.2., it would be able in one click by selecting Base to get the domain model of the product line or also known as the most simple version of the product. By deciding on adding other options, it would be able to get different and more complicated variants. By more complicated we mean more modified and including more features.

6.1. Threats to Validity

If we consider the information we received from the research papers from the scientific databases to be scientifically accurate, then we think that there are no threats to the validity of that infor-mation. We restate that all the concepts we used and worked with come from verified documents such as research papers and guiding books of the modeling tools.

(30)

(31)

7. Conclusions

All the challenges that were tackled and identified as such came as a consequence of the limitations of the existing approaches regarding complex variability scenarios. With this sentence and all the demonstrated work in Section 4. we try to give an answer to the first Research Question. To conclude our work, some of the challenges came as a result of the lack of information and previous research. We can mention here Challenge 2: Selecting features with a higher impact, Challenge 3: Choosing the right feature relations, Challenge 6: Expressing commonalities and Challenge 7: Specializations. Challenges that came as a result of the techniques of representing a Software Product Line into a feature diagram are Challenge 1: Feature Repetitions, Challenge 4: Base and Option possibilities and Challenge 5: Categorizations.

We think that the work done in this thesis is a small contribution in the actual modeling techniques, but a very important step on inspiring and motivating other engineers on extending them. We have proven to have established conditions and principles on setting up criteria on taking decisions upon the techniques, approaches and the practices to be followed on carrying a modeling process from start to finish.

As for the second Research Question, we have tried on giving theoretical contributions to the identified Challenges and thoroughly explain in detail what is insufficient in the existing approaches that lead us to bring them out and arise new approaches. Then the practical contribution followed with displaying and representing the theoretical contribution in an actual feature diagram. After we gave the explanations for the new notations it was seen as necessary to give an idea on how to present them and integrate them in the existing modeling techniques. Sometimes it is difficult to come up with something completely new and include them in some existing means and it is a reason there exist many variations on how to deal with the same modeling procedure. There are several different ways in representing the same relation type in a diagram even for the most simple feature relations, which are for example OR or XOR. We think this is a result of lack of investigation on some other works, and introducing new symbols for every concept that was discovered during the work. That is why, even though not mention previously from our side, it was a challenge to carry this integration process in the existing contributions.

7.1. Future Work

(32)

References

[1] D. Benavides, S. Segura, and A. Ruiz-Cort´es, “Automated analysis of feature models 20 years later: A literature review,” Information systems, vol. 35, no. 6, pp. 615–636, 2010.

[2] K. Czarnecki, S. Helsen, and U. Eisenecker, “Formalizing cardinality-based feature models and their specialization,” Software process: Improvement and practice, vol. 10, no. 1, pp. 7–29, 2005.

[3] ——, “Staged configuration using feature models,” in International Conference on Software Product Lines. Springer, 2004, pp. 266–283.

[4] P. C. Clements, L. G. Jones, L. M. Northrop, and J. D. McGregor, “Project management in a software product line organization,” IEEE software, vol. 22, no. 5, pp. 54–62, 2005. [5] K. Pohl, G. B¨ockle, and F. J. van Der Linden, Software product line engineering: foundations,

principles and techniques. Springer Science & Business Media, 2005.

[6] D. Streitferdt, M. Riebisch, and K. Philippow, “Details of formalized relations in feature models using OCL,” in 10th IEEE International Conference and Workshop on the Engineering of Computer-Based Systems, 2003. Proceedings. IEEE, 2003, pp. 297–304.

[7] P. Clements, “Software product lines: A new paradigm for the new century,” Crosstalk, vol. 12, no. 2, pp. 20–22, 1999.

[8] D. L. Webber and H. Gomaa, “Modeling variability with the variation point model,” in International Conference on Software Reuse. Springer, 2002, pp. 109–122.

[9] J. Carbonnel, M. Huchard, and C. Nebut, “Towards complex product line variability mod-elling: Mining relationships from non-boolean descriptions,” Journal of Systems and Software, vol. 156, pp. 341–360, 2019.

[10] V. R. Basili, R. W. Selby, and D. H. Hutchens, “Experimentation in software engineering,” IEEE Transactions on software engineering, no. 7, pp. 733–743, 1986.

[11] S. Patig, “Measuring expressiveness in conceptual modeling,” in International Conference on Advanced Information Systems Engineering. Springer, 2004, pp. 127–141.

[12] M. Sinnema and S. Deelstra, “Classifying variability modeling techniques,” Information and software technology, vol. 49, no. 7, pp. 717–739, 2007.

[13] K. C. Kang, S. G. Cohen, J. A. Hess, W. E. Novak, and A. S. Peterson, “Feature-oriented domain analysis FODA feasibility study,” Carnegie-Mellon Univ Pittsburgh Pa Software En-gineering Inst, Tech. Rep., 1990.

[14] M. L. Griss, J. Favaro, and M. d’Alessandro, “Integrating feature modeling with the RSEB,” in Proceedings. Fifth International Conference on Software Reuse (Cat. No. 98TB100203). IEEE, 1998, pp. 76–85.

[15] M. Riebisch, K. B¨ollert, D. Streitferdt, and I. Philippow, “Extending feature diagrams with UML multiplicities,” in 6th World Conference on Integrated Design & Process Technology (IDPT2002), vol. 23, 2002, pp. 1–7.

[16] T. Berger, D. Nair, R. Rublack, J. M. Atlee, K. Czarnecki, and A. Wkasowski, “Three cases of feature-based variability modeling in industry,” in International Conference on Model Driven Engineering Languages and Systems. Springer, 2014, pp. 302–319.

[17] T. Berger, S. She, R. Lotufo, A. Wasowski, and K. Czarnecki, “A study of variability models and languages in the systems software domain,” IEEE Transactions on Software Engineering, vol. 39, no. 12, pp. 1611–1640, 2013.

EltonTo¸cka COMPLEXVARIABILITYMODELING

V¨

aster˚

as, Sweden

Thesis for the Degree of Master of Science (60 credits) in Computer

Science with Specialization in Software Engineering

COMPLEX VARIABILITY MODELING

Elton To¸cka

eea18004@student.mdh.se

Examiner: Antonio Cicchetti

M¨

alardalen University, V¨

aster˚

as, Sweden

Supervisor: Jan Carlson

M¨

alardalen University, V¨

aster˚

as, Sweden

Company supervisor: Zulqarnain Haider,

Bombardier Transportation Sweden AB, V¨

aster˚

as,

Sweden

Table of Contents

List of Figures

1.

Introduction

1.1.

Objectives and Problem Formulation

1.2.

Structure

2.

Research Methodology

3.

Background

3.1.

Software Product Lines

3.2.

Feature Modeling

3.3.

Feature relations

3.4.

Feature types

3.5.

Complex Variability Scenarios

4.

Identified challenges and proposed solutions

4.1.

Introduction to the identified challenges

4.2.

Challenges and solutions

5.

Related Work

6.

Discussion

6.1.

Threats to Validity

7.

Conclusions

7.1.

Future Work

References