Screening for important factors in large-scale simulation models: some industrial experiments

(1)

Bachelor Degree Project in Automation Engineering Bachelor Level 30 ECTS

Spring term 2015

Rafael Adrián García Martín José Manuel Gaspar Sánchez Supervisor: Amos H.C. Ng Examiner: Matías Urenda Moris

SCREENING FOR IMPORTANT FACTORS IN LARGE-SCALE SIMULATION MODELS:

SOME INDUSTRIAL

EXPERIMENTS

(2)

(3)

The University of Skövde 15 August 2015

The present report was written during the spring term of 2015 in a four-month project at the University of Skövde. The research described herein was conducted under the supervision of Professor Amos H.C. Ng and was examined by Matias Urenda Moris, Senior Lecturer and Division Head of Production and Automation Engineering at this university.

All material in this thesis which is not the authors’ own work has been identified, and no material is included for which a degree has previously been conferred on them.

Signed by,

The authors

Rafael Adrián García Martín José Manuel Gaspar Sánchez ragm1993@gmail.com jmgs.we@outlook.com

+34 665252386 +34 653247406

The supervisor The examiner

Amos H.C. Ng Matías Urenda Moris

Professor of Production & Automation Engineering Senior Lecture and Division head for Production and Automation Engineering

amos.ng@his.se matias.urenda.moris@his.se.

+46 500448541 +46 500448523

(4)

An Abstract of Abstract

Screening for important factors in large-scale simulation models: some industrial experiments

by

García Martín, R. Adrián and Gaspar Sánchez, J. M.

Submitted to the School of Engineering Science in partial fulfillment of the requirements for obtaining the Bachelor Degree in Automation Engineering

The University of Skövde 15 August 2015

The present project discusses the application of screening techniques in large-scale simulation models with the purpose of determining whether this kind of procedures could be a substitute for or a complement to simulation-based optimization for bottleneck identification and improvement.

Based on sensitivity analysis, the screening techniques consist in finding the most important factors in simulation models where there are many factors, in which presumably only a few or some of these factors are important. The screening technique selected to be studied in this project is Sequential Bifurcation. This method consists in grouping the potentially important factors, dividing the groups continuously depending on the response generated from the model of the system under study.

The results confirm that the application of the Sequential Bifurcation method can considerably reduce the simulation time because of the number of simulations needed, which decreased compared with the optimization study. Furthermore, by introducing two-factor interactions in the metamodel, the results are more accurate and may even be as accurate as the results from optimization. On the other hand, it has been found that the application of Sequential Bifurcation could become a problem in terms of accuracy when there are many storage buffers in the decision variables list.

Due to all of these reasons, the screening techniques cannot be a complete alternative to simulation-based optimization. However, as shown in some initial results, the combination of these two methods could yield a promising roadmap for future research, which is highly recommended by the authors of this project.

(5)

Dedications

To my family, my friends and my partner, without whom I would not be what I am.

Rafael Adrián

I would like to dedicate my work to my family and friends, who have been an indispensable support. Special thanks to the exceptional people I met at Skövde.

José Manuel

(6)

Acknowledgements

This project would not have been possible without the love, support, and encouragement I received from my parents, my brother, my grandma and Natalia, my partner. I do not have words to describe my deep gratitude for all they have provided me. I love you all.

Thanks to my friends, especially to Fernando, Marina, Manolo, Cristina, Sarai and Loli for supporting me and sharing with me very many experiences. I am grateful to them for their affection and their advice during my life.

Thanks also to Professor Amos H.C. Ng and Matias Urenda Moris, for their professionalism, dedication and assistance with the present research.

Finally, thanks to José Manuel, for suffering me during this intense and wonderful year.

Rafael Adrián

I would like to show my gratitude to our supervisor, Amos H.C. Ng, for providing us all the help required. His great vision when recommending us a topic and his knowledge gave us the opportunity to work in this promising area.

I would also like to thank to Matias Urenda Moris, for his useful feedback, which turned our research a success.

And final thanks to my project partner, Rafael Adrián, for his constant motivation.

José Manuel

(7)

Table of Contents

ABSTRACT... III DEDICATIONS ...IV ACKNOWLEDGEMENTS ...V TABLE OF CONTENTS ...VI LIST OF FIGURES ...VIII LIST OF TABLES ...IX

1. INTRODUCTION ... 1

1.1. BACKGROUND TO THE STUDY, PROBLEM AND SIGNIFICANCE ... 1

1.2. AIM AND OBJECTIVES OF THE STUDY ... 2

1.3. METHODOLOGY AND REPORT STRUCTURE... 3

1.4. LIMITATIONS AND DELIMITATIONS ... 4

1.5. SUSTAINABILITY ... 5

1.6. CONFIDENTIALITY ... 5

2. LITERATURE REVIEW ... 6

2.1. SIMULATION FIELD ... 6

2.1.1. Discrete event simulation ... 6

2.1.2. Verification, validation and preparatory experiments ... 6

2.1.3. Simulation software ... 7

2.2. SYSTEMS IMPROVEMENT VIA SIMULATION ... 7

2.2.1. Design of experiments ... 7

2.2.2. Screening techniques ... 7

2.2.3. Optimization ... 8

2.3. SEQUENTIAL BIFURCATION ... 9

2.3.1. Original Sequential Bifurcation ... 9

2.3.2. Sequential Bifurcation under uncertainty ... 10

2.3.3. Controlled Sequential Bifurcation ... 10

2.3.4. Two-Face Sequential Bifurcation ... 10

2.4. SPLITTING INTO DIFFERENT GROUPS ... 11

2.5. CASE STUDIES RELATED TO SEQUENTIAL BIFURCATION ... 11

2.6. SUMMARY OF THE CHAPTER ... 11

3. SEQUENTIAL BIFURCATION METHOD ... 12

3.1. NOTATION ... 12

3.2. ASSUMPTIONS ... 12

3.3. EXPLANATION OF THE METHOD ... 13

3.4. UPPER LIMITS ... 14

3.5. SEQUENTIAL BIFURCATION CONSIDERING INTERACTIONS ... 14

3.5.1. Calculation of the interactions ... 14

3.5.2. Limitations when calculating the interactions ... 15

3.6. CONCLUDING REMARKS ... 15

4. AUTOMATION OF THE METHOD ... 16

4.1. PROGRAMMING LANGUAGE SELECTED ... 16

(8)

4.2. FLOWCHART OF THE COMPUTER PROGRAM ... 16

4.3. PROGRAM STRUCTURE: MAIN FUNCTIONS ... 17

4.3.1. Packages ... 17

4.3.2. Screening ... 17

4.3.3. Screening when considering interactions ... 18

4.3.4. Simulate ... 19

4.3.5. Results ... 19

5. INDUSTRIAL SIMULATION MODELS ... 20

5.1. TABLE PRODUCTION SIMULATION MODEL (TPS) ... 20

5.1.1. Steady state analysis: TPS ... 21

5.1.2. Replication analysis: TPS ... 21

5.2. VCCASSEMBLY ... 22

5.2.1. Steady state analysis: VCC Assembly ... 22

5.2.2. Replication analysis: VCC Assembly ... 22

5.3. VCCL-FAB ... 23

5.3.1. Steady state analysis: VCC L-fab ... 23

5.3.2. Replication analysis: VCC L-fab ... 23

5.4. NBS ... 24

5.4.1. Steady state analysis: NBS ... 24

5.4.2. Replication analysis: NBS ... 24

6. RESULTS ... 25

6.1. TPS ... 25

6.2. VCCASSEMBLY ... 29

6.2.1. Sequential Bifurcation analysis: VCC Assembly ... 29

6.2.2. Comparison with the optimization study: VCC Assembly ... 33

6.3. VCCL-FAB ... 34

6.3.1. Sequential Bifurcation analysis: VCC L-fab ... 34

6.3.2. Comparison with the optimization study: VCC L-fab ... 34

6.4. NBS ... 38

6.4.1. Sequential Bifurcation analysis: NBS ... 38

6.4.2. Comparison with the optimization study: NBS ... 38

6.5. SUMMARY OF THE RESULTS ... 42

7. DISCUSSIONS... 43

8. CONCLUSIONS ... 45

8.1. MAIN CONCLUSIONS ... 45

8.2. FUTURE RESEARCH PROPOSED ... 46

REFERENCES ... 47

APPENDIX ... 49

APPENDIX A:LIST OF FORMULAS ... 49

APPENDIX B:SBR-PROGRAM ... 50

(9)

List of figures

Figure 1. Work methodology followed in the project _______________________________________ 3 Figure 2. Sequential Bifurcation in relation to sustainability _________________________________ 5 Figure 3. MOO for systems improvement _______________________________________________ 9 Figure 4. SB applied to an Ericsson factory in Sweden ___________________________________ 10 Figure 5. Main techniques to find the most important factors in a simulation model _____________ 11 Figure 6. Explanation of the SB with a ten factors group __________________________________ 13 Figure 7. Flowchart of the program ___________________________________________________ 16 Figure 8. TPS model ______________________________________________________________ 20 Figure 9. Steady state of the TPS model ______________________________________________ 21 Figure 10. VCC Assembly model ____________________________________________________ 22 Figure 11. Steady state of the VCC Assembly model _____________________________________ 22 Figure 12. VCC L-fab model ________________________________________________________ 23 Figure 13. Steady state of the VCC L-fab model ________________________________________ 23 Figure 14. NBS model _____________________________________________________________ 24 Figure 15. Steady state of the NBS model _____________________________________________ 24 Figure 16. Most important factors of the TPS model without considering interactions ____________ 25 Figure 17. Most important factors of the TPS model when considering interactions _____________ 25 Figure 18. Pareto graphs of the TPS model ____________________________________________ 26 Figure 19. Most important interactions of the TPS model __________________________________ 27 Figure 20. Group bifurcations of the TPS model_________________________________________ 28 Figure 21. Most important factors of the VCC Assembly model without considering interactions ___ 29 Figure 22. Most important factors of the VCC Assembly model when considering interactions ____ 30 Figure 23. Pareto graphs of the VCC Assembly model ___________________________________ 31 Figure 24. Most important interactions of the VCC Assembly model _________________________ 32 Figure 25. Comparison of SB and optimization in the VCC Assembly model: Pareto graphs ______ 33 Figure 26. Comparison of SB and optimization in the VCC L-fab model: Pareto charts __________ 34 Figure 27. SB results of the VCC L-fab model __________________________________________ 35 Figure 28. Pareto graphs of the VCC L-fab model _______________________________________ 36 Figure 29. Most important interactions of the VCC L-fab model _____________________________ 37 Figure 30. Comparison of SB and optimization in the NBS model: Pareto graphs ______________ 38 Figure 31. SB results of the NBS model _______________________________________________ 39 Figure 32. Pareto graphs of the NBS model ____________________________________________ 40 Figure 33. Most important interactions of the NBS model _________________________________ 41 Figure 34. Buffers’ influence in the SB: TPS model example _______________________________ 43 Figure 35. Incongruence of the results: An experiment with the TPS model ___________________ 44 Figure 36. Combination of SB and optimization: initial results ______________________________ 44

(10)

List of tables

Table 1. Replication analysis of the TPS model _____________________________________ 21 Table 2. Replication analysis of the VCC Assembly model _____________________________ 22 Table 3. Replication analysis of the VCC L-fab model ________________________________ 23 Table 4. Replication analysis of the NBS model _____________________________________ 24 Table 5. Simulations required by SBO, SB and SBi to find the most important factors ___________ 42

(11)

1. INTRODUCTION

Background to the study, aim and objectives, report structure, limitations, sustainability and confidentiality

In the present project, the Sequential Bifurcation method (SB), one type of screening for important factors in large-scale simulation models, is studied and analyzed by performing some experiments in order to compare its results with the optimization study of different industrial models.

The screening technique studied in this project is the original SB presented by Bettonvil and Kleijnen (1996), which is applied in an automated manner to four discrete event simulation models: one of these models serves as an explanation of the method, whereas the others three are used to compare simulation-based optimization (SBO) and Sequential Bifurcation.

1.1. Background to the study, problem and significance

The detecting of important factors in industrial systems is necessary to identify the required changes to perform, e.g. availability of the machines or capacity of the warehouses, with the purpose of improving these systems in terms of throughput, work-in-progress (WIP) or lead time, among other parameters. In this sense, simulation is an effective method to reach this objective and generally for the analysis of large-scale, complex, real-world or artificial systems.

Nowadays, there are many techniques that can be applied to simulation models with the aim of finding their most important factors. Among them, design of experiments and simulation- based optimization are the most well-known ones, even though SBO, which can be defined as finding the best input variables, i.e. the most important factors in this context, among all the possible variable combinations without explicitly evaluating each possibility (Carson & Anu, 1997), is the current method used because of its accuracy.

The problem arises when there are a large number of factors that can be subject to changes in a simulation model. When it happens, the search space in an optimization study increases exponentially. Take a simple example: for a 2-level full factorial study, the combination of 10 variables is 2¹⁰=1024. Nevertheless, if the number of factors is increased by ten times to 100, then the total number of possible combinations is 1.27x10³⁰, which makes any enumeration impossible. The problem is even worse in terms of time: due to the complexity of the models, each simulation takes a relatively long time; consequently, and since the search space is higher, the time required to find the most important factors using optimization is extremely long comparing with other techniques. For this reason, simulation models with many factors would pose a difficult challenge to a simulation-based optimization study.

In the literature, a common method to ameliorate this problem is screening, a type of design of experiments. By using the screening techniques, and more precisely, Sequential Bifurcation, which is particularly suitable for large-scale simulation models, fewer simulations are required to find the most important factors in a simulation model, reducing the time of the study significantly. However, the reliability of the SB is under discussion (Kleijnen, et al., 2006), especially in terms of accuracy. Therefore, it is necessary to perform a comparison between Sequential Bifurcation and simulation-based optimization in order to test the precision of the procedure.

(12)

1.2. Aim and objectives of the study

The purpose of the present project is to apply the Sequential Bifurcation method in different industrial models in order to determine whether the screening techniques could be a substitute for or a complement to the current method that is based on stochastic evolutionary optimization.

In order to reach the aim, the following objectives have to be achieved:

1. Understand and describe the Sequential Bifurcation method.

A proper study of the method is fundamental to make a correct implementation. Therefore, a concomitant objective is to perform a literature review, necessary to summarize and study the different variants of the original procedure.

2. Develop a computer program for the application of the Sequential Bifurcation method.

The program should show the results of the screening method in terms of time, number of simulations and most important factors in the four discrete event simulation models to study. It should be emphasized that the program needs to be combined with FACTS Analyzer, simulation software developed at the University of Skövde, to run the simulations.

This is because the models studied in this project were built under this program.

Furthermore, other sub-objectives are proposed with regards to the development of the computer program, such as:

2.1. Create the code for the method, structuring it into different functions.

2.2. Show the Sequential Bifurcation results using different charts, similarly to the one shown in the publication presented by Kleijnen, Bettonvil and Persson (2006).

2.3. Develop the code to calculate a Pareto chart based on the results of the corresponding simulation models, both of Sequential Bifurcation and simulation- based optimization.

This objective and their corresponding sub-objectives represent the key part of the project;

the proper accomplishment of the aim of this thesis depends on its consecution.

3. Analyze the reliability of the Sequential Bifurcation method.

This objective is especially related to the aim of the project, although it could be executed without any comparison with optimization. Notwithstanding, without this comparison the validation of the results would be more difficult to achieve. Thus, after finishing the automation of the method and the experiments, the reliability of the SB should be discussed based on the results obtained directly from the method and especially from the comparison with the optimization study.

Furthermore, the problems found during the project should be explained with the purpose of creating a roadmap for future researchers.

(13)

1.3. Methodology and report structure

In order to achieve the goals presented in this project properly, a work methodology based on different steps needs to be follow. The flowchart shown in figure 1 represents the stages to do during the project with their corresponding chapters in the report.

First of all, it is essential to perform a literature review with the objective of analyzing the results obtained in previous research of this field as well as to summarize the different variants of the Sequential Bifurcation method that have been studied previously.

Once the different types of screening methods are studied, the Sequential Bifurcation method is described and analyzed, basing especially on the 1996 and 2006 Sequential Bifurcation publications co-authored by Professor Kleijnen.

After that, the automation of the method starts. It is the most difficult and time-consuming task of the project. The experiments are done while developing the code and the results obtained are used as feedback to check the correctness of the implementation. The final results are achieved through several revisions and improvements of the code.

As a final point, the comparison with the optimization study is performed in order to contrast the accuracy of the Sequential Bifurcation method with the simulation-based optimization.

Additionally, the conclusions need to be drawn based on the results obtained from the experiments and also from the comparison with the optimization study.

Regarding the report structure, it is divided into eight chapters, following the same organization as the methodology.

Figure 1. Work methodology followed in the project. Source: own elaboration

Fee d b ac k

Developing of the code

Ch.

4 Experiments

Ch.

5 6 Understanding of the method 3

Literature study

_Ch.

2

Ch.

Analysis of the results

_Ch.

7 8

(14)

1.4. Limitations and delimitations

The present project has some influences that cannot be controlled. The following limitations are the most significant ones:

1. Simulations require a specific software.

Since the models to study are built in FACTS Analyzer, this program is required to perform the simulations. More precisely, an executable, xsim-runner.exe, is needed to perform the experiments. This file is called with a specific code, as is shown in chapter 4.

This limitation represents a potential weakness in the project. All the simulations depend on this program, which is updated and controlled by the University of Skövde.

2. High-performance computers are required for the simulations.

This is a limitation especially related to time. Depending on the complexity of the model to study, a complete experiment could spent more or less time so that this is an important variable to take into account. In this sense, many parts of the SB computer program require a lot of resources in terms of computer performance, such as the computing of the interactions or the calculation of the graphs.

3. The factors to study are decided by the provider of the models.

The majority of the models of this project have a huge amount of factors, although not all of them are studied in the SB implementation. The provider of the models determines beforehand the factors to study, which entails a limitation especially in terms of flexibility.

4. Likewise, the time-limit for the realization of the project represents a significant limitation.

Regarding the delimitations, there are some boundaries set in this project that have to be mentioned:

1. The output variable to study is the throughput (TH).

Even though the SB program is devised to study the TH, WIP and lead time, only the throughput is studied in the experiments because the original SB implementation analyses the factors in terms of TH, and also for the time consumed to study the other parameters.

2. Only the interactions among the first ten most important factors are calculated.

The majority of the simulation models does not have more than ten relevant main effects (see chapter 3). Based on the assumption explained in this chapter, the rest of the interactions are not significant.

3. The minimum number of replications is set to 5.

The statistics studies argue that this is the minimum number of replications needed in a simulation experiment to obtain credible results. However, it could be insufficient (Hoad, et al., 2007), hence a replication analysis needs to be performed to each model.

4. The scope of the project is limited to the implementation and analysis of the existing Sequential Bifurcation method, i.e. it does not include the development of any new screening method.

(15)

1.5. Sustainability

Sustainability can be defined as “the successful management of resources to satisfy human needs while maintaining and enhancing the quality of the environment and conserving resources” (Hietala-Koivu, 1999). In this context, gearing the companies’ activities toward sustainable development is a necessary task nowadays.

The sustainability of the enterprises depends on the efficiency of their processes. Today, the complexity of the systems has been increased due to the technological development.

Therefore, simulating these processes has become an important task to understand the behavior of the systems and also to improve their efficiency.

The main objective of the SB is to find the important factors in a simulation model; thus, the efficiency of the system can be improved, decreasing the number of changes to implement.

Additionally, finding the most important factors in the system can indirectly reduce any environmental impact, e.g. due to adding unnecessary resources.

Furthermore, the SB method can be applied to different types of models. Even though in this research all the models are related to the industry, the Sequential Bifurcation method can improve sustainability if the models consider environmental factors, as in the models studied by Bettonvil and Kleijnen in 1996. Figure 2 represents the main aspects of the sustainability in the simulation field and its relationship with Sequential Bifurcation.

Figure 2. Sequential Bifurcation in relation to sustainability. Source: own elaboration

1.6. Confidentiality

The present project was realized at the University of Skövde, without any enterprise cooperation. Hence, no confidentiality document was required to be signed. However, some aspects regarding the privacy have been considered because many real industrial models were used during the project. The names of the factories, details of the companies and real data of the simulation models used in the study have not been explicitly revealed. All the data that could be thoroughly discussed, as throughput results, has not been included in this report, especially in the charts and final results.

SB

Business

Process Simulation

Most Important Factors

(16)

2. LITERATURE REVIEW

Review of the simulation literature and the progress and development of the SB In this heading of the project, a deep review of the simulation literature is shown in order to relate the progress of this field with the Sequential Bifurcation method, as well as to make know the different variants of this technique that have been studied previously. Furthermore, some real case studies related to Sequential Bifurcation are explained.

2.1. Simulation field

Simulation is the process of designing a model of a system with which experiments are performed to understand the operating of that system or evaluate different strategies (Shannon, 1976). This tool is usually used to predict situations and gather information before implementing changes in real systems.

2.1.1. Discrete event simulation

Nowadays, the majority of the real systems are too complex to be modeled by using analytical techniques (Ng, et al., 2008). Hence, one of the key points when doing a simulation study is the type of simulation modeling to use. Discrete event simulation (DES) is the common one in this area because it “provides an intuitive and flexible approach to representing complex systems” (Karnon, et al., 2012). The main characteristic of DES is that it considers the operating of the systems as a set of discrete events, which facilitate the simulation process.

In this project, and as mentioned before, the experiments are performed with DES models.

2.1.2. Verification, validation and preparatory experiments

The verification and the validation of the models are very important aspects that have to be taken into account when starting a simulation study.

Once the model is finished, it is important to determine whether the model has been correctly built into the simulation software, i.e. to start a debugging process (Law, 2003). This is called verification. Afterward, it is necessary to make a validation process; it represents the comparison between the model built and the real system with the purpose of determining the accuracy of its representation and also to reduce or eliminate the variability of that system (Law, 2009). In this research, it has been assumed that the models provided have been verified and validated beforehand. No verification and validation process has been made.

Another aspect to be considered in a simulation study are the preparatory experiments. Both a steady-state and replication analysis must be made to validate the results. A steady-state study determines the time that a system requires to be stable (Mahajan & Ingalls, 2004) and it is usually the experience of the researchers which determines the correct value. Regarding the replication analysis, the majority of the authors argues about the minimum number of replications necessary in a simulation experiment: despite the fact that statistics determines that at least is five, this number is currently under discussion.

These experiments are crucial to improving the accuracy of a simulation study because the data collection must be performed under steady state and with the correct number of replications. In the present project, a replication analysis sheet provided by the University of Skövde has been used for the preparatory experiments.

(17)

2.1.3. Simulation software

Attention should also be drawn when choosing a simulation software. Nowadays, there are many options that could be considered before starting a simulation project. Some of them are well-known, such as Arena or Plant Simulation, while others are specific for some companies or universities. Since the models studied in the present project were built in FACTS Analyzer, all the simulations have been performed using this program. However, its interface has not been used because the SB calls its simulator, xsim-runner, automatically.

Additionally, FACTS Analyzer offers different optimization tools whose results have been used in this project to compare with the ranking of the most important factors obtained from the Sequential Bifurcation method.

2.2. Systems improvement via simulation

There are many techniques, as stated previously, that can be applied to discrete event simulation models with the aim of improving the systems. Among them, design of experiments and simulation-based optimization are the most important ones.

2.2.1. Design of experiments

The number of factors in a simulation model is in accordance with the complexity of the system that it represents. Related to this, design of experiments, a set of techniques based on sensitivity analysis, has the objective of finding the most important factors by making the least amount of simulations. Thus, there are different strategies that can be applied. The most important ones are: one-factor-at-a-time, 2^k factorial design and 2^k-p fractional factorial design (Law, 2014), where k is the number of factors and p depends on the experimental design.

One-factor-at-a-time makes simulations at two different values of an input factor while keeping the rest of the factors at a fixed value. This procedure is inefficient in terms of simulations and it assumes that there are no interactions in the metamodel.

One way to calculate these interactions is by using a 2^k factorial design, choosing two levels for each factor and calculating all the 2^k possible inputs combinations. This technique is much more accurate than the one-factor-at-a-time technique when there are interactions between the factors of the model. It is also more time-demanding in terms of simulations required.

Finally, the 2^k-p fractional factorial design makes possible to estimate the main effects and the two-factor interactions with a reduced amount of simulations. The time required decreases while increasing the value of p. However, the accuracy of the calculations decreases as well.

On the other hand, quadratic effects can be estimated in many of these methods. In this case, k extra runs are needed and each factor has to be simulated for more than two values (Kleijnen, 1998).

2.2.2. Screening techniques

Screening techniques are a type of design of experiments. They can be defined as a group of methods whose objective is to find the factors for which a small variation of the inputs implies an improvement in the output of the system. These techniques assume that the response of the model depends only on a few factors, which is called the principle of parsimony.

(18)

Several screening techniques have been developed over the last decade. Among them, it is possible to emphasize the next ones: supersaturated design, group screening, multiple group screening or Sequential Bifurcation, among others (Dupuy, et al., 2014).

A supersaturated design uses fewer runs than the number of factors to study; it was developed by Satterthwaite (1959) as cited by Dupuy et al. (2014). Afterward, many different methods were created in this field, like the one published by Watson (1961), based on Dorfman’s publications (1943). This method, called group screening, starts combining the factors into small groups that are treated as a single factor. The aim of this procedure is to use a simpler design to study the system.

A different important variant of this method is called multiple group screening (Morris, 1987).

This method assigns each factor to more than one group. After making the necessary simulations, a factor is defined as potentially influential if all groups that contain this factor are active.

Finally, Sequential Bifurcation is a group-screening technique published by (Bettonvil &

Kleijnen, 1996). This method and its different variants are explained in section 2.3.

2.2.3. Optimization

According to Carson and Anu (1997), and as stated before, simulation-based optimization is

“the process of finding the best input variable values from among all possibilities without explicitly evaluating each possibility”, whose goal is to maximize the information obtained in a simulation experiment while reducing the resources spent.

Thus, the current optimization method, based on stochastic evolutionary optimization, is commonly used in the simulation field because of the accuracy of its results, especially for bottleneck identification and improvement (Ng, et al., 2014).

On the other hand, it should be noted the difficulty when optimizing a system, which depends especially on the number of input parameters. In this sense, the major problem of optimization is that by using this method it is not possible to conclude with assurance that one configuration is the best one among all the possibilities. This is because the optimization method does not study all the inputs combinations.

In this context, many researchers have been focused on designing an evaluation of different algorithmic, as did Riley in her study (2013). More precisely, this author discussed and reviewed theoretically the discrete-event simulation optimization methodology, proposing areas of study for future research, such us designing intelligent optimization interfaces or dynamically-integrated optimization software.

Another of the aspect to empathize in regards to optimization is the possibility of applying multi-objectives into the experiments. In their study, Amos, Pehrsson and Bernedixen (2014), show the benefits of implementing multi-objective optimization (MOO) in a simulation experiment. Figure 3 shows the concept of using MOO for bottleneck analysis, in which the TH is intended to be maximized while minimizing the number of changes.

(19)

It is possible to observe in the picture below (see figure 3) the multi-objectives proposed: f1(x) represents the TH to be increased and f2(x), the number of changes to be minimized.

Moreover, TH0 is the throughput without implementing any change; THmax, the maximum throughput to be reached by the system when implementing Cmax changes, which represents the maximum number of changes to perform in order to obtain that TH, meaning that if more than Cmax changes were implemented, f2(x) > Cmax, the TH will be the same. They call these outcomes as inferior solutions. Finally, THt represents the throughput when implementing a determined number of changes, lower than Cmax.

Figure 3. MOO for systems improvement. Source: Ng et al. (2014)

Based on their results, it is possible to conclude that MOO is a powerful tool to use in simulation studies, in the sense that the optimal combinations are generated in a single optimization run.

2.3. Sequential Bifurcation

The present project focuses on the Sequential Bifurcation method. Here follows a deeper analysis of this method and its different variants developed.

2.3.1. Original Sequential Bifurcation

Sequential Bifurcation origins in 1996, when Bettonvil and Kleijnen published a report explaining the method and applying it to a model. The aim of this method is to find the most important factors (k) among the great many (K) in large-scale models, reducing the number of simulations required (n), where k^<<K and n^<<K.

Moreover, the authors also extended their study considering two-factor interactions, which makes the metamodel more precise. With this metamodel, it is possible to estimate whether the interactions are important or not, although its value cannot be calculated.

(20)

Ten years later, Kleijnen, Bettonvil and Persson (2006) explained, applied, and discussed the SB. They showed graphically its behavior in a case study with different models from an Ericsson factory in Sweden, as shown in figure 4. In this picture, it is possible to observe the division of the factors in their experiment.

Figure 4. SB applied to an Ericsson factory in Sweden. Source: Kleijnen et al. (2006)

2.3.2. Sequential Bifurcation under uncertainty

The method created by Bettonvil and Kleijnen was developed for deterministic simulations, which is appropriate only in a few cases. In this context, there were many authors who tried to improve the original method, e.g. Cheng (1997). This author extended the original SB, naming it as SB-under-uncertainty (SBU). The aim of this method is to apply the SB to models whose response is stochastic and subject to significant error, maintaining the flexibility instead of reaching a precise level of significance.

2.3.3. Controlled Sequential Bifurcation

Another important variant of Sequential Bifurcation is Controlled Sequential Bifurcation (Wan, et al., 2003). The Controlled Sequential Bifurcation procedure (CSB) contains the main concepts from the original SB and the SBU method. The CSB controls with several algorithmic the correctness of the results with the objective of determining more precisely its importance.

2.3.4. Two-Face Sequential Bifurcation

Sequential Bifurcation requires the user to establish the direction of the effect of each factor.

This can be a problematic step because if there were any mistakes, the results could be completely different. Sanchez, Thomas and Wan (2005), tried to solve this problem by developing the Two-Face Controlled Sequential Bifurcation method, FF-CSB. It consists of two different phases when the screening is performed: firstly, the signs and magnitudes of the effects are calculated; secondly, these results are used to apply CSB.

It has to be emphasized that the original SB is the variant implemented in the present project although many of the aspects of the other variants are included in the study, e.g. the inclusion of stochastic models or assuring the correctness of the results by applying several preparatory experiments. Furthermore, and since the aim of the project is to compare the screening techniques with SBO, the selection of one type of another would not affect to the validation of the results.

(21)

2.4. Splitting into different groups

One of the key elements in a Sequential Bifurcation procedure is the way of splitting the group of factors. Frazier and Li Chen (2012) studied the optimal group-splitting problem using a Bayesian-dynamic programming method. They introduced the SB and formulated the optimal group splitting problem analyzing two different versions.

The first case studied in their article assumed that the probability of having an important factor was homogeneously distributed. In this circumstance, the conclusion was that dividing a group in half is the optimal decision when the number of factors is a power of two, consistent with the rule used by Bettonvil and Kleijnen (1996). The second case had two types of factors: one with a high probability of importance and a second one with a low probability of importance. In this circumstance, the optimal solution is to divide based on the category of the factors.

2.5. Case studies related to Sequential Bifurcation

There are many case studies in the field of screening and, more precisely, related to Sequential Bifurcation. Two of them are explained next.

Yaesoubi and Roberts (2007) applied screening techniques in a medical simulation model for colorectal cancer to find the most important factors. By performing the experiment, they found that, out of the 72 factors considered, only 8 seemed to be important. The system was driven by a short list of factors although many others had a contribution.

In a closer relation with Sequential Bifurcation, Shi, Shang, Liu and Zuo (2014) applied this method in a supply chain for JIT operations. In this case, the results were also significant, finding three important factors out of the 58 considered with a fewer number of simulations.

The Sequential Bifurcation method that they applied considered the main effects, the interactions between two factors and the quadratic effects.

2.6. Summary of the chapter

In this chapter, design of experiments and optimization, the main techniques used to find the most important factors in a simulation model, and its differents variants, have been described.

Furthermore, the Sequential Bifurcation method, which belong to the screening techniques, a type of design of experiments, has been explained in detail, analyzing its characteristics and its variants carefully. Figure 5 shows these techniques and its corresponding variants.

Figure 5. Main techniques to find the most important factors in a simulation model. Source: own elaboration

Optimization

Design of experiments

One-factor-at-a- time

2k factorial design Screening techniques

Supersaturated design

Group screening Sequential Bifurcation

Original SB

SB under uncertainty Controlled SB

Two-Face SB

(22)

3. SEQUENTIAL BIFURCATION METHOD

Assumptions, explanation of the method and SB extended with interactions This chapter presents the SB method used in the present project mathematically. As mentioned before, it is based on the original Sequential Bifurcation method developed by Bettonvil and Kleijnen (1996).

3.1. Notation

𝐊: Number of total factors in the model.

𝐤: Number of most important factors.

𝐧: Number of simulations.

𝐱_𝐣: Value of the j^th input, standardized to lie in [−1, 1].

𝐲_(𝐣): Output of the system when the inputs 1,…,j are at their low value and when the inputs j+1,…, K are at their high value.

𝐲_−(𝐣): Output of the system when the inputs 1,…,j are at their high value and when the inputs j+1,…, K are at their low value.

𝛃_𝟎: Average output of the model.

𝛃_𝐣: Main effect of the j^th factor.

𝛃_𝐣,𝐣′: Interaction between factor j and factor j’.

3.2. Assumptions

The method has two basic assumptions:

Assumption 1: A first-order polynomial is a good approximation to the output of the system over the experimental domain of the simulation model.

y = β₀+ ∑ β_jx_j

K

j=1

+ ε (1)

This polynomial can be augmented with two-factor interactions including the cross-products between the factors.

y = β₀+ ∑ β_jx_j

K

j=1

+ ∑ ∑ β_j,j′x_jx_j′

K

j^′=j+1 K−1

j=1

(2)

Assumption 2: The direction of the effect of every factor is known.

This assumption is required because otherwise the main effects could cancel each other.

Within the standardized set of variables that represent the inputs (x_j), this assumption implies that all the main effects (β_j) are non-negative. Therefore, if the effect of the interactions is not studied, Y(x) is non-decreasing when x increases.

(23)

3.3. Explanation of the method

The aim of the SB procedure is to solve the screening problem, i.e. k^<<K, n^<<K., by calculating the β values of all the factors. The initial step is to set a high value, i.e. the one that generates a higher output, and a low value, the one that causes a lower output, to each factor of the model. It is performed before starting the iterative process of the SB, which consist of several steps.

Firstly, all factors placed in a single group are studied by observing the two extreme factor combinations: y₍₀₎, all factors set at their low value, and y_(K), all factors set at their high value.

Secondly, the method estimates if the group has an important effect, that is β_0−K> 0 (see formula 3). In that case, the group is split into two different subgroups based on the section 2.4. Then, the new subgroups are analyzed in the same way, estimating its importance.

If any of the subgroups analyzed during the study are declared as non-important (β_j′−j≤ 0), all the factors contained in that group are considered as non-important so that they are not divided into new subgroups. Eventually, the groups contain only one factor and the individual effect of each factor is estimated.

The process can be observed in figure 6. First of all, the method studies y₍₀₎ and y₍₁₀₎, and determines that this group is important. Secondly, the initial group is divided by the biggest possible power of two, eight. When analyzing the subgroups, the second one is determined as non-important (β₈₋₁₀≤ 0). Therefore, it is not divided anymore. The method continues until finding the most important factors, in this case the 6^th one.

Figure 6. Explanation of the SB with a ten factors group. Source: own elaboration

The effect of a group is estimated with this formula:

β_j−j′=y_(j′)− y_(j−1) 2 (3)

Moreover, the individual main effect of the factor is estimated with this formula:

β_j=y_(j)− y_(j−1) 2 (4)

1 2 3 4 5 6 7 8 9 10

1 2 3 4 5 6 7 8

5 6 7 8

5 6

6

(24)

3.4. Upper limits

This concept is especially related to the SB. When running the method, the current upper limit corresponds to the bigger group effect that has not been split yet. This group is then chosen by the method to be divided, studying its factors. After dividing and making a new simulation, the upper limits are updated. Furthermore, the experiment can stop as soon as the upper limits are small enough, although in the implementation exposed in this project the method continues until studying all the sub-groups.

3.5. Sequential Bifurcation considering interactions

In order to calculate the main effects when considering the interactions, it is necessary to introduce mirror observations (y_−(j)), which denote the output when the first j factors are at their low level and the rest of them are at their high level.

The method works in the same way, although the calculations of the group effects and the individual factors are made in another way; they are estimated by subtracting the mirror to cancel out the effect of the interactions.

The effect of a group considering two-factor interactions is estimated with this formula:

β_j−j′=(y_(j′)− y_−(j′)) − (y_(j−1)− y_−(j−1))

4 (5)

Moreover, the individual main effect considering two-factor interactions is estimated with this formula:

β_j=(y_(j)− y_−(j)) − (y_(j−1)− y_−(j−1))

4 (6)

In summary, when considering the interactions the accuracy is improved. However, the number of simulations is doubled, increasing the time required for completing the method consequently.

3.5.1. Calculation of the interactions

The original SB method finishes at this point, without calculating the interactions. The other versions of the method, described in section 2.3, do not calculate the effect of the interactions either. The interactions in our study are estimated using design of experiments and adapting the formulas to the current metamodel.

The effect of the interaction is estimated with the following formula:

β_j,j′ =(y₊₊+ y₋₋) − (y₊₋+ y₋₊)

4 (7)

Where:

y₊₊ Represents the output of the system when x_j= 1, x_j′= 1.

y₋₋ Represents the output of the system when x_j= −1, x_j′= −1.

y₊₋ Represents the output of the system when x_j= 1, x_j′= −1.

y₋₊ Represents the output of the system when x_j= −1, x_j′= 1.

(25)

3.5.2. Limitations when calculating the interactions

According to Bettonvil and Kleijnen (1996), if a factor is not important, it does not have interactions with any other factor. As a consequence, the number of interactions to calculate depends on the number of important factors:

Number of interactions (k) =k²− k 2 (8)

The number of simulations required for each interaction, four, can be reduced using the data obtained from the simulations run previously to one or two, depending on whether the interactions to calculate are consecutive or non-consecutive. It could be performed as follows:

 Non-consecutive interactions: β_j,j′ ∀ j+1 ≠ j’

In this case, two simulations are needed. The value of y_j can be used as y₊₋. It implies that y₋₋ can be estimated with the value of y_j∗, where j* = j-1 . If y_j∗ have not been calculated yet, meaning that y_j−1 is unimportant, j* is decreased by 1 until finding a calculated factor, e.g. y_j−2, y_j−3, along with others. If there are not preceding values calculated, y₋₋ is equal to y₀. Regarding the last two parameters, y₊₊ is calculated with a simulation with the same inputs than y₊₋ but setting j’ at its high value, whereas y₋₊ is calculated with a simulation with the same inputs than y₋₋ but setting j’ at its high value.

 Consecutive interactions: β_j,j′ ∀ j+1 = j’

In this case, only one simulation is needed. The value of y_j′ can be used as y₊₊. It implies that y₊₋ can be estimated with the value of y_j, already calculated. Moreover, y₋₋ can be estimated with the value of y_j∗, where j* = j-1, following the same procedure explained above. Regarding the last parameter, y₋₊ is calculated with a simulation with the same inputs than y₋₋ but setting j’ at its high value.

It can be perceived that the total number of simulations required to calculate all the interactions is not reachable in an acceptable period when the number of important factors increases. For this reason, the next assumption has been taken into account.

Assumption 3: The effect of the interaction of a factor with the rest of the parameters is insignificant if the main effect of that factor is irrelevant.

This assumption is explained based on the results obtained when studying the different models of the project. It has been observed that the main number of important factors that has a relevant effect is not larger than ten; therefore, the calculation of the interactions is limited to the first ten most important factors, meaning that there are 45 significant interactions.

3.6. Concluding remarks

An important remark after analyzing and explaining mathematically the SB by Bettonvil and Kleijnen (1996) is the limitation of this method to handle interactions. In this chapter, an interactions calculation based on design of experiments has been defined, which represents an important contribution and improvement to the Sequential Bifurcation procedure.

(26)

4. AUTOMATION OF THE METHOD

Programming language used, flowchart and structure of the computer program This chapter describes the automation process of the Sequential Bifurcation method studied in this project. This automation is needed to study easily the results due to the amount of simulations required. The flowchart of the program as well as some functions of the full code is explained next.

4.1. Programming language selected

The programming language with which the code has been developed is R. It has been selected especially for the following reasons:

1. R is a free and open source software.

2. R has multiples packages that can be downloaded to add features to the program.

3. R is a flexible language, e.g. it can work with different archive formats easily.

4. R has many graphic capabilities, providing high-quality results in an easy programmable environment.

This program follows the SB procedure, although it does not make the simulations. The code calls the simulator, xsim-runner, which simulates using the input files generated by the SB program.

4.2. Flowchart of the computer program When programming, it is important to know the different tasks that the program has to perform.

The structure presented in this flowchart corresponds to the version of the method that not consider interactions in its calculations.

In the flowchart (see figure 7), it is possible to observe that the program makes simulations while the “i”, work-already-done counter, and the

“j”, work-to-do counter, are different.

The next factor combination to be studied depends on the previous results. After each simulation, the subgroups generated are analyzed. If the subgroup studied is determined as important, it is added to the M matrix, which stores all the groups that have to be studied in future simulations.

Finally, the betas are calculated using the simulations data, and the results are saved in

different text files. Figure 7. Flowchart of the program. Source: own elaboration

START Scan data

Simulate Extremes

i = 0 j = 1

i = j? FINISH

i = i+1 Select a

group Simulate

Calculate βS Save Results

Important?

Sub-group 1

Important?

Sub-group 2

Add it to M

Matrix j = j+1

Add it to M

Matrix j = j+1

No

Yes

No

Yes

No

Yes

(27)

4.3. Program structure: main functions

The program has been divided into different functions, following the logical order described in the flowchart above (figure 7). Moreover, the whole code includes additional functions which are not related to the SB flowchart, but are necessary for user pourposes, e.g. the function in charge of generating the Pareto chart, as can be observed in Appendix B. The main functions of the computer program are the described next.

4.3.1. Packages

The code was developed using R and additional software packages. This function installs all the R packages necessary for the program:

1. Ggplot2, required for plotting.

2. XLConnect, required to extract the information from the Excel files.

3. XML, required to manage the data from the XML files.

4. SendmailR, required to send e-mails from R.

4.3.2. Screening

This screening function (Screening in the program) is in charge of performing the SB without considering the interactions. One of the most important parts of this function is the estimation of the importance of the sub-groups:

if ((Y[M[i,"High"]] - Y[n] > 0) & (n+1 < M[i,"High"])) { # Compare the result with the higher extreme of the interval and check whether the interval is successive or not

j <- j+1;

M[j,"Low"] <- n;

M[j,"High"] <- M[i,"High"];

M[j,"Betas"] <- (Y[M[i,"High"]] - Y[n])/2;

} # If not, no information is added in the "Work to do" matrix. If yes, a new interval ([n factor, previous higher interval extreme factor]) is added Another important aspect is the order in which the groups are studied. They could be called based on the order of entrance, processing the oldest first. Nevertheless, the groups are studied with accordance to their estimated importance. Therefore, the M matrix is ordered after each simulation:

M[(i+1):j,] <- M[i+order(M[(i+1):j,"Betas"], decreasing = TRUE),]; # Sort the matrix in order to find the correct upper limit

cat("U(",s,") = ", round(M[i+1,"Betas"], digits = 5), "\n", "\n", sep = "");

# Upper limit for the "s" simulation

U[i+1] <- paste("U(",s,") = ", round(M[i+1,"Betas"], digits = 5), sep = "");

# Save the corresponding upper limit

After finishing the simulations, the individual main effects are calculated:

cat("Calculating the Betas", "\n");

B <<- Y; # Create and give to B the same dimension of Y

B <<- B == NA; # Fulfil the B vector with NA to identify the effects easily B0 <<- (Y0 + Y[K])/2; # Not calculated in the loop (vectors start at 1) B[1] <<- (Y[1] - Y0)/2;

i <- 2; # Re-use i counter. Now is a counter for the factors while (i <= length(B)) { # Calculate each B

B[i] <<- (Y[i] - Y[i-1])/2;

i <- i+1; }

(28)

Once the main effects are calculated, the negative betas are eliminated from the results and the most important factors are counted:

j <- 1; # Re-use j counter. Now is a counter for the factors

while (j <= length(B)) { # Post-processor of B vector. Change negative and 0 B values (considered as not important) to NA

if (is.na(B[j]) == FALSE & B[j] <= 0) { B[j] <<- NA;

}

j <- j+1;

}

i <- 1; # Re-use i counter. Now is a counter for the most important factors j <- 0; # Re-use j counter. Now is a counter for the most important factors while (i <= length(B)) { # Post-processor of B vector. Count the number of important factors (B > 0)

if (is.na(B[i]) == FALSE) { j <- j+1;

k <<- j; # Number of the most important factors }

i <- i+1;

}

B <<- B; # Save B as a global variable to analyze the results cat("Betas calculated", "\n", "\n");

4.3.3. Screening when considering interactions

This screening function (ScreeningI in the program) performs the SB method considering the interactions. Two different simulations are needed to calculate the main effects to cancel out the interactions:

Y[K,"Normal"] <- Simulate(K, p, m = 0); # All the factors at the high value Y[K,"Mirror"] <- Simulate(K, p, m = 1); # All the factors at the low value The next steps are similar to the Screening function but the last part. After counting the most important factors, the interactions are estimated in an iterative process based on the formula and explanation exposed in section 3.5.2 (see also Appendix A: List of formulas).

for (i in 1:(K-1)) { # Counter for the interactions (j)

for (j in (i+1):K) { # Counter for the interactions (j')

if(is.na(B[i])==FALSE&is.na(B[j])==FALSE&B[i]>=Limit&B[j]>=Limit) { if (i+1 == j) { # Consecutive interactions

Y11 <- Y[j];

Y10 <- Y[i];

Y01 <- Simulate(i-1, p, c = j);

# For Y00 we need to find the last value calculated k <- i; # Variable for each Y[j]. Re-used

As mentioned previously, the number of simulation can be reduced to one if the factors are consecutive.

a <- 3; # Variable for the loop while (a != 15 & k != 1) {

if (is.na(Y[k-1]) == FALSE) { Y00 <- Y[k-1];

a <- 15;

}

k <- k-1;

}

if ((k == 1)) { # Condition for the Y0 value Y00 <- Y0[,"Normal"];

}