Where is my inhaler?: A simulation and optimization study of the Quality Control on Symbicort Turbuhaler at AstraZeneca

(1)

Master Thesis, 30 credits

Master of Science in Industrial engineering and Management, 300 credits

WHERE IS MY INHALER?

A simulation and optimization study of the Quality Control on

Symbicort Turbuhaler at AstraZeneca

Shirin Haddad & Marie Nilsson

(2)

WHERE IS MY INHALER?: A SIMULATION AND OPTIMIZATION STUDY OF THE QUALITY CONTROL ON SYMBICORT TURBUHALER AT ASTRAZENECA

Department of Mathematics and Mathematical Statistics Umeå University

901 87 Umeå, Sweden Supervisors:

Mats Johansson, Umeå University Magnus Liljenberg, AstraZeneca AB Examiner:

Jonas Westin, Umeå University

(3)

traZeneca for the treatment of asthma and symptoms of chronic obstructive pulmonary disease.

The delivery reliability of the product is dependent on the performance of the whole supply chain and as part of the chain the results from the department, Quality Control (QC), are mandatory to release the produced batches to the market. The performance of QC is thus an important part of the supply chain. In order to reduce the risk of supply problems and market shortage, it is very important to investigate whether the performance of QC can be improved.

The purpose of the thesis is to provide AstraZeneca with scientifically based data to identify sensitive parameters and readjust work procedures in order to improve the performance of QC. The goal of this thesis is to map out the flow of the QC Symbicort Turbuhaler operation and construct a model of it. The model is intended to be used to simulate and optimize different parameters, such as the inflow of batch samples, the utilization of the instrumentation and staff workload.

QC is modelled in a simulation software. The model is used to simulate and optimize different scenarios following a discrete event simulation and an optimization technique based on evolution strategies.

By reducing the number of analytical robots from 14 to 10, it is possible to maintain existing average lead time. Through a reduction, the utilization of the robots increases, simultaneously the workload decreases for some of the staff. However, it is not possible to extend the durability of the system suitability test (SST), and still achieve existing average lead time.

From the investigation of different parameters, it is found that, an added laboratory engineer at the high-performance liquid chromatography (HPLC) station has the best outcome on lead time and overall equipment effectiveness. However, a reduced laboratory engineer at the Minispice robots has the worst outcome. With the resources available today the lead times cannot be maintained in the long run, if the inflow is of 35 batch samples a week or more. By adding a laboratory engineer at the HPLC station and by using a SST with durability of 48 hours, the best outcome in terms of average lead time and number of batch samples with a lead time less than 10 days is received.

Keywords: Discrete event simulation, Evolution strategies, Distribution fitting

(4)

traZeneca för behandling av astma och symptomen för kronisk obstruktiv lungsjukdom.

Leveranssäkerheten för produkten är beroende av hela försörjningskedjans prestanda och som en del utav kedjan är resultaten från kvalitetskontrollen (QC) obligatoriska för att släppa en batch av produkten till marknaden. QCs prestanda är därför en viktig del av försörjningskedjan. För att minska risken för leveransproblem och produktbrist på marknaden är det viktigt att undersöka huruvida prestandan hos QC kan förbättras.

Syftet med arbetet är att ge AstraZeneca vetenskapligt baserat data för att identifiera känsliga parametrar och justera arbetssätt för att förbättra prestandan hos QC. Målet med detta arbete är att kartlägga flödet av QC Symbicort Turbuhaler och konstruera en modell utifrån det flödet. Modellen är avsedd för att simulera och optimera olika parametrar, såsom inflödet av batchprover, utnyttjande av instrumentering och arbetsbelastning av personal.

QC är modellerad i en simuleringsmjukvara. Modellen används för att simulera och optimera olika scenarier genom diskret händelsestyrd simulering och en optimeringsteknik baserad på evolutionära strategier.

Genom att minska antalet analytiska robotar från 14 till 10, är det möjligt att bibehålla befintlig genomsnittlig ledtid. Genom denna minskning ökar utnyttjandet av robotarna, samtidigt som arbetsbelastningen minskar för en del av bemanningen. Det är inte möjligt att förlänga hållbarheterna på robotarnas systemtest (SST) och fortfarande uppnå befintlig genomsnittlig ledtid. Vid undersökning av olika parametrar indikerar resultatet att en yt- terligare laboratorieingenjör vid högpresterande vätskekromatografi-stationen (HPLC) har den bästa effekten på ledtid och produktionseffektivitet. En laboratorieingenjör som reduc- eras från Minispice-robotarna har däremot den värsta effekten. Med de resurser som finns tillgängliga idag kan ledtiderna inte bibehållas långsiktigt om inflödet är 35 batchprover per vecka eller mer. Genom att addera en laboratorieingenjör vid HPLC-stationen och använda en SST med en hållbarhet på 48 timmar, erhålls det bästa resultatet i termer av genomsnittlig ledtid och antal batchprover som har en individuell ledtid på mindre än 10 dagar.

Svensk titel: Var är min inhalator?: En simulerings- och optimeringsstudie på kvalitetskontrollen av Symbicort Turbuhaler vid AstraZeneca

(5)

This journey had not been possible without the support of many. Initially we would like to express our deep gratitude to Caroline Eriksson and our supervisor Magnus Liljenberg for your continuous support and encouragement through this thesis. It has meant a lot during the setbacks that have arisen during this journey. We would also like to thank our supervisor at Umeå University, Mats Johansson for valuable advice and guidance through the thesis.

We would like to express our gratitude to all employees at the Quality Control laboratory in Snäckviken, Södertälje. Especially, the laboratory engineers Thomas Gustavsson, Mattias Hagerlund, Zeljana Magic, Ann-Christine Ståhl and Sofia Thuresson for helping us understand the flow and answer countless questions.

Big thanks to Peter Aurosell, Thomas Bertilsson and Hassanein Sater for all assistance with the modeling and your answers related to the simulation software. Thanks to Terje Albrigt- sen and Raied Kassab for providing us with important data. Finally, we would like to thank Gunnar Nordahl and Magnus Welander who, without hesitation, have helped us with all statistics related issues.

Södertälje, Sweden May 29, 2019

Shirin Haddad & Marie Nilsson

(6)

1 Introduction 1

1.1 AstraZeneca . . . 1

1.2 Symbicort Turbuhaler . . . 1

1.3 Background . . . 2

1.4 Purpose and goal . . . 3

1.5 Problem formulations . . . 3

1.6 Delimitations . . . 3

1.7 Outline . . . 5

2 Current situation 6 2.1 Sample receiving . . . 6

2.2 Analyzes . . . 7

2.2.1 Delivered dose analysis . . . 7

2.2.2 Multistage liquid impinger analysis . . . 7

2.2.3 Infrared spectroscopy analysis . . . 8

2.3 Review . . . 8

3 Theory 10 3.1 Modeling and simulation methodology . . . 10

3.2 Probability distributions . . . 12

3.2.1 Distribution fitting . . . 12

3.2.2 The Weibull distribution . . . 13

3.2.3 Interval censoring . . . 14

3.3 Evolution strategies . . . 14

3.3.1 Algorithm . . . 14

3.3.2 Selection . . . 16

3.3.3 Mutation . . . 16

3.3.4 Recombination . . . 17

4 Method 18 4.1 Interviews . . . 18

4.2 Tours and observations . . . 18

4.3 Software . . . 18

4.3.1 ExtendSim^R . . . 18

4.3.2 Microsoft Excel . . . 19

4.3.3 Minitab^R Statistical Software . . . 19

4.4 Modeling . . . 19

4.4.1 Problem formulation and setting of objectives . . . 19

4.4.2 Data collection . . . 19

4.4.3 Finding distributions . . . 21

4.4.4 Conceptual model . . . 24

4.4.5 Model translation . . . 25

4.4.6 Verification and validation of the model . . . 28

4.5 Simulation . . . 28

4.5.1 Formulation of new analysis setups . . . 28

4.5.2 Optimization . . . 29

4.5.3 Identifying sensitive parameters . . . 31

4.5.4 Modified inflow . . . 31

(7)

5.2 Optimal solution for a SST durable for 48 hours . . . 33

5.3 Optimal solutions for extended durabilities of the SST . . . 34

5.4 Analysis of sensitive parameters . . . 36

5.5 Modified inflow . . . 38

5.5.1 Inflow of 35 batch samples every week . . . 38

6 Discussion 46 6.1 Choice of method . . . 46

6.2 Analysis of results . . . 47

6.3 Recommendations . . . 50

7 Conclusions 52

8 Further work 53

9 References 54

Appendices I

A Analyzes I

A.1 Delivered dose analysis . . . I A.2 Multistage liquid impinger analysis . . . I A.3 High-performance liquid chromatography analysis . . . I A.4 Ultraviolet-visable spectroscopy analysis . . . II A.5 Infrared spectroscopy analysis . . . II

B Description of model IV

B.1 Snäckviken . . . IV B.2 Gärtuna . . . VII

C Analysis setup schedules X

D OEE formula XIII

E Lead times and OEE for sensitive parameters XIV

(8)

AD Anderson-Darling

ADMS Automate dose measurement station COPD Chronic obstructive pulmonary disease DD Delivered dose

EA Evolutionary algorithm ES Evolution strategies

HPLC High-performance liquid chromatography IR Infrared

LRT Likelihood-ratio test MLI Multistage liquid impinger OEE Overall equipment effectiveness OOS Out of specification

OOT Out of trend

PDF Probability density function PMF Probability mass function QA Quality Assurance

QC Quality Control

SOP Standard operating procedure SST System suitability test

TTF Time to failure TTR Time to repair

(9)

1 Introduction

This section gives an introduction to the thesis, including the background of the problem. The introduction also contains information about the purpose and goal, problem formulations and limitations of the thesis.

1.1 AstraZeneca

AstraZeneca is a global pharmaceutical company operating in over 100 countries. The company’s focus areas are Cardiovascular and Metabolic Diseases; Oncology; Respiratory, Inflammation and Autoimmunity; Infection and Neuroscience. Their products are distributed to hospitals and pharmacies around the world (AstraZeneca, 2018).

The work is conducted at the department of Quality Control (QC) at AstraZeneca Sweden Operations in Snäckviken, Södertälje. QC is responsible of analyzing and ensuring that products achieve predetermined quality requirements.

1.2 Symbicort Turbuhaler

AstraZeneca manufactures Symbicort Turbuhaler in Snäckviken, Södertälje. It is a pharmaceutical device for treating asthma and symptoms of chronic obstructive pulmonary disease (COPD).

Symbicort Turbuhaler is an inhaler (see Figure 1) containing a powder which is inhaled. The powder consists of two active pharmaceutical substances: Formoterol and Budesonide (FASS, 2018).

Figure 1: Picture of the pharmaceutical device Symbicort Turbuhaler.

Formoterol is a long-acting beta-2 agonist, which belongs to a group of pharmaceuticals that widen the bronchi. The substance stimulates specific recipients, so-called beta-2 receptors in the muscle cells of the bronchi. This enables the muscles to relax and the bronchi expands (1177, 2018). The other substance, Budesonide, is of the corticosteroid type and belongs to the pharmaceutical group glucocorticoid. Budesonide counteracts the inflammation of the airways and heals irritated tissue. By reducing the inflammation, asthma attacks can be prevented and alleviate COPD. Cortisone does not cure asthma or COPD, but it suppresses the inflammation and soothe the symptoms considerably (1177, 2018).

The pharmaceutical device is prescribed and available in three different amounts of Budesonide and Formoterol:

• Symbicort Mite Turbuhaler, containing 80 µm Budesonide and 4,5 micrograms Formoterol.

• Symbicort Turbuhaler, containing 160 µm Budesonide and 4,5 micrograms Formoterol.

(10)

• Symbicort Forte Turbuhaler, containing 320 µm Budesonide and 9 micrograms Formoterol.

Symbicort Turbuhaler is sold in multiple markets and each market has different demands. The medicine is therefore produced in different amount of doses: 30, 60 and 120 (FASS, 2018).

1.3 Background

The delivery reliability of the product is dependent on the performance of the whole supply chain and as part of the chain, the results from the department QC are mandatory in order to release the product to the market. The performance of QC is thus an important part of the supply chain. If the performance of QC can be improved, the risk of supply problems and market shortage are decreased. There is also a potential to reduce buffer stocks at different points along the supply chain. The process at QC begins with a delivery of a varying amount of batch samples from production. The batch samples are received by a sample receiver who later distributes them to the laboratory engineers. The product undergoes three different analyzes: delivered dose (DD), multistage liquid impinger (MLI) where the high-performance liquid chromatography (HPLC) analysis is included and, infrared (IR) spectroscopy. The DD analysis ensures that the inhaler has the correct torque and that it contains the correct amount of substances per dose. The MLI analysis ensures that the dose delivered has the correct particle size distribution, and the IR spectroscopy analysis ensures that the active substances have the correct identity. The analyzed batch samples go through reviews and when they are approved, the results are submitted to Quality Assurance (QA) which later releases the batch to the market. Based on historical data (2017-11-03 − 2019-02-04) the average lead time is 9.14 days.

The robots used in the DD analysis often stop during an analysis. There are two outcomes of this scenario. The best-case scenario is that the robot can be rebooted, which means that the ongoing analysis can continue. In worst case, the ongoing analysis will be rejected due to the established directives of the laboratory. This is time consuming since the robots are expected to conduct the analysis unattended up to 48 hours. If an unplanned robot stop would occur during the night (that is, past working hours), the analysis can not continue until a laboratory engineer detects the stop the next day. Before an analysis starts, a system suitability test (SST) is carried out on the analysis robot to ensure that the measured values are correct throughout the analysis. The SST is durable for 48 hours, which means that the SST has to be renewed before each analysis. In case of an unplanned robot stop, the SST might become invalid depending on the length of the robot stop. The SST must then be renewed to analyze the remaining inhalers.

These delays in QC can halt the entire shipment and complicate the accessibility of medicines to the market.

There are currently a total of 14 analysis robots that carry out the DD and MLI analyzes. Six of them are Automate dose measurement station (ADMS) robots and eight of them are Minispice robots. Today, the unplanned robot stops are due to both operational and technical reasons.

Depending on the reason, the remaining analysis may need to be carried out on another robot, which affect the predetermined lead times negatively. Despite that, the laboratory engineers believe that the number of robots can be reduced. Today there is no scientific foundation for this statement.

The laboratory engineers believe that there are other sensitive parameters in the flow that affect the lead time negatively, apart from the unplanned robot stops. However, it has not been possible to identify where these sensitive parameters exist and what contributes to them.

(11)

1.4 Purpose and goal

The purpose of the thesis is to provide AstraZeneca with scientifically based data to be used in their efforts to identify sensitive parameters and readjust work procedures in order to improve the performance of QC. The thesis will also present recommendations to AstraZeneca on suitable actions and suggestions for further work.

The goal of this thesis is to map out the flow of the QC Symbicort Turbuhaler operation and construct a model of it. The intended use of the model is to simulate and optimize different parameters, such as the inflow of batch samples, the utilization of the instrumentation and workload of staff. The model will later be administered by QC at AstraZeneca and used as a tool to put the basis for different decisions on a quantitative footing.

1.5 Problem formulations

Investigating how a department can improve its performance is complex, as there are many areas to explore. Therefore, the area has been limited by the following questions:

Q1. If the number of analytical robots is reduced, financial savings can be made. Is it possible to achieve the existing average lead time of 9.14 days or shorter if the number of robots is reduced through optimization and if prevailing conditions persist? That is when unplanned robot stops remain and that the number of employees is kept constant.

How does it affect the lead time, the utilization of instrumentation and the workload for staffing?

Q2. If the durability of the SST is extended from 48 hours to 72 or 96 hours, new opportunities to combine analyzes will be available. Through optimization, is it still possible to reduce the number of robots and achieve existing average lead time of 9.14 days or shorter when the durabilities are extended? How does extended durabilities affect the lead time, the utilization of instrumentation and the workload for staffing?

Q3. Which of the investigated parameters have the biggest impact on the lead time and overall equipment effectiveness (OEE) in the Quality Control of Symbicort Turbuhaler? Which parameters are sensitive in the flow apart from the unplanned robot stops?

Q4. Since there is a large variation in the number of batch samples arriving to QC from production, it is of interest to investigate how a more constant inflow affects the average lead time. How is QC affected by different constant inflows of batch samples? How does it differ for different durabilities of the SST?

1.6 Delimitations

• The IR spectroscopy analysis will be included in the model but no further consideration will be taken to that part of the flow. The flow of the IR spectroscopy analysis does not have any major disruptions today, which is the reasoning behind this delimitation. However, for the product to be approved by QA, all three analyzes need to be completed. Therefore, it is relevant to include this part of the flow. The parameters that are included in the flow of IR spectroscopy analysis is working time, analysis time, resources and analysis setup. No optimization is executed with respect to these parameters.

(12)

• The batch samples are delivered three times a day to Snäckviken and Gärtuna. This is not taken into consideration when constructing the simulation model. Instead, the batch samples are delivered once a day in the model. This is because batch samples that arrive during one day are first analyzed the following day. This simplification is therefore not believed to have any major impact on the results.

• For each analysis, different reference standards and solutions are used. These solutions are prepared by a laboratory engineer who ensures that there are always solutions available.

This is a part of QC and works properly today (that is, with no delays or risk of solutions running out) and have no major influence on the flow. This preparation step is regarded as a side activity and are therefore not included in the model.

• The inflow of batch samples with inhalers containing 30 doses is a small part of the total batch samples that arrive to QC (approximately 2 %). Since this type of inhaler constitutes a minor part of the inflow, it is not considered to have a major impact on the analysis setups and consequently the model’s results. Therefore, these are not included in the analysis setups that are made in the analysis of DD.

• The time period for data collection is within 2017-11-03 to 2019-02-04 and is arbitrarily determined. Initially, this was the only data available which resulted in that this time period was chosen. Halfway through the project, there was a possibility to expand the number of data points but by this time the period of data collection was finished.

It is a trade-off over the amount of data used in a simulation. A time period that is too long can illustrate a misleading picture of the current inflow. The inflow at the beginning of a long-term interval may differ considerably from the current inflow, which can lead to undifferentiated results. On the other hand, a too short time period can have an impact on the result by reflecting an unrealistic inflow, for example campaign months or holiday seasons. However, the number of data points included in this period are seen as sufficient for this type of simulation.

• Mistakes during an analysis, as well as analytical results that fail to meet specifications must be treated as formal deviations and be investigated. In certain time periods these investigations can accumulate and consume more resources that impact the lead times.

This is, however, not included in the model.

• In the flow, rescheduling and reprioritizations of batches occur. However, these are not included in the simulation of scenarios since they are hard to predict. This delimitation can affect the result by possibly presenting shorter lead times compared to the reality.

• When generating the probability distributions for time to failure (TTF) and time to repair (TTR) of the analysis robots, the same distributions are used for all analysis robots. This is because of the difficulties involved in collecting sufficient robot data. In reality, the robots that are available are widely used and therefore have different frequencies of unplanned robot stops and repair times. Since the probability distributions used are the same for all robots in the flow, it can affect the result by not reflecting each individual robot’s unplanned stops and repair times. Instead, a generalized picture is given.

(13)

1.7 Outline

The thesis begins with Section 1, Introduction, which contains information about the company, the product Symbicort Turbuhaler and the background to the problems being investigated.

In this section the purpose and goal of the work are presented, as well as the works problem formulations and delimitations. Section 2, Current situation, describes the studied flow and its components. The general theory that underlies the methods used during the work is described in Section 3, Theory. The theory includes modeling and simulation methodology, probability distribution and the optimization technique evolution strategies.

Section 4, Method, describes the methods used to answer the stated problem formulations. It contains a description of data collection, used software, the modeling process, the model and formulation of different events that are intended to be simulated. The result of the validation of the model and the results generated through different simulations are presented in Section 5, Results.

Section 6, Discussion, contains a discussion of the choice of method, validation of model and results for the problem formulations. Recommendations to the company are also found in this section. Section 7, Conclusion, summarizes the main findings of the report and finally, suggestions for possible future work is presented in Section 8, Further work.

For readers who are interested in the model and its creation, read Section 4 Method. The reader who is interested in the main findings is recommended to read Section 5, Results and Section 7, Conclusion. For readers with good knowledge of Symbicort Turbuhaler QC, less time can be added to Section 2, Current Situation.

(14)

2 Current situation

This section contains the current situation of the flow at QC, where the sample receiving, analyzes and review are described further. Figure 2 demonstrates the work flow of Symbicort Turbuhaler within QC. The sites, Snäckviken and Gärtuna, are distinguished and their specific operations are illustrated in the figure. The boxes; Batch sample leaves production and Release of batch sample, is where a batch sample arrives and leaves QC. The blue box in Figure 2 illustrate the flow in Snäckviken and red box illustrate the flow in Gärtuna. The purple boxes represent the laboratories at each site, it is where the analyzes of the samples are conducted.

Figure 2: Illustration of the analysis flow of Symbicort Turbuhaler.

2.1 Sample receiving

Each day, the flow begins with assigning a person to receive and manage samples from production. The person who has this task also works with as a reviewer. Since the DD and MLI analysis is stationed in Snäckviken and IR spectroscopy analysis is stationed in Gärtuna, the batch samples are delivered and received at both sites. As mentioned in Section 1.6, the batch samples are delivered several times a day to both Snäckviken and Gärtuna (A-C. Ståhl, Personal communication, 28 January, 2019).

In Snäckviken, the sample receiver retrieves the samples two times a day. In Gärtuna, the sample receiver retrieves the samples sporadically. This is due to the location of the goods receiving of samples at each site. In Snäckviken, it is located at another floor meanwhile the goods receiving of samples in Gärtuna is located at QC (A-C. Ståhl, Personal communication, 28 January, 2019).

When the samples are retrieved, the sample receiver register the samples, print protocols and place them in folders. Afterwards, these folders are placed in a file cabinet where laboratory engineers later collect them for analysis (A-C. Ståhl, Personal communication, 28 January, 2019).

Today, QC have access to an inflow schedule but the usability of this schedule is limited. This is due to frequent changes regarding rescheduling and reprioritization of batches.

(15)

2.2 Analyzes

A batch sample of Symbicort Turbuhaler needs to go through three types of analyzes before it passes through a final review. They undergo DD analysis, MLI analysis (where the HPLC analysis is included) and IR spectroscopy analysis. The laboratory engineers are in advance assigned with an analysis schedule for each day. In the end of each analysis, an evaluation of the results is conducted.

2.2.1 Delivered dose analysis

The DD analysis is conducted to ensure that the correct amount of substance is obtained from each dose. The analysis confirms if the amount of substances is within predetermined intervals.

The analysis is also conducted to ensure the uniformity of the doses delivered (M. Hagerlund, Personal communication, 4 February, 2019).

The laboratory engineer starts by bringing a folder with relevant protocols to conduct the analysis. For this type of analysis, both the ADMS and the Minispice robots can be used.

Before the analysis starts, a SST is carried out. The robot is function-tested to ensure that the measured values are correct throughout the analysis. During the SST, two reference standards, are used with known concentrations of Budesonide and Formoterol (M. Hagerlund, Personal communication, 4 February, 2019).

Analyzing a batch sample containing 30 doses takes 7 hours, 60 doses take 9 hours and 120 doses takes 14 hours to complete. Samples can be put together in different analysis setups as long as the total analysis time does not exceed 48 hours. Since the robot does not need supervision, the laboratory engineer desires to create a setup of samples that can be analyzed during non-working hours (that is, evenings and weekends). A start smart schedule with different analysis setups is used as a guideline to combine different samples. These analysis setups are said to be optimal with respect to the laboratory engineers working hours but also the SST that expires after 48 hours. By following the schedule, the end time of the analysis is synchronized with the laboratory engineer’s following work day who can begin to evaluate the result from the analysis in the morning. The steps the robot performs during the analysis mimic a person’s handling when a dose is inhaled. Therefore, the robot also measures the torque of each dose to control if the dose was properly withdrawn or not (M. Hagerlund, Personal communication, 4 February, 2019).

Today, the laboratory engineers work is interrupted by the unplanned stops caused by the robots.

The unplanned stops occur randomly, which makes it difficult to predict when a stop will take place. For technical information about the analysis, see Appendix A.1.

2.2.2 Multistage liquid impinger analysis

The purpose of the MLI analysis is to determine particle size distribution of the dose delivered as this determines how far the dose penetrates to different sections of the lung. For this type of analysis, the ADMS robot is used. A laboratory engineer brings the relevant folder with protocols containing information about the batch sample to be analysed (Z. Magic, Personal communication, 29 January, 2019).

The next step is to prepare the impinger for analysis. An impinger is supposed to imitate a human lung, see Appendix A.2 for more information. Three impingers are needed for each batch

(16)

sample. A filter is placed at the bottom of the impinger and an inflatable inlet is placed at the top. The ADMS robot is cleaned and tested before each analysis. The laboratory engineer inserts sample specific information into pre-coded sequences in a software that communicates with the robot. The sequences are pre-coded to reduce operational risk (Z. Magic, Personal communication, 29 January, 2019).

When the sequences are set, the inhalers are placed in the analysis robot. Each robot analysis for one inhaler takes ten minutes to perform. The robot is supervised during the whole analysis since the impinger needs to be changed for each inhaler. When the analysis is finished, the impinger is moved to a shake table for 20 minutes. A laboratory engineer later collects and places the substances from the impinger in test tubes. The test tubes are then refrigerated overnight. (Z. Magic, Personal communication, 29 January, 2019).

Lastly, the liquid from the impinger and filter is analyzed in a high-performance liquid chromatography (HPLC) to confirm the concentration of fine particles (Z. Magic, Personal communication, 29 January, 2019). For technical information about the analysis, see Appendix A.2.

2.2.3 Infrared spectroscopy analysis

Infrared (IR) spectroscopy analysis identifies molecules and determine their molecular structure.

For Symbicort Turbuhaler it means that the two active pharmaceutical substances are identified to ensure the contents of the inhaler.

Normally four to ten batches samples are analyzed during one analysis. To perform the IR spectroscopy analysis, preparations are made one day before. The inhaler is opened, and its contents emptied and weighed into test tubes. The next day when the analysis is carried out, the two substances are separated. The substances are set in a vacuum drying cabinet until they become solid. Then, they are analyzed separately by a laboratory engineer who grinds each substance with potassium bromide (which is not visible in the IR spectrometer) and pressed into tablets. The freshly pressed tablet is placed in the infrared spectrometer where the sensor lights through the sample. A sample spectrum is obtained which is compared with a reference spectrum to ensure that the sample contain the correct substance. This step is done twice, once to detect Formoterol and once to detect Budesonide (T. Gustafsson, Personal communication, 31 January, 2019).

Today, the laboratory engineers do not experience any problems with this analysis method. For technical information about the analysis, see Appendix A.5.

2.3 Review

The review is managed by two reviewers in Snäckviken. Their function is to examine the evaluation of the analyzes. When the evaluation of the analyzes are submitted by a laboratory engineer, the first reviewer uses E-plan to choose which analysis should be examined depending on priority and lead time. E-plan is an electronic planning tool used by the laboratory and industrial engineers. Usually the review of the evaluation is done the following day. The first reviewer controls that the laboratory engineer’s operations are done correctly and that the results are within predefined limits. Afterwards, the second reviewer verify the review by tracking which documents have been used by the first reviewer. The second reviewer also control if the samples are out of trend (OOT) or out of specification (OOS). OOT is predefined

(17)

limits which is used internally while OOS is predefined limits to different markets. The OOT has a stricter interval than OOS and appear more frequently. Even though the OOT is in within specification to other markets, an internal investigation is conducted to identify the OOT’s origin (C. Eriksson, Personal communication. 11 February, 2019). For the IR analysis in Gärtuna, the review is carried out only by one reviewer.

To enable a complete review of the MLI analysis, the review of the DD analysis first needs to be completed. When the review is completed and approved by the second reviewer, the batch is formally released in the Laboratory Information Management System, and also marked as approved and completed in E-plan. QA later releases the batch to the market when the reviewers from both Snäckviken and Gärtuna are completed (C. Eriksson, Personal communication. 11 February, 2019).

(18)

3 Theory

This section contains necessary theory relevant for this thesis. A description of modeling and simulation methodology, probability distributions and evolution strategies are presented. This section also includes theory to find suitable distributions for the model used in this thesis.

3.1 Modeling and simulation methodology

A simulation is an approximate imitation of the operations of a process or a system, but first it requires a developed model. The model represents the system itself, whereas the simulation represents its operation over time. This type of model is useful when the effects of changes in an existing system are to be investigated and predicted. It can also be used to predict the performance of a new system with new conditions (Banks et al. 2010, 21). Simulation can be used advantageously when dealing with complex problems that cannot be solved deterministically. This category includes many real problems that are most conveniently solved with simulation (Banks et al. 2010, 22).

When creating a model, a problem needs to be formulated and its objective needs to be defined.

Simultaneously, data needs to be collected and a conceptual model completed. This to be able to translate the model, verify and validate it. Banks et al. have defined and broken down these steps, see Figure 3. The various items are also described below in detail.

Figure 3: Flow chart over modeling and the simulation process according to Banks et al.

(19)

Problem formulation: Initially, a number of different steps are carried out to build a model.

To achieve this, a simulation study starts with a clear problem formulation, where everyone involved understands the problem to be solved. The problem formulation may need to be changed for various reasons during the work (Banks et al. 2010, 34).

Setting of objectives and overall project plan: After the problem formulation is determined, the objectives are set which refer to the questions that the simulation should answer. At this stage, decisions are also made regarding whether a simulation is the best suited method to solve the problem. When it is determined that a simulation study will be carried out, a project plan is made which describes how the project is planned to be implemented (Banks et al. 2010, 36).

Data collection: The data collection is an important part of a simulation study and can occupy a large part of the study. Therefore, a proper data collection methodology is important in order to not waste time on collecting data which may not be necessary for the simulation.

The identification and collection of data is also closely connected to the conceptual model (Hill and Onggo 2014, 195).

Model conceptualization: In connection with the data collection, a model conceptualization is made (Banks et al. 2010, 36). A conceptual model is a representation of a system, that uses concepts and ideas to facilitate the understanding of what a model should represent. The conceptual model is often created by hand or digital visualization tools. The creation of a conceptual model is an iterative process (Liu, 2011).

Model translation: To be able to perform advanced calculations and manage the results that the simulation provides, the model needs to be translated into a suitable computer software (Banks et al. 2010, 36).

Verification: This step is performed to check that the model is correctly implemented in the program. A higher degree of complexity results in a more extensive verification process.

The most useful tools for this step are debugging tools and common sense (Banks et al. 2010, 37).

Validation Validating the model involves comparing it with the real system to ensure that the model represents the real system at an acceptable level. It is an iterative process where the model is calibrated to achieve desired results. Including the end user in this process also increases the reliability of the model’s results (Banks et al. 2010, 37).

Experimental design: This step determines which scenarios should be simulated. For each simulation project, the length of the start period, the number of simulation runs and the number of replications must be determined (Banks et al. 2010, 37).

Production runs and analysis: Simulations are carried out and the results is analyzed and evaluated (Banks et al. 2010, 37).

More runs: Depending on the outcome of the simulations, more runs may be required with a new design (Banks et al. 2010, 37).

Documentation and reporting: The project results in a final report which is submitted to the clients. In the report, information about the final model, results and final recommendation

(20)

is found (Banks et al. 2010, 38).

Implementation: The final step is to implement the results from the simulation runs. How well this succeeds mostly depends on how well the previous steps have been performed. The implementation also depends on how well the final user understands the model and its execution.

If the final user has been continuously involved in the construction, the chance of a successful implementation is greater (Banks et al. 2010, 38).

3.2 Probability distributions

Due to variations, a model based on the real world is rather probabilistic than deterministic.

When developing a simulation model, a distribution can be sampled, and its fit can be determined. A distribution can be discrete or continuous.

A discrete distribution can be described by letting X be a random variable. X is called a discrete random variable if the values x1, x2, . . . , xn is finite. The probability of X is determined by the following probability mass function (PMF):

px(x) = P (X = x), ∀x ∈ R

where px: R → [0,1] such that P (X = x) is the probability that the realization of the random variable will be equal to x (Banks 2001, 190).

A continuous distribution can be described by letting X be a random variable. X is defined as a continuous random variable, if the range space R_x of the random variable is an interval (Banks 2001, 191). The probability of X between the interval [a, b] is determined by the following function:

P (a ≤ X ≤ b) = Z b

a

f (x)dx

The function f (x) represents the probability density function (PDF) of X where the following conditions are satisfied (Banks 2001, 192):

1.f (x) ≥ 0, ∀x ∈ R_x

2.

Z

Rx

f (x)dx = 1

3.f (x) = 0, ifx 6∈ Rx

3.2.1 Distribution fitting

Distribution fitting is the procedure of finding a distribution that best fits the collected data.

One way to implement it is to use the Goodness-of-Fit test that determines how well the observed data matches the expected data. Goodness-of-Fit tests often used in statistics are:

• Anderson-Darling

• Kolmogorov-Smirnov

(21)

• Chi-square

• Shipiro-Wilk

The Anderson-Darling (AD) Goodness-of-Fit Test measures how well the data fits a specified distribution. Typically, a smaller AD value signifies that the data better fits the distribution.

However, only comparing AD values is not recommended since the AD are distributed individu- ally for different distributions. Instead, it is preferable to use measured p-values and probability plots (Minitab, LLC 2019b).

The p-value is a probability that measures the proof towards the null hypothesis that the data follow the distribution. A low p-value indicates that the null hypothesis can be rejected, and that the data do not fit the distribution. Instead, a p-value as high as possible is preferred. How low the p-value must be in order for the alternative hypothesis to be considered most probable is determined on the basis of the significance level (α) (Minitab, LLC 2019b). For 95 percent confidence level α = 0.05, the p-value is interpreted in the following way:

• p ≤ α: indicates strong argument against the null hypothesis − the null hypothesis is rejected.

• p > α: indicates weak argument against the null hypothesis − the null hypothesis cannot be rejected.

• p = α is considered as insignificant.

When finding a distribution, probability plots with hypothesis tests for a particular distribution is often included in the analysis. The collected data is displayed against a theoretical distribution, which has a good fit if the data points follow along the distribution line. If the points deviate from the line, it indicates that the data does not fit the given distribution and thus cannot be accepted (Minitab, LLC 2019b).

For distributions that have additional parameters (often a third parameter), the likelihood-ratio test (LRT) p-value can be utilized to decide if the additional parameter increases the fit of the distribution. If the LRT p-value is less than the significant level, it is preferred to choose the 3- parameter distribution instead of the corresponding 2-parameter. This means that when the LRT p-value < α, the 3-parameter distribution is considered to be a better fit than the corresponding 2-parameter. If the outcome is that both the 2-parameter and the 3-parameter distribution gives a good fit, it may be advantageous to choose the distribution that is the simplest, which is the 2-parameter distribution (Minitab, LLC 2019b).

3.2.2 The Weibull distribution

There are many distributions used within reliability engineering and life data analysis. Life data refers to measurements of the lifetime of a product, such the time the product operated before it failed (also called time to failure). The Weibull distribution is the best-known, but other distributions used are exponential, lognormal and normal distributions. Based on experience and Goodness-of-Fit tests, the most appropriate distribution is chosen for each collected data set (Weibull 2019).

The Weibull distribution can be practiced in different forms (1-parameter, 2-parameter, 3- parameter and mixed Weibull). The PDF for the 3-parameter Weibull distribution is written as:

(22)

f (t) = β η(t − γ

η )^β−1e⁻⁽^t−γ^η ⁾^β

where β = shape parameter, η = scale parameter, γ = location parameter, t ≥ γ and β, η > 0 If γ is assumed to be zero, the 2-parameter Weibull is applied:

f (t) = β η(t

η)^β−1e⁻⁽^η^t⁾^β 3.2.3 Interval censoring

Interval censoring is a method used when an event occurs, and the time of the event is not directly observed but is known to take place within some time interval. Event times can be both continuous and discrete. If the exact incidence time of the event can be observed and the event can occur at any time, the event time is expected to be continuous. But when it is only known to have happened in a specified time interval, the exact time of the event is continuous but unobtainable. Such data are denoted to interval censored data. By using interval censoring on discrete data, a continuous distribution can be found (Qiu, Stein and Elston 2013, 1).

3.3 Evolution strategies

Evolution strategies (ES) is an optimization technique based on an Evolution algorithm (EA).

Other popular techniques within EA are Genetic Algorithm and Evolutionary Programming.

This optimization technique is especially suitable to solve ill-structured optimization problems with several local extreme points. ES is used to optimize problems by mimicking methods from natural evolution such as selection, mutation and recombination. Two main factors that drive ES is to modify variables randomly at all times and reject a new set of variables if they do not improve the solution (Beyer and Schwefel 2002, 5). The algorithm in ES is based on a set of parental individuals, µ. From this parental population, a set of offspring λ is created through random mutation and recombination for each generation (Beyer and Schwefel 2002, 6).

3.3.1 Algorithm

The goal of ES is to optimize an objective function F with respect to decision variables (also referred as object parameters in ES) y := (y1, y2,..., yN)

F (y) → opt., y ∈ Y.

where Y can be a set of data structures, such as a N-dimensional search space of real values or integers.

Within ES, populations P contain individuals a. Each individual can be defined as following:

ak := (y_k, sk, F (y_k))

where k is an index (k ∈ {1, ..., P }), yk is an object parameter variable, sk is a endogenous strategy parameter to keep track of statistical property at for example mutation and F (y_k) is the objective function value, also referred as fitness, of y_k.

(23)

During a generation g, λ offspring al (l ∈ {1, ..., λ}) is created by a set of µ parents am (m ∈ {1, ..., µ}). The population of the offspring individuals is denoted as Po and the population of the parents is denoted as Pp. The parameters µ, λ and ρ (which is the number of parents used in one generation) are referred as exogenous strategy parameters. These parameters are held constant during the evolution run (Beyer and Schwefel 2002, 9).

When ρ > 1 the parents are recombined. A special case of ES is when only one parent is involved in the reproduction (ρ = 1), then no recombination is made (Beyer and Schwefel 2002, 10).

Algorithm 1 Evolution strategies

1: begin

2: g := 0

3: initialize

Pp⁽⁰⁾:=n

y⁽⁰⁾m, s⁽⁰⁾m, F (y⁽⁰⁾m)

, m = 1, ..., µo

4: while termination conditions are not fulfilled do

5: for l := 1 to λ do

6: ε_l:= marriage

P_p^g, ρ

7: s_l:= s_recombination(ε_l)

8: y_l:= y_recombination(ε_l)

9: ˜s_l:= s_mutation(s_l)

10: ˜yl:= y_mutation(yl, ˜sl)

11: F˜l:= F (˜yl)

12: end for;

13: Po^(g):=n

˜ yl, ˜sl, ˜Fl

, l = 1, ..., λo

14: case selection_type of

15: (µ, λ) : Pp^(g+1):= selection

Po^(g), µ

16: (µ + λ) : Pp^(g+1):= selection

Po^(g), Pp^(g), µ

17: end case

18: g := g + 1

19: end while

20: end

The algorithm for ES is shown in Algorithm 1. At generation g = 0, an initialization for the parental population Pp⁽⁰⁾:= (a1,...,aµ) is performed. At generation g, a new offspring population Po^(g) is generated from the parental population. This is done λ times and through each cycle, one offspring is generated. Initially, a family ε of parent with size ρ is by random selected from the parent pool with size µ. The process for this marriage selection is based on randomization, which means it is independent of F , the parental objective values. Afterwards, recombination of the endogenous strategy parameters and the object parameters occur. Then, the mutation of the strategy parameters and the objective parameters take place. In order to ensure proper self-adaptation, the order of the mutation operators must not be changed (Beyer and Schwefel 2002, 11).

(24)

When a complete offspring population Po^(g)is obtained, a selection is performed that provides a new parent population Pp^(g+1). At last, an inspection of the completion condition is made. As completion conditions, the standard termination conditions can be used as:

• Resource criteria:

– Maximum number of generations – Maximum CPU time

• Convergence criteria in the space of:

– Fitness values – Object parameters – Strategy parameters 3.3.2 Selection

Within ES, selection sets the direction for the evolution of generations. The selection is based on individuals with high fitness values. From these, a new population of parents are generated (Beyer and Schwefel 2002, 11):

P_p^(g+1):= (a1:γ, ..., am:γ)

where am:γ is the mth best individual of γ individuals (γ = µ + λ), for m = 1,..., µ.

From this selection technique, two instances are possible; (µ + λ) and (µ, λ). In the plus selection (µ + λ), the fittest individual found is guaranteed to survive (survival of the fittest). In this instance, both the parents and the offspring are included in the selection pool that gets the size γ = µ + λ. Also, this instance has no limitations on the number of λ offspring. The case (µ + 1) is a special case of (µ + λ) and is seen as the steady state of ES since the population size is kept constant. This is because there are µ parents at a time, where two are chosen at random and recombined to generate an offspring. The selection can be equated to "the elimination of the worst" where either one parent or one offspring is eliminated (Beyer and Schwefel 2002, 6).

In the other instance (µ, λ), comma selection, parents from generation g are neglected, and only the λ new offspring generated define the selection pool. (µ, λ) is advantageously used when the search space Y is unbounded while (µ + λ) should be used for discrete finite search spaces (Beyer and Schwefel 2002, 12).

3.3.3 Mutation

The main source of variation within ES is mutation. The design of how the mutation should operate, depends on the nature of the problem. There is no established method to date, but there are three rules that should be considered when implementing mutation (Beyer and Schwefel 2002, 12):

• Reachability

To ensure global convergence, it should be possible to reach a finite state (˜y, ˜s) from a given parental state (y_p, sp) within finite time (Beyer and Schwefel 2002, 13).

(25)

• Unbiasedness

It shall not exist any preference at selection of parental individuals and variation operators shall not introduce any bias (Beyer and Schwefel 2002, 13).

• Scalability

The strength of mutation needs to be adjustable in order to adapt to the properties of the fitness landscape. A smooth evolutionary path through the fitness landscape towards the optimal solution is possible by generating small variations. Since the fitness landscape is dependent on the object function and the variations operators, a smooth evolution is therefore considered as essential for effective optimization (Beyer and Schwefel 2002, 13).

3.3.4 Recombination

Recombination utilize data from maximum ρ parents, unlike mutation that performs search steps based on one single parent. Recombination consists of two instances (Beyer and Schwefel 2002, 18):

• Dominant recombination

Let a = (a1,..., aD) be a parental vector and r = (r1,..., rD) a recombinant (offspring) produced by ρ randomly chosen parents. A random selection is done based on the indices of ρ parents, which are recombined to reproduce new offspring:

(r)_k := (a_m_k)_k, with m_k:= Random{1, ..., ρ}

where the kth component of the recombinant is determined by each parents µ_k(Beyer and Schwefel 2002, 19).

• Intermediate recombination

Unlike the dominant recombination, the intermediate recombination tests all ρ parents.

This instance of recombination takes the center of mass of the ρ parent vectors am into account (Beyer and Schwefel 2002, 19):

(r)k:= 1/ρ

ρ

X

m=1

(am)k

(26)

4 Method

This section presents the methods and the model used for solving the problems formulated in Section 1.5. First, the method for collecting the information about QC are described. Next, the software used in this thesis are presented. This section also contains a description of the modeling process which includes the data collection, distribution fitting and the model translation. Lastly, the simulations performed to answer the problem formulations are presented.

4.1 Interviews

Interviews are used during the work to create a deeper understanding of the different analyzes of Symbicort Turbuhaler. Interviews are the primary method for gathering information about people’s experiences and opinions. There are three different types of interviews, unstructured, semi-structured and structured (Wildemuth 2017, 239). Unstructed interviews are used during this work, which means that the person is asked with open questions throughout the interview.

It is then possible for the interviewer to lead the interview against relevant topics and it is possible to ask supplementary questions if something is extra interesting or unclear. This type of interview is suitable when the interviewer has an unclear picture of the subject to be examined. One disadvantage of unstructured interviews is that it can be difficult to summarize and compare several interviews (Wildemuth 2017, 240).

Interviews are carried out during the project together with laboratory engineers and operative managers, usually in connection with a tour.

4.2 Tours and observations

To understand the analysis flow of Symbicort Turbuhaler at AstraZeneca, multiple tours are carried out in the laboratory together with laboratory engineers. During these tours it is possible to interview laboratory engineers and get a perception of the flow. During these tours, observations are made to create an understanding of the analyzes that are conducted. Observations can be used to understand how a task is performed and how people behave, especially actions the professionals are unaware of. This is useful to create an understanding of an area and collect information about different cases (Wildemuth 2017, 210).

For this work, the laboratory engineers are observed while conducting the different analyzes and by following the entire process, an understanding of the flow is formed.

4.3 Software

The software used during this work are ExtendSim^R, Microsoft Excel and Minitab^R Statistical Software. These software are described in Section 4.3.1, 4.3.2 and 4.3.3.

4.3.1 ExtendSim^R

ExtendSim^R is a software which provides tools for modeling discrete event, continuous, agent-based, and discrete rate processes (Krahl 2002, 205). For this project, ExtendSim^R version 9.2 is used to simulate discrete events in the analysis flow of Symbicort Turbuhaler.

Discrete event simulation

A discrete event simulation model is used for systems where variables changing their conditions

(27)

at discrete points in time (Banks et al. 2010, 34). Such models can be used to predict complex systems of a real-world problem. Components of these systems are described by different entities which corresponds to element such as people, equipment and materials. The entities pass through the model when different sequences of events occur over a specified time period (Ullrich and Lückerath 2017, 2). The whole process is organized around events, which can be done at both random and predictable intervals (Imagine That Inc 2013, 93).

4.3.2 Microsoft Excel

Microsoft Excel is used to store collected data. Calculations on the data are performed in Microsoft Excel by utilizing built-in functions.

4.3.3 Minitab^R Statistical Software

Minitab^R Statistical Software offers tools for analysis of data (Minitab, LCC 2019a). It is used in this project for finding distributions that fit the collected robot data. Minitab^R Statistical Software is also used to create plots and other visualizations.

4.4 Modeling

In order to answer the problems formulated in Section 1.5, the majority of the time is used to construct a model. In this section, steps needed to construct this model are presented.

4.4.1 Problem formulation and setting of objectives

The model is based on the defined purpose and goal and problem formulations in Section 1.4 and 1.5. The model is intended to be used as a tool for QC where its future use also can answer to other issues related to the flow.

4.4.2 Data collection

Data relevant to the simulation are identified and collected from multiple units at QC. Historical data about the inflow from production, number of robot stops, robot maintenance, number of days to complete each batch sample and analysis setup is collected. For this, data between the time period 2017-11-03 and 2019-02-04 is used and contain 1507 data points. The time unit of the data is in days.

Inflow data

Data regarding the inflow of batch samples are collected to get an input to the simulation model.

The data is collected from E-plan. In E-plan, information about incoming and outgoing batch samples each day are stored in Microsoft Excel files. The collected files contain information about several parameters, but the following are relevant for the model:

• Day of batch sample delivery

• Item quantity

• Article number

• Sample number Snäckviken

• Sample number Gärtuna

(28)

• Number of doses (30, 60 and 120)

• Amount of substance (80 µg, 160 µg and 320 µg)

These properties are essential to include in the simulation model. Depending on which properties a batch sample has, it has a different flow-through time, and thus impacts the flow differently.

In order to later validate the simulation model, historical data over the delivery of batches, planned and actual finish dates are also collected.

Date of batch sample delivery

This is the date of when a batch sample arrives from production to Snäckviken and Gärtuna for analysis. The number of delivered batch samples differs from day to day. Also, the time between deliveries varies.

Planned finish date of batch sample

This is the planned date of when a batch sample is analyzed, reviewed and is ready to be released to QA. Some batches may be prioritized and therefore need to be completed within a shorter time interval.

Actual finish date of batch sample

This is the actual date of when a batch sample is released to QA. The actual finish date can sometimes exceed the planned date for when the batch samples should be ready.

Robot data

Since the robots are a major part of the flow, it is of interest to collect how often robot stops occur. There are two types of stops; unplanned stops that occur at random and planned stops that are scheduled maintenance. Both types of stops are of relevance to include in the flow. However, it is only of interest to study the unplanned stops when the robot is used to conduct a DD analysis. This is due to that the robots used during this type of analysis is unsupervised during the majority of the total analysis time. Since the MLI analysis is supervised during the entire analysis, the effect of unplanned stops is not as severe as for a DD analysis.

Data relating to time to failure (TTF) and time to repair (TTR) of the robots is not compiled by any specific unit at AstraZeneca, and consequently data from different sources is collected.

Data collected and used originates from the enterprise resource planning software SAP and the software Empower. The data from these sources is stored in Microsoft Excel files.

Data from SAP originates from the maintenance unit at QC and contains information of Registered Date of Failure, Type of Failure, Start Date of Repair, and Finish Date of Repair. In some cases, there are no start or end dates of repair when a failure has occured, therefore an arbitrary time is added which is consistent with a similar type of repair. Also, since there was an inconsistency in the registered date of failure (for example, a repair was conducted before a failure actually occured), data from Empower is used to complement this data.

Empower is used by the laboratory engineers during analyzes and contains data about every conducted analysis. The data contains Sample Set Start Date, Sample Set Finish Date and Comments Sample. A robot stop is identified if a finish date is missing from a conducted analysis. In some cases, the laboratory engineers’ comments about what went wrong during the

(29)

analysis. The TTF is calculated by:

∆t^{T T F}_i = ti− ti−1

and the TTR is calculated by:

∆t^{T T R}_i = tiT − ti0

where i ∈ {1, ..., N }, N is the number of failures, tiis the day of failure i, ti0is the start date of repair on failure i and tiT is the end date

Analysis setup schedule

In the DD analysis, different analysis setups in the start smart schedule are used. The analysis time for these setups is collected from a standard operating procedure (SOP), containing standardized step-by-step instructions developed by AstraZeneca.

4.4.3 Finding distributions

To identify which type of distribution the TTF and TTR data fits, a reasonable approach is to conduct a Goodness-of-Fit test and inspecting the probability plots for different distributions.

Time to failure

For the TTF data, a Goodness-of-Fit test is conducted. By inspecting the p-value for a normal distribution in Figure 4, it is possible to conclude that the data is not normally distributed. This is due to its low p-value (< 0.005). By inspecting the p-values for the remaining distributions, it is also possible to conclude that the data does not fit any distribution (for more information see Section 3.2.1).

Figure 4: Goodness of Fit Test for TTF.

A histogram of the data is plotted (see Figure 5) to get an overview of the spread of the data points. Based on the histogram it can be distinguished that the data contains three points above 120 days. It is assumed that these points may have an influence of how the data fits a distribution. These values are therefore removed, and a new Goodness-of-Fit test is performed.

(30)

Figure 5: Histogram for the TTF data.

By inspecting the p-values in Figure 6, it is possible to conclude that the data still does not fit any distribution. The data is discrete where the exact times for when a stop occurs during the day is unknown, simultaneously as the tested distributions are continuous. An attempt to use interval censoring (see Section 3.2.3) on the data is made to get a better fit. This is done by constructing a time interval for the events of failures that occurs with the robots. For example, the time interval for failures that occur at day one lies between the interval 0.00 and 0.99.

Figure 6: Goodness-of-Fit Test for TTF without extreme values.

Probability plots for different distributions are visualized to inspect which distribution line is the best match for the data. The normal distribution can be ruled out since barely any of the data points follow the straight line (see Figure 7). However, the data points for both the 2-parameter Weibull distribution and the 3-parameter Weibull distribution follow the straight lines. Based on the AD values and the correlation coefficients in Figure 8, the Weibull distributions have the best fit. But because of their similarities, their probability plots are studied more closely. When done, no major difference between them can be distinguished. In the choice of a 2-parameter and a 3-parameter Weibull distribution that does not offer significant advantages, the 2-parameter Weibull distribution is selected (for more information see Section 3.2.1). The values for the shape and scale used is found in Figure 9 and is used for all robots in the flow. See section 1.6 for the motivation for this assumption.

(31)

Figure 7: Probability plots for TTF.

Figure 8: Goodness-of-Fit test for TTF with interval censoring.

Figure 9: Probability plot for TTF, Weibull

Time to repair

The method used to find a suitable distribution for the TTR data is similar to the method used in Section 4.4.3. Similar to the TTF data, when a Goodness-of-Fit test is performed on the TTR data no suitable distribution is found. Since the data is discrete and the tested distributions are

(32)

continuous, an attempt to use interval censoring on the data is made to get a better fit. Since the probability plots in Figure 10 provide a similar outcome as for TTF data, the same procedure is performed. Here, the Weibull distribution for the TTR data is selected. The values for the shape and scale used is found in Figure 11 and is used for all robots in the flow. See section 1.6 for the motivation for this assumption.

Figure 10: Probability plots for TTR.

Figure 11: Probability plot for TTR, Weibull.

4.4.4 Conceptual model

A conceptual model is initially created (see Figure 12) to get an understanding over the flow of QC and an overall understanding of which input data was required for the actual model. During the process of creating the conceptual model, assumptions of the flow of QC are made and model content is defined.

(33)

Figure 12: Conceptual model of QC Symbicort Turbuhaler

4.4.5 Model translation

Based on the current situation in Section 2, the model is translated into the simulation software ExtendSim^R. An overview of the model is shown in Figure 13, where the batch samples leave production and pass through the flow of QC until they are released to QA. The flow part ways to Snäckviken (A) and Gärtuna (B), and these are visualized by hierarchical blocks. Inflow data is genererated by a schedule containing information about each sample entering the flow. All staff in the flow follow an 8 hour schedule each day, apart from weekends.

Figure 13: Overview of the model

An overview of Snäckviken is shown in Figure 14. The inflow reaches Sample receiving in A1, where samples are registered and then moved to Batching of analysis setups in A2. For the different analyzes, the samples are batched together in various ways. When batching DD, the sizes of each analysis setup vary. It depends partly on the inflow of samples and partly on the number of doses in each batch sample. Depending on these factors, an equation block with specified conditions is executed which assembles an analysis setup, see Appendix C for different analysis setup schedules. For MLI, four samples are batched together depending on the amount of the substance Budesonide.

When the different samples are batched together, the analyzes are carried out in A3, A4 and A5 (Figure 14). If all Minispice robots are busy for DD analysis, the analysis setup is transferred to an ADMS robot where a DD analysis is carried out. It is also possible to prohibit this, by

(34)

using the gate between A3 and A4. Finally, a review is conducted in A6 where the analyzes are reviewed.

Figure 14: Overview of the flow in Snäckviken

For a more comprehensive description of this part of the model, see Appendix B.1.

An overview of the flow in Gärtuna is shown in Figure 15. In the same way as in Snäckviken, the inflow reaches Sample receiving in B1. At B2, the samples that arrives within 48 hours are batched together, which is controlled by an equation block with specified conditions. Thereafter, IR analysis is performed in B3 and reviewed in B4.

Figure 15: Overview of the flow in Gärtuna

For a more comprehensive description of this part of the model, see Appendix B.2.

Model translation considerations

A challenge that may need to be considered is when batching different analysis setups. When batching, the batch block can be used where there are two options for creating a batch. Either it can be done statically, that is, where the batch size is defined in the block. It can also be done dynamically, as a signal for the batch size is indicated by a connector. Since the inflow varies, the assembly of batches also varies from one point in time to another. For this type of problem, it is not possible to use the static batching since it does not distinguish different items from each other. Instead, the batching needs to be done dynamically through an equation block.

The equation block receives inputs and returns outputs, where the logic is managed by the programming language ModL (which is similar to the programming language C). In order to define different analysis setups, if-else statements can be used. These statements check which types of batch samples are available and select analysis setup according to a prioritization order in the script. This is an important part of the model translation, as the laboratory engineers themselves follow a similar schedule (start smart schedule) according to different priorities. When the