Simulation and Analysis of Queueing System

(1)

IN

DEGREE PROJECT ELECTRICAL ENGINEERING, SECOND CYCLE, 30 CREDITS

,

STOCKHOLM SWEDEN 2019

Simulation and Analysis of

Queueing System

YUCONG ZHANG

KTH ROYAL INSTITUTE OF TECHNOLOGY

(2)

(3)

Examiner

Gyorgy Dan

Academic adviser

Viktoria Fodor

Industrial adviser

Olga Grinchtein

John Karlsson

K T H R O Y A L I N S T I T U T E O F T E C H N O L O G Y E l e c t r i c a l E n g i n e e r i n g a n d C o m p u t e r S c i e n c e

(4)

(5)

Abstract

This thesis provides a discrete-event simulation framework that can be used to analyze and dimension computing systems. The simulation framework can define and parametrize the flexible queueing system. We use the simulation framework to explore the data collected from the real-world system. We analyze the metrics, including waiting time and server utilization of single-server and multi-server queueing systems. In particular, we study the impact of the number of servers on waiting time and server utilization. The experiments show it is possible to increase server utilization and decrease the server number without significantly increasing waiting time, and flexible architectures can lead to significant gains.

Keywords

1. Queueing theory; 2. Queueing system simulation; 3. System optimization; 4. Computer systems; 5. Data analytics

(6)

(7)

Sammanfattning

Detta examensarbete tillhandahåller ett ramverk som kan användas för att analysera och dimensionera dator-system. Simuleringsramverket kan definera och parameterisera ett flexibelt kösystem baserat på data från ett system i drift. Vi använder simuleringsramverket för att undersöka datat insamlat från skarpa system. Vi analyserar prestandatal, såsom väntetid och utnyttjandegrad för system med en och flera betjänare. Framför allt undersöker vi hur antalet betjänare påverkar väntetid och utnyttjandegrad. Försöken visar att det är möjligt att öka uttnyttjandegraden och minska antalet betjänare utan att märkbart öka väntetiden, och att en flexibel arkitektur kan leda till märkbara förbättringar.

(8)

(9)

i

List of figures

Figure 2-1: An example of the queueing system with full flexibility ... 13

Figure 3-1: The architecture of the simulation framework ... 19

Figure 3-2: Timeline of data clean case 1 ... 21

Figure 3-3: Timeline of data clean case 2 ... 22

Figure 3-4: Data clean algorithm ... 23

Figure 3-5: Event selection algorithm ... 24

Figure 3-6: Job selection algorithm ... 24

Figure 3-7: Event selection algorithm ... 25

Figure 3-8: Server selection algorithm ... 25

Figure 3-9: Execution flow of the simulator ... 26

Figure 4-1: UML class diagram of job, queue, and server ... 28

Figure 4-2: UML class diagram of event and event queue ... 29

Figure 4-3: The topology generation algorithm ... 29

Figure 4-4: Example of a flexible queueing system ... 30

Figure 5-1: Difference rate for ten queueing systems ... 33

Figure 5-2: Error rate for nine queueing systems ... 34

Figure 5-3: Distribution of interarrival time ... 37

Figure 5-4: Distribution of service time ... 38

Figure 5-5: Simulation results based on different types of distribution ... 39

Figure 5-6: Result of system1 ... 41

Figure 5-10: Server utilization for systems with different server number ... 46

(12)

(13)

4

List of acronyms and abbreviations

NHS National Health Service

ICT Information Communication Technology DPM Data Preparation Module

FIFO First-In, First-Out LIFO Last-In, First-Out

SIRO Service In Random Order EDF Earliest Deadline First DES Discrete Event Simulation FDES Fuzzy Discrete Event Simulation

historical

TW _{Mean waiting time of historical data}

dataclean

TW _{Mean waiting time of data clean algorithm}

replay

TW _{Mean waiting time of system replay}

int

ma enance

T _{Maintenance time}

priority

TW _{Mean waiting time of system with priority queues}

nopriority

TW _{Mean waiting time of system with non-priority queues}

(14)

(15)

6

1 Introduction

1.1 Background

The shared resources are very common in society and industry. Familiar examples of shared resources include [1]

• Transportation systems: such as highway, railway, airline, and ports • Basic public service: such as schools, hospitals, and water system • Communication systems: such as telephone networks

• Computer networks: such as database, programs, and hardware

In a system, too much resource means too high costs. For example, in a computer network, the waiting time of jobs will decrease to zero in theory if the number of servers is larger than the jobs in the system. But it is infeasible because of the high costs of servers. Too little resource is also a problem in a system as it will lead to long waiting times. In [2], it provides an example in the National Health Service (NHS) system. The shortage of bed capacity will lead to high waiting times and delays of the patients.

Therefore, we need to make optimal decisions on how much of the resources to deploy and how to access them. In this project, we will focus on a computer system, where jobs compete with each other to access the limited resources in the system [3]. In our case, the limited resource is the servers in the system. We need to find a way to optimize the resources (servers) in the system and figure out how the jobs should we assigned to the servers.

Queueing theory is the theoretic tool to model the resource sharing system and to optimize their use. Queueing theory is mainly seen as a branch of applied probability theory [4]. Its applications can be found in different areas, including transportation system [5], the telecommunication applications [6], call centers [7], manufacturing [8], and computer system [9]. In this thesis, we mainly focus on the queueing system in a test execution engine that distributes test tasks among servers.

The use of resources in the system needs to be optimized for two reasons: first, the users need to receive service with short waiting time; second, the resources should be highly utilized to make the system economically feasible.

(16)

7 In this thesis, we build a simulation framework which can be used to simulate and analyze different kinds of queueing systems based on the data provided by Ericsson AB. The results of the simulation system will be used to measure the performance of the real system and optimize the use of resources.

1.2 Problem

Simulation is widely used when the real-world situation is too complex to be analyzed by the mental process alone. To simulate any system, we will meet the problems of collecting the data from the source system and relating measurement data from the source system to the state variables which can be used to govern the behaviors of the system.

In our case, data collecting will not be included as the measurement data are stored in a provided database which records the information about the jobs in the source system. But we need to capture the necessary data for simulation from a large amount of data stored in the database and then transform the information of jobs into parameters which can define an equivalent queueing system.

An important reason to build a simulation framework is, if we want to optimize the existing system, there will be changes in the system. It is not a good idea to apply the changes to the existing system directly which costs too much. It is much easier to apply the changes to the simulation framework and analyze the results.

After building the simulation framework, we want to find out how to use the simulation framework to optimize the existing queueing system, such as decreasing the waiting time and increasing server utilization. In this thesis, we will find out whether flexible queueing systems can be an adequate methodology for this issue.

1.3 Purpose

The purpose of this project is to build a simulation framework based on the data provided by Ericsson AB, which can be used for system optimization. This

(17)

8 simulation system is built based on a real-world scenario, which means that it can be used to simulate different queueing models, such as single queue single server queueing model and multiple queues multiple servers queueing model.

1.4 Objectives

The objective of this project is to build a simulation framework. The main steps of the work would be:

• Define and parametrize the queueing system based on collected data • Build a simulator which can measure the performance of queueing

systems, such as the waiting time and server utilization

• Analyze the distribution of interarrival time and service time based on the historical data

• Analyze the impact of the number of servers on waiting time and server utilization

1.5 Methodology

In this project, we will use simulation to analyze the queueing system. There are advantages of a simulation model:

• A simulation model can be used to investigate a wide verity of “what-if” questions about the real-world system. It is very easy to apply different changes and predicts the effects of the changes in a simulation model than in a real-world system.

• Time can be compressed in a simulation model. For example, in our case, we need to study the performance of the real-world system in recent months. If we choose experiments, it will be time-consuming. But in a simulation model, it can present the result of recent months in a few minutes.

• A simulation model can be used to study a complex real-world system. For a real-world system, it will be complex to build a mathematical model. A model is based on the assumptions about a real-world system. But compared with the simulation model, there will be more

(18)

9 assumptions in the mathematical model as the information about the real-world system is less precise and hard to measure.

First, we develop and set up a simulation framework based on discrete event simulation (DES). Then, we simulate different kinds of queueing system, including the queueing system with a single server and the queueing system with multiple servers, to measure the simulation results. Finally, we do experiments to find the possible way for increasing the server utilization.

1.6 Sustainability

Information Communication Technology (ICT) is of the vital importance of driving innovation productivity and growth in economy. However, it creates environmental problems at the same time [10]. For example, IT hardware causes environmental problems and resource waste both during its production and its disposal [11]. The quest for sustainability and green growth has become a key policy concern in both developed and developing countries [12]. Green ICT is about reducing the impact of ICT on the environment, which aims to reduce the energy use of computers, servers and data centers. In this thesis, we aim to build a simulation framework which can optimize the resource by reducing the server number in the system. Reducing the server number can save the energy of implementing the hardware and running the servers, which will make the existing system more eco-friendly.

1.7 Outline

The structure of the report will be summarised as follows:

Chapter 1 presents the basic background of the thesis and describes the specific problem that will be addressed in this thesis. Chapter 2 presents the background and the related work of this thesis in detail. Chapter 3 gives an introduction to the data clean algorithm and the simulation system we build in this thesis and how it works. Chapter 4 shows how to implement the simulation framework, including how to define the entities used in the system and generate

(19)

10 the different queueing models. Chapter 5 presents the simulation results of the data clean algorithm and different queueing systems, such as the single server queueing system. Chapter 6 is mainly about limitations and future work. More suggestive improvements can be applied to the simulation system and more experiments can be tested in the future. Chapter 7 concludes the work of this thesis.

(20)

11

2 Literature study

2.1 Queueing theory

Queues are common in computer systems. Typically, a queue has one service facility and a waiting room [9]. There may be one or more servers in the service facility. In general, a waiting room in the queue can be of finite or infinite capacity. The infinite waiting room means that the number of jobs waiting in the line will not be limited.

Not all jobs in the queueing system need to be treated equally. The queueing systems in which some jobs get preferential treatment are called priority queueing systems [13]. In a priority queueing system, the queues are ordered and the higher priority jobs will be served first. It is assumed that all the jobs are divided into different priority classes, which are numbered from 1 to n. The higher priority class is denoted by the smaller number. For instance, the job with priority 1 will be handled before the job with priority 10. But for the jobs with the same priority, it still follows the service discipline FIFO (in, First-out).

There are two basic classes of priority policies: the preemptive-resume policy, and the non-preemptive priority policy. The preemptive-resume policy means that a job of higher priority class has the right to interrupt the service of a lower priority job [14]. On the other hand, the non-preemptive priority policy means that when a server begins to handle a job, this process will not be stopped until the completion of this job. When applying the non-preemptive priority policy, the higher priority job can not interrupt another job’s process.

The service discipline indicates the manner in which the units are taken for service [15]. There are several different queueing disciplines:

• FIFO (First-in, First-out): The jobs will be served in the order of they arrive in the system.

• LIFO (Last-in, First-out): The jobs will be served in the reverse order of they arrive in the system.

• SIRO (Service In Random Order): The jobs will be served in random order.

(21)

12 • EDF (Earliest Deadline First): The job with the earliest deadline will be

served first.

In queueing theory, a special notation, Kendall’s notation is used to describe and classify the queueing system with the form [16]:

/ / / / /

A S c K N D

• A describes the interarrival time distribution

• S describes the service time distribution

• c presents the number of servers in the system

• Kmeans the maximum number of jobs in the system, including the

jobs in the server and the jobs in the waiting room. the capacity of the waiting room. If we assume the job number is infinite (K = ), the

notation can be simplified as A S c/ / .

• N presents the calling population. The calling population can be finite

or infinite, which will affect how the arrival rate is defined. In an infinite queueing model, the arrival rate is not affected by the number of jobs in the system. While in a finite queueing model, the number of jobs in the system will significantly affect the arrival rate.

• D presents the queue’s discipline, including the service discipline and the priority order

In the notation, there are different symbols to describe the interarrival time distribution and the service time distribution:

• M: exponential distribution (M stands for Markov) • D: deterministic distribution

• G: General distribution

the symbol G for general distribution is used to describe for the interarrival

time distribution (A) and the service time distribution (S). For example, the

G/G/1 system is a queueing system which has general distribution for interarrival time and service time, and there is only one server in the system.

(22)

13

2.2 Flexible queueing system

The flexible queueing system is a queueing system with multiple classes of jobs and heterogeneous servers where jobs have the flexibility of being processed by more than one server and server posses the capability of processing more than one job class [17].

Figure 2-1 provides an example of the queueing system with full flexibility, which is a special kind of flexible queueing system. A queueing system with full flexibility means that in this system

1) Each server has the capability of processing any job class.

It can be seen that server 1 and server 2 can handle the jobs from queue 1 and queue 2

2) The job can be processed by any server.

The job from queue 1 and queue 2 can be allocated to server 1 and server 2.

Figure 2-1: An example of the queueing system with full flexibility

Figure 2-2 provides an example of the queueing system with limited flexibility, which is a more general case of the queueing system. A queueing system with limited flexibility means that:

(23)

14

1) Each server has the capability of processing one or more job classes.

Server 1 can handle the jobs from queue 1 and queue 2 while server 2 can handle the jobs from queue 2 and queue 3.

2) The job can be processed by one or more servers.

The job from queue 3 can only be processed by server 2 while the job from queue 3 can be processed by server 1 and server 2.

(24)

15

2.3 Discrete event simulation

Modeling and simulation are general tools of engineering. In practice, simulation is needed because in some cases, experimenting with the real-life system is not feasible due to the budget or the risks [18] and analytic modeling is mathematically challenging to obtain.

A general idea to build a simulation system is provided in [19]. Firstly, we should select a source system, which is the real-world system we are interested in. Second, we need to collect the data and behaviour from the source system, which will be fundamental for simulation. In our case, the database provided by Ericsson contains the information collected from the source system. The final step is to build a simulator. The simulator is the core part of the simulation framework, which has a similar behaviour as the source system.

In general, there are mainly three approaches to simulation: • Quantum simulation

• Continuous simulation • Discrete event simulation

Discrete-event simulation and continuous simulation is widely used, but the quantum simulation is a special way of simulation, as it is mainly used in high-energy physics, atomic physics or similar areas. The quantum simulation relies on the quantum computer because of the huge amount of memory required [20]. It is very hard to perform the quantum simulation on the classical computer.

Continuous simulation is suitable for systems in which the states can change continuously [21]. In the continuous simulation, a continuous function will be applied using real numbers to represent a continuously changing system, which means that we can track the variables in continuous time.

On the other hand, discrete event simulation is suitable for problems in which states change in discrete times and by discrete steps. In the discrete event simulation, the source system will be modelled as a sequence of events ordered by the event occurrence time [22]. The state of the system will not change between two adjacent events.

(25)

16 To fully understand the discrete-event simulation, we introduce the basic components:

• State

The state is a set of variables collected from the system, which can be a particular measurable property of the system [18]. For instance, in this paper, the server state is one of the variables that can be used to describe the system. The server state ‘BUSY’ means that this server is occupied by a job. • Event

The event is the instance of changes in state variables[18]. When an event occurs, the state of the system will change. In the simulation, there will be limited kinds of events. For example, in our case, there are two events: Event ‘ARRIVAL’ and Event ‘DEPARTURE’.

2.4 Related work

The related work consists of two parts: queueing theory and simulation. In the queueing theory part, we mainly focus on the examples which analyze and model the queueing systems. In the simulation part, we present several examples which build the simulation systems in different way.

2.4.1 Queueing theory

In [23], the authors provide a model of a single server queueing system, which can be used to analyze the real-world problem. In our work, we will extend to a multi-server queueing system

A framework for the representation, modeling, and analysis of the flexible queueing system is provided in [17]. The models are generic and can be used to analyze the flexible queueing system in a variety of applications. The model in this paper is useful for us in analyzing the performance of the different queueing system.

An example of the flexible queueing system modeling is provided in [24]. This paper mainly focuses on the effect of different service mechanisms. To implement different service mechanisms, they provide two different servers: parallel servers and cooperating servers. Different kinds of service mechanisms

(26)

17 can be applied in these two kinds of the server while keeps the flexibility the same.

The authors of [25] analyze how to dynamically change the queueing structure to minimize the waiting time. Similar to our work, they also study a general class of queueing systems with multiple job types and a flexible service facility. This work inspires us to study the effect of priority in the queueing system.

2.4.2 Simulation

Building a simulation system is widely used for analyzing the queueing performance. An approach for analyzing queueing system performance with discrete event simulation is provided in [26]. They build a fuzzy discrete event simulation (FDES) framework, which can calculate the queue performance metrics such as average queue length and waiting time. This paper provides a good example for us to study how to calculate queue performance metrics. In our project, we also need to calculate the queue performance metrics such as the average waiting time and server utilization.

An example of simulating and analyzing the single server queueing system is given in [5]. They implement the simulation by modeling the queue with cyclic service. This is a single-server multi-queue queueing model, which server visits queue with cyclic order. In our project, we also need to build a single-server queue model, the difference is that it is not with cyclic service. This paper provides us with a typical example of a single-server queueing model. From this paper, we learn how to analyze the waiting time and server utilization in the single server queueing system.

Flexible queueing systems are simulated in [27]. In this paper, they analyze the effect of flexibility between the queues and the servers. Although in our project, we mainly focus on the queueing system with full flexibility. It still inspires us to simulate a flexible queueing system to optimize the system resource.

(27)

18

3 Simulation framework

In this study, we build a simulation framework to simulate different queueing systems in different conditions. This section presents the functions that we want to achieve in the simulation framework and gives an overview of the simulation framework. Then, we introduce the details of the components in the simulator.

3.1 Capabilities

In this paper, we want to build a simulator with the following capabilities: 1) Queueing system replay

What we have now in the database are the timestamps and other features of jobs recorded from the real system. Queueing system replay is the process that we find out the topology of the source system and reproduce it using the data in the database, which means that the number of queues and servers, and the connections between queue and server will be the same as the real system.

The results of queueing system replay contain the topology of the queueing system and the measurements of queueing system performance, such as the mean waiting time and server utilization. By comparing the results of system replay and the historical data, we can prove the correctness of the simulator.

2) Queueing system simulation based on the estimated distribution

Unlike system replay, this is a different type of simulation: using four different kinds of continuous distribution (exponential distribution, gamma distribution, lognormal distribution, and Weibull distribution) to estimate the interarrival time and service time.

The estimated distribution of interarrival time and service time will be the input of the simulator, while the measurements of the queueing system, such as the mean waiting time and server utilization, will be the outputs of the simulator. To compare the results of queueing system simulation and replay, the topology of the queueing system will be the same as the source system.

(28)

19 3) Queueing system transformation

In queueing system transformation, we will generate a new fully flexibility topology, in which each queue is connected to all the servers in the system and then simulate with the new topology. The jobs in the new system will be the same as in the real system, while the number of queues and servers will change.

The purpose of the queueing system transformation is to find out a possible way to optimize the resource of the system by decreasing the server number but increasing the system complexity.

3.2 Overview of the simulation framework

An overview of the simulation framework can be seen in Figure 3.1. It can be found that the simulation framework consists of two important parts: Data Preparation Module (DPM) and simulator. DPM is used for data preprocessing as the historical data in the database can not be directly used for simulation. The simulator is the core part of the simulation framework, which will simulate a queueing system with the data provided by DPM.

Figure 3-1: The architecture of the simulation framework

3.3 Data Preparation Module

Data Preparation Module is used to capture and clean the historical data from the MySQL database provided by Ericsson. This database consists of six tables with a total of 10830603 records collected during six months.

The database records the information about the details of the jobs. But not all of the information is needed in the simulation, capturing the useful

(29)

20 required by the simulation, such as the interarrival time of jobs can not be directly obtained from the database, this is why we need data clean.

In our case, the Data Preparation Module mainly consists of two parts: data selection and data clean. In the following subsection, we will introduce these two parts in detail.

3.3.1 Data selection

The historical data in the database contains a large number of features for each stored object. Only a part of these is necessary for simulation of the queueing system. The objective of the data selection process is to prepare a database containing the necessary data only. Table 3.1 shows an example of the features that are kept after data selection.

In the table, ’ID’ is the unique id for each object in the database, which can be used to identify the jobs. ‘Use_server’ indicates the server that should handle the job. ‘Priority’ presents the priority class of the job in the form of the number ( the higher priority will be denoted by a smaller number). ‘Capability’ is another feature to distinguish a job. The timestamp ‘Created’ records the time when the job was created, ‘Pickup’ shows the time when the job was handled by the server, and ‘Finished’ shows the time when the job left the system. These timestamps will be used in the data clean to calculate the parameters required for simulation.

ID Use_serve

r Priority Capability Created Pickup Finished 1921423 9 Server1 8 Capability1, Capability 6 2018-11-01 12:14:2 6 2018-11-01 12:14:4 4 2018-11-01 12:14:54

(30)

21

3.3.2 Data clean

In data selection, we capture three timestamps from the database: ‘Created’, ‘Pickup’, and ‘Finished’. But in the simulation, what required is interarrival time and service time. In queue theory, interarrival time is the difference between the arrival time of a job and the next [4] while service time is the time for a server to handle a job. In data clean, we will introduce how to calculate interarrival time, service time and the waiting time with the timestamps from data selection.

1) Case 1:

Figure 3.2 presents the timeline of the data clean case 1. In Figure 3.2, we can see the time of created, pickup, and finished for Job_i, which shows that

i

Job arrives in the system, and it is picked up by a server after waiting for a

period.

In the real system, when a job leaves the system, there will be a maintenance time for the server before it begins to handle the next job. This is the reason why there is a time slot between T_{created i}_{( )} and T_{pickup i}_{( )}. But in the simulation,

there is no maintenance time. To solve this problem, we will add the maintenance time into the service time of this job.

Figure 3.4 presents the pseudocode of the data clean algorithm. From line 8 and line 9 in figure 3.4, we know that we can calculate the waiting time and the service time of Job_i with the following equations:

( ) 0

waiting i

T =

( ) ( ) ( )

service i finished i created i

T =T −T

Figure 3-2: Timeline of data clean case 1

(31)

22 2) Case 2:

It can be seen from Figure 3.3 when Job_i−₂ leaves the system, Jobi has

already been waiting in the queue. But according to the timeline, we can find that Job_i is not picked up by the server, the server chooses to handle Job_i−₁

instead of Job_i. This could happen because, in the system, the job has a

different priority. If Job_i−₁ has higher priority, it will be handled before Jobi.

In this case, the maintenance time is not the time slot between T_{created i}_{( )} and ( )

pickup i

T . Because in this time slot, Job_i₋₁ is executed by the system. The

maintenance time is the time slot between T_{finished i}₍₋₁₎ and T_{pickup i}_{( )} in this case.

From line 12 and line 13 in Figure 3.4, we know that the waiting time and service time can be calculated with the following equation:

( ) ( 1) ( )

waiting i finished i created i

T =T − −T

( ) ( ) ( 1)

service i finished i finished i

T =T −T −

Figure 3-3: Timeline of data clean case 2

(32)

23

Figure 3-4: Data clean algorithm

3.4 Queueing system simulator

In our study, the simulator is based on discrete event simulation. It means that the simulation framework is a discrete-event state and event-driven system which the state changes depend entirely on the occurrence of discrete events over time [18].

Figure 3.9 presents the execution flow of the simulator. To figure out how the simulator works, we need to introduce the events and the selection algorithm we used in the system. In the simulation, Event Arrival is generated when a job arrives at the system while Event Departure is generated when a job leaves the system.

Four selection algorithms are designed for event, job, queue, and server. In the following paragraphs, we will introduce the selection algorithms in detail.

1) Event selection algorithm

In the simulator, we maintain an event queue sorted by time when the event occurs. As shown in Figure 3.5, the event selection algorithm needs to return

(33)

24 an event from a list of events. In our program, the event selection algorithm will return the first event in the list, which means that the event occurs first will be handled first, and then this selected event will be removed from the list.

Figure 3-5: Event selection algorithm

2) Job selection algorithm

The job selection algorithm is used to select the next job to be executed from a queue. All the jobs in a queue have the same priority, but the arriving time is different. In the simulation, we assume that the jobs in a queue follow the rule of First-In, First-Out (FIFO). As shown in Figure 3.6, the job selection will return the first arriving job and then remove this job from the queue.

Figure 3-6: Job selection algorithm

3) Queue selection algorithm

In the simulation, we can generate two kinds of queues: priority queues, and non-priority queues. Figure 3.7 presents the pseudocode of the queue selection algorithm.

For priority queues, the queue with the highest priority will be denoted by the smallest number. When we need to select a queue from the list of queues,

(34)

25 we will find all non-empty queues and return the non-empty queue with the highest priority.

For the non-priority queues, we use the arriving time of first arriving jobs in each queue to be the timestamp of the queue. According to the rule of FIFO, the queue with the smallest timestamp will be selected.

Figure 3-7: Event selection algorithm

4) Server selection algorithm

In the system, a job can be allocated to several servers if its queue is connected to several servers. The server selection algorithm is used to select a server among the connected servers. To balance the load of each server, we adapt the random selection algorithm. As shown in Figure 3.8, an idle server will be picked up randomly from the server pool.

Figure 3-8: Server selection algorithm

(35)

26

Figure 3-9: Execution flow of the simulator

(36)

27

4 Implementation

In this section, we will introduce how we implement the simulation framework. Section 4.1 presents how we define the entities of the system, which is the foundation of the simulator. Section 4.2 introduces how to generate the topology of the queueing system, which is of great importance in simulating different queueing systems.

4.1 Entity generation

The basic elements to define a queueing system are job, queue, and server. In Figure 4.1, it shows how we define these three entities in the program.

The id of the job is unique, which can be used to identify and search for the job among the job population. ‘arrivalTime’ presents the time when the job arrives at the system while ‘serverTime’ means the time for a server to handle this job. ‘priority’ is the priority class of the job, which is denoted by numbers. ‘useServer’ indicates the requirements that this job needs to be handled by a specific server. In our simulator, a queue is a list of jobs waiting to be served, which is defined by three attributes: ‘priority’, ‘capability’, and ‘useServer’. The method ‘addJob’ enables the system to place a job in a queue, while the method ‘removeJob’ can remove a job from a queue.

For a server, it has two different states: BUSY and IDLE. The state BUSY means that the server is occupied by a job while the state IDLE means that the server is ready to be used. In the simulation, we assume that all the servers that can serve a job are identical, which means the service time of a job will not change even if it is handled by different servers. To distinguish the servers, each server has a unique server name.

(37)

28

Figure 4-1: UML class diagram of job, queue, and server

As we mentioned before, the simulator is based on discrete event simulation. Event and event queue is of vital importance to build a discrete event simulator. In Figure 4.2, it shows how to define the event and event queue in the program. There are two types of events: arrival event and departure event. In a discrete event simulation system, the state of the system will not change until a new event occurs[18]. The arrival event occurs when a new job arrives at the system, while the departure event occurs when a job leaves the system.

Similar to the queue, the event queue is a list of events waiting to be handled. In the event queue, the event is ordered by the occurring time. In the simulation, the new event will be added to the event queue while the old event will be removed from the event queue.

(38)

29

Figure 4-2: UML class diagram of event and event queue

4.2 Topology generation

In the database, the records are related to the jobs, and it is hard for us to figure out the structure of the real queueing system directly. Topology generation provides a way to visualize the structure of the system, including the queues and the servers in the system and how the queues connected with the servers. In Figure 4.3, it presents the pseudocode of the topology generation algorithm. This algorithm will return the map which contains the connections between queues and servers in the system. The variable ‘server state’ is recorded in the database, showing the job is allocated to which queue in the real case.

Figure 4-3: The topology generation algorithm

With the result of the topology generation algorithm, we can visualize the topology of the real system. In Figure 4.4, it shows an example of a queuing system. This system is consists of three queues and two servers. We can see the

(39)

30 queue1 is connected with server1, queue3 is connected with server2, and the queue2 is connected with both server1 and server2.

Figure 4-4: Example of a flexible queueing system with limited flexibility

(40)

31

5 Results and Analysis

In this section, we present the results generated by our simulation framework. First of all, we present the results of the data clean algorithm. This is the most important part before simulation. The comparison between the data clean result and the historical data will be a useful feature for us to select the useful test cases.

Secondly, we analyze the queueing system with a single server. A single server system may have one or more queues. For the one queue and one server system, we provide a way to use estimated distributions to simulate the real system. For the several queues and one server system, we focus on how the priority of queues affect the mean waiting time of queues.

Finally, we move to a more complex system, the queueing system with multiple servers. In this part, we present the result of system replay and system transformation by analyzing the mean waiting time and server utilization.

5.1 Result of data clean

In our study, the data clean algorithm is implemented for each server. To present the result of the data clean algorithm, we collect data from ten different queueing systems, which consist of only one single server in each system. The mean waiting time captured from the database will be compared with the mean waiting time calculated by the data clean algorithm.

For the sake of simplicity, the term ‘mean waiting time of historical data (TWhistorical)’ will be used to refer to the mean waiting time captured from

historical data, while the term ‘mean waiting time of data clean (TWdataclean)’ will

be used to refer to the mean waiting time calculated by the data clean algorithm. In the previous section (Section 3.3.1), we present an example of data selection in Table 3.1. There are three timestamps in the table: ‘Created’, ‘Pickup’, and ‘Finished’. The mean waiting time of historical data can be calculated with the following equation:

(41)

32 The details of how to calculate the mean waiting time of data clean can be seen in section 3.3.2, where we introduce the data clean algorithm.

To measure the difference between the mean waiting time of historical data and data clean algorithm, we calculate the difference rate using the following equation:

𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒𝑅𝑎𝑡𝑒 =𝑇𝑊ℎ𝑖𝑠𝑡𝑜𝑟𝑖𝑐𝑎𝑙−𝑇𝑊𝑑𝑎𝑡𝑎𝑐𝑙𝑒𝑎𝑛

𝑇𝑊_{ℎ𝑖𝑠𝑡𝑜𝑟𝑖𝑐𝑎𝑙} *100%

Figure 5.1 presents the difference rate for ten queueing systems. As shown in the picture, most of them are less than 25%, which means that the cleanup waiting time is very close to the historical waiting time. But for system 3, the difference rate is about 75%. The large difference rate is caused by the long maintenance time. For the real-world server, it needs maintenance time between two jobs. But in our simulator, we assume that the server works continuously and the maintenance time is added to the service time. The maintenance time for each job can be calculated by the following equation:

𝑇_{𝑚𝑎𝑖𝑛𝑡𝑒𝑛𝑎𝑛𝑐𝑒} = 𝑇𝑊_{ℎ𝑖𝑠𝑡𝑜𝑟𝑖𝑐𝑎𝑙}− 𝑇𝑊_{𝑑𝑎𝑡𝑎𝑐𝑙𝑒𝑎𝑛}

If there are a lot of large maintenance time 𝑇𝑚𝑎𝑖𝑛𝑡𝑒𝑛𝑎𝑛𝑐𝑒, it means a larger

difference between TWhistorical and TWdataclean . The difference rate is of vital

importance when we select the test system: a system with large difference rate may not have a good performance in the simulation. For example, if in a system, the maintenance time is 24 hours while the waiting time is 10 minutes, it will lead to a large error in simulation. Before the simulation, we will check the difference rates of the test systems, and pick up the systems with small difference rate to make sure that we can get a good performance in the simulation.

(42)

33

Figure 5-1: Difference rate for ten queueing systems

5.2 Analysis of the queueing system with a single server

A queueing system with a single server may have one or more queues. To simulate different cases of the queueing systems, we will begin with the queueing system consisting of one queue and one server, then following by the queueing system consisting of several queues and one server.

5.2.1 Modeling of one queue and one server system

• Results of queueing system replay

The first step to analyze the queueing system is to evaluate the existing system by replay. In this step, the mean waiting time of data clean (TWdataclean) will be

compared with the mean waiting time of replay (TWreplay) as a baseline. The

difference between these two results will indicate the accuracy of the simulator. To measure the accuracy of the simulator, we can calculate the error rate with the following equation:

(43)

34 *100% dataclean replay dataclean TW TW ErrorRate TW − =

As we can conclude from the equation above, if TWreplay was close to TWdataclean,

the error rate will be close to 0%, which means that the simulator can simulate the existing queueing system with high accuracy.

In the experiment, nine different queueing systems, which consist of one queue and one server, are chosen for the test. As indicated in Figure 5.2, the error rates of eight systems equal 0%, which means the replay result is quite close to the result of data clean. In other words, it verifies the accuracy and reliability of the simulation.

For system 6, the error rate approximately equals 5 %, which is much higher than others. After analyzing the details of the jobs, we find that the job execution order may be the reason for the higher error rate. In our simulation for a queueing system of one queue and one server, FIFO is applied in the scheduler. But in the real system, they might use a more complex scheduler. The different scheduler will lead to different job order. For some cases, such as queueing system6, the job order plays a decisive role in waiting time.

Figure 5-2: Error rate for nine queueing systems

(44)

35 • Results of queueing system simulation based on estimated distributions In the replay, we use the timestamps captured from the database to simulate the existing queueing system, which has high requirements of the data integrity. As is introduced in the previous section, several timestamps are needed to define a job, even missing one of the timestamps will lead to a missing job in the system. To solve this problem, we are trying to provide a more general way for simulation: using the estimated distribution to simulate the real system. In our study, we assume that the system is a G/G/1 queueing system, whose interarrival times have a general distribution, and the service time also has a general distribution. We need to mention that the distribution of the interarrival time and the service time is different [28]. Four continuous probability distributions are selected to simulate the interarrival time and the service time: exponential distribution, gamma distribution, lognormal distribution, and Weibull distribution. As the distribution of interarrival time and service time are considered to be independent, there will be 16 different combinations generated by four different distributions.

Before the simulation, we visualize the distribution of interarrival time and service time. Several independent queueing systems are selected in our experiments. Here is the result for one of them. Figure 5.3 presents the distribution of interarrival time generated from data clean algorithm, which is marked with black as the baseline, and the results generated with four different continuous distributions. It can be observed in Figure 5.3 that Weibull distribution is very suitable for simulating the interarrival time if we consider the results of four plots, especially the Q-Q (quantile-quantile) plot and P-P (probability-probability) plot. According to the degree of similarity, we can give an order for these four distributions: [1. Weibull 2. Log-norm 3. Gamma 4. Exponential]. Figure 5.4 presents the results related to the service time. Similarly, we can also order these four distributions: [1. Weibull 2. Gamma 3. Exponential 4. Lognormal].

Different distributions of interarrival time and service time will lead to varying results of waiting time in the simulation. In our experiments, we collect the

(45)

36 results of all 16 possible combinations and repeat the simulation for 1000 times. In Figure 5.5, the horizontal axis presents the details of the simulation. For example, ‘ExpExp’ refers to the simulation which uses interarrival time generated with exponential distribution and service time generated with exponential distribution. The red line indicates the mean waiting time of data clean. In the previous discussion, we give an order for the interarrival time and service time. According to the orders obtained before, we can make a prediction that if we use Weibull distribution to simulate the interarrival time and the service time, the simulator will provide a waiting time which is close to the data clean result. It is visible in Figure 5.5 that the median value of ‘WeibullWeibull’ approximately equals to the data clean result, which is strong evidence to prove the prediction. Besides, if we ignore the difference in service time distribution, we can find that the median value of interarrival time with Weibull distribution is closest to the red line. In summary, the results shown in Figure 5.3, 5.4, and 5.5 demonstrate the correctness of simulation based on the estimated distribution.

(46)

37

(47)

38

Figure 5-4: Distribution of service time

(48)

39

(49)

40

5.2.2 Modeling of several queues and one server system

As introduced in the previous section, the simulator provides a choice to generate priority queues or non-priority queues. In this section, we use four different source systems to test the effect of priority (In our queueing system, the higher priority is donated by a smaller number).

In the experiment, we firstly generate priority queues to collect the mean waiting time (TWpriority) for each queue. The second step is removing the feature

‘priority’ from queues and collect the mean waiting time (TWnopriority). The

difference (Difference) between these two results will present the performance

of the queues. The difference can be calculated with the following equation:

priority nopriority

Difference=TW −TW

1. System1

System1 consists of two queues and one server, and the priority is 8 for queue1 while 10 for queue2. During the test period, there are 476 jobs in total. As we can see from Figure 5.6, the horizontal axis shows the different priorities which are denoted by numbers while the vertical axis presents the difference of mean waiting time. The black dash line indicates the difference equals to zero. The node above the black line means that the mean waiting time increases after applying priority to the system while the node under the black line means that the mean waiting time decreases.

It can be found that, after applying priority into the queues, the mean waiting time of the queue with higher priority (priority: 8) decreases by 2 minutes while the mean waiting time of the queue with lower priority (priority: 10) increases by 1 minute. As for the mean waiting time of the whole system, it is 21.85 minutes for the non-priority queueing system and decreases to 21.63 minutes for the priority queueing system.

(50)

41

Figure 5-6: Result of system1

2. System2

System2 consists of seven different priority queues (priority: 4, 5, 6, 7, 8, 9, 10). During the test period, there are 1193 jobs in total. It can be observed in Figure 5.7 that the differences of all the queues are quite close to 0, which means that the mean waiting time of each queue almost stays unchanged after applying priority to queues. As for the mean waiting time for the whole system, it is 14 minutes for both of the priority queueing system and the non-priority queueing system.

(51)

42

3. System3

System3 consists of ten different priority queues (priority: 1, 2, 3, 4, 5, 6, 7, 8, 10, 11). During the test period, there are 1506 jobs in total. It can be found from Figure 5.8 that the mean waiting time of queue with priority 11 increases by about 26 minutes while the mean waiting time of queue with priority 3 decreases by 58 minutes. There are small fluctuations for the mean waiting time of priority 1 and priority 10. It is reasonable that the mean waiting time of queue with higher priority decreases while the mean waiting time of queue with lower priority increases. As for the mean waiting time of the whole queueing system, it is 59 minutes for both the priority queueing system and non-priority queueing system.

(52)

43

4. System4

System 4 consists of 12 different priority queues (priority: 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 15). In the test period, there are 1827 jobs in total. It can be observed in Figure 5.9 that the mean waiting time of queues with priority 3, 4, 6, 7, 8, 10, 11 decreases while the mean waiting time of queues with priority 5 and 12 increases. It is confusing that the queue with higher priority (priority 5) waits for a longer time. After analyzing the arrive time for all the jobs in the system, we find that the jobs with priority 3 and 5 arrive the system in chunks. The queue with priority 3 will be served before the queue with priority 5. This is why the queue with priority 5 waits for a longer time. As for the mean waiting time of the whole system, it is 119 minutes for the non-priority queueing system and decreases to 116 minutes for the priority queueing system.

(53)

44

5.3 Analysis of the queueing system with multiple servers

A queueing system with multiple servers is much more complex than those with only one single server. In this section, we mainly focus on the result of system transformation, which provides a possible way to improve server utilization by changing the structure of the queueing system.

5.3.1 Result of system transformation

In our study, we find that for some servers, the server utilization is lower than 20%. In this section, we try to find a possible way to improve the server utilization of the servers. The server utilization is calculated by the following equation: *100% ServiceTime ServerUtilization TestPeriod =



In system transformation, we simulate by varying number of servers and record the server utilization and mean waiting time of each test. In the experiments, we apply system transformation to different systems. In scenario 1, we choose

(54)

45 a system consisting of 40 queues and six servers. While in scenario 2, we choose three separate single server systems.

5.3.1.1 Scenario 1

The first step of system transformation is to figure out the structure of the real queueing system. In our experiment, we choose a queueing system (original system), which consists of 40 queues and 6 servers. The second step is to redefine the queues in the system: as introduced before, a queue is defined with ‘priority’, ‘use_server’, and ‘capability’. To simplify the queueing system, we redefine the queue with ‘priority’ (the total number of jobs in the system will not change). In our experiment, the 40 queues will be redefined as 5 different queues with priority 8,9,10, 20 and 100. The last step is to generate a new queueing system with full flexibility, which means that the queues are connected will all the servers in the system.

In the simulation, we will change the number of servers in the system and record the mean waiting time and server utilization. As shown in Table 5.2, the server number indicates how many servers we use in the simulation. It can be found that the mean waiting time decreases rapidly with the number of servers in the system. In the original system, there are six servers, and the mean waiting time of the cleaned data is 13.89 minutes. We can see that for the simulation system with four servers, the mean waiting time is 9.0 minutes, which is 35% lower than 13.89 minutes.

Figure 5.10 shows the server utilization of systems with different server numbers. The horizontal axis presents the number of servers used in the system, while the vertical axis shows the detail of server utilization of each server. For the system with four servers, the mean server utilization of each server is 36.52%, while for the original system, the mean server utilization of each server is 24.85%.

Based on the comparison of waiting time and server utilization, we can find that a system with four servers and full flexibility between queues and server can have a shorter waiting time and larger server utilization. The result shows that

(55)

46 it is a possible way to save the resource by transforming the system into a queueing system with full flexibility.

Table 5-1: Mean waiting time of the system with different number of servers

Figure 5-10: Server utilization for systems with different server number

Server Number Mean waiting time (min) Mean server utilization 1 92781.5 99.8% 2 689.9 74.55% 3 62.7 49.7% 4 9.0 36.25% 5 1.5 29.82% 6 0.4 24.85% 6 (original system) 13.89 24.85%

(56)

47

5.3.1.2 Scenario 2

In the last experiment, we choose a system with six servers, but in the real case, there are some systems with a single server. In scenario 2, we try to combine several single server systems into a multiple server system. This time we choose three single server system as the original system and apply system transformation in these three systems.

Table 5.2 shows the mean waiting time of the system with a different number of servers. It can be seen that the mean waiting time of the simulated system with two servers is 16.89 minutes, which is 40.5% shorter than the mean waiting time of the original system (28.42 minutes).

Then we can check the detail of server utilization in Figure 5.11, for the system with two servers, the mean server utilization is 13.86%, which is larger than the mean server utilization of the original system (9.24%). The result shows that a system with two servers and full flexibility between queues and servers may have a shorter waiting time and high server utilization than three separate single server systems.

Table 5-2: Mean waiting time of the system with different number of servers

Server Number Mean waiting time(min) Mean server utilization 1 527.12 27.72% 2 16.89 13.86% 3 4.25 9.24% 3 (original system) 28.42 9.24%

(57)

48

(58)

49

6 Limitations and future work

This chapter will present the limitations we find in the simulation framework and the future work which can improve the accuracy of the simulation.

6.1 Limitations

6.1.1 Random server selection in simulation

In the simulation, when we need to select an idle server from the server pool, we apply the random selection algorithm as we assume that all the servers are identical. But in the real system, some jobs will declare that it can not be executed by certain servers. For example, a job with capability ‘use Server_ !=Server1’ means that this job can not be executed by server 1. But

in transformation, when we generate a queueing system with full flexibility, the small difference caused by these jobs is ignored. As we can find the number of these jobs is quite small compared to the total number of jobs in the simulation.

6.1.2 Dependent variables in the queueing system

When we simulate the queueing system with the distribution of interarrival time and service time, it is assumed that the system is a G/G/1 queueing system; the interarrival time and service time are independent random variables with general distribution g x( ) and the common cumulative distribution function

(CDF) G x( )[4]. But in the real case, the interarrival times are not independent

variables in some cases. In the database, we can find that sometimes the jobs arrive in bursts. For example, there is a case that most of the jobs arrive with a certain rule: the jobs arrive within one hour and no other jobs arrive in the rest time of that day. In this case, the interarrival times can not be reprocessed by independent random variables. For these cases, the simulation results based on the distribution are not reliable, but the result of system replay can still reflect the performance of the real system.

(59)

50

6.2 Future work

6.2.1 Extension in simulation based on the estimated distribution

Queueing system simulation based on the estimated distribution is only applied in the single server system. For a complex queueing system such as the multiple servers queueing system, there will be a huge error in waiting time.

The coarse-grained estimating algorithm can be one of the reasons. Now we use four different continuous distributions to estimate the interarrival time and the service time, but it is not enough as there might be a large difference between the real distribution and the estimated distribution. In the future, we should design a fine-grained fitting algorithm to estimate the distribution of interarrival time and service time, which can reduce the difference between the real distribution and the estimated distribution, and then apply the fitting algorithm to the more complex queueing systems such as the multiple servers system.

Another reason is the scale of the source database. We find that in some queueing systems, most of the queues contain less than ten jobs. It is not feasible to estimate the distribution with ten samples. In this case, a possible solution is combining the estimated distribution and the real distribution in simulation. For example, we divide the queues into two groups with the number of jobs in the queue: the estimation will be applied to the queues which contain a large number of jobs, while the queues which contain a few jobs will keep the real interarrival time and the service time distribution.

6.2.2 Improvement in system transformation

As introduced before, the system transformation will generate a new queueing system with full flexibility. The result shows that it is possible to use a queueing system with full flexibility and fewer servers to replace the source system. In the future, we can analyze whether full flexibility is necessary. A queueing system with limited flexibility and fewer servers will be enough, which will save more resources than the full flexibility system.

(60)

51

7 Conclusions

In this master thesis, we presented a simulation framework of the queueing system which could be used for optimizing shared resource systems.

Before the simulation, we introduced the data selection algorithm and the data clean algorithm used in this project. This is of vital importance to define and parametrize the queueing system based on the data collected from the source system. In data selection, we captured the necessary objects for simulation from the database. Then in data clean, the measured data was transformed into the variables needed for the simulator.

We introduced the simulation framework and how to implement it. In the result part, we validated that the functions of the simulation framework were correctly implemented. The queueing system replay and the queueing system simulation based on the estimated distribution were applied in the simulation of the single server queueing system. The queueing system transformation was used to analyze the multi-server queueing system.

To analyze the distribution of the interarrival time and service time, we chose a single server queueing system and used four different continuous distributions to estimate the real distribution. The results showed that it is possible to use the estimated distributions to simulate the system.

We analyzed the impact of server numbers using the queueing system transformation, which enabled us to build a queueing structure with full flexibility using different numbers of servers. The results showed that a queueing system with full flexibility and fewer servers may have a similar waiting time and higher server utilization than the source system.

In summary, this thesis provides a simulation framework which can be used to optimize the resources in the computing system. The results of simulation validated a possible way to increase the utilization and decrease the cost without significantly decreased performance. The implemented simulation framework will be used for future studies.

(61)

Simulation and Analysis of Queueing System

Simulation and Analysis of

Queueing System

YUCONG ZHANG

Examiner

Gyorgy Dan

Academic adviser

Viktoria Fodor

Industrial adviser

Olga Grinchtein

John Karlsson

Abstract

Sammanfattning

Contents

List of figures

List of acronyms and abbreviations

1

Introduction

2

Literature study

3

Simulation framework

4

Implementation

5

Results and Analysis



6

Limitations and future work

7

Conclusions