Application of Machine Learning techniques to Optimization algorithms

(1)

STOCKHOLM SWEDEN 2017 ,

Application of Machine Learning techniques to Optimization

algorithms

GEOFFREY DABERT

(2)

(3)

techniques to Optimization algorithms

GEOFFREY DABERT

Degree Projects in Optimization and Systems Theory (30 ECTS credits) Degree Programme in Engineering Physics

KTH Royal Institute of Technology year 2017

Supervisor at Artelys: Alexandre Marié

Supervisor at KTH: Xiaoming Hu

Examiner at KTH: Xiaoming Hu

(4)

TRITA-MAT-E 2017:20 ISRN-KTH/MAT/E--17/20--SE

Royal Institute of Technology

School of Engineering Sciences

(5)

Optimization problems have been immuned to any attempt of combination with machine learning until a decade ago but it is now an active research field. This thesis has studied the potential implementation of a machine learning heuristic to improve the resolution of the optimization scheduling problems based on a Constraint Programming solver. Some scheduling problems, known as N P-hard problems, suffer from large computational cost (large number of jobs to sched- ule) and consequent human effort (well-suited heuristics need to be derived).

Moreover industrial scheduling problems obviously evolves over time but a lot of features and the basic structure remain the same. Hence they have potential in the implementation a supervised-learning-based heuristic.

First part of the study was to model a given benchmark of instances and im- plement some famous heuristics (as earliest due date, combined with the largest duration) in order to solve the benchmark. Based on the none-optimality of returned solutions, primaries instances were choosen to implement our method.

The second part represents the procedure which has been set up to design a supervised-learning-based heuristic. An instance generator was first built to map the potential industrial evolutions of the instances. It returned secondaries instances representing the learning database. Then a CP-well-suited node ex- traction scheme was set up to collect relevant information from the resolution of the search tree. It will collect data from nodes of the search tree given a proper criteria. These nodes are next projected onto a constant-dimensional space which described the system, the underlying subtree and the impact of the affectations. Upon these features and designed target values statistical mod- els are implemented. A linear and a gradient boosting regressions have been implemented, calibrated and tuned upon the data. Last was to integrate the supervised-learning model into an heuristic framework. This has been done via a soft propagation module to try the instantiation of all the children of the considered node and apply the given module upon them. The selection decision rule was based upon a reconstructed score. Third part was to test the proce- dure implemented. New secondaries instances were generated and supervised- learning-based heuristic tested against the earliest due date one.

The procedure was tested upon two different instances. The integrated

heuristic returned positive results for both instances. For the first one (10 jobs

to schedule) a gain in the first solution found (resp. the number of backtracks)

of 18% (resp. 13% were realized. For the second instance (90 jobs to schedule)

a gain in the first solution found of at least 16%. The results come to validate

the procedure implemented and the methodology used.

(6)

(7)

Optimeringsproblem har immuniserats mot eventuella försök med kombination med maskininlärning förrän ett decennium sedan men det är nu ett aktivt forskningsfält. Denna avhandling har studerat det potentiella genomförandet av en maskinlärande heuristisk för att förbättra upplösningen av optimeringsplaneringsproblemen baserat på en

begränsningsprogrammeringslösare. Några schemaläggningsproblem, känd som NP-hard problem, drabbas av stora beräkningskostnader (stort antal jobb att schemalägga) och därmed mänsklig ansträngning (väl lämpad heuristik måste härledas).

Dessutom utvecklas industriella schemaläggningsproblem uppenbarligen med tiden men mycket av funktioner och grundstrukturen förblir densamma. Därför har de potential i genomförandet av en övervakad-lärande-baserad heuristisk.

Första delen av studien var att modellera en given referenspunkt och genomföra en del berömda heuristik (som tidigast förfallodagen, kombinerad med den största varaktigheten) för att lösa riktmärket. Baserat på den optimala lösningen av återvändande lösningar valdes primära instanser för att implementera vår metod.

Den andra delen representerar proceduren som har upprättats för att designa en övervakad-lärande-baserad heuristisk. En instansgenerator byggdes först för att kartlägga eventuella industriella evolutioner av förekomsterna. Det återvände sekundär instans som representerar inlärningsdatabasen. Därefter inrättades en CP-lämplig nod uttagssystem för att samla relevant information från sökträdet. Den kommer att samla in data från noderna i sökträdet med ett korrekt kriterium. Dessa noder projiceras därefter på ett konstant

dimensionerat delrum som beskriver systemet, det underliggande delträdet och påverkan av påverkningarna. På dessa funktioner och utformade målvärden implementeras statistiska modeller. En linjär och gradient höjnings regression har implementerats, kalibrerats och inställts på data. Sist var att integrera den övervakade lärandemodellen i en heuristisk ram. Detta har gjorts via en mjuk förökningsmodul för att försöka instansiering av alla barn i den betraktade noden och tillämpa den givna modulen på dem. Urvalsbeslutsregeln baserades på en

rekonstruerad poäng.

Tredje delen var att testa det genomförda förfarandet. Nya sekundära instanser genererades och övervakades-lärande-baserad heuristisk testad mot tidigast förfallodagen. Förfarandet testades på två olika fall. Den integrerade heuristiken gav positiva resultat för båda fallen. För den första (10 jobb att schemalägga) uppnåddes en förstärkning i den första lösningen (resp.

Antalet backtracks) på 18 % (respektive 13 %). För andra instans (90 jobb att schemalägga) en

vinst i Första lösningen på minst 16 %. Resultaten kommer att validera det genomförda

(8)

(9)

I would first like to thank my thesis advisor Alexandre Mari´ e, Project Manager at Artelys. He consistently allowed this paper to be my own work, but steered me in the right the direction whenever he thought I needed it.

I would also like to thank the experts who were involved in the validation sur- vey for this research project: Sylvain Mouret, Product and Project Manger at Artelys, who was always available whenever I ran into a trouble spot or had a question about my research or writing; Benjamin Horvilleur, Optimization Engi- neer at Artelys; and obviously Arnaud Renaud, CEO at Artelys, without whom, none of this would have been possible. Without their passionate participation and input, the validation survey could not have been successfully conducted.

Finally, I must express my very profound gratitude to my parents, my fam- ily and all my friends for providing me with unfailing support and continuous encouragement throughout my years of study and through the process of re- searching and writing this thesis. This accomplishment would not have been possible without them.

Thank you.

Geoffrey Dabert

(10)

(11)

1 Introduction 1

1.1 Problem overview . . . . 2

1.2 Approach . . . . 2

2 Background 4 2.1 Constraint Programming . . . . 4

2.1.1 Basic principles . . . . 5

2.1.2 Some features . . . . 6

2.2 Scheduling problems . . . . 7

2.2.1 Definition and heuristics rules . . . . 7

2.2.2 Solution candidates and their representation for single- machine case . . . . 9

2.2.3 State-of-art of hybrid techniques of resolution . . . . 10

2.3 Statistical models . . . . 11

2.3.1 Linear regression . . . . 12

2.3.2 Gradient boosting . . . . 13

3 Scheduling instances - Benchmark, modelling and results 16 3.1 Definition and classification of the scheduling problems . . . . 16

3.1.1 Master model . . . . 16

3.1.2 Benchmarking protocol . . . . 17

3.2 Modelling of the basic instance . . . . 18

3.3 Results of some instances . . . . 20

3.3.1 NCOS 02a scheduling problem . . . . 20

3.3.2 NCGS 21a scheduling problem . . . . 21

3.3.3 NCOS 03b scheduling problem . . . . 22

4 Procedure 24 4.1 Data collection . . . . 25

4.1.1 Instance generator . . . . 25

4.1.2 Extraction of node information . . . . 27

4.2 Learning environment . . . . 28

4.2.1 Input node space . . . . 29

4.2.2 Score and target value of the learning module . . . . 31

4.3 Weighting of the observations . . . . 31

4.4 Heuristic integration . . . . 32

(12)

Master Thesis - Title

5 Results 34

5.1 Statistical results . . . . 34

5.1.1 Features analysis . . . . 35

5.1.2 Linear regression model . . . . 37

5.1.3 Gradient boosting method implementation . . . . 40

5.2 Heuristic results . . . . 45

5.2.1 Instance NCOS 03b . . . . 45

5.2.2 Instance NCOS 41b . . . . 47

5.2.3 Conclusion . . . . 47

6 Conclusions and Future Work 49 6.1 Summary . . . . 49

6.2 General conclusions and recommendations . . . . 50

6.3 Future studies . . . . 51

(13)

2.1 Example of constraint propagation for finite domain variables . . 5 2.2 Example of a search tree for a scheduling problem. Green node

is a solution. Each blue line is a branching between two nodes. . 6 2.3 Candidate solution representation in the search tree . . . . 10 2.4 Example of different trade-off between loss and regularization terms 13 2.5 Example of a tree ensemble model for two decisions variables . . 14 3.1 Gantt chart of optimal schedule for instance NCOS 02a. Black

lines represent due dates of each task. . . . 21 3.2 Search tree of the resolution of instance NCOS 02a using EDD

heuristic. . . . . 21 3.3 Gantt chart of optimal schedule for instance NCGS 21a. . . . 22 3.4 Gantt chart of optimal schedule for instance NCOS 03b. Black

lines represent due dates of each task. . . . 22 4.1 Scheme of the entire procedure . . . . 25 4.2 Scheme of the instance generator function . . . . 26 4.3 Map of the secondaries instances in the instance-features space . 27 4.4 Example of an optimal relation child-father and one where it

cannot be set. . . . 28 4.5 Left-figure is the node-candidate solution space associated to per-

mutation σ, or also the decision tree where a σ-heuristic has been implemented. Right-figure is the search tree and illustrates the goal of the heuristic: return a score that evaluates the quality (in terms of objective function values reachable). . . . 29 4.6 4-depth search tree where the extraction scheme is depicted. Green

nodes are the father-optimal ones. . . . . 32 4.7 Histogram and cumulative distribution of the objective function

value for all secondaries instances. . . . . 32 4.8 Scheme of the ML-based heuristic variable selector . . . . 33 5.1 Results position on the general procedure scheme . . . . 34 5.2 Plots of the objective function value as a function of the secondary

instance concerned. . . . 35

5.3 Cumulative distribution of the max degree descriptor. . . . 36

5.4 Plot of the min res avail duration descriptor. . . . 36

(14)

Master Thesis - Title

5.5 Plot of the evolution of the model selection criteria C

_p

as a func- tion of the number of features considered. Right is a zoom of the left one. . . . 37 5.6 Plot in the upper left shows the residual errors plotted versus

their fitted values. The plot in the lower left is a standard Q-Q plot. The scale-location plot in the upper right shows the square root of the standardized residuals (sort of a square root of relative error) as a function of the fitted values. And the plot in the lower right shows each points leverage, which is a measure of its importance in determining the regression result. Superimposed on the plot are contour lines for the Cooks distance, which is another measure of the importance of each observation to the regression. . . . 38 5.7 Plot of the predicted values vs observed values for the scheduling

problem NCOS 03b. . . . 39 5.8 Plot of the predicted values vs observed values for the scheduling

problem NCOS 41b. . . . 40 5.9 Plot of the importance gain for the gradient boosting method for

the instance NCOS 03b. . . . 42 5.10 Plot of the predicted values vs observed values for the scheduling

problem NCOS 03b. . . . 43 5.11 Plot of the importance gain for the gradient boosting method for

the instance NCOS 03b. . . . 44

(15)

3.1 Equivalence between conventational notation and benchmark . . 18 3.2 CPU time as a function of the size of the instance . . . . 18 3.3 Notations for the different sets of variables . . . . 18 3.4 Search tree resolution features for instance NCOS 02a . . . . 21 3.5 Search tree resolution features for the instance NCGS 21a . . . . 21 3.6 Search tree resolution features for instance NCOS 03b . . . . 22 4.1 Descriptors of the task information category . . . . 30 5.1 Statistical criteria results for the fitting of the objective value for

the instance NCOS 03b. . . . 39 5.2 Statistical criteria results for the fitting of the objective value for

the instance NCOS 41b. . . . 40 5.3 Statistical criteria results for the fitting of the objective value for

the instance NCOS 03b. . . . 43 5.4 Statistical criteria results for the fitting of the objective value for

the instance NCOS 41b. . . . 44 5.5 Statistical criteria results for both instances and both model. . . 45 5.6 Plots of the solutions (first and best) returned by the EDD-

heuristic vs ML-based-heuristic. Left-figure is the linear-regression- based heuristic, mid-figure is the gradient-boosting-based heuris- tic and right-figure is the lower bound of the objective function heuristic. Dots represent the best solution found; triangles the first solution found. Colors represent the three scenarios from which instances are generated from. . . . . 46 5.7 Histograms of the gain in First solution (left); CPU Time (top-

right); Nodes scanned (mid-right); Backtracks (bottom-right).

Black lines represent the zero value in gain. Left-figure is the linear-regression-based heuristic, mid-figure is the gradient-boosting- based heuristic and right-figure is the lower bound of the objec- tive function heuristic. Colors represent the three scenarios from which instances are generated from. . . . . 46 5.8 Table of the performance criteria for the heuristics. Colors for the

gain in first solution and best solution correspond to the scenario

which instances are generated from. . . . . 47

(16)

Master Thesis - Title

5.9 Plots of the first solutions returned by the EDD-heuristic vs ML- based. Left-figure is the linear-regression-based heuristic, mid- figure is the gradient-boosting-based heuristic and right-figure is the lower bound of the objective function heuristic. Colors represent the three scenarios which instances are generated from. 48 5.10 Table of the performance criteria for the heuristics. Colors for the

gain in first solution correspond to the scenario which instances are generated from. . . . . 48 5.11 Table of the performance criteria for the heuristics. Colors for the

gain in first solution correspond to the scenario which instances

are generated from. . . . . 48

6.1 Results of the benchmark solved with Xpress-Kalis . . . . 54

(17)

Introduction

Serious research on scheduling and manufacturing problems began in the 1960’s, and has become a popular topic in the past decades as computing power grown and efficient parallel programming algorithms appeared. Many studies have been done on the complexities, classifications and proper techniques for solving these problems. However, due to the N P-hardness of this class of problems, most researchers have restricted their studies to simple classes of problem, or approximation heuristics for solving them.

A scheduling or manufacturing problem is dealing with the allocation of resources, time slots and integrates constraints in order to optimize the pro- cesses with respect to an objective function. The output is a schedule which should be subject to feasible constraints and optimization objectives. It is part of the discrete optimization area. Common generic solution approaches are the Mixed-Integer Programming (MIP) and the Constraint Programming (CP). In the former a problem is represented by means of variables, domains, constraints and an objective function. This technique in an oversimplified vision is based on the enumeration of all solutions in order to return optimality. It is effective in solving decision problems, but performs poorly on problems with large feasible spaces or a lot of variables.

Many existing CP-based scheduling techniques implements hand-crafted heuris- tics, which are based on instances features, to guide the exploration of the search tree in the resolution of the CP model. However their performance are really tied to the class of the scheduling problem and only an expert in both scheduling and constraint programming would know which one to select. Moreover, since the CP is explorating a hugh feasible space the computational cost is heavy.

In the industrial framework of daily optimization problem resolution, one could

think about re-using the previous ones to learn information, structure, common

features, in order to increase the next resolution. This thesis focuses on the

representation of the solution of a CP model (only scheduling problems are con-

sidered), and presents an implementation of an heuristic based on a learning

mechanism. From resolutions of scheduling problems, the learning mechanism

would have learned how to solve in an optimal way a new problem, within any

expert point of view or experiments. The goal is that this supervised-learning-

based heuristic substitute many hand-crafted heuristics and save both compu-

(18)

Master Thesis - Title

tational costs in improving the resolution and efforts in looking for a well-suited heuristic.

1.1 Problem overview

This thesis examines resolution of scheduling problems as a CP optimization problems. The details about the instances and the modeling will be explained later, but for now let consider the problem as a CP model with variables, do- mains, constraints and an objective value.

CP is an optimization technique which the principle is to instantiate a vari- able to a value in its domain, propagate the effect of this instantiation on other variables and study the resulting domains. If one encounters a failure, the solver backtracks and cancel the instantiation; otherwise it continues by instantiating another variable. An objective function is designed in the optimization prob- lems. A bound constraint is created anytime a new solution is found, such as non-improving assignments are not considered. Strengths of CP are to be able of exploring a search space of non-complete assignments and finding solution that satisfies the problem constraints every time.

Moreover, CP makes use of heuristic criteria, for example for choosing the variable/value assignment to extend a partial solution. As said before their performance depends on the class of the instances and need to be tuned by an expert. A learning mechanism will be set up in order to avoid this human effort, cut computational cost and catch during the resolution search some in- formation on the best decision to be taken in order to return the optimal (or nearly-optimal) solution by the end.

First of all, a procedure will be implemented to generate scheduling problems representing the potential evolutions of a problem over the days and months.

Secondly, an effort will be put to solve these problems and retrieve informa- tion, which can be of interest to solve future scheduling problem. Since learning mechanisms are subject to the curse of dimensionality, a constant-dimension ab- stract space which can discriminate the decisions (interpreted as specific reward function values will be set up). Once the database and learning environment are built, a statistical model will be applied upon it. The last part represents the integration of the learning module into an heuristic in order to create a black-box module that can be used on other scheduling problems.

1.2 Approach

The objective of designing a generic learning heuristic for all CP problems, or

even for all scheduling problems, would be too pretentious. The study will fo-

cus on a given class of scheduling problems, ones with only a single machine

involved. Some future work will be described in order to extend the method

developed in the thesis.

(19)

The necessary theoretical and technical background on Constraint Program- ming and Statistical Learning techniques will be derived in chapter 2. The scheduling problems treated are explained, modeled and solved in chapter 3.

The main matter of this investigation, the procedure deployed involving a learn-

ing mechanism during the resolution of scheduling problems, is described in

chapter 4. The proposed approach was to create several chain-blocks in a row

in order to: create a scheduling instances generator, collect data, learn from

the latter and test the heuristic on others scheduling problems. The results

produced from these experiments can be found in chapter 5. Finally, we present

our conclusions and mention some future directions of research in chapter 6.

(20)

Chapter 2

Background

In this chapter is provided a brief discussion of the essential background needed throughout the thesis. First CP basic principles will be explained. Second sec- tion will introduce the scheduling problems and final section will discuss two statisticals models, the linear regression and the gradient boosting.

2.1 Constraint Programming

”Constraint programming represents one of the closest approaches computer science has yet made to the Holy Grail of programming: the user states the problem, the computer solves it.” - Eugene C. Freuder

Constraint Programming is a problem solving technique which is born in the 1960’s and is becoming more widespread thanks to many successes with effective solving of large combinatorial problems. Research results from different fields such as artificial intelligence, discrete mathematics, graph theory and operations research are involved in the core of CP. It provides a powerful tool for decision- making; it is able to search quickly through an enormous space of choices, and infer the implications of those choices. A non-exhaustive list of applications solved using CP could be: scheduling and manufacturing, timetabling, resource allocation, network configuration, transport, online business, defense, industrial production.

The strength of CP lies in its high flexibility and the fact that the high-

level semantics for stating the constraints preserves the original meaning of the

constraints, and so preserve the problem structure. Another optimization pro-

gramming technique that should be mention is the Mixed-Integer Programming

(MIP) which is similar. Indeed, after the same modelling, both are solved with

Branch and Bound (B&B) technique, illustration of the principle divide and

conquer. But the difference lays in the informative objects: in a MIP, informa-

tion is stated in the linear (or non-linear) relaxations and in the cutting planes

whereas in CP information lays in the constraints and their characteristics (car-

dinality, globality) though their propagation.

(21)

Remark: B&B algorithm consits in an enumeration of candidate solutions by means of a tree search representation. It recursively splits the search space into smaller spaces and then optimize the objective function on them (branching); and checks the bounds of solution reachable in order to eliminate candidate solutions that won’t contain optimal solution (bounding).

2.1.1 Basic principles

In Constraint Programming, a problem is defined by means of variables and constraints. Each variable is defined by a domain, the set of possible values for this variable. A constraint represents a property that should be satisfied by a set of variables. Moreover an objective function can be defined, as a function of the decision variables. Next section will explain the basic principles of problem solving in CP.

Filtering and propagation mechanisms

For each subproblem there is a specific filtering algorithm associated, in order to delete values of the domain of involved variables which are not valid with respect to the value of the other domains. These algorithms are typically based on results from ohter areas, such as graph theory. The constraints in a CP problem are linked by a mechanism called constraint propagation. Indeed, whenever the domain of a variable is modified this could affect others constraints and may cause further domain reductions, thus a re-evaluation of all constraints is activated. The propagation mechanism is illustrated in figure 2.1 (see in [1] and in [13]).

Figure 2.1: Example of constraint propagation for finite domain variables

Solution mechanism

In order to reach a solution, global domain is explored by successive affectation

of a variable to a value. Right after an affectation, the filtering and propagation

mechanisms are used as domains have evolved. Sometimes, an affectation can

induce a non-feasibility of the problem. The last affectation is then suspicious

and there is a backtrack in the search tree, and another affectation is made as

long as any feasible solution is returned or as long as the domain is not entirely

covered. When a feasible solution is found, the backtrack mechanism is called in

(22)

Master Thesis - Title

order to be able to explore other part of the search tree. A solution is denoted optimal when the search tree (or variable domain) has been entirely explored, and the optimal solution is the minimum (maximum, w.r.t the considered prob- lem) of all feasible solutions.

The resolution of a CP problem is represented by a search tree. It allows to represent in an efficient way the successive affectations, the backtracks and the solutions found. Each node corresponds to the instantiation of a variable to a value. On a given layer, all nodes got the same number of affected variables i.e. the depth is set. On a given branch of the search tree which leads to a solution-node, all variables are affected, i.e the depth evolves from 0 to the number-of-variables-to-affect. Figure 2.2 below shows an example of search tree for a scheduling instance. The algorithm used was DFS (depth first search, i.e.

the exploration is first top-down and then left-right). When the search do not go down, it backtracks ; it is because either the problem is infeasible or either the solution founded was greater (in a minimization problem) than the one before.

Figure 2.2: Example of a search tree for a scheduling problem. Green node is a solution. Each blue line is a branching between two nodes.

2.1.2 Some features

CP gets a high level of flexibility as there are several ways to interfere in the resolution of a problem. First of all, one may define new constraints (upon the objective function for optimization problems for example). Second, one may de- fine hand-crafted heuristics in order to guide the search process stating a criteria for chosing the next variable and the next value which is going to be affected.

These heuristics are mainly based upon features of the instance. An exhaustive list will be draw in next section.

CP is really flexible since any hypothesis has to be made either upon the

kind of constraint or upon variable domains. From the fact that all the feasible

solutions will be derived, this is also called an enumeration algorithm.

(23)

2.2 Scheduling problems

Scheduling or job shop scheduling problems are optimization problems in which given jobs are allocated to specific resources and time slots, given some con- straints and an objective function to optimize.

Firstly a proper definition and classification of these problems will be given, and secondly some well-known heuristics rules will be derived. Then the rep- resentation of the solutions of such problems will be treated. Finally a brief state-of-art of solving methods combining CP and Machine Learning techniques will be discussed.

2.2.1 Definition and heuristics rules

Definition and classification

The jobshop scheduling problem in the configuration studied is a N P-hard com- binatorial optimization problem. It is about finding the optimal schedule under various objectives, different machine environments and characteristics of the jobs. Let provide a definition of the terms involved in these problems.

A job can be made up of any number of tasks. For example a job could be making a product, and each task is an activity that contributes to making that product. One can set precedence constraints : some jobs must be done before other jobs. In addition, each job also has a specific order of performing the tasks of that job. A job or a task is defined by a ready time (time at which a job is available to be processed), a processing time (length of time to process the job or the task) and a due date (last time to complete a job). The completion time is the time at which a job is finished. One can be late, hence the tardiness and earliness are defined: the tardiness is the maximum between 0 and the difference between the completion time and the due date ; the earliness is the maximum between 0 and the difference between the due date and the completion time.

A machine is available to execute jobs and tasks. It could be a single ma- chine, or parallel machines. Each machine can only process one job at a time, and each job can only be processed by one machine at a time.

The optimal schedule is the mapping of jobs to machines which is fea- sible with respect to the feasibility constraints and optimize the objectives (an objective function for example). Several objectives functions can be set up : the maximum completion time (also referred as makespan), defined as max{C

1

, .., C

n

} where C

i

denotes the completion time of a job J

i

of the sequence J = {J

₁

, ..J

_n

} ; the total (weighted by some costs w) tardiness P

i=n

i=1

w

_i

T

_i

where T

_i

= max{0, C

_i

} denotes the tardiness. If each job J

i

is assumed with a fixed processing cost p

i

, the objective function becomes P

i=n

i=1

p

i

+ P

i=n i=1

w

i

T

i

In this thesis are treated only deterministic scheduling problems. The nota- tion introduced in [7] consists in three fields α, β and γ. The field α describes the machine environment, β the job characteristics and γ the objective function.

The possible environments are:

(24)

Master Thesis - Title

Single-stage problem:

• 1 - single machine

• P ,P

m

- parallel (identical machines)

• Q,Q

m

- related machines (different speeds)

• R,R

m

- unrelated machines (processing time depends on job and machine) Multi-stage problem:

• Shop environments - J : jobshop (each job has linear constraints among its task) - F : flowshop(each job has the same linear constraint among its task) - O: open shop (no constraints among tasks)

The job characteristics or constraints are:

• r

_i

- for each job a release time is given before which it cannot be scheduled

• d

i

- for each job a due date is given after it cannot be scheduled (without tardiness costs)

• pmtn - preemption

• prec - precedence constraints

• s

jk

- sequence dependent set up times And the objective functions are:

• max

i

{C

i

} - Maximum completion time

• P

i

{L

i

} = P

i

{C

i

− d

i

} - Lateness

• P

i

T

_i

= P

i

max{0, d

_i

− C

_i

} - Tardiness

Given this notation, the scheduling problems treated in the next study are 1 | r

i

, d

i

| P

i

T

i

(next referred as NCOS **a), 1 | r

i

, d

i

| P

i

w

i

T

i

(next re- ferred as NCOS **b), m | r

i

, d

i

, prec | P

i

T

i

(next referred as NCGS **a) and m | r

i

, d

i

, prec | P

i

w

i

T

i

(next referred as NCGS **b). All of them are strongly N P-hard.

Heuristic rules

The following heuristic rules are combined with a CP solver in order to guide the first solution. These are expert-derived rules that gives the next job to schedule, or in a CP perspective the next variable to affect. This is a non-exhaustive list from [12] and [1] :

• Heuristic rules with a Jobshop point of view Shortest Processing Time (SPT):

This algorithm is optimal for 1 || max{w

i

C

i

} scheduling problems

(25)

Earliest Due Date (EDD):

EDD finds the optimal schedule for 1 || P

i

w

_i

T

_i

scheduling problems.

This is one of the most popular hand-crafted heuristic and will be set as reference for the thesis. Moreover it runs in O(n log n) time.

Minimum Slack Time (MST):

It measures the urgency of a job by its slack time, defined as d

i

− p

i

the difference between the due date and the processing time. It is optimal for 1 || P

i

w

i

L

i

scheduling problems.

• Heuristic rules with a CP point of view Maximum Constraint Graph Degree:

Next variable (job or task basically) to affect is the one with the largest degree in the constraint graph (that is involved in the maximum number of different constraints).

Smallest Domain Size:

Next variable (job or task basically) to affect is the one with the smallest domain size (number of distinct values in the domain).

Smallest Domain Size to Degree Ratio:

Next variable (job or task basically) to affect is the one with the small- est domain size to degree ratio. The degree is defined as the number of constraints using this variable.

2.2.2 Solution candidates and their representation for single- machine case

In this paragraph a single-machine sheduling problem will be considered in order to derive properties and concepts of representation solution. First let note that for single-machine scheduling problems and flow-shop problems, permutations of jobs are candidate solutions.

Let consider an instance I with n tasks to schedule. Solving the scheduling

problem I is equivalent to finding the optimal schedule with respect to the ob-

jective function, or finding the optimal permutation of the tasks T = {T

1

, .., T

n

}

which minimizes the objective function let say f (T ). This is a specification of

the single-machine instances, because the affectation of a task to a time-slot is

fully determined by the set of tasks already affected. Indeed a non-affected task

will be affected to the first time the machine is available (w.r.t its constraints

also), which corresponds to the maximum completion time of the set of affected

tasks. One could easily argue that the space of the candidate solutions of a

scheduling problem let say S

_I

, is included in Σ

_n

the set of the permutations

with n elements (it is not an equality because some permutations could be not

feasible w.r.t. the constraints). The specific order of the permutation is in fact

directly linked with the affectation progression in the search tree. Hence, the

former set Σ

n

could be written without changing the meaning as a vector in

the set of choices being made for a given depth (layer in the decision tree). For

example, if one consider n = 3, σ

1

= {1, 3, 2} ∈ Σ

3

and the objective function

value y

1

= f (σ

1

(T )), the set (non-ordered) (σ

₁ⁱ

)

_i∈{1,2,3}

where σ

₁ⁱ

denotes the

(26)

Master Thesis - Title

variable to affect at depth i for the permutation σ

₁

represents exactly the same object. In other words, a candidate solution can be expressed as a permutation σ, or as the set of decisions (at depth i, affect variable σ

ⁱ

in order to reach y

1

) such as illustrated in the figure 2.3. This is typically the set of decisions which is really interesting in learning from.

Figure 2.3: Candidate solution representation in the search tree

2.2.3 State-of-art of hybrid techniques of resolution

Literature on hybrid techniques of solving problems which combine metaheuris- tics and CP is quite rich and covers algorithmic theory to applications. To solve Combinatorial Optimization Problems (COP) there are mainly two dual approches ([8]) which may be considered. Firstly, the standard branch and prop- agate and bound described in 2.1 and secondly metaheuristics approaches. CP combines an exhaustive tree-based exploration and bounding techniques which reduce the search space. Optimal solution is returned, but an exponential com- putation time can be expected. CP fails to found high solution quality with an acceptable computation time. On another way, metaheuristics have shown to be very effective for solving many optimization problems. The combination of these two methods will be discussed in the next paragraph. The use of super- vised machine learning and reinforcement learning for the control of CP search algorithms will be also discussed.

According to the state-of-art realized and [14], the combination of meta-

heuristics and CP can be mainly achieved at two levels: on one hand at a low-

level with a pure combinatorial and algorithmic point of view, and on another

hand at a high-level with a generic job-shop point of view. On a low-level very

efficient metaheuristics have been combined with CP and they were based on a

local search framework such that the search space is explored by iteratively per-

turbing combinations. For example in [10] a TSAB tabu search algorithm was

implemented, a guided local search algorithm in [2] and most recently a hybrid

tabu search / simulated annealing algorithm [5]. They are based on problem-

specific move operators and so can not learn from the structure of the generic

(27)

but algorithmic perspective. On a high-level hyper-heuristics were designed and applied to the class of jobshops scheduling problem. It implies implementing a learning mechanism, such as a Learning Vector Quantization Neural Networks in [11] in order to map the properties of the instance to variable ordering heuris- tics, or also a case-based reasoning in [3] to map an augmented representation of the problem space to heuristics.

In this paragraph will be discussed the use of a new kind of heuristics, one which has been learned for the control of search algorithms. It involves the exploitation of a dataset which records appropriate features and the associated target result. Supervised machine learning is applied on the dataset to extract a model of the target result based on the descriptive features of problem instances.

In SATzilla [16] a regression model predicting the runtime of each solver on a problem instance is built, and used to select the solver with minimal expected run-time. CPHydra [4] uses a similarity-based approach and builds a switching policy based on the most efficient solvers for the problem instance. In [15], Machine Learning is likewise applied to adjust the CP heuristics online. In [9]

is advocating the use of another ML approach, namely reinforcement learning, to support the CP search. It extends the Monte-Carlo Tree Search (MCTS) algorithm to control the exploration of the CP search tree.

2.3 Statistical models

In this section, small amount of theory is developed to provide a framework for next derivations of two differents models. Considering a vector of inputs, the goal is to use it to predict the values of the outputs. This exercice is called supervised learning. The linear regression and the gradient boosting are two different supervised learning model and will be explained. These elements are based on [6].

Let X ∈ R

^p

denote a real valued random input vector. These have some inference on the output (or outputs). It could be qualitative and quantitative input variables. Consider also Y ∈ R a real valued random output variable, with joint distribution P(X, Y ). Function we are looking for is f (X) for predicting Y from X. Therefore we require a loss function L(Y, f (X)) for penalizing errors in prediction. Some penalization terms linked to the models could be added and will be treated in next sections. If one choose the squarred error loss L(Y, f (X)) = (Y − f (X))

²

this leads us to the following criteria to choose f : the risk of f called also the expected squared prediction error EP E(f ) = E(Y −f (X))

²

which is the probability of making a mistake. The optimal learner is then the minimizer of EP E(f )

f

^∗

= argmin

_f

EP E(f )

= argmin

_c

E

Y |X

[(Y − c)

²

|X = x]

and the solution is f

^∗

(x) = E(Y |X = x). We will show how next two models

fit this criteria.

(28)

Master Thesis - Title

2.3.1 Linear regression

The linear model makes huge assumptions about structure and gets stable but possibly inaccurate predictions. First some derivations will be made, and second the backward step-wise subset selection will be explained.

Linear regression derivations

Let suppose that f (x) = E(Y |X = x) ≈ x

^T

β i.e. the regression function is approximately linear in its arguments. Pluging it into EP E and differentiating w.r.t β returns

β = [E(XX ˆ

^T

)]

⁻¹

E(XY ) (2.1)

Replacing the expectations by the averages over the training data exactly same results are derived as from the differentation of RSS(β) = P

N

i=1

(y

i

− x

^T_i

β)

²

= (y − Xβ)

²

, the residuals sum of squares to minimize. Unicity came from the normal equation X

^T

(y − Xβ) = 0.

Assumptions realized are quite significant. Indeed, the deviations of Y arount its expectation are supposed additive and gaussian allow us to derive that

β = E(Y |X) + where ∼ N (0, σ ˆ

²

)

= βX +

and so ˆ β ∼ N (β, (X

^T

X)

⁻¹

σ

²

). The linear regression estimator is then unbi- ased. This assumption must be verified after all linear regression models.

Linear regression are powerful since they do not require a lot of computation, are simple and provide an adequate and interpretable description of how the inputs affect the output.

Linear regression and backward-stepwise selection

Rather than search through all possible subsets and consider all features, an idea could be to seek a good path through them. Backward-stepwise selection starts with the full model and sequentially deletes the predictor that has the least impact on the fit. Several criteria can be used to weigh the decisions : C

p

, AIC, R

²

can be used, or simply the RSS. See below the definition of a non-exhaustive list of criteria :

o RM SE - Root-mean squared error o R

²

- Coefficient of determination

Represents the goodness-of-fit of a model. The number between 0 − 1 in- dicates the proportion of the variance explained by the linear relationship between Y and X.

o AIC - Akaike information criterion

AIC = 2k − 2 log( ˆ L) where k is the number of free parameters to be

(29)

model.AIC rewards the goodness of the fit of a model but also includes a penalty that is an increasing function of the number of estimated param- eters.

o C

_p

- Mallows’s C

_p

It is an estimator of the mean squared prediction error C

p

= E P

j

( ˆ Y

j

− E(Y

j

|X

j

))

²

/σ

²

where ˆ Y

j

is the fitted value from the regression model for the jth case model.

2.3.2 Gradient boosting

In this section a regularization term will be added to the loss function. The so-called objective function for a model Θ and its parameters θ is then

Obj(Θ) = L(θ) + Ω(Θ) (2.2)

The regularization term controls the complexity of the model, which helps us to avoid overfitting. In figure 2.4 we can see the impact of the complexity upon the fitting of the data.

Figure 2.4: Example of different trade-off between loss and regularization terms This section will introduce the tree esemble, the boosting trees and the gradient boosting method.

Tree ensemble

Tree-based methods aim to partition the feature space into a set of rectangles and then choose a simple model to fit the output in each one. They are concep- tually simple but yet powerful. Let draw an example for two decisions variables X

₁

and X

₂

just below in figure 2.5.

Generalizing the model of two variables, let suppose the feature space is par- titioned into M regions R

₁

, .., R

_M

. The response is modeled as a constant c

_k

in each region then f (x) = P

M

k=1

c

_k

I(x ∈ R

_k

). Regarding the minimization of the

sum of squares criteria, the best ˆ c

k

is the average of y

i

in region R

k

. Finding the

best binary partition in terms of minimum sum of squares is generally compu-

tationally infeasible and is derived with a greedy algorithm. We are looking for

(30)

Master Thesis - Title

Figure 2.5: Example of a tree ensemble model for two decisions variables

the tuple (X

_j

, s) s.t. regions {X|X

_j

≤ s} and {X|X

_j

> s} define the best split (i.e. minimize the sum of squares P(y

i

− f (x

_i

))

²

). Then the process should be repeated on each of the two regions. Introducing |T | the tree-size (number of terminal nodes), a tradeoff has to be done between the goodness of fit and the regularization α|T | where α is a tuning parameter.

This is the most popular method for tree-based regression called CART and it explains the important features of tree regression.

Boosting trees

Now that the general concept of tree ensemble is set, let explain the boosting trees. To derive some computations, let consider a tree that partition the fea- tures space inot disjoint regions R

j

, j = 1, .., J and a constant c

j

is assigned to each such region (recall that x ∈ R

j

=⇒ f (x) = c

j

). Hence, formally, the tree can be expressed as

T (x; Θ) =

J

X

j=1

c

j

I(x ∈ R

j

) (2.3)

with parameters ; Θ = {R

j

, c

j

}

j∈[1,..,J ]

where J is also a parameter (hyper- parameter). Regarding the introduction of the section, the motivation of this regression is to minimize the empirical risk thereby the parameters should sat- isfied the minimization problem

Θ = argmin ˆ

_Θ

X

i

L(y

i

; f (x

_i

)) = argmin

_Θ

X

j

X

x_i∈Rj

L(y

i

; c

_j

) (2.4)

which is a combinatorial optimization problem. Most of the time it is divided into two parts, finding R

_j

with a greedy algorithm and then c

_j

.

The boosted tree model is a sum of such trees induced in a forward manner with an additive strategy : fix what is learned, and add one new tree at a time.

Let denote y

_i^(t)

the prediction value at step t and f

_t

the sum of the trees such

(31)

that f

_t

(x) = P

t

k=1

T (x; Θ

_k

). Equation 2.4 can be written as Θ ˆ

_t

= argmin

_Θ

t

X

i

L

y

_i

, ˆ y

_i^(t)

(2.5)

= argmin

_Θ

t

X

i

L (y

_i

, T (x

_i

; Θ

_t

) + f

_t−1

(x

_i

)) (2.6)

for the parameters Θ

t

= {R

jt

, c

jt

}

_jt∈[1,..,J_t_]

given the model f

t−1

(x).

Gradient boosting method

Fast approximate algorithms for solving 2.5 can be derived from numerical op- timization. Let consider the squared error criteria : the loss function becomes

L (y

i

; T (x

i

, Θ

t

) + f

t−1

(x

i

)) = (y

i

− f

t−1

(x

i

) − T (x

i

, Θ

t

))

²

= (y

i

− ˆ y

^(t−1)_i

− T (x

i

, Θ

t

))

²

= (r

_i,t−1

− T (x

i

, Θ

_t

))

²

where {r

_i,t−1

}

_i

denotes the residuals of step t − 1. This leads us to

Θ ˆ

t

= argmin

_Θ

X

i

(r

i,t−1

− T (x

i

; Θ)) (2.7)

And the residuals can be approximate by some vector proportionnal to the neg- ative gradient h

_∂L(y

i,f (x_i)

∂f (x_i)

i

f (xi)=fm−1(xi)

. Therefore we can do steepest descent on top of this. Regression trees approximate the (negative) gradient of the loss function, and each tree is a successive gradient descent step. Final model is just all gradient descent steps.

This gradient boosting method has a lot of advantages such as that it is suit-

able to heterogeneous data, it supports different loss functions and automatically

detect non linear features interactions. Disadvantages are that it requires care-

ful parameters tuning, it is slow to train and cannot extrapolate. All the tuning

parameters are available and explained in annex 6.3.

(32)

Chapter 3

Scheduling instances -

Benchmark, modelling and results

In this chapter will be presented more precisely the scheduling problems studied in the thesis. First will be discussed the benchmark of instances considered and the formalism associated. Second, the modelling with a CP model will be studied and the associated results presented. Finally, a new feature will be added in the modelling, the cost of the electricity per time slot. The impact of such considerations will be studied, and the results derived.

3.1 Definition and classification of the schedul- ing problems

ILOG came up in 2007 with a benchmark of manufacturing planning, batching and scheduling problems. The idea was to provide the community with a set of instances, simple enough to be solved without developing huge code and repre- sentative of difficulties encountered on real problems. Indeed, progress of the community on these features has potential impact on many applications. Real- istic instances are needed in order to represent the details of reality. Instances come from different sources, some of them are simplifications of a real problem with less constraints and a smaller size ; some instances come from academia such as hard random instances or instances generated with respect to a real-life pattern.

3.1.1 Master model

A master model is defined and will be common to all instances. There are

resources with given capacity and calendars that are used to consume or pro-

duce materials (raw, intermediate, final products) with storage limitations and

costs through recipes being instantiated with given batch sizes, and consisting

(33)

modes specifying the resources to be used, the time needed depending on the batch size, and the resulting cost. Each activity consumes and produces mate- rials in given quantities. There are also setup initial states, times and costs. On the other side there are demands with due dates which correspond to production orders (recipe instances) and material flow arcs. Finally there are activities to be scheduled.

In the next paragraph each feature of the master model will be detailed and instrumented with its characteristics (an instance Excel sheet is available in annex6.3):

• instance got a name (following a classification explained below), a time scale , a common earliest start time and end time for all activities

• resource got a capacity. By default the capacity is 1 (corresponding to a machine) but it can be more (for the resource electricity for example).

• material

• recipe is a generic process consuming and producing materials. Planning consists of creating instances of recipes, called production orders. It has a minimal/maximal size of a production order for the recipe under consid- eration.

• activity refers to the recipe to which it belongs. For each production order of this recipe, an instance of the activity must be scheduled.

Moreover, for each activity are defined several modes that link an activity to a configuration given by a fixed and variable processing time and a fixed and variable cost. Materials are linked to the activity by a given quantity produced (if quantity positive) or consumed (if quantity negative). Again, demands con- cern materials through a table with quantities. They are characterized also by a due time, an earliness and a tardiness variable costs. The master model’s objective function is the total cost, defined as a linear combination of the pro- duction costs, earliness and tardiness penalties and extensively inventory costs.

Let note that the production costs are fixed and can not be cut.

3.1.2 Benchmarking protocol

The instances defined previously are classified according to the following nomen- clature :

A category defining the active constraints and costs – NCOS (No-Calendar One-Shop),

– NCGS (No-Calendar General-Shop)

A number in the category, the higher the number the bigger the instance A letter for earliness, tardiness and storage costs.

– a instances are such that earliness and storage costs are not consid-

ered. In addition, tardiness costs are such that different production

orders will have the same weight.

(34)

Master Thesis - Title

– b instances are such that earliness and storage costs are not consid- ered.

In order to be in line with the conventional notation, see the table 3.1.2.

ILOG notations Conventional notations NCOS

^∗∗

a 1 | r

_i

, d

_i

| P

i

T

_i

NCOS

^∗∗

b 1 | r

i

, d

i

| P

i

α

i

T

i

NCGS

^∗∗

a m | r

i

, d

i

| P

i

T

i

NCGS

^∗∗

b m | r

i

, d

i

| P

i

α

i

T

i

Table 3.1: Equivalence between conventational notation and benchmark The objectives of this benchmark are multiple such as increase the robust- ness of generic techniques alongside a great variety of instances (size, numeric characteristics, side constraints). A CPU time limit is set as an algorithm exe- cution constraint. It is given dependently on the size (first digit of the number) of the instance (see table 3.1.2).

First digit of the number instance CPU Time (s)

0

^∗

60

1

^∗

150

2

^∗

300

3

^∗

450

4

^∗

600

5

^∗

1200

Table 3.2: CPU time as a function of the size of the instance

3.2 Modelling of the basic instance

Let derive the mathematical formulation of the model and present the results.

Let first introduce some notations for the sets of variables in table 3.2.

P set of production orders I set of recipes

J set of activities M set of modes T set of tasks R set of resources D set of demands

A set of production order-demand arcs Table 3.3: Notations for the different sets of variables

The set A reflects the link between a production order and a demand. One

should note that production order can satisfied several demands, hence for

(p, d) ∈ A the production order p is linked to at least one demand d. The set of

(35)

the existence of link between each of these variables. Moreover, for (p, i, j, m) ∈ T the task (p, i, j, m) is defined by a starting time ST(p, i, j, m), an end time ET(p, i, j, m), a processing time pt(p, i, j, m) and a assignment to a resource assign(p, i, j, m) = r, r ∈ R. Let assume the processing time of a task (p, i, j, m) set to pt

_p,i,j,m

where this value is the sum of a fixed processing time and the variable processing time times the batch size of the production order.

Let derive first the constraints upon the tasks. There are of several kinds : time constraints (3.1 and 3.2), resource constraints 3.3 and precedence con- straints 3.4.

Relation between start and end time

∀(p, i, j, m) ∈ T ET(p, i, j, m) = pt

_p,i,j,m

+ ST(p, i, j, m) (3.1) A production order is linked to one recipe and several activities; it has time constraints

∀(p, i, j, m) ∈ T ST(p, i) ≤ ST(p, i, j, m) (3.2)

∀(p, i, j, m) ∈ T ET(p, i, j, m) ≤ ET(p, i)

Resource (machine) sharing constraints take the disjunctive form

∀(p

n

, i

_n

, j

_n

, m

_n

) ∈ T

^max(n)

, n ∈ {1, 2} (3.3)

∀r ∈ R s.t assign(p

n

, i

n

, j

n

, m

n

) = r, ST(p

₁

, i

₁

, j

₁

, m

₁

) − ET(p

₂

, i

₂

, j

₂

, m

₂

) ≥ 0 OR ST(p

2

, i

2

, j

2

, m

2

) − ET(p

1

, i

1

, j

1

, m

1

) ≥ 0

Precedence constraints between activities (for an intermediate product for example

∀((p, i, j

1

, m

1

), (p, i, j

2

, m

2

)) ∈ T × T (3.4) ST(p, i, j

1

, m

1

) − ET(p, i, j

2

, m

2

) ≥ δ

1,2

where δ

1,2

is a constant delay between activities j

1

and j

2

.

Let define the demand shipment variables SH which denotes the time which the demand d can be shipped. Definition is

∀d ∈ D, SH(d) = max

(p,i,j,m)∈T s.t.(p,d)∈A

ET(p, i, j, m) (3.5) Finally the objective function can be written as the sum of three terms : the processing cost (3.6), the earliness cost (3.7) and the tardiness cost (3.8).

Processing cost

ProcessCost = X

(p,i,j,m)∈T

Pc

p,i,j,m

(3.6)

where Pc

p,i,j,m

is the sum of fixed processing costs and variable processing

costs times the batch size of the corresponding production order for the

task (p, i, j, m) ∈ T .

(36)

Master Thesis - Title

Earliness cost

EarliCost = X

d∈D

max{0, Ec

d

(duetime(d) − SH(d))} (3.7)

where Ec

d

is the product of a variable earliness cost times the batch size of the corresponding production order.

Tardiness cost

TardiCost = X

d∈D

max{0, Tc

d

(SH(d) − duetime(d))} (3.8)

where Tc

d

is the product of a variable tardiness cost times the batch size of the corresponding production order.

Therefore the total cost (i.e. the objective function) can be written as the linear combination of three terms derived in (??). In the next of the thesis, the multipliers will be set to α

P C

= α

T C

= 1 and α

EC

= 0.

TotalCost = α

P C

ProcessCost + α

EC

EarliCost + α

T C

TardiCost (3.9) Hence, the scheduling problems studied can be formulated as

min TotalCost (3.10)

s.t. (3.1), (3.2), (3.3), (3.4)

3.3 Results of some instances

The complete results are available in annex 6.3 but for now, three results will be presented. One example from NCOS’ instance where optimal solution were found, and two NCOS’ and NCGS’ instances where the solver used did not re- turn optimality. This last example will also be part of the scientific approach of next chapter.

Remark: All the mathematical modelling and the resolution were implemented in Xpress software using Xpress-Kalis solver through MOSEL code langage. One generic model has been created, capable of solving every instance from its excel sheet definition.

3.3.1 NCOS 02a scheduling problem

Let first consider the instance NCOS 02a which got 10 tasks to be scheduled and intermediate products. Please refer to (3.1.2) to see the conventional notations.

Several heuristics were tested such as the EDD and the SPT (see 2.2.1) only the best result is returned. Details on instance characteristics are available in annex 6.3. The optimal schedule is presented as a Gantt chart in figure 3.1. To- talCost found is the optimal one (the instance is closed and optimal result known from litterature). The search tree for the resolution is available in figure 3.2.

Each green node is a solution found. The biggest and the brightest represents

the best solution, which is optimal since the decision tree has been totally ex-

plored. Moreover, the search tree resolution caracteristics are aggregated in the

(37)

Figure 3.1: Gantt chart of optimal schedule for instance NCOS 02a. Black lines represent due dates of each task.

Figure 3.2: Search tree of the resolution of instance NCOS 02a using EDD heuristic.

Computation time (s) 1.11 Number of nodes 1981

Backtracks 1964

Table 3.4: Search tree resolution features for instance NCOS 02a

3.3.2 NCGS 21a scheduling problem

This instance is made of 60 tasks and 5 machines. Results are found with the EDD heuristic as previously. Figure 3.3 shows the Gantt chart of the optimal schedule. With this heuristic, the TotalCost returned is not optimal but 36.3%

higher.

Computation time (s) 17.23 Number of nodes 8316

Backtracks 8299

Table 3.5: Search tree resolution features for the instance NCGS 21a

Although the optimal schedule was returned without heuristic, the fact that

hand-crafted heuristic cannot is interesting for the next study. Indeed, for some

large problems, a resolution scheme is to find an initial solution with an heuristic

and then apply some local-search algorithms to improve it. Therefore the better

is the initial solution, the better is the optimal one. This topic will be developped

in the next chapter of the thesis.

(38)

Master Thesis - Title

Figure 3.3: Gantt chart of optimal schedule for instance NCGS 21a.

3.3.3 NCOS 03b scheduling problem

This instance is a 10 × 1 scheduling problem. Even if it seems pretty easy to derive the optimal solution, candidate solutions space is up to a set of 10! ≈ 4e + 06 candidates. With the EDD heuristic the sub-optimal schedule returned is in figure 3.4. This is a sub-optimal schedule since there is a gap of 6% with the optimal solution (known from litterature). Moreover, the first solution found in the resolution has a gap of 9% with the optimal one.

Figure 3.4: Gantt chart of optimal schedule for instance NCOS 03b. Black lines represent due dates of each task.

Computation time (s) 11.29 Number of nodes 48559

Backtracks 48552

Table 3.6: Search tree resolution features for instance NCOS 03b

Therefore, from an industrial business perspective, the considered plant,

assuming a scheduling optimization is performed per day, could have saved 6

(39)

nificant improvment, and one way to tackle this is to argue that the optimization

method should learn from previous scheduling optimizations and learn from its

mistakes to perform better everyday. Since the scheduling instances are quite

similar one day on another, some features are the same and should be re-used

to solve next scheduling problems. This is the idea that will be developped in

section 4.

(40)

Chapter 4

Procedure

In the introduction and in chapter 2 was mentionned that hand-crafted heuris- tics were not optimal for all classes and example was derived in chapter 3 where a scheduling problem solved with the EDD heuristic (which is the most used heuristic rule in applications) did not reach optimal solution. In an indus- trial perspective, scheduling problems are similar from one day to another for a given industry plant. Moreover, they are interested in finding only one solution, returned by an heuristic because of the large size of their instance. One could imagine an heuristic that learns days over days from these features how to reach faster a better solution. This would have both an impact on the computationnal cost and on the human effort needed to design the heuristics.

In the next of the study will be developped an heuristic that should learn from one resolution to another, be able to capitalize previous resolutions on the strategy scheme. It is a batch learning problem. After every optimization resolution data are collected and the learning procedure should be called, to calibrate upon the new observations the model. This study will focus on a supervised learning method where from static and dynamic features collected upon the search tree we will be predicted the value of the objective function at the end of the resolution. This study can be seen as a first step of a rein- forcement learning module, where the value function is approximated by our method. Some insights about this will be given in the conclusion.

The heuristic should learn from two different scales, from different problem solving resolutions and from the decision tree in itself, the nodes traveled. This will be developped in section 4.1. Once the data collection has been realized, the heuristic should be able to learn an evaluation or a score of the underlying subtree for each unaffected variable. In section 4.2 the learning environment will be set up. Both input space and target value will be designed in this section.

Finally the supervised learning method should be integrate into the framework of an heuristic to be tested on other instances. This will be done in section 4.4.

Figure 4.1 summarizes up the procedure set.

The procedure will be applied onto the instance I = NCOS 03b because

of non-optimality of its solution as shown previously. Chapter 5 will show the

(41)

presents justifications.

Figure 4.1: Scheme of the entire procedure

4.1 Data collection

This section will be divided into two parts. The first one will be about an instance generator, in order to map the scheduling problems that are likely to appear into an industrial perspective; the second one will focus on the extraction of data from the instances generated.

4.1.1 Instance generator

The goal is to design an instance generator that represents the potential evo- lutions that could occured in an industrial plant. This mapping of the space of the instances will introduce randomness that will help to catch and learn the structure. This part has to be done because of the lack of clients data.