Using Machine Learning to Predict Employee Resignation in the Swedish Armed Forces

(1)

FIRST CYCLE, 15 CREDITS STOCKHOLM SWEDEN 2019,

Using Machine Learning to Predict Employee Resignation in the

Swedish Armed Forces

AMANDA FOLEY

KTH ROYAL INSTITUTE OF TECHNOLOGY

SCHOOL OF INDUSTRIAL ENGINEERING AND MANAGEMENT

(2)

Abstract— since the Swedish government reinstated conscription in 2017, the Swedish Armed Forces are once again able to meet the wartime staffing requirements. In addition to the increase in employees the Swedish Armed Forces have been able to shift focus from external recruiting to internal human resource management. High employee turnover is a costly affair, especially in an organization like this one, where the initial investments, by way of training, are expensive and arduous. Predicting which employees are about to resign can help retain employees and decrease turnover and in turn save resources. With sufficient data, machine learning can be used to predict which employees are about to resign. This study shows that the machine learning model, random forest, can increase accuracy and precision of predictions, and points to variables and behavioral indicators that have been found to have a strong correlation to employee resignation.

Index Terms— conscription, machine learning, employee turnover, employee retention, random forest, Swedish Armed Forces

I. SAMMANFATTNING

Detta arbete utforskar möjligheten att använda

maskininlärning, mer specifikt modellen random forest, för att förutspå uppsägning av anställda i Försvarsmakten. Arbetet stammar ur återinförandet av värnplikten i 2017, som följd av att enbart ca. 60% av bemanningskravet i krigstid med den frivilliga modellen kunde mötas. Arbetet finner att

maskininlärningsmodellen random forest, kan användas för att förutspå uppsägningar till en icke-trivial grad. Random forest- modellen kan användas till att förutspå uppsägningar till 89%

noggrannhet och 72% precision. Den största källan till osäkerhet i studien är mängden och egenskaperna hos datan.

Studien är baserad på data från 1500 heltidsanställda gruppchefer, soldater och sjömän (GSS-K). För att förbättra resultatet och i synnerhet precisionen behövs mer data och data med en starkare korrelation till beteende. För framtida studier rekommenderas att utforska huruvida andra maskininlärningsmodeller är lämpade för just denna verksamhet, men även hur arbete, insamling och förvaltning av data inom Försvarsmakten kan utvecklas.

1 This number is based solely on the cost of food, trips and compensation

II. INTRODUCTION

A. Brief overview

In 2017 the Swedish government reinstated conscription after eight years of a voluntary application model for enlisting in the Swedish Armed Forces. For the Swedish Armed Forces this implied an ability to once again meet the wartime staffing requirements [1]. In meeting the recruiting targets, the reinstated conscription brought a shift in focus from external marketing to internal resource management, specifically employee retention.

Employee retention, defined as the firm’s ability to keep desirable employees, is measured as the percentage of retained employees over a specified time period, also referred to as the retention rate [2]. Inversely, employee turnover is the loss of employees, measured as turnover rate, or churn rate, and is the percentage of lost employees over a period of time. Turnover can be categorized into subsets: desirable and undesirable turnover, the former referring to the loss of employees with a negative impact on the organization, the latter is the loss of valuable employees and is more common than the former [3].

Consequently, a high turnover rate is associated with costs.

Aside from the direct costs of recruiting such as marketing, selection and training [4], to name a few, turnover is associated with numerous hidden costs including dip in morale, disruption of workflow and loss of production.

In the case of the Swedish Armed Forces, each recruit requires approximately one year of training prior to generating any value to the organization, a training priced at 83,9¹ thousand SEK per recruit as of 2017[1]. As turnover even at entrance level is associated with steep costs, employee retention is of exceeding importance in an organization such as the Swedish Armed Forces. A step in the direction of a lowered turnover rate is understanding the cause of the turnover. This can be achieved in several ways such as exit interviews, employee surveys or, as in this thesis, the quantitative study of parameters leading up to a termination of employment. This study aims not only to evaluate the use of a machine learning model to predict employee turnover but to study the factors indicating resignation.

Using Machine Learning to Predict Employee Resignation in the Swedish Armed Forces

Amanda E. Foley, Royal Institute of Technology, Stockholm

(3)

B. Research question

This study explores the research question: “To what extent can the machine learning model random forest be used to predict employee resignation in the Swedish Armed Forces?”

C. Aim of the study

This study aims to answer the research question and analyze the impact that the input data has on the precision and accuracy of the prediction.

D. Relevance

The relevance of this study arises from the ideologically controversial topic of conscription and the recent

parliamentary decision to reinstate it, in combination with the high costs of employee turnover. The aim of the study is to evaluate if and to what degree machine learning can be used to predict employee turnover. Inductively, the ability to predict employee turnover may provide insight into how an effective retention policy might be approached. Aside from contributing to a potential decrease in the employee attrition rate and thereby the costs associated with them, a predicting model of employee attrition and an understanding of the causes of the resignation can be applied to improve the capabilities of wartime staffing with a voluntary recruiting model. Thereby contributing to potentially moving away from the conscription model of staffing and toward the less controversial voluntary model.

E. Specified problem definition

The Swedish Armed Forces differs with regard to many other organizations in the respect that it is government driven. This implies regulations on employment terminations that make it difficult to terminate an employment in practice. Involuntary turnover is near non-existent and as such is disregarded in this study. Subsequently, all turnover is assumed to be voluntary.

This unilateral approach created the premise for the hypothesis from which this study is sprung, namely: “employees in the Swedish Armed Forces share a pattern of behavior prior to terminating their employment. This pattern of behavior can be detected and used to predict employee resignation with the aid of machine learning”. The intuitive connection between behavior and termination of employment has been

documented in previous studies along with studies showing that employees share behavioral progression patterns surmounting to the termination [5][6] along with studies that remain inconclusive [7]. The studies show a correlation between absenteeism and resignation [5]. This study has explored behavioral variables as well as demographic

variables in order to answer the question: “To what extent can the machine learning model random forest be used to predict employee resignation in the Swedish Armed Forces?”.

Previous comparative studies on machine learning to predict employee turnover have yielded that random forest is an apt

machine learning model, with top or close to top scores on both precision and accuracy compared to other machine learning models. For this reason, random forest was chosen to be used in this study.

F. Challenges

The challenges of this study are twofold, both pertaining to data collection. The first challenge is the quality of the data.

Due to data inconsistency and the sensitive nature of the data only a small portion was made available for this study.

Furthermore, data related to behavior, data which would be of great interest to this study given the link between behavior and resignation, has not been recorded to a significant extent. The second challenge has been obtaining the existing data. The distribution of the data is restricted in primarily two regards, on the one hand by the rights of the employees by the General Data Protection Regulation (GDPR) recently instated in the EU and on the other hand as a result of the nature of the organization, access to the entire data set may give too great of an insight into the organization and publishing it is an issue of national security. In summary, both obtaining the data and the quantity and quality of the data pose a challenge to this study.

III. BACKGROUND

A. Swedish Armed Forces, organization

The Swedish Armed Forces is the employer of 50 000 men and women at the time of this study, whereof 20 000 constitute the daily operations [1]. These 20 000 employees can be further divided into the employment categories officer, specialist officer, group leader/soldier/seaman (herein out GSS), civilian employees and others. The remaining 30 000 employees consist of part-time officers and GSS as well as the national guard with the requirement to serve a minimum of 8 days per year. Due to the consistency in the working hours and routines as well as the importance of the retention of the intellectual capital in the full-time employees, this study will focus on the 20 000 that make up the daily operations. Among the full-time officers, civilian and GSS employees the churn rates as of 2017 are 6%, 13% and 15% respectively, whereof the latter sees 21% of employees leave within the first year.

During the course of the past 10 years the Swedish Armed Forces have changed their organizational structure four times.

This may affect the turnover rate depending on the effect it had on the employees and how the changes were received.

This also provides a potential comparative study in the future.

This notion is explored further in the discussion section of this study. In the Swedish Armed Forces the initial investment in employees is high and the time to payoff is long. During the voluntary recruitment model approximately 60% of the wartime staffing requirements could be met [8]. The increase in employees as a result of conscription caused in increase in the cost of training new recruits from 68 million SEK in 2016 to 155 million SEK in 2017[1]. This resulted in 2419 trained recruits at a cost of 64 thousand SEK per recruit [9]. Of these

(4)

89% decided to accept a position in the Armed Forces after approximately one year of training [1].

B. Economic aspects of attrition

Employee turnover is, and ought to be, an important component for organizations to monitor and manage. A high attrition rate implies steep costs for the affected organization, varying from an added cost of 50% of the annual salary for entry level employees to 250% for senior and executive level employees [10]. The cause of the costs can be derived from several factors such as loss of productivity, loss of employee knowledge, cost of training and recruiting, deprecation of staff morale and customer dissatisfaction [11]. In adjunction to the high initial costs of training the Swedish Armed Forces are faced with another heightened risk. According to the

organization’s annual report on exit interviews, employees of the Swedish Armed Forces value comradery highly. A large sense of community may contribute to a low employee attrition rate, however, where there is a large sense of comradery the resignation of one employee may cause a domino effect and a deprecation of staff morale [12].

C. Background machine learning

Machine learning was coined and defined by Arthur Samuel (1959) as a "Field of study that gives computers the ability to learn without being explicitly programmed". After gradually being infused with probability, decision theory and

optimization there are today a multitude of powerful machine learning models. Of these models, this study will be applying random forest. In short, random forest is a collection of instances of another machine learning model called decision tree. The decision tree classifier aims to classify data by repeatedly dividing data into classes determined by logical disjunctions [13][14]. However, there is a finite limitation in the complexity of decision trees, too much training data leads to over-fitting predictions to the training data at the cost of the accuracy when run on the test data. The machine learning model Random Forest was created by T. K. Ho [15] and later developed by L. Breiman and A. Cutler as a solution to this problem. The Random Forest model is essentially a collection of decision trees that are trained on slightly different data and on slightly varying features. The classification is determined by a majority “vote” of the outcomes of the decision trees [16].

The methodology in this thesis takes on the O.S.E.M.N.

outline as fist described by H. Mason and C. Wiggins.

O.S.E.M.N. is an acronym for: obtain, scrub, explore, model and interpret and was first introduced in the 2010 article A Taxonomy of Data Science [17] and the methodology has since then been included in several educational textbooks [18].

The outline provides an approach to processing data heavy tasks, as “pointing and clicking does not scale” Mason &

Wiggins (2010). Obtain, in this context, refers to the process of obtaining information in an appropriate form and in sufficient amounts. Scrubbing, is the process of preparing the data for

the machine learning program, including processes such as removing extra characters and formatting the data until the data can be understood by the program. Exploring the data through means of visualization is the final step in preparing the data for the model. At this point the data can be extracted by the program and run through the model. The program splits the data into a training set and a testing set. The training data is the data that the model identifies patterns in and is used as reference to predict the outcome in the test data. To create the best premise for accurate predictions, the training data makes up a majority of the total data, often with a split of around between 75%-80% training data and 25%-20% test data. The better the model is, the more accurate and precise the predictions will turn out. The final stage of the O.S.E.M.N., interpret, is the stage of turning the results of the prediction from numbers into significance in terms of the initial problem, in the case of this study for example, interpreting a 0 as an employee staying with the organization and a 1as an employee leaving.

Finally, the machine leaning model and its application to the Swedish Armed Forces is to be evaluated on its ability to predict future resignations. The evaluation is founded on the work of D. M. Powers’s study Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation evaluating the accuracy, precision and recall.

These are calculated as different ratios of the resulting prediction. The predictions can be separated into true positive (TPP), true negative (TNP), false positive (FPP) and total actual positives. Accuracy is the measure of the true predictions, positive or negative over the total number of predictions 𝐴 = ^{𝑇𝑃𝑃+𝑇𝑁𝑃}

𝑇𝑃𝑃+𝑇𝑁𝑃+𝐹𝑃𝑃+𝐹𝑁𝑃, precision 𝑃 = ^𝑇𝑃𝑃

𝑇𝑃𝑃+𝐹𝑃𝑃 is the true positive predictions over all positive predictions and recall R=^𝑇𝑃𝑃

𝑇𝐴𝑃, the total positives over all real positives, also called the true positive accuracy. Together these form the primary metrics against which the model will be evaluated.

D. Previous studies on machine learning and employee attrition

Multiple comparative studies have been conducted evaluating the precision and accuracy with which an assortment of machine learning models can predict employee attrition [19][20]. Among these studies [21] Gabrani and Kwatra (2018) have compared the machine learning models: logic regression algorithm, decision tree, random forest and adaboost to predict employee attrition in the tech industry.

Among these, random forest had the highest scores on the used metrics. In a similar study Alduayj and Rajpoot (2018) Linear Support Vector Machines (SVM), Quadratic SVM, Cubic SVM, Gaussian SVM, Random Forest and K-nearest neighbor were compared in regards to their accuracy, precision and recall, resulting in Gaussian SVM having the highest average score and random forest the second highest.

(5)

E. Societal and Ethical Aspects

Aside from the ethical aspects of dealing with personal data as well as analyzing the longevity of an individual’s employment at the source of their livelihood, ethical dilemmas can be divided into three main categories: placing different values on employees, promoting negative behavior and promoting an ideologically controversial model.

In a world with limited resources and the use of machine learning as a means to retaining employees, it might not be a far leap to anticipate the development of a method to determine which employees are worth spending resources on in order to retain. In fact, employee value models (EVM) already exist. In the event of applying an employee value model (EVM), different employees would be valued differently and possibly be treated accordingly. In an organization with limited resources, retention practices may not be afforded to all. The ethical dilemmas that may arise pertain to the question: who should be provided with incentive to stay? Different EVMs function differently, nevertheless, they are all based on a cost-profit optimization calculation, often benefiting those with medium to high positions within the organization, in which case resources would be spent to retain these, while requiring more resources to be incentivized.

On the other hand, a utilitarian approach could be taken, with the goal of driving down the overall turnover rate, in which case the incentive would be provided to as many employees deemed likely to resign as possible. In summary, the decision on how to make use of the information that the machine learning model could provide is an ethical dilemma in and of itself.

Another possible ethical implication of this study is the risk of promoting negative behavior. This study builds on the hypothesis that discrepancies in employee behavior can be used to predict resignations. Given this premise, this study could potentially justify rewarding employees displaying negative behavior. Whereas employees who go about their work diligently and without any indication of quitting may be at a disadvantage. The result of this study could thereby promote undesirable behavior.

Lastly, and on a broader societal scope, this study addresses the ideologically controversial issue of conscription. The ethical dilemma is depicted in two conflicting philosophies, namely that of the individual as described by Robert Nozick’s notions on the Night-Watchman State [22], and that of the collective as depicted by John Rawls as the welfare state [23].

It can be argued that this paper is aiding the functionality of the latter perspective, and in doing so, taking an ideological stance which is not its intent. Any application of the results of this study should be done so with consideration to both the means as well as the end and within a rigorous ethical framework [24].

F. Conclusion

Through the selection of a thoroughly tested machine learning model and with exceptional results when predicting employee turnover in previous studies, a stable foundation for achieving a high level of accuracy in this study is created. Given the high scores in relevant studies for random forest model, this study is conducted using this model throughout. The machine will be trained using 80% of the data collected and tested on the remaining 20%. Aside from an evaluation by percentage accuracy, precision and recall of the prediction, it is also relevant to look at several other metrics, among them the true positive (a positive prediction made by the machine found to be true according to the given data) results in particular. As a majority of employees won’t resign over the given period of time of one year, the true negative (a negative prediction made by the machine found to be true according to the given data) will make up a large part of the statistics, this makes the true positive a more significant indicator of the machine’s success.

With regard to the ethical challenges of this work, it is encouraged that the results are used within an ethical framework.

IV. METHOD

A. Overview

The study is conducted on 1500 employees in the Swedish Armed Forces. The employees belong to GSS-K, the group of full-time employees with the highest turnover rate. The data is centrally sourced from the Swedish Armed Forces’s

headquarters. The model is built in Python using the integrated development environment (IDE) Spyder. The python library Pandas is used to edit and format the data and python tool scikit-learn is used to train the model.

B. Obtain Data

Data collection is a challenging and important part of the machine learning process and the data collected has an impact on the success of the model’s application. There are several attributes of the data that can affect the model, these can be separated into two categories of aspects, namely quality and quantity. The quality of the data refers to aspects such as the selected variables and the quality of the data in terms of outliers and missing data points. These aspects contribute to the accuracy of the model. The quantity of the data refers to both the number of variable but more importantly the number of cases, in this study, cases are employees in the Swedish Armed Forces. These aspects contribute to the scores on the metrics against which the study is being evaluated.

Qualitative aspects of data

Within the topic of the quality of data, the nature of chosen variables as well as the quality of the data will be looked at in

(6)

further detail. In order to increase the accuracy of this study variables which have a strong correlation to resignation must be found. As previously mentioned, the study is based on the hypothesis that there is a correlation between behavior and resignation. If the hypothesis is correct, data in the form of variables which can be tied to behavior, such as tardiness, hours of overtime per month and sick leave, is desirable.

However, these variables are often harder to obtain than more generic data such as age, gender, salary and place of

employment. The generic variables do not indicate behavior but contribute to predict employee resignation and provide insight into structural patterns (if department, for example, affects the accuracy or precision of the prediction it would indicate that there is a correlation between department and resignation). The data set obtained may be missing certain data points or have outliers that may skew the model. In order to remedy the effect of this, the data needs to be scrubbed, a process described further on in this study.

Quantitative aspects of data

Although ideally based on a larger amount of data, for previously mentioned reasons this was not available. The data received for this study consists of 1500 individuals and covers 13 variables. The variables obtained were the following:

1. Gender 2. Age

3. Employment category (a, b, c, d) 4. Branch (1, 2, 3, 4, 5)

5. Month of discontinued employment

6. Category of transfer (in case of internal transfer) 7. Cause of discontinued employment

8. Level (a-e, n, o)

9. Number of months with scheduled working hours 10. Average monthly working hours,

11. Number of 24 consecutive working hours 12. Unplanned working hours taken

13. Other operations

Some of the variables will not be useful to the machine learning process (such as cause of discontinued employment) and will be removed from the data set. Furthermore, some of the data has been encrypted with substitute data to ensure the anonymity of the employees as well as to conceal certain aspects of the organizational structure. Regardless, the learning process is not be affected by this and discussion and analysis can be made regarding the influence of these variables, despite the encryptions.

C. Scrub

Overview of the scrubbing process

The scrubbing includes several steps before the data ready for the next step of the O.S.E.M.N. methodology. The first is considering the variables what information they hold. Given the fact that the data set is based on availability, and not

specifically tailored to this study and research question to suit to the research question, the variables are considered first: are there any co-dependencies that could obstruct the machine learning? Once these have been identified and neutralized, the specific data points within the remaining variables can be scrubbed. Different terms of the same meaning are aligned, and excessive symbols are removed. One-hot encoding is used to non-numeric values (such as employment type) into terms into binary representations. The features, the variables that the machine uses to make predictions, and targets, that is the classification we wish to predict, of the data are set. The data is then randomly divided into training and test data at an 80 to 20 ratio.

Variables

Of the received variables there are a few that ought to be removed so as not to obstruct the learning and hinder predictions. Since month of discontinued employment and cause of discontinued employment only have data if there is a discontinued employment it is not information that will be available prior to a termination of employment and as such will not be able to contribute to predictions in the future, also it would be a dead giveaway for the model, these two

variables are therefor removed. Category of transfer will only hold data if an employee has transferred positions within the firm, and out of the GSS-K group that is being studied.

Furthermore, a vast majority of the transfers are to part-time employment as a complement to their studies or another, primary source of income, for the purpose of this study they will be considered a discontinued employment, and the variable is therefore removed. The target feature of

discontinued employment is not explicitly represented in the data received but can be converted from the data in Cause of discontinued employment. The Data is extracted and represented in binary in a column called Discontinued employment.

Although these variables have little indication of behavior, save perhaps average working hours, they hold demographic information (age, gender), information about where the employee is stationed (defense branch, level and k-category) as well as some information about the type of work they do and how much they work (average monthly working hours, number of 24 consecutive working hours, unplanned working hours taken and other operations).

Data preparation

The received file contained data in different forms. Symbols are converted to an appropriate numerical value (dashes might for example be converted to zeros), any other non-numeric values are converted into a numeric representation. If the values are binary (such as gender in this particular collection of data) it can be converted in excel to representations of 0s and 1s.

Table I: sample of employee data

(7)

Gender Age Employment category Branch Level No. Months with scheduled hours Av. Month working hours No. Of 24 Consecutive hours of work Unplanned working hours Other operations Discontinued employment

Man 29 a 1 b 7 124 0 0 73 0

Man 29 d 4 d 12 98 49 0 0 0

Man 24 c 4 d 12 100 13 0 0 0

Woman 23 d 1 a 12 101 20 0 32 0

Man 32 b 4 o 1 100 2 0 0 1

Where there are more than two non-numeric options, one-hot encoding is used, this allows for the data to be represented numerically without assigning weighted values to them. K- Category is demonstrated below.

In Figure III an excerpt of the K-Category data before one-hot encoding can be seen. Prior to the one-hot encoding the K- Category data is represented in alphabetical categories, the goal of the encoding is to convert the alphabetical categories into binary representation.

Figure III: sample of data before one-hot encoding

Figure IV shows the K-Category data in binary representation.

This adds a number of columns to the total data set, however, it allows the categories to be void of weighted values, which would be the case if the alfabetical categories (the current a,b,c and d) had been replaced with integers (for example 1,2,3 and 4).

Figure IV: sample of data after one-hot encoding

One-hot encoding converts the non-nummeric data into binary data distinguished by position.

D. Explore

Once the data has been scrubbed visual aid is used to identify any outliers, missing datapoints, anomalies or misinformation and correcting them.

Checking for outliers and missing data

Outliers and missing data can skew the outcome of the machine learning process. These anomalies are identified in two steps. The first step is a check, done numerically in excel, that varies depending on what might be expected from the data. For example, with employment category we expect the outcome to be one of four possible: a, b, c, d. This can easily be checked with excel’s if-or function, yielding a result of 0 if the value in the cell is a, b, c or d and a value of 1 of it is anything else. If the sum of the resulting values is equal to 0 there are no outliers. Similar processes are applied when applicable.

In some cases, a numeric check is not indicative enough or the data is easier to analyze graphically, this is the case for ages (Fig. I), as a certain span and spread is expected, as well as average working hours per month (Fig. II).

Figure I: Age of employees

From figure I above there are no significant outliers in the Age data set.

(8)

Figure II: Average hours per month

From figure I and figure II it is apparent that the data sets Age and Average hours per month have no significant outliers nor missing data points.

E. Model

After thorough data preparation, the modelling aspect of this study is the briefest of the O.S.E.M.N. process. The prepared data is imported to the python program. The program splits the data into the aforementioned training and test data using sklearn’s method train -test-split, ensuring a randomly selected split of the data. Sklearn’s Random Forest Classifier is imported and the model is trained using the built in fit function. The predict function is used on the test data. The results of the predictions are exported and calculations are made on the prediction metrics.

F. Interpret

In order to complete the final step of the O.S.E.M.N.

methodology baselines and metrics must first be established.

This is in order to aid in getting a more detailed interpretation of the results.

Establish a baseline

This study will be comparing the random forest prediction against two today plausible baselines. The first base line is the zero guess. Currently, the Swedish armed forces have no way of knowing who is going to quit, however, the probability of an employee quitting is smaller than that of the employee staying, given the current turnover rate at 15%. Hence, for any given situation without prior knowledge of the individual, guessing that the employee will not be leaving will more often be right than wrong, this is what is meant by the Zero Guess (as not leaving has a value of 0 in the random forest algorithm and leaving has a value of 1). Sticking with the Zero Guess naturally yields an accuracy of 85%, however, as precision is defined as the number of true positive predictions (TPP), this method will have 0 precision, as it never makes a positive

prediction (positive as in targeting the sought for output variable, that is resignation). Another approach is the random guess, the Swedish Armed Forces have well documented statistics regarding the turnover rate. Suppose this was used to randomly predict who will resign. This method is less

accurate, with an accuracy of 74%, it is, however, more precise, with a precision of 15% (some of the positive guesses are probably going to be right), it should be noted, however, that the random guess has high variability of 13. Nevertheless, these are the baselines available to us today.

Metrics

The results will be measured and conveyed through a number of metrics, of these the most significant are accuracy and precision as previously mentioned. Accuracy is measured as the total amount of true predictions (positive and negative) through the total number of predictions, i.e. the size of the test group. Precision is measured as the ratio between the true positive predictions, that is terminated employments that were correctly predicted by the machine, through the total amount of positive predictions (true and false), in other words, how large of a percentage of the total predicted resignations were correct. These two measurements will be the ones used to benchmark against the Zero Guess and the Random Turnover methods. Aside from these metrics there are a number of (less) significant ratios that are interesting to mention. One of these is the ratio between the false positive predictions and the total actual negatives in the test group, that is to say, how many employees that stayed were wrongfully predicted to terminate their employments. To describe these relationships more efficiently a number of acronyms are used.

Table II: clarification of acronyms

Type Acronym Definition

True, positive predictions

TPP The number of predictions of resignation or otherwise terminated employment (hereinafter resignation) which were correct.

False, positive predictions

FPP The number of predictions of employees in the test group that would resign which were in incorrect True, negative

predictions

TNP The number of predictions of employees in the test group that would not resign which were correct False, negative

predictions

FNP The number of predictions of employees in the test group that would not resign that were incorrect Total actual

negatives

TAN The total number of employees in the test group that did not resign

(9)

Total actual positives

TAP The total number of employees in the test group that resigned

Sensitivity test

Lastly, to get a more nuanced picture of the results and the impact of the data, a sensitivity test will be carried out. This in order to further answer the question which variables influence resignation the most. The test was conducted by running the test data while removing one feature at a time. If removing the data causes the metrics to deviate substantially then it is assumed to bear greater significance to the employee’s pending decision to resign.

V. RESULTS

After running the data through the model the following results were obtained.

Table III: Results

Metric Percentage

(%)

Calculation Percentage true

predictions of total test group (accuracy)

89.067 TPP + TNP

TAN + TAP

Percentage false predictions of total test group

10.933 FPP + FNP

TAN + TAP Percentage true

positive

predictions of all positive

predictions (precision)

71.795 TPP

TPP + FPP

Percentage true positive

predictions out or total actual resignation (recall)

48.276 TPP

TAP

Incorrect positive predictions out of employees not resigning

3.470 FPP

TAN

Percentage true negative

predictions out of all negative predictions

91.071 TNP

TNP + FNP

Percentage of actual stays correctly predicted

96.530 TNP

TAN

Percentage of incorrect negative predictions

51.724 FNP

TAP

Out of the total positive predictions made 71.795% were true.

48.276% of the total test group’s actual resignations were predicted, 3.470% of the total test group were wrongfully predicted to resign. Out of the total negative predictions that were made 91.071% were true. Of the total test group, the model correctly predicted 96.530% of the persons who did not resign. Furthermore, the sensitivity study yielded the results found in table IV.

Table IV: Sensitivity Test

All data included Gender Age K-category Branch Level No. months with scheduled hours Average hours per month Consecutive 24h shifts Unplanned shifts Other operations

Percentage of total true

predictions 89 88 87 89 90 87 85 87 88 88 89 Percentage of

total false

true positive

actual resignations

predicted 48 47 45 47 50 50 28 48 50 48 50 Incorrect positive

predictions 3 4 5 3 3 6 4 6 5 4 3

Percentage of negative predictions

which were true 91 91 90 91 91 91 88 91 91 91 91 Percentage of

actual stays

predicted 97 96 95 97 97 94 96 94 95 96 97 Percentage of

incorrect negative

predictions 52 53 55 53 50 50 72 52 50 52 50 Table IV shows the metrics resulting from removing one of the features from the original data set. The larger the deviation, or the larger overall impact on the metrics implies that the feature is more closely linked to resignation.

(10)

Figure V: graph of sensitivity test

The sensitivity test resulted in varying dependability of predictions depending on what variable was removed. The variations of the metrics from the original data set can be seen in fig. V. The largest of these variations being for the removal of the number of months with scheduled hours feature.

Interpretation

Table V below shows the accuracy and reliability of the chosen machine learning model to the current method, the zero-guess, and a hypothetical alternative, the random guess.

Not guessing at all results in an accuracy of around 85%

which reflects the average 15% turnover seen in the Swedish Armed Forces. Knowing the average turnover and randomly guessing will result in a higher precision and recall both around 15%, and since so few of the guesses are positive the accuracy will be fairly high as well at 74%.

Table V: comparison to the zero guess and random guess Random

Forests

Zero- guess

Random

Accuracy 89.067 84.533 73.873

Precision 71.795 0 15.413

Recall 48.276 0 15.467

VI. DISCUSSION

It is evident from the results that using machine learning with the random forest model to predict which employees will resign has a non-trivial result. The implications of the results are that running these 10 features through the random forest machine learning model predictions of which employees are about to resign can be made more accurate and precise, with and accuracy of 89% and a precision of 72%. However, in this case the model only catches around 50% of the employees that end up resigning, then again, that is 50% more than the current strategy. From an ethical or resource-oriented perspective (depending on how the model is implemented) it is reassuring that the machine errs on the side of caution, falsely predicting a mere 3.5% to leave when they in reality stay. The

implications of this could for example be that if resources were spent to keep employees from leaving, these would go to waste when spent on the 3.5% intending to stay but would be well placed when spent on the 50% of the leaving employees.

The research question

The research question “To what extent can the machine learning model random forest be used to predict employee resignation in the Swedish Armed Forces?” is answered in that the model yields a better result than the other methods tested.

Nevertheless, the question still begs, is it good enough? Are the uncertainties in the result considered acceptable enough to act on? These questions should be answered with an ethical frame to ensure the fair treatment of employees.

Sensitivity test

Another interesting discovery is the impact that the different variables had on the prediction. This thesis was sprung out of the hypothesis that employees about to resign share a pattern of behavior. Although the data collective was not particularly indicative of behavior, the ones that came close to identifying behavior, such as number of months with scheduled hours and average hours per month had some noteworthy impacts on the metrics. The most significant of these was the impact that removing data concerning number of months with scheduled hours had on nearly all metrics, lowering accuracy, precision and many more. Removing the data on the average hours per month resulted in the lowest precision and removing demographic data such as age and gender resulted in predicting fewer resignations.

Assumptions

As previously mentioned, the strict regulations prohibiting non-voluntary terminations of employment in the Swedish Armed Forces is the basis for the assumption that all the terminated employments are voluntary. This may lead to false 0

10 20 30 40 50 60 70 80 90 100

PERCENTAGERESULTSINMETRICS

VARIABLES

S

ENSITIVITY CHECK

Percentage of total true predictions Percentage of total false predictions Percentage of true positive predictions Percentage of actual resignations predicted Incorrect positive predictions

Percentage of negative predictions which were true Percentage of actual stays predicted

Percentage of incorrect negative predictions

(11)

conclusions and implies that the resignation in and of itself is negative, which is not necessarily true, as mentioned turnover is positive in the case of a bad recruit. This may cause behavioral outliers and the conditions of the terminated employment may need to be distinguished between in future studies.

Limitations

The study is based on a, by the Swedish Armed Forces, randomly selected of 1500 individuals and 10 variables. The amount of data available for outside studies is quite limited:

out of the 20 000 regular employees only approximately 13%

could be used in this study due to security reasons. This, however, also implies that the Swedish Armed Forces could run the model internally using the remaining data and, in all likelihood, get a better result. Another possibility is to use the model on the total 50 000 employees; however, different variables may have to be used as the contracts and work varies greatly from those in this study. Furthermore, most of the 10 variables included were quite far from indicative of behavior, although the technology to track certain behavioral aspects, such as tardiness and overtime, exist and is implemented in the Swedish Armed Forces today, systems aren’t in place to sufficiently gather that kind of information. Remedies to these circumstances could lead to more reliable results.

Future studies

The Swedish Armed Forces is a very gratifying institute for this type of study. The technology exists to collect data, exit interviews are conducted at the time of departure and there is an interest in further exploring the use of machine learning in human resource management. With more time and systems in place, there a need for further studies in both data collecting and the machine learning models. This study was limited to the machine learning model random forest based on the evaluations in previous studies. A possible future investigation is whether there are other models better suited for this type of organization.

Furthermore, as mentioned previously, the Swedish Armed Forces have in the past years seen several large organizational changes. One possible future study is to use historic data to determine how different organizational structures impact employee turnover, in order to optimize future organizational structures.

(12)

VII. REFERENCES

[1] Försvarsmakten (2018). FM2017-13845:3.

Försvarsmaktens årsredovisning 2017. Stockholm:

Försvarsmakten HKV,. Bilaga 1 Personalberättelse.

[2] Das, B. and Baruah, M. (2013). Employee Retention: A Review of Literature. IOSR Journal of Business and Management, 14(2), pp.08-16.

[3] Gary E. Weiss and Sean A. Lincoln,

"Departing Employee Can Be Nightmare," Electronic News (1991, reprinted March 16, 1998), p. 1.

[4] Abbasi, S. and Hollman, K. (2000).

Turnover: The Real Bottom Line. Public Personnel Management, 29(3).

[5] V. Nagadevara, V. Srinivasan and R. Valk,

”Establishing a link between employee turnover and withdrawal behaviours: Application of data mining techniques”, Research and Practice in Human Resource Management, 16(2), 81-97, 2008

[6] Kanfer, R., Crosby, J. V., & Brandt, D. M.

(1988). Investigating behavioral antecedents of turnover at three job tenure levels. Journal of Applied Psychology, 73(2), 331-335.

[7] Rosse, J. (1988). Relations among Lateness, Absence, and Turnover: Is There a Progression of

Withdrawal?. Human Relations, 41(7), pp.517-531.

[8] Statens Offentliga Utredningar (2016). En robust personalförsörjning av det militära försvaret (SOU 2016:63). Stockholm: Alanders Sverige AB.

[9] Olsson, P., Bäckström, P., Johansson, M., Lehman, J., Lusua, J., Ädel, M. and Öhrn-Lundin, J. (2018).

Structural Challenges within the Swedish Military Supply of Personnel and Materiel. A18107. Totalförsvarets

Forskningsinstitut.

[10] Kung, M. and O’Connell, M. (2007).

Employee Turnover & Retention: Understanding the True Costs and Reducing them through Improved Selection Processes. Industrial Management, (Vol. 49 Issue 1, p14).

[11] Conerly, B. (2019). Companies Need to Know The Dollar Cost Of Employee Turnover. [online]

Forbes.com. Available at:

https://www.forbes.com/sites/billconerly/2018/08/12/compani es-need-to-know-the-dollar-cost-of-employee-

turnover/#11def435d590 [Accessed 5 Apr. 2019].

[12] Richtsmeier, S. (2019). The Actual Cost of Employee Attrition. [online] Tinypulse.com. Available at:

https://www.tinypulse.com/blog/the-actual-cost-of-employee- attrition [Accessed 5 Apr. 2019].

[13] Marsland, S. (2015). Machine learning An Algorithmic Perspective. 2nd ed. Boca Raton: Taylor and Francis Group, pp.249-265.

[14] Patel, S. (2017). Chapter 3: Decision Tree Classifier — Theory. [online] Medium. Available at:

https://medium.com/machine-learning-101/chapter-3-decision- trees-theory-e7398adac567 [Accessed 9 Apr. 2019].

[15] Ho, T. (2016). Random Decision

Forest. AT&T Bell Laboratories, 600 Mountain Avenue, 2C- 548C Murray Hill, NJ 07974, USA.

[16] L. Breiman, "Random forests," Machine learning, vol. 45, no. 1, pp. 5-32, 2001.

[17] Mason, H. (2010). dataists » A Taxonomy of Data Science. [online] Dataists.com. Available at:

http://www.dataists.com/2010/09/a-taxonomy-of-data-science/

[Accessed 26 May 2019].

[18] Janssens, J. (2015). Data science at the command line. 1st ed. Sebastopol: O'Reilly Media.

[19] Saradhi, V. and Palshikar, G. (2010).

Employee churn prediction. Mahashtra, India: Tata Research Development and Design Center, Tata Consultancy Services.

[20] Punnoose, R. and Ajit, P. (2016). Prediction of Employee Turnover in Organizations using Machine Learning Algorithms. PhD candidate. Xavier School of Management, Jamshedpur, India.

[21] Gabrani, G. and Kwatra, A. (2018). Machine Learning Based Predictive Model for Risk Assessment of Employee Attrition. School of Engineering and Technology, BML Munjal University, Gurgaon, India.

[22] Nozick, R. (1974). Anarchy, state and Utopia. Basic Books.

[23] Rawls, J. (2005). A theory of justice.

Cambridge, Mass.: Belknap Press.

[24] Schumann, P. (2001). A moral Principles Framework for Human Resource Management Ethics. College of Business, Minnesota State University.

(13)

VIII. APPENDIX I. Random Forest

# -*- coding: utf-8 -*-

"""

Created on Sat May 25 12:50:21 2019

@author: amand

"""

import pandas as pd import numpy as np

# Read in prepped data from HKV features =

pd.read_csv('data_GSSK_red1_other_operations.csv')

# One-hot encode categorical features features = pd.get_dummies(features)

# set targets

lables = np.array(features['Discontinued employment']) features= features.drop('Discontinued employment', axis = 1) feature_list = list(features.columns)

# Convert to numpy array features = np.array(features)

# Using Skicit-learn to split data into training and testing sets from sklearn.model_selection import train_test_split

train_features, test_features, train_labels, test_labels = train_test_split(features, labels, test_size = 0.25, random_state

= 42)

# Import the model from sklearn

from sklearn.ensemble import RandomForestClassifier

# 1000 decision trees

clf = RandomForestClassifier(n_estimators = 1000, random_state = 42)

# Train the model on training data clf.fit(train_features, train_labels);

# test model on test data

predictions = clf.predict(test_features)

# Calculate the absolute errors

#print(test_labels) print(predictions)

#errors = abs(predictions - test_labels)

# Print out the mean absolute error (mae)

#print('Mean Absolute Error:', round(np.mean(errors), 2), 'degrees.')

II. Random guess

# -*- coding: utf-8 -*-

"""

Created on Wed Aug 21 19:29:55 2019

@author: amand

"""

import pandas as pd import numpy

data = pd.read_excel (r'C:\Users\amand\Documents\KTH rond 2\KEX\pyt_ref.xlsx')

df = pd.DataFrame(data, columns= ['results'])

j=0

resultsAccuracyArray = []

resultsPrecisionArray = []

resultsPositiveRandomCount = []

accuracyCheck=[]

fpp=0 fnp=0

resultsRecallArray=[]

while j < 10000:

x=0 i=375

randomArray=[]

positiveRandomCount=0 while x < i:

r= numpy.random.choice(numpy.arange(0, 2), p=[0.845333333333333, 0.154666666666667]) randomArray.append(r)

if r == 1:

positiveRandomCount+=1 x=x+1

resultsPositiveRandomCount.append(positiveRandomCount) resultAccuracy = 0

truePositive = 0 n=0

while n != len(randomArray):

if randomArray[n] == 1 and df.values[n] == 1:

accuracyCheck.append(1) resultAccuracy=resultAccuracy+1 truePositive=truePositive+1

elif randomArray[n] == 0 and df.values[n] == 0:

resultAccuracy=resultAccuracy+1 accuracyCheck.append(1)

elif randomArray[n] == 0 and df.values[n] == 1:

fnp=fnp+1

else:

accuracyCheck.append(0) n=n+1

if positiveRandomCount ==0:

precision=0

(14)

else:

precision=truePositive/positiveRandomCount fpp=truePositive/58

resultsAccuracyArray.append(resultAccuracy) resultsPrecisionArray.append(precision) resultsRecallArray.append(fpp)

j=j+1

(15)

www.kth.se