Toward Better Health Care Service:

(1)

IN

DEGREE PROJECT ELECTRICAL ENGINEERING, SECOND CYCLE, 30 CREDITS

STOCKHOLM SWEDEN 2017 ,

Toward Better Health Care Service:

Statistical and Machine Learning Based Analysis of Swedish Patient Satisfaction Survey

YU WANG

(2)

(3)

Abstract

Patients as a customer of health care service has rights to evaluate the service they received, and health care providers and professionals may take advantage of these evaluations to improve the health care service. To investigate the rela- tionship between patients overall satisfaction and satisfaction of specific aspects, this study uses classical statistical and machine learning based method to ana- lyze Swedish national patient satisfaction survey data.

Statistical method including cross tabulation, chi-square test, correlation matrix and linear regression identifies the relationship between features. It is found that patients’ demographics have a significant association between overall satisfaction. And patients responses in each dimension show similar trend which will contribute to patients overall satisfaction.

Machine learning classification approaches including Na¨ıve Bayes classifier,

logistic regression, tree-based model (decision tree, random forest, adaptive

boosting decision tree), support vector machines and artificial neural networks

are used to built models to classify patients overall satisfaction (positive or

negative) based on survey responses in dimensions and patients’ demographics

information. These models all have relatively high accuracy (87.41%–89.85%)

and could help to find the important features of health care service and hence

improve the quality of health care service in Sweden.

(4)

Sammanfattning

Patienter som kund av hlsovrdstjnsten har rtt att utvrdera den tjnst de ftt, och vrdgivare och yrkesverksamma kan utnyttja dessa utvrderingar fr att frbttra vrden. Fr att underska frhllandet mellan patientens vergripande tillfredsstl- lelse och tillfredsstllelse av specifika aspekter anvnder den hr studien klassisk statistisk och maskinbaserad metod fr att analysera svenska nationella patien- tunderskningsdata.

Statistisk metod, inklusive tvr tabulering, chi-square test, korrelationsmatris och linjr regression identifierar frhllandet mellan funktioner. Det r konstaterat att patienternas demografi har en betydande koppling mellan vergripande till- fredsstllelse. Och patientens svar i varje dimension visar en liknande trend som kommer att bidra till patientens vergripande tillfredsstllelse.

Klassificeringsmetoder fr maskininlrning, inklusive Na¨ıve Bayes-klassificeraren, logistisk regression, trdbaserad modell (beslutstrd, slumpmssigt skog, adaptivt kar beslutstratt), stdvektormaskiner och konstgjorda neurala ntverk anvnds fr att bygga modeller fr att klassificera Patientens vergripande tillfredsstllelse (pos- itiv eller negativ) baserat p underskningsresponser i dimensioner och patienters demografiinformation. Dessa modeller har alla relativt hg noggrannhet (87.41%

- 89.85%) och kan hjlpa till att hitta de viktigaste egenskaperna hos vrden och

drmed frbttra kvaliteten p vrden i Sverige.

(5)

Acknowledgment

First and foremost, I would like to express my gratitude to my thesis advisor Saikat Chatterjee. As my advisor Professor Chatterjee helped me go through all of my research, offering constructive advice about experiment design, giving prompt and insightful feedback, and so forth. I would also like to thank the committee members, Professor Markus Flierl and Professor Anne Hkansson for their precious time, valuable suggestions and comments.

In addition, I would like to thank all the staff at IC Quality. And I especially want to thank Nils Press for his instructions have contributed a great deal to my thesis writing.

Also thanks to all my class fellows who kindly supported each other in the past two years.

Especially, I would like to express my deepest love to my parents, for their love, selfless support and encouragement to finish my studies.

All of the above help has contributed much to my study and research, and

I express my gratitude again to all of them.

(6)

Introduction

Over the past decade, patient satisfaction have gained increasing attention, as its an important and widely accepted measure of care efficiency [1]. Consumerism of today’s society has led to competitive health care environment, therefore, health care providers take patient perception into account when designing the strategies for quality improvement.

The measurement of patient satisfaction could be described as the difference between the perceived and expected satisfaction for each dimension. Patients’

satisfaction is related to the extent to which general health care needs are met.

And patients today have a higher level of education who are increasingly hope to learn more about their health conditions, and even to participate in planning their own health care and decision-making [2]. Therefore, the quality of health care services cannot be seen only from the health care providers based on their professional standards and assessment.

Moreover, patient involvement influenced quality in health care [3]. On the one hand, patients experiences helps to point out improvement areas that health care professionals had not previously recognized. On the other hand, patient involvement in quality improvement projects helped to fill existing gaps between organizational functions, supporting a view of care from a patient- process perspective. Thus, patient involvement contributed to an extended view of quality dimensions in health care.

The use of the survey instrument to identify issues of concern to patients and to provide feedback to health care service providers is a market-driven approach of turning patient satisfaction surveys into a quality improvement tool for health care providers. For example, evaluation of patient satisfaction was mandatory for all French hospitals since 1996 [4]. In Germany, it is required to measure patients satisfaction as an element of quality management reports since 2005 [5].

In Sweden, all county councils and regions have been involved in the National Patient Survey since 2009.

1.1 Motivation

The Swedish health care provider shows very good performance on medical

outcomes. For instance, Sweden has high cancer survival rates compared with

other Western countries [6]; and has low infant mortality rates compared with

(9)

other European countries and the United States [7]. By contrast, compared to other Western countries, Swedish healthcare performs poorly in informing patients, enabling them to participate and take on a more active role [8].

As mentioned, understanding the patients’ value-creation process from her or his perspective is at focus for health care providers that want to enhance the patients’ perceived value. National patient survey is a powerful tool to understand the patients’ thought.

The survey is implemented by company IC Quality

¹

, on behalf of Sweden’s county councils and regions in the cooperation and coordination of the Sveriges Kommuner och Landsting (SKL). And IC Quality is today in possession of enormous amounts of survey data about patient experience, and the data is constantly growing. However, the best stories in the survey data remain hidden behind rows and columns in tables that are too dicult or expensive to customize.

Compared to computers, the human mind is weak at performing calculations but much stronger at recognizing patterns. Average person could not derive meaning and get help from the voluminous data set.

1.2 Problem statement

Patient satisfaction has proved a difficult concept to measure and its validity and usefulness have been increasingly questioned. Classical statistical methods (descriptive statistics) are now used to analyze the Swedish national patient survey data

²

in Sweden. Bar charts are made to show the average positive response. However, no model-based approach are used currently.

The purpose of the work is to provide better medical services access to the patients and helps the health care organizations in various medical management decisions. Using modern techniques—machine learning, building classification models for the health care survey data in Sweden to predict patient attitude toward health care service (positive or negative).

With the model build, health care providers could easily found the factor which influence the patients’ satisfaction. Thus, narrow the gap between ex- pectations (what patient want) and perceptions (what patient get) of relevant service attributes.

1.3 This study

This thesis is composed of six parts. The first chapter, Chapter One, is an intro- duction, which introduces the research problem, and the structure of the thesis.

Chapter Two addresses research background, the significance of the patient satisfaction, Swedish national patient survey and the data collected. Chap- ter Three reviews the related work about this domain, including two parts:

statistical based approaches and machine learning based approaches. Chap- ter Four focuses on the methodology of this research, which includes the data pre-processing method, classic statistical techniques and machine learning tech- niques. Chapter Five presents experiment design for this research, and the

1Official site of IC Quality is https://icquality.se/

2The result of survey data in 2015 could be found at https://patientenkat.se/sv/

resultat/ta-del-av-resultat/

(10)

results and discussion about the outcomes. In Chapter Six, the research con-

clusion is drawn through summarizing the above assessment results. Besides,

limitations and implications of this research are also discussed in last part.

(11)

Chapter 2

Background

In this chapter, the background to the problem is addressed. The background consists of a theory section about patient satisfaction and sections introduce the Swedish national patient survey and the survey data.

2.1 Patient satisfaction

The key to a successful business is understanding one’s customers, knowing what they want and how satisfied they are with one’s product and service. Would it be more appropriate to address the patients as “customer”? The word customer is defined as “a person who purchases goods or services” [9]. Today the patient sees himself as a buyer of health services. Therefore, there is a need to recognize that every patient has certain rights, which puts a special emphasis on to the delivery of quality health care.

In 1948, The World Health Organization (WHO) defined health as a “a state of complete physical, mental, and social well being not merely the absence of disease or infirmity” [10]. The traditional method to patient assessment is largely based on observer ratings by health professionals. Modern medicine is slowly beginning to recognize the importance of the perspective of the patient in health care. More investigations are needed to understand the importance of the patient satisfaction [11].

Customer satisfaction is dened as “The degree of satisfaction provided by the goods or services of a company as measured by the number of repeat cus- tomers” [12]. Evaluating to what extent patients are satisfied with health ser- vices is clinically relevant, as satisfied patients are more likely to take an active role in their own care [13] and comply with treatment [14], to continue using medical care services and stay within a health provider. In addition, patients’

views on health care is an important source to continually improve the health care. Health professionals may take advantage of satisfaction surveys that iden- tify potential areas for service improvement and health expenditure may be optimized through patient-guided planning and evaluation [15].

However, customer satisfaction is not necessarily observed directly. In a

social science context, analysis of such measures is done indirectly by employ-

ing proxy variables (also referred to as latent variables). In order to measure

customer satisfaction, survey questionnaires are used, in which respondents are

(12)

asked to express their degree of satisfaction with regard to multiple aspects of the product or service. The survey contains a set of observed variables, some attributes are objective, related to the services technical characteristics;

and others are subjective, dealing with behaviours, feelings and psychological benets.

2.2 Swedish national patient survey

All county councils and regions in Sweden have since 2009 been involved in the National Patient Survey. The work is coordinated by the Swedish Association of Local Authorities and Regions. National joint surveys have been conducted every two years in primary care, somatic outpatient and inpatient, emergency, psychiatric outpatient and inpatient care, child care outpatient and inpatient care and child psychiatry. By repeating measurements they have continuously collected knowledge about patients views on the received care [16].

2.2.1 Implementation

Implementation of the measurements takes four steps—planning, implementa- tion, evaluation and joint work. Firstly, planning the measurement period, participant and methodological choices based on actual needs. Secondly, the supplier collects data from patients. Thirdly, when the survey data collected, an evaluation of the process will be made and propose possible changes for future measurements. Finally, SKL is responsible for the guidelines for the conduct of examinations and management of historical results.

The survey will send to people who have recently visited the health service and they are asked to evaluate their recent visit. No results will be traceable to individual persons. The questionnaire is carried out in accordance with the Data Protection Act and the industry codes of conduct. For example, the detail of the implementation of Swedish national patient survey in 2015 is shown in Table 2.1 [16].

2.2.2 Measurement scales

The National Patient Survey used a 5-point scale, called a Likert scale, where the minimum integer value represents ‘complete dissatisfaction’ or ‘completely disagree’ while the maximum value stands for ‘complete satisfaction’ or ‘com- pletely agree’. In addition to the 5-point scale given entry “Not applicable”.

The advantages of a five-point response scale is that it allows the respondent to take a neutral position on an issue while providing freedom respondent to rate their positive or negative experience. With this response scale we can comment on the significant differences even in small populations and over time.

2.2.3 Dimensions

The National Patient Survey contains seven dimensions, whose question in the

questionnaire belongs to one of it [16]. The definition of each dimension is shown

below:

(13)

Table 2.1: Implementation of Swedish national patient survey in 2015

Measurement Primary care autumn 2015

Survey

company IC Quality

Survey Methodology

Postal questionnaire (Mailing, postal digital invitation response ability and a reminder containing postal

questionnaire with digital response ability) Sampling

All medical appointments at the health care unit in the selection period (some county councils/regions

have also chosen to measure visits to the nurse) Selection

period September 2015

Inclusion Individual visits per care (Healthcare/medical center)/All ages/Utomlns Patients

Survey Vrdbas and primary module (additional questions for some counties)

Language

Swedish (digitalt- and postal reply possibility), Spanish, French, English, Arbabiska, Farsi, Finnish,

Somali (digital answers possibility) Time for

response October 12-November 17

Participating counties /

regions

All 21

Emotional support

This dimension is intended to show whether the patient feels that staff is active and responsive for the patient’s anxiety, concern, fear or pain, and in turn, accessible and supportive of same for the patient satisfactory manner.

Information and knowledge

This dimension is intended to show how well patients find the treatment is inform or communicate based on individual circumstances and in a proactive manner. It is, for example, information on delays or waiting times, the patient must be asked questions in an understandable way, the patient is informed about the treatment/medication/warning signs that he should pay attention, etc. The dimension shows the patient’s perception of how well the related parties involved.

Participation and involvement

This dimension is intended to show whether the patient feels himself involved

and participate in his own care and in the decisions making at the same time.

(14)

Continuity and coordination

This dimension is intended to show patients’ experience of health care capacity for continuity and coordination. This means how well the individual’s care is coordinated, both internally and externally. It includes how patients experience the staff’s ability to cooperate with each other and in relation to the patient.

Availability

This dimension refers to the patient’s experience of health care accessibility in both proximity and contact channels, and staff availability for the patient as well as for families.

Respect and treatment

This dimension illuminates the patient’s experience of health care capacity for a treatment tailored to individual needs and conditions. For example, whether the hospitality is characterized by respect on the basis of equal worth, compassion, commitment and or health care. This dimension is related to the dimension Participation and involvement.

Overall satisfaction

This dimension shows the patient’s experience of health care in terms of overall experience, perceived effectiveness, clinical outcomes and security, etc.

2.3 Survey Data

In 2016, 186686 questionnaires

¹

was sent with an response rate of 42%. The 78575 respondents varied by gender (39.4% men), age (range 15 - 104 years), education, employment status. An basic analysis of respondents demographics is presented in Figure 2.1, Table 2.2 and Table 2.3.

Table 2.2: Respondents by education

Education Frequency (%)

Post-secondary education, university or college 31463 40.0

High school or equivalent 23904 30.4

Elementary school, folk school or equivalent 19485 24.8

Others 2480 3.2

No completed education 1242 1.6

Along with questions about the patients, and the remaining thirty-one ques- tions are about patient satisfaction in seven dimensions which will be analyzed to help improve the health care quality.

1The 2016 Swedish national patient satisfaction survey questionnaire could be found at appendix

(15)

Figure 2.1: Age distribution of respondents.

Table 2.3: Respondents by employment status Education Frequency (%)

Pensioner 41382 52.7

Employee 28557 36.3

Student 2793 3.6

Other 2526 3.2

Unemployed 1816 3.3

Missing 1497 1.9

(16)

Chapter 3

Related work

This section introduces the related work in this domain, including statistical based approaches and machine learning based approaches.

3.1 Statistical based approaches

Statistical tools and techniques play a major role in many areas of science like biology, economics, physics etc. In particular, the science of survey sampling, survey data collection methodology, and the analysis of survey data date back a little more than one hundred years.

Descriptive statistics as a classic tool has been used for survey data analysis for a long time [17, 18]. Estimation of totals, means, variances, and distribution percentiles for survey variables may be the primary objective of an analysis plan or possibly an exploratory step on a path to a more multivariate treatment of the survey data. Major reports of survey results can be filled with detailed tabulations of descriptive estimates [19]. Descriptive statistics are applied in [20]

to help health care providers to estimate of his or her patients satisfaction, by summarizing the measurement of diagnostic accuracy, including sensitivity, specicity, positive predictive value (PPV), and negative predictive value (NPV).

During 1950–1990, analytical treatments of survey data expanded as new developments in statistical theory and methods were introduced, empirically tested, and refined. Important classes of methods that were introduced dur- ing this period included log-linear models and related methods for contingency tables, generalized linear models [21] (e.g., logistic regression), survival analy- sis models, general linear mixed models [22] (e.g., hierarchical linear models), structural equation models, and latent variable models. Many of these new sta- tistical techniques applied the method of maximum likelihood to estimate model parameters and standard errors of the estimates, assuming that the survey ob- servations were independent observations from a known probability distribution (e.g., binomial, Poisson, normal).

Regression model still plays an important role in modern survey analysis.

Linear regression models are used to analyze cross-sectional survey data in [23],

to examine the relationship between physician advice to quit smoking and pa-

tient care experiences. [24] estimated linear ordinary least squares regression

equations, using national random telephone survey data, to test for direct effects

(17)

of parenthood on measures of punitive attitudes toward juveniles and adults and overall. [25] performed multilevel regression analyses to identify factors asso- ciated with lack of preventive dental care among U.S. children and state-level factors that explain variation in preventive dental care access across states.

Logistic regression are uesd in [26] to examined the relationship of alcohol out- let density (AOD) and neighborhood poverty with binge drinking and alcohol- related problems among drinkers in married and cohabitating relationships and assessed whether these associations differed across sex. [27] try to estimate the association between residential mobility and non-medical use of prescription drugs (NMUPD), adjusting for potential confounders using logistic regression for survey data.

3.2 Machine learning based approaches

Machine learning as a modern analysis technique that has potential to help researchers in survey analysis. At this analytical stage, data analysis can be either confirmatory. or exploratory In statistical data analysis, t-tests and anal- ysis of variance are examples of confirmatory analysis, and factor analysis is a common exploratory technique. By contrast, machine learning algorithms are predominantly exploratory [28].

Machine learning provides methods, techniques, and tools that can help solving diagnostic and prognostic problems in a variety of medical domains.

Successful implementation of machine learning methods could help the work of medical experts and ultimately to improve the efficiency and quality of medical care.

Medical diagnostic reasoning is a very important application area of machine learning. With machine learning, patient records and their accurate diagnoses are input into a computer program to execute a learning algorithm. The re- sulting classifier can subsequently be used to help physicians to diagnose new patients. For example, Bayesian classifier and backpropagation learning of neu- ral networks are used to improve the predictive power of the ischaemic heart disease (IHD) diagnostic process.

Another field of application is biomedical signal processing. Machine learn- ing methods use sets of physiological signals data, which can be produced easier, and can help to model the nonlinear relationships that exist between these data, and extract parameters and features which can improve medical care. For in- stance, [29] introduces a machine learning based signal processing approach to achieve stress cognition, K-nearest neighbor (KNN) and support vector machine (SVM) were used to classified the stress into three levels based on collected bio- logical data such as Respiration, GSR Hand, GSR Foot, Heart Rate and EMG.

Nevertheless, the use of machine learning to help health care domain is not limited to clinical support. [30] apply a regression tree (Cubist) model for predicting the Length of Stay (LOS) for hospitalized patientsy, which could be an effective tool for healthcare providers to enable more efficient utilization of manpower and facilities in hospitals. Na¨ıve Bayes algorithm are used in [31] to improve the outcomes and decrease the cost of health care delivery by reducing preventable hospital readmissions rates .

Furthermore, machine learning techniques also play a significant role in pa-

tient satisfaction assessment. Nonlinear decision trees were used to classify

(18)

patient satisfaction based on survey response data of emergency departments

in [32]. Logistic regression models were applied in [33] based on survey data

to identify process measures that significantly influence patient satisfaction in

emergency departments. [34] analyzed telephone interview data and sociode-

mographics of hospital patients with logistic regression to investigate the main

predictors of patient satisfaction in municipal emergency departments. Besides,

machine learning could also deal with unstructured, free-text information about

the quality of health care. [35] tried to predict patient satisfaction from their

free-text descriptions on the NHS Choices website using sentiment analysis based

on machine learning and natural language processing techniques.

(19)

Chapter 4

Method

This chapter describes the method including data pre-processing and data anal- ysis techniques used to solve the proposed problem.

4.1 Data pre-processing

4.1.1 Missing data and imputation methods

Missing values are a common problem in many data sets and seem especially widespread in social and economic studies, including our patient satisfaction surveys. Since patients may fail to express their satisfaction level concerning their experience with a specic condition because of lack of interest, unwillingness to criticize, or other reasons.

Missing data

There are two situations found in missing data: unit nonresponse and item nonresponse. When a selected patients does not provide any of the information being sought, unit nonresponse occurs. When a patient responds to some items but not to others, item nonresponse occurs. Figure 4.1 shows the missingness map (shows where missingness occur) of the patient satisfaction survey data with black represents missing data, while grey represents observed data.

There are three missing-data mechanisms as described in [36]:

• MCAR (Missing Completely At Random): the missing data are indepen- dent on any variable observed in the data set.

• MAR (Missing At Random): the missing data may depend on variables observed in the data set, but not on the missing values themselves.

• NMAR (Not Missing At Random, or NI, Non-Ignorable): the missing data depend on the missing values themselves, and not on any other observed variable.

The method to deal with missing data are dependent on why the data is

missing. Only the first two types of missing data will be considered for impu-

tation since the NMAR is difficult to construct an imputation model based on

(20)

unobserved data. When data are missing from the responses to a questionnaire, as pointed out by [37], it is more likely that the missingness mechanism is MAR than MCAR. Thus, we could use imputation method on our survey data.

Figure 4.1: Missingness map of the patient satisfaction survey data

Imputation methods

According to [36], methods for analyzing incomplete data can be grouped into four main categories:

1. Procedures Based on Completely Recorded Units: analyze subsets of the data set without missing data, discards incompletely recorded units and analyzes only the units with complete data.

2. Weighting Procedures: deal with unit nonresponse by increasing the sur- vey weights for responding units in the attempt to adjust for nonresponse as if it was part of the sample design.

3. Imputation-Based Procedures: ll in missing values with plausible values, where the resultant completed data are then analyzed by standard meth- ods as if there never were any missing values.

4. Model-Based Procedures: define a model for the observed data and infer- ences are based on likelihood or Bayesian analysis.

A missing-data method is required to yield statistically valid answers for sci-

entic estimands including population means, variances, correlation coefcients,

and regression coefcients. By a scientic estimand we mean a quantity of scientic

interest that can be calculated in the population and does not change its values

depending on the data collection design used to measure it. And only multi-

ple imputation and model-based procedures can lead to valid inferences. Here

(21)

we use model-based procedures (K-nearest neighbor) do deal with the missing values in our survey data.

Imputation methods based on models are generally consist of creating a predictive model to estimate values that will substitute the missing items. These approaches model the missing data estimation based on information available in the data set. If the observed data contain useful information for predicting the missing values, an imputation procedure can make use of this information and maintain high precision.

K-nearest neighbor imputation

The K-nearest neighbor imputation (KNN) imputation algorithm uses only sim- ilar cases with the incomplete pattern (item nonresponse). Given an incomplete pattern x, this method selects the K closest cases that are not missing values in the attributes to be imputed, such that they minimize some distance measure.

An example of the simple algorithm is shown in Figure 4.2.

Figure 4.2: Example of KNN. The incomplete pattern with blue star should be imputed. If k = 3 it is assigned to the first class(green rectangles) because there are 2 rectangles and only 1 triangles inside the inner circle. If k = 5 it is assigned to the second class because there are 6 triangles and only 4 rectangles.

The idea behind KNN imputation is to take advantage of positive corre- lations between rows. Once the K nearest neighbours have been found, a re- placement value for the missing attribute value must be estimated. The type of data determine the method how replacement value is calculated. For example, the mode is frequently selected for discrete data, while the mean is used for numerical data. In our survey data, we use Likert data where the magnitude of a value matters, we will instead use the median for estimating a replacement value. The algorithm is as follows [38]:

1. Divide the dataset into two groups: D

complete

represent observations with

complete item information and D

missing

represents the set containing the

observations in which at least one of the items is missing.

(22)

2. For each vector x in D

_missing

:

(a) Divide the missing observation vector into two parts: observed and missing.

(b) Calculate the distance between the missing observation and all the observations vectors from the complete group D

complete

.

3. Calculate the weight of k observation vector.

4. Use K-nearest neighbors and estimate the missing values.

5. Repeat the algorithm for multiple variables fill in the missing value.

The use of a small number of k may represent a good compromise between performance and need to preserve the original distribution of data and here we apply 5-nearest neighbour imputation for the dataset.

4.1.2 Weight calculation in dimension

Patient satisfaction survey consists of seven dimensions as mentioned in Sec- tion 2.2.3, and each dimension consists of some questions. These questions are weighted which contribute to the overall satisfaction for within each dimension.

Based on the collected data, an algorithm based on principal component anal- ysis (PCA) is applied to determines the weights. This work is already done by the IC Quality and detailed theoretical explanation is beyond the scope of this project.

Our focus is that based on the weight provided, our prediction target could be calculated within dimension ‘Overall satisfaction’. The calculations are done at the individual level, as shown below:

• Respondent A has answered all the questions included in the dimension

‘Overall satisfaction’, the question weight is the same with the original calculated weight based on the algorithm, for him/her:

– weight for question 1 is 0.34 – weight for question 2 is 0.23 – weight for question 3 is 0.43

• Respondent B has only replied to two of the questions (for example, 1 and 3) in the dimension, for him/her:

– weight for question 1 is 0.44

w

1

= 0.34(initial weight)

0.34 + 0.43(sum of the weight of the question 1 and 3) = 0.44 – weight for question 2 is 0

– weight for question 3 is 0.56

w

₃

= 0.43(initial weight)

0.34 + 0.43(sum of the weight of the question 1 and 3) = 0.56

(23)

Now each patient and question answered by each patient has a special weight in each dimension. The weighted score S

_weighted

in the dimension could be calculated:

S

weighted

=

N

X

i=1

w

i

s

i

with w

i

represents weight for question i and s

i

represents patients’ response to question i.

4.1.3 Balancing data

For classification problems, the practical dataset are usually imbalanced; i.e., at least one of the classes constitutes only a very small minority of the data. In this context, the concern is how to correctly classify the “rare” class. However, the most commonly used classification algorithms do not work well for imbalance data classification because they aim to minimize the overall error rate, rather than paying special attention to the “rare” class.

There are three common approaches to solve the problem of imbalanced dataset [39]. The first is data-level methods that modify the collection of samples to balance the distribution of the dataset. The second approach is algorithm-level methods that directly modify existing learning algorithms to sensitive learning which assign a high cost to mis-classification of the minority class. The third methods is hybrid methods which combine the data-level and algorithm-level method.

In this study, we will use data-level method to tackle the problem. By modi- fying the training set to make it balanced distributed. Random under-sampling is performed to balance class distribution by randomly removing majority class examples until the majority and minority class instances are balanced.

4.1.4 Dummy coding

Some analytical methods have been initially designed for continuous variables such as support vector machine (SVM) which will be used in this study. And these methods can handle nominal variables (education, occupation, gender, etc. in our case) if they are coded appropriately. Here we will use dummy coding which is a bitwise representation of the discrete variable. For nominal attributes, new attributes are created. In every sample, the new attribute which corresponds to the actual nominal value of that sample gets value 1 and all other new attributes get value 0. Take “gender” (male or femaile) as an example, it will be replaced by two dummies, i.e. “gender = male” and “gender = female”.

As shown in Table 4.1, the “gender = male” is 1 if the “gender” variable is male, otherwise it is 0; the “gender = female” is 1 if the “gender” variable is female, otherwise it is 0.

Table 4.1: Dummy coding for “gender”

gender gender = male gender = female

male 1 0

female 0 1

(24)

4.2 Basic analysis

In this section, classicial statistics are used to analyzed the survey data.

4.2.1 Cross tabulation

Cross-tabulation (also called contingency table) analysis [40], is most often used to analyze categorical data. A cross-tabulation is a dimensional table that records the counts of respondents that have the specific characteristics described in the cells of the table. Here we use cross tabulation table to analysis the relationship between the overall satisfaction and the categorical variables in the patient satisfaction survey, i.e. gender, education, occupation and two questions stated the fact but not attitude of the health care visit. The result is shown in Table 4.2

Table 4.2: Cross tabulation of the survey overall satisfaction

¹

Variables Value Unsatisfied Satisfied

Count Proportion Count Proportion

Gender Man 5600 0.181 25391 0.819

Woman 10389 0.218 37195 0.782

Education

NA 510 0.206 1963 0.794

1

²

3144 0.161 16341 0.839

2

³

5061 0.212 18843 0.788

3

⁴

6935 0.22 24528 0.78

4

⁵

336 0.271 906 0.729

Occupation

NA 334 0.224 1156 0.776

Employee 7177 0.251 21380 0.749

Unemployed 518 0.285 1298 0.715

Student 936 0.335 1857 0.665

Pensioner 6259 0.151 35123 0.849

Other 761 0.301 1765 0.699

Question X000031

⁶

Yes 4767 0.104 41104 0.896

No 6858 0.489 7153 0.511

NA 4119 0.236 13307 0.764

Question X000032

⁷

Yes 5428 0.137 34279 0.863

No 7434 0.442 9373 0.558

NA 2994 0.139 18477 0.861

4.2.2 Pearson’s chi-square test

The Pearson’s chi-square test provides the mechanism to test the statistical significance of the cross-tabulation table [40]. Chi-square tests whether or not

1Due to the space limitation, some of the variable names and values are in footnotes

2Elementary school, folk school or equivalent

3High school or equivalent

4Post-secondary education, university or college

5No completed education

6Feel pain during visit to health care center

7Discuss patients’ improvement to health care

(25)

the two variables are independent.

If the variables are independent (have no relationship), then the results of the statistical test will be “non-significant” which means that there is no rela- tionship between the variables. If the variables are related, then the results of the statistical test will be “statistically significant” which means that we can state that there is some relationship between the variables.

The chi-square test is based upon a chi-square distribution. The chi-square χ

²

statistic is computed as

χ

²

=

n

X

i=1

(f

0

− f

e

)

²

f

_e

where f

0

and f

e

are the observed and expected frequencies for each of the possible outcome, respectively.

Contingency coefficient C which range from 0 (no association) to 1 (the theoretical maximum possible association) provides the magnitude of association between the attributes in the cross tabulation. Chi-square tests the significance of an association between two attributes; while contingency coefficient helps to know the extent of association between those attributes. Contingency coefficient could be calculated by

C = s

χ

²

χ

²

+ N

where N is the sum of all frequencies in contingency table.

SPSS is used to do Person’s chi-square test. In SPSS, p value concept is used to test the relationship between two variable. The chi-square is said to be significant at 5% level if the p value is less than 0.05 and is insignificant if it is more than 0.05.

The result of chi-square tests are shown in Appendix B which implies that there is a significant association between overall satisfaction and patients’ gen- der, education, occupation and visit fact.

4.2.3 Correlation matrix

Correlation matrix is used here to find the relationship between different re- sponses of the survey data. Here the responses are in Likert scale. Likert scale data is a type of ordinal data, thus their relationship should both consider quan- titative and ranking of the data. So both Pearson and Spearman correlation coefficient should be used.

Pearson correlation matrix

Pearson correlation matrix is a matrix of product moment correlation coeffi- cients r. Product moment correlation coefficient is an index which provides the magnitude of linear relationship between any two variables [40]. r is given by

r = P (x − m

x

)(y − m

y

) q

P (x − m

x

)

²

P (y − m

y

)

²

where m

x

and m

y

are the mean of x and y variables.

(26)

In addition, r ranges from -1 to +1. The positive value of r means higher scores on one variable tend to be paired with higher scores on the other, or lower scores on one variable tend to be paired with lower scores on the other. On the other hand, negative value of r means higher scores on one variable tend to be paired with lower scores on the other and vice versa.

Here we will analyzed the relationship between question responses in Likert scale data. The result is shown in Figure 4.3.

Figure 4.3: Pearson correlation matrix of question responses in survey One of the main limitations of the correlation coefficient is that it measures only linear relationship between the two variables. Thus, correlation coefficient should be computed only when the data are measured either on interval scale or ratio scale.

Spearman correlation matrix

The Spearman correlation method computes the correlation between the rank of

x and the rank of y variables. The Spearman correlation between two variables

will be high when observations have a similar rank (i.e. relative position label

of the observations within the variable: 1st, 2nd, 3rd, etc.) between the two

(27)

variables, and low when observations have a dissimilar rank between the two variables. ρ is given by

ρ =

P

x

⁰

− m

_x⁰

y

⁰

− m

_y⁰

r

P (x

⁰

− m

_x⁰

)

²

P

y

⁰

− m

_y⁰

2

where x

⁰

= rank (x) and y

⁰

= rank (y).

Figure 4.4: Spearman correlation matrix of question responses in survey Figure 4.3 and Figure 4.4 shows same trend of correlation both in quantita- tive and ranking measurement, we could somehow believe that these questions have strong link.

4.2.4 Linear regression

It is common in practice to use linear regression to analyze ordinal data, which

ignores the categorical nature of the response variable and uses standard para-

metric methods for continuous response variables. This approach assigns nu-

merical scores to the ordered categories and then uses linear regression.

(28)

Multiple linear regression are used here to model the relationship between several explanatory variables (questions in dimensions except ‘Overall satisfac- tion’) and a response variable (question in dimension ‘Overall satisfaction’ or the weighted score calculated in dimension ‘Overall satisfaction’ according to Section 4.1.2) by fitting a linear equation to observed data. Every value of the independent variable x is associated with a value of the dependent variable y.

The regression line is written as:

y = β

0

+ β

1

x

1

+ β

2

x

2

+ ... + β

n

x

n

+

The regression line describes how y changes with the explanatory variables xs. β

0

...β

n

are model parameters and is the error or noise term. Alao is assumed to have a normal distribution with mean 0 and constant variance σ

²

, i.e. ∼ N (0, σ

²

). Thus

y ∼ N (β

₀

+ β

₁

x

₁

+ β

₂

x

₂

+ ... + β

_n

x

_n

, σ

²

) and

E(y) = β

0

+ β

1

x

1

+ β

2

x

2

+ ... + β

n

x

n

We also have categorical independent variables in our dataset, such as gen- der, education and occupation. These variables also have influence on our target y as described in Section 4.2.2. However, linear regression could not handle these values. Thus, dummy coding described in Section 4.1.4 will be used to solve this problem.

The linear regression result are shown in Table 4.3 and Table 4.4. Detailed coefficients are shown in Appendix B.2.

Table 4.3: Model summary of the linear regression

R R Square Adjusted R Square Std. Error of the Estimate

.888 .789 .789 .467688085

Table 4.4: ANOVA of the linear regression

Sum of Squares df Mean Square F Sig.

Regression 64365.681 40 1609.142 7356.678 .000 Residual 17177.910 78534 .219

Total 81543.592 78574

4.3 Machine learning classification techinques

The fundamental goal of machine learning algorithms is to discover meaningful

relationships in a body of training data, presented as individual examples, and

to produce a generalization of those relationships that can be used to interpret

subsequently presented test data. As a result, several learning paradigms have

(29)

arisen to deal with different conditions. When the training data explicit exam- ples of what the correct output should be for given input datasets, then it is called supervised learning; when the training data does not contain any output information at all, then it is called unsupervised learning; when the training dataset does not tell what the target output should be but they instead contain some possible output together with a measure of how good the output it made, then it is called reinforcement learning, etc.

In this study, we have our target output available and will focus on supervised learning. As we stated above, a correct or targeted output is available, when the type of the output variable is a numeric value, the problem is a regression problem; when the type of the output variable is a categorical value, the problem is a classification problem. And classification is the most commonly performed exploratory task in machine learning which here we focus.

Here we perform classification techniques to classify patients’ attitude to- wards the health care service he/she received. Try to find the important factor affect patients satisfaction. We transform the weighted score in dimension ‘Over- all satisfaction’ as mentioned in Section 2.2.3 to binomial labels (positive = 1 or negative = 0), when the score is greater than 3.5 we label it to be a positive response and smaller than 3.5 to be negative response. Several machine learning algorithms are apply here to deal with this classification problem.

4.3.1 Na¨ıve Bayes classifier

Na¨ıve Bayes classifier is based on Bayesian Rule [41], which points out that posterior probability of a random event could be determined after the relevant evidence is taken into account.

P (B|A) = P (A|B)P (B) P (A) posterior = likelihood × prior

evidence

A na¨ıve Bayes classifier is a probabilistic model navely assumes that the features (x

1

, x

2

, ..., x

n

) are independent. It could be written as

P (C|x

1

, x

2

, ..., x

n

) = P (C)P (x

1

, x

2

, ..., x

n

|C) P (x

1

, x

2

, ..., x

n

) where C is the class label.

As assumed by the na¨ıve Bayes classifier, the features are all independent, then P (x

₁

, x

₂

, ..., x

_n

|C) could be re-written as a product of the component prob- abilities. Hence the posterior probability becomes

P (C|x

₁

, x

₂

, ..., x

_n

) ∝ P (C)P (x

₁

, x

₂

, ..., x

_n

|C)

∝ P (C)P (x

1

|C)P (x

2

|C)...P (x

n

|C)

∝ P (C)

n

Y

i=1

P (x

i

|C)

And the na¨ıve Bayes classifier assigns a test sample using the above equation

with the maximum a posterior probability (MAP).

(30)

4.3.2 Logistic regression

Logistic regression investigates the relationship between categorical response variables and a set of explanatory variables [41]. The logistic regression model is a member of the generalized linear models (GLM) class and it is an appro- priate model for studying the relationship between a binary response variable y, representing positive response (y = 1) or negative response (y = 0) , and a set of explanatory variables (x

1

, x

2

, ..., x

n

).

Instead of approximating the 0 and 1 values directly, logistic regression builds a linear model based on a transformed target variable. The ratio of event to nonevent is taken into consideration which is called the ’odd ratio’. Odd ratio can take any value between 0 and infinity and it is in ratio scale. This odd ratio can be modeled as an exponential series as shown below.

P

1 − P = e

^β⁰^+β¹^x¹^+β²^x²^+...+βⁿ^xⁿ

Where P and (1 − P ) are probabilities of events and nonevents. x

1

, x

2

, ..., x

n

are independent variables. β

0

, β

1

, β

2

, ..., β

n

are the parameters. Taking log on both the side convert a nonlinear equation into a linear equation as shown below.

log( P

1 − P ) = β

0

+ β

1

x

1

+ β

2

x

2

+ ... + β

n

x

n

or equivalently,

P = exp(β

0

+ β

1

x

1

+ β

2

x

2

+ ... + β

n

x

n

)/(1 + exp(β

0

+ β

1

x

1

+ β

2

x

2

+ ... + β

n

x

n

)) log(P/1 − P ) is called the link function and the parameters are estimated using maximum likelihood method. Unlike linear regression, logistic regression can handle both numeric and categorical data as independent variables.

4.3.3 Tree-based model

Tree-based methods are data-driven tools based on sequential procedures that recursively partition the data [42]. They exploit tree graphs to provide visual representations of the rules underlying a given data set which provides simple ways of understanding and summarizing the main features of the data.

Decision tree

Decision tree is a kind of tree graph which composes of nodes and edges. (As Figure 4.5 shows.) The leaves of the final tree are base hypotheses. For example, leaves of classification example in Figure 4.5 means different attitude towards service (satisfied or unsatisfied).

Decision trees are generally constructed from the dataset which contains a dependent variable Y and K (K ≥ 1) predictors X

1

, . . . , X

K

for n (n ≥ 1) sample units: D = {(y

i

, x

x1

, ..., x

iK

); i = 1, ..., n}. The process of constructing a tree from the learning set is called tree building or tree growing.

The procedures for growing trees start from the entire learning set, which is

graphically represented as a tree composed of a single leaf, also called a root

node. The learning set is partitioned into two or more subsets according to a

given splitting rule; this produces a new tree with an increased size. Then, the

(31)

Figure 4.5: Decision tree example from [42].The dependent variable considered in the example is the satisfaction of a customer with a given service provided by an international company (Satisfaction, with categories no and yes) and the predictors are the country in which the customer lives (Country, with cate- gories Benelux, Germany, Israel and United Kingdom) and how many years the customer has used the service (Seniority).

partitioning procedure is recursively applied by splitting the leaves of the tree constructed in the previous step. Thus, tree growing is performed through a top-down procedure. The result of the recursive partition process is a sequence of nested trees having an increasing number of leaves.

For tree growing/constructing, it is of vital importance to choose a suitable goodness-of-split criterion, which is used to score all possible splitting rules for a given leaf node and select the best one; and the denition of criteria to stop the recursive partitioning. Many recursive partitioning methods have been proposed named ID3, C4.5, CART, CHAID, QUEST, GUIDE, CRUISE, and CTREE.For example, ID3’ splitting criteria is information gain; C4.5’s splitting criteria is gain ratio; CART’s splitting criteria is Gini index [43].

Random forest

Decision tree bagging relies on the fact that single-tree methods can result in

very different prediction depending on which observations are included. Decision

tree bagging is to conceptualize trees fit to different subsets of data as if they

were independent draws of a random variable. With the help of bagging, we

can reduce the variance in single-tree estimates of the response by fitting many

trees and combining them as the equation shows below, where M is the number

of trees and Θ

m

is the set of parameters that define each tree T

m

.

(32)

f (X

i

) =

M

X

m=1

T

m

(X

i

; Θ

m

)

The procedure for random forest algorithm is: Firstly, it bootstraps (i.e.

sample with replacement) m time to get m new N size sample. Then m decision trees are trained based on these m samples. When make a prediction using random forest, each decision has one vote, and the result is based on majority voting. An example of random forest is shown below in Figure 4.6.

Figure 4.6: Random forest example from [44].

When we applied bagging on single decision trees, bagging typically results in higher accuracy if applied to unpruned trees [45]. The reason is that unpruned trees are overfitted to their training data and therefore have low bias and high variance. Since bagging is a variance reduction technique, it can reduce this variance so that the final bagged model has both low bias and low variance and thus lower total prediction error.

Adaptive boosting decision tree

Similar to bagging decision trees to construct random forests, tree boosting approaches also solve the problem by creating multiple trees. There are many variants on the idea of boosting. The most widely used method is called adaptive boosting. However, unlike random forest which uniform aggregate decision trees, adaptive boosting decision tree is a non-uniform aggregation of decision tree [46].

Every tree has its own weight α on final prediction. The weight are initially equal and at step m the cases mis-classified at step m − 1 have their weights increased and those correctly classified have them decreased. Thus, as iterations proceed, samples that are difficult to classify receive more attention.

f (X

i

) =

N

X

n=1

α

m

T

n

(X

i

; Θ

n

)

Moreover, the difference between random forest and adaptive boosting de-

cision tree is not only in different weights for trees, but also in different boot-

(33)

strapping method. Random forest can bootstrap m samples in parallel (in- dependently), but the subsequent sample in adaptive boosting decision tree focuses on problematic observations in previous decision tree model(i.e. focus on observations can’t be solved by previous decision tree.). The bootstrapping, train model and calculate model’s weight iterate performs sequentially such that each new tree improves the predictive power of the boosting. At last, the adap- tive boosting decision tree aggregates non-uniformly using its calculated model weight α

m

.

4.3.4 Support vector machines (SVM)

Support vector machines is a blend of linear modeling and instance-based learn- ing [41]. Support vector machines use linear models to implement nonlinear class boundaries by transforming the sample space into a new space.

Figure 4.7: Hyperplane example from [41].

Support vector machines select a small number of critical boundary instances called support vectors from each class and build a linear discriminant function that separates them as widely as possible, i.e. find the maximum-margin hy- perplane. For example, a two class dataset whose classes are linearly separable (as shown in Figure 4.7), there is a hyperplane in sample space that classifies all training samples correctly. The maximum-margin hyperplane is the one that gives the greatest separation between the classes.

The set of support vectors defines the maximum-margin hyperplane. The

“Plus-plane” and “Minus-plane” are the plane where support vectors lies. The distance between “Plus-plane” and “Minus-plane” are the margin M that we need to maximize. We could easily know that

M = 2

||w||

Thus the problem could be changed to

(34)

Figure 4.8: Decision boundary and Margin.

min 1

2 ||w||

²

s.t., y

i

(w

^T

x

i

+ b) ≥ 1, i = 1, 2, ..., n

The above is an optimization problem with a convex quadratic objective and only linear constraints that could be solved, and hence the optimal margin classifier is found.

However, when the data could not be spearate in linear case, kernel methods were introduced to help with the classification, common used kernels including polynomial kernels, Gaussian kernels and sigmoid kernels.

4.3.5 Artificial neural networks

An artificial neural network (ANN) also called neural network (NN) is a compu- tational model that is inspired by the structure of the brains biological nervous system which first proposed by McCulloch and Pitts [47] in 1943. Modern arti- ficial neural networks are usually used to model complex relationships between inputs and outputs or to find patterns in data.

Artificial neural networks consist of a large number of simple processing elements called neurons which are interconnected together via channels called connections. The connections between the neurons also called links, and every link has a weight parameter associated with it. Since the performance of NN will not significantly degraded if one of its links or neurons is faulty, Neural networks’ highly interconnected structure could make them fault tolerant.

Each neurons receives stimulus from the neighboring neurons connected to it, processes the information received and product an output. Neurons in the network are divided into different kinds:

• Input neurons: neurons that receive stimuli from outside the network called input neurons.

• Output neurons: neurons that give outputs outside the network.

• Hidden neurons: neurons that receive stimuli from other neurons and give

outputs as stimulus for other neurons in the network.

(35)

Based on the link pattern, artificial neural network can be grouped into two types:

• Feedforward network: the network graph has no loops.

• Feedback network: loops occur because of feedback links.

The artificial neural network has different kinds of architectures which neu- rons have different ways to process information and links have different ways to connected to each other, namely multilayer perceptrons (MLP), radial basis function (RBF) network, wavelet neural network, recurrent networks and self- organizing maps (SOM). Here we will discuss the most commonly used neural network multilayer perceptrons (MLP) which is a feed-forward network trained by a back propagation algorithm.

Neurons are grouped into layers in MLP [48], the first and last layers which represent the inputs and outputs in neural network are called input and output layers respectively, the remaining layers are called hidden layers. An MLP contain an input layer, one or more hidden layers, and an output layer, an example of MLP is shown in Figure 4.9.

Figure 4.9: An example of MLP with three layers.

Suppose the total number of layers is L. The 1st layer is the input layer, the Lth layer is the output layer, and layers 2 to L − 1 are hidden layers. Let the number of neurons in lth layer be N

l

, l = 1, 2, ..., L. Let w

_ij^l

represent the weight of the link between jth neuron of l − 1th layer and ith neuron of lth layer (w

^l_i0

is the bias for ith neuron of lth layer). Let x

i

represent the ith external input of MLP, and z

^l_i

be the output of ith neuron of lth layer. A neuron in the network will process the information in such way: each input is first multiplied by the corresponding weight parameter, and the resulting products are added to a weighted sum γ =

Nl−1

P

j=1

w

^l_ij

z

_j^l−1

, this γ will pass through a activation function

(36)

σ(.) (most coomonly-used sig(γ) = 1/(1+e

^−γ

)) to produce the output to neuron in next layer, as shown in Figure 4.10

Figure 4.10: Information processing by ith neuron of lth layer

Then conputation of feedforward MLP with input X = [x

₁

x

₂

...x

_n

]

^T

and outputs Y = [y

₁

y

₂

...y

_m

]

^T

is

z

_i¹

= x

i

, i = 1, 2, ..., n

z

^l_i

= σ(

N_l−1

X

j=0

w

^l_ij

z

_j^l−1

), i = 1, 2, ..., N

_l

y

i

= z

^L_i

Toward Better Health Care Service:

IN

DEGREE PROJECT ELECTRICAL ENGINEERING, SECOND CYCLE, 30 CREDITS

STOCKHOLM SWEDEN 2017 ,

Toward Better Health Care Service:

Statistical and Machine Learning Based Analysis of Swedish Patient Satisfaction Survey

YU WANG

Abstract

Machine learning classification approaches including Na¨ıve Bayes classifier,

logistic regression, tree-based model (decision tree, random forest, adaptive

boosting decision tree), support vector machines and artificial neural networks

are used to built models to classify patients overall satisfaction (positive or

negative) based on survey responses in dimensions and patients’ demographics

information. These models all have relatively high accuracy (87.41%–89.85%)

and could help to find the important features of health care service and hence

improve the quality of health care service in Sweden.

Sammanfattning

- 89.85%) och kan hjlpa till att hitta de viktigaste egenskaperna hos vrden och

drmed frbttra kvaliteten p vrden i Sverige.

Acknowledgment

In addition, I would like to thank all the staff at IC Quality. And I especially want to thank Nils Press for his instructions have contributed a great deal to my thesis writing.

Also thanks to all my class fellows who kindly supported each other in the past two years.

Especially, I would like to express my deepest love to my parents, for their love, selfless support and encouragement to finish my studies.

All of the above help has contributed much to my study and research, and

I express my gratitude again to all of them.

Contents

1 Introduction 1

1.1 Motivation . . . . 1

1.2 Problem statement . . . . 2

1.3 This study . . . . 2

2 Background 4 2.1 Patient satisfaction . . . . 4

2.2 Swedish national patient survey . . . . 5

2.2.1 Implementation . . . . 5

2.2.2 Measurement scales . . . . 5

2.2.3 Dimensions . . . . 5

2.3 Survey Data . . . . 7

3 Related work 9 3.1 Statistical based approaches . . . . 9

3.2 Machine learning based approaches . . . . 10

4 Method 12 4.1 Data pre-processing . . . . 12

4.1.1 Missing data and imputation methods . . . . 12

4.1.2 Weight calculation in dimension . . . . 15

4.1.3 Balancing data . . . . 16

4.1.4 Dummy coding . . . . 16

4.2 Basic analysis . . . . 17

4.2.1 Cross tabulation . . . . 17

4.2.2 Pearson’s chi-square test . . . . 17

4.2.3 Correlation matrix . . . . 18

4.2.4 Linear regression . . . . 20

4.3 Machine learning classification techinques . . . . 21

4.3.1 Na¨ıve Bayes classifier . . . . 22

4.3.2 Logistic regression . . . . 23

4.3.3 Tree-based model . . . . 23

4.3.4 Support vector machines (SVM) . . . . 26

4.3.5 Artificial neural networks . . . . 27

5 Experiments and results 30

5.1 Experiment design . . . . 30

5.1.1 Data split . . . . 30

5.1.2 Data pre-processing . . . . 30

5.1.3 Cross validation . . . . 30

5.1.4 Grid search . . . . 31

5.1.5 Evaluate model . . . . 31

5.2 Na¨ıve Bayes classifier . . . . 33

5.3 Logistic regression . . . . 34

5.4 Tree-based model . . . . 34

5.4.1 Decision tree . . . . 35

5.4.2 Random forest . . . . 35

5.4.3 Adaptive boosting decision tree . . . . 37

5.5 Support vector machines (SVM) . . . . 37

5.6 Artificial neural networks . . . . 38

5.7 Model performance comparison . . . . 39

6 Conclusions and Future Work 41 A 2016 patient satisfaction survey 43 B Basic analysis results 47 B.1 Chi-square test . . . . 47

B.1.1 Education and overall satisfaction . . . . 47

B.1.2 Gender and overall satisfaction . . . . 47

B.1.3 Occupation and overall satisfaction . . . . 48

B.1.4 Question X000031 and overall satisfaction . . . . 48

B.1.5 Question X000032 and overall satisfaction . . . . 48

B.2 Linear regression . . . . 48

Chapter 1

Introduction

The measurement of patient satisfaction could be described as the difference between the perceived and expected satisfaction for each dimension. Patients’