• No results found

Comparison of mortality rate models for current health diseases

N/A
N/A
Protected

Academic year: 2021

Share "Comparison of mortality rate models for current health diseases"

Copied!
54
0
0

Loading.... (view fulltext now)

Full text

(1)

School of Education, Culture and Communication

Division of Applied Mathematics

MASTER THESIS IN MATHEMATICS / APPLIED MATHEMATICS

Comparison of mortality rate models for current health

diseases

by

Iv´

an Araque Crist´

obal

Masterarbete i matematik / till¨

ampad matematik

DIVISION OF APPLIED MATHEMATICS

M¨ALARDALEN UNIVERSITY

(2)
(3)

School of Education, Culture and Communication

Division of Applied Mathematics

Master thesis in mathematics / applied mathematics Date:

2019-06-7 Project name:

Comparison of mortality rate models for current health diseases Author:

Iván Araque Cristóbal Version:

February 27, 2020 Supervisor(s):

Milica Rancic and Karl Lundengård Reviewer: Fredrik Jansson Examiner: Masood Aryapoor Comprising: 30 ECTS credits

(4)
(5)

Contents

1 Introduction 9

2 Methodology 15

2.1 Models . . . 15

2.1.1 Gompertz model . . . 16

2.1.2 The power exponential model . . . 16

2.2 Non linear least squares method . . . 19

2.3 Calculating the error . . . 20

3 Numerical experiments 21 3.1 Influenza . . . 21

3.1.1 Gompertz model . . . 24

3.1.2 Power-exponential model . . . 28

3.1.3 Correlation between pollution and respiratory diseases . . . . 34

3.2 Mental diseases and suicides . . . 35

3.2.1 Mental diseases - mortality and correlation with self-harm . . 36

3.2.2 Self-harm . . . 37

4 Conclusion and future work 45 4.1 Project summary . . . 45

4.2 Future Work . . . 46

5 Summary of reflection of objectives in the thesis 47 5.1 Objective 1: Knowledge and understanding . . . 47

5.2 Objective 2: Methodological knowledge . . . 47

5.3 Objective 3: Critically and Systematically Integrate Knowledge . . . . 47

5.4 Objective 4: Independently and Creatively Identify and Carry out Ad-vanced Tasks . . . 48

5.5 Objective 5: Present and Discuss Conclusions and Knowledge . . . . 48

5.6 Objective 6: Scientific, Social and Ethical Aspects . . . 48

(6)
(7)

List of Figures

1.1 Deaths distribution in different ages represented using the Heligman-Pollard HP4 model. There are "humps" between 20-30 years due to the traffic accidents that increase the mortality in those ages. [2] . . . 10 1.2 Total deaths caused by influenza (left) and logarithm of the data (right)

in United States in 2016. The hump can be observed for young ages (right). . . 11 1.3 Distribution of self-harm deaths during 2015. . . 12 2.1 Example of the influenza mortality rate curve given by the

Power-exponential Model in the year 2008 with the parameters c1= 12.7889, c2= 0.1135, a1= 0.3009, a2= 0.0743, a3= 2.1006. We can see how the two terms of the model affect the curve. The second term models the "hump" and the first models the exponential growth in older ages. 17 2.2 Curves depending on the parameters c1and c2[3]. The parameter c1is

a parameter control to model the initial points. In the figure (b) we can see that the parameter c2model the slope. . . 17 2.3 Curves depending on the parameters a1, a2and a3[3]. . . 18 3.1 Evolution of the total number of deaths caused by influenza from 1980

to 2016 in United States. There are two spikes in 1982 and 2016 when the disease spread on a larger scale. . . 21 3.2 Evolution of the total number of deaths caused by influenza per 100.000

habitants from 1980 to 2016 in United States. . . 22 3.3 Total number of deaths caused by influenza across different years and

different ages in United States. . . 22 3.4 Data and corresponding model for different age ranges for the period

1980 to 2016 in United States. . . 23 3.5 Evolution of the Gompertz parameters among the years in United States.

The black curve corresponds to the first parameter and the red one cor-responds to the second parameter in Eq. 2.1. . . 24 3.6 Evolution of the Gompertz parameters with respect of the total deaths

in United States. . . 25 5

(8)

6 LIST OF FIGURES 3.7 Gompertz model and original data in United States. In Fig. (a) we can

see the evolution of the total deaths from 1980 to 2016 and in Fig. (b) we can see the evolution in 2016 for different ages. . . 25 3.8 Evolution of the deaths caused by influenza, Gompertz model and

Gom-pertz model with the first fix parameter for different ages in United State in 2015. . . 26 3.9 Evolution of deaths caused by influenza, Gompertz model, Gompertz

model with the fix exponent and G-M model for different ages in United State in 2015. . . 27 3.10 Respiratory problems deaths in United States in 2015 and how the 2

terms of the model given by the Eq. 3.1 after calculating the parameters. 28 3.11 Respiratory problems deaths and exponential model depending on the

age in United States. In (a) we can see the data and how the model fit in 2016, when the error is the biggest, and in (b) in 2015, when the error is the smallest. . . 29 3.12 Respiratory problems deaths in United States in 2015. We can see the

original model and the model with the first parameter fixed. . . 31 3.13 Respiratory problems deaths in United States in 2015. We can see the

original model, the model with the first parameter fixed and the model with the the first and second parameter fixed. . . 32 3.14 Respiratory problems deaths in United States in 2015. We can see the

original model, the model with the first parameter fixed, the model with the first and second parameters fixed and the model with the first term fixed. . . 33 3.15 Total CO2emissions in United States from 1980 to 2008 in 1000 metric

tons of C. . . 34 3.16 Respiratory problems deaths in United States from 1980 to 2008. . . . 35 3.17 Evolution of deaths related with mental diseases from 1980 to 2016 in

United States. . . 36 3.18 Evolution of mental diseases deaths in different age intervals from

1980 to 2016 in United States. . . 37 3.19 Evolution of deaths related with self-harm form 1980 to 2016 . . . 38 3.20 Model given by Eq. 3.5 in 2016 in United States. . . 38 3.21 Power-exponential term 2 model in 2016 in United States with the

sec-ond parameter fixed. . . 41 3.22 Power-exponential term 2 model in 2016 in United States with the first

(9)

List of Tables

1.1 Different models for mortality rates [2]. . . 13

3.1 Model relative errors depending on two and one parameters by year. We can see that the model with the fix exponent is mostly worse than the one when calculating two parameters. . . 26

3.2 Model errors depending on three parameters by year. . . 27

3.3 Model errors depending on six parameters by year. . . 29

3.4 Mean and Standard deviation for the parameters seen in Table 3.3. . . 30

3.5 Model errors depending on five parameters by year. . . 30

3.6 Mean and Standard deviation for the parameters seen in Table 3.5. . . 30

3.7 Model errors depending of four parameters by year. . . 31

3.8 Mean and Standard deviation for the parameters seen in Tab. 3.7. . . . 32

3.9 Model errors depending of three parameters by year. . . 33

3.10 Power-exponential term 2 model from 1980 to 2016. . . 39

3.11 Model given by Eq. 3.5 without the teenager ages. . . 40

3.12 Model given by Eq. 3.5 parameters. . . 40

3.13 Evolution of the model given by Eq. 3.5 parameters and error from 2000 to 2016 . . . 41

3.14 Evolution of the power-exponential term 2 model parameter and error from 2000 to 2016 . . . 42

(10)
(11)

Chapter 1

Introduction

Since the beginning of human time, human knowledge has grown very fast. This de-velopment includes great discoveries and advances in technological and health fields. In particular, the advances in medicine have affected certain diseases to no longer be so influential in the mortality rate and has contributed to, among other things, prolonga-tion of life expectancy. At the same time, other types of illnesses have been discovered that are more difficult to cure, such as mental diseases. [13]

This is why, as progress has been made, models have been created that allow repre-sentation of the mortality rate data, and have been used to predict and mitigate the effects of this mortality rate.

It has been discovered that the mortality rate is very high when people are less than one year old and that it decreases rapidly [14]. From one year old to approximately twenty years old it grows slowly when this curve undergoes a jump, known as an "ac-cident hump". Then it continues to grow more exponentially as we age [see Fig. 1.1].

(12)

10 CHAPTER 1. INTRODUCTION

Figure 1.1: Deaths distribution in different ages represented using the Heligman-Pollard HP4 model. There are "humps" between 20-30 years due to the traffic accidents that increase the mortality in those ages. [2]

What one would expect is that certain diseases would disappear along with these ad-vances. But what we are observing day-by-day is that the mortality caused by certain diseases is increasing. Some social movements that set us back and be more suscep-tible, like the new anti-vaccine social movement. I. Araque proved the importance of the vaccines in his project [15] where he modelled different diseases with and without vaccination.

Population is concerned about the effect of pollution on human being and how the pollution affects the greenhouse effect. Pollution can negatively affect our health. It is the greenhouse that also affects the moving of the diseases to territories where they typically were not observed (for example tropical diseases moving to previously colder parts of continents). Population of such ares is more susceptible to these "new" dis-eases, which has a consequience of increased mortality caused by these diseases. [16] The case of influenza follows an approximately periodic rhythm although there are years like 1982 or 2016 where influenza was especially aggressive and deadly. How-ever, cases of death from influenza are not very precise as the National Center for Health Statistics says [9] since many deaths are caused by other reasons in which in-fluenza weakened people’s immune systems, such as pneumonia.

There are currently many types of models developed that are used to represent mor-tality of population quite well [see Tab. 1.1], thus being able to make a prediction of what may happen in the near future. The main axis of this work is to see how these models fit certain diseases, such as influenza or mental diseases. Further work would be to use these models to make a prediction in the near future of what may happen. The motivation for doing this research is based on the fact that the author has not found any research that focuses on a comparison between different kinds of models with

(13)

dif-11 ferent complexity to see how well they actually match with the disease mortality data. This can be very helpful as this research provides data on how different diseases affect the population at different ages. The results obtained in this research can be of help when developing possible solutions for these diseases.

Discussing different models based on the same data is of vital importance because with them we can try to predict what may happen in the future. However, some models differ from the data more than others, so the distance between the data and the pre-diction of the model is bigger. Furthermore the dependence of different parameters on initial and boundary conditions is a very important point because this way we can try to predict in a much faster way and with lower computational cost. This means that sometimes is better build a model with less parameters and bigger error than build a model with a lot of parameters and small error because the computational time cost is essential.

We will start with a model that is quite simple, intuitive and adapts to the mortality curve. This model will be the Gompertz model, introduced in 1825 [11], which is an exponential curve dependent on two parameters. After analyzing available deaths caused by influenza data, we have observed exponential increase of death cases by age. As we can see in Fig. 1.2 (right) we can compare this figure with the mortality curve [see Fig.1.1] so this will indicate us that the models used to model the mortality rate can be used to model the mortality rate cause by different diseases.

Figure 1.2: Total deaths caused by influenza (left) and logarithm of the data (right) in United States in 2016. The hump can be observed for young ages (right).

The Gompertz model cannot model the jump so to model this jump and to make our model fit more precisely we can try to use the Heligman-Pollard HP4 model, as Be-linda Straß demonstrated in her thesis [2]. In this thesis, we are going to use a simpler model. Analyzing the complexity of the nine parameters of the Heligman-Pollard HP4 model we can take another simpler model, the Power-Exponential model [Lundengård et al., 2017 [1]]. This model has an exponential growth and does not need as many pa-rameters as the Heligman-Pollard HP4 model. In addition, this model fits well because there was a high rate of infant mortality that declined rapidly but in the cases that we

(14)

12 CHAPTER 1. INTRODUCTION are going to study infant mortality with ages under one year is not so high so the rest of data are distributed exponentially with a jump in the ages from 10 to 25 years. Apart from analysing different models that give us mortality curves that adjust bet-ter or worse depending on the data, we are going to try to analyse whether other causes can affect and can be correlated with these diseases. This analysis will be useful to fo-cus attention and thus put efforts, both economic and human, in reducing this mortality in the future.

In the fourth chapter of this work we will study suicide and self-harm, which is not spread by a bacteria or virus, but is common among humans and which does not stop increasing over the years. These problems are difficult to cure. In 2016 the American Foundation for Suicide Prevention did a study showing that suicides were the tenth leading cause of death among Americans and cost $69 billion in 2015 [12].

As we can see in the Fig. 1.3, self-harm deaths are not increasing exponentially with age so many models used to predict overall mortality rate are invalid. This is because we begin to have data for deaths from self-harm from five years onwards. The ages where this distribution reaches its maximum are between 35 and 55 years old and from then on it begins to decrease gradually.

Figure 1.3: Distribution of self-harm deaths during 2015.

All the data we have used in the development of this thesis have been taken from the World Health Organization website [10]. In this database we have made an analysis of almost all the deaths caused by diseases and we have seen which ones were increasing. Among these we had to make another selection to finally arrive at influenza, mental illnesses and self-harm as topics for our research.

(15)

13

Model Mortality rate

Gompertz-Makeham y= a + becx Thiele y= a1e−b1x+ a2e−b2 (x−c)2 2 + a3eb3x Modified Perks y= a 1+eb−cx+ d Double Geometric y= a + b1bx2+ c1cx2

Gompertz-inverse Gaussian y=√ea−bx 1+e−c+bx Weibull y=ab(xb)a−1 Heligman-Pollard HP1 y= a(x+a2)a3 1 + b1e −b2ln(b3x)2+ c 1cx2 Heligman-Pollard HP2 y= a(x+a2)a3 1 + b1e −b2ln(x b3) 2 + c1cx2 1+c1cx2 Heligman-Pollard HP3 y= a(x+a2)a3 1 + b1e −b2ln(b3x)2+ c1cx2 1+c3c1cx2 Heligman-Pollard HP4 y= a(x+a2)a3 1 + b1e −b2ln(b3x)2+ c1cxc32 1+c1cxc32 Hannerz y=g(x)eG(x) 1+eG(x) g(x) =a1 x2 + a2x+ a3ecx G(x) = a0−ax1+a2x 2 2 + a3 ce cx Logistic y= aebx 1+acb(ebx−1)

Log-logistic y=abx1+bxa−1a

Simple power-exponential y= c1 xe−c2x+ a1(xe −a2x)a3 Simple power-exponential y= c1 xe−c2x+ ∑a∈Aa1(xe −a2x)a3

(16)
(17)

Chapter 2

Methodology

In this chapter we are going to define the different models that we are going to fit in our data and how we are going to prove how well our models are. During all of this research we are going to denote x as the age of the people and y the number of deaths. In the numerical experiments we are going to work with the logarithm of the data be-cause this change is going to give us a better fitting of the models.

In this thesis we work with total deaths in front of mortality rate due to the data pro-vided is in total deaths. Moreover, we are working with total deaths because in the data that is clustered in the different age groups is given in total deaths and not in per 100000 so after doing the transformation to the rate the number obtained is too small and this could give us big errors in our modelling. For example in 2012 the population in United States were 314 millions. People who died in the range group lower than 1 year old were 7 and people who died in the range group 75+ were 1666. If we calculate the rate of these number we obtain 0.002229 and 0.530 respectively. When we do this process we are missing some decimals so our models are not going to be very accurate.

2.1

Models

We can see in the Tab. 1.1 in the introduction several models that are used to model the mortality rate. In this research we are working with number of deaths so some of the models showed before may not work for the modelling of the deaths caused by certain diseases. The mathematics behind those models let us apply those models in the case of total deaths. We are going to choose two models that follow the dynamics of the data but the other models could work for other diseases.

Some of the mortality rate models model the high child mortality like the Thiele and Heligman-Pollard and the diseases that we are going to model haven’t that pattern so these models are not going to model well these diseases.

(18)

16 CHAPTER 2. METHODOLOGY On the other hand other models have been discard due to the dynamic of the func-tion like the Weibull and Logistic models. Moreover, there are models that may could fit very well with respect of the patterns in our data but due to the complexity of them we discard them and to this first analysis we are going to continue with lower com-plexity in our models, then the Modified Perks, Double Geometric, Gompertz-inverse Gaussian and Hannerz model are discarded.

2.1.1

Gompertz model

The Gompertz model is the simplest model to study the mortality rate. The initial Gompertz curve was

y= eax

where x is the age of the person and a is the scalar growth of the mortality curve [11]. This curve does not fit good with respect of the data. The modification of this curve is adding another parameter.

y= aebx (2.1)

where the new parameter is another scalar.

This method is useful when we try to model mortality rate but this model doesn’t model the "hump" in the general mortality rate and nor some diseases, as it will be presented in chapter three of this project.

2.1.2

The power exponential model

We will study an improvement of the Gompertz model. With this we will try to model the hump that appears in the adolescence and thus reduce error introduce by the pre-vious model. This will improve the prediction of mortality compare to forecast that it would be obtain by Gompertz model. The power-exponential model is given by:

y= c1

xe−c2x+ a1(xe

−a2x)a3.

This model is based on combination of two curves that represent the hump. For the hump to occur parameteres a1, a2and a3must be positive. The reader is referred to [3] [1] for further information. This representation can been seeing in Fig. 2.1.

(19)

2.1. MODELS 17

Figure 2.1: Example of the influenza mortality rate curve given by the Power-exponential Model in the year 2008 with the parameters c1= 12.7889, c2= 0.1135, a1= 0.3009, a2= 0.0743, a3= 2.1006. We can see how the two terms of the model af-fect the curve. The second term models the "hump" and the first models the exponential growth in older ages.

To understand better how the hump works we can make a study of how the model changes depending on the five parameters. The following graphs have been taken from [3].

(a) Illustrations of changes for varying c1 (b) Illustrations of changes for c2

Figure 2.2: Curves depending on the parameters c1 and c2[3]. The parameter c1is a parameter control to model the initial points. In the figure (b) we can see that the parameter c2model the slope.

(20)

18 CHAPTER 2. METHODOLOGY

(a) Illustrations of changes for a1 (b) Illustrations of changes for a2

(c) Illustrations of changes for a3

Figure 2.3: Curves depending on the parameters a1, a2and a3[3].

Again we are going to apply non linear least squares method to calculate the five param-eters. But we have more parameters than in the Gompertz model so we have to choose properly the initial conditions to obtain the curve that fits better with our parameter. If we start analysing our formula we can observe when x → 0 that

y→c1 x

Then our c1should be near the first value of our data. Another interesting calculation is taking the logarithm of our model, make the derivative and try to see what happens when x → ∞ d dxln y = c1c2ec2x x − c1ec2x x2 + a1a3(xe −a2x)a3−1(e−a2x− a 2xe−a2x) a1(xe−a2x)a3+c1e c2x x and lim x→∞ d dxln y = c2

(21)

2.2. NON LINEAR LEAST SQUARES METHOD 19 So the slope of the curve will be approximately c2when x → ∞ .

To study how the other three parameters model the hump we are going to search the maximum point that will correspond with the hump. We know that the term of the model that model the hump is the second term

y= a1(xe−a2x)a3.

So now we are going to do an analytic study of this function. y0= 1 a1(xe−a2x)a3 a1a3 xe−a2x a3−1 e−a2x− a 2xe−a2x = =a3 x(1 − a2x) So the maximum point is in

x= 1 a2

.

To see if the point satisfy the maximum condition we have to calculate the second derivative and it must satisfy that the second derivative is negative.

y00=−a2a3x− a3(a − a2x)

x2 =

−a3 x2

The parameter a3is positive so the second derivative is negative and it satisfy the max-imum condition.

This point indicates the time for the hump. Also we can know the maximum height of the hump. y 1 a2  = a1  1 a2 e−a2a21 a3 = = a1 aa3 2ea3

2.2

Non linear least squares method

To see how accurate applied models can be we will need to find the appropriate param-eter values. The non linear least squares method solve an optimization problem where the goal is to minimize the sum of the squares of the residuals. A typical non linear problem is minimize m

i=1 (yi− f (xi))2 where fi(x) is a differentiable function. [17]

(22)

20 CHAPTER 2. METHODOLOGY point satisfies this condition that doesn’t imply that the point is optimal. The optimal-ity condition is

5|| f ( ˆx)||2= 2D f ( ˆx)Tf( ˆx) = 0

where ˆxis the solution of the minimize problem and D f ( ˆx) is the Jacobian matrix D f ( ˆx)i j=

∂ fi ∂ xj

( ˆx), i= 1, . . . , m, j= 1, . . . , n.

To apply this method in our research we are going to use MATLAB (see Appendix to see some of the code) to calculate the parameters to fit our models. Due to the complexity of the method the initial boundary conditions are going to be very sensible so first we have to do an analysis of the data.

2.3

Calculating the error

The main objective in this research is to find one model that fits to available data. To know which model is better we are going to calculate the error after using non linear least squares method. We are going to base the study on using the relative error.

εr=

|real value − approximate value| real value

In each year we are going to calculate the mean value of relative error calculated for each age to evaluate how good are the models.

(23)

Chapter 3

Numerical experiments

In this chapter we are going to present results after performing numerical experiments. First we are going to show the results of modelling mortality caused by influenza and second the results of modelling mortality caused by mental diseases and self-harm. All the data that we are going to use in this chapter is given by WHO. [10]

3.1

Influenza

Due to improvement in medicine and health care one could expect that the number of influenza deaths would decrease over time. However, in recent years there has been an increase in people dying from the flu, see Fig.3.1. This increase may be due to stronger strains, a decrease in vaccination that prevents this disease, or an increase in pollution making us more sensitive. An interesting and important aspect to investigate is if this increase is across all age groups or if some parts of population suffer more than others.

Figure 3.1: Evolution of the total number of deaths caused by influenza from 1980 to 2016 in United States. There are two spikes in 1982 and 2016 when the disease spread on a larger scale.

(24)

22 CHAPTER 3. NUMERICAL EXPERIMENTS

Figure 3.2: Evolution of the total number of deaths caused by influenza per 100.000 habitants from 1980 to 2016 in United States.

Figure 3.3: Total number of deaths caused by influenza across different years and dif-ferent ages in United States.

We can see in Fig.3.3 that the older part of the population is more susceptible to flu and this is reflected in the number of deaths. When we have the spikes we can also see that the increase of total deaths in certain years, for example 1982 and 2016, affected the hole population not only specific age group.

In order to have a forecast for the future, it is necessary to build a series of models that match the historical data in order to find solutions to this growing problem of the

(25)

3.1. INFLUENZA 23 increase of the influenza deaths in the last years. The main problem with trying to predict influenza cycles is that they do not follow a period. This is why the proposed model will not fit the data provided.[10]

We can try to model influenza cycles in order to prevent and reduce the years when the influenza is strongest. Due to the first analysis we can see that the influenza is cyclic having the population more affected than other years. Observing how the total number of deaths changes, see Fig. 3.2, we can start with the hypothesis that the deaths caused by influenza follow a linear combination of trigonometric functions. So the first model to analyze is the following:

y= a + b sin(cx) + d cos(ex).

Applying the non linear least squares method we can obtain the parameters of this model from 1980 to 2016. We are going to see that for some age groups the model fits better than other. This is because we can observe the cycles of influenza better in certain age range than in others, since there are certain ages where no cycle can be observed. For this reason we are going to attempt to model this mortality up to age 55, see Fig. 3.4.

In addition, anomalous data, whether high or low numbers of deaths, do not fit the proposed model. So we could continue to increase the degree of the model, for exam-ple taking the square of the trigonometrical function, but with that we would not get a good model.

(a) 5-14 years (b) 15-24 years

(c) 25-34 years (d) 35-55 years

Figure 3.4: Data and corresponding model for different age ranges for the period 1980 to 2016 in United States.

(26)

24 CHAPTER 3. NUMERICAL EXPERIMENTS As we can see in Fig.3.4, this trigonometric model does not adapt because, as we have said before, influenza does not exactly follow a cycle and has periods where influenza is much stronger than the rest of the years. For this reason instead of trying to make a prediction depending on each age in each year we will try to analyze what happens each year with the ages and propose different models that fit the data.

3.1.1

Gompertz model

Testing the model given by Eq. 2.1 with all the data from 1980 to 2016 and applying non linear least square method we obtain those parameters that fit with the data de-scribing deaths caused by influenza. We can see how these parameters change among the years.

As we explained in the methodology we are going to use the non linear least square method to calculate the two parameters for each year.

Figure 3.5: Evolution of the Gompertz parameters among the years in United States. The black curve corresponds to the first parameter and the red one corresponds to the second parameter in Eq. 2.1.

We can see in Fig. 3.5 if the parameters follow the same dynamics as the data repre-senting the influenza deaths or if they are independent.

(27)

3.1. INFLUENZA 25

(a) First Gompertz parameter (b) Second Gompertz parameter Figure 3.6: Evolution of the Gompertz parameters with respect of the total deaths in United States.

(a) (b)

Figure 3.7: Gompertz model and original data in United States. In Fig. (a) we can see the evolution of the total deaths from 1980 to 2016 and in Fig. (b) we can see the evolution in 2016 for different ages.

We can make a deeper study of how different parameters affect the fit of the model. From Fig 3.6 (b) The variation of the values of the second parameter is not high so we are going to try to fix the value for the second parameter and build the model only depending of the first parameter. The standard deviation of the second parameter is σb= 0.002286391 and the mean is µb= 0.016327027. So taking the mean value our new model will be

y= ae0.016327027·x. The relative errors of the Gompertz model are

(28)

26 CHAPTER 3. NUMERICAL EXPERIMENTS Year Error Gompertz Error Gompertz fix exponent Difference

2016 6.7754 9.6061 -2.8307 2015 11.4279 12.7006 -1.2727 2014 13.8727 13.1417 0.731 2013 9.3932 9.6132 -0.22 2012 11.5867 12.9824 -1.3957 2011 6.7083 6.7467 -0.0384 2010 18.1986 20.4155 -2.2169 2009 14.9169 18.8522 -3.9353 2008 5.5154 5.4432 0.0722 2007 10.1797 10.1711 0.0086 2006 12.4582 13.7623 -1.3041 2005 28.5445 28.7499 -0.2054 2004 15.2007 24.3986 -9.1979 2003 17.8503 17.7013 0.149 2002 7.7480 13.7179 -5.9699 2001 7.5136 9.0797 -1.5661 2000 13.0986 13.6175 -0.5189

Table 3.1: Model relative errors depending on two and one parameters by year. We can see that the model with the fix exponent is mostly worse than the one when calculating two parameters.

Figure 3.8: Evolution of the deaths caused by influenza, Gompertz model and Gom-pertz model with the first fix parameter for different ages in United State in 2015.

The results shown in Table 3.1 and in Fig. 3.8 lead us to conclude that this model does not fit well to the data describing deaths caused by influenza. This is especially observable for the age range 5-15 years when the before mentioned "hump" occurs. To try to solve this problem with youth ages what is proposed is to add another pa-rameter that will move the curve and with which we can adjust the Gompertz model

(29)

3.1. INFLUENZA 27 much better. The new model is going to be the Gompertz-Makeham (G-M) model

y= aebx+ c. If we repeat the process done above we have

Year Error Gompertz Error G-M Difference

2016 6.7754 6.8982 -0.1228 2015 11.4279 11.4154 0.0125 2014 13.8727 13.8727 0 2013 9.3932 9.0427 0.3505 2012 11.5867 11.5299 0.0568 2011 6.7083 6.6831 0.0252 2010 18.1986 18.0941 0.1045 2009 14.9169 15.2269 -0.31 2008 5.5154 5.5154 0 2007 10.1797 9.8084 0.3713 2006 12.4582 7.9192 4.539 2005 28.5445 25.2435 3.301 2004 15.2007 11.6692 3.5315 2003 17.8503 17.2228 0.6275 2002 7.748 6.7622 0.9858 2001 7.5136 7.3251 0.1885 2000 13.0986 10.071 -0.5189

Table 3.2: Model errors depending on three parameters by year.

Figure 3.9: Evolution of deaths caused by influenza, Gompertz model, Gompertz model with the fix exponent and G-M model for different ages in United State in 2015. We can see in Table 3.2 and Fig. 3.9 that in many cases the error has been greatly reduced. These errors as we have said before are more influential in early ages and with this new model we have managed to reduce this error. On the other hand we also

(30)

28 CHAPTER 3. NUMERICAL EXPERIMENTS have cases where the error has increased so we can not rely too much on this model. We still have the problem of the curve that we have between four and twenty years. This will lead us to consider a more complex model with which we will try to adjust that jump and have a much more precise model.

3.1.2

Power-exponential model

Another diseases related with influenza are the respiratory problems. The analysis of the influenza cannot be done without doing an analysis of the respiratory problems be-cause the deaths be-caused by these diseases not always are well separated. The deaths caused by the respiratory problems has the same distribution as the deaths caused by influenza. To not repeat the same processes again in the influenza and the respiratory problems in this subsection we are going to model the respiratory problems.

We are going to fit the data using the power-exponential model. The original model is as follows.

y= a1

xe−a2x+ a3(xe

−a4x)a5.

Due to the way the data is distributed, and after checking this model, I have decided to modify this model by doing another one. The data is growing continuously and we can appreciate 2 humps so this new model will focus on the data humps. That’s why our model is going to be based on six variables, three for each jump. The new model is

y= a1(xe−a2x)a3+ a4(xe−a5x)a6. (3.1) This model is based on the "hump" that the original power-exponential model has. The data of this disease has two "humps" so that is why we build the model using the term that creates the "hump". We can see how the two term affect the model in the following figure.

Figure 3.10: Respiratory problems deaths in United States in 2015 and how the 2 terms of the model given by the Eq. 3.1 after calculating the parameters.

(31)

3.1. INFLUENZA 29 Parameters of the model and the corresponding error are given in the next table.

Year a1 a2 a3 a4 a5 a6 Error 2016 0.00001 0.012 3.759 3.1256 0.1034 0.3415 4.3632 2015 0.0003 0.0122 3.1191 3.03 0.1271 0.4117 1.5874 2014 0.0002 0.0124 3.2182 3.2669 0.1398 0.4058 2.268 2013 0.0003 0.0123 3.06 3.0854 0.1407 0.4037 2.1825 2012 0.0001 0.012 3.2833 3.4498 0.1356 0.2957 3.5446 2011 0.00001 0.012 3.6572 3.5641 0.1226 0.2295 3.4186 2010 0.0002 0.0121 3.2008 3.3984 0.1375 0.3277 2.2374 2009 0.0001 0.012 3.3199 3.4745 0.137 0.2749 3.2723 2008 0.00001 0.0119 3.6021 3.7408 0.1309 0.1852 3.0847 2007 0.00001 0.0119 3.6286 3.3498 0.1118 0.2408 3.5398 2006 0.0003 0.0122 3.0828 2.9411 0.134 0.4108 2.0962 2005 0.0001 0.0118 3.3029 3.0565 0.116 0.3354 3.5142 2004 0.0001 0.0118 3.3818 3.0245 0.1109 0.3425 2.4619 2003 0.0002 0.012 3.2595 3.2217 0.1244 0.3557 1.8846 2002 0.0001 0.0115 3.4795 3.2084 0.0981 0.2341 3.4233 2001 0.0001 0.0117 3.4505 3.1567 0.1077 0.3105 3.261 2000 0.00001 0.0116 3.6894 3.139 0.0917 0.2904 3.0237

Table 3.3: Model errors depending on six parameters by year.

Let’s see the graphs of how the curve fits with the data.

(a) Exponential model in 2016 (b) Exponential model in 2015

Figure 3.11: Respiratory problems deaths and exponential model depending on the age in United States. In (a) we can see the data and how the model fit in 2016, when the error is the biggest, and in (b) in 2015, when the error is the smallest.

The next important point to consider is whether we can simplify the model by setting some variable. In order to do this, we are going to look at the mean and the standard deviation of the data obtained in Table 3.3.

(32)

30 CHAPTER 3. NUMERICAL EXPERIMENTS

a1 a2 a3 a4 a5 a6

Mean 0.000126 0.011964 3.382035 3.249011 0.121717 0.317405 Sd 0.000102 0.000232 0.215932 0.211486 0.015095 0.067322 Table 3.4: Mean and Standard deviation for the parameters seen in Table 3.3. The results in Table 3.4 show the two parameters that vary the least are the first two, so let’s fix the first one only a1.

Fix the first parameter The new model is going to be

y= 0.000126471(xe−a2x)a3+ a

4(xe−a5x)a6. (3.2) We will continue calling the parameters with the same enumeration in order to be able to follow them in a more feasible way.

Year a2 a3 a4 a5 a6 Error Difference wrt original model 2016 0.012 3.3259 3.1234 0.1191 0.3742 4.6436 -0.2804 2015 0.0122 3.3414 3.028 0.1181 0.392 1.4371 0.1503 2014 0.0124 3.3643 3.2661 0.1344 0.3923 2.1621 0.1059 2013 0.0123 3.3535 3.082 0.1281 0.3751 1.9636 0.2189 2012 0.012 3.3242 3.4488 0.134 0.2939 3.5304 0.0142 2011 0.012 3.3168 3.5696 0.1406 0.2553 3.6123 -0.1937 2010 0.0121 3.3316 3.3956 0.1316 0.317 2.1786 0.0588 2009 0.012 3.3202 3.4742 0.1373 0.276 3.2788 -0.0065 2008 0.0118 3.3005 3.7485 0.1498 0.2072 3.2565 -0.1718 2007 0.0119 3.3028 3.3542 0.1292 0.2655 3.7207 -0.1809 2006 0.0122 3.3425 2.9385 0.1229 0.3857 1.9063 0.1899 2005 0.0118 3.3037 3.0558 0.1164 0.3371 3.5242 -0.01 2004 0.0119 3.3066 3.0247 0.1143 0.3492 2.5041 -0.0422 2003 0.012 3.3172 3.2208 0.1222 0.3514 1.8538 0.0308 2002 0.0115 3.2671 3.2114 0.1109 0.2493 3.5694 -0.1461 2001 0.0117 3.2936 3.1571 0.115 0.3235 3.3811 -0.1201 2000 0.0116 3.2789 3.1411 0.1091 0.3177 3.3424 -0.3187

Table 3.5: Model errors depending on five parameters by year. Let’s see how the parameters change.

a2 a3 a4 a5 a6

Mean 0.011964706 3.317105882 3.2494 0.125470588 0.321317647 Sd 0.000232483 0.024754809 0.213461298 0.011063547 0.054408771

(33)

3.1. INFLUENZA 31

Figure 3.12: Respiratory problems deaths in United States in 2015. We can see the original model and the model with the first parameter fixed.

On the other hand if we set the second parameter instead of the first one the error does not decrease in any of the cases. Let’s see what happens if we set the first parameter from Tab. 3.4 and second parameter from Tab.3.6.

Fix the second parameter in Eq.3.2

If we look at Table 3.6 we see that the parameter that varies the least is the second one a2, that’s why we’re going to set this parameter. The new model is going to be

y= 0.000126471(xe−0.011964706x)a3+ a

4(xe−a5x)a6. (3.3)

Year a3 a4 a5 a6 Error Difference wrt Difference wrt original model Eq. 3.2 2016 3.3227 3.1252 0.1187 0.3722 4.6387 -0.2755 0.0049 2015 3.32 3.0425 0.1143 0.3723 1.2944 0.293 0.1427 2014 3.3239 3.2961 0.1276 0.3511 1.9315 0.3365 0.2306 2013 3.3206 3.1043 0.1219 0.3424 1.6978 0.4847 0.2658 2012 3.3179 3.452 0.133 0.2893 3.4804 0.0642 0.05 2011 3.3152 3.5699 0.1405 0.2549 3.6075 -0.1889 0.0048 2010 3.317 3.4034 0.129 0.3049 1.9916 0.2458 0.187 2009 3.3173 3.4753 0.137 0.2745 3.2623 0.01 0.0165 2008 3.3107 3.7437 0.1522 0.2164 3.359 -0.2743 -0.1025 2007 3.3121 3.3484 0.1315 0.2746 3.8165 -0.2767 -0.0958 2006 3.3187 2.9543 0.1185 0.3624 1.6833 0.4129 0.223 2005 3.3144 3.0477 0.1188 0.3482 3.6356 -0.1214 -0.1114 2004 3.3152 3.0184 0.1162 0.3581 2.5934 -0.1315 -0.0893 2003 3.316 3.2209 0.1222 0.3514 1.8533 0.0313 0.0005 2002 3.3084 3.1894 0.1212 0.2835 3.8683 -0.445 -0.2989 2001 3.3132 3.1433 0.1191 0.3419 3.5338 -0.2728 -0.1527 2000 3.3124 3.1184 0.116 0.347 3.4273 -0.4036 -0.0849

(34)

32 CHAPTER 3. NUMERICAL EXPERIMENTS We can see in Table 3.7 how in recent years (remember that the mortality rate is in-creasing) if we set the first two parameters we get a lower error than the one we get with the original model. So this leads us to consider continuing fixing parameters.

a3 a4 a5 a6

Mean 3.316217647 3.250188235 0.125747059 0.3203 Sd 0.004079682 0.212556326 0.010010648 0.045226151 Table 3.8: Mean and Standard deviation for the parameters seen in Tab. 3.7.

Figure 3.13: Respiratory problems deaths in United States in 2015. We can see the original model, the model with the first parameter fixed and the model with the the first and second parameter fixed.

We can see that the changes are small and the error is similar but fixing the second pa-rameter we have fixed two papa-rameters. This fixing will reduce the computational cost and make easier our model.

The next parameter with smallest standard deviation is a3so the next fixing is going to be that parameter. If we are able to reduce the error fixing the third parameter our first time will be only a constant reducing our six parameters model to a three parameters model.

Fix the third parameter

Due to the results show in Table 3.8 now we are going to fix a3and the new model is going to be

(35)

3.1. INFLUENZA 33 And now let’s see how this model fits to available data

Year a4 a5 a6 Error Difference wrt Difference wrt original model Eq. 3.3 2016 3.1689 0.1164 0.3404 4.6663 -0.3031 -0.0227 2015 3.0647 0.1127 0.3547 1.2129 0.3745 0.2242 2014 3.3411 0.1249 0.3144 2.2076 0.0604 -0.0455 2013 3.1289 0.1202 0.3215 1.7308 0.4517 0.2328 2012 3.4611 0.1326 0.2819 3.472 0.0726 0.0584 2011 3.5645 0.1407 0.2593 3.6133 -0.1947 -0.001 2010 3.4077 0.1287 0.3013 1.9884 0.249 0.1902 2009 3.481 0.1367 0.2697 3.2569 0.0154 0.0219 2008 3.7176 0.1526 0.2386 3.4768 -0.3921 -0.2203 2007 3.3261 0.1328 0.2929 3.9101 -0.3703 -0.1894 2006 2.9684 0.1175 0.3503 1.7894 0.3068 0.1169 2005 3.0369 0.1195 0.3567 3.65 -0.1358 -0.1258 2004 3.0122 0.1166 0.363 2.5985 -0.1366 -0.0944 2003 3.2197 0.1222 0.3523 1.8543 0.0303 -0.0005 2002 3.1464 0.1241 0.3175 4.0609 -0.6376 -0.4915 2001 3.1251 0.1202 0.3557 3.6121 -0.3511 -0.231 2000 3.0953 0.1174 0.3643 3.5258 -0.5021 -0.1834

Table 3.9: Model errors depending of three parameters by year.

Figure 3.14: Respiratory problems deaths in United States in 2015. We can see the original model, the model with the first parameter fixed, the model with the first and second parameters fixed and the model with the first term fixed.

As we can see in Table 3.9 by setting the first three parameters we find curves that fit better with the data and at the same time we have reduced the number of variables so

(36)

34 CHAPTER 3. NUMERICAL EXPERIMENTS we could propose a model in which the first term of the model has fixed parameters and calculate the parameters of the second term in Eq. 3.4 responsible for modelling part of data corresponding to older ages.

3.1.3

Correlation between pollution and respiratory diseases

This subsection doesn’t deal with modelling but we are interested in investigating cor-relation between the pollution and respiratory problems or influenza. As we said in the introduction, if we find any correlation between these factors we can apply efforts to mitigate and reduce the deaths.

Although pollution has not yet been linked to influenza, it is possible to demonstrate the relationship between the increase in respiratory diseases and pollution [7]. As we have said in the introduction to suffering an influenza cycle, we are more vulnerable to suffering from other types of respiratory diseases. That is why in this section we are going to see how pollution has changed over the years and how respiratory diseases have changed.

Figure 3.15: Total CO2emissions in United States from 1980 to 2008 in 1000 metric tons of C.

(37)

3.2. MENTAL DISEASES AND SUICIDES 35

Figure 3.16: Respiratory problems deaths in United States from 1980 to 2008. So it seems that they are quite related. To show that deaths due to respiratory dis-eases increase as pollution incrdis-eases we are going to calculate Pearson’s correlation coefficient.

ρ =Cov(X ,Y ) σxσy

= 0.9659.

We can then ensure that both events are highly related. Moreover, this relationship is positive, which means that as one increases, the other also increases.

If we try to see Pearson’s correlation coefficient between influenza and respiratory problems or pneumonia we are faced with a low correlation (the highest data is that of pneumonia which is approximately 0.25) but as we have said before there is no clear difference between influenza deaths and other death categories with similar symptoms. That is why it cannot be ruled out that deaths caused by respiratory problems are not related to the increase in influenza deaths. Therefore, it cannot be ruled out that the high levels of pollution that are increasing year after year may not affect a more deadly influenza linked to respiratory problems.

3.2

Mental diseases and suicides

Cases of mental diseases have increased over the years. Some of the cases in turn lead to the deaths of their patients. But could there be some way, either through information or laws, by which to reduce this number of people affected? For example, in 2010 U.S. President Barack Obama approved a law in which the population will receive tax credits in order to subsidize the payment of health insurance. Effect of this at large part

(38)

36 CHAPTER 3. NUMERICAL EXPERIMENTS of the population could access to the free medical support.This factor could have been decisive for the decrease in mental diseases that followed a path of rising growth but in 2013 began to decline considerably, see Fig. 3.17.

Figure 3.17: Evolution of deaths related with mental diseases from 1980 to 2016 in United States.

In the case of mental diseases, the Gompertz model will adapt perfectly because we have a progressive increase in deaths as the population ages. On the other hand, the data corresponding to a number of self-harm will not fit this model because its structure follows the shape of the Gauss bell curve. This is why we are going to have to resort to another model.

3.2.1

Mental diseases - mortality and correlation with self-harm

Mental diseases as we have seen in the previous Fig. 3.17 has been progressively increasing. Although the ratio per 100.000 inhabitants has decreased since 2010 the number of people affected has not stopped growing. Now we are going to see if this increase has been gradual in different ages or on the contrary it has been aggravated at certain ages.

(39)

3.2. MENTAL DISEASES AND SUICIDES 37

Figure 3.18: Evolution of mental diseases deaths in different age intervals from 1980 to 2016 in United States.

As we can see in the Fig. 3.18 at all ages we see an increase in the number of death, but less in the population of 75 years or older where it is reduced in recent years. In addition, the distribution of deaths in the same year is increasing with age without any jump as happened in the case of influenza so the Gompertz model would fit well to this data.

Cases of death from mental diseases and self-harm are closely related. If we calcu-late the correlation coefficient between this data we obtain

r=Cov(Mental diseases, Self-harm) σMental diseasesσSelf-harm

= 0.9548.

So in order to try to find a solution to one problem, we would have to find a solution to the other. Next, we will study how the data on deaths due to self-harm are distributed.

3.2.2

Self-harm

As the years went by, deaths from self-harm had been reduced, but with the arrival of the new millennium this number has not stopped growing.

(40)

38 CHAPTER 3. NUMERICAL EXPERIMENTS

Figure 3.19: Evolution of deaths related with self-harm form 1980 to 2016

In this section we will not continue to perform a study of all ages, as we will start study-ing from the age of 5 years where people can start to harm themselves consciously. If we remember how the exponential model worked we can take one of its terms, the one that caused the hump, and see how it adapts to self-harm data.

So in this case our model will be

y= a1(xe−a2x)a3. (3.5) The study carried out by A. Boulougari (see Fig. 2.3) will again be very interesting to look at in order to identify appropriate initial conditions that will be used in the process of obtaining parameters of this model. To see how the parameters affect the curve see the Fig. 2.3.

(41)

3.2. MENTAL DISEASES AND SUICIDES 39 We can observe that the data provided at the age of between 5 and 14 years is very low. Because it is very low, it will be very influential, so if we get too far away from this point, the error will increase.

In this case using this model we are going to find ourselves facing a problem that depends on which millennium we are in will be more influential. This is due to the fact that before 2000 the rate of children dying of self-harm was low and increased considerably from this year onwards. After applying the non linear square method and calculating the parameters we can see how this model fits using the error that this model gives us.

Year Error Year Error 2016 19.656 1997 16.2418 2015 19.382 1996 21.6593 2014 19.9201 1995 27.953 2013 19.3758 1994 26.5308 2012 19.1616 1993 25.6004 2011 20.3694 1992 35.3354 2010 21.3814 1991 40.6867 2009 20.4276 1990 49.4029 2008 19.264 1989 53.5149 2007 21.9224 1988 50.9086 2006 20.9372 1987 54.9348 2005 21.1399 1986 54.5765 2004 19.329 1985 61.74 2003 19.4499 1984 68.3176 2002 18.7155 1983 73.0717 2001 20.7365 1982 93.7904 2000 22.163 1981 116.4456 1999 18.2229 1980 107.4598 1998 18.3603

Table 3.10: Power-exponential term 2 model from 1980 to 2016.

We can see in Table 3.10 that in the years before 2000 the error increases until 1980. It should be remembered that the number of deaths before this year was in decline and from 2000 increased considerably. This is reflected in the data and the graphical display is shown in the Fig. 3.18. In the data prior to 2000 we have a low number of deaths from self-harm in young ages, while from the new millennium this number of deaths increases considerably making the error of the model is considerably lower. On the other hand, if we remove the young ages from our data we see in Tab. 3.11 that the error is reduced a lot.

(42)

40 CHAPTER 3. NUMERICAL EXPERIMENTS Year Error Year Error

1999 22.2574 1989 7.8393 1998 21.6133 1988 9.8758 1997 18.5029 1987 11.9323 1996 15.6742 1986 13.4315 1995 14.8926 1985 15.8229 1994 11.7293 1984 13.7852 1993 10.5941 1983 16.3385 1992 8.2046 1982 16.5367 1991 8.1812 1981 18.0916 1990 8.8657 1980 18.8949

Table 3.11: Model given by Eq. 3.5 without the teenager ages.

On the other hand, we can see how the parameters change and try to make some approximation of them. a1 a2 a3 Error Mean 0.414218919 0.028364865 4.329927027 36.70499189 Mean before 2000 0.7587 0.03365 3.64707 50.73767 Mean after 2000 0.008947059 0.022147059 5.133288235 20.19595882 Minimum 0.001 0.022 3.0887 16.2418 Maximum 1.6821 0.23 5.8022 116.4456 Min before 2000 0.01 0.0223 3.0887 16.2418 Max before 2000 1.6821 0.23 4.9694 116.4456 Min after 2000 0.001 0.022 4.7755 18.7155 Max after 2000 0.02 0.023 5.8022 22.163 Standard deviation 0.563551264 0.033613504 0.877465166 26.07908032 Sd before 2000 0.573786818 0.045048957 0.576591423 28.78847783 Sd after 2000 0.006374039 0.000247618 0.301108212 1.00623868

Table 3.12: Model given by Eq. 3.5 parameters.

From this table we can draw two important conclusions, the first that the parameters of the model after the year 2000 seem to be concentrated so we can try to fix these parameters and see what happens. The second is that the parameters of the years prior to 2000 are more dispersed so the complexity of the modelling problem increases. Due to the sensitivity of the system as it is very sensitive to initial conditions and boundary conditions we will see what happens if we set the second parameter that ac-cording to the data is the parameter that varies less and is more centred.

It is for this reason that we are going to repeat the same model taking a2= 0.022147059. Then the new model is as follows.

(43)

3.2. MENTAL DISEASES AND SUICIDES 41 As the reader has been able to verify, we still maintain the notation of a3so as not to be continuously changing the notation and to be able to make a better evolution of this parameter.

Year a1 a3 Error Original error Difference

2016 0.0082 5.1286 20.0725 19.656 -0.4165 2015 0.0041 5.3705 22.9365 19.382 -3.5545 2014 0.0023 5.5644 23.4409 19.9201 -3.5208 2013 0.006 5.2195 18.9438 19.3758 0.432 2012 0.004 5.3577 19.0836 19.1616 0.078 2011 0.004 5.3489 19.723 20.3694 0.6464 2010 0.004 5.3378 20.4865 21.3814 0.8949 2009 0.002 5.5799 19.251 20.4276 1.1766 2008 0.001 5.8163 18.6565 19.264 0.6075 2007 0.002 5.5494 20.7916 21.9224 1.1308 2006 0.007 5.0846 20.2222 20.9372 0.715 2005 0.008 5.0342 21.4942 21.1399 -0.3543 2004 0.005 5.1948 19.1567 19.329 0.1723 2003 0.007 5.0734 19.5701 19.4499 -0.1202 2002 0.009 4.9685 18.8976 18.7155 -0.1821 2001 0.016 4.7407 20.3733 20.7365 0.3632 2000 0.007 5.0375 19.0684 22.163 3.0946 Mean 0.005682353 5.259217647 20.12755294 20.19595882 0.068405882 Sd 0.003484628 0.262095313 1.347183785 1.00623868 1.533747391 Table 3.13: Evolution of the model given by Eq. 3.5 parameters and error from 2000 to 2016

Figure 3.21: Power-exponential term 2 model in 2016 in United States with the second parameter fixed.

(44)

42 CHAPTER 3. NUMERICAL EXPERIMENTS now we observe that the first parameter remains more or less stable so the next point will be to make a study by setting this parameter.

To do this, we will take the mean value of parameter a1 and the system will be as follows.

y= 0.005682353(xe−0.022147059x)a3.

Then the error is

Year a3 Error Original error Difference

2016 5.2624 21.0233 19.656 -1.3673 2015 5.2528 22.2313 19.382 -2.8493 2014 5.2404 21.5315 19.9201 -1.6114 2013 5.2392 18.9043 19.3758 0.4715 2012 5.2305 19.8164 19.1616 -0.6548 2011 5.2221 21.4352 20.3694 -1.0658 2010 5.2104 22.3655 21.3814 -0.9841 2009 5.2018 25.0884 20.4276 -4.6608 2008 5.187 28.0153 19.264 -8.7513 2007 5.1713 24.7453 21.9224 -2.8229 2006 5.16003 20.3983 20.9372 0.5389 2005 5.1584 22.4869 21.1399 -1.347 2004 5.1483 19.5056 19.329 -0.1766 2003 5.1491 20.0285 19.4499 -0.5786 2002 5.1354 20.7969 18.7155 -2.0814 2001 5.1167 25.4163 20.7365 -4.6798 2000 5.1135 19.3599 22.163 2.8031 Mean 5.188195882 21.94993529 20.19595882 -1.753976471 Sd 0.046772653 2.452455476 1.00623868 2.495627974 Table 3.14: Evolution of the power-exponential term 2 model parameter and error from 2000 to 2016

As we can see the error with this new model dependent only on one parameter fits worse increasing considerably the error with respect to the original model. So it is also not better than the previous single-parameter model.

If on the other hand we set the a3 parameter with the mean obtained in the model with two parameters we do not improve the error either. Neither do we get better data if we use the mean value obtained in the model with one parameter so the best model we can have for this disease is the model given in the Eq. 3.6.

(45)

3.2. MENTAL DISEASES AND SUICIDES 43

Figure 3.22: Power-exponential term 2 model in 2016 in United States with the first and second parameters fixed.

We can see that the models with one and two parameters fixed are very similar but the small changes makes that the model with one parameter fixed is better than the other one.

(46)
(47)

Chapter 4

Conclusion and future work

4.1

Project summary

In this work we have analysed different models to represent with greater or lower pre-cision to mortality data caused by diseases that are very common and that each year take lives of many people. We have also analyzed the available data in order to develop a model that can be used for reliable forecasting, which in its turn can be used by au-thorities to allocate resources into preventing and reducing these deaths for different age groups.

The first model we have observed is the Gompertz model that follows an exponential distribution. This model did not have required complexity, hence error introduce by it was higher than in the case of other models studied later in this project. The greatest observable error introduced by this model is for young ages because it couldn’t model the hump.

Some improvement have been archived by the G-M model. This model was based adding a new parameter to try to model the hump that we observed in young ages. However, neither G-M could represent the hump so alternative model was studied. The next model that we tested overcame the main disadvantage of the Gompertz model mentioned above. The power-exponential model introduced a new term that models the hump occurring between 10 and 20 years of age. This model depends on 5 parameters, 2 more than Gompertz model, so the complexity increased. This meant that we had to analyse the initial conditions very carefully in order to solve and find the solutions to the parameters using the non linear square method. In this research we found a lot of interesting conclusions. For example in the influenza we observed that in the last years more people is dying caused by influenza. Also, the Gompertz model doesn’t improve the error while we are fixing parameters. Moreover, we presented that the power-exponential model fits well with respect to the deaths caused by the respiratory problems and we can fix the first term while having a good fitting. The deaths caused

(48)

46 CHAPTER 4. CONCLUSION AND FUTURE WORK by the mental diseases have been increasing in the last years and we showed how well the exponential model fit available data.

These conclusions will help for future research to know what models fit better for each disease and in what age group we have the biggest problems.

4.2

Future Work

Throughout this work we have seen the weaknesses and strengths of these models in representing mortality data connected to diseases. Possible project for future work could be introducing more complex models or comparing different fitting methods. Different forecasting methods can also be analysed. Model can also improve by ob-serving different age group separately.

In addition, another interesting focus for future work would be to do a more in-depth study on the correlation of diseases with their possible causes, as we have done on the correlation between pollution and deaths from respiratory problems. Moreover, we found a correlation between influenza and pneumonia in the most recent years but we didn’t consider a good sample so a deep analysis could be necessary as future work. Also outside the scope of the study of the models would be necessary to study the possible solutions to reduce this number of deaths that each year is increasing and look for foci where they could reduce such deaths. This would not only be a matter of saving lives because behind these deaths there are a lot of costs and public services.

To conclude this section another point of view that is interesting and in which a study should be made is the mortality from tropical diseases in areas that are not. This is due to climate change. In this thesis this analysis has not been done because no data was available because these deaths are very recent and are causing great problems. But these deaths and disease movements are connected to pollution and the greenhouse ef-fect.

(49)

Chapter 5

Summary of reflection of

objectives in the thesis

5.1

Objective 1: Knowledge and understanding

This research demonstrate a deep knowledge and understanding about the different models to model the diseases that we studied in the research. In the section of numer-ical experiments we study the models that fit better with the diseases and we applied complex numerical analysis to find the parameters of those models.

5.2

Objective 2: Methodological knowledge

First of all we started making a complete description about the models and the numer-ical methods that we were going to use in the research to help the reader to understand the analysis that we did in the numerical experiment. Also we had to analyze the dif-ferent models used in mortality modelling to model the mortality caused by difdif-ferent diseases.

5.3

Objective 3: Critically and Systematically Integrate

Knowledge

Many models have been studied to analyze the mortality rate but there weren’t a lot of papers analyzing these models in the diseases deaths. So we did a systematic analysis about these models to know which ones were the best candidates for our diseases.

(50)

48CHAPTER 5. SUMMARY OF REFLECTION OF OBJECTIVES IN THE THESIS

5.4

Objective 4: Independently and Creatively Identify

and Carry out Advanced Tasks

Many of the research about mortality rate only focus in the mortality rate and not in the causes. This research make an independent and creative analysis about some of the patterns of the mortality rate focusing in the diseases mortality rate.

5.5

Objective 5: Present and Discuss Conclusions and

Knowledge

The structure of this thesis is thought to people who has different levels in mathematics. First of all it is explained the models and the numerical methods to give to the reader the basics to follow the thesis. In the numerical experiment we present the conclusions after applying the fitting into our models.

5.6

Objective 6: Scientific, Social and Ethical Aspects

This research needed a lot of data to build the models. To obtain the data I have to check a lot of databases where the data could be missing or corrupting.

(51)

Bibliography

[1] MODELLING MORTALITY RATES USING POWER-EXPONENTIAL FUNCTIONS, Karl Lundengrad, Milica Rancic and Sergei Silvestrov,Applied mathematics di-vision of applied mathematics, Mälardalen University SE-721 23 Västeras, Swe-den.

[2] COMPARISON OF PRICES OF LIFE INSURANCES USING DIFFERENT MORTAL -ITY RATES MODELS, Belinda Straß, Master thesis in mathematics/ Applied mathematics division of applied mathematics, Mälardalen University SE-721 23 Västeras, Sweden.

[3] APPLICATION OF A POWER-EXPONENTIAL FUNCTION BASED MODEL TO MORTALITY RATES FORECASTING, Andromachi Boulougari, Master thesis in mathematics, Applied mathematics division of applied mathematics, Mälardalen University SE-721 23 Västeras, Sweden.

[4] MODELLING MORTALITY WITH JUMPS: APPLICATIONS TO MORTALITY SE -CURITIZATION, Hua Chen and Samuel H. Cox, Journal of Risk & Insurance, September 2009.

[5] EXCESS OF MORTALITY IN ADULTS AND ELDERLY AND CIRCULATION OF SUBTYPES OF INFLUENZA VIRUS IN SOUTHERN BRAZIL, André Ri-cardo Ribas Freitas and Maria Rita Donalisio, frontiers in Immunology, January 2018.

[6] REAL-TIME ESTIMATION OF THE INFLUENZA-ASSOCIATED EXCESS MORTAL -ITY IN HONG KONG, Jessica Y. Wong, Edward Goldstein, Vicky J. Fang, Ben-jamin J. Cowling and Peng Wu, Epidemiology and Infection 147, May 2019. [7] MULTIVARIATE ANALYSIS OF RESPIRATORY PROBLEMS AND THEIR CONNEC

-TION WITH METEOROLOGICAL PARAMETERS AND THE MAIN BIOLOGICAL AND CHEMICAL AIR POLLUTANTS, István Matyasovszky, László Makra, Beat-rix Bálint, Zoltán Guba and Zoltán Sümeghy, Atmospheric Environment, May 2011.

[8] POLLUTION DEATHS, Ivan Araque Cristobal, Project in Biomathematics, Mälardalen University SE-721 23 Västeras, Sweden.

(52)

50 BIBLIOGRAPHY [9] INFLUENZA, https://www.cdc.gov/flu/about/burden/faq.htm, May

2019.

[10] WHO MORTALITY DATABASE, https://apps.who.int/healthinfo/ statistics/mortality/whodpms/, May 2019.

[11] ON THE NATURE OF THE FUNCTIONEXPRESSIVE OF THE LAW OF HUMAN MORTALITY,AND ON ANEW MODE OFDETERMINING THEVALUE OFLIFE CONTINGENCIES. B Gompertz (1825). Philosophical Transactions of the Royal Society. 115: 513–585. doi:10.1098/rstl.1825.0026. JSTOR 107756.

[12] SUICIDE IN THE UNITED STATES, https://afsp.org/about-suicide/ suicide-statistics/, May 2019.

[13] DISAPPEARING DISEASES: THE GLOBAL FIGHT AGAINST POLIO, GUINEA WORM AND MEASLES, https://www. theguardian.com/health-revolution/2016/apr/19/

diseases-eradicate-polio-guinea-worm-measles-vaccination, May 2019.

[14] PRINCIPLES OFEPIDEMIOLOGY INPUBLICHEALTHPRACTICE, THIRDEDI -TION AN INTRODUCTION TO APPLIED EPIDEMIOLOGY AND BIOSTATIS -TICS, https://www.cdc.gov/csels/dsepd/ss1978/lesson3/section3. html, May 2019.

[15] RE-EMERGING DISEASES: WHY SOME ARE MAK -ING A COMEBACK, https://www.verywellhealth.com/ why-some-diseases-are-re-emerging-4151072, May 2019.

[16] AIR POLLUTION: EVERYTHING YOU SHOULD

KNOW ABOUT A PUBLIC HEALTH EMERGENCY,

https://www.theguardian.com/environment/2018/nov/05/air-pollution-everything-you-should-know-about-a-public-health-emergency, May 2019. [17] LEASTSQUARES DATAFITTING WITH APPLICATION, Per Christian Hansen,

Victor Pereyra and Godela Scherer, Master thesis in mathematics, The Johns Hopkins University Press, 2013.

(53)

Appendix

Code Gompertz model clear all

close all

filename = ’datathesis.xlsx’; sheet = 3;

xlRange = ’B3:AL10’;

DATA = xlsread(filename,sheet,xlRange);%Taking the data from my data base ydata=log(DATA(:,17)’);

xdata=[1 3 10 20 30 45 65 85]; % ages of the people % non linear square method

fun10 = @(x) x(3)+x(1)*exp(x(2)*xdata)-ydata; x0 = [0,0,1];

x1 = lsqnonlin(fun10,x0,[0 0 0],[inf inf inf]) % Calculate the error

err=abs(x1(3)+x1(1)*exp(x1(2)*xdata)-ydata)./ydata*100; mean(err)

Code Power-exponential model clear all close all filename = ’datathesis.xlsx’; sheet = 3; xlRange10 = ’B3:AL10’; DATA=xlsread(filename,sheet,xlRange10);

ydata=DATA(:,1)’;%y value for the complete data

xdata=[1 3 10 20 30 45 65 85];%x value for the complete data % non linear square method

fun = @(x)

x(1)./(xdata.*exp(-x(2).*xdata))+x(3)*(xdata.*exp(-x(4).*xdata)).^(x(5))-ydata; x0 = [36,0.2,0.2,0.0498,0.6589];

x = lsqnonlin(fun,x0) % Calculate the error

err=abs(x(1)./(xdata.*exp(-x(2).*xdata))+x(3)*(xdata.*exp(-x(4).*xdata)).^(x(5))-ydata) ./ydata*100;

mean(err)

(54)

52 BIBLIOGRAPHY Code Power-exponential model modified

clear all close all filename = ’datathesis.xlsx’; sheet = 4; xlRange = ’B12:AL18’; DATA = xlsread(filename,sheet,xlRange); ydata=DATA(:,17)’; xdata=[10 20 30 45 65 85]; % non linear square method fun10 = @(x)

x(1).*(xdata.*exp(-x(2).*xdata)).^x(3)+x(4).*(xdata.*exp(-x(5).*xdata)) .^x(6)-ydata;

x1 = lsqnonlin(fun10,x0,[0.001 0.01 0 0.1 0.1 0.1],[10 10 5 6 6 6]) % Calculate the error

err=abs(x1(1).*(xdata.*exp(-x1(2).*xdata)).^x1(3)+x1(4).* (xdata.*exp(-x1(5).*xdata)).^x1(6)-ydata)./ydata*100; mean(err(2:end))

Figure

Figure 2.2: Curves depending on the parameters c 1 and c 2 [3]. The parameter c 1 is a parameter control to model the initial points
Figure 2.3: Curves depending on the parameters a 1 , a 2 and a 3 [3].
Figure 3.2: Evolution of the total number of deaths caused by influenza per 100.000 habitants from 1980 to 2016 in United States.
Figure 3.4: Data and corresponding model for different age ranges for the period 1980 to 2016 in United States.
+7

References

Related documents

Two states that have similar projected temperature changes might thereby differ in projected export change, if the different underlying baseline climate causes the shift in

LD is a Lecturer in Global Health, University of Aberdeen, Scotland UK, a Global Affiliate of the Umeå Centre for Global Health Research, Umeå University Sweden and an Honorary

För att uppskatta den totala effekten av reformerna måste dock hänsyn tas till såväl samt- liga priseffekter som sammansättningseffekter, till följd av ökad försäljningsandel

Regioner med en omfattande varuproduktion hade också en tydlig tendens att ha den starkaste nedgången i bruttoregionproduktionen (BRP) under krisåret 2009. De

Generella styrmedel kan ha varit mindre verksamma än man har trott De generella styrmedlen, till skillnad från de specifika styrmedlen, har kommit att användas i större

I regleringsbrevet för 2014 uppdrog Regeringen åt Tillväxtanalys att ”föreslå mätmetoder och indikatorer som kan användas vid utvärdering av de samhällsekonomiska effekterna av

Närmare 90 procent av de statliga medlen (intäkter och utgifter) för näringslivets klimatomställning går till generella styrmedel, det vill säga styrmedel som påverkar

• Utbildningsnivåerna i Sveriges FA-regioner varierar kraftigt. I Stockholm har 46 procent av de sysselsatta eftergymnasial utbildning, medan samma andel i Dorotea endast