Mind the gap – save lives?

(1)

DEPARTMENT OF ECONOMICS Uppsala University

Economics C: Thesis Work, 15.0 c Author: Anna Tybrandt

Supervisor: Niklas Bengtsson Fall Semester 2019

Mind the gap – save lives?

(2)

Abstract

The aim of this thesis is to study the relationship between economic inequality and violent crime, primarily on a global level but additionally with a special focus on Latin America. Three different regression models are used; a multiple linear regression model with cross-national data on a global level, and a multiple linear regression model with panel data as well as a fixed effects regression model with panel data for Latin America. The data cover the time period 2000-2015 and include 119 countries globally and 18 countries regionally. The reason to why it is interesting to zoom in on Latin America specifically is because the region has both the highest levels of economic inequality and the highest levels of violent crime in the world. Does this mean that the effect of inequality on crime is stronger in Latin America than in the rest of the world? In line with many previous studies on the same topic, the results show a statistically significant positive relationship between economic inequality and violent crime on a global level. The results for the multiple linear regression on Latin America show a similar effect, though not as strong as on a global level and not statistically significant. In the regression performed with fixed effects, the results are more difficult to interpret. To summarize, it can be said that the results indicate (and confirm in the case for the regression on a global level where the null hypothesis can be rejected) that economic inequality is a strong predictor of violent crime. This effect seems to be weaker in Latin America than on a global level. It is suggested that a possible explanation for that could be that the slope curve is diminishing, i.e. when the level of inequality reaches a certain point, the effect on violent crime starts to decrease and thereby shows a weaker relationship. However, this is something that would need to be studied further in future research.

Keywords

Economic inequality, violent crime, income inequality, homicide, Latin America

Acknowledgments

(3)

List of tables

Table I: Definitions of variables

Table II: Descriptive statistics - global level Table III: Descriptive statistics - Latin America Table IV: Regression results – global level

Table V: Regression results – Latin America (multiple linear regression) Table VI: Regression results – Latin America (fixed effects regression) Table AI: Countries included – global level

Table AII: Countries included – Latin America

List of Abbreviations

CEPAL Comisión Económica para América Latina y el Caribe (Spanish name of ECLAC) ECLAC Economic Commission for Latin America and Caribbean

GDP Gross Domestic Product

LAC Latin America and the Caribbean OVB Omitted Variable Bias

WHO World Health Organization

(5)

1. Introduction

Latin America has long been called the forgotten continent. It has been a region in the

periphery, rarely making it to the headlines of the world’s main newspapers. But something is changing. During the last months of 2019, reports about discontent and protests across the region caught the attention of the rest of the world and sparked an interest in understanding the conditions that characterize Latin American countries. What started as a student protest against a raised metro fee in Santiago, Chile, quickly turned into nationwide demonstrations against inequality and political exclusion (The Guardian, 2019). In Bolivia, chaos surrounding the recent election and the suspected election fraud by the administration of sitting president Evo Morales eventually led to his resignation and people took the streets both to support and protest against these events (Washington Post, 2019). In Colombia, large demonstrations were held to show discontent towards the current government, the insufficient implementation of the historic 2016 peace deal and inequality in general (The Guardian, 2019). Additionally, women’s movements in Latin America gained worldwide attention for their work on issues such as the fight for legal abortion and against violence against women (El País, 2019).

Naturally there are many things that differ from one Latin American country to another, but one thing that characterizes the region as a whole is economic inequality. Latin America is the most unequal region in the world. According to a report published by Oxfam in 2015, 32 people own the same amount of wealth as 50% of the region’s population. In 2014, the richest 10% had access to 71% of the wealth, and the richest 1% to 41 %. If this trend continues it would lead to the richest 1% possessing over 50% of the total wealth by 2022, leaving the other 50 % to the remaining 99% of the population in the region. The same report also states that the wealth of the 101 billionaires in Latin America would be enough to eradicate poverty in Ecuador, El Salvador, Nicaragua, Paraguay, Peru and the Dominican Republic (Oxfam, 2015).

The following quote is translated from ECLAC’s report on inequality, published in 2016:

(6)

societies in the region. It is characterized by a complex framework where the inequalities of socioeconomic origin crisscross inequalities based on gender, territory, ethnicity, race and generation” (ECLAC, 2016)

As the quote indicates, inequality in Latin America is not something that is just shown in income or wealth disparities. It is a structural inequality that affects all sectors of society, including education, healthcare and public transport. Many Latin American cities are extremely divided in the sense that many citizens with resources live in gated communities, send their children to private schools and would never use public transport. Since the tax system is often dysfunctional and the informal economy large, this contributes to a situation where there are not enough resources to improve public services and thereby reduce the disparities between the ones who can afford to choose the private options and the ones who cannot (ECLAC, 2018).

Another thing that characterizes Latin American countries is violent crime. When excluding countries that are battling wars or internal armed conflicts, Latin America is the most violent region in the world. It is home to 8% of the world’s population but 33% of all homicides, which is the measurement most often used to study violent crime. On the global list of countries with the highest homicide rate, 14 out of 20 are Latin American, and when ranking the cities with the most homicides, 43 out of 50 are located in Latin America. Of all the people murdered in the world, one out of four are either from Brazil, Mexico, Colombia or Venezuela (Muggah and Aguirre, 2018). What is the reason behind these high levels of violent crime in Latin America? One often mentioned hypothesis is that there exists a positive relationship between economic inequality and violent crime, i.e. a rise in inequality will lead to a rise in crime. Therefore, the main research question in this thesis will be; is there a causal

effect of economic inequality on violent crime?

There is a large collection of previous literature on this topic. However, the majority of them either use data for one single country or only data on a global level. By adding a special focus on Latin America and comparing the global versus the regional relationship between

(7)

The main empirical model for the study will be a multiple linear regression on a global level, using data for 119 countries that cover the time period 2000-2015. Economic inequality (measured as the Gini coefficient of income inequality) will be the explanatory variable and violent crime (measured as homicide rate per 100 000 people) will be the dependent variable. In addition to this, a multiple linear regression as well as a fixed effects regression with data especially on Latin America will be performed in order to make a comparison between the global and regional level.

(8)

2. Theoretical background

This thesis belongs to the field of economics that is usually referred to as development

economics. Research within this field include themes such as economic growth, poverty,

structural change, the role of institutions, inequality, education, health and agriculture. Perkins et al. (2012) highlights the important difference between economic growth and economic

development, two terms that are often mixed up. Economic growth describes simply the

increase in national or per capita income in a country, i.e. the increase in the production of goods and services. However, talking about economic growth does not reveal anything about the distribution of income among the population, nor about the quality of education or healthcare in the country. Economic development, on the other hand, also requires improvements in human welfare such as school enrollment, life expectancy and political freedom (Perkins et al., 2012).

This thesis contributes with an addition to the field of development economics since it focuses on economic inequality and discusses how it might impact the levels of violent crime, themes that are often examined within this specific field. This section in particular will provide a closer explanation of the main concepts of the thesis as well as a summary of previous research on similar studies.

2.1. Economic inequality

As mentioned before, economic growth does not automatically signify that the whole

population benefit from it. If the distribution of income in the country is highly concentrated to just a few people, growth will probably not improve the living standard for the rest of the population. Distribution of income is the most used sub-group when speaking of inequality, but it can also refer to distribution of wealth, consumption and assets such as land and human capital. It is a concept that has always been present in some form within the field of

economics. Well-known economists such as David Ricardo, Karl Marx and Thomas Malthus discussed distribution back in the 19th_{century, but it was not until the work of Simon Kuznets}

in the mid-20th_{century that the study of inequality was really put in the spotlight. According}

(9)

of inequality, in contrast to previous economists who had to base their research dominantly on theoretical assumptions (Piketty, 2018).

In Economic growth and income inequality from 1955, Kuznets discussed the relationship between growth and inequality. The basic idea is that when a country develops from a

traditional to a modern society, i.e. transforms from a dominantly rural, agricultural economy to an urban, industrialized economy, inequality will rise. This is because the per capita income is usually higher among the urban citizens which means that when the urban part of the population increases, there will be larger income differences within the total population. Kuznets then presented what has become known as the Kuznets curve, an inverted-U curve demonstrating this relationship. According to the model, there is a positive slope between growth and inequality in the beginning, with a rise in growth leading to a rise in inequality. However, when growth reaches a certain point, the relationship changes and becomes negative, meaning that a rise in growth leads to a decline in inequality. In other words, Kuznets argued that countries in the transition from what we today call developing countries to developed countries would experience an increase in inequality during the developing process but once they reached a certain level of development this inequality would start to even out and eventually decrease (Anand & Kanbur, 1993).

In the decades that have passed since Kuznets published his work, many economists have rejected his theory of the inverted-U curve, claiming that there is no such relationship

(10)

2.2. Violent crime

Mire and Robertson (2011) presents the following definition of violence and violent crime:

“Violence is the intentional use of physical force or power, threatened or actual, against oneself, another person, or against a group or community that either results in or has a high likelihood of resulting in injury, death, psychological harm, maldevelopment, or deprivation and is legally proscribed by law.”

The concept of violence is also divided into different sub-groups, including interpersonal violence, self-directed violence, collective violence and economic violence. This thesis focuses on interpersonal violence, which can be described as “behavior by persons against persons that intentionally threatens, attempts, or actually inflicts physical harm” (Reiss and Roth, 1993). According to Mire and Robertson, psychological and emotional harm can be added to the definition of interpersonal violence as well.

The cost of violence can be devastating, primarily for the persons directly involved but also for families, communities and the society as a whole. Additional to physical damage, a violent act can also result in psychological and emotional costs that tend to linger for a much longer time, such as PTSD (post-traumatic stress disorder). The cost of violence can be divided into direct costs and indirect costs, with direct costs referring to the actual payments connected to a violent act and indirect costs referring mainly to the loss of resources and productivity as a consequence of the loss of opportunities experienced after a violent act (Mire and Robertson, 2011).

Many theories exist that try to explain the motivation behind committing a crime. One that has been very influential within the sociological branch of the literature is the concept of

relative deprivation (inequality). When there is a social and economic gap between the rich

and the poor in a society, it causes a feeling of injustice, resentment and hostility for those who feel deprived of their opportunities. It is also believed that the discrepancy between one’s ambitions and the lack of means to achieve them creates economic frustration and is a

breeding ground for crime (Messner and Rosenfeld, 1994).

(11)

based on a cost-benefit analysis. If a person expects the utility gained by committing a crime to be higher than the utility gained by spending that time on legal activities, then committing the crime will be a rational decision. If punishments such as fines or imprisonment are

introduced, it is possible that the person values the benefit of the crime as lower and therefore decides not to proceed with it. Translating this theory into policy making, the key is to find a punishment that meets the point where the person values the risk of committing the crime higher than the benefit (Enamorado et al., 2014).

There are of course many other theories that try to pinpoint the causes of violent crime. Some of the most studied ones that are assumed to correlate closely with violence are poverty, alcohol abuse, childhood characterized by neglect and abuse, and low levels of education (Mire and Robertson, 2011).

2.3. Latin American context

What is the reason to why Latin America suffers from such high rates of both inequality and violent crime? According to Robinson (2000), one thing that distinguishes Latin America from for example Western Europe is that the latter experienced processes of democratization that were accompanied by mass education and tax systems that distributed income. Income distribution is mainly decided by the distribution of assets (land and human capital) and in Latin America, the distribution of both of these assets have always been extremely

(12)

progressive reforms) that repressed the left leaning parties and unions which lead to the demand of redistributive policies to fade away (Robinson, 2000).

Regarding the question of how Latin America has come to be the most violent region in the world, Bergman (2018) presents a theory based on the idea that a rise in crime rates results from a breakdown of a social equilibrium. In order to pinpoint the conditions that cause this breakdown, one has to study the social, economic, political and cultural context of the crime. Bergman especially highlights the increase of illegal economies in Latin America and with them an increase in black markets as one of the contributing factors, since it causes the demand for crime to rise. Additionally, many Latin American countries have weak law enforcement, high levels of corruption and unstable social institutions, which is a breeding ground for crime since there is a larger probability of not getting punished. All social equilibriums have a tipping point where the situation, either because of a rise in the demand of illegal goods or a decline in law enforcement, could spiral out of control and cause crime rates to increase dramatically (Bergman, 2018).

In a report for Igarapé Institute examining citizen security in Latin America, Muggah and Aguirre (2018) mention that the principal paradox of the region is that crime rates have

increased during the last decades even though several important indicators of well-being (such as reduction in poverty, decline in income inequality and expansion of the middle-class) have improved. In line with the work of Bergman, the report also highlights weak law enforcement and involvement in illegal (drug) markets as contributing factors to violent crime in Latin America. Other factors that are mentioned include youth unemployment, low rates of education, the number of single-headed female households and alcohol abuse (Muggah and Aguirre, 2018).

2.4. Previous research

The relationship between economic inequality and violent crime is something that has been studied extensively. Because of the large collection of literature, a few articles have been selected that are similar to this thesis, i.e. they use the same variables (income inequality and homicide rates) and examine cross-national data that vary over time (panel data).

(13)

correlated with homicide rates. In line with many previous studies they find a positive

correlation between income inequality (measured by the Gini coefficient) and homicide rates, and since their sample is significantly larger than in other studies, they argue that their results are more trustworthy. Additionally, they conclude that population growth (which is used as a proxy for a large share of young people in the population) is another factor that correlates closely with homicide rates. They also run a correlation analysis showing that countries characterized by poverty, cultural diversity, low defense expenditure, less democracy and low school enrolment suffer from higher rates of homicide. Krahn et al. base their study on the theory of relative deprivation which predicts that the correlation between income inequality and homicide rates are stronger in more democratic countries due to the combination of an egalitarian value system and high material inequality. They find some support for this theory. It seems like income inequality has a stronger effect on homicide rates in more democratic countries as well as in wealthier countries, more advanced capitalist countries and countries with larger internal security forces. Lastly, the results show that there is a strong relationship between income inequality and homicide rates in more densely populated countries (Krahn et al., 1986).

In a study from 1980, Braithwaite and Braithwaite also examine how income inequality affects homicide rates, using data for 31 countries that cover a time period of 20 years. They first map the correlation between different measurements of inequality and find that income inequality and homicide rates show a strong and statistically significant correlation. They then use multiple regression analysis and include protein grams per capita, political freedom and ethnic fractionalization as control variables to control for omitted variable bias. The results show that there is a positive (though not statistically significant) relationship between income inequality and homicide rates, which is in line with previous research (Braithwaite &

Braithwaite, 1980).

(14)

instrumental variable methods. However, the original findings are still robust. They also find that economic growth leads to a decrease in homicide rates, whereas factors such as school enrollment, urbanization and the average level of income do not affect crime in a significant, robust or consistent way. Two analytical shortcomings in the study are also mentioned. Firstly, they cannot distinguish between the two theoretical approaches, i.e. they cannot say for certain if it is the economic or sociological theory that explains the relationship. Secondly, they have not identified the mechanisms of inequality that contribute to the increase in

homicide rates, which is necessary to be able to suggest policy recommendations (Fajnzylber et al., 2002)

2.5. Hypotheses

On the basis of the literature and previous research presented, two hypotheses for the empirical analysis are formulated:

H1: There is a causal effect of economic inequality on violent crime.

(15)

3. Data

The main variables in this thesis are the explanatory variable (income inequality measured by the Gini coefficient) and the dependent variable (homicide rate per 100 000 people). In addition to this, it is necessary to add several control variables to avoid omitted variable bias (OVB). OVB indicates a situation where there are factors that correlate with both the

explanatory and the dependent variable, but that are not included in the regression model. This can cause an overestimation or an underestimation of the actual effect that income

inequality has on homicides, and it is therefore important to identify and include these omitted factors in order to make correct estimations (Stock & Watson, 2015).

3.1. Income inequality

As mentioned in section two, the term economic inequality can refer to different types of inequality such as income inequality, wealth inequality, land inequality etc. Whilst the term denotes a broader understanding of societal differences, for reasons of accessibility and reliability of different possible measurements this thesis will operationalize it as income inequality. The most used measurement of income inequality is the Gini coefficient, a value between 0 and 1 that demonstrates if income is evenly distributed among the population or if it is highly concentrated to a small group of people. The value 0 signifies a hypothetical scenario where the total income in a country is equally distributed between everyone, i.e. all citizens have the exact same income. The value 1 signifies the opposite hypothetical scenario where the total income is concentrated to one single person and the rest have nothing.

Depending on where the country ends up on the range between 0 and 1, it gives an idea of the level of inequality.

(16)

more available values, the collapse command in Stata was used to calculate the mean for every country, meaning that the dataset contains cross-sectional data (varies across entities) but not time-series data (varies over time.)

The secondary step is to zoom in on Latin America and examine the relationship between income inequality and homicides on a regional level. Data was again downloaded from the World Bank but this time only for the countries located in Latin America. Cuba and Puerto Rico did not have any data for the chosen time period and were therefore dropped from the dataset, leaving 18 countries in total. Since the data for Latin America were more complete than on a global level, and in order to increase the number of observations for the regression analysis, the decision was taken to not calculate an average value for each country. Instead, all observations for the years 2000, 2005, 2010 and 2015 were kept since those are the four years with available data for homicide rates (the dependent variable). By keeping four observations from each country, the dataset increases to 72 observations instead of 18 which improves the precision and credibility of the regression results. Finally, the reason to why the Gini data and the homicide data need to be synchronized is to be able to merge the two datasets in Stata.

After dropping all observations except for the years 2000, 2005, 2010 and 2015, it was

(17)

3.2. Homicide rate

This variable was chosen as the measurement of violent crime for several reasons. Firstly, it is considered one of the most documented forms of data since there is always supposed to include a routine by the authorities when someone has been killed. In other words, it is easier to register dead bodies than for example armed robberies or sexual violence. Secondly, there is probably no other type of crime that is as violent and grave as intentionally ending the life of another person.

The literature mainly refers to two different sources for homicide data: the World Health Organization (WHO) and Interpol. The majority of previous studies have chosen the WHO data with the argument that, since all countries have to report within the same framework, these data are more likely to be synchronized and not include bias due to different national methods of collecting the data. The definition of homicide might vary between countries, sometimes they include attempted homicides, i.e. someone has been experiencing a murder attempt but survived. Interpol uses the registers of each country’s national police when collecting their data and since the definitions might differ from country to country, many studies argue that the WHO data are more consistent and therefore a better option.

WHO provides data for five different years: 2000, 2005, 2010, 2015 and 2016. Since there are five-year intervals between the first four, the decision was taken to drop the data for 2016 and thereby create a time period of 15 years with new data for every five years. Observations were dropped for all countries that were not included in the Gini dataset, since it does not make sense to use observations that lack data for the explanatory variable. Fortunately, the

homicide data were complete and did not contain any missing values, meaning that there was no need to either drop more observations or to interpolate the data.

Control variables

In order to avoid OVB, several control variables have been included in the regression model. According to previous research, one factor that is often mentioned as one of the main

predictors of homicide is population growth. Messner (1982) finds that population growth together with income inequality were the primary causes of homicide in his study, with the theory that a rapid increase in population destabilizes society. He also mentions the possibility that high rates of population growth might indicate that a large part of the population is

(18)

correlate with an increased amount of homicides since the age group 15-29 years tends to be overrepresented in violent crime (Messner, 1982). For these reasons, annual population is included as a control variable. Data are downloaded from the World Bank with the same motivation as for the Gini coefficient, i.e. that it is a well-known and reliable source that offers extensive access to data. The observations that do not match the Gini data are dropped and the remaining ones are merged into the existing dataset with the other variables. There are no missing values.

Another factor that is believed to affect violent crime and is therefore chosen as a control variable, is GDP per capita. This variable is used as a measurement of economic development and social change, and the prediction is that it is negatively correlated with homicides, i.e. if GDP per capita increases the number of homicides should decrease (Krahn et al., 1986). Data are downloaded from the World Bank with the same motivation as for the Gini coefficient, i.e. that it is a well-known and reliable source that offers extensive access to data. The observations that do not match the Gini data are dropped and the remaining ones are merged into the existing dataset with the other variables. The decision is also taken to divide the observations by 1000 to obtain values that are easier to work with in Stata. There are no missing values.

(19)

Regarding control variables, there is a trade-off between including too few and too many. If important control variables are left out of the model, it will probably bias the estimations and make the predicted effect of income inequality on homicide rates less accurate. On the other hand, there is no point in including every possible variable just to be on the safe side since this will eventually “overfit” the model. This could lead to a “Type 1 error”, meaning that a statistical significance is discovered even though there is in fact no causal relationship (Stock & Watson, 2015). The control variables that are chosen for this regression analysis are well established in previous literature as factors that are likely to be correlated with violent crime.

3.3. Natural logarithms

Both the explanatory variable and the dependent variable are transformed into natural logarithms (ln) before running the regressions. The main reason to why they are logged is to be able to show a percentage change in the relationship between the variables of interest. This means that a 1% change in the Gini coefficient represents a b1% change in the homicide rate,

i.e. b1 is the elasticity of homicides with respect to income inequality. Since the explanatory

(20)

3.4. Definitions of variables and descriptive statistics Table I

Definitions of variables used in the regression analysis

Variable Description Source

Explanatory variable

gini lngini

Gini coefficient in %

Natural logarithm of the Gini coefficient

World Bank World Bank

Dependent variable

homicide lnhomicide

Homicide rate per 100 000 people Natural logarithm of the homicide rate

WHO WHO

Control variables

popgrowth Annual population growth in % World Bank

gdp GDP per capita, PPP (current international _{Observations divided by 1000} $) World Bank enrollment School enrollment, secondary (% gross) World Bank

Table II

Descriptive statistics for data used in regression analysis on a global level

Variable Observations Mean Std. Dev. Min Max

(21)

Table III

Descriptive statistics for data used in regression analysis on Latin America

Variable Observations Mean Std. Dev. Min Max

(22)

4. Empirical Method

4.1. Baseline model

To test the stated hypotheses that economic inequality has a causal effect on violent crime, a multiple linear regression is used as the baseline model. The dependent variable (lnhomicidei)

is the logarithm of homicide rate, b1 is the slope coefficient showing the effect that the main

explanatory variable (lnginii) has on the dependent variable. lnginii is the logarithm of income

inequality measured by the Gini coefficient. b2, b3 and b4 are the slope coefficients for the

control variables for population growth (popgrowthi), GDP per capita (gdpi) and school

enrollment (enrollmenti) respectively. Lastly, ui denotes the error term which gathers

everything that has an effect on the dependent variable (lnhomicidei) but is not correlated with

any of the explanatory variables.

lnhomicidei = b1lnginii + b2popgrowthi + b3gdpi + b4enrollmenti + ui (1)

i = 1,…, 119 (number of entities, in this case 119 countries)

4.2. Additional model I

The additional hypothesis is that the effect of economic inequality on violent crime is stronger in Latin America than on a global level. To test this, two different regression models will be formulated. Firstly, a multiple linear regression like the one presented for the baseline model will be performed. The difference is that the dataset for Latin America contains panel data which means that it varies both across entities and over time. When performing a regression with several observations from the same entity, it is important to use clustered standard errors that allow for correlation within the same “cluster”, i.e. the same entity. The reason behind this is that observations within the same entity cannot be considered independent, which is one of the conditions for internal validity.

lnhomicidei,t = b1lnginii,t + b2popgrowthi,t + b3gdpi,t + b4enrollmenti,t + ui,t (2)

i = 1,…, 18 (number of entities, in this case 18 Latin American countries)

(23)

Just like in the first model, the dependent variable (lnhomicidei,t) is the logarithm of homicide

rate, b1 is the slope coefficient showing the effect that the main explanatory variable (lnginii,t)

has on the dependent variable. lnginii,t is the logarithm of income inequality measured by the

Gini coefficient. b2, b3 and b4 are the slope coefficients for the control variables for

population growth (popgrowthi,t), GDP per capita (gdpi,t) and school enrollment (enrollmenti,t)

respectively. Lastly, ui,t denotes the error term which gathers everything that has an effect on

the dependent variable (lnhomicidei,t) but is not correlated with any of the explanatory

variables. In this model, the denotation t is added to the variables to symbolize the four

different time periods since the data for Latin America vary both across entities and over time.

4.3. Additional model II

Since the dataset for Latin America contains panel data, it is possible to also use a fixed effects model for the regression analysis. The benefit of the fixed effects model is that it controls for entity-specific (in this case country-specific) effects that are constant over time and that might affect the result of the regression. By comparing the values for different years within the same country, the fixed effects model holds constant the country-specific effects and isolates the (possible) change in homicides that is caused by a change in income

inequality. However, the fixed effects model is mainly beneficial when the aim of the study is to examine effects in the short term, e.g. consequences of an economic crisis or a natural disaster (Stock & Watson, 2015). Since the focus of this thesis is not to study the effects of specific violent events but rather the general violence that is present on an everyday basis, it is more relevant to look at effects in the long term. For that reason, more emphasis is put on the multiple linear regression model, but it is still valuable to use the fixed effects model as an additional model with the aim of comparing the results in the short and the long term.

lnhomicidei,t = b1lnginii,t + b2popgrowthi,t + b3gdpi,t + b4enrollmenti,t + ai + ui,t (3)

The equation for the fixed effects model is similar to the ones previously presented, with the difference that this model includes the variable a that represents the country-specific effect. Apart from that, the remaining variables are identical to the ones that make up the equation for the multiple linear regression model. The dependent variable (lnhomicidei,t) is the

logarithm of homicide rate, b1 is the slope coefficient showing the effect that the main

(24)

income inequality measured by the Gini coefficient. b2, b3 and b4 are the slope coefficients for

the control variables for population growth (popgrowthi,t), GDP per capita (gdpi,t) and school

enrollment (enrollmenti,t) respectively. Lastly, ui,t denotes the error term which gathers

everything that has an effect on the dependent variable (lnhomicidei,t) but is not correlated

(25)

5. Results

5.1. Main results

Table IV

Regression results: Multiple linear regression Global level

Dependent variable: lnhomicide.

Variable (1) (2) (3) lngini 3.666*** _(0.349) 2.781*** _(0.381) popgrowth 0.022 _(0.091) 0.176 _(0.108) gdp -0.031 (0.007) -0.048 (0.008) enrollment 0.001 _(0.004) 0.003 _(0.005) Constant -11.694*** _(1.267) -8.104*** _(1.414) 1.855*** _(0.457) N 119 114 114 R2 _0.486 _0.603 _0.409

Parentheses show default standard errors. * p < 0.10, ** p < 0.05, *** p < 0.01

The table above presents the main results of the thesis that will be used to answer the main research question; does economic inequality have a causal effect on violent crime? The principal value for each variable is the slope coefficient b that represents the effect that the variable in question has on the dependent variable. The value in parentheses shows the standard error, N represents the number of observations for each regression and R2 _{stands for}

R-squared which is a measurement that shows how much of the variation of the dependent variable (homicide rate) that is explained by the variables included in the regression model.

The first column (1) shows the result of the regression including only the Gini coefficient as the explanatory variable. The slope coefficient (b1) is 3.666 which indicates that there is a

(26)

significant on the 1% level, meaning that it is safe to say that there in fact is a causal effect of income inequality on homicides. R2 _{is 0.486 which means that 48.6% of the variation of the}

homicide rate can be explained by income inequality. However, since both b1 and R2 have

quite large values there is a risk of omitted variables that could affect the dependent variable through the explanatory variable and thereby lead to an overestimation of the actual effect that income inequality has on homicide. Therefore, a second regression is performed that includes three control variables that are assumed to also be predictors of homicides.

The second column (2) represents the result of the regression including both the main explanatory variable (income inequality) and the control variables (population growth, GDP per capita and school enrollment). The slope coefficient (b1) for income inequality is now

2.781 which indicates that a 1% increase in income inequality will lead to a 2.781% increase in homicides. This result is still statistically significant on the 1% level but the slope

coefficient is lower than in the first regression, signaling that there were probably omitted factors causing an overestimation of the effect of income inequality on homicides in the first model. R-squaredfor the second regression is 0.603 which means that 60.3% of the variation in homicides can be explained by the four explanatory variables included in the model, an 11.7% increase compared to the first regression.

It seems like there is enough evidence to confirm the main hypothesis, that economic

inequality has a causal effect on violent crime, but what happens if we exclude the main

explanatory variable (income inequality) from the model? Column three (3) shows the result of a third regression that includes the control variables but not the variable for income

inequality. It is evident that the slope coefficients for the control variables are smaller than the one for income inequality and none of them are statistically significant, i.e. it is not certain that they have a causal effect on homicides. However, R-squared is still high (0.409) which indicates that 40.9% of the variation in homicides can be explained by the control variables.

(27)

5.2. Additional results

Apart from the main regression model, two additional models focusing on Latin America have been included in this thesis with the aim of comparing the results on a global level to the ones on a regional level. As previously mentioned, Latin America has been chosen since it is the region with both the highest levels of economic inequality as well as the highest levels of violent crime.

Table V

Regression results: Multiple linear regression Latin America

Variable (1) (2) (3) lngini 2.919 _(1.734) 2.091 _(2.115) popgrowth -0.388 _(0.393) -0.214 _(0.336) gdp -0.000 (0.000) -0.002 (0.033) enrollment -0.020** (0.009) -0.023** (0.010) Constant -8.577 _(6.798) -3.151*** _(8.571) 4.994*** _(1.058) N 68 64 68 R2 _0.112 _0.243 _0.2

Parentheses show robust standard errors clustered by country. * p < 0.10, ** p < 0.05, *** p < 0.01

Table V presents the results of the multiple linear regressions for Latin America. Since this dataset contains panel data, the regressions are performed with clustered standard errors that allow for correlation between the observations that belong to the same entity (country).

The first column (1) shows that the slope coefficient for income inequality (b1) is 2.919,

(28)

regression and it is not statistically significant, meaning that the null hypothesis stating that there is no relationship between the variables of interest cannot be rejected. Another thing worth mentioning is that the R-squared value in column one (0.112) is notably lower than in the first regression, meaning that income inequality can only explain 11.2% of the variation in homicides in Latin America. This result implies that there are probably many other factors that contribute to the high levels of crime in Latin America, additional to the effect caused by income inequality.

Just like in the regressions on a global level, another regression including both the main explanatory variable (income inequality) and the control variables are performed. The result is shown in column two (2) where the slope coefficient for income inequality has decreased from 2.919 (column 1) to 2.091 when the control variables are added. This is in line with the results on a global level and implies that there were probably omitted factors in the first regression causing an overestimation of the effect of income inequality. R-squaredis 0.243 which makes sense since it is higher than when only income inequality was included in the model (column 1), but still notably lower than the results on a global level.

Lastly, a third regression is performed that excludes the main explanatory variable (income inequality) from the model and only includes the control variables. The results (column 3) are not statistically significant but imply, just like on a global level, that the effects of the control variables on homicides are not as strong as the effect of income inequality. R-squared is 0.2 which means that 20% of the variation in homicides can be explained by the control variables. R-squared for the regression including both income inequality and the control variables (column 2) was only slightly higher (0.243), something that indicates that in Latin America, the variance in homicides does not seem to be explained to a large extent by income

(29)

Table VI

Regression results: Fixed effects regression Latin America

Variable (1) (2) (3) lngini -0.114 _(0.672) 0.150 _(0.736) popgrowth -0.261 (0.275) -0.183 (0.143) gdp 0.011 (0.022) 0.008 (0.018) enrollment -0.003 (0.007) -0.002 (0.007) Constant 3.288 _(2.629) 2.702 _(2.757) 3.147*** _(0.629) N 68 64 68 R2 _(overall) _0.112 _0.027 _0.032

Parentheses show robust standard errors clustered by country. * p < 0.10, ** p < 0.05, *** p < 0.01

Finally, table VI presents the result from the fixed effects regression model, which holds constant all country-specific factors that affect the dependent variable. Just like in the multiple linear regression for Latin America, the regressions are performed with clustered standard errors that allow for correlation between observations within the same entity (country).

These results are somewhat surprising since they show that the slope coefficient (b1) has a

negative value (-0.114) in the regression performed with only the main explanatory variable (income inequality) and the dependent variable (homicide rate). This indicates that the relationship between income inequality and homicides is negative, i.e. if income inequality increases by 1% it would lead to a 0.114 decrease in homicides. These results are not in line with the multiple linear regressions that showed a positive relationship (statistically

(30)

In column two (2), the control variables are included in the regression model. The slope coefficient for income inequality (b1) has now changed to a positive value (0.150), which is

more in line with previous results but still notably lower than in the multiple linear

regressions. R-squared is 0.027 which is surprising since it is lower than in the last regression even though the only difference is that control variables have been added, i.e. it would make sense if the value was higher and not lower. The most probable explanation is that the number of observations is lower in the second regression which could provide less precise values.

The last step of the regression analysis is to exclude the main explanatory variable (income inequality) from the fixed effects regression model and only keep the control variables. The results show low values for the slope coefficients, indicating that the relationships between the control variables and the dependent variable (homicides) are weak. Furthermore, the results are not statistically significant. R-squared is 0.032 which is higher than in the second regression (column 2), probably because this model includes 68 observations and the previous only 64 observations, but is still a very low value.

5.3. Internal and extern validity

Internal validity refers to how well fit the model is for the stated purpose of the study, e.g. if

there might be omitted variable bias, measurement errors etc. There are several decisions that were taken in this thesis than can be discussed whether or not they were the best alternatives. Firstly, the decision to collapse all observations for the same country and use an average value in the dataset on a global level was not obvious. This was done because of the amount of missing values in the downloaded dataset for the Gini coefficient (measurement of income inequality), but another option would have been to interpolate those values to reach a balanced dataset. This would have allowed a much larger dataset (since there would be four observations per country instead of one) which can increase the precision and credibility of the results. On the other hand, interpolating values could also reduce internal validity since new values are created that were not part of the original dataset. Since the amount of missing values was quite large for the Gini coefficient, the decision was taken to not create any new values but instead use all of the existing observations with available data and collapse them into one average value for each country.

(31)

risk of either including too few or too many. There might be omitted factors that affect homicides that are instead included in the presented effect of income inequality, something that will overestimate that effect. On the other hand, if too many control variables that correlate with income inequality are included in the model, it will be “over-fitted” and all control variables together will eliminate the effect of income inequality that actually exist. In this thesis, choosing three control variables that have often been mentioned as strong

predictors of homicides was thought to be the best solution to this trade-off and will hopefully increase internal validity.

External validity refers to how well the results of the observed population match the real

population, i.e. if the estimated relationship between economic inequality and violent crime can be assumed to be the same in reality. In this thesis, one threat to external validity might be that many countries had to be dropped from the dataset due to the lack of available data on income inequality. While browsing the dataset for the Gini coefficient, 98 countries had an insufficient number of observations and were therefore excluded. It is possible that the countries lacking data share several mutual characteristics, e.g. they might be less democratic (and hence less willing to collect data or statistics), poorer etc. If those characteristics

correlate with economic inequality or violent crime (as mentioned in the article by Krahn et al., 1986), then these countries might have different (most likely higher) levels of inequality or crime than the countries included in the dataset. Not having access to data from these countries for the study will in that case cause a gap between the results of the study and the real situation. It is evident that dropping many countries from the dataset reduces external validation but there is unfortunately not much that can be done about it, except reflecting upon the consequences that it might cause.

(32)

6. Discussion

The main purpose of this thesis is to answer the research question; is there a causal effect of

economic inequality on violent crime? The results of the regression analysis on a global level

show a strong positive relationship between the two that maintains statistically significant even after several control variables have been added to the regression model. This result is in line with many previous studies (e.g. Braithwaite & Braithwaite, 1980 & Fajnzylber et al., 2002) that confirms a statistically significant causal effect of income inequality on homicides (the two variables used as proxies for economic inequality and violent crime in this thesis). Since the results show statistical significance it means that the null hypothesis stating that

there is no effect of economic inequality on violent crime can be rejected, however there are

still several discussions that can be held regarding the results.

At the starting point of this theses it was assumed that there would be a causal relationship between economic inequality and violent crime, but it is still surprising that the effect is so strong (2.781*** when control variables are included). Even though the main explanatory variable (income inequality) is of most interest, it is also relevant to observe the slope

coefficients of the control variables. In several studies (such as Krahn et al., 1986), population growth is, together with income inequality, concluded to be the main predictor of homicides. It was therefore noteworthy that the slope coefficient for population growth was only 0.022 and not statistically significant. The other two control variables also had values close to zero and were not statistically significant, but according to R-squared, the control variables still explained 40.9% of the variation in homicides. How could that be? One explanation would be that omitted factors are affecting the high R-squared value but in that case it would make sense if the slope coefficients were higher too, which they are not.

The additional research question for the thesis is if the effect of economic inequality on violent

crime is stronger in Latin America than on a global level? Results show that when multiple

(33)

highlights the rise of illegal economies and weak law enforcement as contributors to violent crime in the region.

The hypothesis connected to the additional research question was that the effect of economic inequality on violent crime would be stronger in Latin America, since the region has both the highest levels of economic inequality and the highest levels of violent crime. Since the results were not statistically significant it is not possible to reject the null hypothesis that the effect of economic inequality on violent crime is not stronger in Latin America. However, the

implication from the regressions is that the effect is actually smaller than on a global level. When reflecting upon it, there is nothing that implies that the effect ought to be stronger in Latin America just because the levels of both economic inequality and violent crime are higher. Even if the slope coefficient for inequality would be the same for Latin America as for the rest of the world, since the region has more inequality than the global average it is logical that it also has more violence than the global average. As mentioned in section two, economic inequality is decided by a complex combination of factors such as economic growth and poverty, but also history, politics and natural resources (Perkins et al., 2012). There are many possible explanations for the high levels of economic inequality (see Robinson, 2000) as well as the high levels of crime (see Bergman, 2018) in Latin America, without them necessarily correlating strongly with each other.

(34)

7. Concluding remarks

The aim of this thesis has been to investigate the relationship between economic inequality and violent crime, with the main research question being; is there a causal effect of economic

inequality on violent crime? As an additional focus, special emphasis has been put on Latin

America since it is the region in the world with both the highest levels of economic inequality and the highest levels of violent crime. Therefore, the additional research question is

formulated as; is the effect of economic inequality on violent crime stronger in Latin America

than on a global level?

To answer these questions, literature on the theories that predict economic inequality and violent crime is presented together with an introduction to the Latin American context and a review of previous studies on the topic. Data are downloaded from The World Bank and WHO and merged together into a dataset including income inequality (proxy for economic inequality) as the main explanatory variable, homicide rate (proxy for violent crime) as the dependent variable, as well as population growth, GDP per capita and school enrollment as control variables. The data include 119 countries on a global level, 18 countries for Latin America and cover a time period of 2000-2015.

Three different regression models are used; multiple linear regression with cross-national data on a global level, and multiple linear regression with panel data as well as fixed effects

regression with panel data for Latin America. The results for the multiple linear regression show a strong effect of economic inequality on violent crime and is statistically significant on a global level but not on a regional level. In these two regression models, the R-squared value is also notably high, indicating that a lot of the variation in violent crime can be explained by economic inequality. The results for the fixed effects model are more difficult to interpret and show a non-statistically significant negative effect, something that is not in line with most of the previous research on the topic.

(35)

References

Anand, S. & Kanbur, S.M.R. (1993), The Kuznets process and the inequality – development relationship, Journal of Development Economics 40, 25-52.

Bergman M. (2018), More Money, More Crime: Prosperity and Rising Crime in Latin

America, Oxford University Press, New York.

Braithwaite, J. & Braithwaite, V. (1980), “The Effect of Income Inequality and Social Democracy on Homicide – A Cross-National Comparison”, British Journal of Criminology 20, 45-53.

CEPAL, “La matriz de la desigualdad social en América Latina” (2016),

https://www.cepal.org/sites/default/files/events/files/matriz_de_la_desigualdad.pdf (2019-10-09)

CEPAL, “Panorama Social de América Latina, 2019” (2019),

https://www.cepal.org/es/publicaciones/44969-panorama-social-america-latina-2019 (2019-12-09)

ECLAC, “The Inefficiency of Inequality” (2018),

https://www.cepal.org/en/publications/43443-inefficiency-inequality (2019-12-09).

El País, “Un himno feminista para toda América Latina” (2019-12-08),

https://elpais.com/sociedad/2019/12/07/actualidad/1575759576_174063.html (2020-01-14).

Enamorado, T., López-Calva, L., Rodríguez Castelán, C. & Winkler, H. (2014), “Income Inequality and Violent Crime: Evidence from Mexico's Drug War”, Policy Research Working

Paper 6935, World Bank, Washington DC.

Fajnzylber, P., Lederman, D. & Loayza, N. (2002), “Inequality and Violent Crime”, The

Journal of Law and Economics 45, 1-39

(36)

Messner, S. (1982), “Societal Development, Social Equality, and Homicide: A Cross-National Test of a Durkheimian Model”, Social Forces 61, 225-240.

Messner, S. F. & Rosenfeld, R. (1994), Crime and the American dream, Wadsworth, Belmont, CA.

Mire, S., & Roberson, C. (2010), The study of violent crime: Its correlates and concerns, Routledge.

Muggah, R. & Aguirre, K. (2018), “Citizen security in Latin America: Facts and Figures”, https://igarape.org.br/wp-content/uploads/2018/04/Citizen-Security-in-Latin-America-Facts-and-Figures.pdf (2019-12-09).

Nettler, G. (1978), Explaining Crime, McGraw-Hill, New York.

Oxfam, “Privilegios que niegan derechos” (2015), https://oi-files-d8-prod.s3.eu-west-2.amazonaws.com/s3fs-public/file_attachments/reporte_iguales-oxfambr.pdf (2019-12-09).

Perkins, D. H., Radelet, S. C., Lindauer, D. L. & Block, S. A. (2012), Economics of

Development, W. W. Norton & Company, New York & London.

Piketty, T. (2018), “Tony Atkinson: The birth and development of modern inequality studies”,

The Economic and Labour Relations Review 29, 41-43.

Reiss, A. J., Jr. & Roth, J.A. (1993), Understanding and preventing violence, National Academy Press, Washington DC.

Stock, J.H. & Watson, M.W. (2015), Introduction to econometrics, Pearson Education.

(37)

The Guardian, “Colombia: thousands take to the streets in third national strike in two weeks” (2019-12-04), https://www.theguardian.com/world/2019/dec/04/colombia-protest-duque-bogota (2019-12-11).

The Washington Post, “The real story behind the Bolivia protests isn’t the one you’re

(38)

Appendix 1

Table AI

Countries included in the dataset on a global level

Albania Ghana Panama

Argentina Greece Paraguay

Armenia Guatemala Peru

Australia Guinea Philippines

Austria Honduras Poland

Azerbaijan Hungary Portugal

Bangladesh Iceland Romania

Belarus India Russia

Belgium Indonesia Rwanda

Benin Iran Samoa

Bhutan Ireland Senegal

Bolivia Israel Serbia

Bosnia and Herzegovina Italy Slovakia

Botswana Jamaica Slovenia

Brazil Jordan South Africa

Bulgaria Kazakhstan South Korea

Burkina Faso Kyrgyz Republic Spain

Cameroon Laos Sri Lanka

Canada Latvia Sweden

Chile Liberia Switzerland

China Lithuania Tajikistan

Colombia Luxembourg Tanzania

Costa Rica Madagascar Thailand

Croatia Malawi Timor-Leste

Cyprus Malaysia Togo

Czech Republic Mali Tonga

Côte d'Ivoire Malta Tunisia

Denmark Mauritania Turkey

Djibouti Mexico Uganda

Dominican Republic Moldova Ukraine

Ecuador Mongolia United Kingdom

Egypt Montenegro United States

El Salvador Morocco Uruguay

Estonia Mozambique Uzbekistan

Ethiopia Namibia Venezuela

Fiji Netherlands Vietnam

Finland Nicaragua Zambia

France Niger

Gambia North Macedonia

Georgia Norway

(39)

Table AII

Countries included in the dataset Latin America Argentina Bolivia Brazil Chile Colombia Costa Rica Dominican Republic Ecuador El Salvador Guatemala Honduras Mexico Nicaragua Panama Paraguay Peru Uruguay Venezuela

Mind the gap – save lives?

Mind the gap – save lives?

Abstract

Keywords

Acknowledgments

Table of Contents

List of tables

List of Abbreviations

1. Introduction

2. Theoretical background

3. Data

4. Empirical Method

5. Results

6. Discussion

7. Concluding remarks

References

Appendix 1