• No results found

The Global Conflict Risk Index: A quantitative tool for policy support on conflict prevention

N/A
N/A
Protected

Academic year: 2022

Share "The Global Conflict Risk Index: A quantitative tool for policy support on conflict prevention"

Copied!
22
0
0

Loading.... (view fulltext now)

Full text

(1)

The Global Con flict Risk Index: A quantitative tool for policy support on con flict prevention

Matina Halkia

a,

⁎ , Stefano Ferri

a

, Marie K. Schellens

a,b,c

, Michail Papazoglou

d

, Dimitrios Thomakos

e

aEuropean Commission, Joint Research Centre, Via Enrico Fermi 2749, 21027 Ispra, VA, Italy

bStockholm University, Department of Physical Geography, Svante Arrhenius väg 8, 106 91 Stockholm, Sweden

cUniversity of Iceland, Environment and Natural Resource Programme, Sæmundargata 2, 102 Reykjavik, Iceland

dUnisystems S.A, Via Michelangelo. Buonarroti 39, 20145 Milano, Italy

eUnisystems S.A, Rue Edward Steichen 26, LU L-2540, Luxembourg

A B S T R A C T A R T I C L E I N F O

Article history:

Received 25 June 2019

Received in revised form 5 February 2020 Accepted 16 February 2020

Available online 19 March 2020

In an effort to bridge the gap between academic and governmental initiatives on quantitative conflict modelling, this article presents, validates and discusses the Global Conflict Risk Index (GCRI), the quantitative starting point of the European Union Conflict Early Warning System. Based on open-source data of five risk areas representing the struc- tural conditions characterising a given country (political, economic, social, environmental and security areas), it eval- uates the risk of violent conflict in the next one to four years. Using logistic regression, the GCRI calculates the probability of national and subnational conflict risk. Several model design decisions, including definition of the depen- dent variable, predictor variable selection, data imputation, and probability threshold definition, are tested and discussed in light of the model's direct application in the EU policy support on conflict prevention. While the GCRI re- mainsfirmly rooted by its conception and development in the European conflict prevention policy agenda, it is vali- dated as a scientifically robust and rigorous method for a baseline quantitative evaluation of armed conflict risk.

Despite its standard, simple methodology, the model predicts better than six other published quantitative conflict early warning systems for ten out of twelve reported performance metrics. Thereby, this article aims to contribute to a cross-fertilisation of academic and governmental efforts in quantitative conflict risk modelling.

Keywords:

Conflict risk Conflict prevention Early warning system GCRI

Regression Validation

1. Introduction

Quantitative modelling of conflict risk has become standard practice during the last two decades in conflict and peace research. There are two general purposes in developing these models. On the one hand, the models are employed for explanatory analysis of conflict drivers [1,2]. On the other hand, modelling and simulation techniques are used to forecast the risk of future conflicts [3–6]. Focusing exclusively on intrastate violent conflict, a country's structural conditions, including social, economic, security, polit- ical and geographical/environmental factors, have been associated to the risk of conflict [7–9].

Concerning the models themselves, the dominant approach to forecast the risk of a conflict is based on logistic regression models, which are used to estimate the probability of a violent conflict event [7,10,11].

Some researchers have extended their methods beyond logit models' basic

applicability. Hegre et al. [5], for example, created a dynamic multinomial logit model to forecast conflicts up to 2050. Goldstone et al. [2] used a con- ditional logit model and Ward and Beger [12] ensembles of logit models.

Furthermore, neural network approaches have been implemented [13], fixed-effects regression models [14], naïve Bayes classifiers [15] and ran- dom forest models [15,16]. The transparency of simple models fosters dia- logue about the development and use of a model, as well as trust with its end-users [2,17]. More complex models are better able to capture complex patterns in the input data to provide rigorous causal insights [18].

Besides their academic context, conflict risk models increasingly inform the work of governments, aid organizations, think tanks, the mass media and other actors. Many of them are embedded in early warning systems that facilitate conflict prevention policies. The Global Conflict Risk Index (GCRI) is a conflict risk model initiated in 2014 to support the design of European Union's (EU) conflict prevention strategies [19]. It was and is con- tinuously developed by the Joint Research Centre (JRC) of the European Commission (EC) in collaboration with an expert panel of researchers and policy-makers. The model was designed to be a robust, evidence-based risk management tool based on measurable structural factors. GCRI is the main input of the EU Early Warning System. The annual process is com- prised of an initial quantitative analysis phase, the GCRI, followed by Progress in Disaster Science 6 (2020) 100069

Corresponding author at: European Commission - Joint Research Centre, Via Enrico Fermi, 2749, I - 21027 Ispra, VA, Italy.

E-mail addresses:Matina.HALKIA@ec.europa.eu, (M. Halkia),

Stefano.FERRI@ec.europa.eu, (S. Ferri),marie.schellens@natgeo.su.se, (M.K. Schellens), Michail.PAPAZOGLOU@ext.ec.europa.eu, (M. Papazoglou),

Dimitrios.THOMAKOS@ext.ec.europa.eu. (D. Thomakos).

http://dx.doi.org/10.1016/j.pdisas.2020.100069

2590-0617/© 2020 Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Contents lists available atScienceDirect

Progress in Disaster Science

j o u r n a l h o m e p a g e :w w w . e l s e v i e r . c o m / l o c a t e / p d i s a s

(2)

iterations of qualitative analysis by country desks and geographical experts under the coordination of the European External Action Service.

Likewise, other global actors are interested in being risk informed and proactive. In 2002, O'Brien from the Center for Army Analysis of the U.S.

Army developed a forecasting model as an early warning approach to insta- bility and conflict based on country macrostructural factors [20]. As a follow-up to this model, O'Brien describes the efforts by the US military to develop an Integrated Crisis Early Warning System with a multi-model ap- proach that integrates six different models within a Bayesian network to provide quarterly-level forecast of six different types of conflicts, instabil- ities and political crises [21]. At the same time, the U.S. Central Intelligence Agency funded the Political Instability Task Force to provide a forecasting model for the onset of both violent civil wars and non-violent democratic reversals [2]. In 2012, this conflict early warning effort was moved to the Center for Systemic Peace and is since then more difficult to follow [22].

Many EU member states have developed their own early warning systems based on qualitative and/or quantitative methods (personal communica- tion). For example, The Hague Centre for Strategic Studies provides quanti- tative assessments of large scale political violence worldwide to the Dutch national security [18]. At the World Bank, four predictive modelling tech- niques were tested on three different datasets [17]. They also provided an example of how their model could inform policy frameworks for anticipat- ing the outbreak of conflict based on ranking countries. Further, the en- tirely academic ViEWS project provides publicly available monthly predictions for the African continent of three conflict outcomes on a high geographical resolution and country level, based on open-access data and open-source code of their ensemble model [6]. Nevertheless, the existing conflict prediction models developed directly for policy-making [2,17,18,20,21] are not publicly available and it is not clearly described whether and how the above models are used as a conflict early warning sys- tem for policy support.

During the last decade, the bulk of scholarly debate has moved increas- ingly from the view that existing modelling techniques are not adequate enough to forecast conflicts [23,24], to the view that prediction is feasible and policy-relevant within a limited spatial and temporal scope [24,25].

However, based on the existing literature we can understand that the cross-fertilisation between the violent conflict models, developed within ac- ademic contexts, and the ones used for policy support is relatively scarce.

Within this article, we present, validate, and discuss the GCRI, devel- oped at the crossroads of science and EU policy-making for conflict preven- tion. Thereby, the goal is to enrich the connection between research and policy making, by providing a transparent validation of the model. The fol- lowing section describes the data and methods used, including variable se- lection, data management, model specification, and validation procedures of the GCRI. Subsequently, the‘Results and discussion’ section consists of four parts: (1) a presentation of the results of various model design tests and decisions, (2) a discussion of these outcomes in light of their relevance for policy support in conflict prevention, (3) a presentation of the resulting model with a comparison to other conflict early warning systems, and (4) a discussion of the use and future developments of the resulting model in light of its policy support function.

2. Data and methods

The development of the GCRI is a continuous process, rooted within sci- entific literature, and agreed upon within a working group of academic ex- perts and policy-making end-users. The methodological steps described here are the variable selection, data management, model specification, and validation.

2.1. Variable selection

The GCRI's response variable represents the incidence of conflict in the next one to four years based on a set of structural conditions prevailing in a given country. The incidence of conflict is defined and based on three of Uppsala Conflict Data Program's (UCDP) datasets: Battle-Related Deaths

(BRD), One-sided Violence (OSV) and Non-State Conflict (NSV) [26]. A conflict is defined in these datasets as 25 or more battle-related deaths in a year. The GCRI considers two dimensions of conflict: one indicating the risk for subnational conflicts (SN) and the other for conflicts over national power (NP). The BRD data are reclassified into an NP or SN conflict, while OSV and NSV conflicts are classified as subnational conflicts. Of the BRD data, the GCRI only considers internal and internationalized internal conflicts, but it does not include systemic and interstate conflicts. The GCRI uses historical data from 1989 to 2017 for 191 countries worldwide as country-year observations.

The GCRI includes 24 predictor variables which represent the structural conditions of a country infive risk areas, i.e. political, security, social, eco- nomic, and geographical/environmental [27].Table 1gives an overview of those predictor variables per theme, including information on data sources and basic descriptive statistics. All the variables in this analysis have been extensively used as explanatory or control variables in the conflict research literature, as well as agreed upon within the GCRI panel of advisors from ac- ademia and policy-making. The datasets used are all freely accessible on the internet and have been compiled by diverse international organizations such as the World Bank, the United Nations and academic institutions [27].

Many studies acknowledge that political factors have a major impact on the risk of a conflict [28].‘Regime type’ is included as a predictor in the GCRI, because evidence suggests that it is a main factor explaining out- breaks of political violence [2,28,29]. For example, anocracies have been shown to be more susceptible to civil war than both pure democracies and pure dictatorships [9,29]. Inconsistent democratic institutions have been correlated to civil war onset [8], and therefore‘Lack of democracy’

is included in the GCRI as an independent variable. In addition,‘Govern- ment effectiveness’ is a measure of the quality of public and civil services, as well as the degree of its independence from political pressures, which has been shown crucial to a lowered probability of conflict onset [30,31].

‘Repression’ has been found to be positively correlated with the onset of civil war [28,32]. Lastly, lack of‘Empowerment rights’ have been shown to be a crucial underlying factor of violent conflict [33], but also to lower the likelihood of civil wars in highly fractionalised societies [34].

Concerning the security area, there is a scientific consensus on the exis- tence of a‘conflict trap’. This entails that there is a high risk of recurrence when a country has already experienced a conflict earlier [9,35]. Moreover, a strong correlation exists between violent conflict and the conflict situation in neighbouring countries [8,32,36,37]. Sambanis found already in 2001 that‘living in a bad neighbourhood, with undemocratic neighbours or neighbours at war, significantly increases a country's risk of experiencing ethnic civil war’ [36]. Accordingly, we have included the variables‘Recent internal conflict’, ‘Years since highly violent conflict’, and ‘Neighbours with highly violent conflict’ in the GCRI.

The following social factors have been collected from the existing liter- ature for having an impact on the risk of conflict. ‘Ethnic power change’ is included in the analysis since ethnic marginalisation and ethnic nationalism have been found to be a main factor in violent conflicts [33,38].‘Ethnic compilation’ is included as a predictor in the GCRI because ethnic fractionalisation has been shown to increase conflict risk [34], especially for lower level armed conflicts [8].‘Transnational ethnic bonds’ have been shown to‘constitute a central mechanism of conflict contagion’

[37].‘Corruption’ has been shown to lead to armed conflict when in- creasingly competitive forms of corruption turn violent [33,39].‘Homi- cide rate’ is understood as a proxy for a violent culture and a means for control over territories between organized criminal groups [40,41].

Lastly, also‘Infant Mortality’ has been linked to increased risk for armed conflicts [42].

Regarding the economic variables,‘GDP per capita’ is consistently linked with conflict in the literature because of low income levels and low rates of economic growth [8,9,28,33,43]. Moreover,‘Income inequal- ity’, more specifically between societal classes, has been shown to breed po- litical violence [44,45]. Further, ‘Economic openness’ [43,46],‘Food security’ [47] and‘Unemployment’ [41,48] have also been found to have an effect on the risk of a conflict.

(3)

The GCRI considers geographical/environmental factors as well.‘Water stress’ has been shown to increase the risk of armed conflict, but only with- out institutionalized agreements or good water governance [49,50].‘Oil producer’ indicates a country's fuel exports as a percentage of the total ex- ports, for which there is scientific consensus on a heightened risk for armed conflict [7,9,51].‘Structural Constraints’ such as rough, mountain- ous terrain that limits access to regions, consistently increases the likeli- hood of violent conflict [1,8]. Similarly, there is great consensus on

‘Population size’ contributing to a higher risk for conflict [1,7–9]. Lastly,

‘Youth bulge’, or the proportional size of young people within the entire population of a country, has been linked to violent uprisings [52].

Except for the abovementioned variables, we use interaction terms among‘Regime type’, ‘Income inequality’ and ‘GDP per capita’, as they have been shown to be strongly correlated. There is a large body of litera- ture on the impacts of economic inequality on regime change and democ- racy (e.g. [52–54]). Further, it has been shown that good political institutions support consistent economic growth [56], but also that author- itarian regimes show faster economic growth rates [57]. Lastly, there is a positive association between equality and economic growth [58,59], as well as strong correlation between income inequality and GDP [60].

We are aware that some of these predictor variables are also known to be an outcome of violent conflict and violence. For example, economic growth, poverty levels and child mortality and access to potable water have been significantly damaged by armed conflicts [14]. Partly, the effect of violent conflict on various variables is captured by the conflict history

variables‘Recent internal conflict’, and ‘Years since highly violent conflict’.

Modelling the full endogeneity of armed conflicts' effects on development is, however, beyond the scope of this paper.

2.2. Data management

Because most of the variables used in models are not normally distrib- uted, we conducted a distribution analysis of them. That way, not- normally distributed variables and outliers were detected. Based on this study, we applied various transformations and winsorization to manage not-normal distributions and outliers [61,62]. We used a logarithmic trans- formation for the following variables:‘Infant Mortality’, ‘Openness’, ‘Homi- cide rate’, ‘GDP per capita’, ‘Unemployment rate’. For the variable ‘Oil producer’ we implemented a fifth square root ðp5ffiffiffixÞ transformation, while we used the winsorization technique for the variable‘Corruption’. Please seeAppendix Afor more information about the outliers' detection and the variables' distributions for the variables that are not normally distributed.

Further, the interpretation and comparability of regression coefficients is sensitive to the scale of the input data [63]. Therefore, we rescaled all the variables of this analysis using a zero to ten scale. For a detailed descrip- tion of the rescaling, we refer to the following technical report of the GCRI [27]. This made the modelling results more comprehensible.

Missing data constitutes a serious problem in statistical modelling, espe- cially when the proportion of missing data for one variable is higher than Table 1

Overview of the GCRI's predictor variables per theme, with an explanation of the variables' data source and descriptive statistic metrics. The minimum and maximum values of each variable's distribution are not included as they are all 0 and 10 respectively because of rescaling. For an explanation of the abbreviations, see below the table.

Theme Variable name Data source 1st

Qu.

Median Mean 3rd Qu.

Political

Regime Type Openness of executive recruitment (EXREC) and the competitiveness of political participation (PARCOMP)

variables of Polity IV Annual Time-Series, 1800–2015 dataset [68] 1.00 3.91 5.51 3.98

Lack of Democracy POLITY2 variable of Polity IV Annual Time-Series, 1800–2015 dataset [68] 0.00 1.50 3.12 6.00 Government

Effectiveness

Government Effectiveness Estimate by the World Bank's Worldwide Governance Indicators [69] 3.83 5.53 5.13 6.49

Level of Repression Max value of the variables PTS_A (from Amnesty International), PTS_H (from Human Rights Watch) and PTS_S (from US State Department) of the Political Terror Scale (PTS) [70]

2.00 4.00 4.91 6.00

Empowerment Rights Empowerment Rights index of the Cingranelli and Richards (CIRI) Human Rights Data Project [71] 1.43 2.86 3.87 6.43

Security

Recent internal conflict Battle related deaths, One-sided violence and Non-state conflict datasets provided by the Uppsala Conflict Data

Programme [26] 0.00 0.00 1.36 0.00

Years since highly violent conflict

Battle related deaths, One-sided violence and Non-state conflict datasets provided by the Uppsala Conflict Data Programme [26]

0.00 0.00 0.88 0.00

Neighbours with highly violent conflict

Battle related deaths, One-sided violence and Non-state conflict datasets provided by the Uppsala Conflict Data Programme [26]

0.00 0.00 3.75 8.00

Social

Ethnic Power Change Based on changes in the Status and Regional Autonomy (Reg_aut) variables of the Ethnic Power Relations (EPR)

Core Dataset [72] 0.00 0.00 0.124 0.00

Ethnic Compilation Maximum value of the variable Status over all present ethnic groups from the Ethnic Power Relations (EPR) Core Dataset [72]

1.00 1.00 3.346 5.00

Transnational Ethnic Bonds

Variable transnational dispersion (GC10) of the Minorities at Risk Dataset [73] 0.00 3.33 4.03 6.67

Corruption Control of Corruption series of the World Bank's Worldwide Governance Indicators [69] 3.92 5.94 5.34 7.26 Homicide Rate Intentional homicides variable of the World Bank's Worldwide Development Indicators [74] 1.30 3.09 3.06 4.62 Infant Mortality Under-five mortality rate (SH.DYN.MORT) variable of the World Bank's Worldwide Development Indicators

[74]

3.55 5.25 5.27 7.25

Economic

GDP per capita GDP per capita, PPP (constant 2011 international $) of the World Bank's Worldwide Development Indicators

[74] 3.08 4.52 4.60 6.19

Income Inequality The Gini index of net income variable from the Standardized World Income Inequality Database (SWIID) [75] 3.28 4.79 4.63 6.23 Economic openness A weighted mean of the following three World Bank's Worldwide Development Indicators (after rescaling):

Foreign direct investment, net inflows (BoP, current US$), Foreign direct investment, net inflows (% of GDP), and Exports of goods and services (% of GDP) [74]

4.08 4.70 4.85 5.46

Food Security A weighted mean of the following four FAO's Food Security Indicators (after rescaling): Dietary Energy Supply, domestic food price level index, Nourishment, and domestic food price volatility index [76]

2.56 4.13 4.43 6.18

Unemployment Unemployment, total (% of total labour force), of the World Bank's Worldwide Development Indicators [74] 3.42 4.93 4.68 6.31

Geographical

Water Stress Aqueduct Country and River Basin Rankings (raw country scores for‘tdefm’) [77] 2.96 4.07 4.33 6.34 Oil Production Fuel exports (% of merchandise exports) of the World Bank's Worldwide Development Indicators [74] 2.46 4.85 4.80 6.73 Structural Constraints Variable Structural constraints of the by the Bertelsmann Stiftung's Transformation Index [78] 1.00 5.00 4.44 7.00 Population Size Total sum per country of Annual Population by Age-both Sexes data by UN DESA's World Population Prospects

[79]

5.89 6.94 6.62 7.84

Youth Bulge Number of inhabitants between age 15 and 24 divided by the number of inhabitants older than 25, based on Annual Population by Age-both Sexes data by UN DESA's World Population Prospects [79]

2.77 6.29 5.56 8.17

Abbreviations: 1st Qu. =first quantile; 3rd Qu. = third quantile.

(4)

25% [64]. Yet, in the literature there is no established appropriate cut-off percentage of missing data [64]. The missing data per variable of the GCRI is reported in [27]. We implemented various imputations techniques and tested the predictions sensitivity to them, such as Last Observation Car- ried Forward (LOCF), listwise deletion (also called complete cases), Multi- ple Imputation by Chained Equations (MICE) using predictive mean matching, and MICE using random forests [65–67].

2.3. Model definition and validation

Inspired by the existing quantitative conflict risk models, GCRI uses a standard logistic regression to measure the probability of conflict incidence [80]. Hence, the probability model is mathematically defined as:

P ¼ eγ

1 þ eγ; γ ¼ β0þ β1x1þ β2x2þ … þ βnxn

P stands for the probability of conflict incidence, β012,…βnare the coefficients and x1, x2,…xnare the variables we consider in the model. The GCRI encompass two distinct equations in this analysis: (a) the probability of violent national power conflicts (NP); and (b) the probability of subna- tional violent conflict (SN). Both equations are composed of the same vari- ables, except for‘Ethnic Power Change’ and ‘Ethnic Compilation’ that are included only in the national and subnational dimension respectively.

To evaluate the predictive performance of the GCRI, we applied a ten- fold cross-validation. Because a model's predictive performance will almost certainly be overestimated when it is tested on data also used for its deriva- tion, cross validation splits the available input data into a set for the model's derivation (training data) and evaluation (test data) [81–83]. Thereby, it ensures evaluation of a model's performance on a separate test data set, in- dependently from the optimization of the model, and thus without overes- timation [83,84]. A small decrease in model performance calculated on test data compared to the training data is to be expected. However, a large de- crease indicates that the model is overfitted, meaning that corresponds too strictly to the training data and invalidates the use of the model out of the bounds outside of them [83].

Among many cross-validation methods, Kohavi [85] and Arlot& Celisse [84] recommend ten-fold cross-validation for models based on real-world datasets with many predictors, such as the GCRI. This method splits the complete dataset of observation in ten parts of equal size. Over ten itera- tions, each part is used as the test dataset once, while the other nine parts are used to train the model. Potential issues have beenflagged for k-fold cross-validation in time-series data because of the inherent serial correla- tion [86]. Alternatives might include leave-one-out cross-validation, non- dependent cross-validation, temporal partitioning of the training and test sets, or classical out-of-sample evaluation, where a block of data from the end of the series is used for evaluation [86]. Bergmeier et al., however, have shown theoretically and empirically that k-fold cross-validation is the preferred method for predictive model design, even in case of serial au- tocorrelation [86].

We compare the models' performance calculated from the training data and test data to identify overfitting [83]. We assess the internal stability of the model by analysing the models' performance over the ten iterations.‘If the variability is large, then the model's coefficients highly depend on the particular portion of the original data used tofit the model’ [83]. If the model is both internally stable and not overfitting, we can fit it on the full original sample of observations (no distinction between test and training data) to use all available information according to the sufficiency principle [81,83]. The performance of thisfinal model can be estimated by in-sample validation over the full sample of observations, or out-of-sample by taking the median over the performance metrics calculated over the ten iterations of test sets.

Performance metrics reported for the probability models include the Brier score, the area under the receiver operating characteristic (ROC) curve (AUC), and the precision-recall curve [82,87]. The Brier score is the average of the squared prediction error [82,87], which is similar to the

Mean Squared Error (MSE) for linear regression. The AUC is commonly used to assess how well a model discriminates between high-risk and low- risk subjects [82,83]. The ROC curve plots the sensitivity against fall out rate over the whole range of possible thresholds (zero to one; see Appendix Bfor information on cross-validation and related accuracy met- rics). The precision-recall curve plots the precision against the recall rate (or sensitivity) over the whole range of possible thresholds (also see Appendix B). These performance metrics are reportedfirst as they are inde- pendent of a selected probability threshold, to indicate the probability above which risk for conflict is predicted.

Employing 24 predictor variables, we are aware of likely multicollinearity and over-fitting problems [88]. We performed a standard variable selection analysis to estimate the effect of a reduced number of variables on the GCRI's predictions: Least Absolute Shrinkage and Selection Operator (LASSO).

LASSO is a regression analysis measure for variable selection and regulariza- tion [88]. We acknowledge that there exist more variable reduction tech- niques, for example based on the Bayesian Information Criterion (BIC), Akaike Information Criterion (AIC), or Least Angle Regression (LAR) [88].

A full variable selection study is, however, outside the scope of this study.

We rather focus on the effects of the LASSO-based variable reduction on the GCRI's predictive performance in light of the main goal of a predictive early warning model (rather than explanatory), and in light of interactions with the end-user for policy-making. The potential problem of overfitting due to multicollinearity is investigated with cross-validation as described above.

Lastly, we needed to set the probability threshold above which events are declared as having conflict risk. This can be described visually from a double histogram presenting the distributions of predicted conflict proba- bilities of the actual positive (conflict) events, as well as of the actual nega- tive (peace) observations [83]. Another common way to derive a suitable threshold is from the ROC plot. The use of the ROC curve to calculate a threshold is very common in the literature [5,23,89]. Selecting a threshold where the ROC curve starts bending would maximize sensitivity (minimiz- ing omission rate), while minimizing the fall out rate (maximizing specific- ity) [83]. However, it can be more accurately visualized by plotting the sensitivity (omission rate) and specificity (fall out rate) over all possible probability thresholds from zero to one. The intersection then indicates the probability threshold which maximizes (minimizes) both plotted met- rics. In the particular case of conflict risk prediction, high sensitivity (low omission) is preferred over high specificity (low fall out). This means we rather tolerate false positives (type I errors, falsely predicting conflict) than false negatives (type II errors, falsely predicting peace) considering the precautionary principle. This choice reflects accurately the conflict risk prevention policies which this evidence-based method supports. The model validation is compared with other existing conflict early warning- systems.

3. Results and discussion 3.1. Model design tests

As a point of reference, wefirst present the validation of the two stan- dard conflict probability models (NP and SN) to compare and discuss model design decisions against. The baseline model includes all 24 vari- ables according toTable 1above, imputed with the LOCF technique.

Fig. 1shows the Brier scores and AUC scores as boxplots of 10 scores, one for each fold of the cross-validation, as well as ROC curves and precision- recall curves for the NP and SN model. The median Brier scores are 0.0622 (NP) and 0.0782 (SN), well below 0.25 which is recommended as maximum allowed Brier score [87]. The median AUC scores are 0.9386 (NP) and 0.9362 (SN).

3.1.1. Overfitting

Thefirst step in the validation of the probability models is to test for overfitting by comparing performance metrics calculated on the training and test data set. For both conflict probability models, the difference in

(5)

performance between the test and train data is <0.01 for both the AUC and Brier score.

Secondly, inFig. 2we plotted observed and predicted means to investi- gate the probability models' bias. The means of the predicted probabilities of the test data are identical to the observed means (plotted on the 1:1 line). This is an exclusive property of how logistic regressionfits its model to given input data [90]. Hence, the bias away from the 1:1 line (and plot- ted training data points) by the test data's mean predicted and observed probabilities provides valuable model performance information, among others on overfitting. The SN model predicted means diverge further away from the 1:1 line than the NP models' means. In general, the predicted mean probabilities of both models stay within a maximum probability bias of 0.05 from the observed means.

Based on the AUC, Brier scores andFig. 1, we can conclude that there is practically no overfitting in the conflict risk probability models. Therefor onwards, we will plot performance metrics and graphics for the probability models only based on the test data sets of the ten-fold cross-validation.

3.1.2. Stability

Next, the internal stability of the models is tested by comparing the per- formance over the ten iterations/folds. As for the tests on overfitting, we only present performance metrics independent of a selected probability threshold. The AUC over the ten folds is very stable for both probability models, with a maximum difference of 0.04 between two out of ten folds of the NP and SN models (Fig. 1). The same observation results from the

Brier score, which has a maximum difference of 0.03 between two of the ten folds of the NP model (Fig. 1). Further,Fig. 2also holds interesting in- formation on the stability of model performance. According to the compar- ison the means of predicted and observed probabilities, there is more spread/variability in model performance over the ten folds of the SN models than of the NP models. There is no clear pattern of over- or under- estimating the mean probabilities (Fig. 2). In general, the spread is low for both probability models and we conclude they are internally stable be- cause of similar predictive performance over the ten iterations/folds.

3.1.3. Data imputation

Table 2reports on the sensitivity of the predictions to different imputa- tion techniques for handling missing data. Imputing the data with LOCF re- sults in the highest predictive power according to the Brier score for the NP model, and according to the Brier and AUC scores in the SN model. Listwise deletion delivers the lowest predictive performance. MICE with predictive mean matching results in a similar predictive power as imputing using MICE with random forests. In general, there is not a big difference in predic- tive power when implementing different imputation techniques on the data according to the AUC, but bigger difference is noticed according to the Brier score.

3.1.4. Variable selection

Table 3summarizes the results obtained by the variable selection tech- nique (LASSO regression, alpha equal to 0.001). The remaining variables Fig. 1. GCRI probability model predictive performance for the national power (NP, blue) model and subnational (SN, brown) model. Brier scores and AUC scores shown as boxplots of 10 scores, one for each fold of the cross-validation. ROC curves and precision-recall curves for the NP and SN model, again one curve per cross-validation fold. The different colour scales indicate the different folds. Note that the scales for the Brier and AUC scores are between 0–0,3 and 0,7–1,0 respectively.

(6)

are ordered from most to least important for both the NP and SN model based on the coefficient values assigned. The number of variables has been reduced to seven andfive respectively. The out-of-sample predictive performance increased slightly according to the Brier score of the NP model (0.001 lower Brier score than the model with all variables). But the other three metric show that the out-of-sample predictive performance decreased for both models to a 0.754 AUC (NP model), and to a 0.115 Brier score and 0.742 AUC (SN model).

3.1.5. Probability threshold

A suitable probability threshold would maximize sensitivity (minimiz- ing omission rate), while minimizing the fall out rate (maximizing specific- ity).Fig. 3visualizes this trade-off over all probability thresholds from zero to one. The intersection then indicates the probability threshold which maximizes both sensitivity and specificity. Accordingly, the recommended probability threshold for the highest combination of sensitivity and speci- ficity are 0.15 and 0.21 respectively for NP and SN. The averaged recom- mended threshold would thus be 0.18. Other selection criteria for the threshold are possible too, for example based on the highest kappa index of agreement. Then the recommended threshold lies between 0.3 and 0.4.

Fig. 2. The mean predicted values vs. mean observed values of the training and test dataset for each of the ten iterations/folds of the NP (blue) and SN (brown) conflict risk probability models.

Table 2

Predictive performance of the GCRI probability models when implementing differ- ent data imputation techniques. The values in italics indicate the best predictive per- formance (lowest Brier, highest AUC). For an explanation of the abbreviations, see below the table.

Imputation technique National power

model

Subnational model

Brier score AUC Brier score AUC

LOCF 0.0622 0.9386 0.0782 0.9362

Listwise deletion (complete cases) 0.1041 0.9253 0.1526 0.9325 MICE using predictive mean matching 0.0653 0.9449 0.0863 0.9357

MICE using random forests 0.0653 0.9446 0.0863 0.9359

Abbreviations: AUC = Area Under Curve; LOCF = Last Observation Carried For- ward; MICE = Multiple Imputation by Chained Equations.

Table 3

Selected variables by the LASSO Regression models, in order of importance, and the predictive performance of these models (Brier score and AUC based on out-of-sam- ple test set of last year of available data).

Ranked importance

National power model variables Subnational model variables

1 Recent internal conflict Recent internal conflict

2 Years since highly violent conflict Population 3 Neighbours with highly violent conflict Income inequalities

4 Unemployment Rate Unemployment Rate

5 Income inequalities Openness

6 Openness /

7 Population /

Brier Score 0.061 0.115

AUC 0.754 0.742

(7)

Averaged over both probability models, 0.35 is the kappa-based recom- mended threshold.

Depending on the criteria to evaluate the model against and which models to focus on, a different threshold can be decided as most suitable.

For illustration, we further present results based on the threshold with on average over both probability models (SN and NP) the maximum combina- tion of sensitivity and specificity: 0.18.Table 4provides threshold depen- dent accuracy metrics of the full model as described inAppendix B. This means it is in-sample validation. However, we have already concluded that our model is internally stable and not overfitting. The accuracy or pro- portion of correctly predicted events is very high because of the unbalanced amount of peace observations compared to conflict observations. Kappa index of agreement takes this imbalance into account and indicates moder- ate (>0.40) to substantial agreement (>0.60) of the predictions with the ob- servations. The sensitivity indicates that around 86% of conflict events will be predicted, while the specificity illustrates that around 86% of all peace events will be predicted. 58% of predicted conflict events are actual con- flicts, while 97% of peace predictions are actual peace events. With a higher threshold (e.g. 0.35 based on the highest kappa) the overall predictive performance increases (kappa of 0.66), mainly because many more peace events are being predicted (increased specificity of 93%). However, the sensitivity (to predict conflict events) decreases to 73%, which is the more important accuracy measure from a precautionary conflict prevention perspective.

Fig. 4provides an overview of the false negative and false positive predic- tions for the GCRI. There are more false positives (falsely predicted conflict risk) than false negatives (falsely predicted peace), as we mentioned before.

The false predictions are evenly spread out through time. Further there are more false predictions in Africa, relatively to other continents. They are how- ever, mainly false negative predictions. Countries with 5 or more false nega- tive mispredictions over the whole training period (1998–2013) are Peru, Egypt, Djibouti, Rwanda, Mozambique, and Cambodia for the national

power conflict risk; and Canada, Haiti, Guinea, Togo, Libya, Egypt, DRC Congo for the subnational conflict risk.

3.2. Model design decisions in light of policy support

The GCRI is a conflict risk model specifically developed at the cross- roads of science and EU policy making for conflict prevention. During the modelling stage, this combination creates certain consideration that is dif- ferent from a pure scientific-investigation context. Here, we discuss model design decisions and their consequent results, as presented in the previous section, in light of the GCRI's policy support purpose.

3.2.1. Outcome variables

First, the differentiation between an NP and SN model informs policy makers on the type of conflict to expect and thus, directs them regarding the policy measures suitable for conflict prevention. According toFig. 1, the SN probability model shows more variation in its performance over the ten folds as well as slightly lower overall performance (lower AUC and higher Brier scores) than the NP probability model. This could be as- cribed to the fact that there is more variety in SN conflict events, ranging over autonomist, secessionist, and ethnic violence, than within NP conflict events, which only represent conflicts for control over the political system of a country [19].

There are conflict early warning models that predict conflict well on a more temporally and spatially disaggregated scale, e.g. every month or three months, on sub-national political regions or even geo-located [6,91]. The obvious advantage, compared to the country-year unit that the GCRI applies, is that policy makers are provided with more precise in- formation on location and timing of conflict events, as all conflicts play out locally and are not distributed evenly throughout a country [91]. An- other advantage is that short term events or contentious issues that can trig- ger conflicts are more accurately captured by the data, e.g. elections, Fig. 3. Out-of-sample sensitivity and specificity (averaged over 10 folds) of the NP and SN probability models over all probability thresholds (zero to one). The droplines indicate the threshold with the combination of maximum sensitivity and specificity.

Table 4

Threshold dependent out-of-sample and in-sample accuracy metrics, based on a probability threshold of 0.18. Out-of-sample values represent the median over the ten folds;

in-sample values are based on the full conflict prediction models trained on all available observations. For explanation of the metrics, seeAppendix B.

Conflict dimension Accuracy Kappa Sensitivity Specificity Fall out Omission Precision Negative predictive value

Out-of-sample median over 10 folds NP 0.872 0.577 0.835 0.878 0.122 0.165 0.533 0.970

SN 0.855 0.636 0.892 0.845 0.155 0.108 0.622 0.964

Average 0.863 0.606 0.863 0.862 0.138 0.137 0.577 0.967

In-sample full model NP 0.871 0.573 0.827 0.879 0.121 0.173 0.531 0.968

SN 0.855 0.637 0.889 0.845 0.155 0.111 0.623 0.964

Average 0.863 0.605 0.858 0.862 0.138 0.142 0.577 0.966

(8)

natural hazards, or internal displacement of population groups [3,91].

However, policy design for conflict prevention such as the one prescribed by the Instrument for Stability and Peace, at EU level, involve longer-term processes and annual decision cycles often on a national scale, e.g.

supporting socio-economic development through aid programs and diplo- macy. Such long-term conflict prevention policies are developed to impact the structural conditions a country exhibits, e.g. inequality, economic de- velopment, trade relations, corruption, food security, health (infant mortal- ity), etc. Therefore, a conflict early warning tool at the level of years and countries, based on structural variables, is more useful to EU conflict early warning than spatially disaggregated monthly predictions based on contentious issues and conflict triggers. Other conflict prevention actors, such as relief organizations or UN peacekeepers in thefield, likely benefit more from disaggregated warning tools in time and space.

Similarly, specifying the dependent variable as‘the incidence of conflict in the coming four years’ simplifies and limits understanding conflict onset,

duration and cessation [21]. Disaggregating the dependent variable explic- itly into onset, duration and end of violent conflict will yield a more difficult predictive task and such a model will definitely loose on predictive perfor- mance [21]. On the other hand, it reduces serial autocorrelation in the input data and can provide valuable information on the different processes and variables driving the three phases within the conflict cycle.

3.2.2. Data imputation

Missing data constitutes a serious problem in statistical analyses, the early warning tools built on them and, consequently, the policy support provided. There is not much change in the predictive performance of the GCRI depending on the imputation method applied and the data imputed with LOCF result in the most correct predictions (Table 2). Those facts sup- port our decision to keep LOCF as a simple and transparent imputation method, even though holding values constant after the last observed value is not a realistic assumption for some of the cases. The problem Fig. 4. Overview of the amount of false positive and false negative predictions for the NP and SN models, based on a probability threshold of 0.18 summed over the ten cross- validation test sets (out-of-sample). The top bar charts show the false predictions over time, while the bottom maps show the false prediction per country summed over all years.

(9)

with listwise deletion (complete cases) is that too much information is lost as the whole observation is deleted when there is one (or more) value for a variable missing. This loss of information can severely reduce the power of the predictive model and introduce bias, especially when data is missing not at random [65,92]. MICE using predictive mean matching or random forest is a method that more realistically recreates missing values and holds good predictive power (Table 2).

Further, it is important to consider the reason of missingness for data.

For certain variables, e.g.‘Homicide Rate’ data is very likely missing not at random (MNAR) [67], but rather underreported or not reported at all in certain political or socio-economic contexts [65]. A complete analysis of the dependencies of missing data falls outside of the scope of this paper, but will be part of future developments of the GCRI. MNAR variables can be either removed, replaced by another proxy, or replaced by a dummy variables indicating the reason for their missingness. In conflict zones or contexts with a high risk for conflict, data gathering is difficult and data is thus missing not at random. Observations for both dependent and predic- tor variables are quite uncertain or missing there, leading to very low corre- lations (0.3–0.5) between different datasets of civil war onsets [18].

3.2.3. Variable selection

After analysing the variable reduction based on LASSO, we decided to keep all 24 variables in the GCRI for a number of reasons. The main reason is the trade-off between predictive power and explanatory power. More var- iables improved the predictive performance and thus early warning capac- ity of the GCRI (Table 3), though, it impeded the interpretation of the coefficients and significance tests of the regression due to multicollinearity.

The dominantly predictive purpose of the GCRI is not endangered by overfitting to 24 variables, as was extensively tested for. The other reason for keeping 24 variables is the collaborative development of the GCRI with an expert panel of researchers and policy-makers. End-users of the early warning tool advocate to include variables from theirfield of work to better understand interventions that they can make in pursuit of conflict prevention [21]. Likewise, Usanov and Sweijs noticed that‘expert-based approaches to political violence forecasting still carry great weight in the deliberations of many policymakers’ [21 p. 2]. Even with completely quan- titative tools for conflict early warning, expert inputs are decisive when selecting and aggregating variables [18]. A full variable reduction study, in- cluding also other techniques (e.g. BIC, AIC, or LAR), and focused both on the predictive and explanatory power of the model, is advisable when any variable-related future development of the model would occur (e.g. when a variable would be replaced or is updated with an extra year of observa- tions), though the results of it will always be deliberated with the expert panel.

3.2.4. Probability threshold

Regarding the threshold for the probability models, we mentioned two thresholds, i.e. 0.18 and 0.35, based on respectively the maximum combi- nation of sensitivity and specificity, and the highest kappa index of agree- ment. From a precautionary perspective for policy support, we prefer a threshold that lowers the amount of false negative predictions. In other words, we try to minimize the amount of false peace predictions that are ac- tually conflict events. Therefore, the lowest threshold of 0.18 is preferred over the other.Table 4shows that SN model has a higher sensitivity (pre- dicts more true positives, less true negatives, and more false positives, for a similar amount of false negatives) compared to the NP model. Therefore, although the NP model has a better overall performance, the SN model is the most useful model for policy support following a precautionary standpoint.

3.3. Model outcomes

Because the ten models based on partitioning of the data into ten folds were observed to be internally stable and not overfitting, we fitted two full models (NP and SN) to the complete dataset. The AUC value of the full model is 0.94 for both the SN and NP models. The Brier scores are

0.06 (NP) and 0.07 (SN), well below 0.25 which is recommended as maxi- mum allowed Brier score [87]. Overall, the in-sample validation of the full model compares well to the out-of-sample ten-fold cross-validation pre- sented inFig. 1(see alsoTable 4).

The quantification of the resulting NP and SN models can be found in Appendix C, where their parameter estimates standard errors and signifi- cance levels are given. These numbers can be used in simple spreadsheet software to reproduce the predictions found in this article or create new predictions. Since the main goal of the GCRI is conflict risk prediction rather than causal explanation, we focus the predictive performance rather than on the description and discussion of parameter estimates. To make rel- evant causal analyses, there is a need for input data to adhere to stringent statistical conditions, which cannot be assured for the input data currently used. Globally, the GCRI predicts more countries at risk for subnational con- flicts than national power conflicts. The African continent shows most countries with a high risk for both national power and subnational conflicts, however each continent has at least number of countries at risk for violent conflict (probability >.18).

Table 5compares the GCRI's predictive performance to other existing quantitative conflict early warning systems. Although these tools have dif- ferent end-users, input variables, modelling approaches, and validation techniques, the overall predictive performance can be compared. Accord- ing to the reported performance metrics, the GCRI shows better predictive performances for all reported metrics except for the precision of O'Brien's classification algorithm [20] and the AUC of the ViEWS project [6].

3.4. Use and future development of the GCRI model in light of policy support

The GCRI's predictive performance, compares well to other existing, often more complex, conflict early warning systems (Table 5). Simplicity is an advantage when used in a policy context as it facilitates transparency and trust in the model. For this reason, Goldstone et al. of the Political Insta- bility Task Force [2] and Celiku and Kraay of the World Bank [17] also pre- fer the use of simple models or algorithms. A simple model fosters direct dialogue between model developers, experts and end-users for the develop- ment and use of the model. On the other hand, the other reviewed conflict early warning systems [6,18,21] use complex combinations (ensembles) of different forecasting techniques and predictor variables for a valid reason.

‘Many of the most interesting, policy-relevant theoretical questions are also the most complex, nonlinear, and highly context-dependent. […] it is at best impractical and at worst impossible to apply standard regression techniques within the context of a Large N study, short of invoking unrea- sonable, oversimplifying assumptions’ [18]. The dominant purpose of the GCRI is prediction and early warning. Therefore, any technique, simple or complex, isfit as long as the out-of-sample validation shows high predictive performance. Would the use of the GCRI be extended to investigating con- flict causes and potential interventions, it will need to adhere more strictly to statistical assumptions for input data (but lose predictive power that way, seeSections 3.1.4 and 3.2.3on variable selection) or apply more complex modelling techniques, such as machine learning and ensemble methods [93].

The main limitations faced by the GCRI, as described inSection 3.2 (‘Model design decisions in light of policy support’), are the rough resolu- tion of country-year observations and predictions, the aggregated informa- tion in conflict incidence as the dependent variable (instead of onset, duration, and end), missing data and ways for imputing or handling those, and the limited explanatory power due to many variables and a sim- ple model specification. Further, the GCRI is invested in modelling the in- tensity of conflict for early warning purposes. Those efforts, however, require a more rigorous definition of the outcome variable and better modelling techniques before being able to provide robust scientific fore- casting for policy support. Lastly, it would be very interesting to compare the GCRI's performance to qualitative conflict risk assessments. However, the necessary data on the performance of qualitative assessments is not available to us at this point.

(10)

Given the above-mentioned limitations, with this release we laid the first stone for a solid quantitative conflict risk analysis allowing more cross-fertilisation between academic and governmental initiatives. As the EU scientific and technical service, we provide evidence-based support to policy makers, preferably in close collaboration with our peers. Through formal academic debate (represented by specialized journals), opportuni- ties arise for the academic community to partake in the process of policy ad- vice. Future versions of GCRI will be shaped by the scientific debate as they will provide increased weight to the scientific advice given to policy. Scien- tists are welcomed to critique and contribute to policy making for conflict prevention.

4. Conclusion

This article attempted to bridge the gap between quantitative conflict risk models developed in an academic context and the ones used for policy support on conflict prevention. We presented, validated, and distributed freely the GCRI, a conflict risk model developed in collaboration with an ex- pert panel of researchers and policy-makers. The GCRI directly supports the design of conflict prevention strategies of the EU as the main quantitative tool of the EU Conflict Early Warning System.

We showed that the development of a model for direct policy support brings certain considerations and model design decisions to support its spe- cific application. Yet, we argued and presented that these political- technical decisions are made within scientific bounds, safeguarding scien- tific rigour as well as objective policy support. The GCRI is validated as a stable model that does not overfit and predicts the risk for conflict with a sensitivity for conflict events of 86%, a specificity for peace events of 86%, an AUC of 0.94 and a Brier score of 0.07. This performance supports the use of the GCRI as the main quantitative tool of the EU Conflict Early Warning System. Furthermore, by providing t a transparent validation of the model, we encourage the use of the GCRI for education, analyses and discussions on model improvements in the context of conflict prevention, towards the benefit of fundamental research as well as policy-making.

Future research will investigate possible improvements to the variables included in the model, dependent as well as predictor variables, handling of missing data, increasing explanatory power by aligning to statistical pre- conditions or by applying alternative forecasting techniques, and the devel- opment of a conflict intensity model. This will be researched in continued collaboration with the expert panel and policy-makers, as well as with the wider academic community as follow-up to this release.

CRediT authorship contribution statement

Matina Halkia: Conceptualization, Supervision, Funding acquisition, Project administration, Writing - original draft, Writing - review& editing.

Stefano Ferri: Methodology, Software, Data curation, Validation, Re- sources.Marie K. Schellens: Conceptualization, Methodology, Formal analysis, Investigation, Writing - original draft, Writing - review& editing, Visualization, Project administration.Michail Papazoglou: Conceptualiza- tion, Methodology, Data curation, Formal analysis, Investigation, Writing - original draft, Writing - review& editing. Dimitrios Thomakos: Writing - original draft, Writing - review& editing, Validation.

Declaration of competing interest None.

Acknowledgements

We would like to acknowledge and thank the Foreign Policy Instru- ments and European External Action Service for supporting this research and for the fruitful collaboration and communication between policy and research services of the European Commission. Further, we are grateful to- wards former trainees and consultants who have contributed to the devel- opment of the GCRI in the past. The authors have also benefitted from the excellent work made by many anonymous reviewers who we want to thank for their precious time.

The results do not reflect the opinion of the European Commission or the European External Action Service on the risk and status of conflict in the countries. While every effort has been taken to make the GCRI assessment reliable, the information is purely indicative and should not be used for any decision making without additional sources of information and analy- sis. The European Commission is not responsible for any impact, damage or loss resulting from the use of the information presented in this article.

This research was partly funded by the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie In- novative Training Network grant agreement no. 675153.

Appendix A. Variables' distribution and outlier study

Transformations are extensively used in statistics for variable rescaling.

They can be described as‘Applying a deterministic mathematical function Table 5

Comparison of the GCRI with other existing quantitative conflict early-warning models, specifying their modelling approach, spatial and temporal resolution, reported per- formance metrics (out-of-sample, averaged over test data divisions or model subparts if given), and the GCRI's performance (median of out-of-sample folds). The values in italics indicate the best predictive performance.

Developer and/or model name

Year Modelling approach Spatial and

temporal resolution

Predictive performance

Reported measure Model GCRI

O'Brien from the Center for Army Analysis, U.S. Army [20]

2002 A pattern classification algorithm called fuzzy analysis of statistical evidence (FASE)

Country, year Overall accuracy Recall/sensitivity Precision

79%

75%

66%

86%

86%

58%

Integrated Crisis Early Warning System, U.S. [21]

2010 Bayesian aggregation (ensemble) of six diverging models Country, quarterly (4 times a year)

Brier score (of their Rebellion model)

0.18 0.06 (NP) 0.08 (SN) Political Instability Task

Force, U.S. [2]

2010 Case-controlled conditional logistic regression Country, year Onsets correctly classified (sensitivity) Controls correctly classified (specificity)

82.7%

83.5%

86.3%

86.2%

Hague Centre for Strategic Studies, The Netherlands [18]

2017 Ensemble of four existing quantitative forecasting model Country, year AUC 0.84 0.94

(SN and NP) Celiku and Kraay, World

Bank [17]

2017 Algorithm that chooses a set of thresholds for correlates of conflict, together with the number of breaches of thresholds that constitute a prediction of conflict

Country, year Average of the false positive and false negative rate

0.31 0.14

ViEWS project, Uppsala University [6]

2019 Ensemble of thematic models and specific statistical/machine-learning approaches

Country and subnational level, monthly

AUC Brier Accuracy

0.96 0.09 0.85

0.94 0.07 0.86

(11)

(e.g., log function, ln function) to each value to not only keep the outlying data point in the analysis and the relative ranking among data points, but also reduce the error variance and skew of the data points in the construct.’ [61]. We have applied standard logarithmic transformations, square root transformations and winsorization. Ghosh and Vogt explain that

‘winsorizing a distribution means setting the values at or more extreme than theτ quantile to that of the τ quantile on one tail and setting those values at or more extreme than the 1‑ τ quantile to those of that quantile’

[62] (p.4).

In this appendix we present the distribution analysis of the not-normally distributed variables and the transformations applies to them. We used a log- arithmic transformation for the following variables:‘Infant Mortality’, ‘Open- ness’, ‘Homicide rate’, ‘GDP per capita’, ‘Unemployment rate’. For the variable

‘Oil producer’ we implemented a fifth square root ð ffiffiffi x p5

Þ transformation, while we used the winsorization technique for the variable‘Corruption’.

From the descriptive plots concerning the variable‘Infant Mortality’ we can observe a strongly right-skewed distribution. Therefore, we applied a logarithmic transformation.

Fig. A1. Histogram, boxplot, density plot and Q-Q plot for‘Infant Mortality’.

Fig. A2. Cullen and Frey graph for‘Infant Mortality’.

(12)

Fig. A3. Histogram, boxplot, density plot and Q-Q plot for‘Economic Openness’ FDI net inflow (BoP, current US $).

Fig. A4. Cullen and Frey graph for‘Economic Openness’ FDI net inflow (BoP, current US $).

In order to build the variable‘Economic Openness’ we have used three indicators: FDI net inflow (BoP, current US $), FDI net inflow as a percentage of the GDP, and Exports of goods and services (as percentage of the GDP). Based on their distributions we applied a logarithmic transformation.

(13)

Fig. A5. Histogram, boxplot, density plot and Q-Q plot for‘Economic Openness’ FDI net inflow as a percentage of the GDP.

Fig. A6. Cullen and Frey graph for‘Economic Openness’ FDI net inflow as a percentage of the GDP.

(14)

Fig. A7. Histogram, boxplot, density plot and Q-Q plot for‘Economic Openness’ Exports of goods and services (as percentage of the GDP).

Fig. A8. Cullen and Frey graph for‘Economic Openness’ Exports of goods and services (as percentage of the GDP).

(15)

Fig. A9. Histogram, boxplot, density plot and Q-Q plot for‘Homicide rate’.

Fig. A10. Cullen and Frey graph for‘Homicide rate’.

From the descriptive analysis of the variable‘Homicide rate’ and the variables' distribution we decided to use a logarithmic transforma- tion.

(16)

From the descriptive analysis of the variable‘GDP Per capita’ and the variables' distribution we decided to use a logarithmic transforma- tion.

Fig. A11. Histogram, boxplot, density plot and Q-Q plot for‘GDP Per capita’.

Fig. A12. Cullen and Frey graph for‘GDP Per capita’.

(17)

Fig. A13. Histogram, boxplot, density plot and Q-Q plot for‘Unemployment’.

Fig. A14. Cullen and Frey graph for‘Unemployment’.

From the descriptive analysis of the variable‘Unemployment’ we can observe a right-skewed distribution. Based on the variable's distribution the most appropriate transformation is a logarithmic transformation.

(18)

From the descriptive analysis of the variable‘Oil Producer’ we imple- mented að ffiffiffi

x p5

Þ rescaling. By using the ð ffiffiffi x p5

Þ transformation, the variable is almost normally distributed which is ideal to using some form of the gen- eral linear model (e.g., t-test, ANOVA, regression).

Fig. A15. Histogram, boxplot, density plot and Q-Q plot for‘Oil producer’.

Fig. A16. Cullen and Frey graph for‘Oil producer’.

(19)

Fig. A17. Histogram, boxplot, density plot and Q-Q plot for‘Corruption’.

Fig. A18. Cullen and Frey graph for‘Corruption’.

From the descriptive analysis of the variable‘Corruption’ we can ob- serve long tails on both ends of its distribution. Based on this we used the winsorization technique.

References

Related documents

Industrial Emissions Directive, supplemented by horizontal legislation (e.g., Framework Directives on Waste and Water, Emissions Trading System, etc) and guidance on operating

Stöden omfattar statliga lån och kreditgarantier; anstånd med skatter och avgifter; tillfälligt sänkta arbetsgivaravgifter under pandemins första fas; ökat statligt ansvar

46 Konkreta exempel skulle kunna vara främjandeinsatser för affärsänglar/affärsängelnätverk, skapa arenor där aktörer från utbuds- och efterfrågesidan kan mötas eller

För att uppskatta den totala effekten av reformerna måste dock hänsyn tas till såväl samt- liga priseffekter som sammansättningseffekter, till följd av ökad försäljningsandel

Av dessa har 158 e-postadresser varit felaktiga eller inaktiverade (i de flesta fallen beroende på byte av jobb eller pensionsavgång). Det finns ingen systematisk

Generella styrmedel kan ha varit mindre verksamma än man har trott De generella styrmedlen, till skillnad från de specifika styrmedlen, har kommit att användas i större

I regleringsbrevet för 2014 uppdrog Regeringen åt Tillväxtanalys att ”föreslå mätmetoder och indikatorer som kan användas vid utvärdering av de samhällsekonomiska effekterna av

Parallellmarknader innebär dock inte en drivkraft för en grön omställning Ökad andel direktförsäljning räddar många lokala producenter och kan tyckas utgöra en drivkraft