C Thesis Analysis Code

(1)

C Thesis Analysis Code

Lina Sigurdh January 11, 2021

1: Replication Reiter & Stam

#All right, I will begin by simply trying to replicate the results from Reiter and Stam's ana lysis. 'doublereduce' is the data they provided.

load("reiterstam.Rdata")

#Creating the logit model.

replication <- glm(sideaa ~ pdemdtar + pdemdin+ sdemin + sdemtar + mdemin + mdemtar + persona l + military + single + democ + contig + majpow + ally + loglsrat + advanced + dispyrs + dspl ine1 + dspline2 + dspline3, data = doublereduce, family = "binomial")

replication_f = glm(sideaa ~ pdemdtar + pdemdin+ sdemin + sdemtar + mdemin + mdemtar + person al + military + single + democ + contig + majpow + ally + loglsrat + advanced + dispyrs + dsp line1 + dspline2 + dspline3, data = doublereduce, family = binomial)

coeftest(replication_f, vcov. = vcovCL(replication_f, cluster = doublereduce$idyr, type = "HC 0"))

##

## z test of coefficients:

##

## Estimate Std. Error z value Pr(>|z|)

## (Intercept) -4.80545323 0.09889942 -48.5893 < 2.2e-16 ***

## pdemdtar 1.04494020 0.14021596 7.4524 9.168e-14 ***

## pdemdin 0.10529635 0.18999241 0.5542 0.579433

## sdemin -0.08143726 0.14207032 -0.5732 0.566497

## sdemtar 0.17889896 0.11394694 1.5700 0.116410

## mdemin -0.45500447 0.27555257 -1.6512 0.098689 .

## mdemtar 0.64215122 0.24440637 2.6274 0.008604 **

## personal 0.29558888 0.26506248 1.1152 0.264779

## military -0.30662639 0.57399481 -0.5342 0.593205

## single -0.65063790 0.15066330 -4.3185 1.571e-05 ***

## democ -1.04653039 0.21039016 -4.9742 6.551e-07 ***

## contig 2.91383090 0.09028561 32.2735 < 2.2e-16 ***

## majpow 2.16351725 0.10355229 20.8930 < 2.2e-16 ***

## ally 0.08479717 0.08700605 0.9746 0.329753

## loglsrat -0.31570044 0.02667998 -11.8329 < 2.2e-16 ***

## advanced -0.18713522 0.14153245 -1.3222 0.186099

## dispyrs -0.38071836 0.02265343 -16.8062 < 2.2e-16 ***

## dspline1 -0.00351297 0.00042412 -8.2829 < 2.2e-16 ***

## dspline2 0.00228290 0.00035586 6.4152 1.407e-10 ***

## dspline3 -0.00064137 0.00014120 -4.5423 5.564e-06 ***

## ---

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(2)

file:///C:/Users/smulb/OneDrive/FREDS C/METHODS C MODULE 3/WIP/Lina Sigurdh_Source code bachelor thesis.html 2/8

#That worked! All right, on to working with the new IV.

2: Subsetting and reshaping data

#When working with SIPRI and the VDem data, I first need to do some clean-up, before I create my dummies and merge with the other data.

load("fullpopulation.Rdata")

apop <- population[which(population$`Additive polyarchy index` < 0.35),]

aypoplow <- apop[which(apop$Year > 1948),]

population2 <- aypoplow[which(aypoplow$Year < 1995),]

#What I have done here is sort the population of data. First, I removed all values equal to a nd above 0.35 on the API. Then, I removed all values equal to or lower than 1948. Last, I rem oved all values equal to or higher than 1995. Thus, my population should be countries with a score lower than 0.35 on the API and within the time period 1949-1994.

#I then did the same with the SIPRI data, and excluded all subjects except for those within t he period 1949-1994:

load("SIPRIuntreated.Rdata")

sipri1 <- sipri %>% select(-("1995":"2019"))

#I also need to transform the SIPRI data from wide to long.

load("milex.Rdata")

milex2 = reshape(milex, direction = "long", varying = list(names(milex)[2:47]), v.names = "Mi litary expenditure", idvar = c("Observation"), timevar = "year", times = 1949:1994)

#There we have it :)

3: Merging with V-Dem and SIPRI

load("milexp.Rdata")

load("vdempopulation.Rdata")

colnames(population) <- c("Country","country_id","country_text_id","year","api","COWcode","md i,")

colnames(population) <- c("Country","country_id","country_text_id","year","api","COWcode","md i")

merged <- merge(population, milexp)

#Now the VDem and the SIPRI data are combined in one dataset. On to removing data I do not ne ed and renaming columns.

colnames(merged)[colnames(merged) == "Military expenditure"] <- "milexp"

t_merged <- merged %>% select(-Observation,-country_text_id,-Country,-country_id)

#And that concludes the merging part.

(3)

4: Model 2: Creating the IV dummy

#My IV will be a dummy variable created from the MDI and milex.

#Model 1:

t_merged$dummy <- ifelse(t_merged$mdi >=0.5,1,0) & ifelse(t_merged$milex > 0.05,1,0)

#That worked!

dummyd <- t_merged %>% select(-mdi,-milexp) summary(dummyd$dummy)

## Mode FALSE TRUE NA's

## logical 3636 111 223

#All right, that wraps up making the dummy for model 2. Next, I will experiment some with hig her/lower cutoffs in order to make model 3 and 4.

#Model 3, lower cutoff for MDI:

t_merged$dummy <- ifelse(t_merged$mdi >=0.3,1,0) & ifelse(t_merged$milex > 0.05,1,0) d_md3 <- t_merged %>% select(-mdi,-milexp)

#Model 4, higher cutoff for MDI:

t_merged$dummy <- ifelse(t_merged$mdi >=0.7,1,0) & ifelse(t_merged$milex > 0.05,1,0) d_md4 <- t_merged %>% select(-mdi,-milexp)

5: Merging my IV with the data from Reiter &

Stam

(4)

load("dummyd.Rdata") load("reiterstam.Rdata")

#On to the crucial part, merging the data I have compiled with the data from Reiter & Stam. I intend to merge on the variables Year, and merge Side A with COWn.

#It is important to note that while the variables are called statea and stateb, the descripti on says the opposite (that statea is state B and vice versa). However, looking at the data, I conclude that the statea variable is State A.

colnames(dummyd)[colnames(dummyd) == "COWcode"] <- "statea"

colnames(dummyd)[colnames(dummyd) == "dummy"] <- "d_autocracy_mil"

d_model2 <- merge(dummyd,doublereduce)

#This was model 2. Now I do the same for model 3:

colnames(d_md3)[colnames(d_md3) == "COWcode"] <- "statea"

colnames(d_md3)[colnames(d_md3) == "dummy"] <- "d_autocracy_mil"

d_model3 <- merge(d_md3,doublereduce)

#...and model 4!

colnames(d_md4)[colnames(d_md4) == "COWcode"] <- "statea"

colnames(d_md4)[colnames(d_md4) == "dummy"] <- "d_autocracy_mil"

d_model4 <- merge(d_md4,doublereduce)

6: Regression

load("data_cthesis.Rdata")

#Model 2

model2 <- glm(sideaa ~ d_autocracy_mil+pdemdtar + pdemdin+ sdemin + sdemtar + mdemin + mdemta r + personal + military + single + democ + contig + majpow + ally + loglsrat + advanced + dis pyrs + dspline1 + dspline2 + dspline3, data = d_model2, family = "binomial")

model2_f = glm(sideaa ~ d_autocracy_mil + pdemdtar + pdemdin+ sdemin + sdemtar + mdemin + mde mtar + personal + military + single + democ + contig + majpow + ally + loglsrat + advanced + dispyrs + dspline1 + dspline2 + dspline3, data = d_model2, family = binomial)

coeftest(model2_f, vcov. = vcovCL(model2_f, cluster = model2_f$idyr, type = "HC0"))

(5)

##

## (Intercept) -4.72977909 0.10830238 -43.6720 < 2.2e-16 ***

## d_autocracy_milTRUE 0.72681302 0.13558314 5.3606 8.293e-08 ***

## pdemdtar 1.12845407 0.12809179 8.8097 < 2.2e-16 ***

## pdemdin -0.08967405 0.51023790 -0.1757 0.86049

## sdemin 0.44636389 0.19023903 2.3463 0.01896 *

## sdemtar -0.09855915 0.11294964 -0.8726 0.38288

## mdemin -1.19386573 1.01410765 -1.1773 0.23909

## mdemtar 0.29091842 0.21855423 1.3311 0.18315

## personal 0.26365341 0.25352657 1.0399 0.29837

## military -0.47791702 0.60208342 -0.7938 0.42733

## single -0.63386167 0.14890912 -4.2567 2.075e-05 ***

## democ 0.43937742 0.20130493 2.1826 0.02906 *

## contig 3.06969976 0.09036322 33.9707 < 2.2e-16 ***

## majpow 2.02282858 0.11227636 18.0165 < 2.2e-16 ***

## ally 0.05805295 0.09042070 0.6420 0.52085

## loglsrat -0.30833982 0.02979351 -10.3492 < 2.2e-16 ***

## advanced -0.07951295 0.13412956 -0.5928 0.55331

## dispyrs -0.41851692 0.02641836 -15.8419 < 2.2e-16 ***

## dspline1 -0.00419284 0.00050540 -8.2962 < 2.2e-16 ***

## dspline2 0.00282954 0.00042268 6.6943 2.167e-11 ***

## dspline3 -0.00084947 0.00016664 -5.0976 3.439e-07 ***

## ---

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##Model 3: lower cutoff value of the MDI

(6)

##

## (Intercept) -4.75404440 0.11327483 -41.9691 < 2.2e-16 ***

## pdemdtar 1.05319018 0.13717269 7.6778 1.618e-14 ***

## pdemdin -0.06657868 0.51056949 -0.1304 0.8962493

## sdemin 0.47319713 0.19178835 2.4673 0.0136141 *

## sdemtar -0.04678661 0.11518372 -0.4062 0.6846021

## mdemin -1.18623975 1.01431448 -1.1695 0.2422026

## mdemtar 0.17954943 0.24195435 0.7421 0.4580390

## personal 0.29404877 0.25395238 1.1579 0.2469092

## military -0.42170126 0.60265511 -0.6997 0.4840903

## single -0.58100608 0.15210310 -3.8198 0.0001336 ***

## democ 0.44138548 0.20081565 2.1980 0.0279517 *

## contig 3.05465110 0.09387929 32.5381 < 2.2e-16 ***

## majpow 1.97575356 0.11430509 17.2849 < 2.2e-16 ***

## ally 0.04656265 0.09245623 0.5036 0.6145296

## loglsrat -0.29347525 0.03006291 -9.7620 < 2.2e-16 ***

## advanced -0.08031336 0.13405570 -0.5991 0.5491032

## dispyrs -0.41606505 0.02721889 -15.2859 < 2.2e-16 ***

## dspline1 -0.00415855 0.00051760 -8.0342 9.418e-16 ***

## dspline2 0.00280317 0.00043061 6.5098 7.524e-11 ***

## dspline3 -0.00083920 0.00016858 -4.9782 6.418e-07 ***

## ---

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##Model 4: increase cutoff value of the MDI

(7)

##

## (Intercept) -4.62020965 0.10343778 -44.6666 < 2.2e-16 ***

## pdemdtar 1.10025027 0.12287284 8.9544 < 2.2e-16 ***

## pdemdin -0.11711293 0.50888427 -0.2301 0.8180

## sdemin 0.40885684 0.19032804 2.1482 0.0317 *

## sdemtar -0.01419075 0.10916970 -0.1300 0.8966

## mdemin -1.23833364 1.01265429 -1.2229 0.2214

## mdemtar 0.29730896 0.21306272 1.3954 0.1629

## personal 0.32696386 0.23457073 1.3939 0.1634

## military -0.48748382 0.60352287 -0.8077 0.4192

## single -0.68790505 0.14579332 -4.7184 2.378e-06 ***

## democ 0.39741243 0.20125414 1.9747 0.0483 *

## contig 2.96293216 0.08830950 33.5517 < 2.2e-16 ***

## majpow 2.16303527 0.10760508 20.1016 < 2.2e-16 ***

## ally 0.10222080 0.08805797 1.1608 0.2457

## loglsrat -0.33168357 0.02902249 -11.4285 < 2.2e-16 ***

## advanced -0.11492316 0.13536620 -0.8490 0.3959

## dispyrs -0.41554971 0.02569266 -16.1739 < 2.2e-16 ***

## dspline1 -0.00413727 0.00049279 -8.3956 < 2.2e-16 ***

## dspline2 0.00280936 0.00041416 6.7832 1.175e-11 ***

## dspline3 -0.00085503 0.00016445 -5.1994 2.000e-07 ***

## ---

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

#And there we have it!

#Converting to odds-ratios for interpretation, model 2:

m2_oddsr <- exp(cbind(OR = coef(model2), confint(model2)))

7: Creating tables to present results

(8)

#To present my results in tables, I use the stargazer package (Hlavac, Marek (2018). stargaze r: Well-Formatted Regression and Summary Statistics Tables. R package version 5.2.2. https://

CRAN.R-project.org/package=stargazer).

#Table 1: Summary statistics

stargazer(replication,model2,model3,model4,type="html",out="thesis_table1.html",title="Table 1: Summary statistics",font.size = "small",covariate.labels = c("Military dictatorship","Per s/democ","Democ/pers","Democ/single","Single/democ","Democ/mil","Mil/democ","Personal","Milit ary","Single","Democ","Contiguous","Major power","Ally","Power ratio","Economically advanced"

,"Time since last dispute","Cubic spline 1","Cubic spline 2","Cubic spline 3"))

#Note that in table 1 in the thesis, the dependent variable is called "Directed Dyads" as opp osed to "sideaa" which it is in this output. I changed that in word.

#Table 2: Odds ratio of Table 2

rownames(m2_oddsr) <- c("Intercept","Military dictatorship","Pers/democ","Democ/pers","Democ/

single","Single/democ","Democ/mil","Mil/democ","Personal","Military","Single","Democ","Contig uous","Major power","Ally","Power ratio","Economically advanced","Time since last dispute","C ubic spline 1","Cubic spline 2","Cubic spline 3")

stargazer(m2_oddsr, type="html", title = "Table 2: Odds ratios of Model 2", out = "oddsratio.

html")