• No results found

Political Competition and Party Incumbency: The Case of German Federal Elections

N/A
N/A
Protected

Academic year: 2022

Share "Political Competition and Party Incumbency: The Case of German Federal Elections"

Copied!
35
0
0

Loading.... (view fulltext now)

Full text

(1)

Political Competition and Party Incumbency: The Case of German Federal Elections

Karen Demski

Supervisor: Luca Repetto

Master Thesis in Economics, Spring 2017

Abstract

This paper investigates party incumbency advantage in German fed-

eral elections. I use a sharp regression discontinuity to exploit the first-

past-the-post voting system on the first ballot electing district represen-

tatives to the Bundestag, using election results from all voting districts

that existed in Germany between 1953 and 2013. I find that being the

incumbent party leads to a 1.2 percent increase in vote shares in the

following election. This analysis focuses on the most dominant political

party during the span of the data, the CDU. Historical analysis reveals

the effect to be largest when the political system is stable and lowest at

a time where the political structure was experiencing a shift. I do not

find evidence to argue that the party incumbency advantage is larger

when political competition threatens the potential reelection chances of

the incumbent.

(2)

1 Introduction

Incumbent politicians win elections at higher rates than challengers, they of- ten enjoy an electoral margin known as the incumbency advantage (Lee, 2008).

This effect is well-established by the vast literature that is working to identify the mechanisms behind it. Furthermore, heterogeneity in the effect exists across elec- toral settings, political tiers and time. However, it is still unclear why exactly this heterogeneity exists and how is affected by certain aspects of the political process.

We already have great amount of information on incumbency advantage in many different political settings, also in the German setting we see a positive incumbency advantage on all the political tiers (Ade and Freier, 2013; Freier, 2015). The aim of this thesis is to verify to what extent there exists evidence of a party incum- bency advantage in German federal elections. I will extend the current literature on incumbency advantage to study how it has evolved over time and investigate how it is affected by political competition within the districts.

This paper starts by documenting the party incumbency advantage in the Ger- man federal elections. The first ballot to elect district representatives to parliament provides a convenient platform for using a sharp regression discontinuity design (RDD). Using the margin of victory, I am able to consistently estimate the treat- ment effect of incumbency by comparing close elections. This is an ideal setting to use the RDD, since marginal victories in individual voting districts for German federal elections are very common. About 20% of district representatives win with a less than 5% lead in vote shares.

1

I start by estimating a baseline treatment effect of being the incumbent party on next election’s vote shares using election results from 1953 to 2013. I find that the most dominant political party in Ger- many experiences a 1.2% treatment effect of being the incumbent party on district seats in parliament, measured in increase in vote shares in the following election.

I then provide a historical perspective and study heterogeneity in incumbency advantage over time. I divide my sample according to time periods presented by Alemann (2003). He splits the nineteenth century into phases according to party dynamics and the development of the political structure in Germany, rather than using historical events as markers. I find evidence that the party incumbency advantage is smallest in the 1980s and 1990s, an era of change in the political structure. It is also clear that the most recent period, 2000s, reports the largest incumbency advantage up to 2%.

By conducting subgroup analysis to find out how political competition relates to the party incumbency effect, I collect insight into determining whether political parties in Germany take advantage of the benefits available to them in districts

1

These calculations are based on data presented in Table 1 and confirmed in Hainmueller and

Kern (2008).

(3)

where they are the representative party. Through further subgroup analysis I investigate whether this heterogeneity is related to political competitiveness. I use several definitions to identify districts that are viewed as competitive by the incumbent party. I do not find overwhelming evidence that this effect is restricted to either subgroup, however I find larger than the baseline estimates in the non- competitive districts that the incumbent considers safe for reelection.

I test the sensitivity of my estimates to bandwidth size and the underlying functional form. The estimates are consistent in magnitude and significance. I then run a series of placebo checks to verify that the effect I estimate is not just a result of the shape of the underlying function. I find no trace of a discontinuity at an alternative threshold and interpret this as evidence to support the robustness of my findings.

This thesis contributes to the literature by providing an extensive historical perspective to the evolution of party incumbency advantage. Many studies report finding larger effects in more recent time periods (Katz and Cox, 2002; Hirano and Snyder Jr, 2009). In the German setting this is often limited to pre- and post- reunification (Ade et al., 2014). These studies use relatively short sample periods compared to my dataset that spans six decades. To my knowledge, I am the first to relate to the political science literature in this context, while carrying out an extended analysis of incumbency advantage over time. Previous studies simply document the evolution of the heterogeneity, instead I use specific time periods defined by Alemann (2003) according to the dynamics of the political system in Germany. The characteristics of the political system during these periods provides insight into why the incumbency advantage might differ over time. Relating this heterogeneity to the characteristics of the political structure gives my analysis extra external validity that can aid future research.

I also address a gap in the literature regarding the relationship between com- petitiveness and incumbency advantage. A few studies mention how the degree of competition affects incumbency advantage (Hirano and Snyder Jr, 2009; Ariga et al., 2016). I find that there is little consensus in the literature regarding this topic. I clarify this issue by directly estimating the direction of the heterogeneity caused by political competition within a district. By identifying which districts are viewed as competitive or secure by the incumbent, I am able to estimate a larger party incumbency advantage when there is less political competition challenging the incumbent party. This provides clarity in the direction of the relationship between the two concepts.

This paper is organized as follows. Theoretical background to the incumbency

advantage is discussed in section 2 and section 3 provides an overview of the data

and the electoral system in Germany. Section 4 presents my empirical analysis,

beginning with the methodology behind the RDD, using it to estimate a general

(4)

party incumbency advantage and testing the validity of the estimation. In section 5, I present the results of the subgroup analysis that is followed by a final conclusion is section 6.

2 Theoretical Background

A key concept in political economic theory is accountability, and through elec- tions voters can reward or punish their politicians and influence their behavior.

With every election voters update their beliefs about the competence and hon- esty of each politician and political party. This indicates that incumbents have a signaling advantage as the challenger cannot easily signal their competence to the voters, while the incumbent politician can easily influence their own reputa- tion and probability of reelection. This difference in information should influence the popularity and vote share of the incumbent politician, because it reduces the amount of uncertainty surrounding the politician or his/her party (Besley, 2007).

Theory cannot clearly predict whether this is an advantage or disadvantage for the incumbent, much depends on the motivation or competence of the politician, but empirical tests can help answer this question. The following literature summary focuses on the main empirical approach used to study this effect, which is the RDD.

Empirical studies are generally more of the opinion that incumbency advan- tage is mainly tied to the incumbent taking advantage of the benefits available to them. Originally the focus of incumbency studies has been on the US, as in Lee (2008), who famously applied the RDD to close elections. Using the US House of Representatives he finds a 45% advantage in the probability of winning the seat for the incumbent legislator.

Since then, a large number of studies interested in isolating either the individual

or party incumbency advantage have utilized the electoral RDD in other settings

too, such as in the German electoral setting. Hainmueller and Kern (2008) were the

first to use electoral RDD in the German mixed electoral setting, finding a sizable

positive incumbency advantage of 1.5-1.9% for the party of a district candidate

in the parliamentary elections. Ade et al. (2014) investigates the heterogeneity of

incumbency advantage in Germany on the federal and state level, examining if and

how it depends on the party that is in power. They report results for party incum-

bency advantage similar in magnitude to Hainmueller and Kern (2008). They also

find that representatives from the two major political parties, the CDU (Christilich

Demokratische Union) and the SPD (Sozialdemokratische Partei Deutschlands),

both have an advantage when the SPD is in power and hypothesize that the SPD

is driving the effect through increased competition from the left. More specifically,

they argue that when the SPD district representatives face both the CDU and the

(5)

Figure 1: First Ballot Votes for Main Parties

05101520Total Votes (in millions) 1953 1957 1961 1965 1969 1972 1976 1980 1983 1987 1990 1994 1998 2002 2005 2009 2013

Election Year

CDU SPD

FDP The Green Party

The Left Party

Notes: This graph plots the total votes on the first ballot for each political party in each election, only parties that won seats in the Bundestag are included. With the exception of the DP, BHE, and DZP; who each held a very small number of seat in the second Bundestag in 1953.

Left party, they use the resources available to them for being the governing party in order to boost their chances of winning. While Ade et al. enter a lengthy dis- cussion and draw supportive evidence from their findings regarding heterogeneity, it is not something they directly estimate. I aim to expand on their hypothesis of competition driving incumbency advantage in German parliamentary elections.

3 Setting and Data

In the following section, I describe how the federal Bundestag elections in Germany work and introduce the main parties. In section 3.2, I briefly present the data I will be using and in 3.3, I outline how I plan to conduct subgroup analysis, both historical and competition based.

3.1 The German Electoral Setting

The German electoral system at the federal level is a mixed-member propor-

tional system. The Bundestag (Parliament) is part of the legislative branch and is

usually elected every 4 years. Voters cast a ballot for the direct representative of

(6)

their district (in a first-past-the-post system) and on the second ballot they elect a party for the proportional representation in Parliament. The second ballot de- termines the relative strength of the parties that are represented and it comprises about half of the seats. Direct candidates are always guaranteed a seat in Parlia- ment. If a party has more winning districts than direct candidates then the seats are filled using the party lists, if they have less winning districts then they receive overhang seats. While it is clearly not a two-party system, the district candidates almost exclusively come from one of the two leading parties: the center-right CDU (Christilich Demokratische Union) and the center-left SPD (Sozialdemokratische Partei Deutschlands).

2

In some districts, these parties face significant opposition from at least one smaller party, usually the center-right FDP (Freie Demokratis- che Partei ), the Left party (Die Linke) or the center-left Greens (Die Grünen).

3

Figure 1 graphs the first ballot vote shares in each election for all of the parties that held seats in the Bundestag, votes from all the districts have been summed up.

There are two aspects regarding the political parties in Germany that are im- portant for my analysis. Firstly, the smaller parties usually receive more votes in the second ballot than the first, giving me two ballots through which to measure political competition.

4

Solely looking at the direct candidate election in a district might not reveal enough about the competitiveness of the district. Secondly, oc- casionally a candidate is endorsed by multiple parties. Votes for this candidate will still be recorded under the party he/she identifies with most. For example, instead of nominating their own candidate the CDU might choose to support the candidate nominated by the FDP. In this case the votes are recorded under FDP and CDU totals zero votes. When I carry out my analysis I do not use any obser- vations in which the CDU did not clearly nominate their own candidate. I assume that if the CDU has nominated a candidate their vote total will be non-zero. Since this is the most dominant political party it is a reasonable assumption.

3.2 Data

The data I use to carry out this empirical analysis consists of publicly available election results. I have access to these online via the German Federal Statistical Office and the Federal Returning Office (Der Bundeswahlleiter ) who is appointed

2

In Bavaria the CDU are represented through the CSU; the CDU and CSU are politically aligned. They have an agreement that they cannot represent in the same state, so when I refer to CDU I include the CDU-CSU union.

3

The Left party was founded in 2007 through a merging of PDS and another small party. In this paper I refer to the Left and its predecessors under the same name. The Greens also refers to B’90/Grüne.

4

This is a characteristic of the data that can be attributed to the quality of the individual

district candidates of each party.

(7)

Table 1: Close Elections Sample Size Observations CDU SPD Other

Total Sample 3636 2219 1355 62

10% margin 1396 718 678 0

5% margin 747 386 361 0

2% margin 313 157 136 0

Notes: This table reports number of district-election ob- servations for certain winning margin sizes. The last three columns report the number of observations in which the respective party won the election.

with the responsibility of overseeing German federal elections. Each dataset that I use provides the number of votes received by every political party in each voting district for a particular election to the Bundestag. This includes votes on both ballots separately. I am able to exploit the first-past-the-post system on the first ballot to identify my treatment effect. As long as a candidate wins the majority of the vote they are elected as representative of their district; this rule allows me to use a sharp RDD.

Germany has 16 states and currently 299 voter districts, each electing a direct representative to Parliament. The data spans the period from 1953 to 2013 with elections every three or four years. German voting districts often undergo redis- tricting, since the districts are allocated to each state by size of population. A general rule is that no voting district can vary more than 15% from the average district population, if this is the case there will be redistricting. This often shifts the whole numbering system identifying the districts. Additionally, they are af- fected by regional reforms in the states, which can change borders and/or names of the districts. In order to avoid any identification issues I corrected for these changes, so that if a district changed its boundaries it is classified as a whole new district, but if it changed its name without changing its boundaries it remains the same district. This greatly increased the number of districts that have ever existed since 1953 and it means that not all districts cover the same number of elections. In total only 9 districts existed without physical border changes from 1953 to 2013. The others have been subject to redistricting, dissolved into the surrounding districts or created due to regional reforms.

Table 1 provides sample size for a selection of common bandwidths. Not sur-

prisingly, the sample size decreases quite rapidly as the margin gets smaller. I

estimate my results using a number of different bandwidths starting at 0.05, as it

is clear that a narrower margin will not contain enough observations. The RDD

that I describe in the following section requires a larger sample size than a ran-

domized experiment to get the same precision (van der Klaauw, 2008).

(8)

4 Empirical Analysis

In this section I carry out my main empirical analysis, which will be extended using subgroup analysis in section 5. In 4.1, I begin by introducing the general RDD model and show how I go on to apply it to the data I described in the previous section. In section 4.2, I test the validity of my design to see how well the identifying assumptions hold. I present my findings and test their robustness in sections 4.3 and 4.4 respectively.

4.1 The Setup

I will use a sharp RDD for my analysis. In the sharp RD, treatment, D

t,i

, is a deterministic and continuous function of a single observed covariate called the assignment or running variable. The key identifying assumption in a RDD is the continuity assumption, which requires that all factors are continuous with respect to the assignment variable.

In my analysis the running variable is the margin of victory of the CDU in the election at time t in district i, denoted M OV

t,i

. The dependent variable is V S

t+1,i

, which is the vote share of the CDU in the election at time t + 1 in the same district i. Treatment, D

t,i

= 1, occurs when M OV

t,i

is positive and the CDU becomes the incumbent party in district i. Since D

i

is an indicator of treatment, it equals 1 if treated and zero otherwise, it is determined by a threshold, where M OV

t,i

= 0. The observation is treated only if M OV

t,i

> 0, which means that there is no overlap between the treated and control group. We have the following regression model:

V S

t+1,i

= α + βM OV

t,i

+ ρD

t,i

+ πM OV

t,i

· D

t,i

+ γ

t+1

+ µ

i

+ 

t+1,i

(1) Where the effect of interest is ρ, which captures the discontinuous jump in the outcome variable due to treatment at the threshold. As commonly done in the literature, I have added the interaction term to allow the slope of the underlying function of M OV

t,i

to differ on either side of the threshold. Additionally, I include time fixed effects, γ

t+1

, and district fixed effects, µ

i

, to reduce the variance.

In traditional parametric estimation, vital to estimating this treatment effect, ρ, is the specification of the correct functional form of the underlying function of the running variable on either side of the threshold. Equation 1 represents a linear functional form, however in my model I also allow other possible shapes. I show this in Equation 2, where functional form is denoted as f (x).

V S

t+1,i

= f (x) + ρD

t,i

+ γ

t+1

+ µ

i

+ 

t+1,i

(2)

The correct form of f (x) is very important for finding consistent estimates of

ρ, otherwise we cannot compare average outcomes of treated with those of non-

treated. We could mistake a nonlinearity of the function with a discontinuity and

(9)

using the entire sample puts too much weight on the observations farthest from the threshold.

One way around this restriction is to use a nonparametric estimation approach.

Van der Klaauw (2008) shows that we can compare average outcomes of individual observations close to the threshold as long as the conditional potential outcomes, E[V S

1t+1,i

|M OV

t,i

] and E[V S

0t+1,i

|M OV

t,i

], are continuous at this threshold. This does not require any parametric functional form restriction in order to identify an average treatment effect. Lee and Lemieux (2010) point out that this linear es- timation does not solve all identification issues. Assuming linear functional form of f (x) close to the threshold, will still produce biased estimates if the functional from is not exactly linear. I use this as my main estimation method, and I com- plement it by also testing a functional form using a second order polynomial. It has become convention in the literature not to control for high order polynomials, especially for global polynomials, as this leads to poor inference. Gelman and Im- bens (2014) recommend using estimators based on quadratic polynomials. These methods/approaches should be seen as complements and they recommend not to rely solely on one specification.

All standard errors are clustered at the district level because I expect correla- tion within the districts. In the interest of testing the stability of my estimates, I present my specifications using a variety of alternative bandwidths. A narrow margin around the threshold is preferred in terms of reducing the specification bias of using observations too far from the threshold and the wrong functional form. However, this also comes with a loss in efficiency and I will not be able to get precise estimates with smaller margins. So, in addition to trying a range of estimates I use two different data-driven methods of selection optimal bandwidth size. First, I use IK’s method (Imbens and Kalyanaraman, 2012) which selects bandwidths though minimization of the approximate mean squared error at the threshold. Second, I also use CCT’s method (Calonico et al., 2014) which have improved on this method and allow for bias corrected inference.

4.2 Identification and Validity

In this section, I clarify the identification assumptions and test the validity

of my estimation. As de la Cuesta and Imai (2016) make clear, the identifying

assumption of the RD is the continuity assumption. Meaning that the only discon-

tinuous change that occurs at the threshold is treatment status itself. As I stated

in the previous section, this means that the potential outcomes conditional on the

running variable, E[V S

1t+1,i

|M OV

t,i

] and E[V S

0t+1,i

|M OV

t,i

], must be continuous

at the threshold. In the German electoral setting, this implies that an observation

where the political party barely won is a valid counterfactual for an observation

where that political party barely lost.

(10)

Figure 2: Density of the Assignment Variable

0 .5 1 1.5 2 Density

−1 −.5 0 .5 1

Margin of Victory CDU at t

0 .5 1 1.5 2 2.5 Density

−1 −.5 0 .5 1

Margin of Victory SPD at t

Note: Histogram of the margin of victory in the election at time t for the two largest parties, the CDU (top) and the SPD (bottom). Columns are split into 60 evenly sized bins.

A violation of this assumption can arise if one of the outcomes (treated or non-treated) is preferred, and the agents are able to precisely sort themselves as a response to this. In this setting, we expect political parties to have a certain amount of influence on their vote share, but precise sorting would require extensive manipulation. For example, the party would not only need to know that the election will be close in advance, but that they are just short of winning and then be able to manipulate to vote into a win for them. If this is the case, although unlikely, I would find either a discontinuity in the density with observations bunching just right of the threshold (since we can assume agents would prefer to win the election rather than lose it) or a discontinuity in a covariate, as it would not be random who choses or is able to manipulate the vote share.

I begin by testing for manipulation of the assignment variable, the margin of

victory in period t. Bunching of observations on either side of the threshold can be

an indication of self-selection into treatment, which would mean treatment is not

randomly assigned. Van der Klaauw (2008) shows that for sorting to invalidate

(11)

Table 2: Balance of Covariates

Linear Polynomial

0.05 0.1 0.15 0.05 0.1 0.15

Vote Share at t − 1 0.0214 0.00512 0.00237 0.0289 0.00347 0.00615

(1.48) (0.69) (0.44) (1.09) (0.27) (0.62)

Observations 587 1086 1507 587 1086 1507

Fraction of votes considered in t+1 -0.0000693 -0.0000851 -0.0000611 -0.00142 -0.000212 -0.000432 (-0.08) (-0.17) (-0.14) (-0.94) (-0.28) (-0.69)

Observations 745 1397 1938 745 1397 1938

Eligible voting population t + 1 1.242 1.324 1.359 1.679 0.385 0.512

(0.78) (1.26) (1.71) (0.81) (0.30) (0.40)

Observations 745 1397 1938 745 1397 1938

Turnout t + 1 -0.00235 -0.00136 0.000172 -0.00251 -0.000346 -0.00143

(-1.05) (-1.05) (0.14) (-0.90) (-0.18) (-0.92)

Observations 745 1397 1938 745 1397 1938

Turnout t 0.00255 -0.000309 -0.000346 -0.00174 0.000702 -0.000244

(1.27) (-0.22) (-0.29) (-0.54) (0.33) (-0.15)

Observations 745 1397 1938 745 1397 1938

*p < 0.1, ** p < 0.05, *** p < 0.01

Notes: Covariates listed in each row are used as dependent variable, each specification reports the treatment effect of the party incumbency status on these covariates. Other than differing in the dependent variable, these regressions follow the estimation strategy presented in the section 4. Both linear and second order polynomial specifications include election and district fixed effects. All standard errors are clustered at district level and t-statistics are in parentheses. Eligible voting population at t + 1 is estimated in 1000s.

the identifying assumption of the RD, the agent needs to be able to precisely sort themselves around the cutoff. Some influence on the assignment variable is not enough to undermine the interpretation of the approach. If there is no evidence of precise sorting then the density of the sorting variable should be smooth near the threshold. McCrary (2008) proposes a linear density estimator to check for a dis- continuity in density of the running variable around the threshold. Figure 2 shows the histograms of the running variable for both parties, fitted with a kernel density estimator. The corresponding McCrary test produced a p-value of 0.5127, indicat- ing that the null hypothesis of no bunching cannot be rejected and manipulation is highly unlikely.

5

If the assumption that there is no manipulation of the assignment variable, holds, then I should also not find a discontinuity in any pre-determined covariates.

Unfortunately, due to the nature of the voting districts in this dataset, the available covariates are quite limited.

6

The district borders are quite dynamic and there is no easily accessible data that is tied to the population of the districts. I want to

5

Also see Figure A.3 in the Appendix for the empirical density of the running variable margin of victory of both parties at time t.

6

The districts are divided so that all district include close to the same amount of the popu-

lation, a district population should not deviate more than 15% from the average.

(12)

Figure 3: Graphical Representation of the Discontinuity

.3.35.4.45.5vote share at t+1

−.2 −.1 0 .1 .2

margin of victory at t

.3.35.4.45.5vote share at t+1

−.2 −.1 0 .1 .2

margin of victory at t

Notes: This graph plots both the linear (left) and quadratic (right) estimation approach using all the specifications described in section 4.1. Both use a margin of 0.2 around the threshold, which is marked by the solid vertical line. Each point on the graphs represents average vote share in t + 1 in bins of 0.005 intervals of the running variable.

check that there is no discontinuous jump at the threshold that I might mistake for the treatment effect in my main findings. Those covariates that I do have access to are vote share at t − 1, voter turnout at t + 1, voter eligibility at t + 1, invalid votes at t+1. Table 2 confirms that there are no statistically significant discontinuities in the baseline covariates around the threshold for a range of specifications. Although, there is one specification that reports a statistically significant estimate, this results is not consistent throughout the range of specifications I test. Therefore, I can attribute this to random chance of finding false significance when running such a large number of tests.

4.3 Main Empirical Results

In this section, I present my main findings. All of the specifications that I present have been carried out on both major parties. My main focus is on the CDU, the results for the SPD are only presented as a comparison to the CDU results and they are left to the appendix. All tables in this section only report the effect of interest, which is the coefficient to the treatment indicator that estimates the discontinuous jump at the threshold.

7

7

See Appendix for more detailed regression output, Table A2 reports coefficients of all vari-

ables for the specifications in my main output table.

(13)

Table 3: Effect of Incumbency Status on Party Vote Shares in Election at t + 1

Bandwidth Size

0.05 0.1 0.15 0.2

A. Linear

Treatment Indicator 0.0100

∗∗

0.0130

∗∗∗

0.0106

∗∗∗

0.00851

∗∗∗

(2.08) (4.44) (4.44) (4.06)

Observations 744 1398 1939 2397

B. Polynomial

Treatment Indicator 0.0104 0.0101

∗∗

0.0133

∗∗∗

0.0118

∗∗∗

(1.47) (2.29) (3.71) (3.75)

Observations 744 1398 1939 2397

CCT IK CCT IK

C. Optimal Bandwidths

Treatment Indicator 0.0121

∗∗∗

0.0123

∗∗∗

0.0124

∗∗∗

0.0125

∗∗∗

(4.34) (4.58) (3.38) (3.71)

Method Linear Linear Polynomial Polynomial

Calculated BW 0.10423 0.11321 0.14004 0.18036

Observations 1445 1546 1843 2236

*p < 0.1, ** p < 0.05, *** p < 0.01

Notes: The dependent variable is the vote share for the CDU candidate in the following election within the same district. All specifications include election and district level fixed effects. Stan- dard errors are clustered at the district level and t-statistics are reported in parentheses. These specifications use only observations where the running variable, margin of victory for CDU at time t, is within the bandwidth specified. In panel C, these bandwidths are optimally calculated via two different data-driven methods, CCT and IK, and applied to both specifications using linear and quadratic functional form.

Figure 3 plots the CDU’s vote share in the election at time t + 1 against their margin of victory in the election at time t around the threshold. A positive margin of victory implies the party won the district seat and is the incumbent for the election in period t + 1. Both plots are restricted to a 0.2 margin around the threshold in order to restrict focus to the observations that are relevant, but wide enough to display the shape of the curve. This is also the largest bandwidth that I test. The plot on the left is fitted with a linear regression line matching the last column of panel A Table 3, and similarly on the right, a polynomial of the second order that matches the same regression as presented in panel B of Table 3.

Table 3 reports the results for my two estimation approaches, each is estimated

using several different bandwidths. This means that when the bandwidth is 0.05

the effect was estimated using only observations where the CDU had a margin of

victory between -0.05 and 0.05 in the election at time t. Almost all estimates of

party incumbency advantage for the CDU in Table 3 are statistically significant

at the 10% level and most above the 1% level. The size of the estimates are also

(14)

quite consistent, although slightly smaller in the linear specifications in panel A.

The discontinuity is consistently measured around 1% to 1.3%.

In Panel C, I use using optimally calculated bandwidths, as is suggested in the literature, on both linear and quadratic approaches (Calonico et al., 2014).

All specifications in panel C are statistically significant at the 1% level and are consistent with the estimates in the other two panels. Note that both optimal bandwidth calculations produce margins between 0.1 and 0.2. This is also true for SPD specifications in Table A1, so I will use a linear regression with a bandwidth of 0.15 as my base specification.

Table A1 reports the above mentioned specifications for the SPD. Sample size, optimal bandwidths and consistency of the estimates are all very similar. Statis- tical significance is at the 1% level for all specifications except one. The estimate cover a larger range from about 1% to 1.5% and tend towards a larger treatment effect. Similarly to the CDU, optimal bandwidths of both linear and quadratic estimations did produce quite consistent estimates at around 1.2%.

Overall, I find a party incumbency advantage of 1.2%, which is slightly lower than the 1.5-1.9% suggested by the literature. However, such small variation can be due to a difference in data range or estimation strategy.

4.4 Robustness

I follow a few standard procedures for checking the robustness of my estimate.

These include sensitivity analysis of my estimates to specification changes and using placebo tests to verify whether the discontinuity I find can be attributed to the effect I am attempting to estimate.

In the previous section I have already shown the robustness of my findings by reporting a variety of specifications, Lee and Lemieux (2010) advise showing the sensitivity of the results to a range of bandwidths and orders of the polynomial as I have done in Tables 3 and A1. In Figure 4, I reiterate this point by plotting the sensitivity of the linear estimates to bandwidth size. The graph plots the estimated treatment effect and its confidence interval for a range of bandwidths.

Figure 4, clearly depicts the main issue with bandwidth choice, the bias-variance trade-off (de la Cuesta and Imai, 2016). On the one hand, the 0.95 confidence interval clearly gets larger close to the threshold, showing that there is a loss of efficiency with narrower margins around the thresholds. On the other hand, using observations further away from the threshold means they are no longer comparable to those on the other side of the threshold and the linear functional form is not appropriate anymore, known as specification bias.

I also check whether the discontinuity I found for the CDU can be attributed

to an effect other than the one I am attempting to identify. The placebo tests

are quite straightforward, I test a number of specifications with alternative cutoff

(15)

Figure 4: Sensitivity Analysis

−.03−.02−.010.01.02Estimated Coefficient

.05 .1 .15 .2 .25 .3

Bandwidth

−.03−.02−.010.01.02Estimated Coefficient

−.2 −.15 −.1 −.05 0 .05 .1 .15 .2

Threshold Location

Notes: The left panel plots the sensitivity of the linear specification to bandwidth size in intervals of 0.025. Each point represents the estimated treatment effect and its 0.95 confidence interval is marked using vertical lines. The graph on the right similarly plots estimates using the baseline specification at alternative thresholds along the assignment variable.

points; I attempt to identify a discontinuity at a point where I should not expect to find one. If I clearly identify a discontinuity at a different value of my running variable, then I cannot claim that the discontinuity I have identified at the original threshold is the treatment effect that I am looking for. Table 4 reports estimates for thresholds at -0.2, 0.2, -0.15 and 0.15. I do find that a few specifications tend to produce statistically significant estimates, but without a pattern the estimates are sporadically distributed and incoherent in size. This indicates that this is unlikely to be evidence of a discontinuity at these points. To complement these estimates I include sensitivity analysis of these placebo tests to the cutoff point, as seen on the right panel of Figure 4. At my original threshold where the cutoff is 0 the graph marks my baseline estimate, using the local linear specification with a 0.15 bandwidth, and the 0.95 confidence interval around it. This same specification reports estimates at alternative thresholds on either side of the original in 0.05 intervals. The 0.95 confidence intervals show that these estimates are not statisti- cally significant at any of the other thresholds. This is further evidence that the significant estimates found in Table 4 are not an indication of a true treatment effect at these cutoff points and support the robustness of my main findings.

8

8

Unfortunately, it is difficult to run the placebo tests very far from the threshold as the

number of observations decreases and the distribution is less consistent (as seen in Appendix

Figure A.2)

(16)

Table 4: Placebo Tests

Linear Polynomial

0.05 0.1 0.15 0.05 0.1 0.15

Threshold at -0.2 -0.0296 0.000214 -0.000266 0.923

0.0586 0.0248 (-1.00) (0.01) (-0.03) (1.74) (0.49) (0.45)

Observations 316 633 1002 316 633 1002

Threshold at 0.2 0.000124 -0.0343

∗∗

-0.00315 -0.365 0.109 -0.0267 (0.00) (-2.04) (-0.27) (-0.42) (0.77) (-0.42)

Observations 488 988 1467 488 988 1467

Threshold at -0.15 0.00838 -0.00602 -0.00750 0.350 0.0424 -0.0184 (0.32) (-0.65) (-1.13) (1.14) (0.68) (-0.52)

Observations 424 873 1312 424 873 1312

Threshold at 0.15 -0.00331 0.0149 -0.00543 0.516 0.0190 0.0893

∗∗

(-0.10) (1.30) (-0.71) (1.37) (0.21) (2.46)

Observations 574 1124 1706 574 1124 1706

*p < 0.1, ** p < 0.05, *** p < 0.01

Notes: This table reports estimates for treatment indicator using all the same specifications as in table 3, only at alternative cutoff points as specified in each row. All specifications include election and district level fixed effects. Standard errors are clustered at the district level and t-statistics are reported in parentheses.

5 Heterogeneity in the Party Incumbency Advantage

This paper raises the question of where the effect of party incumbency is con- centrated. Research in other settings have investigated heterogeneity over time and the link to competition. The flowing sections attempt to address these ques- tions in this setting by looking at subgroups within the data and using dummy variables to identify an effect for any of these groups. I will present my analysis using the results based on CDU data.

5.1 Historical Perspective

Previous studies clearly document an increase in incumbency advantage over

time. Especially in the case of the US House of Representatives, Katz and Cox

(2002) find the incumbency advantage increases after 1966. In the same setting,

Hirano and Snyder Jr (2009) also report increases in the effect between elections in

1972 and 2001. Ade et al. (2014) identify this heterogeneity in the German setting

and confirm that the effect is larger post-reunification. Despite the objective of

these studies to find heterogeneity over time, they all use quite restricted datasets

spanning not much more than three decades.

(17)

Table 5: Party Incumbency Effect for Historical Subgroups

Linear Polynomial

0.1 0.15 0.2 0.1 0.15 0.2

A. Reunification

Pre-1990 0.0118

∗∗

0.00693

∗∗

0.00818

∗∗∗

0.00702 0.0109

∗∗

0.00774

(2.46) (1.99) (2.74) (1.04) (1.98) (1.65)

Observations 622 868 1079 622 868 1079

Post-1990 0.0153

∗∗∗

0.0135

∗∗∗

0.00930

∗∗∗

0.0111

0.0155

∗∗∗

0.0162

∗∗∗

(3.69) (3.92) (3.04) (1.68) (2.92) (3.62)

Observations 776 1071 1318 776 1071 1318

B. Split at 1980

Pre-1980 0.0152

∗∗

0.00755 0.00807

0.0101 0.0151

∗∗

0.00953

(2.29) (1.51) (1.78) (1.06) (2.02) (1.47)

Observations 428 609 747 428 609 747

Post-1980 0.0127

∗∗∗

0.0115

∗∗∗

0.00884

∗∗∗

0.00911

0.0123

∗∗∗

0.0123

∗∗∗

(3.92) (4.32) (3.87) (1.84) (3.12) (3.55)

Observations 1021 1391 1718 1021 1391 1718

C. Phase II

Period -1976 0.0162

∗∗

0.00910

0.00933

∗∗

0.00992 0.0168

∗∗

0.0124

(2.44) (1.77) (2.02) (1.04) (2.17) (1.83)

Observations 377 548 679 377 548 679

D. Phase III

Period 1980-1987 0.00624 0.00406 0.00818

∗∗∗

0.00622 0.00200 0.00402

(1.21) (1.10) (2.82) (1.07) (0.44) (0.89)

Observations 289 390 494 289 390 494

E. Phase IV

Period 1990-2002 0.00915 0.0114

∗∗

0.00989

∗∗

0.00516 0.0113 0.0118

(1.51) (2.25) (2.28) (0.61) (1.53) (1.85)

Observations 420 583 701 420 583 701

F. Phase V

Period 2002- 0.0161

∗∗∗

0.0120

∗∗∗

0.00712

0.00884 0.0159

∗∗

0.0181

∗∗∗

(3.11) (2.70) (1.74) (0.92) (2.32) (3.04)

Observations 423 573 717 423 573 717

*p < 0.1, ** p < 0.05, *** p < 0.01

Notes: The dependent variable is the vote share for the CDU candidate in the following election within the same district and only treatment effect is reported for each specified time period. All specifications include election and district level fixed effects. Standard errors are clustered at the district level and t-statistics are reported in parentheses.

In order to investigate the development of the treatment effect over time I split

my dataset into historical subgroups. For this I follow Alemann (2003), who has

periodized the development of the German political system according to the evo-

lution of the party dynamics over time. He outlines four distinct phases. First,

Phase I refers to the post-war years between 1945 and 1953 is considered a forma-

(18)

Table 6: Party Incumbency Effect with Historical Dummy Variables

Bandwidth Size

0.05 0.1 0.15 0.2

A. Linear

Treatment Indicator 0.0202

∗∗∗

0.0182

∗∗∗

0.0132

∗∗∗

0.00942

∗∗∗

(2.84) (4.15) (3.73) (2.86)

Treatment · Phase II (1953 - 1976) -0.00234 -0.00181 0.000715 0.00344 (-0.29) (-0.34) (0.15) (0.72) Treatment · Phase I (1980 - 1987) -0.0170

∗∗∗

-0.00895

∗∗

-0.00447 -0.00188

(-2.62) (-2.07) (-1.22) (-0.53) Treatment · Phase IV (1990 - 1998) -0.0144

∗∗

-0.00829

-0.00600 -0.00465

(-2.12) (-1.83) (-1.54) (-1.26)

Observations 744 1398 1939 2397

B. Polynomial

Treatment Indicator 0.0210

∗∗

0.0156

∗∗∗

0.0161

∗∗∗

0.0128

∗∗∗

(2.41) (2.88) (3.67) (3.16)

Treatment · Phase II (1953 - 1976) -0.00237 -0.00169 0.000415 0.00328 (-0.29) (-0.32) (0.09) (0.69) Treatment · Phase III (1980 - 1987) -0.0169

∗∗∗

-0.00893

∗∗

-0.00470 -0.00225

(-2.60) (-2.07) (-1.29) (-0.64) Treatment · Phase IV (1990 - 1998) -0.0145

∗∗

-0.00807

-0.00613 -0.00468

(-2.14) (-1.79) (-1.57) (-1.26)

Observations 744 1398 1939 2397

*p < 0.1, ** p < 0.05, *** p < 0.01

Notes: Treatment indicator for incumbency status is interacted with time period dummies. The most recent time period, Phase V, from election in 2002 is the reference period. All specifications include election and district level fixed effects. Standard errors are clustered at the district level and t-statistics are reported in parentheses.

tion phase, Formierungsphase. The second, Konzentrierungsphase, from 1953 all the way through to the election in 1976, was characterized by they dominance of three parties: CDU/CSU, SPD and FDP. It is often referred to as an era of the three-party system, I label it Phase II. Concentration of political power was at a high in the 1970s. The third, Phase III, captures the period leading up to reunifi- cation in 1990 and is considered a transformation phase, or Transformationsphase.

It is characterized by a noticeable polarization and loss in dominance of the three major parties. A new fourth party, the Greens, finally breaks the five-percent hurdle to win parliamentary seats. Lastly Phase IV, post-reunification from 1990 to 2002 is the Zentripetalephase, when the political structures from east and west merge. Alemann notes that the major parties begin to orient themselves more centrally on the political spectrum. While his work ends there, in my analysis I refer to the elections in the period post-2002 as Phase V.

Table 5 presents the analysis split into these distinct phases. The formation

(19)

phase is unfortunately left out because it is too short and the dataset begins with the 1953 election. Panels C, D, E, and F report the treatment effects for each of the aforementioned time periods. Of these, Phase V reports the largest estimates, up to 1.8%, with statistical significance at the 1% level. Phase II reports similar magnitudes, but estimates are slightly less significant. The middle time periods, the 1980s and 1990s, report the least significant estimates and are generally smaller in magnitude.

Overall, results are a bit clearer in panels A and B, which split the data at reunification and just before the third transformation phase in 1980. In these groups statistical significance is high for most estimates. The largest estimates are evident for the post-1990 group, between 1% and 1.5%. This confirms that the party incumbency effect is highest in the most recent decade.

I test the results of the four phases relative to each other in Table 6, using all of the original specifications. I have created a dummy variable for each time period and interacted it with the treatment indicator. Phase V is the period of reference because it produced the strongest results. The base period reports estimates slightly larger than the main regressions using the whole sample, some estimates as large as 2%, indicating that the effect is in fact stronger in more recent years. When interacted with dummies for Phase III and Phase IV the treatment effect is consistently negative across all specifications. This indicates that the effect is likely to be lower in the 1980s and 1990s, relative to the period the 2000s, yet a lack of significance for treatment interaction with Phase II can indicate that the effect was of a similar size during that time period.

To understand why this heterogeneity exists only in the 1980s and 1990s, I

return to Alemann (2003). He describes the 1980s as a build up to German reuni-

fication, which was the break that changed the political system. New movement

in the until then rigid political party system, power change, the Greens entered

parliament and right-wing extreme parties were rushing elections on other political

tiers. The keywords to describe this time period for political parties are polariza-

tion and fragmentation. In the 1990s, the east and west political systems were

merging and parties were being redefined. Three small parties won seats in parlia-

ment; by the end of the 1990s the small parties have stabilized themselves and the

large have polarized. Since then, a five-party system has existed throughout the

2000s. In Figure 1 in section 3, I show how the German political structure starts

with a long period of the three-party system and ends in a five-party system, with

the transition in the 1980s and 1990s. I find lowest estimates for party incumbency

effect during this transitional period.

(20)

5.2 Political Competition

The link between political competition and party incumbency is not clearly established in the literature. I find a discrepancy in what different authors conclude regarding the influence of competition among parties on incumbency advantage.

Stein and Bickers (1994) argue that only incumbents that are vulnerable seek advantages from holding office, in the form of pork barrel spending, to boost their reelection chances. In a more recent paper, Hirano and Snyder Jr (2009) look at individual incumbency advantage for US state legislators and consider the possible underlying sources. In doing so, they find evidence that the competitiveness of a district is directly related to both the the size of the office-holders benefits and the incumbents quality relative to his/her competitors. They conclude that incum- bents exert more effort to use their office-holder benefits when they are running for reelection in competitive districts; in other words they use these benefits to respond to electoral threats. This strongly suggests that in competitive districts the incumbent may exert more effort to utilize direct office holder benefits since they are more vulnerable.

More recent empirical papers are of the opposite view. Ariga et al. (2016) look at electoral benefits in Japan’s lower house and struggle to confirm previous reports of an incumbency advantage. They attribute this lack of an effect to be due to their study focusing on competitive districts. Kendall and Rekkas (2012) attempt to separate individual and party incumbency effects in the Canadian parliament.

They argue that they find a smaller party effect, compared to individual legislative, because the individual and not the party has incentives to engage in pork-barrel type projects. They imply that the party does not have the incentives to respond to a threat of political competition.

The phases of party development in the previous section can already provide some hints as to the relationship between political competition and party incum- bency advantage, since they are each characterized by a certain dynamic among the different parties. I find the treatment effect to be lower at times of instability in the political system. It is not clear whether this indicates a rise of political competition within districts. To get a clearer understanding of this relationship I try different ways to define the competitiveness of individual districts .

I try three different ways to divide the district into two separate groups accord-

ing to how safe or competitive the incumbent party might view each district. For

the sake of consistency and transparency, Table 7 reports results of these different

definitions for the same set of specifications as used in the previous historical anal-

ysis. First, I loosely follow the paper by Hirano and Snyder Jr (2009), in which

a US voting district is considered competitive if the dominant party received less

than 60 percent of the normal vote. I adapt this to the German federal election

classifying a district to be competitive, or rather have significant competition from

(21)

Table 7: Party Incumbency Effect for Subgroups Based on Competitiveness

Linear Polynomial

0.1 0.15 0.2 0.1 0.15 0.2

A. Second Ballot - Mean

Competitive Districts 0.0167

∗∗∗

0.00933

∗∗

0.00752

0.00764 0.0170

∗∗

0.0146

∗∗

(2.79) (1.99) (1.88) (0.71) (2.16) (2.33)

Observations 572 782 968 572 782 968

Not Competitive Districts 0.00800

∗∗∗

0.00893

∗∗∗

0.00901

∗∗∗

0.00834

0.00954

∗∗∗

0.0108

∗∗∗

(2.63) (3.55) (4.24) (1.92) (2.72) (3.38)

Observations 826 1157 1429 826 1157 1429

B. Flip Between Parties (11.5 percent)

Competitive Districts 0.00981

∗∗

0.00959

∗∗

0.00933

∗∗

0.00793 0.0112

∗∗

0.0100

(2.16) (2.25) (2.28) (1.16) (2.12) (1.96)

Observations 550 659 705 550 659 705

Not Competitive Districts 0.0219

∗∗

0.0223

∗∗

0.0212

∗∗

0.0202 0.0259

∗∗

0.0252

∗∗

(2.04) (2.11) (2.07) (1.53) (2.43) (2.45)

Observations 848 1280 1692 848 1280 1692

C. Average MOV (-0.070/0.076 percent)

Competitive Districts 0.00836

∗∗

0.00708

∗∗

0.00874

∗∗∗

0.00590 0.00765

0.00626

(2.35) (2.27) (2.61) (1.10) (1.71) (1.47)

Observations 877 990 1013 877 990 1013

Not Competitive Districts 0.0225

∗∗∗

0.0161

∗∗∗

0.0157

∗∗∗

0.0165

0.0174

∗∗

0.0162

∗∗∗

(3.57) (2.82) (2.92) (1.70) (2.42) (2.62)

Observations 521 949 1384 521 949 1384

*p < 0.1, ** p < 0.05, *** p < 0.01

Notes: The dependent variable remains the vote share for the CDU candidate in the following election within the same district and only treatment effect is reported for each specified subgroup. Specifications are the same, but run on the two subgroups separately. All specifications include election and district level fixed effects. Standard errors are clustered at the district level and t-statistics are reported in parentheses.

small parties. I use the second ballot results out of practical reasons, however intuitively this makes sense as I am looking for party incumbency effect and the second ballot essentially determines the relative strengths of the parties in par- liament. If the combined vote share of the two main parties is below the mean of 0.78, then the smaller parties provide sufficient political competition for the incumbent party.

9

Important to note is that I use the first election at time t for this measure, as this corresponds to the information that the parties would have

9

To calculate this mean 0f 0.78 I use only district-year observations within the 0.15 margin.

(22)

when approaching the election at time t + 1.

My second method considers the number of times the district representative seat has flipped between parties. The aim of this is to clearly separate those districts, which can be considered a stronghold for the CDU. Since I am focusing my analysis on the CDU, I define the district representative seat as switched or flipped if the CDU has either lost a seat in a district where they were the incumbent party or won one in a district where they were not the incumbent party. Here I also use the mean as a splitting point. In 11.5% of elections in a district’s lifespan the district representative seat switches between parties. In order to calculate this measure for observation I only use prior election results.

Third, I divide the districts according to their average margin of victory. Since I focus my analysis on the CDU, I hope that this will separate those district that the CDU might consider safe for reelection from those that they might consider unstable. Similarly as in the previous method, to calculate this measure for each unit of observation I only use previous election results. I use the 25th and 75th percentiles of the average margin of victory to group the observations, I calculate these as -0.07 and 0.076 respectively. I classifying those within this percentile range as politically competitive.

Table 7 reports the estimates for theses subgroups under the three definitions.

Statistical significance is quite high for both groups under all definitions, slightly higher for the noncompetitive groups. Almost all estimates are at the 5% level. All estimates are positive and reasonably consistent across bandwidths. I do not find a lack of an effect or negative estimates for any group, yet I find heterogeneity in the magnitudes. In panels B and C the estimates are much larger for the districts in the not competitive groups. They range from 1.5% to 2.5%, which is almost double my main estimate of 1.2% from Table 3, and the competitive groups report almost all estimates to be below 1%. This heterogeneity seems to be reversed for the first definition in panel A, however the differences are not as large and estimates are not consistent.

In order to achieve a better understanding of the heterogeneity I repeat my ini-

tial specifications including a dummy variable to identify a district that I classify

as competitive. The interaction term between this dummy variable and the initial

treatment indicator will indicate whether the treatment effect is significantly dif-

ferent in these competitive groups. Table 8 reports the treatment indicator and

interaction term for all of the specifications using all of the three definitions. Not

all of the interaction terms are significant, implying that whether the effect differs

for competitive district-year observations depends on how I define political com-

petition. However, those that are statistically different from the noncompetitive

estimates are all negative, implying the party incumbency advantage is smaller in

those districts.

(23)

Table 8: Party Incumbency Effect with Competitive Dummy Variables

Bandwidth Size

0.05 0.1 0.15 0.2

A. Second Ballot - Mean Linear:

Treatment Indicator 0.00839 0.0121∗∗∗ 0.0109∗∗∗ 0.0108∗∗∗

(1.71) (3.96) (4.26) (4.66)

Treatment · Competitive 0.00693 0.00329 -0.000107 -0.00548

(1.26) (1.00) (-0.04) (-1.76)

Observations 744 1398 1939 2397

Polynomial:

Treatment Indicator 0.00883 0.00986∗∗ 0.0138∗∗∗ 0.0147∗∗∗

(1.30) (2.22) (3.67) (4.45)

Treatment · Competitive 0.00686 0.00337 0.0000685 -0.00521

(1.23) (1.04) (0.02) (-1.67)

Observations 744 1398 1939 2397

B. Flip Between Parties (7.4 percent) Linear:

Treatment Indicator 0.00727 0.0131∗∗ 0.00923 0.00816

(0.82) (2.35) (1.89) (1.85)

Treatment · Competitive 0.00404 -0.0000288 0.00192 0.000572

(0.44) (-0.00) (0.32) (0.10)

Observations 744 1398 1939 2397

Polynomial:

Treatment Indicator 0.00787 0.0101 0.0120∗∗ 0.0116∗∗

(0.79) (1.49) (2.13) (2.23)

Treatment · Competitive 0.00414 0.00000307 0.00185 0.000246

(0.45) (0.00) (0.31) (0.04)

Observations 744 1398 1939 2397

C. Average MOV (-0.080/0.058 percent) Linear:

Treatment Indicator 0.0184∗∗∗ 0.0212∗∗∗ 0.0180∗∗∗ 0.0173∗∗∗

(2.61) (4.24) (4.41) (4.36)

Treatment · Competitive -0.0108 -0.00962∗∗ -0.00849∗∗ -0.00991∗∗∗

(-1.63) (-2.14) (-2.19) (-2.59)

Observations 744 1398 1939 2397

Polynomial:

Treatment Indicator 0.0187∗∗ 0.0180∗∗∗ 0.0197∗∗∗ 0.0199∗∗∗

(2.06) (3.15) (4.05) (4.50)

Treatment · Competitive -0.0108 -0.0105∗∗ -0.00827∗∗ -0.00973∗∗

(-1.63) (-2.32) (-2.13) (-2.55)

Observations 744 1398 1939 2397

*p < 0.1, ** p < 0.05, *** p < 0.01

Notes: Treatment indicator for incumbency status is interacted with dummy variables to indicate a historically competitive district under each definition. The noncompetitive or safe districts act as the reference group. All specifications include election and district level fixed effects. Standard errors are clustered at the district level and t-statistics are reported in parentheses.

It is challenging to characterize what makes a voting district politically com- petitive and therefore not safe or guaranteed reelection for the incumbent party.

From the various subgroups that I test, I find the presence of a discontinuity in po-

(24)

litically competitive districts. However non-competitive groups produce significant results too and report that the effect might be larger than when using the whole sample. The estimates of Tables 7 and 8 in this section all do seem to indicate that party incumbency effect for the CDU can be found, and is potentially quite large, in districts that can be considered safe for reelection, given the definitions that I tested. The results of the same specifications and grouping for the SPD in the Appendix Table A8 seem to partially confirm these results.

6 Conclusions

In this paper, I take advantage of the first-past-the-post system on the first ballot and use a RDD to estimate the party incumbency effect in Germany at its highest political tier. I find estimate of party incumbency advantage to be around 1.2%, measured by an in increase in vote shares in the following election. This confirms the findings of previous research, in particular the 1.5-1.9% presented in Hainmueller and Kern (2008) and Ade et al. (2014).

I also investigate heterogeneity in these party incumbency effects. To my knowl- edge I am the first to take such an extensive historical perspective of incumbency advantage in this setting. I find the effect to be large in periods where the political system is most stable. Specifically, before the 1980s when a three-party system was in place and again with the formation of the five-party system in the new millennium. The incumbency effect is smaller during times of political transfor- mations characterizing the end of the nineteenth century. Specifically, the 1980s were a time of polarization and fragmentation of the party system. The 1990s then saw an increasing number of small parties in parliament and a polarization of the major parties, as response to the merging of west and east German political structures. It seems that the lower treatment effects of the 1980s and 1990s might be attributed to the shift in political competition within the political structure.

Finally, I search for heterogeneity in the effect based on long term political

competitiveness of the districts. Previous research led me to expect more evidence

of a treatment effect in districts that are more unstable for reelection. I find that

the party incumbency advantage clearly does exist in the districts that I consider

to be viewed as safe by the incumbent CDU. In fact, the estimates are partially

very large, over 2 percentage points for one of the definitions tested. It is not clear

whether this is where the effect is concentrated or it is simply larger in this subset

of districts.I do not find evidence of any particularly large effect in cases where

the incumbent party would be expecting reelection to be uncertain. I take this as

evidence against the view that the source of incumbency advantage is incumbents

making use of officeholder benefits to boost their reelection chances.

(25)

A Appendix

A.1 Figures

Figure A.1: Linear and Quadratic Regressions (SPD)

.3.35.4.45.5.55vote share at t+1

−.2 −.1 0 .1 .2

margin of victory at t

.3.35.4.45.5.55vote share at t+1

−.2 −.1 0 .1 .2

margin of victory at t

Notes: This graph plots both the linear (left) and quadratic (right) estimation approach for a

margin of 0.2 around the threshold, which is marked by the solid vertical line. Each point on the

graphs represents average vote share in t + 1 in bins of 0.005 intervals of the running variable.

(26)

Figure A.2: CDU Data Across Entire Sample

Notes: This graph plots the margin of victory for the CDU in the election at time t against their vote share in the following election using the full sample. The plot is covered by a 0.95 confidence interval to describe the variance in the data.

Figure A.3: McCrary Density Test

(a) CDU (b) SPD

Notes: These graphs plot the empirical density of the running variable margin of victory at time

t for the CDU (left) and the SPD (right). This is done according to McCrary (2008) to test for

manipulation.

(27)

A.2 Tables

Table A1: Effect of Incumbency Status on Party Vote Shares in Election at t + 1 (SPD)

Bandwidth Size

0.05 0.1 0.15 0.2

A. Linear

Treatment Indicator 0.0150

∗∗∗

0.0117

∗∗∗

0.0121

∗∗∗

0.00767

∗∗∗

(3.09) (4.03) (4.93) (3.42)

Observations 753 1418 1970 2423

B. Polynomial

Treatment Indicator 0.0135

0.0125

∗∗∗

0.0125

∗∗∗

0.0136

∗∗∗

(1.83) (2.97) (3.65) (4.39)

Observations 753 1418 1970 2423

CCT IK CCT IK

C. Optimal Bandwidths

Treatment Indicator 0.0120

∗∗∗

0.0119

∗∗∗

0.0123

∗∗∗

0.0108

∗∗∗

(4.57) (4.50) (3.56) (4.03)

Method Linear Linear Polynomial Polynomial

Calculated BW 0.12693 0.12645 0.14884 0.28906

Observations 1734 1730 1955 2990

*p < 0.1, ** p < 0.05, *** p < 0.01

Notes: The dependent variable is the vote share for the SPD candidate in the following

election within the same district. All specifications include election and district level fixed

effects. Standard errors are clustered at the district level and t-statistics are reported in

parentheses. These specifications use only observations where the running variable, margin

of victory for SPD at time t, is within the bandwidth specified. In panel C, these bandwidths

are optimally calculated via two different data-driven methods, CCT and IK, and applied

to both specifications using linear and quadratic functional form.

References

Related documents

In ad- dition to a well-known disciplining effect (Stigler, 1972; Becker, 1983 for instance), electoral competition is also likely to enhance this selection process, by pushing

All regressions include rain probability dummies, region fixed effects, a second-order polynomial in the county population, Obama 2008 vote share, 2008 House Republican

The Armed Struggle: Where Socialist Ideology was Born The structure of the colonial economy and the conditions of liberation made the creation of a strong state apparatus

Although studies of new political parties are slowly increasing in number, we still lack a comprehensive understanding of when new parties manage to enter the most important

Furthermore, when we compare coalition governments to single party governments, we find that net cost as a share of revenue increases with, on average, 2.03 percentage points under

If media talent can be reflected by experiences and Manin’s theory is accurate, the metamorphosis is still in a transitioning phase as not yet half of the political advisors

The study concludes that Fairtrade International frames its Twitter feed according to the language of political consumerism, and found in the feed is the

Key words: social movements, social democratic parties, party change, political opportunity structure, Ireland, Spain, Right2Water, Movimiento 15-M, Labour Party, PSOE,