• No results found

On the Specification of Local Models in a Global Vector Autoregression: A Comparison of Markov-Switching Alternatives

N/A
N/A
Protected

Academic year: 2021

Share "On the Specification of Local Models in a Global Vector Autoregression: A Comparison of Markov-Switching Alternatives"

Copied!
33
0
0

Loading.... (view fulltext now)

Full text

(1)

On the Specification of Local Models in a Global

Vector Autoregression: A Comparison of

Markov-Switching Alternatives

Sebastian Andersson

Department of Statistics

Uppsala University

Supervisor: Rolf Larsson

(2)

Abstract

In this paper, focus is on the global vector autoregressive (GVAR) model. Its attractive-ness stems from an ability to incorporate global interdependencies when modeling local economies. The model is based on a collection of local models, which in general are esti-mated as regular VAR models. This paper examines alternative specifications of the local models by estimating them as regime-switching VAR models, where transition probabili-ties between different states are studied using both constant and time-varying settings. The results show that regime-switching models are appealing as they yield inferences about the states of the economy, but these inferences are not guaranteed to be reasonable from an economic point of view. Furthermore, the global solution of the model is in some cases non-stationary when local models are switching. The conclusion is that the regime-switching alternatives, while theoretically reasonable, are sensitive to the exact specifica-tion used. At the same time, the issue of specifying the regime-switching models in such a way that they perform adequately speaks in favor of the simpler, yet functional, basic GVAR model.

Keywords: GVAR, VAR, time-varying, regime switching, macroeconometrics, global model, generalized impulse response

(3)

Contents

1 Introduction 1

2 Methodology 2

2.1 Global VARs . . . 2

2.1.1 Local Models . . . 2

2.1.2 Solving the Global Model . . . 4

2.2 Modeling Switches in Regimes . . . 6

2.2.1 Description of Regime-Switches and Markov Chains . . . 6

2.2.2 Time-Varying Transition Probabilities . . . 8

2.3 Regime-Switches in the Global VAR Context . . . 9

2.4 Estimation . . . 10

2.5 Forecasting . . . 11

2.6 Impulse Response Analysis . . . 12

3 Empirical Application 14 3.1 General Setup . . . 14 3.2 Data . . . 15 3.3 Model Specifications . . . 16 3.3.1 TVTP-GVAR . . . 16 3.3.2 RS-GVAR . . . 17

3.3.3 AR-TVTP and AR-RS . . . 17

3.3.4 GVAR . . . 19

3.3.5 VAR and AR . . . 19

3.4 Regime Inference . . . 19

3.5 Impulse Response Analysis . . . 21

3.6 Eigenvalues . . . 23 3.7 Forecast Comparison . . . 25 4 Discussion 26 4.1 Further Research . . . 27 5 Conclusion 28 References 30

(4)

1

Introduction

Vector autoregressive (VAR) models have been widely used in economic analyses since Sims (1980). Many extensions have been developed since then, providing a large class of models within the family of VARs. This paper is concerned with one of these extensions, namely the global VAR model.

The global VAR model (GVAR) is a fairly new concept, with defining contributions by Pesaran, Schuermann, and Weiner (2004) and Dees, di Mauro, Pesaran, and Smith (2007). The underlying idea is that the global nature of today’s society needs to be incorporated in some way when doing macroeconometric modeling. If anything, the financial turmoil of the last half decade has illustrated that there are very strong interdependencies among countries.1 As stated in Pesaran et al. (2004), applications of VAR models are often done for a single country at a time, with the possible inclusion of an exogenous indicator of some sort (creating what is often called a VARX model). That, however, comes with limitations, as the exogeneity of the exogenous variable means it cannot be included in an impulse response analysis. Furthermore, for forecasting purposes one must already have forecasts of the exogenous variable, since it, from the perspective of the model, is a deterministic process.

This global framework is based on a set of local models, in which foreign variables are weighted together to create an exogenous variable vector (often referred to as the foreign vector). The difference, in comparison with VARX models, is that the local models to-gether create a global system. That is, while foreign variables are exogenous in the local models, they are endogenous to the system as a whole. Hence, it is possible to carry out impulse response analysis with rest-of-the-world shocks. It also allows for endogenous forecasting.

In Binder and Gross (2013), regime switches are added to the model. Essentially, what that means is that every local model is allowed to behave differently depending on what state (regime) the vector of endogenous variables (the local vector) is in. For example, a process with two possible states could be used to try to capture business cycle fluctuations such that one state is for “good times” and one is for “bad times”. The setup of the regimes in Binder and Gross (2013) is such that the probability of a local model going from one specific state to another only depends on the previous state. That is, at a certain point in time, the probability of going from a boom to a bust in the next period only depends on which of the two the current state is.

The objective of this paper is to compare a number of alternative specifications of the local model. It includes the previously mentioned regime-switching model, as well as 1While the GVAR model was initially developed for countries, it is by no means a necessary restriction. A

(5)

an extended version where the transition probabilities also depend on the observable for-eign vector; this type of model is known as regime-switching with time-varying transition probabilities. From a practitioner’s point of view, it is a meaningful extension — it is rea-sonable to assume that the probability of Sweden being in the “bad times”’ state at time t + 1 depends not only on Sweden’s state at t, but also on what is happening in the rest of the world at time t. The comparison also features the basic GVAR model, as well as the regime-switching models with a modified estimation approach.

The outline of the paper is as follows. Section 2 provides the necessary methodological backgrounds to global VAR models and regime-switches, both as separate and combined entities. It introduces the regime-switching global VAR model with time-varying transi-tion probabilities, as well as the concepts used in the comparison. Sectransi-tion 3 contains an empirical application in which the models are compared with respect to regime inference, impulse response analysis and forecasting. Section 4 concludes.

2

Methodology

2.1

Global VARs

The main idea behind the global VAR framework is to incorporate interlinkages between cross-sectional units in a viable way. Traditionally, what is often used in practice is the inclusion of a set of exogenous variables. In an economic application with a model for a small and open economy, this could, for instance, be the price of oil, U.S. GDP growth, or something else considered to be influencing the endogenous variables while not being influenced by them in return. While this historically might have constituted an adequate solution to incorporating global changes, it is often argued that it is unsatisfactory for the present situation.2

One alternative option, which is readily available in a regular VAR-model, is to simply make all variables endogenous right away. That is, however, not feasible. A model with 25 countries, four variables per country (such as GDP, interest rate, inflation and equity prices) and two lags would yield 25 equations with 200 parameters per equation, not counting de-terministic terms.

2.1.1 Local Models

The following description of the setup of the basic model follows di Mauro and Pesaran (2013, chapter 2). Suppose that there are N cross-sectional units, such that i = 1, . . . , N . The units must not necessarily be countries, but can be other types of regions, banks, etc. The basic building blocks of the global model are called local models. There are N local models, with one for each unit. Each of these local models is a VARX(pi, qi) model,

(6)

meaning that it is a VAR model with pilags of the endogenous variables and qilags of the set of exogenous variables. The local model for unit i is:

yi,t = ai+ pi X j=1 Φi,jyi,t−j + qi X l=0

Λi,lxi,t−l+ ui,t (2.1.1)

where yi,t is a ki× 1 vector containing the endogenous variables, xi,t is a k∗i × 1 vector containing the foreign variables and ui,tis a vector of errors. The remaining terms ai, Φi,j and Λi,lare coefficient vectors and matrices of conformable sizes.

The first thing to note is the xi,t−l part of Equation (2.1.1) as this, more or less, is what defines the GVAR model. It is constructed as

xi,t = N X

j=1

wi,jyj,t, (2.1.2)

where there are three restrictions associated with the weights: first, wi,i = 0, second, PN

j=1wi,j = 1, and third, wi,j ≥ 0. In other words, the non-negative weights sum to unity and the foreign variable vector for unit i does not include yi,titself.

The weights wi,j have in most cases been regarded as known constants obtained from, for example, direction of trade statistics. Recently, however, ways of estimating these just like any other parameter have been presented. Gross (2013) discusses how these can be estimated, and an application can be found in Gray, Gross, Paredes, and Sydow (2013). Estimating the weights is especially desirable in settings where no obvious weights exist (such as when banks constitute the cross-section). It is shown by Gross (2013) that pre-determined weights could potentially bias the results, but the idea of estimated weights is rather new and the majority of the literature does not employ any estimation methods. To comply with most of the previous research, the weights are assumed to be fixed.

Before the global model can be solved, the local models need to first be estimated by themselves. The local models are estimated by ordinary least squares, as if they were reg-ular VAR models. The parameter vectors and matrices for unit i are collected in the matrix Γi, which is equal to

Γi = 

ai Φi,1 · · · Φi,pi Λi,0 · · · Λi,qi

 . By also stacking all the observations ineytas follows:

e

y0i,t =1 y0i,t−1 · · · yi,t−p0

i x 0 i,t · · · x 0 i,t−qi 

(7)

the local model in Equation (2.1.1) can be written as

yi,t = Γiyei,t+ ui,t. (2.1.3)

From Equation (2.1.3), the least squares estimator of Γican subsequently be shown to be

b Γi=



b

ai Φbi,1 · · · Φbi,pi Λbi,0 · · · Λbi,qi  = T X t=1 yi,tey 0 i,t T X t=1 e yi,tye 0 i,t !−1 . (2.1.4)

2.1.2 Solving the Global Model

Following the country-by-country estimation, the global model can be solved. By creating a global variable vector, collecting all of the entire system’s k = PN

i=1ki endogenous variables, a global solution can be reached based on local estimates from Equation (2.1.4).

The first step is to rewrite the local model in Equation (2.1.1) as a function of the global variable vector. To this end, define zi,t =



y0i,t x0i,t 0

. Without any loss of generality, it may be assumed that the lag length of the domestic vector is equal to the lag length of the foreign vector, i.e. that pi = qi, to make the notation more comprehensible. Notice that this allows for a more compact form of Equation (2.1.1):

Ai,0zi,t= ai+ pi

X

j=1

Ai,jzi,t−j + ui,t (2.1.5)

where Ai,0=  I −Λi,0  , Ai,j =  Φi,j Λi,j  . (2.1.6)

Before the model in Equation (2.1.5) is re-formulated in terms of the global vector, the global vector must be defined. This is done by stacking the local vectors:

yt=        y1,t y2,t .. . yN,t        . (2.1.7)

The resulting vector ytin (2.1.7) is of size k × 1, where k =PNi=1ki. Using a link matrix Wi, the local models’ vectors can easily be constructed from the global vector in (2.1.7). The link matrix, in which the previously discussed weights come into play, requires the following definition. Let:

Wi = 0k

i×Pij=1kj Iki×ki 0ki×(k−Pij=1kj)

!

(8)

where ¯In×m is a possibly non-square zero matrix with top left square being the identity matrix.3 Essentially, the upper part of the link matrix Wiconsists of zeros and an identity matrix which moves to the right for increasing i, while the lower part contains the weights associated with unit i.

The local models’ vectors can be constructed from the global vector with the aid of the link matrix:

zi,t = Wiyt. (2.1.9)

By substituting (2.1.9) into the model in (2.1.5), the model for unit i can be written as

Ai,0Wiyt= ai+ pi

X

j=1

Ai,jWiyt−j+ ui,t, (2.1.10)

where the models corresponding to each unit in (2.1.10) can be stacked:        A1,0W1yt A2,0W2yt .. . AN,0WNyt        =        a1+Ppj=11 A1W1yt−j+ u1,t a2+Ppj=12 A2W2yt−j+ u2,t .. . aN + PpN j=1ANWNyt−j+ uN,t        . (2.1.11)

By letting all units’ lag lengths be equal to p for notational purposes4, the results are unaf-fected, but it allows for expressing (2.1.11) as

G0yt= a + p X j=1 Gjyt−j+ ut, (2.1.12) where Gj =        A1,jW1 A2,jW2 .. . AN,jWN        , a =        a1 a2 .. . aN        , ut=        u1,t u2,t .. . uN,t        .

3If n = m, it is the identity matrix. If n 6= m, then the min(n, m) × min(n, m) square at the top left of ¯I n×m will be the identity matrix of dimension min(n, m), with all other elements equal to 0. Element (i, j) is

¯

Ia×b(i, j) = (

1, if i = j

0, otherwise. 4The lag restrictions p

i = qi = p imposed along the way are not restrictive, as stated in the text. If p instead denotes the maximum number of lags of all local models, then all models can in fact be thought of as having p lags, but with zero-restrictions on the lags greater than pi. It is a simplification made to make the notation more accessible, but is in the end not of importance for the results.

(9)

To obtain the reduced form of the global model, both sides in (2.1.12) are pre-multiplied by G−10 .5 This yields the final form of the model:

yt= b + p X j=1 Fjyt−j+ et, (2.1.13) where Fj = G−10 Gj, b = G0−1a, et= G−10 ut.

All of the Wi matrices are predetermined and thus known. The local models are esti-mated, so estimates for the Ai,j matrices exist following (2.1.6). Therefore, obtaining the estimates of the parameters in (2.1.13) is just a matter of multiplication.

2.2

Modeling Switches in Regimes

2.2.1 Description of Regime-Switches and Markov Chains

To model regime switches, Markov chains are usually employed. Markov chains are stochastic processes, in which future and past states are independent conditional on the present state. For a more formal definition, let {st}Tt=1be a sequence of random variables where for each t, stcan take on values in the set S = {1, 2, . . . , R}. Furthermore, if this stochastic process satisfies the Markov property

P {st= j|st−1= i, st−2= k, . . . } = P {st= j|st−1= i} = pij,

then it can be described as an R-state Markov chain, with its state space being S. The probability pij gives the probability of the process going to state j, given that it is in state i. Since it must go to some state, it follows that PR

j=1pij = 1. These probabilities are known as transition probabilities, and they are collected in a transition matrix P such that

P =        p11 p12 · · · p1R p21 p22 · · · p2R .. . ... . .. ... pR1 pR2 · · · pRR        , (2.2.1)

where one should note that the rows must sum to unity. To explain the method more closely, suppose that ytis a vector of endogenous and xta vector of exogenous variables. Let Yt and Xtdenote the history up to, and including, time t such that Yt =

 y0t, . . . , y01 0 and Xt =  x0t, . . . , x01 

. It is, for the moment, assumed that the state process is independent 5G

(10)

of past observations of ytand both current and past observations of xt, such that

P {st= j|st−1= i, st−2= k, . . . , Yt−1, Xt} = P {st= j|st−1= i} = pij. (2.2.2) In addition, suppose that the model parameters Γ are known constants. Assume that, at time t, the dependent vector is characterized by being in state i such that its conditional density is

f (yt|st= i, Yt−1, Xt; Γ). (2.2.3)

If there are R regimes, there are R conditional densities of this form. All of these are collected in a vector defined as

ηt=        f (yt|st= 1, Yt−1, Xt; Γ) f (yt|st= 2, Yt−1, Xt; Γ) .. . f (yt|st= R, Yt−1, Xt; Γ)        . (2.2.4)

Suppose now that, in addition to the model parameters in Γ being known, the probabili-ties in (2.2.1) are known with certainty as well. Let all of these parameters be collected in λ. Despite λ being known, one cannot be certain what state governed the process at any point in time. Thus, one needs to turn the problem around and study the probabilities of the different states given the observed outcome. Denote this conditional probability by P {st = i|Yt, Xt; λ}, where i ∈ S. Collect these in the R × 1 vector ˆξt|t, known as filtered probabilities. Closely related is the state forecast vector, ˆξt+1|t, which contains the probabilities of the different states at t + 1 given observations up to, and including, time t. Thus, its elements describe P {st+1= i|Yt, Xt; λ}.

As suggested by Hamilton (1994), the optimal inference for each time period t involves iteration over the two equations:

ˆ ξt|t= ˆ ξt|t−1 ηt 10( ˆξt|t−1 ηt) (2.2.5) ˆ ξt+1|t= P0ξˆt|t, (2.2.6)

where means element-wise multiplication, 1 is an R × 1 vector of ones, P is the tran-sition matrix from (2.2.1), and ηtis the vector of conditional densities defined in (2.2.4). The iteration requires a starting value, ˆξ1|0, for which one can, for example, use the naïve option of R−11, where 1 is R × 1.

Building on a procedure outlined in Kim (1994), a last step entails smoothing the proba-bilities by making use of all the information known and iterating backwards. The iterating

(11)

equation is

ˆ

ξt|T = ˆξt|t Ph ˆξt+1|T ˆξt+1|ti (2.2.7) for t = 1, 2, . . . , T − 1, where denotes element-wise division. The iteration is started with ˆξT |T taken from (2.2.5) with t = T . These probabilities, given all information the sample entails, are known as the smoothed probabilities and they constitute the optimal inference for the unobserved states.

2.2.2 Time-Varying Transition Probabilities

The Markov-switching approach to modeling regimes can be, and has been, extended in a number of ways. One such extension of particular interest for a global VAR model is time-varying transition probabilities, as discussed in e.g. Diebold, Lee, and Weinbach (1994) and Krolzig (1997). In the traditional model, often referred to as the “Hamilton model”, the transition probabilities are constant over time. Allowing for time variation implies a different structure of the conditional probabilities of the state process. For a regular first-order, time-homogeneous Markov process, the elements of the transition matrix P are described by P {st= i|st−1= j}. When P varies over time, the transitions may be set to depend on some observable variable x in addition to the previous state:

Pt(i, j) = P {st= i|st−1= j, xt−1}. (2.2.8) The influencing variable x can be either a scalar or a vector. A special case occurs when it is equal to the lagged endogenous variable; this model is said to have a feature known as endogenous switching.

To parametrize the dependence on xt−1one can choose any function mapping to the open interval (0, 1). Following Gray (1996), a probit specification is employed in this paper. The analysis here is constrained to the two-state case, but it is possible to extend it to more states. The motivation is that it is what is found in much of the literature6, but also because it is less complicated providing a more comprehensible introduction for the uninitiated reader. Additionally, a two-state model is preferable to one with three (or more) states from a computational perspective, which is not to be neglected considering that the GVAR model is constructed as a collection of models requiring substantial computational power.

With a probit specification, the 2 × 2 matrix Ptis

Pt= p1t 1 − p1t 1 − p2t p2t ! = Φ(x 0 t−1β1) 1 − Φ(x0t−1β1) 1 − Φ(x0t−1β2) Φ(x0t−1β2) ! , (2.2.9)

(12)

where xt−1 now also includes unity. Inevitably, this time dependence changes the tions that yield filtered and smoothed probabilities. For the filtered probabilities, the equa-tions are ˆ ξt|t= ˆ ξt|t−1 ηt 10( ˆξt|t−1 ηt) (2.2.10) ˆ ξt+1|t= P0tξˆt|t, (2.2.11)

with Ptdefined as in Equation (2.2.9) and ηtbeing

ηt=

f (yt|st= 1, Xt, Yt−1; Γ) f (yt|st= 2, Xt, Yt−1; Γ) !

. (2.2.12)

The smoothed probabilities are obtained by iterating backwards over the equation ˆ

ξt|T = ˆξt|t Pth ˆξt+1|T ˆξt+1|t i

. (2.2.13)

2.3

Regime-Switches in the Global VAR Context

The notion of regime switches has been added to the GVAR model by Binder and Gross (2013), in which the basic idea is the same as in the previous section. The models may be estimated on a country-by-country basis following a combination of the procedures in the previous sections.

Including regime switches means that the coefficient vectors and matrices in Section 2.1 depend on the prevailing regime, such that Equation (2.1.3) may be written as

yi,t = Γi,si,tyei,t+ ui,t, (2.3.1) where si,t is the state for country i at time t. Allowing for different numbers of regimes in each country, the country-specific state spaces are defined as Si = {1, 2, . . . , Ri}. Equation (2.3.1) implies that the model in Equation (2.1.10) needs to be reformulated as

Ai,0,si,tWiyt= ai,si,t+

pi

X

j=1

Ai,j,si,tWiyt−j+ ui,t. (2.3.2)

Subsequently, the solution to the global model is state-dependent such that the regime-switching analogue of the model in Equation (2.1.12) is

G0,Styt= aSt+

p X

j=1

Gj,Styt−j+ ut, (2.3.3)

where St is comprised of all regimes at time t, i.e. St = {s1,t, s2,t, . . . , sN,t}. The reduced form thus naturally depends on the current regime constellation, meaning that

(13)

Equation (2.1.13) is written as yt= bSt+ p X j=1 Fj,Styt−j+ et,St, (2.3.4) where Fj,St = G −1 0,StGj,St, bSt = G −1 0,StaSt, et,St = G −1 0,Stut.

2.4

Estimation

To estimate the parameters, the common choice is maximum likelihood estimation. The es-timation is mostly either carried out through straight-forward maximization using general-purpose optimization routines, or through the use of the Expectation-Maximization (EM) algorithm.7Recently, Perlin (2010) released a MATLAB package for estimation of Markov-switching VAR models using built-in optimization routines. This package has subsequently been extended upon by Ding (2012) to also include the possibility of time-varying transi-tion probabilities, a feature not present in the original package.

The time-varying estimation procedure in Ding (2012) is based on GAUSS code by Perez-Quiros and Timmermann (2000), and thus follows the specifications therein. The objective function in the maximization problem is

L(λ) = f (YT|XT; λ) = T Y

t=1

f (yt|Yt−1, Xt; λ), (2.4.1)

where λ contains all unknown parameters (model coefficients, transition parameters and the covariance matrix). It is, as is often the case, more convenient to maximize the log likelihood: log L(λ) = T X t=1 log f (yt|Yt−1, Xt; λ). (2.4.2)

However, by Bayes’ theorem and the law of total probability, (2.4.2) is

log L(λ) = T X t=1 log 2 X i=1 f (yt|st= i, Yt−1, Xt; λ)P {st= i|Yt−1, Xt; λ} ! . (2.4.3)

If the sequence of states {st} were to be observed, maximization would be greatly simpli-fied. Since it is not, one can use the filtered probabilities described earlier. Notice that the first term in (2.4.2) is row i of (2.2.12), and that the second is the one-step ahead forecast 7Gray (1996) and Diebold et al. (1994) include examples of the former and the latter, respectively. Bayesian estimation methods also provide an alternative, as discussed by Krolzig (1997).

(14)

from (2.2.11). Thus, (2.4.2) is log L(λ) = T X t=1 log  10( ˆξt|t−1 ηt)  . (2.4.4)

By repeated computation of the filtered probabilities and maximization of the log-likelihood, maximum likelihood estimates of the model parameters and covariance matrix as well as the coefficients in the transition matrix can be obtained.

2.5

Forecasting

The dependence on a specific constellation of regimes complicates forecasting somewhat, thus requiring some care. To overcome the regime dependence, one can weigh the state-specific coefficient matrices together by using the smoothed state probabilities and the transition probabilities. For a one step ahead forecast at time T , the weighting is given by Ξi = P0iξˆi,T |T, which with two states is 2 × 1. Element m is the probability of state m governing at T + 1. The weighting of coefficient matrices is thus:

e Ai,j = 2 X m=1 Ai,j,si=mΞi(m). (2.5.1)

The rationale for using this is that the smoothed probability, ˆξi,T |T, provides an estimate of the probabilities of each regime at time T . Multiplying this by P0i yields a forecast of regimes at T + 1, which is used to weigh the coefficient matrices together for the forecast. At every forecasting step, the coefficient matrices need to be reweighted as the multiplica-tion by the transimultiplica-tion matrix changes the probabilities at each step. The procedure is thus to weigh all local models’ coefficient matrices together using (2.5.1) and solve the global model using these weighted matrices. One then makes forecasts one step ahead, and apply the weighting of foreign variables to obtain the exogenous variables for each local model. Following that, all coefficient matrices are reweighted by first multiplying the previous Ξi with P0i. The global model is then again solved and a new forecast is made using the pre-vious ones in place of the lagged terms in the model. This procedure is repeated until the end of the forecast horizon is arrived at.

Forecasting for a Markov-switching VAR model with time-varying transition probabilities is in spirit the same as for the model with constants probabilities. The smoothed proba-bility, ˆξT |T, of the state constellation at the end of the estimation period is multiplied by P0i,T +1to create regime probabilities for time T + 1. The weights for the different states’ coefficient matrices are used to create the weighted coefficient matrix eAi,j, which is done for each local model. It is then possible to solve the global model and forecast the global vector at T + 1. With the forecasted values, one can then compute estimates of all ex-ogenous variables in the local models as well as forecasted values for the variables in the transition probability equations. This procedure is repeated until a desired number of time

(15)

periods have been forecasted.

Forecasting with the global VAR model without regime switches is done in the usual way. Consider the global model in Equation (2.1.13). With one lag, this model has the following forecasting equation for a prediction h steps ahead:

ˆ yt+h= h−1 X τ =0 Fτb + Fhyt. (2.5.2)

2.6

Impulse Response Analysis

To perform impulse response analysis for a global Markov-switching VAR model, there are two major obstacles that complicate traditional practice.

The first problem is the global nature of the model, as the global vector usually is fairly large. In impulse response analysis, it is common to apply a Cholesky decomposition to orthogonalize shocks. In doing so, one must pay attention to the ordering of the variables as that affects the decomposition. In a small-scale VAR model, the ordering can usually be motivated by, and often implicitly found in, economic theory. However, it is considerably more difficult to order the variables in a global VAR model, as the number of endoge-nous variables in the global model often ranges from 40 up to 100 and occasionally even exceeding that. A method that is invariant to the ordering is the generalized impulse re-sponse function (GIRF) developed by Koop, Pesaran, and Potter (1996) and Pesaran and Shin (1998), which has become the standard tool for analyzing impulse responses in global VAR models.

The second complication that arises is related to the regime-switching, as the different regimes have different coefficient and covariance matrices. Consequently, it follows that impulse responses are regime-dependent. The regime-dependent nature of the responses is utilized by Ehrmann, Ellison, and Valla (2003) to create impulse responses that are ana-lyzed conditional on a specific regime prevailing. The downside is that it is, in a sense, un-realistic as the probability of staying in a specific regime goes to zero over time. However, keeping this limitation in mind the impulse response analysis herein follows the regime-dependent approach, similar in spirit to Ehrmann et al. (2003), with generalized impulse responses.

Consider the global VAR model with one lag, solved conditional on a regime constella-tionS :

(16)

The generalized impulse response is then defined as

GI(n, δ, Ωt−1) = E(yt+n|et= δ, Ωt−1) − E(yt+n|Ωt−1), (2.6.2) where δ is the shock and Ωt−1 is the known information through time t − 1. It may then be shown (see Pesaran and Shin (1998) for details) that the scaled generalized impulse response function of the effect of a shock of the jth variable is

ψj,S(n) = σ −1

2

jj,SΨn,SΣSνj, n = 0, 1, 2, . . . (2.6.3)

where σjj,S is element (j, j) from the given regime constellation’s error covariance matrix ΣS and νjhas its jth element equal to unity with the rest being zero. The matrix Ψn,S is obtained recursively as

Ψn,S = F1,SΨn−1,S + F2,SΨn−2,S + · · · + Fp,SΨn−p,S, n = 1, 2, . . . , (2.6.4) where Ψ0,S = I, Ψn,S = 0 for n < 0 and Fj,S as in (2.3.4). For the model given in (2.6.1), however, it is simply

Ψn,S = FnS, n = 1, 2, . . . . (2.6.5)

Following Binder and Gross (2013), to estimate the covariance matrix it is first partitioned into blocks: ΣS =        Σ11,s1 Σ12,s1,s2 · · · Σ1N,s1,sN Σ21,s1,s2 Σ22,s2 · · · Σ2N,s2,sN .. . ... . .. ... ΣN 1,s1,sN ΣN 2,sN,s2 · · · ΣN N,sN        . (2.6.6)

Each diagonal block is estimated as

ˆ Σii,si = PT t=1uˆit,siuˆ 0 it,si  ˆξ0 i,t|Tνsi  PT t=1ξˆi,t|T0 νsi , (2.6.7)

where the idea is to generate residuals over the entire sample conditional on the selected regime of si. The residuals at each time point are then weighted and scaled by the sum of the probabilities of the regime at every point in time. If there were only one regime, the term in brackets would always be one and the denominator would be T ; thus in that case the estimator would be the usual covariance estimator.

(17)

Similarly, the off-diagonal blocks are estimated by ˆ Σij,si,sj = PT t=1uˆit,siuˆ 0 jt,sj r  ˆξ0 i,t|Tνsi  ˆξ 0 j,t|Tνsj  PT t=1 r  ˆξ0 i,t|Tνsi  ˆξ 0 j,t|Tνsj  , (2.6.8)

which collapses into (2.6.7) if i = j. However, the off-diagonal blocks are not strictly speaking covariance matrices, but matrices containing covariances. Since local models are not restricted to have the same number of endogenous variables, these matrices may very well be non-square. Furthermore, even if they are square matrices, the diagonal does not consist of variances, but covariances as well. When local models i and j have the same types of endogenous variables, the diagonal elements will be covariances between the same type of variable (e.g. GDP growth in country i and j), and the off-diagonal elements will be covariances between different types of variables from the two countries (such as GDP growth in country i and inflation in country j).

A note of caution for the selection of regimes is in order here. If the selection is done such that the smoothed probabilities for the regimes are never simultaneously non-zero, (2.6.8) will not be defined.

For the global VAR model without regime switches, impulse response analysis is carried out in the same way with the exception of the calculation of the covariance matrix. Without any switches, Equations (2.6.7)-(2.6.8) are instead

ˆ Σij = PT t=1uˆituˆ0jt T , i, j = 1, . . . , N. (2.6.9)

3

Empirical Application

To compare the models from various perspectives, this section consists of an empirical application. The application is made with particular attention to Sweden. For a practitioner, it is likely that there is a particular country that is of interest rather than equal interest in each one, thus meaning that this is a plausible scenario in a real-life setting.

3.1

General Setup

The regime-switching global VAR model is set up for a total of 13 countries and regions. The selection is made such that it includes major developed countries and the main trading partners of Sweden, with some countries grouped together into regions. The selection of countries/regions resembles that of Pesaran et al. (2004) and Dees et al. (2007). Included countries and regions are presented in Table 1.

(18)

Table 1: Included countries and regions

China Finland Germany

Euro area Latin America Southeast Asia

- Austria - Argentina - Indonesia

- Belgium - Brazil - Korea

- France - Chile - Malaysia

- Italy - Mexico - Philippines

- Netherlands - Peru - Singapore

- Spain - Thailand

India Japan Norway

Sweden Switzerland United Kingdom

United States

Note:Countries and regions in boldface are included in the model. The regions are comprised by the countries in non-boldface below their respective names.

the regions. The PPP-GDP of the individual countries are accumulated and the countries within each region are then given a weight corresponding to the proportion of their contri-bution to their region’s PPP-GDP.

The variables included in the model are real GDP (y), inflation (p) and the nominal short-term interest rate (r). Prior to entering the model, y and r are transformed into first dif-ferences, whereas an additional difference of p is taken. After doing so, the Augmented Dickey-Fuller test for unit root rejects the null hypothesis of non-stationarity at the 5 % level for all variables for all countries and regions.8 The endogenous and exogenous vec-tors of every local model, as described by (2.1.1), are therefore

yi,t=    yi,t pi,t ri,t   , xi,t =    yi,t∗ p∗i,t ri,t∗   . (3.1.1)

Because of the large number of parameters that are to be estimated, the model is kept parsimonious and the number of lags is set to 1 overall. That is, pi= qi = 1 for all i such that endogenous and exogenous variables are included with only one lag in every local model.

3.2

Data

The model setup requires data for 27 countries, thus making the collection of which a ma-jor task in itself. Furthermore, the data that is accessible from various sources are often not directly comparable, but require a number of transformations such as seasonal adjust-8The second difference of p is required to ensure stationarity for all countries. After the first difference, only half of the countries’ p series are stationary. By also differencing the already stationary series the problem of overdifferencing is introduced, but similar to Pesaran et al. (2004), this is deemed to be less serious than still having some unit roots left.

(19)

ments, interpolations due to differing frequencies and so on.

With the difficulties of collecting and preparing the necessary data in mind, one alter-native is to use the ready-to-use data provided in conjunction with the GVAR toolbox for MATLAB by Smith and Galesi (2011). The data originally comes from Dees et al. (2007), but has been extended since. The methods for extending the sample are described in Smith and Galesi (2011), and the methods for constructing the initial data set can be found in a supplement to Dees et al. (2007), which is available by the authors upon request. However, using this data comes at a cost: the sample period of the updated data ends at 2011Q2. Naturally, more recent data would be desirable, but there is nothing in particular that re-quires it. For this reason, the data set of Smith and Galesi (2011) is used in this empirical application.

The data from Dees et al. (2007), updated and made available online by Smith and Galesi (2011), ranges from 1979Q1 to 2011Q2 yielding a total number of observations equal to 130. It includes the real gross domestic product in levels (RGDP ), consumer price index (CP I) and the short-term interest rate (Rs). The transformations that are made to create the data set used for model estimations are taking logarithms and differencing as follows:

yi,t = ln  RGDPi,t RGDPi,t−1  (3.2.1) pi,t = ln  CP Ii,t CP Ii,t−1  − ln CP Ii,t−1 CP Ii,t−2  (3.2.2) ri,t = 1 4 " ln 1 + Rs i,t 100 1 +R s i,t−1 100 !# . (3.2.3)

These transformations are the same as in Dees et al. (2007), except that these also include differencing of terms.

3.3

Model Specifications

The comparison features a number of models, of which a brief description in this empiri-cal context is presented here. Note that all models, unless otherwise stated, use the same set of variables as previously presented and the number of lags is restricted to 1 as men-tioned before. In the regime-switching alternatives, the number of states is set to 2. For methodological details, the reader is referred to Section 2.

3.3.1 TVTP-GVAR

The first model is the regime-switching global VAR with time-varying transition proba-bilities, abbreviated by TVTP-GVAR. As indicators in the transition probability matrix, specified in Equation (2.2.9), the lagged vector of exogenous variables is chosen, with the

(20)

each local model, a constant and the rest of the world’s lagged GDP, inflation and inter-est rate measures as explanatory variables. The number of states is chosen to be two, as discussed in the previous section. The transition probabilities are therefore

P {si,t = l|si,t−1= m, xt−1}

=  

Φβm1(i) + βm2(i)yi,t−1∗ + βm3(i)p∗i,t−1+ βm4(i)r∗i,t−1, if l = m, 1 − Φβm1(i) + βm2(i)yi,t−1∗ + βm3(i)p∗i,t−1+ βm4(i)r∗i,t−1, if l 6= m

(3.3.1)

for m ∈ {1, 2}. The local models, conditional on the regime process si,t, are thus

yi,t = ai,si,t+ Φi,si,tyi,t−1+ Λi,0,si,txi,t,si,t+ Λi,1,si,txi,t−1,si,t+ ui,t (3.3.2)

for i = 1, . . . , 13.

3.3.2 RS-GVAR

The second model is also a regime-switching global VAR model, but in which the transition probabilities are constant. The specification is in essence the same as in Binder and Gross (2013), with only a different set of variables. This model is referred to as the RS-GVAR model. The local models are the same as for the TVTP-GVAR, i.e.

yi,t = ai,st + Φi,styi,t−1+ Λi,0,stxi,t,st+ Λi,1,stxi,t−1,st+ ui,t. (3.3.3)

What is different is the transition probability matrix, which in this specification is

P {si,t= l|si,t−1= m} =    pll, if l = m, 1 − pmm, if l 6= m, for m ∈ {1, 2}. (3.3.4)

3.3.3 AR-TVTP and AR-RS

Model three and four are similar and are essentially variations of the TVTP-GVAR and RS-GVAR models using ideas discussed in Binder and Gross (2013). What is different in these is that instead of estimating the transition probabilities (either constant or time-varying) simultaneously with the model coefficients, a more parsimonious model is used to make inference about the regimes. Conditional on this regime inference, i.e. taking the estimated smoothed regime probabilities as the true regimes, estimating each local model is a straight-forward generalized least squares resembling regression.

Elaborating on this idea, what it means is that initially a simple model is specified and estimated. In this case, a regime-switching AR(1) for the GDP growth variable is used:

(21)

For the time-varying probabilities case, the specification of the state process is

P {si,t = l|si,t−1= m, yi,t−1∗ }, l, m ∈ {1, 2}, (3.3.6) where yi,t−1∗ is the foreign GDP growth series for country i. In the case of constant proba-bilities

P {si,t = l|si,t−1= m}, l, m ∈ {1, 2}. (3.3.7)

Let ˆΞi,m = diag 

ˆ

ξi,m,1|T, ξˆi,m,2|T, . . . , ξˆi,m,T |T 

, i.e. a diagonal matrix of size T × T containing the smoothed probabilities of country i being in state m over the entire sample. These estimated probabilities are obtained in the regime-switching estimation of the AR(1) models. It is shown by Krolzig (1997) that, for a given regime inference, the parameters γi,mfor local model i in state m can be estimated by

ˆ γi,m=   z0iΞˆi,mzi −1 z0iΞˆi,m  ⊗ Ik  yi, m ∈ {1, 2}, (3.3.8) where zi =       

1 y0i,0 x0i,1 x0i,0 1 y0i,1 x0i,2 x0i,1

..

. ... ... ...

1 yi,T −10 x0i,T x0i,T −1        , yi=        yi,1 yi,2 .. . yi,T        . (3.3.9)

The vector γi,mis a vectorization of the coefficient matrices:

γi,m=       ai,m vec Φ0i,m vec Λ0i,0,m vec Λ0i,1,m       . (3.3.10)

from the local models specified as for TVTP- and RS-GVAR. The rationale for using this approach is, according to Binder and Gross (2013), that it is easier to interpret as one can only focus on economic growth when doing so, rather than a combination of growth, inflation and interest rates. To estimate regimes, it is even possible to use a variable not included in the main model. In the end, it depends on what the objective is and what one wants to achieve. These two models are referred to as AR-RS (constant probabilities) and AR-TVTP (time-varying probabilities).

(22)

3.3.4 GVAR

The traditional model choice in this context is the global VAR model without regime switches. The local model is

yi,t = ai+ Φiyi,t−1+ Λi,0xi,t+ Λi,1xi,t−1+ ui,t. (3.3.11) These models are estimated as VAR models with exogenous variables using ordinary least squares estimation. This model is referred to as the GVAR model.

3.3.5 VAR and AR

Lastly, two models are used for the forecast comparison: an AR(1) and a VAR(1) which only includes the Swedish variables. The VAR is thus specified as

yswe,t= aswe+ Φsweyswe,t−1+ uswe,t (3.3.12) and the AR as

yswe,t−1= aswe+ φsweyswe,t−1+ uswe,t. (3.3.13) Table 2 contains an overview of the models described in this section.

Table 2: Overview of models

TVTP-GVAR Local models: Markov-switching VARs with time-varying transition probabilities Regimes estimated in local models

RS-GVAR Local models: Markov-switching VARs with constant transition probabilities Regimes estimated in local models

AR-TVTP Local models: Markov-switching VARs with time-varying transition probabilities Regimes estimated in auxiliary AR(1) models

AR-RS Local models: Markov-switching VARs with constant transition probabilities Regimes estimated in auxiliary AR(1) models

GVAR Local models: VARs

VAR VAR(1) model for Sweden with no exogenous variables

AR AR(1) for Swedish GDP growth

Note:See text for model details.

3.4

Regime Inference

Four of the models are regime-switching and the estimation thus also involves inferences made about the prevailing regimes. These models’ regime inferences are presented visu-ally in Figure 1.

(23)

(a) TVTP-GV AR (b) RS-GV AR (c) AR-TVTP (d) AR-RS 1: Smoothed state probabilities (state 1) for: (a) TVTP-GV AR, (b) RS-GV AR, (c) AR-TVTP , and (d) AR-RS. Shaded areas in the GDP wth plots indicate that the smoothed probability exceeds 0.5.

(24)

The results are quite different for the models. For the TVTP-GVAR model in subfigure (a), there are barely no switches at all: the smoothed probability is essentially zero for the second state throughout the entire sample, except for two brief periods where the probabil-ity is close to one. The RS-GVAR model, subfigure (b), exhibits a fairly different behavior, with a fundamentally more volatile regime probability that signifies a much greater vari-ation in which state is estimated to have governed. This behavior, however, only exists through 1996 in the sample; in the last 15 years, the probability of state two is throughout estimated to be close to zero.

As opposed to TVTP-GVAR and RS-GVAR, AR-TVTP and AR-RS in subfigures (c) and (d) respectively, in which the regime inference is solely based on GDP growth, are much more similar with respect to the regime inference. They both estimate two major periods of state two prevalence: the early 1990s and post-2008. The AR-RS also estimate state two to have been governing around the end of the 1990s and beginning of the 2000s. It should, however, be noted that this period of state two is estimated with a lower probability than the other two major periods, which does not show in the discretization of states.

The economic interpretation of the estimation is not obvious in the case of TVTP-GVAR and RS-GVAR. The TVTP-GVAR has estimated two very brief periods of state two, mak-ing it hard to identify a particular economic motivation for this state, as both crises and booms commonly last longer than just a couple of quarters. The large fluctuations in the probabilities estimated by RS-GVAR complicate the interpretations in this case too. Nev-ertheless, it seems to cover the financial crisis that hit Sweden in the early 1990s, and the period during which state two is estimated to have been governing is very similar to that of AR-TVTP and AR-RS. The preceding estimates of state two, i.e. in the 1980s, are harder to justify with respect to economic motivations.

Interpretations that are economic in nature are inherently easier for AR-TVTP and AR-RS as state two is estimated to cover two periods of known major financial distress: the finan-cial crisis of the 1990s and the recent crisis that begun following the collapse of Lehman Brothers. State two may thus be thought of as a “bust” state, and state one as a “boom” state. The estimated period of state two in subfigure (d) is around the time of the collapse of the dot-com bubble. AR-TVTP, subfigure (c), has an increased probability of state two during this time showing an increased risk of state two, but the probability does not exceed 0.5.

3.5

Impulse Response Analysis

The impulse response analysis carried out here considers the scenario of a one standard deviation shock to U.S. GDP growth and the effect of this shock to Swedish GDP growth. This type of shock scenario is not possible to do with the local Swedish model alone, as

(25)

0 2 4 6 8 10 12 14 16 −0.001 0 0.001 0.002 0.003 0.004 Time (quarters)

Cumulative generalized impulse response, TVTP−GVAR

(a) TVTP-GVAR 0 2 4 6 8 10 12 14 16 −0.001 0 0.001 0.002 0.003 0.004 Time (quarters)

Cumulative generalized impulse response, RS−GVAR

(b) RS-GVAR 0 2 4 6 8 10 12 14 16 −0.001 0 0.001 0.002 0.003 0.004 Time (quarters)

Cumulative generalized impulse response, AR−TVTP

(c) AR-TVTP 0 2 4 6 8 10 12 14 16 −0.001 0 0.001 0.002 0.003 0.004 Time (quarters)

Cumulative generalized impulse response, AR−RS

(d) AR-RS 0 2 4 6 8 10 12 14 16 −0.001 0 0.001 0.002 0.003 0.004 Time (quarters)

Cumulative generalized impulse response, GVAR

(e) GVAR

Figure 2: Cumulative generalized impulse response of Swedish GDP growth to a one standard deviation shock in U.S. GDP growth for: (a) TVTP-GVAR, (b) RS-GVAR, (c) AR-TVTP, (d) AR-RS, and (e) GVAR. Subfigures (a) through (d) are conditional on the estimated regime constellation at the end of the sample.

(26)

analysis of this kind is made possible by the GVAR model, as the American growth series is endogenous in the global model. One could therefore argue that this is one of the major advantages of the GVAR model, as it allows for this type of shock analysis. The general-ized impulse responses have been criticgeneral-ized by Kim (2012) for not identifying structural shocks, a critique already noted by Pesaran et al. (2004). Pesaran et al. (2004) argue that while a structural economic interpretation may not be appropriate, it is still a very useful tool that can help in identifying regional shocks and how they transmit. The cumulative generalized impulse response functions for the five relevant models over a horizon of 16 time periods are presented in Figure 2.

The effect of the shock, which is not orthogonalized but correlations between countries may exist, is estimated to increase during the first two years until it converges at around 0.2 percentage points in the TVTP-GVAR model, subfigure (a). The RS-GVAR model, subfigure (b), exhibits a very similar behavior initially, but the cumulative effect then starts to tail off. Subfigure (e), the GVAR’s response, is also similar to the responses of TVTP-GVAR and RS-TVTP-GVAR: increasing cumulative response over the first quarters and then the change ceases.

Subfigures (b) and (c) show something completely different. The AR-TVTP rests steadily around -0.0005 percentage points, but the changes seem to increase with the horizon. The AR-RS response is essentially the same, but on the positive side of the axis. This kind of response is typical for non-stationary processes, and as one might suspect, the global model’s coefficient matrix F has maximum eigenvalues exceeding 1 for the AR-TVTP and AR-RS models.9 Since the impulse response analysis involves Fn, as in (2.6.5), the re-sponses are explosive.

It is peculiar that the global model is non-stationary and explosive when the local mod-els are not, but this is likely connected to the weighting that occurs when conditioning on a regime-constellation and then forming the global coefficient matrix. It is, however, not an unknown issue; Binder and Gross (2013) noted that their regime-switching global VAR was stable in about 60 % of the cases.

3.6

Eigenvalues

Explosive behavior of the kind noted in the previous section will affect forecasts as well. The forecasts are done in a slightly different manner compared to the impulse responses, as they are based on what can be called weighted regimes, whereas the impulse responses are computed using discrete regimes. What this means is that the impulse response analysis is based on the estimated regime constellations as of 2011Q2, where the regime having 9For the AR-TVTP model, the maximum absolute eigenvalue is 1.0540, and for AR-RS it is 1.0620. The other three models are stable, with maximum absolute eigenvalues being 0.6433, 0.9542 and 0.5893 for the TVTP-GVAR, RS-GVAR and GVAR respectively.

(27)

2008 2009 2010 2011 0.5

1 1.5 2

Maximum moduli of global model’s eigenvalues, TVTP−GVAR

(a) TVTP-GVAR 2008 2009 2010 2011 0 1 2 3 4 5

Maximum moduli of global model’s eigenvalues, RS−GVAR

(b) RS-GVAR 2008 2009 2010 2011 0.5 1 1.5 2 2.5 3

Maximum moduli of global model’s eigenvalues, AR−TVTP

(c) AR-TVTP 2008 2009 2010 2011 0.7 0.8 0.9 1

Maximum moduli of global model’s eigenvalues, RS−GVAR

(d) AR-RS 2008 2009 2010 2011 0.6 0.7 0.8 0.9 1

Maximum moduli of global model’s eigenvalues, GVAR

(e) GVAR

Figure 3: Maximum moduli of eigenvalues for: (a) TVTP-GVAR, (b) RS-GVAR, (c) AR-TVTP, (d) AR-RS, and (e) GVAR. Note: A value above 1 means explosive behavior.

(28)

the largest estimated smoothed probability is taken to be the one governing (i.e. discrete regimes). For the forecasts, the different states’ coefficient matrices are instead weighted together using the smoothed probabilities (weighted regimes). As the problem of explo-siveness already appears to exist for the discrete regimes, it is wise to also consider the weighted regimes as this inevitably will influence the forecast comparison. For this reason, the maximum moduli of the models’ eigenvalues over the last 16 time periods are plotted in Figure 3.

The TVTP-GVAR is for the most part stable, with two occasions of explosiveness. The RS-GVAR is balancing on the border of stability during all 16 time periods, with a notable peak in 2009. A similar pattern is found for the AR-TVTP, which jumps back and forth between stability and explosiveness. The eigenvalues of the later time periods are close to 1 for AR-RS, but none of them exceed 1. Fortunately, the GVAR does not have a problem with non-stationarity in any of the studied time periods and seems to be exempt from this issue.

The eigenvalues do not, however, tell the complete story. The impact is also related to where the non-stationarity occurs; more specifically, the impact of explosiveness on im-pulse responses and forecasts for Sweden will be greater if the non-stationarity is directly related to Sweden.

3.7

Forecast Comparison

For the forecast comparison the full set of models is estimated repeatedly starting at 2007Q1 and ending at 2011Q1. This yields 17 − h forecasts for horizons h = 1, 2, 3, 4. To accom-modate the comparisons of forecasts, the AR(1) is taken as a benchmark model. The forecast performance is evaluated using root mean square error (RMSE), and the results are expressed relative to the AR(1) model. Hence, a value above 1 indicates inferior per-formance of that particular model, and a value below 1 means superior perper-formance. The results are presented in Table 3.

Table 3: Root mean square errors (RMSE) for model forecasts, relative to AR(1)

h TVTP-GVAR RS-GVAR AR-TVTP RS-TVTP GVAR VAR

1 0.8341 0.8409 0.7899 0.7960 0.8962 1.0003

2 1.0214 1.0839 0.9511 1.0268 0.9943 1.0016

3 0.9844 1.0563 1.4273 1.0838 1.0564 1.0032

4 0.9858 1.1081 6.8298 1.1384 1.0418 1.0026

Note: A value below 1 indicates superior forecasting performance in terms of RMSE to the AR(1) benchmark model.

The table shows a well-known fact: it is often not very easy to beat an AR(1). The one-step ahead predictions are for all models, except the VAR, better than the AR, but from there on the results vary. When the horizon is two, i.e. h = 2, only AR-TVTP and GVAR manage

(29)

to predict the Swedish GDP growth better than the AR. When the horizon is longer, h = 3 and h = 4, the only model that beats the AR is TVTP-GVAR. The ratio is close to one, however, indicating that the difference in performance is small.

What is particularly alarming is the terrible performance of AR-TVTP for h = 3 and h = 4. The reason for this relates to the discussion in the previous section, namely the instability of the model and especially how it is connected to Swedish GDP growth. As Figure 3 showed, eigenvalues larger than 1 occur frequently in a number of models, but none of these show as bad forecasting performance as the AR-TVTP.

The coefficient matrices whose eigenvalues constitute the peaks in subfigure (c) in Fig-ure 3 have estimates that exceed one in absolute value for the impact of the lag of Swedish GDP growth on itself. Thus, the non-stationarity is directly related to Sweden, which produces subpar forecasts in the two explosive instances. Because of a small sample for the forecast evaluation, these two major deviations give rise to the very poor forecasting performance reflected by the RMSE measure. For the RS-GVAR, whose maximum eigen-value in absolute terms at one point exceeds 4, the forecasts are less affected because the non-stationarity is not related to Sweden directly. Thus, even though the global model is explosive, this is not immediately reflected in the RMSE because the explosiveness is found in some other part of the model.

4

Discussion

The results are somewhat ambiguous as to what model is better. The regime inference made by applicable models were all quite different, and in two cases unsatisfactory — the estimated regime series for TVTP-GVAR and RS-GVAR lack a reasonable economic inter-pretation. Naturally, with a regime-switching model the inference made regarding regimes is an important aspect of the model, especially considering that the estimated regime tra-jectories make up the basis for the estimation of the model parameters.

However, it is hard to distinguish any particularly good or bad times by simply looking at the evolvement of the Swedish GDP growth series (Figure 1) alone, and TVTP-GVAR and RS-GVAR also make use of the inflation and interest rate series. Naturally, the model is incapable of identifying switches if the input series barely seem to contain any. Thus, the regime-switching models should not be discarded completely based on only this analysis; rather, it would be interesting to perform a similar model comparison with a different set of data where possible switches are apparent even to the naked eye. It could be as simple as transforming the variables differently, or perhaps a different set of variables is required for a more reasonable state identification.

(30)

inference, with both models identifying two longer periods of the second state approxi-mately covering the two financial crises that have hit Sweden during the sample period. It is thus more natural to award their estimates economic interpretations. However, these models are subject to another problem: non-stationarity. The global models’ largest eigen-values exceed one in modulus, making impulse responses explosive and severely affecting the forecasts for the AR-TVTP model. In general, all regime-switching models have prob-lems with explosiveness, it is only the extent of it that is different. The instability is a concern, and its presence was also noted by Binder and Gross (2013). Unless stability is ensured, one should be careful with interpretations and conclusions drawn that may be based on an unstable model.

The basic GVAR model does not show any problems with instability, it produces rea-sonable impulse responses and its forecasting performance is decent. For obvious reasons, it does not give any information about possible regimes. That aside, it is still an attractive choice. The question is what the advantage of a regime-switching model is, if the estimated regimes make little sense to begin with. Thus, in many situations the basic GVAR model is a reasonable choice as it, in a sense, is more trustworthy. It appears as if one wants to fully explore the benefits of a regime-switching model, one must be very careful in specifying the model such that estimated regimes are reasonable and the global model is stable.

Furthermore, there are also practical benefits to the basic GVAR model. The GVAR Tool-box, which is set to be updated from version 1.1 to 2 in mid-2014, by Smith and Galesi (2011) is easy to use, customizable and produces most of the output one might need. A major downside of the regime-switching models is that they are computationally intensive, making it a cumbersome task to test various model specifications. For instance, estimation of a single local model takes between 5-20 minutes in the most parsimonious of speci-fications with three endogenous variables, two states and one lag. Thus, estimation of a global model with 13 countries takes up to 4 hours. Increasing the number of variables, states and/or lags leads to drastically more demanding computations that quickly exceed one hour per local model. It is therefore a time-consuming task to compare model setups, something that is avoided when using the basic GVAR.

4.1

Further Research

Many applications of regime-switching time series models are made using financial time series, where the number of observations is generally larger due to higher data frequencies and in addition greater fluctuations are often seen. An interesting alternative, yet similar, model comparison to the one made here could thus be on a monthly data set. Ang and Bekaert (2002) considered a regime-switching VAR with time-varying probabilities for in-terest rates and spreads with monthly data. A similar model specification, but in the GVAR framework, would be interesting in order to see if the TVTP-GVAR and RS-GVAR models

(31)

are possible. For example, a Bayesian approach to the estimation of the local models is used by Feldkircher, Huber, and Cuaresma (2014), which is another plausble method of es-timation. It is also possible to extend the regime-switching models even further to include the cointegrated case for even richer model dynamics. Much of the literature is still very new, which allows for several interesting extensions that have not been considered before.

A more practical issue that may be worth exploring is the estimation. This paper uti-lized two existing MATLAB packages for Markov-switching models, both with (Ding, 2012) and without (Perlin, 2010) time-varying transition probabilities. They both employ optimization routines to maximize the likelihoods. An alternative is to implement the expectation-maximization (EM) algorithm, which is the method advocated by e.g. Hamil-ton (1990). The author also notes that the EM algorithm is relatively insensitive to poor choices of starting values, which is especially desirable in this setting where there are mul-tiple models, making it infeasible to manually test different sets of starting values for each model.

5

Conclusion

In conclusion, the specification of the local models in a global VAR as regime-switching is an appealing option, but as the empirical application here has illustrated it requires an appropriate context. The regime-switching models examinated are associated with some undesirable features, such as explosiveness and identification of regimes that lack eco-nomic interpretations. However, this does not disprove the usefulness of regime-switching models, both with and without time-varying switching probabilities, but merely that they did not perform adequately in the given setting. The basic global VAR model used in most previous applications does not, for natural reasons, make any inference regarding possi-ble underlying states, but it is not subject to explosiveness and is decent with respect to forecasting. Thus, the empirical application highlights one thing in particular, considering that previous studies have in fact shown that regime-switching models are useful in some contexts: namely, that the aggregation of models that occurs in creating the global model requires the local models to be well-specified in the sense that they not only act appro-priately on the local level, but also together on the global level. This may or may not be possible, and what this means in terms of altering the models in this paper with respect to variable selection, lags and number of states will require further investigation.

(32)

References

Ang, A. and Bekaert, G. Regime Switches in Interest Rates. Journal of Business & Eco-nomic Statistics, 20(2):163–182, 2002.

Binder, M. and Gross, M. Regime-Switching Global Vector Autoregressive Models. Work-ing Paper Series 1569, European Central Bank, August 2013.

Dees, S., di Mauro, F., Pesaran, M. H., and Smith, L. V. Exploring the International Linkages of the Euro Area: A Global VAR Analysis. Journal of Applied Econometrics, 22(1):1–38, 2007.

di Mauro, F. and Pesaran, M. H., editors. The GVAR Handbook: Structure and Applications of a Macro Model of the Global Economy for Policy Analysis. Oxford University Press, 2013.

Diebold, F. X., Lee, J.-H., and Weinbach, G. C. Regime Switching with Time-Varying Transition Probabilities. In Hargreaves, C., editor, Non-stationary Time Series Analyses and Cointegration, pages 283–302. Oxford University Press, 1994.

Ding, Z. An Implementation of Markov Regime Switching Model with Time Varying Tran-sition Probabilities in MATLAB, 2012.

Ehrmann, M., Ellison, M., and Valla, N. Regime-dependent Impulse Response Functions in a Markov-Switching Vector Autoregression Model. Economics Letters, 78(3):295– 299, 2003.

Feldkircher, M., Huber, F., and Cuaresma, J. C. Forecasting with Bayesian Global Vector Autoregressive Models: A Comparison of Priors. Working Papers 189, Oesterreichische Nationalbank (Austrian Central Bank), March 2014.

Gray, D., Gross, M., Paredes, J., and Sydow, M. Modeling Banking, Sovereign, and Macro Risk in a CCA Global VAR. IMF Working Papers 13/218, International Monetary Fund, 2013.

Gray, S. F. Modeling the Conditional Distribution of Interest Rates as a Regime-Switching Process. Journal of Financial Economics, 42(1):27 – 62, 1996.

Gross, M. Estimating GVAR Weight Matrices. Working Paper Series 1523, European Central Bank, March 2013.

Gross, M. and Kok, C. Measuring Contagion Potential Among Sovereigns and Banks Using a Mixed-Cross-Section GVAR. Working Paper Series 1570, European Central Bank, August 2013.

(33)

Econo-Hamilton, J. D. Time Series Analysis. Princeton Univ. Press, Princeton, N.J., 1994. Kim, C.-J. Dynamic Linear Models with Markov-Switching. Journal of Econometrics, 60

(1-2):1–22, 1994.

Kim, H. Generalized Impulse Response Analysis: General or Extreme? Auburn Eco-nomics Working Paper Series auwp2012-04, Department of EcoEco-nomics, Auburn Uni-versity, 2012.

Koop, G., Pesaran, M. H., and Potter, S. M. Impulse Response Analysis in Nonlinear Multivariate Models. Journal of Econometrics, 74(1):119–147, 1996.

Krolzig, H.-M. Markov-switching Vector Autoregressions: Modelling, Statistical Infer-ence, and Application to Business Cycle Analysis, volume 454. Springer Berlin, 1997. Perez-Quiros, G. and Timmermann, A. Firm Size and Cyclical Variations in Stock Returns.

The Journal of Finance, 55(3):1229–1262, 2000.

Perlin, M. MS_Regress — The MATLAB Package for Markov Regime Switching Models, 2010.

Pesaran, M. H. and Shin, Y. Generalized Impulse Response Analysis in Linear Multivariate Models. Economics letters, 58(1):17–29, 1998.

Pesaran, M. H., Schuermann, T., and Weiner, S. M. Modeling Regional Interdependen-cies Using a Global Error-Correcting Macroeconometric Model. Journal of Business & Economic Statistics, 22(2):129–162, April 2004.

Sims, C. A. Macroeconomics and Reality. Econometrica, 48(1):1–48, January 1980.

Smith, L. V. and Galesi, A. GVAR Toolbox 1.1, 2011. URL

References

Related documents

The MRCT theory is a very general multi-length scale finite element formulation while the non-local damage model is a specialised method using a weighted averaging of softening

The results in this paper could still be used to test such restrictions, but then they are identical in form to those in linear VAR:s - even if the drift and the covariance terms

In Chapter 4 we describe how sequential Monte Carlo methods can be used for parameter and state inference in hidden Markov models, such as the one we have defined for the scaled

This is important for the design of protocols for wireless sensor networks with ESD antennas: the best antenna direc- tion, i.e., the direction that leads to the highest

To make this happens; the LogTool was developed by creating a new Analyzer that analyses the log file and presents the results in a way which makes it easier to be read.. The

Finally, Subsection 2.3 introduces options on the CDS index, sometimes denoted by credit index options, and uses the result form Subsection 2.2 to provide a formula for the payoff

Further more, when using the regressions to predict excess stock return by out-of-sample forecasting, it shows the regime-switching regression performs better than basic predictive

In this thesis we have examined whether we can gain in the stock market and outperform Swedish OMX Stockholm 30 (OMXS30) index by using hidden Markov models to predict regime shifts